Series comparison

-[Qemu-devel] [PULL 00/42] target-arm queue
+[PULL 00/23] target-arm queue
-First pullreq for arm of the 4.1 series, since I'm back from
+Mostly my decodetree stuff, but also some patches for various
-holiday now. This is mostly my M-profile FPU series and Philippe's
+smaller bugs/features from others.
 devices.h cleanup. I have a pile of other patchsets to work through
 in my to-review folder, but 42 patches is definitely quite
 big enough to send now...
 thanks
 -- PMM
-The following changes since commit 413a99a92c13ec408dcf2adaa87918dc81e890c8:
+The following changes since commit 53550e81e2cafe7c03a39526b95cd21b5194d9b1:
-  Add Nios II semihosting support. (2019-04-29 16:09:51 +0100)
+  Merge remote-tracking branch 'remotes/berrange/tags/qcrypto-next-pull-request' into staging (2020-06-15 16:36:34 +0100)
 are available in the Git repository at:
-  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20190429
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200616
-for you to fetch changes up to 437cc27ddfded3bbab6afd5ac1761e0e195edba7:
+for you to fetch changes up to 64b397417a26509bcdff44ab94356a35c7901c79:
-  hw/devices: Move SMSC 91C111 declaration into a new header (2019-04-29 17:57:21 +0100)
+  hw: arm: Set vendor property for IMX SDHCI emulations (2020-06-16 10:32:29 +0100)
 ----------------------------------------------------------------
-target-arm queue:
+ * hw: arm: Set vendor property for IMX SDHCI emulations
- * remove "bag of random stuff" hw/devices.h header
+ * sd: sdhci: Implement basic vendor specific register support
- * implement FPU for Cortex-M and enable it for Cortex-M4 and -M33
+ * hw/net/imx_fec: Convert debug fprintf() to trace events
- * hw/dma: Compile the bcm2835_dma device as common object
+ * target/arm/cpu: adjust virtual time for all KVM arm cpus
- * configure: Remove --source-path option
+ * Implement configurable descriptor size in ftgmac100
- * hw/ssi/xilinx_spips: Avoid variable length array
+ * hw/misc/imx6ul_ccm: Implement non writable bits in CCM registers
- * hw/arm/smmuv3: Remove SMMUNotifierNode
+ * target/arm: More Neon decodetree conversion work
 ----------------------------------------------------------------
-Eric Auger (1):
+Erik Smit (1):
-      hw/arm/smmuv3: Remove SMMUNotifierNode
+      Implement configurable descriptor size in ftgmac100
-Peter Maydell (28):
+Guenter Roeck (2):
-      hw/ssi/xilinx_spips: Avoid variable length array
+      sd: sdhci: Implement basic vendor specific register support
-      configure: Remove --source-path option
+      hw: arm: Set vendor property for IMX SDHCI emulations
       target/arm: Make sure M-profile FPSCR RES0 bits are not settable
       hw/intc/armv7m_nvic: Allow reading of M-profile MVFR* registers
       target/arm: Implement dummy versions of M-profile FP-related registers
       target/arm: Disable most VFP sysregs for M-profile
       target/arm: Honour M-profile FP enable bits
       target/arm: Decode FP instructions for M profile
       target/arm: Clear CONTROL_S.SFPA in SG insn if FPU present
       target/arm: Handle SFPA and FPCA bits in reads and writes of CONTROL
       target/arm/helper: don't return early for STKOF faults during stacking
       target/arm: Handle floating point registers in exception entry
       target/arm: Implement v7m_update_fpccr()
       target/arm: Clear CONTROL.SFPA in BXNS and BLXNS
       target/arm: Clean excReturn bits when tail chaining
       target/arm: Allow for floating point in callee stack integrity check
       target/arm: Handle floating point registers in exception return
       target/arm: Move NS TBFLAG from bit 19 to bit 6
       target/arm: Overlap VECSTRIDE and XSCALE_CPAR TB flags
       target/arm: Set FPCCR.S when executing M-profile floating point insns
       target/arm: Activate M-profile floating point context when FPCCR.ASPEN is set
       target/arm: New helper function arm_v7m_mmu_idx_all()
       target/arm: New function armv7m_nvic_set_pending_lazyfp()
       target/arm: Add lazy-FP-stacking support to v7m_stack_write()
       target/arm: Implement M-profile lazy FP state preservation
       target/arm: Implement VLSTM for v7M CPUs with an FPU
       target/arm: Implement VLLDM for v7M CPUs with an FPU
       target/arm: Enable FPU for Cortex-M4 and Cortex-M33
-Philippe Mathieu-Daudé (13):
+Jean-Christophe Dubois (2):
-      hw/dma: Compile the bcm2835_dma device as common object
+      hw/misc/imx6ul_ccm: Implement non writable bits in CCM registers
-      hw/arm/aspeed: Use TYPE_TMP105/TYPE_PCA9552 instead of hardcoded string
+      hw/net/imx_fec: Convert debug fprintf() to trace events
       hw/arm/nseries: Use TYPE_TMP105 instead of hardcoded string
       hw/display/tc6393xb: Remove unused functions
       hw/devices: Move TC6393XB declarations into a new header
       hw/devices: Move Blizzard declarations into a new header
       hw/devices: Move CBus declarations into a new header
       hw/devices: Move Gamepad declarations into a new header
       hw/devices: Move TI touchscreen declarations into a new header
       hw/devices: Move LAN9118 declarations into a new header
       hw/net/ne2000-isa: Add guards to the header
       hw/net/lan9118: Export TYPE_LAN9118 and use it instead of hardcoded string
       hw/devices: Move SMSC 91C111 declaration into a new header
- configure                     |  10 +-
+Peter Maydell (17):
- hw/dma/Makefile.objs          |   2 +-
+      target/arm: Fix missing temp frees in do_vshll_2sh
- include/hw/arm/omap.h         |   6 +-
+      target/arm: Convert Neon 3-reg-diff prewidening ops to decodetree
- include/hw/arm/smmu-common.h  |   8 +-
+      target/arm: Convert Neon 3-reg-diff narrowing ops to decodetree
- include/hw/devices.h          |  62 ---
+      target/arm: Convert Neon 3-reg-diff VABAL, VABDL to decodetree
- include/hw/display/blizzard.h |  22 ++
+      target/arm: Convert Neon 3-reg-diff long multiplies
- include/hw/display/tc6393xb.h |  24 ++
+      target/arm: Convert Neon 3-reg-diff saturating doubling multiplies
- include/hw/input/gamepad.h    |  19 +
+      target/arm: Convert Neon 3-reg-diff polynomial VMULL
- include/hw/input/tsc2xxx.h    |  36 ++
+      target/arm: Add 'static' and 'const' annotations to VSHLL function arrays
- include/hw/misc/cbus.h        |  32 ++
+      target/arm: Add missing TCG temp free in do_2shift_env_64()
- include/hw/net/lan9118.h      |  21 +
+      target/arm: Convert Neon 2-reg-scalar integer multiplies to decodetree
- include/hw/net/ne2000-isa.h   |   6 +
+      target/arm: Convert Neon 2-reg-scalar float multiplies to decodetree
- include/hw/net/smc91c111.h    |  19 +
+      target/arm: Convert Neon 2-reg-scalar VQDMULH, VQRDMULH to decodetree
- include/qemu/typedefs.h       |   1 -
+      target/arm: Convert Neon 2-reg-scalar VQRDMLAH, VQRDMLSH to decodetree
- target/arm/cpu.h              |  95 ++++-
+      target/arm: Convert Neon 2-reg-scalar long multiplies to decodetree
- target/arm/helper.h           |   5 +
+      target/arm: Convert Neon VEXT to decodetree
- target/arm/translate.h        |   3 +
+      target/arm: Convert Neon VTBL, VTBX to decodetree
- hw/arm/aspeed.c               |  13 +-
+      target/arm: Convert Neon VDUP (scalar) to decodetree
  hw/arm/exynos4_boards.c       |   3 +-
  hw/arm/gumstix.c              |   2 +-
  hw/arm/integratorcp.c         |   2 +-
  hw/arm/kzm.c                  |   2 +-
  hw/arm/mainstone.c            |   2 +-
  hw/arm/mps2-tz.c              |   3 +-
  hw/arm/mps2.c                 |   2 +-
  hw/arm/nseries.c              |   7 +-
  hw/arm/palm.c                 |   2 +-
  hw/arm/realview.c             |   3 +-
  hw/arm/smmu-common.c          |   6 +-
  hw/arm/smmuv3.c               |  28 +-
  hw/arm/stellaris.c            |   2 +-
  hw/arm/tosa.c                 |   2 +-
  hw/arm/versatilepb.c          |   2 +-
  hw/arm/vexpress.c             |   2 +-
  hw/display/blizzard.c         |   2 +-
  hw/display/tc6393xb.c         |  18 +-
  hw/input/stellaris_input.c    |   2 +-
  hw/input/tsc2005.c            |   2 +-
  hw/input/tsc210x.c            |   4 +-
  hw/intc/armv7m_nvic.c         | 261 +++++++++++++
  hw/misc/cbus.c                |   2 +-
  hw/net/lan9118.c              |   3 +-
  hw/net/smc91c111.c            |   2 +-
  hw/ssi/xilinx_spips.c         |   6 +-
  target/arm/cpu.c              |  20 +
  target/arm/helper.c           | 873 +++++++++++++++++++++++++++++++++++++++---
  target/arm/machine.c          |  16 +
  target/arm/translate.c        | 150 +++++++-
  target/arm/vfp_helper.c       |   8 +
  MAINTAINERS                   |   7 +
 files changed, 1595 insertions(+), 235 deletions(-)
  delete mode 100644 include/hw/devices.h
  create mode 100644 include/hw/display/blizzard.h
  create mode 100644 include/hw/display/tc6393xb.h
  create mode 100644 include/hw/input/gamepad.h
  create mode 100644 include/hw/input/tsc2xxx.h
  create mode 100644 include/hw/misc/cbus.h
  create mode 100644 include/hw/net/lan9118.h
  create mode 100644 include/hw/net/smc91c111.h
+fangying (1):
+      target/arm/cpu: adjust virtual time for all KVM arm cpus
+ hw/sd/sdhci-internal.h          |    5 +
+ include/hw/sd/sdhci.h           |    5 +
+ target/arm/translate.h          |    1 +
+ target/arm/neon-dp.decode       |  130 +++++
+ hw/arm/fsl-imx25.c              |    6 +
+ hw/arm/fsl-imx6.c               |    6 +
+ hw/arm/fsl-imx6ul.c             |    2 +
+ hw/arm/fsl-imx7.c               |    2 +
+ hw/misc/imx6ul_ccm.c            |   76 ++-
+ hw/net/ftgmac100.c              |   26 +-
+ hw/net/imx_fec.c                |  106 ++--
+ hw/sd/sdhci.c                   |   18 +-
+ target/arm/cpu.c                |    6 +-
+ target/arm/cpu64.c              |    1 -
+ target/arm/kvm.c                |   21 +-
+ target/arm/translate-neon.inc.c | 1148 ++++++++++++++++++++++++++++++++++++++-
+ target/arm/translate.c          |  684 +----------------------
+ hw/net/trace-events             |   18 +
+files changed, 1495 insertions(+), 766 deletions(-)

-[Qemu-devel] [PULL 01/42] hw/arm/smmuv3: Remove SMMUNotifierNode
+Deleted patch
-From: Eric Auger <eric.auger@redhat.com>
-The SMMUNotifierNode struct is not necessary and brings extra
-complexity so let's remove it. We now directly track the SMMUDevices
-which have registered IOMMU MR notifiers.
-This is inspired from the same transformation on intel-iommu
-done in commit b4a4ba0d68f50f218ee3957b6638dbee32a5eeef
-("intel-iommu: remove IntelIOMMUNotifierNode")
-Signed-off-by: Eric Auger <eric.auger@redhat.com>
-Reviewed-by: Peter Xu <peterx@redhat.com>
-Message-id: 20190409160219.19026-1-eric.auger@redhat.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/arm/smmu-common.h |  8 ++------
- hw/arm/smmu-common.c         |  6 +++---
- hw/arm/smmuv3.c              | 28 +++++++---------------------
-files changed, 12 insertions(+), 30 deletions(-)
-diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/smmu-common.h
-+++ b/include/hw/arm/smmu-common.h
-@@ -XXX,XX +XXX,XX @@ typedef struct SMMUDevice {
-     AddressSpace       as;
-     uint32_t           cfg_cache_hits;
-     uint32_t           cfg_cache_misses;
-+    QLIST_ENTRY(SMMUDevice) next;
- } SMMUDevice;
--typedef struct SMMUNotifierNode {
--    SMMUDevice *sdev;
--    QLIST_ENTRY(SMMUNotifierNode) next;
--} SMMUNotifierNode;
--
- typedef struct SMMUPciBus {
-     PCIBus       *bus;
-     SMMUDevice   *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
-@@ -XXX,XX +XXX,XX @@ typedef struct SMMUState {
-     GHashTable *iotlb;
-     SMMUPciBus *smmu_pcibus_by_bus_num[SMMU_PCI_BUS_MAX];
-     PCIBus *pci_bus;
--    QLIST_HEAD(, SMMUNotifierNode) notifiers_list;
-+    QLIST_HEAD(, SMMUDevice) devices_with_notifiers;
-     uint8_t bus_num;
-     PCIBus *primary_bus;
- } SMMUState;
-diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/smmu-common.c
-+++ b/hw/arm/smmu-common.c
-@@ -XXX,XX +XXX,XX @@ inline void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
- /* Unmap all notifiers of all mr's */
- void smmu_inv_notifiers_all(SMMUState *s)
- {
--    SMMUNotifierNode *node;
-+    SMMUDevice *sdev;
--    QLIST_FOREACH(node, &s->notifiers_list, next) {
--        smmu_inv_notifiers_mr(&node->sdev->iommu);
-+    QLIST_FOREACH(sdev, &s->devices_with_notifiers, next) {
-+        smmu_inv_notifiers_mr(&sdev->iommu);
-     }
- }
-diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/smmuv3.c
-+++ b/hw/arm/smmuv3.c
-@@ -XXX,XX +XXX,XX @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
- /* invalidate an asid/iova tuple in all mr's */
- static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova)
- {
--    SMMUNotifierNode *node;
-+    SMMUDevice *sdev;
--    QLIST_FOREACH(node, &s->notifiers_list, next) {
--        IOMMUMemoryRegion *mr = &node->sdev->iommu;
-+    QLIST_FOREACH(sdev, &s->devices_with_notifiers, next) {
-+        IOMMUMemoryRegion *mr = &sdev->iommu;
-         IOMMUNotifier *n;
-         trace_smmuv3_inv_notifiers_iova(mr->parent_obj.name, asid, iova);
-@@ -XXX,XX +XXX,XX @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
-     SMMUDevice *sdev = container_of(iommu, SMMUDevice, iommu);
-     SMMUv3State *s3 = sdev->smmu;
-     SMMUState *s = &(s3->smmu_state);
--    SMMUNotifierNode *node = NULL;
--    SMMUNotifierNode *next_node = NULL;
-     if (new & IOMMU_NOTIFIER_MAP) {
-         int bus_num = pci_bus_num(sdev->bus);
-@@ -XXX,XX +XXX,XX @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
-     if (old == IOMMU_NOTIFIER_NONE) {
-         trace_smmuv3_notify_flag_add(iommu->parent_obj.name);
--        node = g_malloc0(sizeof(*node));
--        node->sdev = sdev;
--        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
--        return;
--    }
--
--    /* update notifier node with new flags */
--    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
--        if (node->sdev == sdev) {
--            if (new == IOMMU_NOTIFIER_NONE) {
--                trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
--                QLIST_REMOVE(node, next);
--                g_free(node);
--            }
--            return;
--        }
-+        QLIST_INSERT_HEAD(&s->devices_with_notifiers, sdev, next);
-+    } else if (new == IOMMU_NOTIFIER_NONE) {
-+        trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
-+        QLIST_REMOVE(sdev, next);
-     }
- }
---
-.20.1

-[Qemu-devel] [PULL 02/42] hw/ssi/xilinx_spips: Avoid variable length array
+Deleted patch
-In the stripe8() function we use a variable length array; however
-we know that the maximum length required is MAX_NUM_BUSSES. Use
-a fixed-length array and an assert instead.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
-Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
-Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
-Message-id: 20190328152635.2794-1-peter.maydell@linaro.org
----
- hw/ssi/xilinx_spips.c | 6 ++++--
-file changed, 4 insertions(+), 2 deletions(-)
-diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/ssi/xilinx_spips.c
-+++ b/hw/ssi/xilinx_spips.c
-@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_qspips_reset(DeviceState *d)
- static inline void stripe8(uint8_t *x, int num, bool dir)
- {
--    uint8_t r[num];
--    memset(r, 0, sizeof(uint8_t) * num);
-+    uint8_t r[MAX_NUM_BUSSES];
-     int idx[2] = {0, 0};
-     int bit[2] = {0, 7};
-     int d = dir;
-+    assert(num <= MAX_NUM_BUSSES);
-+    memset(r, 0, sizeof(uint8_t) * num);
-+
-     for (idx[0] = 0; idx[0] < num; ++idx[0]) {
-         for (bit[0] = 7; bit[0] >= 0; bit[0]--) {
-             r[idx[!d]] |= x[idx[d]] & 1 << bit[d] ? 1 << bit[!d] : 0;
---
-.20.1

-[Qemu-devel] [PULL 03/42] configure: Remove --source-path option
+Deleted patch
-Normally configure identifies the source path by looking
-at the location where the configure script itself exists.
-We also provide a --source-path option which lets the user
-manually override this.
-There isn't really an obvious use case for the --source-path
-option, and in commit 927128222b0a91f56c13a in 2017 we
-accidentally added some logic that looks at $source_path
-before the command line option that overrides it has been
-processed.
-The fact that nobody complained suggests that there isn't
-any use of this option and we aren't testing it either;
-remove it. This allows us to move the "make $source_path
-absolute" logic up so that there is no window in the script
-where $source_path is set but not yet absolute.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
-Message-id: 20190318134019.23729-1-peter.maydell@linaro.org
----
- configure | 10 ++--------
-file changed, 2 insertions(+), 8 deletions(-)
-diff --git a/configure b/configure
-index XXXXXXX..XXXXXXX 100755
---- a/configure
-+++ b/configure
-@@ -XXX,XX +XXX,XX @@ ld_has() {
- # default parameters
- source_path=$(dirname "$0")
-+# make source path absolute
-+source_path=$(cd "$source_path"; pwd)
- cpu=""
- iasl="iasl"
- interp_prefix="/usr/gnemul/qemu-%M"
-@@ -XXX,XX +XXX,XX @@ for opt do
-   ;;
-   --cxx=*) CXX="$optarg"
-   ;;
--  --source-path=*) source_path="$optarg"
--  ;;
-   --cpu=*) cpu="$optarg"
-   ;;
-   --extra-cflags=*) QEMU_CFLAGS="$QEMU_CFLAGS $optarg"
-@@ -XXX,XX +XXX,XX @@ if test "$debug_info" = "yes"; then
-     LDFLAGS="-g $LDFLAGS"
- fi
--# make source path absolute
--source_path=$(cd "$source_path"; pwd)
--
- # running configure in the source tree?
- # we know that's the case if configure is there.
- if test -f "./configure"; then
-@@ -XXX,XX +XXX,XX @@ for opt do
-   ;;
-   --interp-prefix=*) interp_prefix="$optarg"
-   ;;
--  --source-path=*)
--  ;;
-   --cross-prefix=*)
-   ;;
-   --cc=*)
-@@ -XXX,XX +XXX,XX @@ $(echo Available targets: $default_target_list | \
-   --target-list-exclude=LIST exclude a set of targets from the default target-list
- Advanced options (experts only):
--  --source-path=PATH       path of source code [$source_path]
-   --cross-prefix=PREFIX    use PREFIX for compile tools [$cross_prefix]
-   --cc=CC                  use C compiler CC [$cc]
-   --iasl=IASL              use ACPI compiler IASL [$iasl]
---
-.20.1

-[Qemu-devel] [PULL 05/42] hw/intc/armv7m_nvic: Allow reading of M-profile MVFR* registers
+[PULL 01/23] target/arm: Fix missing temp frees in do_vshll_2sh
-For M-profile the MVFR* ID registers are memory mapped, in the
+The widenfn() in do_vshll_2sh() does not free the input 32-bit
-range we implement via the NVIC. Allow them to be read.
+TCGv, so we need to do this in the calling code.
 (If the CPU has no FPU, these registers are defined to be RAZ.)
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-3-peter.maydell@linaro.org
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 ---
- hw/intc/armv7m_nvic.c | 6 ++++++
+ target/arm/translate-neon.inc.c | 2 ++
-file changed, 6 insertions(+)
+file changed, 2 insertions(+)
-diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/intc/armv7m_nvic.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/hw/intc/armv7m_nvic.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
+@@ -XXX,XX +XXX,XX @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a,
-             return 0;
+     tmp = tcg_temp_new_i64();
-         }
-         return cpu->env.v7m.sfar;
+     widenfn(tmp, rm0);
-+    case 0xf40: /* MVFR0 */
++    tcg_temp_free_i32(rm0);
-+        return cpu->isar.mvfr0;
+     if (a->shift != 0) {
-+    case 0xf44: /* MVFR1 */
+         tcg_gen_shli_i64(tmp, tmp, a->shift);
-+        return cpu->isar.mvfr1;
+         tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
-+    case 0xf48: /* MVFR2 */
+@@ -XXX,XX +XXX,XX @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a,
-+        return cpu->isar.mvfr2;
+     neon_store_reg64(tmp, a->vd);
-     default:
-     bad_offset:
+     widenfn(tmp, rm1);
-         qemu_log_mask(LOG_GUEST_ERROR, "NVIC: Bad read offset 0x%x\n", offset);
++    tcg_temp_free_i32(rm1);
      if (a->shift != 0) {
          tcg_gen_shli_i64(tmp, tmp, a->shift);
          tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
 --
 .20.1

-[Qemu-devel] [PULL 09/42] target/arm: Decode FP instructions for M profile
+[PULL 02/23] target/arm: Convert Neon 3-reg-diff prewidening ops to decodetree
-Correct the decode of the M-profile "coprocessor and
+Convert the "pre-widening" insns VADDL, VSUBL, VADDW and VSUBW
-floating-point instructions" space:
+in the Neon 3-registers-different-lengths group to decodetree.
- * op0 == 0b11 is always unallocated
+These insns work by widening one or both inputs to double their
- * if the CPU has an FPU then all insns with op1 == 0b101
+size, performing an add or subtract at the doubled size and
-   are floating point and go to disas_vfp_insn()
+then storing the double-size result.
-For the moment we leave VLLDM and VLSTM as NOPs; in
+As usual, rather than copying the loop of the original decoder
-a later commit we will fill in the proper implementation
+(which needs awkward code to avoid problems when source and
-for the case where an FPU is present.
+destination registers overlap) we just unroll the two passes.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-7-peter.maydell@linaro.org
 ---
- target/arm/translate.c | 26 ++++++++++++++++++++++----
+ target/arm/neon-dp.decode       |  43 +++++++++++++
-file changed, 22 insertions(+), 4 deletions(-)
+ target/arm/translate-neon.inc.c | 104 ++++++++++++++++++++++++++++++++
+ target/arm/translate.c          |  16 ++---
 files changed, 151 insertions(+), 12 deletions(-)
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon-dp.decode
 +++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VCVT_FU_2sh      1111 001 1 1 . ...... .... 1111 0 . . 1 .... @2reg_vcvt
  # So we have a single decode line and check the cmode/op in the
  # trans function.
  Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
 +
 +######################################################################
 +# Within the "two registers, or three registers of different lengths"
 +# grouping ([23,4]=0b10), bits [21:20] are either part of the opcode
 +# decode: 0b11 for VEXT, two-reg-misc, VTBL, and duplicate-scalar;
 +# or they are a size field for the three-reg-different-lengths and
 +# two-reg-and-scalar insn groups (where size cannot be 0b11). This
 +# is slightly awkward for decodetree: we handle it with this
 +# non-exclusive group which contains within it two exclusive groups:
 +# one for the size=0b11 patterns, and one for the size-not-0b11
 +# patterns. This allows us to check that none of the insns within
 +# each subgroup accidentally overlap each other. Note that all the
 +# trans functions for the size-not-0b11 patterns must check and
 +# return false for size==3.
 +######################################################################
 +{
 +  # 0b11 subgroup will go here
 +
 +  # Subgroup for size != 0b11
 +  [
 +    ##################################################################
 +    # 3-reg-different-length grouping:
 +    # 1111 001 U 1 D sz!=11 Vn:4 Vd:4 opc:4 N 0 M 0 Vm:4
 +    ##################################################################
 +
 +    &3diff vm vn vd size
 +
 +    @3diff       .... ... . . . size:2 .... .... .... . . . . .... \
 +                 &3diff vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +
 +    VADDL_S_3d   1111 001 0 1 . .. .... .... 0000 . 0 . 0 .... @3diff
 +    VADDL_U_3d   1111 001 1 1 . .. .... .... 0000 . 0 . 0 .... @3diff
 +
 +    VADDW_S_3d   1111 001 0 1 . .. .... .... 0001 . 0 . 0 .... @3diff
 +    VADDW_U_3d   1111 001 1 1 . .. .... .... 0001 . 0 . 0 .... @3diff
 +
 +    VSUBL_S_3d   1111 001 0 1 . .. .... .... 0010 . 0 . 0 .... @3diff
 +    VSUBL_U_3d   1111 001 1 1 . .. .... .... 0010 . 0 . 0 .... @3diff
 +
 +    VSUBW_S_3d   1111 001 0 1 . .. .... .... 0011 . 0 . 0 .... @3diff
 +    VSUBW_U_3d   1111 001 1 1 . .. .... .... 0011 . 0 . 0 .... @3diff
 +  ]
 +}
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_Vimm_1r(DisasContext *s, arg_1reg_imm *a)
      }
      return do_1reg_imm(s, a, fn);
  }
 +
 +static bool do_prewiden_3d(DisasContext *s, arg_3diff *a,
 +                           NeonGenWidenFn *widenfn,
 +                           NeonGenTwo64OpFn *opfn,
 +                           bool src1_wide)
 +{
 +    /* 3-regs different lengths, prewidening case (VADDL/VSUBL/VAADW/VSUBW) */
 +    TCGv_i64 rn0_64, rn1_64, rm_64;
 +    TCGv_i32 rm;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if (!widenfn || !opfn) {
 +        /* size == 3 case, which is an entirely different insn group */
 +        return false;
 +    }
 +
 +    if ((a->vd & 1) || (src1_wide && (a->vn & 1))) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    rn0_64 = tcg_temp_new_i64();
 +    rn1_64 = tcg_temp_new_i64();
 +    rm_64 = tcg_temp_new_i64();
 +
 +    if (src1_wide) {
 +        neon_load_reg64(rn0_64, a->vn);
 +    } else {
 +        TCGv_i32 tmp = neon_load_reg(a->vn, 0);
 +        widenfn(rn0_64, tmp);
 +        tcg_temp_free_i32(tmp);
 +    }
 +    rm = neon_load_reg(a->vm, 0);
 +
 +    widenfn(rm_64, rm);
 +    tcg_temp_free_i32(rm);
 +    opfn(rn0_64, rn0_64, rm_64);
 +
 +    /*
 +     * Load second pass inputs before storing the first pass result, to
 +     * avoid incorrect results if a narrow input overlaps with the result.
 +     */
 +    if (src1_wide) {
 +        neon_load_reg64(rn1_64, a->vn + 1);
 +    } else {
 +        TCGv_i32 tmp = neon_load_reg(a->vn, 1);
 +        widenfn(rn1_64, tmp);
 +        tcg_temp_free_i32(tmp);
 +    }
 +    rm = neon_load_reg(a->vm, 1);
 +
 +    neon_store_reg64(rn0_64, a->vd);
 +
 +    widenfn(rm_64, rm);
 +    tcg_temp_free_i32(rm);
 +    opfn(rn1_64, rn1_64, rm_64);
 +    neon_store_reg64(rn1_64, a->vd + 1);
 +
 +    tcg_temp_free_i64(rn0_64);
 +    tcg_temp_free_i64(rn1_64);
 +    tcg_temp_free_i64(rm_64);
 +
 +    return true;
 +}
 +
 +#define DO_PREWIDEN(INSN, S, EXT, OP, SRC1WIDE)                         \
 +    static bool trans_##INSN##_3d(DisasContext *s, arg_3diff *a)        \
 +    {                                                                   \
 +        static NeonGenWidenFn * const widenfn[] = {                     \
 +            gen_helper_neon_widen_##S##8,                               \
 +            gen_helper_neon_widen_##S##16,                              \
 +            tcg_gen_##EXT##_i32_i64,                                    \
 +            NULL,                                                       \
 +        };                                                              \
 +        static NeonGenTwo64OpFn * const addfn[] = {                     \
 +            gen_helper_neon_##OP##l_u16,                                \
 +            gen_helper_neon_##OP##l_u32,                                \
 +            tcg_gen_##OP##_i64,                                         \
 +            NULL,                                                       \
 +        };                                                              \
 +        return do_prewiden_3d(s, a, widenfn[a->size],                   \
 +                              addfn[a->size], SRC1WIDE);                \
 +    }
 +
 +DO_PREWIDEN(VADDL_S, s, ext, add, false)
 +DO_PREWIDEN(VADDL_U, u, extu, add, false)
 +DO_PREWIDEN(VSUBL_S, s, ext, sub, false)
 +DO_PREWIDEN(VSUBL_U, u, extu, sub, false)
 +DO_PREWIDEN(VADDW_S, s, ext, add, true)
 +DO_PREWIDEN(VADDW_U, u, extu, add, true)
 +DO_PREWIDEN(VSUBW_S, s, ext, sub, true)
 +DO_PREWIDEN(VSUBW_U, u, extu, sub, true)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     case 6: case 7: case 14: case 15:
+                 /* Three registers of different lengths.  */
-         /* Coprocessor.  */
+                 int src1_wide;
-         if (arm_dc_feature(s, ARM_FEATURE_M)) {
+                 int src2_wide;
--            /* We don't currently implement M profile FP support,
+-                int prewiden;
--             * so this entire space should give a NOCP fault, with
+                 /* undefreq: bit 0 : UNDEF if size == 0
--             * the exception of the v8M VLLDM and VLSTM insns, which
+                  *           bit 1 : UNDEF if size == 1
--             * must be NOPs in Secure state and UNDEF in Nonsecure state.
+                  *           bit 2 : UNDEF if size == 2
-+            /* 0b111x_11xx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx */
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+            if (extract32(insn, 24, 2) == 3) {
+                 int undefreq;
-+                goto illegal_op; /* op0 = 0b11 : unallocated */
+                 /* prewiden, src1_wide, src2_wide, undefreq */
-+            }
+                 static const int neon_3reg_wide[16][4] = {
-+
+-                    {1, 0, 0, 0}, /* VADDL */
-+            /*
+-                    {1, 1, 0, 0}, /* VADDW */
-+             * Decode VLLDM and VLSTM first: these are nonstandard because:
+-                    {1, 0, 0, 0}, /* VSUBL */
-+             *  * if there is no FPU then these insns must NOP in
+-                    {1, 1, 0, 0}, /* VSUBW */
-+             *    Secure state and UNDEF in Nonsecure state
++                    {0, 0, 0, 7}, /* VADDL: handled by decodetree */
-+             *  * if there is an FPU then these insns do not have
++                    {0, 0, 0, 7}, /* VADDW: handled by decodetree */
-+             *    the usual behaviour that disas_vfp_insn() provides of
++                    {0, 0, 0, 7}, /* VSUBL: handled by decodetree */
-+             *    being controlled by CPACR/NSACR enable bits or the
++                    {0, 0, 0, 7}, /* VSUBW: handled by decodetree */
-+             *    lazy-stacking logic.
+                     {0, 1, 1, 0}, /* VADDHN */
-              */
+                     {0, 0, 0, 0}, /* VABAL */
-             if (arm_dc_feature(s, ARM_FEATURE_V8) &&
+                     {0, 1, 1, 0}, /* VSUBHN */
-                 (insn & 0xffa00f00) == 0xec200a00) {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
-                 /* Just NOP since FP support is not implemented */
+                 };
-                 break;
-             }
+-                prewiden = neon_3reg_wide[op][0];
-+            if (arm_dc_feature(s, ARM_FEATURE_VFP) &&
+                 src1_wide = neon_3reg_wide[op][1];
-+                ((insn >> 8) & 0xe) == 10) {
+                 src2_wide = neon_3reg_wide[op][2];
-+                /* FP, and the CPU supports it */
+                 undefreq = neon_3reg_wide[op][3];
-+                if (disas_vfp_insn(s, insn)) {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+                    goto illegal_op;
+                         } else {
-+                }
+                             tmp = neon_load_reg(rn, pass);
-+                break;
+                         }
-+            }
+-                        if (prewiden) {
-+
+-                            gen_neon_widen(cpu_V0, tmp, size, u);
-             /* All other insns: NOCP */
+-                        }
-             gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
+                     }
-                                default_exception_el(s));
+                     if (src2_wide) {
                          neon_load_reg64(cpu_V1, rm + pass);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          } else {
                              tmp2 = neon_load_reg(rm, pass);
                          }
 -                        if (prewiden) {
 -                            gen_neon_widen(cpu_V1, tmp2, size, u);
 -                        }
                      }
                      switch (op) {
                      case 0: case 1: case 4: /* VADDL, VADDW, VADDHN, VRADDHN */
 --
 .20.1

-[Qemu-devel] [PULL 25/42] target/arm: Add lazy-FP-stacking support to v7m_stack_write()
+[PULL 03/23] target/arm: Convert Neon 3-reg-diff narrowing ops to decodetree
-Pushing registers to the stack for v7M needs to handle three cases:
+Convert the narrow-to-high-half insns VADDHN, VSUBHN, VRADDHN,
- * the "normal" case where we pend exceptions
+VRSUBHN in the Neon 3-registers-different-lengths group to
- * an "ignore faults" case where we set FSR bits but
+decodetree.
    do not pend exceptions (this is used when we are
    handling some kinds of derived exception on exception entry)
  * a "lazy FP stacking" case, where different FSR bits
    are set and the exception is pended differently
 Implement this by changing the existing flag argument that
 tells us whether to ignore faults or not into an enum that
 specifies which of the 3 modes we should handle.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-23-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 118 +++++++++++++++++++++++++++++---------------
+ target/arm/neon-dp.decode       |  6 +++
-file changed, 79 insertions(+), 39 deletions(-)
+ target/arm/translate-neon.inc.c | 87 +++++++++++++++++++++++++++++++
+ target/arm/translate.c          | 91 ++++-----------------------------
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+files changed, 104 insertions(+), 80 deletions(-)
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static bool v7m_cpacr_pass(CPUARMState *env, bool is_secure, bool is_priv)
+@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
      VSUBW_S_3d   1111 001 0 1 . .. .... .... 0011 . 0 . 0 .... @3diff
      VSUBW_U_3d   1111 001 1 1 . .. .... .... 0011 . 0 . 0 .... @3diff
 +
 +    VADDHN_3d    1111 001 0 1 . .. .... .... 0100 . 0 . 0 .... @3diff
 +    VRADDHN_3d   1111 001 1 1 . .. .... .... 0100 . 0 . 0 .... @3diff
 +
 +    VSUBHN_3d    1111 001 0 1 . .. .... .... 0110 . 0 . 0 .... @3diff
 +    VRSUBHN_3d   1111 001 1 1 . .. .... .... 0110 . 0 . 0 .... @3diff
    ]
  }
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_PREWIDEN(VADDW_S, s, ext, add, true)
  DO_PREWIDEN(VADDW_U, u, extu, add, true)
  DO_PREWIDEN(VSUBW_S, s, ext, sub, true)
  DO_PREWIDEN(VSUBW_U, u, extu, sub, true)
 +
 +static bool do_narrow_3d(DisasContext *s, arg_3diff *a,
 +                         NeonGenTwo64OpFn *opfn, NeonGenNarrowFn *narrowfn)
 +{
 +    /* 3-regs different lengths, narrowing (VADDHN/VSUBHN/VRADDHN/VRSUBHN) */
 +    TCGv_i64 rn_64, rm_64;
 +    TCGv_i32 rd0, rd1;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if (!opfn || !narrowfn) {
 +        /* size == 3 case, which is an entirely different insn group */
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm) & 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    rn_64 = tcg_temp_new_i64();
 +    rm_64 = tcg_temp_new_i64();
 +    rd0 = tcg_temp_new_i32();
 +    rd1 = tcg_temp_new_i32();
 +
 +    neon_load_reg64(rn_64, a->vn);
 +    neon_load_reg64(rm_64, a->vm);
 +
 +    opfn(rn_64, rn_64, rm_64);
 +
 +    narrowfn(rd0, rn_64);
 +
 +    neon_load_reg64(rn_64, a->vn + 1);
 +    neon_load_reg64(rm_64, a->vm + 1);
 +
 +    opfn(rn_64, rn_64, rm_64);
 +
 +    narrowfn(rd1, rn_64);
 +
 +    neon_store_reg(a->vd, 0, rd0);
 +    neon_store_reg(a->vd, 1, rd1);
 +
 +    tcg_temp_free_i64(rn_64);
 +    tcg_temp_free_i64(rm_64);
 +
 +    return true;
 +}
 +
 +#define DO_NARROW_3D(INSN, OP, NARROWTYPE, EXTOP)                       \
 +    static bool trans_##INSN##_3d(DisasContext *s, arg_3diff *a)        \
 +    {                                                                   \
 +        static NeonGenTwo64OpFn * const addfn[] = {                     \
 +            gen_helper_neon_##OP##l_u16,                                \
 +            gen_helper_neon_##OP##l_u32,                                \
 +            tcg_gen_##OP##_i64,                                         \
 +            NULL,                                                       \
 +        };                                                              \
 +        static NeonGenNarrowFn * const narrowfn[] = {                   \
 +            gen_helper_neon_##NARROWTYPE##_high_u8,                     \
 +            gen_helper_neon_##NARROWTYPE##_high_u16,                    \
 +            EXTOP,                                                      \
 +            NULL,                                                       \
 +        };                                                              \
 +        return do_narrow_3d(s, a, addfn[a->size], narrowfn[a->size]);   \
 +    }
 +
 +static void gen_narrow_round_high_u32(TCGv_i32 rd, TCGv_i64 rn)
 +{
 +    tcg_gen_addi_i64(rn, rn, 1u << 31);
 +    tcg_gen_extrh_i64_i32(rd, rn);
 +}
 +
 +DO_NARROW_3D(VADDHN, add, narrow, tcg_gen_extrh_i64_i32)
 +DO_NARROW_3D(VSUBHN, sub, narrow, tcg_gen_extrh_i64_i32)
 +DO_NARROW_3D(VRADDHN, add, narrow_round, gen_narrow_round_high_u32)
 +DO_NARROW_3D(VRSUBHN, sub, narrow_round, gen_narrow_round_high_u32)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_addl(int size)
      }
  }
-+/*
+-static inline void gen_neon_subl(int size)
-+ * What kind of stack write are we doing? This affects how exceptions
+-{
-+ * generated during the stacking are treated.
+-    switch (size) {
-+ */
+-    case 0: gen_helper_neon_subl_u16(CPU_V001); break;
-+typedef enum StackingMode {
+-    case 1: gen_helper_neon_subl_u32(CPU_V001); break;
-+    STACK_NORMAL,
+-    case 2: tcg_gen_sub_i64(CPU_V001); break;
-+    STACK_IGNFAULTS,
+-    default: abort();
-+    STACK_LAZYFP,
+-    }
-+} StackingMode;
+-}
-+
+-
- static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
+ static inline void gen_neon_negl(TCGv_i64 var, int size)
 -                            ARMMMUIdx mmu_idx, bool ignfault)
 +                            ARMMMUIdx mmu_idx, StackingMode mode)
  {
-     CPUState *cs = CPU(cpu);
+     switch (size) {
-     CPUARMState *env = &cpu->env;
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-@@ -XXX,XX +XXX,XX @@ static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
+             op = (insn >> 8) & 0xf;
-                       &attrs, &prot, &page_size, &fi, NULL)) {
+             if ((insn & (1 << 6)) == 0) {
-         /* MPU/SAU lookup failed */
+                 /* Three registers of different lengths.  */
-         if (fi.type == ARMFault_QEMU_SFault) {
+-                int src1_wide;
--            qemu_log_mask(CPU_LOG_INT,
+-                int src2_wide;
--                          "...SecureFault with SFSR.AUVIOL during stacking\n");
+                 /* undefreq: bit 0 : UNDEF if size == 0
--            env->v7m.sfsr |= R_V7M_SFSR_AUVIOL_MASK | R_V7M_SFSR_SFARVALID_MASK;
+                  *           bit 1 : UNDEF if size == 1
-+            if (mode == STACK_LAZYFP) {
+                  *           bit 2 : UNDEF if size == 2
-+                qemu_log_mask(CPU_LOG_INT,
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+                              "...SecureFault with SFSR.LSPERR "
+                     {0, 0, 0, 7}, /* VADDW: handled by decodetree */
-+                              "during lazy stacking\n");
+                     {0, 0, 0, 7}, /* VSUBL: handled by decodetree */
-+                env->v7m.sfsr |= R_V7M_SFSR_LSPERR_MASK;
+                     {0, 0, 0, 7}, /* VSUBW: handled by decodetree */
-+            } else {
+-                    {0, 1, 1, 0}, /* VADDHN */
-+                qemu_log_mask(CPU_LOG_INT,
++                    {0, 0, 0, 7}, /* VADDHN: handled by decodetree */
-+                              "...SecureFault with SFSR.AUVIOL "
+                     {0, 0, 0, 0}, /* VABAL */
-+                              "during stacking\n");
+-                    {0, 1, 1, 0}, /* VSUBHN */
-+                env->v7m.sfsr |= R_V7M_SFSR_AUVIOL_MASK;
++                    {0, 0, 0, 7}, /* VSUBHN: handled by decodetree */
-+            }
+                     {0, 0, 0, 0}, /* VABDL */
-+            env->v7m.sfsr |= R_V7M_SFSR_SFARVALID_MASK;
+                     {0, 0, 0, 0}, /* VMLAL */
-             env->v7m.sfar = addr;
+                     {0, 0, 0, 9}, /* VQDMLAL */
-             exc = ARMV7M_EXCP_SECURE;
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             exc_secure = false;
+                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
-         } else {
+                 };
--            qemu_log_mask(CPU_LOG_INT, "...MemManageFault with CFSR.MSTKERR\n");
--            env->v7m.cfsr[secure] |= R_V7M_CFSR_MSTKERR_MASK;
+-                src1_wide = neon_3reg_wide[op][1];
-+            if (mode == STACK_LAZYFP) {
+-                src2_wide = neon_3reg_wide[op][2];
-+                qemu_log_mask(CPU_LOG_INT,
+                 undefreq = neon_3reg_wide[op][3];
-+                              "...MemManageFault with CFSR.MLSPERR\n");
-+                env->v7m.cfsr[secure] |= R_V7M_CFSR_MLSPERR_MASK;
+                 if ((undefreq & (1 << size)) ||
-+            } else {
+                     ((undefreq & 8) && u)) {
-+                qemu_log_mask(CPU_LOG_INT,
+                     return 1;
-+                              "...MemManageFault with CFSR.MSTKERR\n");
+                 }
-+                env->v7m.cfsr[secure] |= R_V7M_CFSR_MSTKERR_MASK;
+-                if ((src1_wide && (rn & 1)) ||
-+            }
+-                    (src2_wide && (rm & 1)) ||
-             exc = ARMV7M_EXCP_MEM;
+-                    (!src2_wide && (rd & 1))) {
-             exc_secure = secure;
++                if (rd & 1) {
-         }
+                     return 1;
-@@ -XXX,XX +XXX,XX @@ static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
+                 }
-                          attrs, &txres);
-     if (txres != MEMTX_OK) {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         /* BusFault trying to write the data */
+                 /* Avoid overlapping operands.  Wide source operands are
--        qemu_log_mask(CPU_LOG_INT, "...BusFault with BFSR.STKERR\n");
+                    always aligned so will never overlap with wide
--        env->v7m.cfsr[M_REG_NS] |= R_V7M_CFSR_STKERR_MASK;
+                    destinations in problematic ways.  */
-+        if (mode == STACK_LAZYFP) {
+-                if (rd == rm && !src2_wide) {
-+            qemu_log_mask(CPU_LOG_INT, "...BusFault with BFSR.LSPERR\n");
++                if (rd == rm) {
-+            env->v7m.cfsr[M_REG_NS] |= R_V7M_CFSR_LSPERR_MASK;
+                     tmp = neon_load_reg(rm, 1);
-+        } else {
+                     neon_store_scratch(2, tmp);
-+            qemu_log_mask(CPU_LOG_INT, "...BusFault with BFSR.STKERR\n");
+-                } else if (rd == rn && !src1_wide) {
-+            env->v7m.cfsr[M_REG_NS] |= R_V7M_CFSR_STKERR_MASK;
++                } else if (rd == rn) {
-+        }
+                     tmp = neon_load_reg(rn, 1);
-         exc = ARMV7M_EXCP_BUS;
+                     neon_store_scratch(2, tmp);
-         exc_secure = false;
+                 }
-         goto pend_fault;
+                 tmp3 = NULL;
-@@ -XXX,XX +XXX,XX @@ pend_fault:
+                 for (pass = 0; pass < 2; pass++) {
-      * later if we have two derived exceptions.
+-                    if (src1_wide) {
-      * The only case when we must not pend the exception but instead
+-                        neon_load_reg64(cpu_V0, rn + pass);
-      * throw it away is if we are doing the push of the callee registers
+-                        tmp = NULL;
--     * and we've already generated a derived exception. Even in this
++                    if (pass == 1 && rd == rn) {
--     * case we will still update the fault status registers.
++                        tmp = neon_load_scratch(2);
-+     * and we've already generated a derived exception (this is indicated
+                     } else {
-+     * by the caller passing STACK_IGNFAULTS). Even in this case we will
+-                        if (pass == 1 && rd == rn) {
-+     * still update the fault status registers.
+-                            tmp = neon_load_scratch(2);
-      */
+-                        } else {
--    if (!ignfault) {
+-                            tmp = neon_load_reg(rn, pass);
-+    switch (mode) {
+-                        }
-+    case STACK_NORMAL:
++                        tmp = neon_load_reg(rn, pass);
          armv7m_nvic_set_pending_derived(env->nvic, exc, exc_secure);
 +        break;
 +    case STACK_LAZYFP:
 +        armv7m_nvic_set_pending_lazyfp(env->nvic, exc, exc_secure);
 +        break;
 +    case STACK_IGNFAULTS:
 +        break;
      }
      return false;
  }
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
      uint32_t limit;
      bool want_psp;
      uint32_t sig;
 +    StackingMode smode = ignore_faults ? STACK_IGNFAULTS : STACK_NORMAL;
      if (dotailchain) {
          bool mode = lr & R_V7M_EXCRET_MODE_MASK;
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
       */
      sig = v7m_integrity_sig(env, lr);
      stacked_ok =
 -        v7m_stack_write(cpu, frameptr, sig, mmu_idx, ignore_faults) &&
 -        v7m_stack_write(cpu, frameptr + 0x8, env->regs[4], mmu_idx,
 -                        ignore_faults) &&
 -        v7m_stack_write(cpu, frameptr + 0xc, env->regs[5], mmu_idx,
 -                        ignore_faults) &&
 -        v7m_stack_write(cpu, frameptr + 0x10, env->regs[6], mmu_idx,
 -                        ignore_faults) &&
 -        v7m_stack_write(cpu, frameptr + 0x14, env->regs[7], mmu_idx,
 -                        ignore_faults) &&
 -        v7m_stack_write(cpu, frameptr + 0x18, env->regs[8], mmu_idx,
 -                        ignore_faults) &&
 -        v7m_stack_write(cpu, frameptr + 0x1c, env->regs[9], mmu_idx,
 -                        ignore_faults) &&
 -        v7m_stack_write(cpu, frameptr + 0x20, env->regs[10], mmu_idx,
 -                        ignore_faults) &&
 -        v7m_stack_write(cpu, frameptr + 0x24, env->regs[11], mmu_idx,
 -                        ignore_faults);
 +        v7m_stack_write(cpu, frameptr, sig, mmu_idx, smode) &&
 +        v7m_stack_write(cpu, frameptr + 0x8, env->regs[4], mmu_idx, smode) &&
 +        v7m_stack_write(cpu, frameptr + 0xc, env->regs[5], mmu_idx, smode) &&
 +        v7m_stack_write(cpu, frameptr + 0x10, env->regs[6], mmu_idx, smode) &&
 +        v7m_stack_write(cpu, frameptr + 0x14, env->regs[7], mmu_idx, smode) &&
 +        v7m_stack_write(cpu, frameptr + 0x18, env->regs[8], mmu_idx, smode) &&
 +        v7m_stack_write(cpu, frameptr + 0x1c, env->regs[9], mmu_idx, smode) &&
 +        v7m_stack_write(cpu, frameptr + 0x20, env->regs[10], mmu_idx, smode) &&
 +        v7m_stack_write(cpu, frameptr + 0x24, env->regs[11], mmu_idx, smode);
      /* Update SP regardless of whether any of the stack accesses failed. */
      *frame_sp_p = frameptr;
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
       * if it has higher priority).
       */
      stacked_ok = stacked_ok &&
 -        v7m_stack_write(cpu, frameptr, env->regs[0], mmu_idx, false) &&
 -        v7m_stack_write(cpu, frameptr + 4, env->regs[1], mmu_idx, false) &&
 -        v7m_stack_write(cpu, frameptr + 8, env->regs[2], mmu_idx, false) &&
 -        v7m_stack_write(cpu, frameptr + 12, env->regs[3], mmu_idx, false) &&
 -        v7m_stack_write(cpu, frameptr + 16, env->regs[12], mmu_idx, false) &&
 -        v7m_stack_write(cpu, frameptr + 20, env->regs[14], mmu_idx, false) &&
 -        v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
 -        v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
 +        v7m_stack_write(cpu, frameptr, env->regs[0], mmu_idx, STACK_NORMAL) &&
 +        v7m_stack_write(cpu, frameptr + 4, env->regs[1],
 +                        mmu_idx, STACK_NORMAL) &&
 +        v7m_stack_write(cpu, frameptr + 8, env->regs[2],
 +                        mmu_idx, STACK_NORMAL) &&
 +        v7m_stack_write(cpu, frameptr + 12, env->regs[3],
 +                        mmu_idx, STACK_NORMAL) &&
 +        v7m_stack_write(cpu, frameptr + 16, env->regs[12],
 +                        mmu_idx, STACK_NORMAL) &&
 +        v7m_stack_write(cpu, frameptr + 20, env->regs[14],
 +                        mmu_idx, STACK_NORMAL) &&
 +        v7m_stack_write(cpu, frameptr + 24, env->regs[15],
 +                        mmu_idx, STACK_NORMAL) &&
 +        v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, STACK_NORMAL);
      if (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) {
          /* FPU is active, try to save its registers */
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
                          faddr += 8; /* skip the slot for the FPSCR */
                      }
-                     stacked_ok = stacked_ok &&
+-                    if (src2_wide) {
--                        v7m_stack_write(cpu, faddr, slo, mmu_idx, false) &&
+-                        neon_load_reg64(cpu_V1, rm + pass);
--                        v7m_stack_write(cpu, faddr + 4, shi, mmu_idx, false);
+-                        tmp2 = NULL;
-+                        v7m_stack_write(cpu, faddr, slo,
++                    if (pass == 1 && rd == rm) {
-+                                        mmu_idx, STACK_NORMAL) &&
++                        tmp2 = neon_load_scratch(2);
-+                        v7m_stack_write(cpu, faddr + 4, shi,
+                     } else {
-+                                        mmu_idx, STACK_NORMAL);
+-                        if (pass == 1 && rd == rm) {
-                 }
+-                            tmp2 = neon_load_scratch(2);
-                 stacked_ok = stacked_ok &&
+-                        } else {
-                     v7m_stack_write(cpu, frameptr + 0x60,
+-                            tmp2 = neon_load_reg(rm, pass);
--                                    vfp_get_fpscr(env), mmu_idx, false);
+-                        }
-+                                    vfp_get_fpscr(env), mmu_idx, STACK_NORMAL);
++                        tmp2 = neon_load_reg(rm, pass);
-                 if (cpacr_pass) {
+                     }
-                     for (i = 0; i < ((framesize == 0xa8) ? 32 : 16); i += 2) {
+                     switch (op) {
-                         *aa32_vfp_dreg(env, i / 2) = 0;
+-                    case 0: case 1: case 4: /* VADDL, VADDW, VADDHN, VRADDHN */
 -                        gen_neon_addl(size);
 -                        break;
 -                    case 2: case 3: case 6: /* VSUBL, VSUBW, VSUBHN, VRSUBHN */
 -                        gen_neon_subl(size);
 -                        break;
                      case 5: case 7: /* VABAL, VABDL */
                          switch ((size << 1) | u) {
                          case 0:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              abort();
                          }
                          neon_store_reg64(cpu_V0, rd + pass);
 -                    } else if (op == 4 || op == 6) {
 -                        /* Narrowing operation.  */
 -                        tmp = tcg_temp_new_i32();
 -                        if (!u) {
 -                            switch (size) {
 -                            case 0:
 -                                gen_helper_neon_narrow_high_u8(tmp, cpu_V0);
 -                                break;
 -                            case 1:
 -                                gen_helper_neon_narrow_high_u16(tmp, cpu_V0);
 -                                break;
 -                            case 2:
 -                                tcg_gen_extrh_i64_i32(tmp, cpu_V0);
 -                                break;
 -                            default: abort();
 -                            }
 -                        } else {
 -                            switch (size) {
 -                            case 0:
 -                                gen_helper_neon_narrow_round_high_u8(tmp, cpu_V0);
 -                                break;
 -                            case 1:
 -                                gen_helper_neon_narrow_round_high_u16(tmp, cpu_V0);
 -                                break;
 -                            case 2:
 -                                tcg_gen_addi_i64(cpu_V0, cpu_V0, 1u << 31);
 -                                tcg_gen_extrh_i64_i32(tmp, cpu_V0);
 -                                break;
 -                            default: abort();
 -                            }
 -                        }
 -                        if (pass == 0) {
 -                            tmp3 = tmp;
 -                        } else {
 -                            neon_store_reg(rd, 0, tmp3);
 -                            neon_store_reg(rd, 1, tmp);
 -                        }
                      } else {
                          /* Write back the result.  */
                          neon_store_reg64(cpu_V0, rd + pass);
 --
 .20.1

-[Qemu-devel] [PULL 21/42] target/arm: Set FPCCR.S when executing M-profile floating point insns
+[PULL 04/23] target/arm: Convert Neon 3-reg-diff VABAL, VABDL to decodetree
-The M-profile FPCCR.S bit indicates the security status of
+Convert the Neon 3-reg-diff insns VABAL and VABDL to decodetree.
-the floating point context. In the pseudocode ExecuteFPCheck()
+Like almost all the remaining insns in this group, these are
-function it is unconditionally set to match the current
+a combination of a two-input operation which returns a double width
-security state whenever a floating point instruction is
+result and then a possible accumulation of that double width
-executed.
+result into the destination.
 Implement this by adding a new TB flag which tracks whether
 FPCCR.S is different from the current security state, so
 that we only need to emit the code to update it in the
 less-common case when it is not already set correctly.
 Note that we will add the handling for the other work done
 by ExecuteFPCheck() in later commits.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-19-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       |  2 ++
+ target/arm/translate.h          |   1 +
- target/arm/translate.h |  1 +
+ target/arm/neon-dp.decode       |   6 ++
- target/arm/helper.c    |  5 +++++
+ target/arm/translate-neon.inc.c | 132 ++++++++++++++++++++++++++++++++
- target/arm/translate.c | 20 ++++++++++++++++++++
+ target/arm/translate.c          |  31 +-------
-files changed, 28 insertions(+)
+files changed, 142 insertions(+), 28 deletions(-)
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.h
 +++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
  FIELD(TBFLAG_A32, VFPEN, 7, 1)
  FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
  FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
 +/* For M profile only, set if FPCCR.S does not match current security state */
 +FIELD(TBFLAG_A32, FPCCR_S_WRONG, 20, 1)
  /* For M profile only, Handler (ie not Thread) mode */
  FIELD(TBFLAG_A32, HANDLER, 21, 1)
  /* For M profile only, whether we should generate stack-limit checks */
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
-@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
+@@ -XXX,XX +XXX,XX @@ typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
-     bool v7m_handler_mode;
+ typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
-     bool v8m_secure; /* true if v8M and we're in Secure mode */
+ typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
-     bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
+ typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
-+    bool v8m_fpccr_s_wrong; /* true if v8M FPCCR.S != v8m_secure */
++typedef void NeonGenTwoOpWidenFn(TCGv_i64, TCGv_i32, TCGv_i32);
-     /* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
+ typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
-      * so that top level loop can generate correct syndrome information.
+ typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
-      */
+ typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
+@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
-         flags = FIELD_DP32(flags, TBFLAG_A32, STACKCHECK, 1);
+     VADDHN_3d    1111 001 0 1 . .. .... .... 0100 . 0 . 0 .... @3diff
-     }
+     VRADDHN_3d   1111 001 1 1 . .. .... .... 0100 . 0 . 0 .... @3diff
-+    if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
++    VABAL_S_3d   1111 001 0 1 . .. .... .... 0101 . 0 . 0 .... @3diff
-+        FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S) != env->v7m.secure) {
++    VABAL_U_3d   1111 001 1 1 . .. .... .... 0101 . 0 . 0 .... @3diff
-+        flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
++
-+    }
+     VSUBHN_3d    1111 001 0 1 . .. .... .... 0110 . 0 . 0 .... @3diff
-+
+     VRSUBHN_3d   1111 001 1 1 . .. .... .... 0110 . 0 . 0 .... @3diff
-     *pflags = flags;
++
-     *cs_base = 0;
++    VABDL_S_3d   1111 001 0 1 . .. .... .... 0111 . 0 . 0 .... @3diff
 +    VABDL_U_3d   1111 001 1 1 . .. .... .... 0111 . 0 . 0 .... @3diff
    ]
  }
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-neon.inc.c
++++ b/target/arm/translate-neon.inc.c
+@@ -XXX,XX +XXX,XX @@ DO_NARROW_3D(VADDHN, add, narrow, tcg_gen_extrh_i64_i32)
+ DO_NARROW_3D(VSUBHN, sub, narrow, tcg_gen_extrh_i64_i32)
+ DO_NARROW_3D(VRADDHN, add, narrow_round, gen_narrow_round_high_u32)
+ DO_NARROW_3D(VRSUBHN, sub, narrow_round, gen_narrow_round_high_u32)
++
++static bool do_long_3d(DisasContext *s, arg_3diff *a,
++                       NeonGenTwoOpWidenFn *opfn,
++                       NeonGenTwo64OpFn *accfn)
++{
++    /*
++     * 3-regs different lengths, long operations.
++     * These perform an operation on two inputs that returns a double-width
++     * result, and then possibly perform an accumulation operation of
++     * that result into the double-width destination.
++     */
++    TCGv_i64 rd0, rd1, tmp;
++    TCGv_i32 rn, rm;
++
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
++        return false;
++    }
++
++    /* UNDEF accesses to D16-D31 if they don't exist. */
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
++        ((a->vd | a->vn | a->vm) & 0x10)) {
++        return false;
++    }
++
++    if (!opfn) {
++        /* size == 3 case, which is an entirely different insn group */
++        return false;
++    }
++
++    if (a->vd & 1) {
++        return false;
++    }
++
++    if (!vfp_access_check(s)) {
++        return true;
++    }
++
++    rd0 = tcg_temp_new_i64();
++    rd1 = tcg_temp_new_i64();
++
++    rn = neon_load_reg(a->vn, 0);
++    rm = neon_load_reg(a->vm, 0);
++    opfn(rd0, rn, rm);
++    tcg_temp_free_i32(rn);
++    tcg_temp_free_i32(rm);
++
++    rn = neon_load_reg(a->vn, 1);
++    rm = neon_load_reg(a->vm, 1);
++    opfn(rd1, rn, rm);
++    tcg_temp_free_i32(rn);
++    tcg_temp_free_i32(rm);
++
++    /* Don't store results until after all loads: they might overlap */
++    if (accfn) {
++        tmp = tcg_temp_new_i64();
++        neon_load_reg64(tmp, a->vd);
++        accfn(tmp, tmp, rd0);
++        neon_store_reg64(tmp, a->vd);
++        neon_load_reg64(tmp, a->vd + 1);
++        accfn(tmp, tmp, rd1);
++        neon_store_reg64(tmp, a->vd + 1);
++        tcg_temp_free_i64(tmp);
++    } else {
++        neon_store_reg64(rd0, a->vd);
++        neon_store_reg64(rd1, a->vd + 1);
++    }
++
++    tcg_temp_free_i64(rd0);
++    tcg_temp_free_i64(rd1);
++
++    return true;
++}
++
++static bool trans_VABDL_S_3d(DisasContext *s, arg_3diff *a)
++{
++    static NeonGenTwoOpWidenFn * const opfn[] = {
++        gen_helper_neon_abdl_s16,
++        gen_helper_neon_abdl_s32,
++        gen_helper_neon_abdl_s64,
++        NULL,
++    };
++
++    return do_long_3d(s, a, opfn[a->size], NULL);
++}
++
++static bool trans_VABDL_U_3d(DisasContext *s, arg_3diff *a)
++{
++    static NeonGenTwoOpWidenFn * const opfn[] = {
++        gen_helper_neon_abdl_u16,
++        gen_helper_neon_abdl_u32,
++        gen_helper_neon_abdl_u64,
++        NULL,
++    };
++
++    return do_long_3d(s, a, opfn[a->size], NULL);
++}
++
++static bool trans_VABAL_S_3d(DisasContext *s, arg_3diff *a)
++{
++    static NeonGenTwoOpWidenFn * const opfn[] = {
++        gen_helper_neon_abdl_s16,
++        gen_helper_neon_abdl_s32,
++        gen_helper_neon_abdl_s64,
++        NULL,
++    };
++    static NeonGenTwo64OpFn * const addfn[] = {
++        gen_helper_neon_addl_u16,
++        gen_helper_neon_addl_u32,
++        tcg_gen_add_i64,
++        NULL,
++    };
++
++    return do_long_3d(s, a, opfn[a->size], addfn[a->size]);
++}
++
++static bool trans_VABAL_U_3d(DisasContext *s, arg_3diff *a)
++{
++    static NeonGenTwoOpWidenFn * const opfn[] = {
++        gen_helper_neon_abdl_u16,
++        gen_helper_neon_abdl_u32,
++        gen_helper_neon_abdl_u64,
++        NULL,
++    };
++    static NeonGenTwo64OpFn * const addfn[] = {
++        gen_helper_neon_addl_u16,
++        gen_helper_neon_addl_u32,
++        tcg_gen_add_i64,
++        NULL,
++    };
++
++    return do_long_3d(s, a, opfn[a->size], addfn[a->size]);
++}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-         }
+                     {0, 0, 0, 7}, /* VSUBL: handled by decodetree */
-     }
+                     {0, 0, 0, 7}, /* VSUBW: handled by decodetree */
+                     {0, 0, 0, 7}, /* VADDHN: handled by decodetree */
-+    if (arm_dc_feature(s, ARM_FEATURE_M)) {
+-                    {0, 0, 0, 0}, /* VABAL */
-+        /* Handle M-profile lazy FP state mechanics */
++                    {0, 0, 0, 7}, /* VABAL */
-+
+                     {0, 0, 0, 7}, /* VSUBHN: handled by decodetree */
-+        /* Update ownership of FP context: set FPCCR.S to match current state */
+-                    {0, 0, 0, 0}, /* VABDL */
-+        if (s->v8m_fpccr_s_wrong) {
++                    {0, 0, 0, 7}, /* VABDL */
-+            TCGv_i32 tmp;
+                     {0, 0, 0, 0}, /* VMLAL */
-+
+                     {0, 0, 0, 9}, /* VQDMLAL */
-+            tmp = load_cpu_field(v7m.fpccr[M_REG_S]);
+                     {0, 0, 0, 0}, /* VMLSL */
-+            if (s->v8m_secure) {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+                tcg_gen_ori_i32(tmp, tmp, R_V7M_FPCCR_S_MASK);
+                         tmp2 = neon_load_reg(rm, pass);
-+            } else {
+                     }
-+                tcg_gen_andi_i32(tmp, tmp, ~R_V7M_FPCCR_S_MASK);
+                     switch (op) {
-+            }
+-                    case 5: case 7: /* VABAL, VABDL */
-+            store_cpu_field(tmp, v7m.fpccr[M_REG_S]);
+-                        switch ((size << 1) | u) {
-+            /* Don't need to do this for any further FP insns in this TB */
+-                        case 0:
-+            s->v8m_fpccr_s_wrong = false;
+-                            gen_helper_neon_abdl_s16(cpu_V0, tmp, tmp2);
-+        }
+-                            break;
-+    }
+-                        case 1:
-+
+-                            gen_helper_neon_abdl_u16(cpu_V0, tmp, tmp2);
-     if (extract32(insn, 28, 4) == 0xf) {
+-                            break;
-         /*
+-                        case 2:
-          * Encodings with T=1 (Thumb) or unconditional (ARM):
+-                            gen_helper_neon_abdl_s32(cpu_V0, tmp, tmp2);
-@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
+-                            break;
-     dc->v8m_secure = arm_feature(env, ARM_FEATURE_M_SECURITY) &&
+-                        case 3:
-         regime_is_secure(env, dc->mmu_idx);
+-                            gen_helper_neon_abdl_u32(cpu_V0, tmp, tmp2);
-     dc->v8m_stackcheck = FIELD_EX32(tb_flags, TBFLAG_A32, STACKCHECK);
+-                            break;
-+    dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
+-                        case 4:
-     dc->cp_regs = cpu->cp_regs;
+-                            gen_helper_neon_abdl_s64(cpu_V0, tmp, tmp2);
-     dc->features = env->features;
+-                            break;
+-                        case 5:
 -                            gen_helper_neon_abdl_u64(cpu_V0, tmp, tmp2);
 -                            break;
 -                        default: abort();
 -                        }
 -                        tcg_temp_free_i32(tmp2);
 -                        tcg_temp_free_i32(tmp);
 -                        break;
                      case 8: case 9: case 10: case 11: case 12: case 13:
                          /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL */
                          gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          case 10: /* VMLSL */
                              gen_neon_negl(cpu_V0, size);
                              /* Fall through */
 -                        case 5: case 8: /* VABAL, VMLAL */
 +                        case 8: /* VABAL, VMLAL */
                              gen_neon_addl(size);
                              break;
                          case 9: case 11: /* VQDMLAL, VQDMLSL */
 --
 .20.1

-[Qemu-devel] [PULL 22/42] target/arm: Activate M-profile floating point context when FPCCR.ASPEN is set
+[PULL 05/23] target/arm: Convert Neon 3-reg-diff long multiplies
-The M-profile FPCCR.ASPEN bit indicates that automatic floating-point
+Convert the Neon 3-reg-diff insns VMULL, VMLAL and VMLSL; these perform
-context preservation is enabled. Before executing any floating-point
+a 32x32->64 multiply with possible accumulate.
 instruction, if FPCCR.ASPEN is set and the CONTROL FPCA/SFPA bits
 indicate that there is no active floating point context then we
 must create a new context (by initializing FPSCR and setting
 FPCA/SFPA to indicate that the context is now active). In the
 pseudocode this is handled by ExecuteFPCheck().
-Implement this with a new TB flag which tracks whether we
+Note that for VMLSL we do the accumulate directly with a subtraction
-need to create a new FP context.
+rather than doing a negate-then-add as the old code did.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-20-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       |  2 ++
+ target/arm/neon-dp.decode       |  9 +++++
- target/arm/translate.h |  1 +
+ target/arm/translate-neon.inc.c | 71 +++++++++++++++++++++++++++++++++
- target/arm/helper.c    | 13 +++++++++++++
+ target/arm/translate.c          | 21 +++-------
- target/arm/translate.c | 29 +++++++++++++++++++++++++++++
+files changed, 86 insertions(+), 15 deletions(-)
 files changed, 45 insertions(+)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
+@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
- FIELD(TBFLAG_A32, VFPEN, 7, 1)
- FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
+     VABDL_S_3d   1111 001 0 1 . .. .... .... 0111 . 0 . 0 .... @3diff
- FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
+     VABDL_U_3d   1111 001 1 1 . .. .... .... 0111 . 0 . 0 .... @3diff
-+/* For M profile only, set if we must create a new FP context */
++
-+FIELD(TBFLAG_A32, NEW_FP_CTXT_NEEDED, 19, 1)
++    VMLAL_S_3d   1111 001 0 1 . .. .... .... 1000 . 0 . 0 .... @3diff
- /* For M profile only, set if FPCCR.S does not match current security state */
++    VMLAL_U_3d   1111 001 1 1 . .. .... .... 1000 . 0 . 0 .... @3diff
- FIELD(TBFLAG_A32, FPCCR_S_WRONG, 20, 1)
++
- /* For M profile only, Handler (ie not Thread) mode */
++    VMLSL_S_3d   1111 001 0 1 . .. .... .... 1010 . 0 . 0 .... @3diff
-diff --git a/target/arm/translate.h b/target/arm/translate.h
++    VMLSL_U_3d   1111 001 1 1 . .. .... .... 1010 . 0 . 0 .... @3diff
 +
 +    VMULL_S_3d   1111 001 0 1 . .. .... .... 1100 . 0 . 0 .... @3diff
 +    VMULL_U_3d   1111 001 1 1 . .. .... .... 1100 . 0 . 0 .... @3diff
    ]
  }
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate.h
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
+@@ -XXX,XX +XXX,XX @@ static bool trans_VABAL_U_3d(DisasContext *s, arg_3diff *a)
-     bool v8m_secure; /* true if v8M and we're in Secure mode */
-     bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
+     return do_long_3d(s, a, opfn[a->size], addfn[a->size]);
-     bool v8m_fpccr_s_wrong; /* true if v8M FPCCR.S != v8m_secure */
+ }
-+    bool v7m_new_fp_ctxt_needed; /* ASPEN set but no active FP context */
++
-     /* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
++static void gen_mull_s32(TCGv_i64 rd, TCGv_i32 rn, TCGv_i32 rm)
-      * so that top level loop can generate correct syndrome information.
++{
-      */
++    TCGv_i32 lo = tcg_temp_new_i32();
-diff --git a/target/arm/helper.c b/target/arm/helper.c
++    TCGv_i32 hi = tcg_temp_new_i32();
-index XXXXXXX..XXXXXXX 100644
++
---- a/target/arm/helper.c
++    tcg_gen_muls2_i32(lo, hi, rn, rm);
-+++ b/target/arm/helper.c
++    tcg_gen_concat_i32_i64(rd, lo, hi);
-@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
++
-         flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
++    tcg_temp_free_i32(lo);
-     }
++    tcg_temp_free_i32(hi);
++}
-+    if (arm_feature(env, ARM_FEATURE_M) &&
++
-+        (env->v7m.fpccr[env->v7m.secure] & R_V7M_FPCCR_ASPEN_MASK) &&
++static void gen_mull_u32(TCGv_i64 rd, TCGv_i32 rn, TCGv_i32 rm)
-+        (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) ||
++{
-+         (env->v7m.secure &&
++    TCGv_i32 lo = tcg_temp_new_i32();
-+          !(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)))) {
++    TCGv_i32 hi = tcg_temp_new_i32();
-+        /*
++
-+         * ASPEN is set, but FPCA/SFPA indicate that there is no active
++    tcg_gen_mulu2_i32(lo, hi, rn, rm);
-+         * FP context; we must create a new FP context before executing
++    tcg_gen_concat_i32_i64(rd, lo, hi);
-+         * any FP insn.
++
-+         */
++    tcg_temp_free_i32(lo);
-+        flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
++    tcg_temp_free_i32(hi);
 +}
 +
 +static bool trans_VMULL_S_3d(DisasContext *s, arg_3diff *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        gen_helper_neon_mull_s8,
 +        gen_helper_neon_mull_s16,
 +        gen_mull_s32,
 +        NULL,
 +    };
 +
 +    return do_long_3d(s, a, opfn[a->size], NULL);
 +}
 +
 +static bool trans_VMULL_U_3d(DisasContext *s, arg_3diff *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        gen_helper_neon_mull_u8,
 +        gen_helper_neon_mull_u16,
 +        gen_mull_u32,
 +        NULL,
 +    };
 +
 +    return do_long_3d(s, a, opfn[a->size], NULL);
 +}
 +
 +#define DO_VMLAL(INSN,MULL,ACC)                                         \
 +    static bool trans_##INSN##_3d(DisasContext *s, arg_3diff *a)        \
 +    {                                                                   \
 +        static NeonGenTwoOpWidenFn * const opfn[] = {                   \
 +            gen_helper_neon_##MULL##8,                                  \
 +            gen_helper_neon_##MULL##16,                                 \
 +            gen_##MULL##32,                                             \
 +            NULL,                                                       \
 +        };                                                              \
 +        static NeonGenTwo64OpFn * const accfn[] = {                     \
 +            gen_helper_neon_##ACC##l_u16,                               \
 +            gen_helper_neon_##ACC##l_u32,                               \
 +            tcg_gen_##ACC##_i64,                                        \
 +            NULL,                                                       \
 +        };                                                              \
 +        return do_long_3d(s, a, opfn[a->size], accfn[a->size]);         \
 +    }
 +
-     *pflags = flags;
++DO_VMLAL(VMLAL_S,mull_s,add)
-     *cs_base = 0;
++DO_VMLAL(VMLAL_U,mull_u,add)
- }
++DO_VMLAL(VMLSL_S,mull_s,sub)
 +DO_VMLAL(VMLSL_U,mull_u,sub)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-             /* Don't need to do this for any further FP insns in this TB */
+                     {0, 0, 0, 7}, /* VABAL */
-             s->v8m_fpccr_s_wrong = false;
+                     {0, 0, 0, 7}, /* VSUBHN: handled by decodetree */
-         }
+                     {0, 0, 0, 7}, /* VABDL */
-+
+-                    {0, 0, 0, 0}, /* VMLAL */
-+        if (s->v7m_new_fp_ctxt_needed) {
++                    {0, 0, 0, 7}, /* VMLAL */
-+            /*
+                     {0, 0, 0, 9}, /* VQDMLAL */
-+             * Create new FP context by updating CONTROL.FPCA, CONTROL.SFPA
+-                    {0, 0, 0, 0}, /* VMLSL */
-+             * and the FPSCR.
++                    {0, 0, 0, 7}, /* VMLSL */
-+             */
+                     {0, 0, 0, 9}, /* VQDMLSL */
-+            TCGv_i32 control, fpscr;
+-                    {0, 0, 0, 0}, /* Integer VMULL */
-+            uint32_t bits = R_V7M_CONTROL_FPCA_MASK;
++                    {0, 0, 0, 7}, /* Integer VMULL */
-+
+                     {0, 0, 0, 9}, /* VQDMULL */
-+            fpscr = load_cpu_field(v7m.fpdscr[s->v8m_secure]);
+                     {0, 0, 0, 0xa}, /* Polynomial VMULL */
-+            gen_helper_vfp_set_fpscr(cpu_env, fpscr);
+                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
-+            tcg_temp_free_i32(fpscr);
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+            /*
+                         tmp2 = neon_load_reg(rm, pass);
-+             * We don't need to arrange to end the TB, because the only
+                     }
-+             * parts of FPSCR which we cache in the TB flags are the VECLEN
+                     switch (op) {
-+             * and VECSTRIDE, and those don't exist for M-profile.
+-                    case 8: case 9: case 10: case 11: case 12: case 13:
-+             */
+-                        /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL */
-+
++                    case 9: case 11: case 13:
-+            if (s->v8m_secure) {
++                        /* VQDMLAL, VQDMLSL, VQDMULL */
-+                bits |= R_V7M_CONTROL_SFPA_MASK;
+                         gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
-+            }
+                         break;
-+            control = load_cpu_field(v7m.control[M_REG_S]);
+                     default: /* 15 is RESERVED: caught earlier  */
-+            tcg_gen_ori_i32(control, control, bits);
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+            store_cpu_field(control, v7m.control[M_REG_S]);
+                         /* VQDMULL */
-+            /* Don't need to do this for any further FP insns in this TB */
+                         gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
-+            s->v7m_new_fp_ctxt_needed = false;
+                         neon_store_reg64(cpu_V0, rd + pass);
-+        }
+-                    } else if (op == 5 || (op >= 8 && op <= 11)) {
-     }
++                    } else {
+                         /* Accumulate.  */
-     if (extract32(insn, 28, 4) == 0xf) {
+                         neon_load_reg64(cpu_V1, rd + pass);
-@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
+                         switch (op) {
-         regime_is_secure(env, dc->mmu_idx);
+-                        case 10: /* VMLSL */
-     dc->v8m_stackcheck = FIELD_EX32(tb_flags, TBFLAG_A32, STACKCHECK);
+-                            gen_neon_negl(cpu_V0, size);
-     dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
+-                            /* Fall through */
-+    dc->v7m_new_fp_ctxt_needed =
+-                        case 8: /* VABAL, VMLAL */
-+        FIELD_EX32(tb_flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED);
+-                            gen_neon_addl(size);
-     dc->cp_regs = cpu->cp_regs;
+-                            break;
-     dc->features = env->features;
+                         case 9: case 11: /* VQDMLAL, VQDMLSL */
+                             gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
                              if (op == 11) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              abort();
                          }
                          neon_store_reg64(cpu_V0, rd + pass);
 -                    } else {
 -                        /* Write back the result.  */
 -                        neon_store_reg64(cpu_V0, rd + pass);
                      }
                  }
              } else {
 --
 .20.1

-[Qemu-devel] [PULL 07/42] target/arm: Disable most VFP sysregs for M-profile
+[PULL 06/23] target/arm: Convert Neon 3-reg-diff saturating doubling multiplies
-The only "system register" that M-profile floating point exposes
+Convert the Neon 3-reg-diff insns VQDMULL, VQDMLAL and VQDMLSL:
-via the VMRS/VMRS instructions is FPSCR, and it does not have
+these are all saturating doubling long multiplies with a possible
-the odd special case for rd==15. Add a check to ensure we only
+accumulate step.
-expose FPSCR.
 These are the last insns in the group which use the pass-over-each
 elements loop, so we can delete that code.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-5-peter.maydell@linaro.org
 ---
- target/arm/translate.c | 19 +++++++++++++++++--
+ target/arm/neon-dp.decode       |  6 +++
-file changed, 17 insertions(+), 2 deletions(-)
+ target/arm/translate-neon.inc.c | 82 +++++++++++++++++++++++++++++++++
+ target/arm/translate.c          | 59 ++----------------------
 files changed, 92 insertions(+), 55 deletions(-)
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon-dp.decode
 +++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
      VMLAL_S_3d   1111 001 0 1 . .. .... .... 1000 . 0 . 0 .... @3diff
      VMLAL_U_3d   1111 001 1 1 . .. .... .... 1000 . 0 . 0 .... @3diff
 +    VQDMLAL_3d   1111 001 0 1 . .. .... .... 1001 . 0 . 0 .... @3diff
 +
      VMLSL_S_3d   1111 001 0 1 . .. .... .... 1010 . 0 . 0 .... @3diff
      VMLSL_U_3d   1111 001 1 1 . .. .... .... 1010 . 0 . 0 .... @3diff
 +    VQDMLSL_3d   1111 001 0 1 . .. .... .... 1011 . 0 . 0 .... @3diff
 +
      VMULL_S_3d   1111 001 0 1 . .. .... .... 1100 . 0 . 0 .... @3diff
      VMULL_U_3d   1111 001 1 1 . .. .... .... 1100 . 0 . 0 .... @3diff
 +
 +    VQDMULL_3d   1111 001 0 1 . .. .... .... 1101 . 0 . 0 .... @3diff
    ]
  }
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_VMLAL(VMLAL_S,mull_s,add)
  DO_VMLAL(VMLAL_U,mull_u,add)
  DO_VMLAL(VMLSL_S,mull_s,sub)
  DO_VMLAL(VMLSL_U,mull_u,sub)
 +
 +static void gen_VQDMULL_16(TCGv_i64 rd, TCGv_i32 rn, TCGv_i32 rm)
 +{
 +    gen_helper_neon_mull_s16(rd, rn, rm);
 +    gen_helper_neon_addl_saturate_s32(rd, cpu_env, rd, rd);
 +}
 +
 +static void gen_VQDMULL_32(TCGv_i64 rd, TCGv_i32 rn, TCGv_i32 rm)
 +{
 +    gen_mull_s32(rd, rn, rm);
 +    gen_helper_neon_addl_saturate_s64(rd, cpu_env, rd, rd);
 +}
 +
 +static bool trans_VQDMULL_3d(DisasContext *s, arg_3diff *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        NULL,
 +        gen_VQDMULL_16,
 +        gen_VQDMULL_32,
 +        NULL,
 +    };
 +
 +    return do_long_3d(s, a, opfn[a->size], NULL);
 +}
 +
 +static void gen_VQDMLAL_acc_16(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
 +{
 +    gen_helper_neon_addl_saturate_s32(rd, cpu_env, rn, rm);
 +}
 +
 +static void gen_VQDMLAL_acc_32(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
 +{
 +    gen_helper_neon_addl_saturate_s64(rd, cpu_env, rn, rm);
 +}
 +
 +static bool trans_VQDMLAL_3d(DisasContext *s, arg_3diff *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        NULL,
 +        gen_VQDMULL_16,
 +        gen_VQDMULL_32,
 +        NULL,
 +    };
 +    static NeonGenTwo64OpFn * const accfn[] = {
 +        NULL,
 +        gen_VQDMLAL_acc_16,
 +        gen_VQDMLAL_acc_32,
 +        NULL,
 +    };
 +
 +    return do_long_3d(s, a, opfn[a->size], accfn[a->size]);
 +}
 +
 +static void gen_VQDMLSL_acc_16(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
 +{
 +    gen_helper_neon_negl_u32(rm, rm);
 +    gen_helper_neon_addl_saturate_s32(rd, cpu_env, rn, rm);
 +}
 +
 +static void gen_VQDMLSL_acc_32(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
 +{
 +    tcg_gen_neg_i64(rm, rm);
 +    gen_helper_neon_addl_saturate_s64(rd, cpu_env, rn, rm);
 +}
 +
 +static bool trans_VQDMLSL_3d(DisasContext *s, arg_3diff *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        NULL,
 +        gen_VQDMULL_16,
 +        gen_VQDMULL_32,
 +        NULL,
 +    };
 +    static NeonGenTwo64OpFn * const accfn[] = {
 +        NULL,
 +        gen_VQDMLSL_acc_16,
 +        gen_VQDMLSL_acc_32,
 +        NULL,
 +    };
 +
 +    return do_long_3d(s, a, opfn[a->size], accfn[a->size]);
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      {0, 0, 0, 7}, /* VSUBHN: handled by decodetree */
                      {0, 0, 0, 7}, /* VABDL */
                      {0, 0, 0, 7}, /* VMLAL */
 -                    {0, 0, 0, 9}, /* VQDMLAL */
 +                    {0, 0, 0, 7}, /* VQDMLAL */
                      {0, 0, 0, 7}, /* VMLSL */
 -                    {0, 0, 0, 9}, /* VQDMLSL */
 +                    {0, 0, 0, 7}, /* VQDMLSL */
                      {0, 0, 0, 7}, /* Integer VMULL */
 -                    {0, 0, 0, 9}, /* VQDMULL */
 +                    {0, 0, 0, 7}, /* VQDMULL */
                      {0, 0, 0, 0xa}, /* Polynomial VMULL */
                      {0, 0, 0, 7}, /* Reserved: always UNDEF */
                  };
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      }
+                     return 0;
                  }
-             } else { /* !dp */
+-
-+                bool is_sysreg;
+-                /* Avoid overlapping operands.  Wide source operands are
-+
+-                   always aligned so will never overlap with wide
-                 if ((insn & 0x6f) != 0x00)
+-                   destinations in problematic ways.  */
-                     return 1;
+-                if (rd == rm) {
-                 rn = VFP_SREG_N(insn);
+-                    tmp = neon_load_reg(rm, 1);
-+
+-                    neon_store_scratch(2, tmp);
-+                is_sysreg = extract32(insn, 21, 1);
+-                } else if (rd == rn) {
-+
+-                    tmp = neon_load_reg(rn, 1);
-+                if (arm_dc_feature(s, ARM_FEATURE_M)) {
+-                    neon_store_scratch(2, tmp);
-+                    /*
+-                }
-+                     * The only M-profile VFP vmrs/vmsr sysreg is FPSCR.
+-                tmp3 = NULL;
-+                     * Writes to R15 are UNPREDICTABLE; we choose to undef.
+-                for (pass = 0; pass < 2; pass++) {
-+                     */
+-                    if (pass == 1 && rd == rn) {
-+                    if (is_sysreg && (rd == 15 || (rn >> 1) != ARM_VFP_FPSCR)) {
+-                        tmp = neon_load_scratch(2);
-+                        return 1;
+-                    } else {
-+                    }
+-                        tmp = neon_load_reg(rn, pass);
-+                }
+-                    }
-+
+-                    if (pass == 1 && rd == rm) {
-                 if (insn & ARM_CP_RW_BIT) {
+-                        tmp2 = neon_load_scratch(2);
-                     /* vfp->arm */
+-                    } else {
--                    if (insn & (1 << 21)) {
+-                        tmp2 = neon_load_reg(rm, pass);
-+                    if (is_sysreg) {
+-                    }
-                         /* system register */
+-                    switch (op) {
-                         rn >>= 1;
+-                    case 9: case 11: case 13:
+-                        /* VQDMLAL, VQDMLSL, VQDMULL */
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+-                        gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
-                     }
+-                        break;
-                 } else {
+-                    default: /* 15 is RESERVED: caught earlier  */
-                     /* arm->vfp */
+-                        abort();
--                    if (insn & (1 << 21)) {
+-                    }
-+                    if (is_sysreg) {
+-                    if (op == 13) {
-                         rn >>= 1;
+-                        /* VQDMULL */
-                         /* system register */
+-                        gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
-                         switch (rn) {
+-                        neon_store_reg64(cpu_V0, rd + pass);
 -                    } else {
 -                        /* Accumulate.  */
 -                        neon_load_reg64(cpu_V1, rd + pass);
 -                        switch (op) {
 -                        case 9: case 11: /* VQDMLAL, VQDMLSL */
 -                            gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
 -                            if (op == 11) {
 -                                gen_neon_negl(cpu_V0, size);
 -                            }
 -                            gen_neon_addl_saturate(cpu_V0, cpu_V1, size);
 -                            break;
 -                        default:
 -                            abort();
 -                        }
 -                        neon_store_reg64(cpu_V0, rd + pass);
 -                    }
 -                }
 +                abort(); /* all others handled by decodetree */
              } else {
                  /* Two registers and a scalar. NB that for ops of this form
                   * the ARM ARM labels bit 24 as Q, but it is in our variable
 --
 .20.1

-[Qemu-devel] [PULL 27/42] target/arm: Implement VLSTM for v7M CPUs with an FPU
+[PULL 07/23] target/arm: Convert Neon 3-reg-diff polynomial VMULL
-Implement the VLSTM instruction for v7M for the FPU present case.
+Convert the Neon 3-reg-diff insn polynomial VMULL. This is the last
 insn in this group to be converted.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-25-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       |  2 +
+ target/arm/neon-dp.decode       |  2 ++
- target/arm/helper.h    |  2 +
+ target/arm/translate-neon.inc.c | 43 +++++++++++++++++++++++
- target/arm/helper.c    | 84 ++++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate.c          | 60 ++-------------------------------
- target/arm/translate.c | 15 +++++++-
+files changed, 48 insertions(+), 57 deletions(-)
 files changed, 102 insertions(+), 1 deletion(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
- #define EXCP_INVSTATE       18   /* v7M INVSTATE UsageFault */
+     VMULL_U_3d   1111 001 1 1 . .. .... .... 1100 . 0 . 0 .... @3diff
- #define EXCP_STKOF          19   /* v8M STKOF UsageFault */
- #define EXCP_LAZYFP         20   /* v7M fault during lazy FP stacking */
+     VQDMULL_3d   1111 001 0 1 . .. .... .... 1101 . 0 . 0 .... @3diff
-+#define EXCP_LSERR          21   /* v8M LSERR SecureFault */
++
-+#define EXCP_UNALIGNED      22   /* v7M UNALIGNED UsageFault */
++    VMULL_P_3d   1111 001 0 1 . .. .... .... 1110 . 0 . 0 .... @3diff
- /* NB: add new EXCP_ defines to the array in arm_log_exception() too */
+   ]
+ }
- #define ARMV7M_EXCP_RESET   1
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
+--- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/helper.h
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VQDMLSL_3d(DisasContext *s, arg_3diff *a)
- DEF_HELPER_1(v7m_preserve_fp_state, void, env)
+     return do_long_3d(s, a, opfn[a->size], accfn[a->size]);
+ }
 +DEF_HELPER_2(v7m_vlstm, void, env, i32)
 +
- DEF_HELPER_2(v8m_stackcheck, void, env, i32)
++static bool trans_VMULL_P_3d(DisasContext *s, arg_3diff *a)
  DEF_HELPER_4(access_check_cp_reg, void, env, ptr, i32, i32)
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_preserve_fp_state)(CPUARMState *env)
      g_assert_not_reached();
  }
 +void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
 +{
-+    /* translate.c should never generate calls here in user-only mode */
++    gen_helper_gvec_3 *fn_gvec;
 +    g_assert_not_reached();
 +}
 +
- uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
- {
++        return false;
      /* The TT instructions can be used by unprivileged code, but in
@@ -XXX,XX +XXX,XX @@ static void v7m_update_fpccr(CPUARMState *env, uint32_t frameptr,
      }
  }
 +void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
 +{
 +    /* fptr is the value of Rn, the frame pointer we store the FP regs to */
 +    bool s = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
 +    bool lspact = env->v7m.fpccr[s] & R_V7M_FPCCR_LSPACT_MASK;
 +
 +    assert(env->v7m.secure);
 +
 +    if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)) {
 +        return;
 +    }
 +
-+    /* Check access to the coprocessor is permitted */
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    if (!v7m_cpacr_pass(env, true, arm_current_el(env) != 0)) {
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        raise_exception_ra(env, EXCP_NOCP, 0, 1, GETPC());
++        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
-+    if (lspact) {
++    if (a->vd & 1) {
-+        /* LSPACT should not be active when there is active FP state */
++        return false;
 +        raise_exception_ra(env, EXCP_LSERR, 0, 1, GETPC());
 +    }
 +
-+    if (fptr & 7) {
++    switch (a->size) {
-+        raise_exception_ra(env, EXCP_UNALIGNED, 0, 1, GETPC());
++    case 0:
 +        fn_gvec = gen_helper_neon_pmull_h;
 +        break;
 +    case 2:
 +        if (!dc_isar_feature(aa32_pmull, s)) {
 +            return false;
 +        }
 +        fn_gvec = gen_helper_gvec_pmull_q;
 +        break;
 +    default:
 +        return false;
 +    }
 +
-+    /*
++    if (!vfp_access_check(s)) {
-+     * Note that we do not use v7m_stack_write() here, because the
++        return true;
 +     * accesses should not set the FSR bits for stacking errors if they
 +     * fail. (In pseudocode terms, they are AccType_NORMAL, not AccType_STACK
 +     * or AccType_LAZYFP). Faults in cpu_stl_data() will throw exceptions
 +     * and longjmp out.
 +     */
 +    if (!(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPEN_MASK)) {
 +        bool ts = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK;
 +        int i;
 +
 +        for (i = 0; i < (ts ? 32 : 16); i += 2) {
 +            uint64_t dn = *aa32_vfp_dreg(env, i / 2);
 +            uint32_t faddr = fptr + 4 * i;
 +            uint32_t slo = extract64(dn, 0, 32);
 +            uint32_t shi = extract64(dn, 32, 32);
 +
 +            if (i >= 16) {
 +                faddr += 8; /* skip the slot for the FPSCR */
 +            }
 +            cpu_stl_data(env, faddr, slo);
 +            cpu_stl_data(env, faddr + 4, shi);
 +        }
 +        cpu_stl_data(env, fptr + 0x40, vfp_get_fpscr(env));
 +
 +        /*
 +         * If TS is 0 then s0 to s15 and FPSCR are UNKNOWN; we choose to
 +         * leave them unchanged, matching our choice in v7m_preserve_fp_state.
 +         */
 +        if (ts) {
 +            for (i = 0; i < 32; i += 2) {
 +                *aa32_vfp_dreg(env, i / 2) = 0;
 +            }
 +            vfp_set_fpscr(env, 0);
 +        }
 +    } else {
 +        v7m_update_fpccr(env, fptr, false);
 +    }
 +
-+    env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
++    tcg_gen_gvec_3_ool(neon_reg_offset(a->vd, 0),
 +                       neon_reg_offset(a->vn, 0),
 +                       neon_reg_offset(a->vm, 0),
 +                       16, 16, 0, fn_gvec);
 +    return true;
 +}
-+
- static bool v7m_push_stack(ARMCPU *cpu)
- {
-     /* Do the "set up stack frame" part of exception entry,
-@@ -XXX,XX +XXX,XX @@ static void arm_log_exception(int idx)
-             [EXCP_INVSTATE] = "v7M INVSTATE UsageFault",
-             [EXCP_STKOF] = "v8M STKOF UsageFault",
-             [EXCP_LAZYFP] = "v7M exception during lazy FP stacking",
-+            [EXCP_LSERR] = "v8M LSERR UsageFault",
-+            [EXCP_UNALIGNED] = "v7M UNALIGNED UsageFault",
-         };
-         if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
-@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
-         armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
-         env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_STKOF_MASK;
-         break;
-+    case EXCP_LSERR:
-+        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
-+        env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
-+        break;
-+    case EXCP_UNALIGNED:
-+        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
-+        env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_UNALIGNED_MASK;
-+        break;
-     case EXCP_SWI:
-         /* The PC already points to the next instruction.  */
-         armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SVC, env->v7m.secure);
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                 if (!s->v8m_secure || (insn & 0x0040f0ff)) {
+ {
-                     goto illegal_op;
+     int op;
-                 }
+     int q;
--                /* Just NOP since FP support is not implemented */
+-    int rd, rn, rm, rd_ofs, rn_ofs, rm_ofs;
-+
++    int rd, rn, rm, rd_ofs, rm_ofs;
-+                if (arm_dc_feature(s, ARM_FEATURE_VFP)) {
+     int size;
-+                    TCGv_i32 fptr = load_reg(s, rn);
+     int pass;
-+
+     int u;
-+                    if (extract32(insn, 20, 1)) {
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+                        /* VLLDM */
+     size = (insn >> 20) & 3;
-+                    } else {
+     vec_size = q ? 16 : 8;
-+                        gen_helper_v7m_vlstm(cpu_env, fptr);
+     rd_ofs = neon_reg_offset(rd, 0);
-+                    }
+-    rn_ofs = neon_reg_offset(rn, 0);
-+                    tcg_temp_free_i32(fptr);
+     rm_ofs = neon_reg_offset(rm, 0);
-+
-+                    /* End the TB, because we have updated FP control bits */
+     if ((insn & (1 << 23)) == 0) {
-+                    s->base.is_jmp = DISAS_UPDATE;
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+                }
+         if (size != 3) {
-                 break;
+             op = (insn >> 8) & 0xf;
-             }
+             if ((insn & (1 << 6)) == 0) {
-             if (arm_dc_feature(s, ARM_FEATURE_VFP) &&
+-                /* Three registers of different lengths.  */
 -                /* undefreq: bit 0 : UNDEF if size == 0
 -                 *           bit 1 : UNDEF if size == 1
 -                 *           bit 2 : UNDEF if size == 2
 -                 *           bit 3 : UNDEF if U == 1
 -                 * Note that [2:0] set implies 'always UNDEF'
 -                 */
 -                int undefreq;
 -                /* prewiden, src1_wide, src2_wide, undefreq */
 -                static const int neon_3reg_wide[16][4] = {
 -                    {0, 0, 0, 7}, /* VADDL: handled by decodetree */
 -                    {0, 0, 0, 7}, /* VADDW: handled by decodetree */
 -                    {0, 0, 0, 7}, /* VSUBL: handled by decodetree */
 -                    {0, 0, 0, 7}, /* VSUBW: handled by decodetree */
 -                    {0, 0, 0, 7}, /* VADDHN: handled by decodetree */
 -                    {0, 0, 0, 7}, /* VABAL */
 -                    {0, 0, 0, 7}, /* VSUBHN: handled by decodetree */
 -                    {0, 0, 0, 7}, /* VABDL */
 -                    {0, 0, 0, 7}, /* VMLAL */
 -                    {0, 0, 0, 7}, /* VQDMLAL */
 -                    {0, 0, 0, 7}, /* VMLSL */
 -                    {0, 0, 0, 7}, /* VQDMLSL */
 -                    {0, 0, 0, 7}, /* Integer VMULL */
 -                    {0, 0, 0, 7}, /* VQDMULL */
 -                    {0, 0, 0, 0xa}, /* Polynomial VMULL */
 -                    {0, 0, 0, 7}, /* Reserved: always UNDEF */
 -                };
 -
 -                undefreq = neon_3reg_wide[op][3];
 -
 -                if ((undefreq & (1 << size)) ||
 -                    ((undefreq & 8) && u)) {
 -                    return 1;
 -                }
 -                if (rd & 1) {
 -                    return 1;
 -                }
 -
 -                /* Handle polynomial VMULL in a single pass.  */
 -                if (op == 14) {
 -                    if (size == 0) {
 -                        /* VMULL.P8 */
 -                        tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, 16, 16,
 -                                           0, gen_helper_neon_pmull_h);
 -                    } else {
 -                        /* VMULL.P64 */
 -                        if (!dc_isar_feature(aa32_pmull, s)) {
 -                            return 1;
 -                        }
 -                        tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, 16, 16,
 -                                           0, gen_helper_gvec_pmull_q);
 -                    }
 -                    return 0;
 -                }
 -                abort(); /* all others handled by decodetree */
 +                /* Three registers of different lengths: handled by decodetree */
 +                return 1;
              } else {
                  /* Two registers and a scalar. NB that for ops of this form
                   * the ARM ARM labels bit 24 as Q, but it is in our variable
 --
 .20.1

-[Qemu-devel] [PULL 04/42] target/arm: Make sure M-profile FPSCR RES0 bits are not settable
+[PULL 08/23] target/arm: Add 'static' and 'const' annotations to VSHLL function arrays
-Enforce that for M-profile various FPSCR bits which are RES0 there
+Mark the arrays of function pointers in trans_VSHLL_S_2sh() and
-but have defined meanings on A-profile are never settable. This
+trans_VSHLL_U_2sh() as both 'static' and 'const'.
 ensures that M-profile code can't enable the A-profile behaviour
 (notably vector length/stride handling) by accident.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-2-peter.maydell@linaro.org
 ---
- target/arm/vfp_helper.c | 8 ++++++++
+ target/arm/translate-neon.inc.c | 4 ++--
-file changed, 8 insertions(+)
+file changed, 2 insertions(+), 2 deletions(-)
-diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/vfp_helper.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/vfp_helper.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val)
+@@ -XXX,XX +XXX,XX @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a,
-         val &= ~FPCR_FZ16;
-     }
+ static bool trans_VSHLL_S_2sh(DisasContext *s, arg_2reg_shift *a)
+ {
-+    if (arm_feature(env, ARM_FEATURE_M)) {
+-    NeonGenWidenFn *widenfn[] = {
-+        /*
++    static NeonGenWidenFn * const widenfn[] = {
-+         * M profile FPSCR is RES0 for the QC, STRIDE, FZ16, LEN bits
+         gen_helper_neon_widen_s8,
-+         * and also for the trapped-exception-handling bits IxE.
+         gen_helper_neon_widen_s16,
-+         */
+         tcg_gen_ext_i32_i64,
-+        val &= 0xf7c0009f;
+@@ -XXX,XX +XXX,XX @@ static bool trans_VSHLL_S_2sh(DisasContext *s, arg_2reg_shift *a)
-+    }
-+
+ static bool trans_VSHLL_U_2sh(DisasContext *s, arg_2reg_shift *a)
-     /*
+ {
-      * We don't implement trapped exception handling, so the
+-    NeonGenWidenFn *widenfn[] = {
-      * trap enable bits, IDE|IXE|UFE|OFE|DZE|IOE are all RAZ/WI (not RES0!)
++    static NeonGenWidenFn * const widenfn[] = {
          gen_helper_neon_widen_u8,
          gen_helper_neon_widen_u16,
          tcg_gen_extu_i32_i64,
 --
 .20.1

-[Qemu-devel] [PULL 10/42] target/arm: Clear CONTROL_S.SFPA in SG insn if FPU present
+[PULL 09/23] target/arm: Add missing TCG temp free in do_2shift_env_64()
-If the floating point extension is present, then the SG instruction
+In commit 37bfce81b10450071 we accidentally introduced a leak of a TCG
-must clear the CONTROL_S.SFPA bit. Implement this.
+temporary in do_2shift_env_64(); free it.
 (On a no-FPU system the bit will always be zero, so we don't need
 to make the clearing of the bit conditional on ARM_FEATURE_VFP.)
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-8-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 1 +
+ target/arm/translate-neon.inc.c | 1 +
 file changed, 1 insertion(+)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/helper.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool v7m_handle_execute_nsc(ARMCPU *cpu)
+@@ -XXX,XX +XXX,XX @@ static bool do_2shift_env_64(DisasContext *s, arg_2reg_shift *a,
-     qemu_log_mask(CPU_LOG_INT, "...really an SG instruction at 0x%08" PRIx32
+         neon_load_reg64(tmp, a->vm + pass);
-                   ", executing it\n", env->regs[15]);
+         fn(tmp, cpu_env, tmp, constimm);
-     env->regs[14] &= ~1;
+         neon_store_reg64(tmp, a->vd + pass);
-+    env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
++        tcg_temp_free_i64(tmp);
-     switch_v7m_security_state(env, true);
+     }
-     xpsr_write(env, 0, XPSR_IT);
+     tcg_temp_free_i64(constimm);
-     env->regs[15] += 4;
+     return true;
 --
 .20.1

-[Qemu-devel] [PULL 13/42] target/arm: Handle floating point registers in exception entry
+[PULL 10/23] target/arm: Convert Neon 2-reg-scalar integer multiplies to decodetree
-Handle floating point registers in exception entry.
+Convert the VMLA, VMLS and VMUL insns in the Neon "2 registers and a
-This corresponds to the FP-specific parts of the pseudocode
+scalar" group to decodetree.  These are 32x32->32 operations where
-functions ActivateException() and PushStack().
+one of the inputs is the scalar, followed by a possible accumulate
+operation of the 32-bit result.
-We defer the code corresponding to UpdateFPCCR() to a later patch.
 The refactoring removes some of the oddities of the old decoder:
  * operands to the operation and accumulation were often
    reversed (taking advantage of the fact that most of these ops
    are commutative); the new code follows the pseudocode order
  * the Q bit in the insn was in a local variable 'u'; in the
    new code it is decoded into a->q
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-11-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 98 +++++++++++++++++++++++++++++++++++++++++++--
+ target/arm/neon-dp.decode       |  15 ++++
-file changed, 95 insertions(+), 3 deletions(-)
+ target/arm/translate-neon.inc.c | 133 ++++++++++++++++++++++++++++++++
+ target/arm/translate.c          |  77 ++----------------
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+files changed, 154 insertions(+), 71 deletions(-)
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
+@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
-     switch_v7m_security_state(env, targets_secure);
+     VQDMULL_3d   1111 001 0 1 . .. .... .... 1101 . 0 . 0 .... @3diff
-     write_v7m_control_spsel(env, 0);
-     arm_clear_exclusive(env);
+     VMULL_P_3d   1111 001 0 1 . .. .... .... 1110 . 0 . 0 .... @3diff
-+    /* Clear SFPA and FPCA (has no effect if no FPU) */
++
-+    env->v7m.control[M_REG_S] &=
++    ##################################################################
-+        ~(R_V7M_CONTROL_FPCA_MASK | R_V7M_CONTROL_SFPA_MASK);
++    # 2-regs-plus-scalar grouping:
-     /* Clear IT bits */
++    # 1111 001 Q 1 D sz!=11 Vn:4 Vd:4 opc:4 N 1 M 0 Vm:4
-     env->condexec_bits = 0;
++    ##################################################################
-     env->regs[14] = lr;
++    &2scalar vm vn vd size q
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
++
-     uint32_t xpsr = xpsr_read(env);
++    @2scalar     .... ... q:1 . . size:2 .... .... .... . . . . .... \
-     uint32_t frameptr = env->regs[13];
++                 &2scalar vm=%vm_dp vn=%vn_dp vd=%vd_dp
-     ARMMMUIdx mmu_idx = arm_mmu_idx(env);
++
-+    uint32_t framesize;
++    VMLA_2sc     1111 001 . 1 . .. .... .... 0000 . 1 . 0 .... @2scalar
-+    bool nsacr_cp10 = extract32(env->v7m.nsacr, 10, 1);
++
-+
++    VMLS_2sc     1111 001 . 1 . .. .... .... 0100 . 1 . 0 .... @2scalar
-+    if ((env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) &&
++
-+        (env->v7m.secure || nsacr_cp10)) {
++    VMUL_2sc     1111 001 . 1 . .. .... .... 1000 . 1 . 0 .... @2scalar
-+        if (env->v7m.secure &&
+   ]
-+            env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK) {
+ }
-+            framesize = 0xa8;
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VMULL_P_3d(DisasContext *s, arg_3diff *a)
 , 16, 0, fn_gvec);
      return true;
  }
 +
 +static void gen_neon_dup_low16(TCGv_i32 var)
 +{
 +    TCGv_i32 tmp = tcg_temp_new_i32();
 +    tcg_gen_ext16u_i32(var, var);
 +    tcg_gen_shli_i32(tmp, var, 16);
 +    tcg_gen_or_i32(var, var, tmp);
 +    tcg_temp_free_i32(tmp);
 +}
 +
 +static void gen_neon_dup_high16(TCGv_i32 var)
 +{
 +    TCGv_i32 tmp = tcg_temp_new_i32();
 +    tcg_gen_andi_i32(var, var, 0xffff0000);
 +    tcg_gen_shri_i32(tmp, var, 16);
 +    tcg_gen_or_i32(var, var, tmp);
 +    tcg_temp_free_i32(tmp);
 +}
 +
 +static inline TCGv_i32 neon_get_scalar(int size, int reg)
 +{
 +    TCGv_i32 tmp;
 +    if (size == 1) {
 +        tmp = neon_load_reg(reg & 7, reg >> 4);
 +        if (reg & 8) {
 +            gen_neon_dup_high16(tmp);
 +        } else {
-+            framesize = 0x68;
++            gen_neon_dup_low16(tmp);
 +        }
 +    } else {
-+        framesize = 0x20;
++        tmp = neon_load_reg(reg & 15, reg >> 4);
 +    }
++    return tmp;
-     /* Align stack pointer if the guest wants that */
++}
-     if ((frameptr & 4) &&
++
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
++static bool do_2scalar(DisasContext *s, arg_2scalar *a,
-         xpsr |= XPSR_SPREALIGN;
++                       NeonGenTwoOpFn *opfn, NeonGenTwoOpFn *accfn)
-     }
++{
++    /*
--    frameptr -= 0x20;
++     * Two registers and a scalar: perform an operation between
-+    xpsr &= ~XPSR_SFPA;
++     * the input elements and the scalar, and then possibly
-+    if (env->v7m.secure &&
++     * perform an accumulation operation of that result into the
-+        (env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)) {
++     * destination.
-+        xpsr |= XPSR_SFPA;
++     */
-+    }
++    TCGv_i32 scalar;
-+
++    int pass;
-+    frameptr -= framesize;
++
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-     if (arm_feature(env, ARM_FEATURE_V8)) {
++        return false;
-         uint32_t limit = v7m_sp_limit(env);
++    }
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
++
-         v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-         v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
++        ((a->vd | a->vn | a->vm) & 0x10)) {
-+    if (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) {
++        return false;
-+        /* FPU is active, try to save its registers */
++    }
-+        bool fpccr_s = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
++
-+        bool lspact = env->v7m.fpccr[fpccr_s] & R_V7M_FPCCR_LSPACT_MASK;
++    if (!opfn) {
-+
++        /* Bad size (including size == 3, which is a different insn group) */
-+        if (lspact && arm_feature(env, ARM_FEATURE_M_SECURITY)) {
++        return false;
-+            qemu_log_mask(CPU_LOG_INT,
++    }
-+                          "...SecureFault because LSPACT and FPCA both set\n");
++
-+            env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
++    if (a->q && ((a->vd | a->vn) & 1)) {
-+            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
++        return false;
-+        } else if (!env->v7m.secure && !nsacr_cp10) {
++    }
-+            qemu_log_mask(CPU_LOG_INT,
++
-+                          "...Secure UsageFault with CFSR.NOCP because "
++    if (!vfp_access_check(s)) {
-+                          "NSACR.CP10 prevents stacking FP regs\n");
++        return true;
-+            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, M_REG_S);
++    }
-+            env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_NOCP_MASK;
++
-+        } else {
++    scalar = neon_get_scalar(a->size, a->vm);
-+            if (!(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPEN_MASK)) {
++
-+                /* Lazy stacking disabled, save registers now */
++    for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
-+                int i;
++        TCGv_i32 tmp = neon_load_reg(a->vn, pass);
-+                bool cpacr_pass = v7m_cpacr_pass(env, env->v7m.secure,
++        opfn(tmp, tmp, scalar);
-+                                                 arm_current_el(env) != 0);
++        if (accfn) {
-+
++            TCGv_i32 rd = neon_load_reg(a->vd, pass);
-+                if (stacked_ok && !cpacr_pass) {
++            accfn(tmp, rd, tmp);
-+                    /*
++            tcg_temp_free_i32(rd);
 +                     * Take UsageFault if CPACR forbids access. The pseudocode
 +                     * here does a full CheckCPEnabled() but we know the NSACR
 +                     * check can never fail as we have already handled that.
 +                     */
 +                    qemu_log_mask(CPU_LOG_INT,
 +                                  "...UsageFault with CFSR.NOCP because "
 +                                  "CPACR.CP10 prevents stacking FP regs\n");
 +                    armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
 +                                            env->v7m.secure);
 +                    env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_NOCP_MASK;
 +                    stacked_ok = false;
 +                }
 +
 +                for (i = 0; i < ((framesize == 0xa8) ? 32 : 16); i += 2) {
 +                    uint64_t dn = *aa32_vfp_dreg(env, i / 2);
 +                    uint32_t faddr = frameptr + 0x20 + 4 * i;
 +                    uint32_t slo = extract64(dn, 0, 32);
 +                    uint32_t shi = extract64(dn, 32, 32);
 +
 +                    if (i >= 16) {
 +                        faddr += 8; /* skip the slot for the FPSCR */
 +                    }
 +                    stacked_ok = stacked_ok &&
 +                        v7m_stack_write(cpu, faddr, slo, mmu_idx, false) &&
 +                        v7m_stack_write(cpu, faddr + 4, shi, mmu_idx, false);
 +                }
 +                stacked_ok = stacked_ok &&
 +                    v7m_stack_write(cpu, frameptr + 0x60,
 +                                    vfp_get_fpscr(env), mmu_idx, false);
 +                if (cpacr_pass) {
 +                    for (i = 0; i < ((framesize == 0xa8) ? 32 : 16); i += 2) {
 +                        *aa32_vfp_dreg(env, i / 2) = 0;
 +                    }
 +                    vfp_set_fpscr(env, 0);
 +                }
 +            } else {
 +                /* Lazy stacking enabled, save necessary info to stack later */
 +                /* TODO : equivalent of UpdateFPCCR() pseudocode */
 +            }
 +        }
-+    }
++        neon_store_reg(a->vd, pass, tmp);
-+
++    }
-     /*
++    tcg_temp_free_i32(scalar);
-      * If we broke a stack limit then SP was already updated earlier;
++    return true;
-      * otherwise we update SP regardless of whether any of the stack
++}
-@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
++
++static bool trans_VMUL_2sc(DisasContext *s, arg_2scalar *a)
-     if (arm_feature(env, ARM_FEATURE_V8)) {
++{
-         lr = R_V7M_EXCRET_RES1_MASK |
++    static NeonGenTwoOpFn * const opfn[] = {
--            R_V7M_EXCRET_DCRS_MASK |
++        NULL,
--            R_V7M_EXCRET_FTYPE_MASK;
++        gen_helper_neon_mul_u16,
-+            R_V7M_EXCRET_DCRS_MASK;
++        tcg_gen_mul_i32,
-         /* The S bit indicates whether we should return to Secure
++        NULL,
-          * or NonSecure (ie our current state).
++    };
-          * The ES bit indicates whether we're taking this exception
++
-@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
++    return do_2scalar(s, a, opfn[a->size], NULL);
-         if (env->v7m.secure) {
++}
-             lr |= R_V7M_EXCRET_S_MASK;
++
-         }
++static bool trans_VMLA_2sc(DisasContext *s, arg_2scalar *a)
-+        if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK)) {
++{
-+            lr |= R_V7M_EXCRET_FTYPE_MASK;
++    static NeonGenTwoOpFn * const opfn[] = {
-+        }
++        NULL,
-     } else {
++        gen_helper_neon_mul_u16,
-         lr = R_V7M_EXCRET_RES1_MASK |
++        tcg_gen_mul_i32,
-             R_V7M_EXCRET_S_MASK |
++        NULL,
 +    };
 +    static NeonGenTwoOpFn * const accfn[] = {
 +        NULL,
 +        gen_helper_neon_add_u16,
 +        tcg_gen_add_i32,
 +        NULL,
 +    };
 +
 +    return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
 +}
 +
 +static bool trans_VMLS_2sc(DisasContext *s, arg_2scalar *a)
 +{
 +    static NeonGenTwoOpFn * const opfn[] = {
 +        NULL,
 +        gen_helper_neon_mul_u16,
 +        tcg_gen_mul_i32,
 +        NULL,
 +    };
 +    static NeonGenTwoOpFn * const accfn[] = {
 +        NULL,
 +        gen_helper_neon_sub_u16,
 +        tcg_gen_sub_i32,
 +        NULL,
 +    };
 +
 +    return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
  #define VFP_DREG_N(reg, insn) VFP_DREG(reg, insn, 16,  7)
  #define VFP_DREG_M(reg, insn) VFP_DREG(reg, insn,  0,  5)
 -static void gen_neon_dup_low16(TCGv_i32 var)
 -{
 -    TCGv_i32 tmp = tcg_temp_new_i32();
 -    tcg_gen_ext16u_i32(var, var);
 -    tcg_gen_shli_i32(tmp, var, 16);
 -    tcg_gen_or_i32(var, var, tmp);
 -    tcg_temp_free_i32(tmp);
 -}
 -
 -static void gen_neon_dup_high16(TCGv_i32 var)
 -{
 -    TCGv_i32 tmp = tcg_temp_new_i32();
 -    tcg_gen_andi_i32(var, var, 0xffff0000);
 -    tcg_gen_shri_i32(tmp, var, 16);
 -    tcg_gen_or_i32(var, var, tmp);
 -    tcg_temp_free_i32(tmp);
 -}
 -
  static inline bool use_goto_tb(DisasContext *s, target_ulong dest)
  {
  #ifndef CONFIG_USER_ONLY
@@ -XXX,XX +XXX,XX @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
  #define CPU_V001 cpu_V0, cpu_V0, cpu_V1
 -static inline void gen_neon_add(int size, TCGv_i32 t0, TCGv_i32 t1)
 -{
 -    switch (size) {
 -    case 0: gen_helper_neon_add_u8(t0, t0, t1); break;
 -    case 1: gen_helper_neon_add_u16(t0, t0, t1); break;
 -    case 2: tcg_gen_add_i32(t0, t0, t1); break;
 -    default: abort();
 -    }
 -}
 -
 -static inline void gen_neon_rsb(int size, TCGv_i32 t0, TCGv_i32 t1)
 -{
 -    switch (size) {
 -    case 0: gen_helper_neon_sub_u8(t0, t1, t0); break;
 -    case 1: gen_helper_neon_sub_u16(t0, t1, t0); break;
 -    case 2: tcg_gen_sub_i32(t0, t1, t0); break;
 -    default: return;
 -    }
 -}
 -
  static TCGv_i32 neon_load_scratch(int scratch)
  {
      TCGv_i32 tmp = tcg_temp_new_i32();
@@ -XXX,XX +XXX,XX @@ static void neon_store_scratch(int scratch, TCGv_i32 var)
      tcg_temp_free_i32(var);
  }
 -static inline TCGv_i32 neon_get_scalar(int size, int reg)
 -{
 -    TCGv_i32 tmp;
 -    if (size == 1) {
 -        tmp = neon_load_reg(reg & 7, reg >> 4);
 -        if (reg & 8) {
 -            gen_neon_dup_high16(tmp);
 -        } else {
 -            gen_neon_dup_low16(tmp);
 -        }
 -    } else {
 -        tmp = neon_load_reg(reg & 15, reg >> 4);
 -    }
 -    return tmp;
 -}
 -
  static int gen_neon_unzip(int rd, int rm, int size, int q)
  {
      TCGv_ptr pd, pm;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      return 1;
                  }
                  switch (op) {
 +                case 0: /* Integer VMLA scalar */
 +                case 4: /* Integer VMLS scalar */
 +                case 8: /* Integer VMUL scalar */
 +                    return 1; /* handled by decodetree */
 +
                  case 1: /* Float VMLA scalar */
                  case 5: /* Floating point VMLS scalar */
                  case 9: /* Floating point VMUL scalar */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          return 1;
                      }
                      /* fall through */
 -                case 0: /* Integer VMLA scalar */
 -                case 4: /* Integer VMLS scalar */
 -                case 8: /* Integer VMUL scalar */
                  case 12: /* VQDMULH scalar */
                  case 13: /* VQRDMULH scalar */
                      if (u && ((rd | rn) & 1)) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              } else {
                                  gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
                              }
 -                        } else if (op & 1) {
 +                        } else {
                              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
                              gen_helper_vfp_muls(tmp, tmp, tmp2, fpstatus);
                              tcg_temp_free_ptr(fpstatus);
 -                        } else {
 -                            switch (size) {
 -                            case 0: gen_helper_neon_mul_u8(tmp, tmp, tmp2); break;
 -                            case 1: gen_helper_neon_mul_u16(tmp, tmp, tmp2); break;
 -                            case 2: tcg_gen_mul_i32(tmp, tmp, tmp2); break;
 -                            default: abort();
 -                            }
                          }
                          tcg_temp_free_i32(tmp2);
                          if (op < 8) {
                              /* Accumulate.  */
                              tmp2 = neon_load_reg(rd, pass);
                              switch (op) {
 -                            case 0:
 -                                gen_neon_add(size, tmp, tmp2);
 -                                break;
                              case 1:
                              {
                                  TCGv_ptr fpstatus = get_fpstatus_ptr(1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                  tcg_temp_free_ptr(fpstatus);
                                  break;
                              }
 -                            case 4:
 -                                gen_neon_rsb(size, tmp, tmp2);
 -                                break;
                              case 5:
                              {
                                  TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 --
 .20.1

-[Qemu-devel] [PULL 14/42] target/arm: Implement v7m_update_fpccr()
+[PULL 11/23] target/arm: Convert Neon 2-reg-scalar float multiplies to decodetree
-Implement the code which updates the FPCCR register on an
+Convert the float versions of VMLA, VMLS and VMUL in the Neon
-exception entry where we are going to use lazy FP stacking.
+-reg-scalar group to decodetree.
 We have to defer to the NVIC to determine whether the
 various exceptions are currently ready or not.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Message-id: 20190416125744.27770-12-peter.maydell@linaro.org
 ---
- target/arm/cpu.h      | 14 +++++++++
+As noted in the comment on the WRAP_FP_FN macro, we could have
- hw/intc/armv7m_nvic.c | 34 ++++++++++++++++++++++
+had a do_2scalar_fp() function, but for 3 insns it seemed
- target/arm/helper.c   | 67 ++++++++++++++++++++++++++++++++++++++++++-
+simpler to just do the wrapping to get hold of the fpstatus ptr.
-files changed, 114 insertions(+), 1 deletion(-)
+(These are the only fp insns in the group.)
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
  target/arm/neon-dp.decode       |  3 ++
  target/arm/translate-neon.inc.c | 65 +++++++++++++++++++++++++++++++++
  target/arm/translate.c          | 37 ++-----------------
 files changed, 71 insertions(+), 34 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ void armv7m_nvic_acknowledge_irq(void *opaque);
+@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
-  * (Ignoring -1, this is the same as the RETTOBASE value before completion.)
+                  &2scalar vm=%vm_dp vn=%vn_dp vd=%vd_dp
-  */
- int armv7m_nvic_complete_irq(void *opaque, int irq, bool secure);
+     VMLA_2sc     1111 001 . 1 . .. .... .... 0000 . 1 . 0 .... @2scalar
-+/**
++    VMLA_F_2sc   1111 001 . 1 . .. .... .... 0001 . 1 . 0 .... @2scalar
-+ * armv7m_nvic_get_ready_status(void *opaque, int irq, bool secure)
-+ * @opaque: the NVIC
+     VMLS_2sc     1111 001 . 1 . .. .... .... 0100 . 1 . 0 .... @2scalar
-+ * @irq: the exception number to mark pending
++    VMLS_F_2sc   1111 001 . 1 . .. .... .... 0101 . 1 . 0 .... @2scalar
-+ * @secure: false for non-banked exceptions or for the nonsecure
-+ * version of a banked exception, true for the secure version of a banked
+     VMUL_2sc     1111 001 . 1 . .. .... .... 1000 . 1 . 0 .... @2scalar
-+ * exception.
++    VMUL_F_2sc   1111 001 . 1 . .. .... .... 1001 . 1 . 0 .... @2scalar
-+ *
+   ]
-+ * Return whether an exception is "ready", i.e. whether the exception is
+ }
-+ * enabled and is configured at a priority which would allow it to
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-+ * interrupt the current execution priority. This controls whether the
+index XXXXXXX..XXXXXXX 100644
-+ * RDY bit for it in the FPCCR is set.
+--- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VMLS_2sc(DisasContext *s, arg_2scalar *a)
      return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
  }
 +
 +/*
 + * Rather than have a float-specific version of do_2scalar just for
 + * three insns, we wrap a NeonGenTwoSingleOpFn to turn it into
 + * a NeonGenTwoOpFn.
 + */
-+bool armv7m_nvic_get_ready_status(void *opaque, int irq, bool secure);
++#define WRAP_FP_FN(WRAPNAME, FUNC)                              \
- /**
++    static void WRAPNAME(TCGv_i32 rd, TCGv_i32 rn, TCGv_i32 rm) \
-  * armv7m_nvic_raw_execution_priority: return the raw execution priority
++    {                                                           \
-  * @opaque: the NVIC
++        TCGv_ptr fpstatus = get_fpstatus_ptr(1);                \
-diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
++        FUNC(rd, rn, rm, fpstatus);                             \
-index XXXXXXX..XXXXXXX 100644
++        tcg_temp_free_ptr(fpstatus);                            \
 --- a/hw/intc/armv7m_nvic.c
 +++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ int armv7m_nvic_complete_irq(void *opaque, int irq, bool secure)
      return ret;
  }
 +bool armv7m_nvic_get_ready_status(void *opaque, int irq, bool secure)
 +{
 +    /*
 +     * Return whether an exception is "ready", i.e. it is enabled and is
 +     * configured at a priority which would allow it to interrupt the
 +     * current execution priority.
 +     *
 +     * irq and secure have the same semantics as for armv7m_nvic_set_pending():
 +     * for non-banked exceptions secure is always false; for banked exceptions
 +     * it indicates which of the exceptions is required.
 +     */
 +    NVICState *s = (NVICState *)opaque;
 +    bool banked = exc_is_banked(irq);
 +    VecInfo *vec;
 +    int running = nvic_exec_prio(s);
 +
 +    assert(irq > ARMV7M_EXCP_RESET && irq < s->num_irq);
 +    assert(!secure || banked);
 +
 +    /*
 +     * HardFault is an odd special case: we always check against -1,
 +     * even if we're secure and HardFault has priority -3; we never
 +     * need to check for enabled state.
 +     */
 +    if (irq == ARMV7M_EXCP_HARD) {
 +        return running > -1;
 +    }
 +
-+    vec = (banked && secure) ? &s->sec_vectors[irq] : &s->vectors[irq];
++WRAP_FP_FN(gen_VMUL_F_mul, gen_helper_vfp_muls)
 +WRAP_FP_FN(gen_VMUL_F_add, gen_helper_vfp_adds)
 +WRAP_FP_FN(gen_VMUL_F_sub, gen_helper_vfp_subs)
 +
-+    return vec->enabled &&
++static bool trans_VMUL_F_2sc(DisasContext *s, arg_2scalar *a)
-+        exc_group_prio(s, vec->prio, secure) < running;
++{
 +    static NeonGenTwoOpFn * const opfn[] = {
 +        NULL,
 +        NULL, /* TODO: fp16 support */
 +        gen_VMUL_F_mul,
 +        NULL,
 +    };
 +
 +    return do_2scalar(s, a, opfn[a->size], NULL);
 +}
 +
- /* callback when external interrupt line is changed */
++static bool trans_VMLA_F_2sc(DisasContext *s, arg_2scalar *a)
  static void set_irq_level(void *opaque, int n, int level)
  {
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
      env->thumb = addr & 1;
  }
 +static void v7m_update_fpccr(CPUARMState *env, uint32_t frameptr,
 +                             bool apply_splim)
 +{
-+    /*
++    static NeonGenTwoOpFn * const opfn[] = {
-+     * Like the pseudocode UpdateFPCCR: save state in FPCAR and FPCCR
++        NULL,
-+     * that we will need later in order to do lazy FP reg stacking.
++        NULL, /* TODO: fp16 support */
-+     */
++        gen_VMUL_F_mul,
-+    bool is_secure = env->v7m.secure;
++        NULL,
-+    void *nvic = env->nvic;
++    };
-+    /*
++    static NeonGenTwoOpFn * const accfn[] = {
-+     * Some bits are unbanked and live always in fpccr[M_REG_S]; some bits
++        NULL,
-+     * are banked and we want to update the bit in the bank for the
++        NULL, /* TODO: fp16 support */
-+     * current security state; and in one case we want to specifically
++        gen_VMUL_F_add,
-+     * update the NS banked version of a bit even if we are secure.
++        NULL,
-+     */
++    };
 +    uint32_t *fpccr_s = &env->v7m.fpccr[M_REG_S];
 +    uint32_t *fpccr_ns = &env->v7m.fpccr[M_REG_NS];
 +    uint32_t *fpccr = &env->v7m.fpccr[is_secure];
 +    bool hfrdy, bfrdy, mmrdy, ns_ufrdy, s_ufrdy, sfrdy, monrdy;
 +
-+    env->v7m.fpcar[is_secure] = frameptr & ~0x7;
++    return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
 +
 +    if (apply_splim && arm_feature(env, ARM_FEATURE_V8)) {
 +        bool splimviol;
 +        uint32_t splim = v7m_sp_limit(env);
 +        bool ign = armv7m_nvic_neg_prio_requested(nvic, is_secure) &&
 +            (env->v7m.ccr[is_secure] & R_V7M_CCR_STKOFHFNMIGN_MASK);
 +
 +        splimviol = !ign && frameptr < splim;
 +        *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, SPLIMVIOL, splimviol);
 +    }
 +
 +    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, LSPACT, 1);
 +
 +    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, S, is_secure);
 +
 +    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, USER, arm_current_el(env) == 0);
 +
 +    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, THREAD,
 +                        !arm_v7m_is_handler_mode(env));
 +
 +    hfrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_HARD, false);
 +    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, HFRDY, hfrdy);
 +
 +    bfrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_BUS, false);
 +    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, BFRDY, bfrdy);
 +
 +    mmrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_MEM, is_secure);
 +    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, MMRDY, mmrdy);
 +
 +    ns_ufrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_USAGE, false);
 +    *fpccr_ns = FIELD_DP32(*fpccr_ns, V7M_FPCCR, UFRDY, ns_ufrdy);
 +
 +    monrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_DEBUG, false);
 +    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, MONRDY, monrdy);
 +
 +    if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
 +        s_ufrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_USAGE, true);
 +        *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, UFRDY, s_ufrdy);
 +
 +        sfrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_SECURE, false);
 +        *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, SFRDY, sfrdy);
 +    }
 +}
 +
- static bool v7m_push_stack(ARMCPU *cpu)
++static bool trans_VMLS_F_2sc(DisasContext *s, arg_2scalar *a)
- {
++{
-     /* Do the "set up stack frame" part of exception entry,
++    static NeonGenTwoOpFn * const opfn[] = {
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
++        NULL,
-                 }
++        NULL, /* TODO: fp16 support */
-             } else {
++        gen_VMUL_F_mul,
-                 /* Lazy stacking enabled, save necessary info to stack later */
++        NULL,
--                /* TODO : equivalent of UpdateFPCCR() pseudocode */
++    };
-+                v7m_update_fpccr(env, frameptr + 0x20, true);
++    static NeonGenTwoOpFn * const accfn[] = {
-             }
++        NULL,
-         }
++        NULL, /* TODO: fp16 support */
-     }
++        gen_VMUL_F_sub,
 +        NULL,
 +    };
 +
 +    return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  case 0: /* Integer VMLA scalar */
                  case 4: /* Integer VMLS scalar */
                  case 8: /* Integer VMUL scalar */
 -                    return 1; /* handled by decodetree */
 -
                  case 1: /* Float VMLA scalar */
                  case 5: /* Floating point VMLS scalar */
                  case 9: /* Floating point VMUL scalar */
 -                    if (size == 1) {
 -                        return 1;
 -                    }
 -                    /* fall through */
 +                    return 1; /* handled by decodetree */
 +
                  case 12: /* VQDMULH scalar */
                  case 13: /* VQRDMULH scalar */
                      if (u && ((rd | rn) & 1)) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              } else {
                                  gen_helper_neon_qdmulh_s32(tmp, cpu_env, tmp, tmp2);
                              }
 -                        } else if (op == 13) {
 +                        } else {
                              if (size == 1) {
                                  gen_helper_neon_qrdmulh_s16(tmp, cpu_env, tmp, tmp2);
                              } else {
                                  gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
                              }
 -                        } else {
 -                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -                            gen_helper_vfp_muls(tmp, tmp, tmp2, fpstatus);
 -                            tcg_temp_free_ptr(fpstatus);
                          }
                          tcg_temp_free_i32(tmp2);
 -                        if (op < 8) {
 -                            /* Accumulate.  */
 -                            tmp2 = neon_load_reg(rd, pass);
 -                            switch (op) {
 -                            case 1:
 -                            {
 -                                TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -                                gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
 -                                tcg_temp_free_ptr(fpstatus);
 -                                break;
 -                            }
 -                            case 5:
 -                            {
 -                                TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -                                gen_helper_vfp_subs(tmp, tmp2, tmp, fpstatus);
 -                                tcg_temp_free_ptr(fpstatus);
 -                                break;
 -                            }
 -                            default:
 -                                abort();
 -                            }
 -                            tcg_temp_free_i32(tmp2);
 -                        }
                          neon_store_reg(rd, pass, tmp);
                      }
                      break;
 --
 .20.1

-[Qemu-devel] [PULL 08/42] target/arm: Honour M-profile FP enable bits
+[PULL 12/23] target/arm: Convert Neon 2-reg-scalar VQDMULH, VQRDMULH to decodetree
-Like AArch64, M-profile floating point has no FPEXC enable
+Convert the VQDMULH and VQRDMULH insns in the 2-reg-scalar group
-bit to gate floating point; so always set the VFPEN TB flag.
+to decodetree.
 M-profile also has CPACR and NSACR similar to A-profile;
 they behave slightly differently:
  * the CPACR is banked between Secure and Non-Secure
  * if the NSACR forces a trap then this is taken to
    the Secure state, not the Non-Secure state
 Honour the CPACR and NSACR settings. The NSACR handling
 requires us to borrow the exception.target_el field
 (usually meaningless for M profile) to distinguish the
 NOCP UsageFault taken to Secure state from the more
 usual fault taken to the current security state.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-6-peter.maydell@linaro.org
 ---
- target/arm/helper.c    | 55 +++++++++++++++++++++++++++++++++++++++---
+ target/arm/neon-dp.decode       |  3 +++
- target/arm/translate.c | 10 ++++++--
+ target/arm/translate-neon.inc.c | 29 +++++++++++++++++++++++
-files changed, 60 insertions(+), 5 deletions(-)
+ target/arm/translate.c          | 42 ++-------------------------------
 files changed, 34 insertions(+), 40 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.c
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ uint32_t arm_phys_excp_target_el(CPUState *cs, uint32_t excp_idx,
+@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
-     return target_el;
      VMUL_2sc     1111 001 . 1 . .. .... .... 1000 . 1 . 0 .... @2scalar
      VMUL_F_2sc   1111 001 . 1 . .. .... .... 1001 . 1 . 0 .... @2scalar
 +
 +    VQDMULH_2sc  1111 001 . 1 . .. .... .... 1100 . 1 . 0 .... @2scalar
 +    VQRDMULH_2sc 1111 001 . 1 . .. .... .... 1101 . 1 . 0 .... @2scalar
    ]
  }
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-+/*
+index XXXXXXX..XXXXXXX 100644
-+ * Return true if the v7M CPACR permits access to the FPU for the specified
+--- a/target/arm/translate-neon.inc.c
-+ * security state and privilege level.
++++ b/target/arm/translate-neon.inc.c
-+ */
+@@ -XXX,XX +XXX,XX @@ static bool trans_VMLS_F_2sc(DisasContext *s, arg_2scalar *a)
-+static bool v7m_cpacr_pass(CPUARMState *env, bool is_secure, bool is_priv)
      return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
  }
 +
 +WRAP_ENV_FN(gen_VQDMULH_16, gen_helper_neon_qdmulh_s16)
 +WRAP_ENV_FN(gen_VQDMULH_32, gen_helper_neon_qdmulh_s32)
 +WRAP_ENV_FN(gen_VQRDMULH_16, gen_helper_neon_qrdmulh_s16)
 +WRAP_ENV_FN(gen_VQRDMULH_32, gen_helper_neon_qrdmulh_s32)
 +
 +static bool trans_VQDMULH_2sc(DisasContext *s, arg_2scalar *a)
 +{
-+    switch (extract32(env->v7m.cpacr[is_secure], 20, 2)) {
++    static NeonGenTwoOpFn * const opfn[] = {
-+    case 0:
++        NULL,
-+    case 2: /* UNPREDICTABLE: we treat like 0 */
++        gen_VQDMULH_16,
-+        return false;
++        gen_VQDMULH_32,
-+    case 1:
++        NULL,
-+        return is_priv;
++    };
-+    case 3:
++
-+        return true;
++    return do_2scalar(s, a, opfn[a->size], NULL);
 +    default:
 +        g_assert_not_reached();
 +    }
 +}
 +
- static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
++static bool trans_VQRDMULH_2sc(DisasContext *s, arg_2scalar *a)
-                             ARMMMUIdx mmu_idx, bool ignfault)
++{
- {
++    static NeonGenTwoOpFn * const opfn[] = {
-@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
++        NULL,
-         env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_UNDEFINSTR_MASK;
++        gen_VQRDMULH_16,
-         break;
++        gen_VQRDMULH_32,
-     case EXCP_NOCP:
++        NULL,
--        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
++    };
 -        env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_NOCP_MASK;
 +    {
 +        /*
 +         * NOCP might be directed to something other than the current
 +         * security state if this fault is because of NSACR; we indicate
 +         * the target security state using exception.target_el.
 +         */
 +        int target_secstate;
 +
-+        if (env->exception.target_el == 3) {
++    return do_2scalar(s, a, opfn[a->size], NULL);
-+            target_secstate = M_REG_S;
++}
 +        } else {
 +            target_secstate = env->v7m.secure;
 +        }
 +        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, target_secstate);
 +        env->v7m.cfsr[target_secstate] |= R_V7M_CFSR_NOCP_MASK;
          break;
 +    }
      case EXCP_INVSTATE:
          armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
          env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_INVSTATE_MASK;
@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
          return 0;
      }
 +    if (arm_feature(env, ARM_FEATURE_M)) {
 +        /* CPACR can cause a NOCP UsageFault taken to current security state */
 +        if (!v7m_cpacr_pass(env, env->v7m.secure, cur_el != 0)) {
 +            return 1;
 +        }
 +
 +        if (arm_feature(env, ARM_FEATURE_M_SECURITY) && !env->v7m.secure) {
 +            if (!extract32(env->v7m.nsacr, 10, 1)) {
 +                /* FP insns cause a NOCP UsageFault taken to Secure */
 +                return 3;
 +            }
 +        }
 +
 +        return 0;
 +    }
 +
      /* The CPACR controls traps to EL1, or PL1 if we're 32 bit:
       * 0, 2 : trap EL0 and EL1/PL1 accesses
       * 1    : trap only EL0 accesses
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
          flags = FIELD_DP32(flags, TBFLAG_A32, SCTLR_B, arm_sctlr_b(env));
          flags = FIELD_DP32(flags, TBFLAG_A32, NS, !access_secure_reg(env));
          if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)
 -            || arm_el_is_aa64(env, 1)) {
 +            || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
              flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
          }
          flags = FIELD_DP32(flags, TBFLAG_A32, XSCALE_CPAR, env->cp15.c15_cpar);
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
-      * for attempts to execute invalid vfp/neon encodings with FP disabled.
-      */
+ #define CPU_V001 cpu_V0, cpu_V0, cpu_V1
-     if (s->fp_excp_el) {
--        gen_exception_insn(s, 4, EXCP_UDEF,
+-static TCGv_i32 neon_load_scratch(int scratch)
--                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
+-{
-+        if (arm_dc_feature(s, ARM_FEATURE_M)) {
+-    TCGv_i32 tmp = tcg_temp_new_i32();
-+            gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
+-    tcg_gen_ld_i32(tmp, cpu_env, offsetof(CPUARMState, vfp.scratch[scratch]));
-+                               s->fp_excp_el);
+-    return tmp;
-+        } else {
+-}
-+            gen_exception_insn(s, 4, EXCP_UDEF,
+-
-+                               syn_fp_access_trap(1, 0xe, false),
+-static void neon_store_scratch(int scratch, TCGv_i32 var)
-+                               s->fp_excp_el);
+-{
-+        }
+-    tcg_gen_st_i32(var, cpu_env, offsetof(CPUARMState, vfp.scratch[scratch]));
-         return 0;
+-    tcg_temp_free_i32(var);
-     }
+-}
+-
  static int gen_neon_unzip(int rd, int rm, int size, int q)
  {
      TCGv_ptr pd, pm;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  case 1: /* Float VMLA scalar */
                  case 5: /* Floating point VMLS scalar */
                  case 9: /* Floating point VMUL scalar */
 -                    return 1; /* handled by decodetree */
 -
                  case 12: /* VQDMULH scalar */
                  case 13: /* VQRDMULH scalar */
 -                    if (u && ((rd | rn) & 1)) {
 -                        return 1;
 -                    }
 -                    tmp = neon_get_scalar(size, rm);
 -                    neon_store_scratch(0, tmp);
 -                    for (pass = 0; pass < (u ? 4 : 2); pass++) {
 -                        tmp = neon_load_scratch(0);
 -                        tmp2 = neon_load_reg(rn, pass);
 -                        if (op == 12) {
 -                            if (size == 1) {
 -                                gen_helper_neon_qdmulh_s16(tmp, cpu_env, tmp, tmp2);
 -                            } else {
 -                                gen_helper_neon_qdmulh_s32(tmp, cpu_env, tmp, tmp2);
 -                            }
 -                        } else {
 -                            if (size == 1) {
 -                                gen_helper_neon_qrdmulh_s16(tmp, cpu_env, tmp, tmp2);
 -                            } else {
 -                                gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
 -                            }
 -                        }
 -                        tcg_temp_free_i32(tmp2);
 -                        neon_store_reg(rd, pass, tmp);
 -                    }
 -                    break;
 +                    return 1; /* handled by decodetree */
 +
                  case 3: /* VQDMLAL scalar */
                  case 7: /* VQDMLSL scalar */
                  case 11: /* VQDMULL scalar */
 --
 .20.1

-[Qemu-devel] [PULL 24/42] target/arm: New function armv7m_nvic_set_pending_lazyfp()
+[PULL 13/23] target/arm: Convert Neon 2-reg-scalar VQRDMLAH, VQRDMLSH to decodetree
-In the v7M architecture, if an exception is generated in the process
+Convert the VQRDMLAH and VQRDMLSH insns in the 2-reg-scalar
-of doing the lazy stacking of FP registers, the handling of
+group to decodetree.
 possible escalation to HardFault is treated differently to the normal
 approach: it works based on the saved information about exception
 readiness that was stored in the FPCCR when the stack frame was
 created. Provide a new function armv7m_nvic_set_pending_lazyfp()
 which pends exceptions during lazy stacking, and implements
 this logic.
 This corresponds to the pseudocode TakePreserveFPException().
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-22-peter.maydell@linaro.org
 ---
- target/arm/cpu.h      | 12 ++++++
+ target/arm/neon-dp.decode       |  3 ++
- hw/intc/armv7m_nvic.c | 96 +++++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate-neon.inc.c | 74 +++++++++++++++++++++++++++++++++
-files changed, 108 insertions(+)
+ target/arm/translate.c          | 38 +----------------
 files changed, 79 insertions(+), 36 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ void armv7m_nvic_set_pending(void *opaque, int irq, bool secure);
+@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
-  * a different exception).
-  */
+     VQDMULH_2sc  1111 001 . 1 . .. .... .... 1100 . 1 . 0 .... @2scalar
- void armv7m_nvic_set_pending_derived(void *opaque, int irq, bool secure);
+     VQRDMULH_2sc 1111 001 . 1 . .. .... .... 1101 . 1 . 0 .... @2scalar
-+/**
++
-+ * armv7m_nvic_set_pending_lazyfp: mark this lazy FP exception as pending
++    VQRDMLAH_2sc 1111 001 . 1 . .. .... .... 1110 . 1 . 0 .... @2scalar
-+ * @opaque: the NVIC
++    VQRDMLSH_2sc 1111 001 . 1 . .. .... .... 1111 . 1 . 0 .... @2scalar
-+ * @irq: the exception number to mark pending
+   ]
-+ * @secure: false for non-banked exceptions or for the nonsecure
+ }
-+ * version of a banked exception, true for the secure version of a banked
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 + * exception.
 + *
 + * Similar to armv7m_nvic_set_pending(), but specifically for exceptions
 + * generated in the course of lazy stacking of FP registers.
 + */
 +void armv7m_nvic_set_pending_lazyfp(void *opaque, int irq, bool secure);
  /**
   * armv7m_nvic_get_pending_irq_info: return highest priority pending
   *    exception, and whether it targets Secure state
 diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/intc/armv7m_nvic.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/hw/intc/armv7m_nvic.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ void armv7m_nvic_set_pending_derived(void *opaque, int irq, bool secure)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VQRDMULH_2sc(DisasContext *s, arg_2scalar *a)
-     do_armv7m_nvic_set_pending(opaque, irq, secure, true);
      return do_2scalar(s, a, opfn[a->size], NULL);
  }
++
-+void armv7m_nvic_set_pending_lazyfp(void *opaque, int irq, bool secure)
++static bool do_vqrdmlah_2sc(DisasContext *s, arg_2scalar *a,
 +                            NeonGenThreeOpEnvFn *opfn)
 +{
 +    /*
-+     * Pend an exception during lazy FP stacking. This differs
++     * VQRDMLAH/VQRDMLSH: this is like do_2scalar, but the opfn
-+     * from the usual exception pending because the logic for
++     * performs a kind of fused op-then-accumulate using a helper
-+     * whether we should escalate depends on the saved context
++     * function that takes all of rd, rn and the scalar at once.
 +     * in the FPCCR register, not on the current state of the CPU/NVIC.
 +     */
-+    NVICState *s = (NVICState *)opaque;
++    TCGv_i32 scalar;
-+    bool banked = exc_is_banked(irq);
++    int pass;
 +    VecInfo *vec;
 +    bool targets_secure;
 +    bool escalate = false;
 +    /*
 +     * We will only look at bits in fpccr if this is a banked exception
 +     * (in which case 'secure' tells us whether it is the S or NS version).
 +     * All the bits for the non-banked exceptions are in fpccr_s.
 +     */
 +    uint32_t fpccr_s = s->cpu->env.v7m.fpccr[M_REG_S];
 +    uint32_t fpccr = s->cpu->env.v7m.fpccr[secure];
 +
-+    assert(irq > ARMV7M_EXCP_RESET && irq < s->num_irq);
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+    assert(!secure || banked);
++        return false;
 +
 +    vec = (banked && secure) ? &s->sec_vectors[irq] : &s->vectors[irq];
 +
 +    targets_secure = banked ? secure : exc_targets_secure(s, irq);
 +
 +    switch (irq) {
 +    case ARMV7M_EXCP_DEBUG:
 +        if (!(fpccr_s & R_V7M_FPCCR_MONRDY_MASK)) {
 +            /* Ignore DebugMonitor exception */
 +            return;
 +        }
 +        break;
 +    case ARMV7M_EXCP_MEM:
 +        escalate = !(fpccr & R_V7M_FPCCR_MMRDY_MASK);
 +        break;
 +    case ARMV7M_EXCP_USAGE:
 +        escalate = !(fpccr & R_V7M_FPCCR_UFRDY_MASK);
 +        break;
 +    case ARMV7M_EXCP_BUS:
 +        escalate = !(fpccr_s & R_V7M_FPCCR_BFRDY_MASK);
 +        break;
 +    case ARMV7M_EXCP_SECURE:
 +        escalate = !(fpccr_s & R_V7M_FPCCR_SFRDY_MASK);
 +        break;
 +    default:
 +        g_assert_not_reached();
 +    }
 +
-+    if (escalate) {
++    if (!dc_isar_feature(aa32_rdm, s)) {
-+        /*
++        return false;
 +         * Escalate to HardFault: faults that initially targeted Secure
 +         * continue to do so, even if HF normally targets NonSecure.
 +         */
 +        irq = ARMV7M_EXCP_HARD;
 +        if (arm_feature(&s->cpu->env, ARM_FEATURE_M_SECURITY) &&
 +            (targets_secure ||
 +             !(s->cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK))) {
 +            vec = &s->sec_vectors[irq];
 +        } else {
 +            vec = &s->vectors[irq];
 +        }
 +    }
 +
-+    if (!vec->enabled ||
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-+        nvic_exec_prio(s) <= exc_group_prio(s, vec->prio, secure)) {
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        if (!(fpccr_s & R_V7M_FPCCR_HFRDY_MASK)) {
++        ((a->vd | a->vn | a->vm) & 0x10)) {
-+            /*
++        return false;
 +             * We want to escalate to HardFault but the context the
 +             * FP state belongs to prevents the exception pre-empting.
 +             */
 +            cpu_abort(&s->cpu->parent_obj,
 +                      "Lockup: can't escalate to HardFault during "
 +                      "lazy FP register stacking\n");
 +        }
 +    }
 +
-+    if (escalate) {
++    if (!opfn) {
-+        s->cpu->env.v7m.hfsr |= R_V7M_HFSR_FORCED_MASK;
++        /* Bad size (including size == 3, which is a different insn group) */
 +        return false;
 +    }
-+    if (!vec->pending) {
++
-+        vec->pending = 1;
++    if (a->q && ((a->vd | a->vn) & 1)) {
-+        /*
++        return false;
 +         * We do not call nvic_irq_update(), because we know our caller
 +         * is going to handle causing us to take the exception by
 +         * raising EXCP_LAZYFP, so raising the IRQ line would be
 +         * pointless extra work. We just need to recompute the
 +         * priorities so that armv7m_nvic_can_take_pending_exception()
 +         * returns the right answer.
 +         */
 +        nvic_recompute_state(s);
 +    }
++
++    if (!vfp_access_check(s)) {
++        return true;
++    }
++
++    scalar = neon_get_scalar(a->size, a->vm);
++
++    for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
++        TCGv_i32 rn = neon_load_reg(a->vn, pass);
++        TCGv_i32 rd = neon_load_reg(a->vd, pass);
++        opfn(rd, cpu_env, rn, scalar, rd);
++        tcg_temp_free_i32(rn);
++        neon_store_reg(a->vd, pass, rd);
++    }
++    tcg_temp_free_i32(scalar);
++
++    return true;
 +}
 +
- /* Make pending IRQ active.  */
++static bool trans_VQRDMLAH_2sc(DisasContext *s, arg_2scalar *a)
- void armv7m_nvic_acknowledge_irq(void *opaque)
++{
- {
++    static NeonGenThreeOpEnvFn *opfn[] = {
 +        NULL,
 +        gen_helper_neon_qrdmlah_s16,
 +        gen_helper_neon_qrdmlah_s32,
 +        NULL,
 +    };
 +    return do_vqrdmlah_2sc(s, a, opfn[a->size]);
 +}
 +
 +static bool trans_VQRDMLSH_2sc(DisasContext *s, arg_2scalar *a)
 +{
 +    static NeonGenThreeOpEnvFn *opfn[] = {
 +        NULL,
 +        gen_helper_neon_qrdmlsh_s16,
 +        gen_helper_neon_qrdmlsh_s32,
 +        NULL,
 +    };
 +    return do_vqrdmlah_2sc(s, a, opfn[a->size]);
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  case 9: /* Floating point VMUL scalar */
                  case 12: /* VQDMULH scalar */
                  case 13: /* VQRDMULH scalar */
 +                case 14: /* VQRDMLAH scalar */
 +                case 15: /* VQRDMLSH scalar */
                      return 1; /* handled by decodetree */
                  case 3: /* VQDMLAL scalar */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          neon_store_reg64(cpu_V0, rd + pass);
                      }
                      break;
 -                case 14: /* VQRDMLAH scalar */
 -                case 15: /* VQRDMLSH scalar */
 -                    {
 -                        NeonGenThreeOpEnvFn *fn;
 -
 -                        if (!dc_isar_feature(aa32_rdm, s)) {
 -                            return 1;
 -                        }
 -                        if (u && ((rd | rn) & 1)) {
 -                            return 1;
 -                        }
 -                        if (op == 14) {
 -                            if (size == 1) {
 -                                fn = gen_helper_neon_qrdmlah_s16;
 -                            } else {
 -                                fn = gen_helper_neon_qrdmlah_s32;
 -                            }
 -                        } else {
 -                            if (size == 1) {
 -                                fn = gen_helper_neon_qrdmlsh_s16;
 -                            } else {
 -                                fn = gen_helper_neon_qrdmlsh_s32;
 -                            }
 -                        }
 -
 -                        tmp2 = neon_get_scalar(size, rm);
 -                        for (pass = 0; pass < (u ? 4 : 2); pass++) {
 -                            tmp = neon_load_reg(rn, pass);
 -                            tmp3 = neon_load_reg(rd, pass);
 -                            fn(tmp, cpu_env, tmp, tmp2, tmp3);
 -                            tcg_temp_free_i32(tmp3);
 -                            neon_store_reg(rd, pass, tmp);
 -                        }
 -                        tcg_temp_free_i32(tmp2);
 -                    }
 -                    break;
                  default:
                      g_assert_not_reached();
                  }
 --
 .20.1

-[Qemu-devel] [PULL 23/42] target/arm: New helper function arm_v7m_mmu_idx_all()
+[PULL 14/23] target/arm: Convert Neon 2-reg-scalar long multiplies to decodetree
-Add a new helper function which returns the MMU index to use
+Convert the Neon 2-reg-scalar long multiplies to decodetree.
-for v7M, where the caller specifies all of the security
+These are the last instructions in the group.
 state, privilege level and whether the execution priority
 is negative, and reimplement the existing
 arm_v7m_mmu_idx_for_secstate_and_priv() in terms of it.
 We are going to need this for the lazy-FP-stacking code.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-21-peter.maydell@linaro.org
 ---
- target/arm/cpu.h    |  7 +++++++
+ target/arm/neon-dp.decode       |  18 ++++
- target/arm/helper.c | 14 +++++++++++---
+ target/arm/translate-neon.inc.c | 163 ++++++++++++++++++++++++++++
-files changed, 18 insertions(+), 3 deletions(-)
+ target/arm/translate.c          | 182 ++------------------------------
 files changed, 187 insertions(+), 176 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ static inline int arm_mmu_idx_to_el(ARMMMUIdx mmu_idx)
+@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
      @2scalar     .... ... q:1 . . size:2 .... .... .... . . . . .... \
                   &2scalar vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +    # For the 'long' ops the Q bit is part of insn decode
 +    @2scalar_q0  .... ... . . . size:2 .... .... .... . . . . .... \
 +                 &2scalar vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
      VMLA_2sc     1111 001 . 1 . .. .... .... 0000 . 1 . 0 .... @2scalar
      VMLA_F_2sc   1111 001 . 1 . .. .... .... 0001 . 1 . 0 .... @2scalar
 +    VMLAL_S_2sc  1111 001 0 1 . .. .... .... 0010 . 1 . 0 .... @2scalar_q0
 +    VMLAL_U_2sc  1111 001 1 1 . .. .... .... 0010 . 1 . 0 .... @2scalar_q0
 +
 +    VQDMLAL_2sc  1111 001 0 1 . .. .... .... 0011 . 1 . 0 .... @2scalar_q0
 +
      VMLS_2sc     1111 001 . 1 . .. .... .... 0100 . 1 . 0 .... @2scalar
      VMLS_F_2sc   1111 001 . 1 . .. .... .... 0101 . 1 . 0 .... @2scalar
 +    VMLSL_S_2sc  1111 001 0 1 . .. .... .... 0110 . 1 . 0 .... @2scalar_q0
 +    VMLSL_U_2sc  1111 001 1 1 . .. .... .... 0110 . 1 . 0 .... @2scalar_q0
 +
 +    VQDMLSL_2sc  1111 001 0 1 . .. .... .... 0111 . 1 . 0 .... @2scalar_q0
 +
      VMUL_2sc     1111 001 . 1 . .. .... .... 1000 . 1 . 0 .... @2scalar
      VMUL_F_2sc   1111 001 . 1 . .. .... .... 1001 . 1 . 0 .... @2scalar
 +    VMULL_S_2sc  1111 001 0 1 . .. .... .... 1010 . 1 . 0 .... @2scalar_q0
 +    VMULL_U_2sc  1111 001 1 1 . .. .... .... 1010 . 1 . 0 .... @2scalar_q0
 +
 +    VQDMULL_2sc  1111 001 0 1 . .. .... .... 1011 . 1 . 0 .... @2scalar_q0
 +
      VQDMULH_2sc  1111 001 . 1 . .. .... .... 1100 . 1 . 0 .... @2scalar
      VQRDMULH_2sc 1111 001 . 1 . .. .... .... 1101 . 1 . 0 .... @2scalar
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VQRDMLSH_2sc(DisasContext *s, arg_2scalar *a)
      };
      return do_vqrdmlah_2sc(s, a, opfn[a->size]);
  }
 +
 +static bool do_2scalar_long(DisasContext *s, arg_2scalar *a,
 +                            NeonGenTwoOpWidenFn *opfn,
 +                            NeonGenTwo64OpFn *accfn)
 +{
 +    /*
 +     * Two registers and a scalar, long operations: perform an
 +     * operation on the input elements and the scalar which produces
 +     * a double-width result, and then possibly perform an accumulation
 +     * operation of that result into the destination.
 +     */
 +    TCGv_i32 scalar, rn;
 +    TCGv_i64 rn0_64, rn1_64;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if (!opfn) {
 +        /* Bad size (including size == 3, which is a different insn group) */
 +        return false;
 +    }
 +
 +    if (a->vd & 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    scalar = neon_get_scalar(a->size, a->vm);
 +
 +    /* Load all inputs before writing any outputs, in case of overlap */
 +    rn = neon_load_reg(a->vn, 0);
 +    rn0_64 = tcg_temp_new_i64();
 +    opfn(rn0_64, rn, scalar);
 +    tcg_temp_free_i32(rn);
 +
 +    rn = neon_load_reg(a->vn, 1);
 +    rn1_64 = tcg_temp_new_i64();
 +    opfn(rn1_64, rn, scalar);
 +    tcg_temp_free_i32(rn);
 +    tcg_temp_free_i32(scalar);
 +
 +    if (accfn) {
 +        TCGv_i64 t64 = tcg_temp_new_i64();
 +        neon_load_reg64(t64, a->vd);
 +        accfn(t64, t64, rn0_64);
 +        neon_store_reg64(t64, a->vd);
 +        neon_load_reg64(t64, a->vd + 1);
 +        accfn(t64, t64, rn1_64);
 +        neon_store_reg64(t64, a->vd + 1);
 +        tcg_temp_free_i64(t64);
 +    } else {
 +        neon_store_reg64(rn0_64, a->vd);
 +        neon_store_reg64(rn1_64, a->vd + 1);
 +    }
 +    tcg_temp_free_i64(rn0_64);
 +    tcg_temp_free_i64(rn1_64);
 +    return true;
 +}
 +
 +static bool trans_VMULL_S_2sc(DisasContext *s, arg_2scalar *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        NULL,
 +        gen_helper_neon_mull_s16,
 +        gen_mull_s32,
 +        NULL,
 +    };
 +
 +    return do_2scalar_long(s, a, opfn[a->size], NULL);
 +}
 +
 +static bool trans_VMULL_U_2sc(DisasContext *s, arg_2scalar *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        NULL,
 +        gen_helper_neon_mull_u16,
 +        gen_mull_u32,
 +        NULL,
 +    };
 +
 +    return do_2scalar_long(s, a, opfn[a->size], NULL);
 +}
 +
 +#define DO_VMLAL_2SC(INSN, MULL, ACC)                                   \
 +    static bool trans_##INSN##_2sc(DisasContext *s, arg_2scalar *a)     \
 +    {                                                                   \
 +        static NeonGenTwoOpWidenFn * const opfn[] = {                   \
 +            NULL,                                                       \
 +            gen_helper_neon_##MULL##16,                                 \
 +            gen_##MULL##32,                                             \
 +            NULL,                                                       \
 +        };                                                              \
 +        static NeonGenTwo64OpFn * const accfn[] = {                     \
 +            NULL,                                                       \
 +            gen_helper_neon_##ACC##l_u32,                               \
 +            tcg_gen_##ACC##_i64,                                        \
 +            NULL,                                                       \
 +        };                                                              \
 +        return do_2scalar_long(s, a, opfn[a->size], accfn[a->size]);    \
 +    }
 +
 +DO_VMLAL_2SC(VMLAL_S, mull_s, add)
 +DO_VMLAL_2SC(VMLAL_U, mull_u, add)
 +DO_VMLAL_2SC(VMLSL_S, mull_s, sub)
 +DO_VMLAL_2SC(VMLSL_U, mull_u, sub)
 +
 +static bool trans_VQDMULL_2sc(DisasContext *s, arg_2scalar *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        NULL,
 +        gen_VQDMULL_16,
 +        gen_VQDMULL_32,
 +        NULL,
 +    };
 +
 +    return do_2scalar_long(s, a, opfn[a->size], NULL);
 +}
 +
 +static bool trans_VQDMLAL_2sc(DisasContext *s, arg_2scalar *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        NULL,
 +        gen_VQDMULL_16,
 +        gen_VQDMULL_32,
 +        NULL,
 +    };
 +    static NeonGenTwo64OpFn * const accfn[] = {
 +        NULL,
 +        gen_VQDMLAL_acc_16,
 +        gen_VQDMLAL_acc_32,
 +        NULL,
 +    };
 +
 +    return do_2scalar_long(s, a, opfn[a->size], accfn[a->size]);
 +}
 +
 +static bool trans_VQDMLSL_2sc(DisasContext *s, arg_2scalar *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        NULL,
 +        gen_VQDMULL_16,
 +        gen_VQDMULL_32,
 +        NULL,
 +    };
 +    static NeonGenTwo64OpFn * const accfn[] = {
 +        NULL,
 +        gen_VQDMLSL_acc_16,
 +        gen_VQDMLSL_acc_32,
 +        NULL,
 +    };
 +
 +    return do_2scalar_long(s, a, opfn[a->size], accfn[a->size]);
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_revsh(TCGv_i32 dest, TCGv_i32 var)
      tcg_gen_ext16s_i32(dest, var);
  }
 -/* 32x32->64 multiply.  Marks inputs as dead.  */
 -static TCGv_i64 gen_mulu_i64_i32(TCGv_i32 a, TCGv_i32 b)
 -{
 -    TCGv_i32 lo = tcg_temp_new_i32();
 -    TCGv_i32 hi = tcg_temp_new_i32();
 -    TCGv_i64 ret;
 -
 -    tcg_gen_mulu2_i32(lo, hi, a, b);
 -    tcg_temp_free_i32(a);
 -    tcg_temp_free_i32(b);
 -
 -    ret = tcg_temp_new_i64();
 -    tcg_gen_concat_i32_i64(ret, lo, hi);
 -    tcg_temp_free_i32(lo);
 -    tcg_temp_free_i32(hi);
 -
 -    return ret;
 -}
 -
 -static TCGv_i64 gen_muls_i64_i32(TCGv_i32 a, TCGv_i32 b)
 -{
 -    TCGv_i32 lo = tcg_temp_new_i32();
 -    TCGv_i32 hi = tcg_temp_new_i32();
 -    TCGv_i64 ret;
 -
 -    tcg_gen_muls2_i32(lo, hi, a, b);
 -    tcg_temp_free_i32(a);
 -    tcg_temp_free_i32(b);
 -
 -    ret = tcg_temp_new_i64();
 -    tcg_gen_concat_i32_i64(ret, lo, hi);
 -    tcg_temp_free_i32(lo);
 -    tcg_temp_free_i32(hi);
 -
 -    return ret;
 -}
 -
  /* Swap low and high halfwords.  */
  static void gen_swap_half(TCGv_i32 var)
  {
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_addl(int size)
      }
  }
-+/*
+-static inline void gen_neon_negl(TCGv_i64 var, int size)
-+ * Return the MMU index for a v7M CPU with all relevant information
+-{
-+ * manually specified.
+-    switch (size) {
-+ */
+-    case 0: gen_helper_neon_negl_u16(var, var); break;
-+ARMMMUIdx arm_v7m_mmu_idx_all(CPUARMState *env,
+-    case 1: gen_helper_neon_negl_u32(var, var); break;
-+                              bool secstate, bool priv, bool negpri);
+-    case 2:
-+
+-        tcg_gen_neg_i64(var, var);
- /* Return the MMU index for a v7M CPU in the specified security and
+-        break;
-  * privilege state.
+-    default: abort();
-  */
+-    }
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+-}
-index XXXXXXX..XXXXXXX 100644
+-
---- a/target/arm/helper.c
+-static inline void gen_neon_addl_saturate(TCGv_i64 op0, TCGv_i64 op1, int size)
-+++ b/target/arm/helper.c
+-{
-@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
+-    switch (size) {
-     return 0;
+-    case 1: gen_helper_neon_addl_saturate_s32(op0, cpu_env, op0, op1); break;
- }
+-    case 2: gen_helper_neon_addl_saturate_s64(op0, cpu_env, op0, op1); break;
+-    default: abort();
--ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
+-    }
--                                                bool secstate, bool priv)
+-}
-+ARMMMUIdx arm_v7m_mmu_idx_all(CPUARMState *env,
+-
-+                              bool secstate, bool priv, bool negpri)
+-static inline void gen_neon_mull(TCGv_i64 dest, TCGv_i32 a, TCGv_i32 b,
 -                                 int size, int u)
 -{
 -    TCGv_i64 tmp;
 -
 -    switch ((size << 1) | u) {
 -    case 0: gen_helper_neon_mull_s8(dest, a, b); break;
 -    case 1: gen_helper_neon_mull_u8(dest, a, b); break;
 -    case 2: gen_helper_neon_mull_s16(dest, a, b); break;
 -    case 3: gen_helper_neon_mull_u16(dest, a, b); break;
 -    case 4:
 -        tmp = gen_muls_i64_i32(a, b);
 -        tcg_gen_mov_i64(dest, tmp);
 -        tcg_temp_free_i64(tmp);
 -        break;
 -    case 5:
 -        tmp = gen_mulu_i64_i32(a, b);
 -        tcg_gen_mov_i64(dest, tmp);
 -        tcg_temp_free_i64(tmp);
 -        break;
 -    default: abort();
 -    }
 -
 -    /* gen_helper_neon_mull_[su]{8|16} do not free their parameters.
 -       Don't forget to clean them now.  */
 -    if (size < 2) {
 -        tcg_temp_free_i32(a);
 -        tcg_temp_free_i32(b);
 -    }
 -}
 -
  static void gen_neon_narrow_op(int op, int u, int size,
                                 TCGv_i32 dest, TCGv_i64 src)
  {
-     ARMMMUIdx mmu_idx = ARM_MMU_IDX_M;
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+     int u;
-@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
+     int vec_size;
-         mmu_idx |= ARM_MMU_IDX_M_PRIV;
+     uint32_t imm;
-     }
+-    TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
++    TCGv_i32 tmp, tmp2, tmp3, tmp5;
--    if (armv7m_nvic_neg_prio_requested(env->nvic, secstate)) {
+     TCGv_ptr ptr1;
-+    if (negpri) {
+     TCGv_i64 tmp64;
-         mmu_idx |= ARM_MMU_IDX_M_NEGPRI;
-     }
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+         return 1;
-@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
+     } else { /* (insn & 0x00800010 == 0x00800000) */
-     return mmu_idx;
+         if (size != 3) {
- }
+-            op = (insn >> 8) & 0xf;
+-            if ((insn & (1 << 6)) == 0) {
-+ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
+-                /* Three registers of different lengths: handled by decodetree */
-+                                                bool secstate, bool priv)
+-                return 1;
-+{
+-            } else {
-+    bool negpri = armv7m_nvic_neg_prio_requested(env->nvic, secstate);
+-                /* Two registers and a scalar. NB that for ops of this form
-+
+-                 * the ARM ARM labels bit 24 as Q, but it is in our variable
-+    return arm_v7m_mmu_idx_all(env, secstate, priv, negpri);
+-                 * 'u', not 'q'.
-+}
+-                 */
-+
+-                if (size == 0) {
- /* Return the MMU index for a v7M CPU in the specified security state */
+-                    return 1;
- ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate)
+-                }
- {
+-                switch (op) {
 -                case 0: /* Integer VMLA scalar */
 -                case 4: /* Integer VMLS scalar */
 -                case 8: /* Integer VMUL scalar */
 -                case 1: /* Float VMLA scalar */
 -                case 5: /* Floating point VMLS scalar */
 -                case 9: /* Floating point VMUL scalar */
 -                case 12: /* VQDMULH scalar */
 -                case 13: /* VQRDMULH scalar */
 -                case 14: /* VQRDMLAH scalar */
 -                case 15: /* VQRDMLSH scalar */
 -                    return 1; /* handled by decodetree */
 -
 -                case 3: /* VQDMLAL scalar */
 -                case 7: /* VQDMLSL scalar */
 -                case 11: /* VQDMULL scalar */
 -                    if (u == 1) {
 -                        return 1;
 -                    }
 -                    /* fall through */
 -                case 2: /* VMLAL sclar */
 -                case 6: /* VMLSL scalar */
 -                case 10: /* VMULL scalar */
 -                    if (rd & 1) {
 -                        return 1;
 -                    }
 -                    tmp2 = neon_get_scalar(size, rm);
 -                    /* We need a copy of tmp2 because gen_neon_mull
 -                     * deletes it during pass 0.  */
 -                    tmp4 = tcg_temp_new_i32();
 -                    tcg_gen_mov_i32(tmp4, tmp2);
 -                    tmp3 = neon_load_reg(rn, 1);
 -
 -                    for (pass = 0; pass < 2; pass++) {
 -                        if (pass == 0) {
 -                            tmp = neon_load_reg(rn, 0);
 -                        } else {
 -                            tmp = tmp3;
 -                            tmp2 = tmp4;
 -                        }
 -                        gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
 -                        if (op != 11) {
 -                            neon_load_reg64(cpu_V1, rd + pass);
 -                        }
 -                        switch (op) {
 -                        case 6:
 -                            gen_neon_negl(cpu_V0, size);
 -                            /* Fall through */
 -                        case 2:
 -                            gen_neon_addl(size);
 -                            break;
 -                        case 3: case 7:
 -                            gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
 -                            if (op == 7) {
 -                                gen_neon_negl(cpu_V0, size);
 -                            }
 -                            gen_neon_addl_saturate(cpu_V0, cpu_V1, size);
 -                            break;
 -                        case 10:
 -                            /* no-op */
 -                            break;
 -                        case 11:
 -                            gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
 -                            break;
 -                        default:
 -                            abort();
 -                        }
 -                        neon_store_reg64(cpu_V0, rd + pass);
 -                    }
 -                    break;
 -                default:
 -                    g_assert_not_reached();
 -                }
 -            }
 +            /*
 +             * Three registers of different lengths, or two registers and
 +             * a scalar: handled by decodetree
 +             */
 +            return 1;
          } else { /* size == 3 */
              if (!u) {
                  /* Extract.  */
 --
 .20.1

-[Qemu-devel] [PULL 20/42] target/arm: Overlap VECSTRIDE and XSCALE_CPAR TB flags
+[PULL 15/23] target/arm: Convert Neon VEXT to decodetree
-We are close to running out of TB flags for AArch32; we could
+Convert the Neon VEXT insn to decodetree. Rather than keeping the
-start using the cs_base word, but before we do that we can
+old implementation which used fixed temporaries cpu_V0 and cpu_V1
-economise on our usage by sharing the same bits for the VFP
+and did the extraction with by-hand shift and logic ops, we use
-VECSTRIDE field and the XScale XSCALE_CPAR field. This
+the TCG extract2 insn.
-works because no XScale CPU ever had VFP.
 We don't need to special case 0 or 8 immediates any more as the
 optimizer is smart enough to throw away the dead code.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-18-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       | 10 ++++++----
+ target/arm/neon-dp.decode       |  8 +++-
- target/arm/cpu.c       |  7 +++++++
+ target/arm/translate-neon.inc.c | 76 +++++++++++++++++++++++++++++++++
- target/arm/helper.c    |  6 +++++-
+ target/arm/translate.c          | 58 +------------------------
- target/arm/translate.c |  9 +++++++--
+files changed, 85 insertions(+), 57 deletions(-)
-files changed, 25 insertions(+), 7 deletions(-)
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 diff --git a/target/arm/cpu.h b/target/arm/cpu.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
+@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
- FIELD(TBFLAG_A32, THUMB, 0, 1)
+ # return false for size==3.
- FIELD(TBFLAG_A32, VECLEN, 1, 3)
+ ######################################################################
- FIELD(TBFLAG_A32, VECSTRIDE, 4, 2)
+ {
-+/*
+-  # 0b11 subgroup will go here
-+ * We store the bottom two bits of the CPAR as TB flags and handle
++  [
-+ * checks on the other bits at runtime. This shares the same bits as
++    ##################################################################
-+ * VECSTRIDE, which is OK as no XScale CPU has VFP.
++    # Miscellaneous size=0b11 insns
-+ */
++    ##################################################################
-+FIELD(TBFLAG_A32, XSCALE_CPAR, 4, 2)
++    VEXT         1111 001 0 1 . 11 .... .... imm:4 . q:1 . 0 .... \
- /*
++                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
-  * Indicates whether cp register reads and writes by guest code should access
++  ]
-  * the secure or nonsecure bank of banked registers; note that this is not
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
+   # Subgroup for size != 0b11
- FIELD(TBFLAG_A32, VFPEN, 7, 1)
+   [
- FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
  FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
 -/* We store the bottom two bits of the CPAR as TB flags and handle
 - * checks on the other bits at runtime
 - */
 -FIELD(TBFLAG_A32, XSCALE_CPAR, 17, 2)
  /* For M profile only, Handler (ie not Thread) mode */
  FIELD(TBFLAG_A32, HANDLER, 21, 1)
  /* For M profile only, whether we should generate stack-limit checks */
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/cpu.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VQDMLSL_2sc(DisasContext *s, arg_2scalar *a)
-         set_feature(env, ARM_FEATURE_THUMB_DSP);
-     }
+     return do_2scalar_long(s, a, opfn[a->size], accfn[a->size]);
+ }
-+    /*
++
-+     * We rely on no XScale CPU having VFP so we can use the same bits in the
++static bool trans_VEXT(DisasContext *s, arg_VEXT *a)
-+     * TB flags field for VECSTRIDE and XSCALE_CPAR.
++{
-+     */
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+    assert(!(arm_feature(env, ARM_FEATURE_VFP) &&
++        return false;
-+             arm_feature(env, ARM_FEATURE_XSCALE)));
++    }
 +
-     if (arm_feature(env, ARM_FEATURE_V7) &&
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-         !arm_feature(env, ARM_FEATURE_M) &&
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
-         !arm_feature(env, ARM_FEATURE_PMSA)) {
++        ((a->vd | a->vn | a->vm) & 0x10)) {
-diff --git a/target/arm/helper.c b/target/arm/helper.c
++        return false;
-index XXXXXXX..XXXXXXX 100644
++    }
---- a/target/arm/helper.c
++
-+++ b/target/arm/helper.c
++    if ((a->vn | a->vm | a->vd) & a->q) {
-@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
++        return false;
-             || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
++    }
-             flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
++
-         }
++    if (a->imm > 7 && !a->q) {
--        flags = FIELD_DP32(flags, TBFLAG_A32, XSCALE_CPAR, env->cp15.c15_cpar);
++        return false;
-+        /* Note that XSCALE_CPAR shares bits with VECSTRIDE */
++    }
-+        if (arm_feature(env, ARM_FEATURE_XSCALE)) {
++
-+            flags = FIELD_DP32(flags, TBFLAG_A32,
++    if (!vfp_access_check(s)) {
-+                               XSCALE_CPAR, env->cp15.c15_cpar);
++        return true;
 +    }
 +
 +    if (!a->q) {
 +        /* Extract 64 bits from <Vm:Vn> */
 +        TCGv_i64 left, right, dest;
 +
 +        left = tcg_temp_new_i64();
 +        right = tcg_temp_new_i64();
 +        dest = tcg_temp_new_i64();
 +
 +        neon_load_reg64(right, a->vn);
 +        neon_load_reg64(left, a->vm);
 +        tcg_gen_extract2_i64(dest, right, left, a->imm * 8);
 +        neon_store_reg64(dest, a->vd);
 +
 +        tcg_temp_free_i64(left);
 +        tcg_temp_free_i64(right);
 +        tcg_temp_free_i64(dest);
 +    } else {
 +        /* Extract 128 bits from <Vm+1:Vm:Vn+1:Vn> */
 +        TCGv_i64 left, middle, right, destleft, destright;
 +
 +        left = tcg_temp_new_i64();
 +        middle = tcg_temp_new_i64();
 +        right = tcg_temp_new_i64();
 +        destleft = tcg_temp_new_i64();
 +        destright = tcg_temp_new_i64();
 +
 +        if (a->imm < 8) {
 +            neon_load_reg64(right, a->vn);
 +            neon_load_reg64(middle, a->vn + 1);
 +            tcg_gen_extract2_i64(destright, right, middle, a->imm * 8);
 +            neon_load_reg64(left, a->vm);
 +            tcg_gen_extract2_i64(destleft, middle, left, a->imm * 8);
 +        } else {
 +            neon_load_reg64(right, a->vn + 1);
 +            neon_load_reg64(middle, a->vm);
 +            tcg_gen_extract2_i64(destright, right, middle, (a->imm - 8) * 8);
 +            neon_load_reg64(left, a->vm + 1);
 +            tcg_gen_extract2_i64(destleft, middle, left, (a->imm - 8) * 8);
 +        }
-     }
++
++        neon_store_reg64(destright, a->vd);
-     flags = FIELD_DP32(flags, TBFLAG_ANY, MMUIDX, arm_to_core_mmu_idx(mmu_idx));
++        neon_store_reg64(destleft, a->vd + 1);
 +
 +        tcg_temp_free_i64(destright);
 +        tcg_temp_free_i64(destleft);
 +        tcg_temp_free_i64(right);
 +        tcg_temp_free_i64(middle);
 +        tcg_temp_free_i64(left);
 +    }
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     dc->fp_excp_el = FIELD_EX32(tb_flags, TBFLAG_ANY, FPEXC_EL);
+     int pass;
-     dc->vfp_enabled = FIELD_EX32(tb_flags, TBFLAG_A32, VFPEN);
+     int u;
-     dc->vec_len = FIELD_EX32(tb_flags, TBFLAG_A32, VECLEN);
+     int vec_size;
--    dc->vec_stride = FIELD_EX32(tb_flags, TBFLAG_A32, VECSTRIDE);
+-    uint32_t imm;
--    dc->c15_cpar = FIELD_EX32(tb_flags, TBFLAG_A32, XSCALE_CPAR);
+     TCGv_i32 tmp, tmp2, tmp3, tmp5;
-+    if (arm_feature(env, ARM_FEATURE_XSCALE)) {
+     TCGv_ptr ptr1;
-+        dc->c15_cpar = FIELD_EX32(tb_flags, TBFLAG_A32, XSCALE_CPAR);
+-    TCGv_i64 tmp64;
-+        dc->vec_stride = 0;
-+    } else {
+     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+        dc->vec_stride = FIELD_EX32(tb_flags, TBFLAG_A32, VECSTRIDE);
+         return 1;
-+        dc->c15_cpar = 0;
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-+    }
+             return 1;
-     dc->v7m_handler_mode = FIELD_EX32(tb_flags, TBFLAG_A32, HANDLER);
+         } else { /* size == 3 */
-     dc->v8m_secure = arm_feature(env, ARM_FEATURE_M_SECURITY) &&
+             if (!u) {
-         regime_is_secure(env, dc->mmu_idx);
+-                /* Extract.  */
 -                imm = (insn >> 8) & 0xf;
 -
 -                if (imm > 7 && !q)
 -                    return 1;
 -
 -                if (q && ((rd | rn | rm) & 1)) {
 -                    return 1;
 -                }
 -
 -                if (imm == 0) {
 -                    neon_load_reg64(cpu_V0, rn);
 -                    if (q) {
 -                        neon_load_reg64(cpu_V1, rn + 1);
 -                    }
 -                } else if (imm == 8) {
 -                    neon_load_reg64(cpu_V0, rn + 1);
 -                    if (q) {
 -                        neon_load_reg64(cpu_V1, rm);
 -                    }
 -                } else if (q) {
 -                    tmp64 = tcg_temp_new_i64();
 -                    if (imm < 8) {
 -                        neon_load_reg64(cpu_V0, rn);
 -                        neon_load_reg64(tmp64, rn + 1);
 -                    } else {
 -                        neon_load_reg64(cpu_V0, rn + 1);
 -                        neon_load_reg64(tmp64, rm);
 -                    }
 -                    tcg_gen_shri_i64(cpu_V0, cpu_V0, (imm & 7) * 8);
 -                    tcg_gen_shli_i64(cpu_V1, tmp64, 64 - ((imm & 7) * 8));
 -                    tcg_gen_or_i64(cpu_V0, cpu_V0, cpu_V1);
 -                    if (imm < 8) {
 -                        neon_load_reg64(cpu_V1, rm);
 -                    } else {
 -                        neon_load_reg64(cpu_V1, rm + 1);
 -                        imm -= 8;
 -                    }
 -                    tcg_gen_shli_i64(cpu_V1, cpu_V1, 64 - (imm * 8));
 -                    tcg_gen_shri_i64(tmp64, tmp64, imm * 8);
 -                    tcg_gen_or_i64(cpu_V1, cpu_V1, tmp64);
 -                    tcg_temp_free_i64(tmp64);
 -                } else {
 -                    /* BUGFIX */
 -                    neon_load_reg64(cpu_V0, rn);
 -                    tcg_gen_shri_i64(cpu_V0, cpu_V0, imm * 8);
 -                    neon_load_reg64(cpu_V1, rm);
 -                    tcg_gen_shli_i64(cpu_V1, cpu_V1, 64 - (imm * 8));
 -                    tcg_gen_or_i64(cpu_V0, cpu_V0, cpu_V1);
 -                }
 -                neon_store_reg64(cpu_V0, rd);
 -                if (q) {
 -                    neon_store_reg64(cpu_V1, rd + 1);
 -                }
 +                /* Extract: handled by decodetree */
 +                return 1;
              } else if ((insn & (1 << 11)) == 0) {
                  /* Two register misc.  */
                  op = ((insn >> 12) & 0x30) | ((insn >> 7) & 0xf);
 --
 .20.1

-[Qemu-devel] [PULL 28/42] target/arm: Implement VLLDM for v7M CPUs with an FPU
+[PULL 16/23] target/arm: Convert Neon VTBL, VTBX to decodetree
-Implement the VLLDM instruction for v7M for the FPU present cas.
+Convert the Neon VTBL, VTBX instructions to decodetree.  The actual
 implementation of the insn is copied across to the new trans function
 unchanged except for renaming 'tmp5' to 'tmp4'.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-26-peter.maydell@linaro.org
 ---
- target/arm/helper.h    |  1 +
+ target/arm/neon-dp.decode       |  3 ++
- target/arm/helper.c    | 54 ++++++++++++++++++++++++++++++++++++++++++
+ target/arm/translate-neon.inc.c | 56 +++++++++++++++++++++++++++++++++
- target/arm/translate.c |  2 +-
+ target/arm/translate.c          | 41 +++---------------------
-files changed, 56 insertions(+), 1 deletion(-)
+files changed, 63 insertions(+), 37 deletions(-)
-diff --git a/target/arm/helper.h b/target/arm/helper.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/helper.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
+@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
- DEF_HELPER_1(v7m_preserve_fp_state, void, env)
+     ##################################################################
+     VEXT         1111 001 0 1 . 11 .... .... imm:4 . q:1 . 0 .... \
- DEF_HELPER_2(v7m_vlstm, void, env, i32)
+                  vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+DEF_HELPER_2(v7m_vlldm, void, env, i32)
++
++    VTBL         1111 001 1 1 . 11 .... .... 10 len:2 . op:1 . 0 .... \
- DEF_HELPER_2(v8m_stackcheck, void, env, i32)
++                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
+   ]
-diff --git a/target/arm/helper.c b/target/arm/helper.c
    # Subgroup for size != 0b11
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/helper.c
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VEXT(DisasContext *s, arg_VEXT *a)
-     g_assert_not_reached();
+     }
      return true;
  }
++
-+void HELPER(v7m_vlldm)(CPUARMState *env, uint32_t fptr)
++static bool trans_VTBL(DisasContext *s, arg_VTBL *a)
 +{
-+    /* translate.c should never generate calls here in user-only mode */
++    int n;
-+    g_assert_not_reached();
++    TCGv_i32 tmp, tmp2, tmp3, tmp4;
-+}
++    TCGv_ptr ptr1;
 +
- uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
- {
++        return false;
      /* The TT instructions can be used by unprivileged code, but in
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
      env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
  }
 +void HELPER(v7m_vlldm)(CPUARMState *env, uint32_t fptr)
 +{
 +    /* fptr is the value of Rn, the frame pointer we load the FP regs from */
 +    assert(env->v7m.secure);
 +
 +    if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)) {
 +        return;
 +    }
 +
-+    /* Check access to the coprocessor is permitted */
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    if (!v7m_cpacr_pass(env, true, arm_current_el(env) != 0)) {
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        raise_exception_ra(env, EXCP_NOCP, 0, 1, GETPC());
++        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
-+    if (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPACT_MASK) {
++    if (!vfp_access_check(s)) {
-+        /* State in FP is still valid */
++        return true;
 +        env->v7m.fpccr[M_REG_S] &= ~R_V7M_FPCCR_LSPACT_MASK;
 +    } else {
 +        bool ts = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK;
 +        int i;
 +        uint32_t fpscr;
 +
 +        if (fptr & 7) {
 +            raise_exception_ra(env, EXCP_UNALIGNED, 0, 1, GETPC());
 +        }
 +
 +        for (i = 0; i < (ts ? 32 : 16); i += 2) {
 +            uint32_t slo, shi;
 +            uint64_t dn;
 +            uint32_t faddr = fptr + 4 * i;
 +
 +            if (i >= 16) {
 +                faddr += 8; /* skip the slot for the FPSCR */
 +            }
 +
 +            slo = cpu_ldl_data(env, faddr);
 +            shi = cpu_ldl_data(env, faddr + 4);
 +
 +            dn = (uint64_t) shi << 32 | slo;
 +            *aa32_vfp_dreg(env, i / 2) = dn;
 +        }
 +        fpscr = cpu_ldl_data(env, fptr + 0x40);
 +        vfp_set_fpscr(env, fpscr);
 +    }
 +
-+    env->v7m.control[M_REG_S] |= R_V7M_CONTROL_FPCA_MASK;
++    n = a->len + 1;
 +    if ((a->vn + n) > 32) {
 +        /*
 +         * This is UNPREDICTABLE; we choose to UNDEF to avoid the
 +         * helper function running off the end of the register file.
 +         */
 +        return false;
 +    }
 +    n <<= 3;
 +    if (a->op) {
 +        tmp = neon_load_reg(a->vd, 0);
 +    } else {
 +        tmp = tcg_temp_new_i32();
 +        tcg_gen_movi_i32(tmp, 0);
 +    }
 +    tmp2 = neon_load_reg(a->vm, 0);
 +    ptr1 = vfp_reg_ptr(true, a->vn);
 +    tmp4 = tcg_const_i32(n);
 +    gen_helper_neon_tbl(tmp2, tmp2, tmp, ptr1, tmp4);
 +    tcg_temp_free_i32(tmp);
 +    if (a->op) {
 +        tmp = neon_load_reg(a->vd, 1);
 +    } else {
 +        tmp = tcg_temp_new_i32();
 +        tcg_gen_movi_i32(tmp, 0);
 +    }
 +    tmp3 = neon_load_reg(a->vm, 1);
 +    gen_helper_neon_tbl(tmp3, tmp3, tmp, ptr1, tmp4);
 +    tcg_temp_free_i32(tmp4);
 +    tcg_temp_free_ptr(ptr1);
 +    neon_store_reg(a->vd, 0, tmp2);
 +    neon_store_reg(a->vd, 1, tmp3);
 +    tcg_temp_free_i32(tmp);
 +    return true;
 +}
-+
- static bool v7m_push_stack(ARMCPU *cpu)
- {
-     /* Do the "set up stack frame" part of exception entry,
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-                     TCGv_i32 fptr = load_reg(s, rn);
+ {
+     int op;
-                     if (extract32(insn, 20, 1)) {
+     int q;
--                        /* VLLDM */
+-    int rd, rn, rm, rd_ofs, rm_ofs;
-+                        gen_helper_v7m_vlldm(cpu_env, fptr);
++    int rd, rm, rd_ofs, rm_ofs;
-                     } else {
+     int size;
-                         gen_helper_v7m_vlstm(cpu_env, fptr);
+     int pass;
-                     }
+     int u;
      int vec_size;
 -    TCGv_i32 tmp, tmp2, tmp3, tmp5;
 -    TCGv_ptr ptr1;
 +    TCGv_i32 tmp, tmp2, tmp3;
      if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
          return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      q = (insn & (1 << 6)) != 0;
      u = (insn >> 24) & 1;
      VFP_DREG_D(rd, insn);
 -    VFP_DREG_N(rn, insn);
      VFP_DREG_M(rm, insn);
      size = (insn >> 20) & 3;
      vec_size = q ? 16 : 8;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      break;
                  }
              } else if ((insn & (1 << 10)) == 0) {
 -                /* VTBL, VTBX.  */
 -                int n = ((insn >> 8) & 3) + 1;
 -                if ((rn + n) > 32) {
 -                    /* This is UNPREDICTABLE; we choose to UNDEF to avoid the
 -                     * helper function running off the end of the register file.
 -                     */
 -                    return 1;
 -                }
 -                n <<= 3;
 -                if (insn & (1 << 6)) {
 -                    tmp = neon_load_reg(rd, 0);
 -                } else {
 -                    tmp = tcg_temp_new_i32();
 -                    tcg_gen_movi_i32(tmp, 0);
 -                }
 -                tmp2 = neon_load_reg(rm, 0);
 -                ptr1 = vfp_reg_ptr(true, rn);
 -                tmp5 = tcg_const_i32(n);
 -                gen_helper_neon_tbl(tmp2, tmp2, tmp, ptr1, tmp5);
 -                tcg_temp_free_i32(tmp);
 -                if (insn & (1 << 6)) {
 -                    tmp = neon_load_reg(rd, 1);
 -                } else {
 -                    tmp = tcg_temp_new_i32();
 -                    tcg_gen_movi_i32(tmp, 0);
 -                }
 -                tmp3 = neon_load_reg(rm, 1);
 -                gen_helper_neon_tbl(tmp3, tmp3, tmp, ptr1, tmp5);
 -                tcg_temp_free_i32(tmp5);
 -                tcg_temp_free_ptr(ptr1);
 -                neon_store_reg(rd, 0, tmp2);
 -                neon_store_reg(rd, 1, tmp3);
 -                tcg_temp_free_i32(tmp);
 +                /* VTBL, VTBX: handled by decodetree */
 +                return 1;
              } else if ((insn & 0x380) == 0) {
                  /* VDUP */
                  int element;
 --
 .20.1

-[Qemu-devel] [PULL 26/42] target/arm: Implement M-profile lazy FP state preservation
+[PULL 17/23] target/arm: Convert Neon VDUP (scalar) to decodetree
-The M-profile architecture floating point system supports
+Convert the Neon VDUP (scalar) insn to decodetree.  (Note that we
-lazy FP state preservation, where FP registers are not
+can't call this just "VDUP" as we used that already in vfp.decode for
-pushed to the stack when an exception occurs but are instead
+the "VDUP (general purpose register" insn.)
 only saved if and when the first FP instruction in the exception
 handler is executed. Implement this in QEMU, corresponding
 to the check of LSPACT in the pseudocode ExecuteFPCheck().
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-24-peter.maydell@linaro.org
 ---
- target/arm/cpu.h       |   3 ++
+ target/arm/neon-dp.decode       |  7 +++++++
- target/arm/helper.h    |   2 +
+ target/arm/translate-neon.inc.c | 26 ++++++++++++++++++++++++++
- target/arm/translate.h |   1 +
+ target/arm/translate.c          | 25 +------------------------
- target/arm/helper.c    | 112 +++++++++++++++++++++++++++++++++++++++++
+files changed, 34 insertions(+), 24 deletions(-)
  target/arm/translate.c |  22 ++++++++
 files changed, 140 insertions(+)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/target/arm/neon-dp.decode
-+++ b/target/arm/cpu.h
++++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
- #define EXCP_NOCP           17   /* v7M NOCP UsageFault */
- #define EXCP_INVSTATE       18   /* v7M INVSTATE UsageFault */
+     VTBL         1111 001 1 1 . 11 .... .... 10 len:2 . op:1 . 0 .... \
- #define EXCP_STKOF          19   /* v8M STKOF UsageFault */
+                  vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+#define EXCP_LAZYFP         20   /* v7M fault during lazy FP stacking */
++
- /* NB: add new EXCP_ defines to the array in arm_log_exception() too */
++    VDUP_scalar  1111 001 1 1 . 11 index:3 1 .... 11 000 q:1 . 0 .... \
++                 vm=%vm_dp vd=%vd_dp size=0
- #define ARMV7M_EXCP_RESET   1
++    VDUP_scalar  1111 001 1 1 . 11 index:2 10 .... 11 000 q:1 . 0 .... \
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
++                 vm=%vm_dp vd=%vd_dp size=1
- FIELD(TBFLAG_A32, VFPEN, 7, 1)
++    VDUP_scalar  1111 001 1 1 . 11 index:1 100 .... 11 000 q:1 . 0 .... \
- FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
++                 vm=%vm_dp vd=%vd_dp size=2
- FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
+   ]
-+/* For M profile only, set if FPCCR.LSPACT is set */
-+FIELD(TBFLAG_A32, LSPACT, 18, 1)
+   # Subgroup for size != 0b11
- /* For M profile only, set if we must create a new FP context */
+diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
  FIELD(TBFLAG_A32, NEW_FP_CTXT_NEEDED, 19, 1)
  /* For M profile only, set if FPCCR.S does not match current security state */
 diff --git a/target/arm/helper.h b/target/arm/helper.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.h
+--- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/helper.h
++++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(v7m_blxns, void, env, i32)
+@@ -XXX,XX +XXX,XX @@ static bool trans_VTBL(DisasContext *s, arg_VTBL *a)
+     tcg_temp_free_i32(tmp);
- DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
+     return true;
+ }
 +DEF_HELPER_1(v7m_preserve_fp_state, void, env)
 +
- DEF_HELPER_2(v8m_stackcheck, void, env, i32)
++static bool trans_VDUP_scalar(DisasContext *s, arg_VDUP_scalar *a)
  DEF_HELPER_4(access_check_cp_reg, void, env, ptr, i32, i32)
 diff --git a/target/arm/translate.h b/target/arm/translate.h
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.h
 +++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
      bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
      bool v8m_fpccr_s_wrong; /* true if v8M FPCCR.S != v8m_secure */
      bool v7m_new_fp_ctxt_needed; /* ASPEN set but no active FP context */
 +    bool v7m_lspact; /* FPCCR.LSPACT set */
      /* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
       * so that top level loop can generate correct syndrome information.
       */
 diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/helper.c
 +++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
      g_assert_not_reached();
  }
 +void HELPER(v7m_preserve_fp_state)(CPUARMState *env)
 +{
-+    /* translate.c should never generate calls here in user-only mode */
++    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+    g_assert_not_reached();
++        return false;
 +}
 +
  uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
  {
      /* The TT instructions can be used by unprivileged code, but in
@@ -XXX,XX +XXX,XX @@ pend_fault:
      return false;
  }
 +void HELPER(v7m_preserve_fp_state)(CPUARMState *env)
 +{
 +    /*
 +     * Preserve FP state (because LSPACT was set and we are about
 +     * to execute an FP instruction). This corresponds to the
 +     * PreserveFPState() pseudocode.
 +     * We may throw an exception if the stacking fails.
 +     */
 +    ARMCPU *cpu = arm_env_get_cpu(env);
 +    bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
 +    bool negpri = !(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_HFRDY_MASK);
 +    bool is_priv = !(env->v7m.fpccr[is_secure] & R_V7M_FPCCR_USER_MASK);
 +    bool splimviol = env->v7m.fpccr[is_secure] & R_V7M_FPCCR_SPLIMVIOL_MASK;
 +    uint32_t fpcar = env->v7m.fpcar[is_secure];
 +    bool stacked_ok = true;
 +    bool ts = is_secure && (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK);
 +    bool take_exception;
 +
 +    /* Take the iothread lock as we are going to touch the NVIC */
 +    qemu_mutex_lock_iothread();
 +
 +    /* Check the background context had access to the FPU */
 +    if (!v7m_cpacr_pass(env, is_secure, is_priv)) {
 +        armv7m_nvic_set_pending_lazyfp(env->nvic, ARMV7M_EXCP_USAGE, is_secure);
 +        env->v7m.cfsr[is_secure] |= R_V7M_CFSR_NOCP_MASK;
 +        stacked_ok = false;
 +    } else if (!is_secure && !extract32(env->v7m.nsacr, 10, 1)) {
 +        armv7m_nvic_set_pending_lazyfp(env->nvic, ARMV7M_EXCP_USAGE, M_REG_S);
 +        env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_NOCP_MASK;
 +        stacked_ok = false;
 +    }
 +
-+    if (!splimviol && stacked_ok) {
++    /* UNDEF accesses to D16-D31 if they don't exist. */
-+        /* We only stack if the stack limit wasn't violated */
++    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        int i;
++        ((a->vd | a->vm) & 0x10)) {
-+        ARMMMUIdx mmu_idx;
++        return false;
 +
 +        mmu_idx = arm_v7m_mmu_idx_all(env, is_secure, is_priv, negpri);
 +        for (i = 0; i < (ts ? 32 : 16); i += 2) {
 +            uint64_t dn = *aa32_vfp_dreg(env, i / 2);
 +            uint32_t faddr = fpcar + 4 * i;
 +            uint32_t slo = extract64(dn, 0, 32);
 +            uint32_t shi = extract64(dn, 32, 32);
 +
 +            if (i >= 16) {
 +                faddr += 8; /* skip the slot for the FPSCR */
 +            }
 +            stacked_ok = stacked_ok &&
 +                v7m_stack_write(cpu, faddr, slo, mmu_idx, STACK_LAZYFP) &&
 +                v7m_stack_write(cpu, faddr + 4, shi, mmu_idx, STACK_LAZYFP);
 +        }
 +
 +        stacked_ok = stacked_ok &&
 +            v7m_stack_write(cpu, fpcar + 0x40,
 +                            vfp_get_fpscr(env), mmu_idx, STACK_LAZYFP);
 +    }
 +
-+    /*
++    if (a->vd & a->q) {
-+     * We definitely pended an exception, but it's possible that it
++        return false;
 +     * might not be able to be taken now. If its priority permits us
 +     * to take it now, then we must not update the LSPACT or FP regs,
 +     * but instead jump out to take the exception immediately.
 +     * If it's just pending and won't be taken until the current
 +     * handler exits, then we do update LSPACT and the FP regs.
 +     */
 +    take_exception = !stacked_ok &&
 +        armv7m_nvic_can_take_pending_exception(env->nvic);
 +
 +    qemu_mutex_unlock_iothread();
 +
 +    if (take_exception) {
 +        raise_exception_ra(env, EXCP_LAZYFP, 0, 1, GETPC());
 +    }
 +
-+    env->v7m.fpccr[is_secure] &= ~R_V7M_FPCCR_LSPACT_MASK;
++    if (!vfp_access_check(s)) {
-+
++        return true;
 +    if (ts) {
 +        /* Clear s0 to s31 and the FPSCR */
 +        int i;
 +
 +        for (i = 0; i < 32; i += 2) {
 +            *aa32_vfp_dreg(env, i / 2) = 0;
 +        }
 +        vfp_set_fpscr(env, 0);
 +    }
 +    /*
 +     * Otherwise s0 to s15 and FPSCR are UNKNOWN; we choose to leave them
 +     * unchanged.
 +     */
 +}
 +
  /* Write to v7M CONTROL.SPSEL bit for the specified security bank.
   * This may change the current stack pointer between Main and Process
   * stack pointers if it is done for the CONTROL register for the current
@@ -XXX,XX +XXX,XX @@ static void arm_log_exception(int idx)
              [EXCP_NOCP] = "v7M NOCP UsageFault",
              [EXCP_INVSTATE] = "v7M INVSTATE UsageFault",
              [EXCP_STKOF] = "v8M STKOF UsageFault",
 +            [EXCP_LAZYFP] = "v7M exception during lazy FP stacking",
          };
          if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
              return;
          }
          break;
 +    case EXCP_LAZYFP:
 +        /*
 +         * We already pended the specific exception in the NVIC in the
 +         * v7m_preserve_fp_state() helper function.
 +         */
 +        break;
      default:
          cpu_abort(cs, "Unhandled exception 0x%x\n", cs->exception_index);
          return; /* Never happens.  Keep compiler happy.  */
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
          flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
      }
 +    if (arm_feature(env, ARM_FEATURE_M)) {
 +        bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
 +
 +        if (env->v7m.fpccr[is_secure] & R_V7M_FPCCR_LSPACT_MASK) {
 +            flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
 +        }
 +    }
 +
-     *pflags = flags;
++    tcg_gen_gvec_dup_mem(a->size, neon_reg_offset(a->vd, 0),
-     *cs_base = 0;
++                         neon_element_offset(a->vm, a->index, a->size),
- }
++                         a->q ? 16 : 8, a->q ? 16 : 8);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-     if (arm_dc_feature(s, ARM_FEATURE_M)) {
+                     }
-         /* Handle M-profile lazy FP state mechanics */
+                     break;
+                 }
-+        /* Trigger lazy-state preservation if necessary */
+-            } else if ((insn & (1 << 10)) == 0) {
-+        if (s->v7m_lspact) {
+-                /* VTBL, VTBX: handled by decodetree */
-+            /*
+-                return 1;
-+             * Lazy state saving affects external memory and also the NVIC,
+-            } else if ((insn & 0x380) == 0) {
-+             * so we must mark it as an IO operation for icount.
+-                /* VDUP */
-+             */
+-                int element;
-+            if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
+-                MemOp size;
-+                gen_io_start();
+-
-+            }
+-                if ((insn & (7 << 16)) == 0 || (q && (rd & 1))) {
-+            gen_helper_v7m_preserve_fp_state(cpu_env);
+-                    return 1;
-+            if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
+-                }
-+                gen_io_end();
+-                if (insn & (1 << 16)) {
-+            }
+-                    size = MO_8;
-+            /*
+-                    element = (insn >> 17) & 7;
-+             * If the preserve_fp_state helper doesn't throw an exception
+-                } else if (insn & (1 << 17)) {
-+             * then it will clear LSPACT; we don't need to repeat this for
+-                    size = MO_16;
-+             * any further FP insns in this TB.
+-                    element = (insn >> 18) & 3;
-+             */
+-                } else {
-+            s->v7m_lspact = false;
+-                    size = MO_32;
-+        }
+-                    element = (insn >> 19) & 1;
-+
+-                }
-         /* Update ownership of FP context: set FPCCR.S to match current state */
+-                tcg_gen_gvec_dup_mem(size, neon_reg_offset(rd, 0),
-         if (s->v8m_fpccr_s_wrong) {
+-                                     neon_element_offset(rm, element, size),
-             TCGv_i32 tmp;
+-                                     q ? 16 : 8, q ? 16 : 8);
-@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
+             } else {
-     dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
++                /* VTBL, VTBX, VDUP: handled by decodetree */
-     dc->v7m_new_fp_ctxt_needed =
+                 return 1;
-         FIELD_EX32(tb_flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED);
+             }
-+    dc->v7m_lspact = FIELD_EX32(tb_flags, TBFLAG_A32, LSPACT);
+         }
      dc->cp_regs = cpu->cp_regs;
      dc->features = env->features;
 --
 .20.1

-[Qemu-devel] [PULL 33/42] hw/display/tc6393xb: Remove unused functions
+[PULL 18/23] hw/misc/imx6ul_ccm: Implement non writable bits in CCM registers
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Jean-Christophe Dubois <jcd@tribudubois.net>
-No code used the tc6393xb_gpio_in_get() and tc6393xb_gpio_out_set()
+Some bits of the CCM registers are non writable.
 functions since their introduction in commit 88d2c950b002. Time to
 remove them.
-Suggested-by: Markus Armbruster <armbru@redhat.com>
+This was left undone in the initial commit (all bits of registers were
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+writable).
-Message-id: 20190412165416.7977-4-philmd@redhat.com
 This patch adds the required code to protect the non writable bits.
 Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
 Message-id: 20200608133508.550046-1-jcd@tribudubois.net
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/devices.h  |  3 ---
+ hw/misc/imx6ul_ccm.c | 76 ++++++++++++++++++++++++++++++++++++--------
- hw/display/tc6393xb.c | 16 ----------------
+file changed, 63 insertions(+), 13 deletions(-)
 files changed, 19 deletions(-)
-diff --git a/include/hw/devices.h b/include/hw/devices.h
+diff --git a/hw/misc/imx6ul_ccm.c b/hw/misc/imx6ul_ccm.c
 index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
+--- a/hw/misc/imx6ul_ccm.c
-+++ b/include/hw/devices.h
++++ b/hw/misc/imx6ul_ccm.c
-@@ -XXX,XX +XXX,XX @@ void retu_key_event(void *retu, int state);
+@@ -XXX,XX +XXX,XX @@
- typedef struct TC6393xbState TC6393xbState;
- TC6393xbState *tc6393xb_init(struct MemoryRegion *sysmem,
+ #include "trace.h"
-                              uint32_t base, qemu_irq irq);
--void tc6393xb_gpio_out_set(TC6393xbState *s, int line,
++static const uint32_t ccm_mask[CCM_MAX] = {
--                    qemu_irq handler);
++    [CCM_CCR] = 0xf01fef80,
--qemu_irq *tc6393xb_gpio_in_get(TC6393xbState *s);
++    [CCM_CCDR] = 0xfffeffff,
- qemu_irq tc6393xb_l3v_get(TC6393xbState *s);
++    [CCM_CSR] = 0xffffffff,
++    [CCM_CCSR] = 0xfffffef2,
- #endif
++    [CCM_CACRR] = 0xfffffff8,
-diff --git a/hw/display/tc6393xb.c b/hw/display/tc6393xb.c
++    [CCM_CBCDR] = 0xc1f8e000,
-index XXXXXXX..XXXXXXX 100644
++    [CCM_CBCMR] = 0xfc03cfff,
---- a/hw/display/tc6393xb.c
++    [CCM_CSCMR1] = 0x80700000,
-+++ b/hw/display/tc6393xb.c
++    [CCM_CSCMR2] = 0xe01ff003,
-@@ -XXX,XX +XXX,XX @@ struct TC6393xbState {
++    [CCM_CSCDR1] = 0xfe00c780,
-              blanked : 1;
++    [CCM_CS1CDR] = 0xfe00fe00,
- };
++    [CCM_CS2CDR] = 0xf8007000,
++    [CCM_CDCDR] = 0xf00fffff,
--qemu_irq *tc6393xb_gpio_in_get(TC6393xbState *s)
++    [CCM_CHSCCDR] = 0xfffc01ff,
--{
++    [CCM_CSCDR2] = 0xfe0001ff,
--    return s->gpio_in;
++    [CCM_CSCDR3] = 0xffffc1ff,
--}
++    [CCM_CDHIPR] = 0xffffffff,
--
++    [CCM_CTOR] = 0x00000000,
- static void tc6393xb_gpio_set(void *opaque, int line, int level)
++    [CCM_CLPCR] = 0xf39ff01c,
 +    [CCM_CISR] = 0xfb85ffbe,
 +    [CCM_CIMR] = 0xfb85ffbf,
 +    [CCM_CCOSR] = 0xfe00fe00,
 +    [CCM_CGPR] = 0xfffc3fea,
 +    [CCM_CCGR0] = 0x00000000,
 +    [CCM_CCGR1] = 0x00000000,
 +    [CCM_CCGR2] = 0x00000000,
 +    [CCM_CCGR3] = 0x00000000,
 +    [CCM_CCGR4] = 0x00000000,
 +    [CCM_CCGR5] = 0x00000000,
 +    [CCM_CCGR6] = 0x00000000,
 +    [CCM_CMEOR] = 0xafffff1f,
 +};
 +
 +static const uint32_t analog_mask[CCM_ANALOG_MAX] = {
 +    [CCM_ANALOG_PLL_ARM] = 0xfff60f80,
 +    [CCM_ANALOG_PLL_USB1] = 0xfffe0fbc,
 +    [CCM_ANALOG_PLL_USB2] = 0xfffe0fbc,
 +    [CCM_ANALOG_PLL_SYS] = 0xfffa0ffe,
 +    [CCM_ANALOG_PLL_SYS_SS] = 0x00000000,
 +    [CCM_ANALOG_PLL_SYS_NUM] = 0xc0000000,
 +    [CCM_ANALOG_PLL_SYS_DENOM] = 0xc0000000,
 +    [CCM_ANALOG_PLL_AUDIO] = 0xffe20f80,
 +    [CCM_ANALOG_PLL_AUDIO_NUM] = 0xc0000000,
 +    [CCM_ANALOG_PLL_AUDIO_DENOM] = 0xc0000000,
 +    [CCM_ANALOG_PLL_VIDEO] = 0xffe20f80,
 +    [CCM_ANALOG_PLL_VIDEO_NUM] = 0xc0000000,
 +    [CCM_ANALOG_PLL_VIDEO_DENOM] = 0xc0000000,
 +    [CCM_ANALOG_PLL_ENET] = 0xffc20ff0,
 +    [CCM_ANALOG_PFD_480] = 0x40404040,
 +    [CCM_ANALOG_PFD_528] = 0x40404040,
 +    [PMU_MISC0] = 0x01fe8306,
 +    [PMU_MISC1] = 0x07fcede0,
 +    [PMU_MISC2] = 0x005f5f5f,
 +};
 +
  static const char *imx6ul_ccm_reg_name(uint32_t reg)
  {
- //    TC6393xbState *s = opaque;
+     static char unknown[20];
-@@ -XXX,XX +XXX,XX @@ static void tc6393xb_gpio_set(void *opaque, int line, int level)
+@@ -XXX,XX +XXX,XX @@ static void imx6ul_ccm_write(void *opaque, hwaddr offset, uint64_t value,
-     // FIXME: how does the chip reflect the GPIO input level change?
      trace_ccm_write_reg(imx6ul_ccm_reg_name(index), (uint32_t)value);
 -    /*
 -     * We will do a better implementation later. In particular some bits
 -     * cannot be written to.
 -     */
 -    s->ccm[index] = (uint32_t)value;
 +    s->ccm[index] = (s->ccm[index] & ccm_mask[index]) |
 +                           ((uint32_t)value & ~ccm_mask[index]);
  }
--void tc6393xb_gpio_out_set(TC6393xbState *s, int line,
+ static uint64_t imx6ul_analog_read(void *opaque, hwaddr offset, unsigned size)
--                    qemu_irq handler)
+@@ -XXX,XX +XXX,XX @@ static void imx6ul_analog_write(void *opaque, hwaddr offset, uint64_t value,
--{
+          * the REG_NAME register. So we change the value of the
--    if (line >= TC6393XB_GPIOS) {
+          * REG_NAME register, setting bits passed in the value.
--        fprintf(stderr, "TC6393xb: no GPIO pin %d\n", line);
+          */
--        return;
+-        s->analog[index - 1] |= value;
--    }
++        s->analog[index - 1] |= (value & ~analog_mask[index - 1]);
--
+         break;
--    s->handler[line] = handler;
+     case CCM_ANALOG_PLL_ARM_CLR:
--}
+     case CCM_ANALOG_PLL_USB1_CLR:
--
+@@ -XXX,XX +XXX,XX @@ static void imx6ul_analog_write(void *opaque, hwaddr offset, uint64_t value,
- static void tc6393xb_gpio_handler_update(TC6393xbState *s)
+          * the REG_NAME register. So we change the value of the
- {
+          * REG_NAME register, unsetting bits passed in the value.
-     uint32_t level, diff;
+          */
 -        s->analog[index - 2] &= ~value;
 +        s->analog[index - 2] &= ~(value & ~analog_mask[index - 2]);
          break;
      case CCM_ANALOG_PLL_ARM_TOG:
      case CCM_ANALOG_PLL_USB1_TOG:
@@ -XXX,XX +XXX,XX @@ static void imx6ul_analog_write(void *opaque, hwaddr offset, uint64_t value,
           * the REG_NAME register. So we change the value of the
           * REG_NAME register, toggling bits passed in the value.
           */
 -        s->analog[index - 3] ^= value;
 +        s->analog[index - 3] ^= (value & ~analog_mask[index - 3]);
          break;
      default:
 -        /*
 -         * We will do a better implementation later. In particular some bits
 -         * cannot be written to.
 -         */
 -        s->analog[index] = value;
 +        s->analog[index] = (s->analog[index] & analog_mask[index]) |
 +                           (value & ~analog_mask[index]);
          break;
      }
  }
 --
 .20.1

-[Qemu-devel] [PULL 12/42] target/arm/helper: don't return early for STKOF faults during stacking
+[PULL 19/23] Implement configurable descriptor size in ftgmac100
-Currently the code in v7m_push_stack() which detects a violation
+From: Erik Smit <erik.lucas.smit@gmail.com>
 of the v8M stack limit simply returns early if it does so. This
 is OK for the current integer-only code, but won't work for the
 floating point handling we're about to add. We need to continue
 executing the rest of the function so that we check for other
 exceptions like not having permission to use the FPU and so
 that we correctly set the FPCCR state if we are doing lazy
 stacking. Refactor to avoid the early return.
+The hardware supports configurable descriptor sizes, configured in the DBLAC
+register.
+Most drivers use the default 4 word descriptor, which is currently hardcoded,
+but Aspeed SDK configures 8 words to store extra data.
+Signed-off-by: Erik Smit <erik.lucas.smit@gmail.com>
+Reviewed-by: Cédric Le Goater <clg@kaod.org>
+[PMM: removed unnecessary parens]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-10-peter.maydell@linaro.org
 ---
- target/arm/helper.c | 23 ++++++++++++++++++-----
+ hw/net/ftgmac100.c | 26 ++++++++++++++++++++++++--
-file changed, 18 insertions(+), 5 deletions(-)
+file changed, 24 insertions(+), 2 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
+diff --git a/hw/net/ftgmac100.c b/hw/net/ftgmac100.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
+--- a/hw/net/ftgmac100.c
-+++ b/target/arm/helper.c
++++ b/hw/net/ftgmac100.c
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
+@@ -XXX,XX +XXX,XX @@
-      * should ignore further stack faults trying to process
+ #define FTGMAC100_APTC_TXPOLL_CNT(x)        (((x) >> 8) & 0xf)
-      * that derived exception.)
+ #define FTGMAC100_APTC_TXPOLL_TIME_SEL      (1 << 12)
-      */
--    bool stacked_ok;
++/*
-+    bool stacked_ok = true, limitviol = false;
++ * DMA burst length and arbitration control register
-     CPUARMState *env = &cpu->env;
++ */
-     uint32_t xpsr = xpsr_read(env);
++#define FTGMAC100_DBLAC_RXBURST_SIZE(x)     (((x) >> 8) & 0x3)
-     uint32_t frameptr = env->regs[13];
++#define FTGMAC100_DBLAC_TXBURST_SIZE(x)     (((x) >> 10) & 0x3)
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
++#define FTGMAC100_DBLAC_RXDES_SIZE(x)       ((((x) >> 12) & 0xf) * 8)
-             armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
++#define FTGMAC100_DBLAC_TXDES_SIZE(x)       ((((x) >> 16) & 0xf) * 8)
-                                     env->v7m.secure);
++#define FTGMAC100_DBLAC_IFG_CNT(x)          (((x) >> 20) & 0x7)
-             env->regs[13] = limit;
++#define FTGMAC100_DBLAC_IFG_INC             (1 << 23)
--            return true;
++
-+            /*
+ /*
-+             * We won't try to perform any further memory accesses but
+  * PHY control register
-+             * we must continue through the following code to check for
+  */
-+             * permission faults during FPU state preservation, and we
+@@ -XXX,XX +XXX,XX @@ static void ftgmac100_do_tx(FTGMAC100State *s, uint32_t tx_ring,
-+             * must update FPCCR if lazy stacking is enabled.
+         if (bd.des0 & s->txdes0_edotr) {
-+             */
+             addr = tx_ring;
-+            limitviol = true;
+         } else {
-+            stacked_ok = false;
+-            addr += sizeof(FTGMAC100Desc);
 +            addr += FTGMAC100_DBLAC_TXDES_SIZE(s->dblac);
          }
      }
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
+@@ -XXX,XX +XXX,XX @@ static void ftgmac100_write(void *opaque, hwaddr addr,
-      * (which may be taken in preference to the one we started with
+         s->phydata = value & 0xffff;
-      * if it has higher priority).
+         break;
-      */
+     case FTGMAC100_DBLAC: /* DMA Burst Length and Arbitration Control */
--    stacked_ok =
++        if (FTGMAC100_DBLAC_TXDES_SIZE(s->dblac) < sizeof(FTGMAC100Desc)) {
-+    stacked_ok = stacked_ok &&
++            qemu_log_mask(LOG_GUEST_ERROR,
-         v7m_stack_write(cpu, frameptr, env->regs[0], mmu_idx, false) &&
++                          "%s: transmit descriptor too small : %d bytes\n",
-         v7m_stack_write(cpu, frameptr + 4, env->regs[1], mmu_idx, false) &&
++                          __func__, FTGMAC100_DBLAC_TXDES_SIZE(s->dblac));
-         v7m_stack_write(cpu, frameptr + 8, env->regs[2], mmu_idx, false) &&
++            break;
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
++        }
-         v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
++        if (FTGMAC100_DBLAC_RXDES_SIZE(s->dblac) < sizeof(FTGMAC100Desc)) {
-         v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
++            qemu_log_mask(LOG_GUEST_ERROR,
++                          "%s: receive descriptor too small : %d bytes\n",
--    /* Update SP regardless of whether any of the stack accesses failed. */
++                          __func__, FTGMAC100_DBLAC_RXDES_SIZE(s->dblac));
--    env->regs[13] = frameptr;
++            break;
-+    /*
++        }
-+     * If we broke a stack limit then SP was already updated earlier;
+         s->dblac = value;
-+     * otherwise we update SP regardless of whether any of the stack
+         break;
-+     * accesses failed or we took some other kind of fault.
+     case FTGMAC100_REVR:  /* Feature Register */
-+     */
+@@ -XXX,XX +XXX,XX @@ static ssize_t ftgmac100_receive(NetClientState *nc, const uint8_t *buf,
-+    if (!limitviol) {
+         if (bd.des0 & s->rxdes0_edorr) {
-+        env->regs[13] = frameptr;
+             addr = s->rx_ring;
-+    }
+         } else {
+-            addr += sizeof(FTGMAC100Desc);
-     return !stacked_ok;
++            addr += FTGMAC100_DBLAC_RXDES_SIZE(s->dblac);
- }
+         }
      }
      s->rx_descriptor = addr;
 --
 .20.1

-[Qemu-devel] [PULL 29/42] target/arm: Enable FPU for Cortex-M4 and Cortex-M33
+[PULL 20/23] target/arm/cpu: adjust virtual time for all KVM arm cpus
-Enable the FPU by default for the Cortex-M4 and Cortex-M33.
+From: fangying <fangying1@huawei.com>
+Virtual time adjustment was implemented for virt-5.0 machine type,
+but the cpu property was enabled only for host-passthrough and max
+cpu model.  Let's add it for any KVM arm cpu which has the generic
+timer feature enabled.
+Signed-off-by: Ying Fang <fangying1@huawei.com>
+Reviewed-by: Andrew Jones <drjones@redhat.com>
+Message-id: 20200608121243.2076-1-fangying1@huawei.com
+[PMM: minor commit message tweak, removed inaccurate
+ suggested-by tag]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-27-peter.maydell@linaro.org
 ---
- target/arm/cpu.c | 8 ++++++++
+ target/arm/cpu.c   |  6 ++++--
-file changed, 8 insertions(+)
+ target/arm/cpu64.c |  1 -
  target/arm/kvm.c   | 21 +++++++++++----------
 files changed, 15 insertions(+), 13 deletions(-)
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
-@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
+@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
-     set_feature(&cpu->env, ARM_FEATURE_M);
+     if (arm_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER)) {
-     set_feature(&cpu->env, ARM_FEATURE_M_MAIN);
+         qdev_property_add_static(DEVICE(cpu), &arm_cpu_gt_cntfrq_property);
-     set_feature(&cpu->env, ARM_FEATURE_THUMB_DSP);
+     }
-+    set_feature(&cpu->env, ARM_FEATURE_VFP4);
++
-     cpu->midr = 0x410fc240; /* r0p0 */
++    if (kvm_enabled()) {
-     cpu->pmsav7_dregion = 8;
++        kvm_arm_add_vcpu_properties(obj);
-+    cpu->isar.mvfr0 = 0x10110021;
++    }
-+    cpu->isar.mvfr1 = 0x11000011;
+ }
-+    cpu->isar.mvfr2 = 0x00000000;
-     cpu->id_pfr0 = 0x00000030;
+ static void arm_cpu_finalizefn(Object *obj)
-     cpu->id_pfr1 = 0x00000200;
+@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
-     cpu->id_dfr0 = 0x00100000;
-@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
+     if (kvm_enabled()) {
-     set_feature(&cpu->env, ARM_FEATURE_M_MAIN);
+         kvm_arm_set_cpu_features_from_host(cpu);
-     set_feature(&cpu->env, ARM_FEATURE_M_SECURITY);
+-        kvm_arm_add_vcpu_properties(obj);
-     set_feature(&cpu->env, ARM_FEATURE_THUMB_DSP);
+     } else {
-+    set_feature(&cpu->env, ARM_FEATURE_VFP4);
+         cortex_a15_initfn(obj);
-     cpu->midr = 0x410fd213; /* r0p3 */
-     cpu->pmsav7_dregion = 16;
+@@ -XXX,XX +XXX,XX @@ static void arm_host_initfn(Object *obj)
-     cpu->sau_sregion = 8;
+     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
-+    cpu->isar.mvfr0 = 0x10110021;
+         aarch64_add_sve_properties(obj);
-+    cpu->isar.mvfr1 = 0x11000011;
+     }
-+    cpu->isar.mvfr2 = 0x00000040;
+-    kvm_arm_add_vcpu_properties(obj);
-     cpu->id_pfr0 = 0x00000030;
+     arm_cpu_post_init(obj);
-     cpu->id_pfr1 = 0x00000210;
+ }
-     cpu->id_dfr0 = 0x00200000;
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
      if (kvm_enabled()) {
          kvm_arm_set_cpu_features_from_host(cpu);
 -        kvm_arm_add_vcpu_properties(obj);
      } else {
          uint64_t t;
          uint32_t u;
 diff --git a/target/arm/kvm.c b/target/arm/kvm.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm.c
 +++ b/target/arm/kvm.c
@@ -XXX,XX +XXX,XX @@ static void kvm_no_adjvtime_set(Object *obj, bool value, Error **errp)
  /* KVM VCPU properties should be prefixed with "kvm-". */
  void kvm_arm_add_vcpu_properties(Object *obj)
  {
 -    if (!kvm_enabled()) {
 -        return;
 -    }
 +    ARMCPU *cpu = ARM_CPU(obj);
 +    CPUARMState *env = &cpu->env;
 -    ARM_CPU(obj)->kvm_adjvtime = true;
 -    object_property_add_bool(obj, "kvm-no-adjvtime", kvm_no_adjvtime_get,
 -                             kvm_no_adjvtime_set);
 -    object_property_set_description(obj, "kvm-no-adjvtime",
 -                                    "Set on to disable the adjustment of "
 -                                    "the virtual counter. VM stopped time "
 -                                    "will be counted.");
 +    if (arm_feature(env, ARM_FEATURE_GENERIC_TIMER)) {
 +        cpu->kvm_adjvtime = true;
 +        object_property_add_bool(obj, "kvm-no-adjvtime", kvm_no_adjvtime_get,
 +                                 kvm_no_adjvtime_set);
 +        object_property_set_description(obj, "kvm-no-adjvtime",
 +                                        "Set on to disable the adjustment of "
 +                                        "the virtual counter. VM stopped time "
 +                                        "will be counted.");
 +    }
  }
  bool kvm_arm_pmu_supported(CPUState *cpu)
 --
 .20.1

-[Qemu-devel] [PULL 42/42] hw/devices: Move SMSC 91C111 declaration into a new header
+[PULL 21/23] hw/net/imx_fec: Convert debug fprintf() to trace events
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Jean-Christophe Dubois <jcd@tribudubois.net>
-This commit finally deletes "hw/devices.h".
+Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
+Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+[PMD: Fixed 32-bit format string using PRIx32/PRIx64]
-Message-id: 20190412165416.7977-13-philmd@redhat.com
+Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- include/hw/devices.h       | 11 -----------
+ hw/net/imx_fec.c    | 106 +++++++++++++++++++-------------------------
- include/hw/net/smc91c111.h | 19 +++++++++++++++++++
+ hw/net/trace-events |  18 ++++++++
- hw/arm/gumstix.c           |  2 +-
+files changed, 63 insertions(+), 61 deletions(-)
  hw/arm/integratorcp.c      |  2 +-
  hw/arm/mainstone.c         |  2 +-
  hw/arm/realview.c          |  2 +-
  hw/arm/versatilepb.c       |  2 +-
  hw/net/smc91c111.c         |  2 +-
 files changed, 25 insertions(+), 17 deletions(-)
  delete mode 100644 include/hw/devices.h
  create mode 100644 include/hw/net/smc91c111.h
-diff --git a/include/hw/devices.h b/include/hw/devices.h
+diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
-deleted file mode 100644
+index XXXXXXX..XXXXXXX 100644
-index XXXXXXX..XXXXXXX
+--- a/hw/net/imx_fec.c
---- a/include/hw/devices.h
++++ b/hw/net/imx_fec.c
 +++ /dev/null
 @@ -XXX,XX +XXX,XX @@
--#ifndef QEMU_DEVICES_H
+ #include "qemu/module.h"
--#define QEMU_DEVICES_H
+ #include "net/checksum.h"
--
+ #include "net/eth.h"
--/* Devices that have nowhere better to go.  */
++#include "trace.h"
--
 -#include "hw/hw.h"
 -
 -/* smc91c111.c */
 -void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
 -
 -#endif
 diff --git a/include/hw/net/smc91c111.h b/include/hw/net/smc91c111.h
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/include/hw/net/smc91c111.h
@@ -XXX,XX +XXX,XX @@
 +/*
 + * SMSC 91C111 Ethernet interface emulation
 + *
 + * Copyright (c) 2005 CodeSourcery, LLC.
 + * Written by Paul Brook
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or later.
 + * See the COPYING file in the top-level directory.
 + */
 +
 +#ifndef HW_NET_SMC91C111_H
 +#define HW_NET_SMC91C111_H
 +
 +#include "hw/irq.h"
 +#include "net/net.h"
 +
 +void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
 +
 +#endif
 diff --git a/hw/arm/gumstix.c b/hw/arm/gumstix.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/gumstix.c
 +++ b/hw/arm/gumstix.c
@@ -XXX,XX +XXX,XX @@
  #include "hw/arm/pxa.h"
  #include "net/net.h"
  #include "hw/block/flash.h"
 -#include "hw/devices.h"
 +#include "hw/net/smc91c111.h"
  #include "hw/boards.h"
  #include "exec/address-spaces.h"
  #include "sysemu/qtest.h"
 diff --git a/hw/arm/integratorcp.c b/hw/arm/integratorcp.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/integratorcp.c
 +++ b/hw/arm/integratorcp.c
@@ -XXX,XX +XXX,XX @@
  #include "qemu-common.h"
  #include "cpu.h"
  #include "hw/sysbus.h"
 -#include "hw/devices.h"
  #include "hw/boards.h"
  #include "hw/arm/arm.h"
  #include "hw/misc/arm_integrator_debug.h"
 +#include "hw/net/smc91c111.h"
  #include "net/net.h"
  #include "exec/address-spaces.h"
  #include "sysemu/sysemu.h"
 diff --git a/hw/arm/mainstone.c b/hw/arm/mainstone.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/mainstone.c
 +++ b/hw/arm/mainstone.c
@@ -XXX,XX +XXX,XX @@
  #include "hw/arm/pxa.h"
  #include "hw/arm/arm.h"
  #include "net/net.h"
 -#include "hw/devices.h"
 +#include "hw/net/smc91c111.h"
  #include "hw/boards.h"
  #include "hw/block/flash.h"
  #include "hw/sysbus.h"
 diff --git a/hw/arm/realview.c b/hw/arm/realview.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/realview.c
 +++ b/hw/arm/realview.c
@@ -XXX,XX +XXX,XX @@
  #include "hw/sysbus.h"
  #include "hw/arm/arm.h"
  #include "hw/arm/primecell.h"
 -#include "hw/devices.h"
  #include "hw/net/lan9118.h"
 +#include "hw/net/smc91c111.h"
  #include "hw/pci/pci.h"
  #include "net/net.h"
  #include "sysemu/sysemu.h"
 diff --git a/hw/arm/versatilepb.c b/hw/arm/versatilepb.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/versatilepb.c
 +++ b/hw/arm/versatilepb.c
@@ -XXX,XX +XXX,XX @@
  #include "cpu.h"
  #include "hw/sysbus.h"
  #include "hw/arm/arm.h"
 -#include "hw/devices.h"
 +#include "hw/net/smc91c111.h"
  #include "net/net.h"
  #include "sysemu/sysemu.h"
  #include "hw/pci/pci.h"
 diff --git a/hw/net/smc91c111.c b/hw/net/smc91c111.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/net/smc91c111.c
 +++ b/hw/net/smc91c111.c
@@ -XXX,XX +XXX,XX @@
  #include "qemu/osdep.h"
  #include "hw/sysbus.h"
  #include "net/net.h"
 -#include "hw/devices.h"
 +#include "hw/net/smc91c111.h"
  #include "qemu/log.h"
  /* For crc32 */
  #include <zlib.h>
+-#ifndef DEBUG_IMX_FEC
+-#define DEBUG_IMX_FEC 0
+-#endif
+-
+-#define FEC_PRINTF(fmt, args...) \
+-    do { \
+-        if (DEBUG_IMX_FEC) { \
+-            fprintf(stderr, "[%s]%s: " fmt , TYPE_IMX_FEC, \
+-                                             __func__, ##args); \
+-        } \
+-    } while (0)
+-
+-#ifndef DEBUG_IMX_PHY
+-#define DEBUG_IMX_PHY 0
+-#endif
+-
+-#define PHY_PRINTF(fmt, args...) \
+-    do { \
+-        if (DEBUG_IMX_PHY) { \
+-            fprintf(stderr, "[%s.phy]%s: " fmt , TYPE_IMX_FEC, \
+-                                                 __func__, ##args); \
+-        } \
+-    } while (0)
+-
+ #define IMX_MAX_DESC    1024
+ static const char *imx_default_reg_name(IMXFECState *s, uint32_t index)
+@@ -XXX,XX +XXX,XX @@ static void imx_eth_update(IMXFECState *s);
+  * For now we don't handle any GPIO/interrupt line, so the OS will
+  * have to poll for the PHY status.
+  */
+-static void phy_update_irq(IMXFECState *s)
++static void imx_phy_update_irq(IMXFECState *s)
+ {
+     imx_eth_update(s);
+ }
+-static void phy_update_link(IMXFECState *s)
++static void imx_phy_update_link(IMXFECState *s)
+ {
+     /* Autonegotiation status mirrors link status.  */
+     if (qemu_get_queue(s->nic)->link_down) {
+-        PHY_PRINTF("link is down\n");
++        trace_imx_phy_update_link("down");
+         s->phy_status &= ~0x0024;
+         s->phy_int |= PHY_INT_DOWN;
+     } else {
+-        PHY_PRINTF("link is up\n");
++        trace_imx_phy_update_link("up");
+         s->phy_status |= 0x0024;
+         s->phy_int |= PHY_INT_ENERGYON;
+         s->phy_int |= PHY_INT_AUTONEG_COMPLETE;
+     }
+-    phy_update_irq(s);
++    imx_phy_update_irq(s);
+ }
+ static void imx_eth_set_link(NetClientState *nc)
+ {
+-    phy_update_link(IMX_FEC(qemu_get_nic_opaque(nc)));
++    imx_phy_update_link(IMX_FEC(qemu_get_nic_opaque(nc)));
+ }
+-static void phy_reset(IMXFECState *s)
++static void imx_phy_reset(IMXFECState *s)
+ {
++    trace_imx_phy_reset();
++
+     s->phy_status = 0x7809;
+     s->phy_control = 0x3000;
+     s->phy_advertise = 0x01e1;
+     s->phy_int_mask = 0;
+     s->phy_int = 0;
+-    phy_update_link(s);
++    imx_phy_update_link(s);
+ }
+-static uint32_t do_phy_read(IMXFECState *s, int reg)
++static uint32_t imx_phy_read(IMXFECState *s, int reg)
+ {
+     uint32_t val;
+@@ -XXX,XX +XXX,XX @@ static uint32_t do_phy_read(IMXFECState *s, int reg)
+     case 29:    /* Interrupt source.  */
+         val = s->phy_int;
+         s->phy_int = 0;
+-        phy_update_irq(s);
++        imx_phy_update_irq(s);
+         break;
+     case 30:    /* Interrupt mask */
+         val = s->phy_int_mask;
+@@ -XXX,XX +XXX,XX @@ static uint32_t do_phy_read(IMXFECState *s, int reg)
+         break;
+     }
+-    PHY_PRINTF("read 0x%04x @ %d\n", val, reg);
++    trace_imx_phy_read(val, reg);
+     return val;
+ }
+-static void do_phy_write(IMXFECState *s, int reg, uint32_t val)
++static void imx_phy_write(IMXFECState *s, int reg, uint32_t val)
+ {
+-    PHY_PRINTF("write 0x%04x @ %d\n", val, reg);
++    trace_imx_phy_write(val, reg);
+     if (reg > 31) {
+         /* we only advertise one phy */
+@@ -XXX,XX +XXX,XX @@ static void do_phy_write(IMXFECState *s, int reg, uint32_t val)
+     switch (reg) {
+     case 0:     /* Basic Control */
+         if (val & 0x8000) {
+-            phy_reset(s);
++            imx_phy_reset(s);
+         } else {
+             s->phy_control = val & 0x7980;
+             /* Complete autonegotiation immediately.  */
+@@ -XXX,XX +XXX,XX @@ static void do_phy_write(IMXFECState *s, int reg, uint32_t val)
+         break;
+     case 30:    /* Interrupt mask */
+         s->phy_int_mask = val & 0xff;
+-        phy_update_irq(s);
++        imx_phy_update_irq(s);
+         break;
+     case 17:
+     case 18:
+@@ -XXX,XX +XXX,XX @@ static void do_phy_write(IMXFECState *s, int reg, uint32_t val)
+ static void imx_fec_read_bd(IMXFECBufDesc *bd, dma_addr_t addr)
+ {
+     dma_memory_read(&address_space_memory, addr, bd, sizeof(*bd));
++
++    trace_imx_fec_read_bd(addr, bd->flags, bd->length, bd->data);
+ }
+ static void imx_fec_write_bd(IMXFECBufDesc *bd, dma_addr_t addr)
+@@ -XXX,XX +XXX,XX @@ static void imx_fec_write_bd(IMXFECBufDesc *bd, dma_addr_t addr)
+ static void imx_enet_read_bd(IMXENETBufDesc *bd, dma_addr_t addr)
+ {
+     dma_memory_read(&address_space_memory, addr, bd, sizeof(*bd));
++
++    trace_imx_enet_read_bd(addr, bd->flags, bd->length, bd->data,
++                   bd->option, bd->status);
+ }
+ static void imx_enet_write_bd(IMXENETBufDesc *bd, dma_addr_t addr)
+@@ -XXX,XX +XXX,XX @@ static void imx_fec_do_tx(IMXFECState *s)
+         int len;
+         imx_fec_read_bd(&bd, addr);
+-        FEC_PRINTF("tx_bd %x flags %04x len %d data %08x\n",
+-                   addr, bd.flags, bd.length, bd.data);
+         if ((bd.flags & ENET_BD_R) == 0) {
++
+             /* Run out of descriptors to transmit.  */
+-            FEC_PRINTF("tx_bd ran out of descriptors to transmit\n");
++            trace_imx_eth_tx_bd_busy();
++
+             break;
+         }
+         len = bd.length;
+@@ -XXX,XX +XXX,XX @@ static void imx_enet_do_tx(IMXFECState *s, uint32_t index)
+         int len;
+         imx_enet_read_bd(&bd, addr);
+-        FEC_PRINTF("tx_bd %x flags %04x len %d data %08x option %04x "
+-                   "status %04x\n", addr, bd.flags, bd.length, bd.data,
+-                   bd.option, bd.status);
+         if ((bd.flags & ENET_BD_R) == 0) {
+             /* Run out of descriptors to transmit.  */
++
++            trace_imx_eth_tx_bd_busy();
++
+             break;
+         }
+         len = bd.length;
+@@ -XXX,XX +XXX,XX @@ static void imx_eth_enable_rx(IMXFECState *s, bool flush)
+     s->regs[ENET_RDAR] = (bd.flags & ENET_BD_E) ? ENET_RDAR_RDAR : 0;
+     if (!s->regs[ENET_RDAR]) {
+-        FEC_PRINTF("RX buffer full\n");
++        trace_imx_eth_rx_bd_full();
+     } else if (flush) {
+         qemu_flush_queued_packets(qemu_get_queue(s->nic));
+     }
+@@ -XXX,XX +XXX,XX @@ static void imx_eth_reset(DeviceState *d)
+     memset(s->tx_descriptor, 0, sizeof(s->tx_descriptor));
+     /* We also reset the PHY */
+-    phy_reset(s);
++    imx_phy_reset(s);
+ }
+ static uint32_t imx_default_read(IMXFECState *s, uint32_t index)
+@@ -XXX,XX +XXX,XX @@ static uint64_t imx_eth_read(void *opaque, hwaddr offset, unsigned size)
+         break;
+     }
+-    FEC_PRINTF("reg[%s] => 0x%" PRIx32 "\n", imx_eth_reg_name(s, index),
+-                                              value);
++    trace_imx_eth_read(index, imx_eth_reg_name(s, index), value);
+     return value;
+ }
+@@ -XXX,XX +XXX,XX @@ static void imx_eth_write(void *opaque, hwaddr offset, uint64_t value,
+     const bool single_tx_ring = !imx_eth_is_multi_tx_ring(s);
+     uint32_t index = offset >> 2;
+-    FEC_PRINTF("reg[%s] <= 0x%" PRIx32 "\n", imx_eth_reg_name(s, index),
+-                (uint32_t)value);
++    trace_imx_eth_write(index, imx_eth_reg_name(s, index), value);
+     switch (index) {
+     case ENET_EIR:
+@@ -XXX,XX +XXX,XX @@ static void imx_eth_write(void *opaque, hwaddr offset, uint64_t value,
+         if (extract32(value, 29, 1)) {
+             /* This is a read operation */
+             s->regs[ENET_MMFR] = deposit32(s->regs[ENET_MMFR], 0, 16,
+-                                           do_phy_read(s,
++                                           imx_phy_read(s,
+                                                        extract32(value,
+, 10)));
+         } else {
+             /* This a write operation */
+-            do_phy_write(s, extract32(value, 18, 10), extract32(value, 0, 16));
++            imx_phy_write(s, extract32(value, 18, 10), extract32(value, 0, 16));
+         }
+         /* raise the interrupt as the PHY operation is done */
+         s->regs[ENET_EIR] |= ENET_INT_MII;
+@@ -XXX,XX +XXX,XX @@ static bool imx_eth_can_receive(NetClientState *nc)
+ {
+     IMXFECState *s = IMX_FEC(qemu_get_nic_opaque(nc));
+-    FEC_PRINTF("\n");
+-
+     return !!s->regs[ENET_RDAR];
+ }
+@@ -XXX,XX +XXX,XX @@ static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
+     unsigned int buf_len;
+     size_t size = len;
+-    FEC_PRINTF("len %d\n", (int)size);
++    trace_imx_fec_receive(size);
+     if (!s->regs[ENET_RDAR]) {
+         qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: Unexpected packet\n",
+@@ -XXX,XX +XXX,XX @@ static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
+         bd.length = buf_len;
+         size -= buf_len;
+-        FEC_PRINTF("rx_bd 0x%x length %d\n", addr, bd.length);
++        trace_imx_fec_receive_len(addr, bd.length);
+         /* The last 4 bytes are the CRC.  */
+         if (size < 4) {
+@@ -XXX,XX +XXX,XX @@ static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
+         if (size == 0) {
+             /* Last buffer in frame.  */
+             bd.flags |= flags | ENET_BD_L;
+-            FEC_PRINTF("rx frame flags %04x\n", bd.flags);
++
++            trace_imx_fec_receive_last(bd.flags);
++
+             s->regs[ENET_EIR] |= ENET_INT_RXF;
+         } else {
+             s->regs[ENET_EIR] |= ENET_INT_RXB;
+@@ -XXX,XX +XXX,XX @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf,
+     size_t size = len;
+     bool shift16 = s->regs[ENET_RACC] & ENET_RACC_SHIFT16;
+-    FEC_PRINTF("len %d\n", (int)size);
++    trace_imx_enet_receive(size);
+     if (!s->regs[ENET_RDAR]) {
+         qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: Unexpected packet\n",
+@@ -XXX,XX +XXX,XX @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf,
+         bd.length = buf_len;
+         size -= buf_len;
+-        FEC_PRINTF("rx_bd 0x%x length %d\n", addr, bd.length);
++        trace_imx_enet_receive_len(addr, bd.length);
+         /* The last 4 bytes are the CRC.  */
+         if (size < 4) {
+@@ -XXX,XX +XXX,XX @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf,
+         if (size == 0) {
+             /* Last buffer in frame.  */
+             bd.flags |= flags | ENET_BD_L;
+-            FEC_PRINTF("rx frame flags %04x\n", bd.flags);
++
++            trace_imx_enet_receive_last(bd.flags);
++
+             /* Indicate that we've updated the last buffer descriptor. */
+             bd.last_buffer = ENET_BD_BDU;
+             if (bd.option & ENET_BD_RX_INT) {
+diff --git a/hw/net/trace-events b/hw/net/trace-events
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/net/trace-events
++++ b/hw/net/trace-events
+@@ -XXX,XX +XXX,XX @@ i82596_receive_packet(size_t sz) "len=%zu"
+ i82596_new_mac(const char *id_with_mac) "New MAC for: %s"
+ i82596_set_multicast(uint16_t count) "Added %d multicast entries"
+ i82596_channel_attention(void *s) "%p: Received CHANNEL ATTENTION"
++
++# imx_fec.c
++imx_phy_read(uint32_t val, int reg) "0x%04"PRIx32" <= reg[%d]"
++imx_phy_write(uint32_t val, int reg) "0x%04"PRIx32" => reg[%d]"
++imx_phy_update_link(const char *s) "%s"
++imx_phy_reset(void) ""
++imx_fec_read_bd(uint64_t addr, int flags, int len, int data) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x"
++imx_enet_read_bd(uint64_t addr, int flags, int len, int data, int options, int status) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x option 0x%04x status 0x%04x"
++imx_eth_tx_bd_busy(void) "tx_bd ran out of descriptors to transmit"
++imx_eth_rx_bd_full(void) "RX buffer is full"
++imx_eth_read(int reg, const char *reg_name, uint32_t value) "reg[%d:%s] => 0x%08"PRIx32
++imx_eth_write(int reg, const char *reg_name, uint64_t value) "reg[%d:%s] <= 0x%08"PRIx64
++imx_fec_receive(size_t size) "len %zu"
++imx_fec_receive_len(uint64_t addr, int len) "rx_bd 0x%"PRIx64" length %d"
++imx_fec_receive_last(int last) "rx frame flags 0x%04x"
++imx_enet_receive(size_t size) "len %zu"
++imx_enet_receive_len(uint64_t addr, int len) "rx_bd 0x%"PRIx64" length %d"
++imx_enet_receive_last(int last) "rx frame flags 0x%04x"
 --
 .20.1

-[Qemu-devel] [PULL 06/42] target/arm: Implement dummy versions of M-profile FP-related registers
+[PULL 22/23] sd: sdhci: Implement basic vendor specific register support
-The M-profile floating point support has three associated config
+From: Guenter Roeck <linux@roeck-us.net>
 registers: FPCAR, FPCCR and FPDSCR. It also makes the registers
 CPACR and NSACR have behaviour other than reads-as-zero.
 Add support for all of these as simple reads-as-written registers.
 We will hook up actual functionality later.
-The main complexity here is handling the FPCCR register, which
+The Linux kernel's IMX code now uses vendor specific commands.
-has a mix of banked and unbanked bits.
+This results in endless warnings when booting the Linux kernel.
-Note that we don't share storage with the A-profile
+sdhci-esdhc-imx 2194000.usdhc: esdhc_wait_for_card_clock_gate_off:
-cpu->cp15.nsacr and cpu->cp15.cpacr_el1, though the behaviour
+    card clock still not gate off in 100us!.
 is quite similar, for two reasons:
  * the M profile CPACR is banked between security states
  * it preserves the invariant that M profile uses no state
    inside the cp15 substruct
+Implement support for the vendor specific command implemented in IMX hardware
+to be able to avoid this warning.
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Signed-off-by: Guenter Roeck <linux@roeck-us.net>
+Message-id: 20200603145258.195920-2-linux@roeck-us.net
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-4-peter.maydell@linaro.org
 ---
- target/arm/cpu.h      |  34 ++++++++++++
+ hw/sd/sdhci-internal.h |  5 +++++
- hw/intc/armv7m_nvic.c | 125 ++++++++++++++++++++++++++++++++++++++++++
+ include/hw/sd/sdhci.h  |  5 +++++
- target/arm/cpu.c      |   5 ++
+ hw/sd/sdhci.c          | 18 +++++++++++++++++-
- target/arm/machine.c  |  16 ++++++
+files changed, 27 insertions(+), 1 deletion(-)
 files changed, 180 insertions(+)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+diff --git a/hw/sd/sdhci-internal.h b/hw/sd/sdhci-internal.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
+--- a/hw/sd/sdhci-internal.h
-+++ b/target/arm/cpu.h
++++ b/hw/sd/sdhci-internal.h
-@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
+@@ -XXX,XX +XXX,XX @@
-         uint32_t scr[M_REG_NUM_BANKS];
+ #define SDHC_CMD_INHIBIT               0x00000001
-         uint32_t msplim[M_REG_NUM_BANKS];
+ #define SDHC_DATA_INHIBIT              0x00000002
-         uint32_t psplim[M_REG_NUM_BANKS];
+ #define SDHC_DAT_LINE_ACTIVE           0x00000004
-+        uint32_t fpcar[M_REG_NUM_BANKS];
++#define SDHC_IMX_CLOCK_GATE_OFF        0x00000080
-+        uint32_t fpccr[M_REG_NUM_BANKS];
+ #define SDHC_DOING_WRITE               0x00000100
-+        uint32_t fpdscr[M_REG_NUM_BANKS];
+ #define SDHC_DOING_READ                0x00000200
-+        uint32_t cpacr[M_REG_NUM_BANKS];
+ #define SDHC_SPACE_AVAILABLE           0x00000400
-+        uint32_t nsacr;
+@@ -XXX,XX +XXX,XX @@ extern const VMStateDescription sdhci_vmstate;
-     } v7m;
-     /* Information associated with an exception about to be taken:
+ #define ESDHC_MIX_CTRL                  0x48
-@@ -XXX,XX +XXX,XX @@ FIELD(V7M_CSSELR, LEVEL, 1, 3)
++
-  */
+ #define ESDHC_VENDOR_SPEC               0xc0
- FIELD(V7M_CSSELR, INDEX, 0, 4)
++#define ESDHC_IMX_FRC_SDCLK_ON          (1 << 8)
++
-+/* v7M FPCCR bits */
+ #define ESDHC_DLL_CTRL                  0x60
-+FIELD(V7M_FPCCR, LSPACT, 0, 1)
-+FIELD(V7M_FPCCR, USER, 1, 1)
+ #define ESDHC_TUNING_CTRL               0xcc
-+FIELD(V7M_FPCCR, S, 2, 1)
+@@ -XXX,XX +XXX,XX @@ extern const VMStateDescription sdhci_vmstate;
-+FIELD(V7M_FPCCR, THREAD, 3, 1)
+ #define DEFINE_SDHCI_COMMON_PROPERTIES(_state) \
-+FIELD(V7M_FPCCR, HFRDY, 4, 1)
+     DEFINE_PROP_UINT8("sd-spec-version", _state, sd_spec_version, 2), \
-+FIELD(V7M_FPCCR, MMRDY, 5, 1)
+     DEFINE_PROP_UINT8("uhs", _state, uhs_mode, UHS_NOT_SUPPORTED), \
-+FIELD(V7M_FPCCR, BFRDY, 6, 1)
++    DEFINE_PROP_UINT8("vendor", _state, vendor, SDHCI_VENDOR_NONE), \
-+FIELD(V7M_FPCCR, SFRDY, 7, 1)
+     \
-+FIELD(V7M_FPCCR, MONRDY, 8, 1)
+     /* Capabilities registers provide information on supported
-+FIELD(V7M_FPCCR, SPLIMVIOL, 9, 1)
+      * features of this specific host controller implementation */ \
-+FIELD(V7M_FPCCR, UFRDY, 10, 1)
+diff --git a/include/hw/sd/sdhci.h b/include/hw/sd/sdhci.h
-+FIELD(V7M_FPCCR, RES0, 11, 15)
+index XXXXXXX..XXXXXXX 100644
-+FIELD(V7M_FPCCR, TS, 26, 1)
+--- a/include/hw/sd/sdhci.h
-+FIELD(V7M_FPCCR, CLRONRETS, 27, 1)
++++ b/include/hw/sd/sdhci.h
-+FIELD(V7M_FPCCR, CLRONRET, 28, 1)
+@@ -XXX,XX +XXX,XX @@ typedef struct SDHCIState {
-+FIELD(V7M_FPCCR, LSPENS, 29, 1)
+     uint16_t acmd12errsts; /* Auto CMD12 error status register */
-+FIELD(V7M_FPCCR, LSPEN, 30, 1)
+     uint16_t hostctl2;     /* Host Control 2 */
-+FIELD(V7M_FPCCR, ASPEN, 31, 1)
+     uint64_t admasysaddr;  /* ADMA System Address Register */
-+/* These bits are banked. Others are non-banked and live in the M_REG_S bank */
++    uint16_t vendor_spec;  /* Vendor specific register */
-+#define R_V7M_FPCCR_BANKED_MASK                 \
-+    (R_V7M_FPCCR_LSPACT_MASK |                  \
+     /* Read-only registers */
-+     R_V7M_FPCCR_USER_MASK |                    \
+     uint64_t capareg;      /* Capabilities Register */
-+     R_V7M_FPCCR_THREAD_MASK |                  \
+@@ -XXX,XX +XXX,XX @@ typedef struct SDHCIState {
-+     R_V7M_FPCCR_MMRDY_MASK |                   \
+     uint32_t quirks;
-+     R_V7M_FPCCR_SPLIMVIOL_MASK |               \
+     uint8_t sd_spec_version;
-+     R_V7M_FPCCR_UFRDY_MASK |                   \
+     uint8_t uhs_mode;
-+     R_V7M_FPCCR_ASPEN_MASK)
++    uint8_t vendor;        /* For vendor specific functionality */
  } SDHCIState;
 +#define SDHCI_VENDOR_NONE       0
 +#define SDHCI_VENDOR_IMX        1
 +
  /*
-  * System register ID fields.
+  * Controller does not provide transfer-complete interrupt when not
-  */
+  * busy.
-diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
+diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/intc/armv7m_nvic.c
+--- a/hw/sd/sdhci.c
-+++ b/hw/intc/armv7m_nvic.c
++++ b/hw/sd/sdhci.c
-@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
+@@ -XXX,XX +XXX,XX @@ static uint64_t usdhc_read(void *opaque, hwaddr offset, unsigned size)
      }
      case 0xd84: /* CSSELR */
          return cpu->env.v7m.csselr[attrs.secure];
 +    case 0xd88: /* CPACR */
 +        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            return 0;
 +        }
 +        return cpu->env.v7m.cpacr[attrs.secure];
 +    case 0xd8c: /* NSACR */
 +        if (!attrs.secure || !arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            return 0;
 +        }
 +        return cpu->env.v7m.nsacr;
      /* TODO: Implement debug registers.  */
      case 0xd90: /* MPU_TYPE */
          /* Unified MPU; if the MPU is not present this value is zero */
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
              return 0;
          }
          return cpu->env.v7m.sfar;
 +    case 0xf34: /* FPCCR */
 +        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            return 0;
 +        }
 +        if (attrs.secure) {
 +            return cpu->env.v7m.fpccr[M_REG_S];
 +        } else {
 +            /*
 +             * NS can read LSPEN, CLRONRET and MONRDY. It can read
 +             * BFRDY and HFRDY if AIRCR.BFHFNMINS != 0;
 +             * other non-banked bits RAZ.
 +             * TODO: MONRDY should RAZ/WI if DEMCR.SDME is set.
 +             */
 +            uint32_t value = cpu->env.v7m.fpccr[M_REG_S];
 +            uint32_t mask = R_V7M_FPCCR_LSPEN_MASK |
 +                R_V7M_FPCCR_CLRONRET_MASK |
 +                R_V7M_FPCCR_MONRDY_MASK;
 +
 +            if (s->cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK) {
 +                mask |= R_V7M_FPCCR_BFRDY_MASK | R_V7M_FPCCR_HFRDY_MASK;
 +            }
 +
 +            value &= mask;
 +
 +            value |= cpu->env.v7m.fpccr[M_REG_NS];
 +            return value;
 +        }
 +    case 0xf38: /* FPCAR */
 +        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            return 0;
 +        }
 +        return cpu->env.v7m.fpcar[attrs.secure];
 +    case 0xf3c: /* FPDSCR */
 +        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            return 0;
 +        }
 +        return cpu->env.v7m.fpdscr[attrs.secure];
      case 0xf40: /* MVFR0 */
          return cpu->isar.mvfr0;
      case 0xf44: /* MVFR1 */
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
              cpu->env.v7m.csselr[attrs.secure] = value & R_V7M_CSSELR_INDEX_MASK;
          }
          break;
-+    case 0xd88: /* CPACR */
-+        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
++    case ESDHC_VENDOR_SPEC:
-+            /* We implement only the Floating Point extension's CP10/CP11 */
++        ret = s->vendor_spec;
-+            cpu->env.v7m.cpacr[attrs.secure] = value & (0xf << 20);
++        break;
      case ESDHC_DLL_CTRL:
      case ESDHC_TUNE_CTRL_STATUS:
      case ESDHC_UNDOCUMENTED_REG27:
      case ESDHC_TUNING_CTRL:
 -    case ESDHC_VENDOR_SPEC:
      case ESDHC_MIX_CTRL:
      case ESDHC_WTMK_LVL:
          ret = 0;
@@ -XXX,XX +XXX,XX @@ usdhc_write(void *opaque, hwaddr offset, uint64_t val, unsigned size)
      case ESDHC_UNDOCUMENTED_REG27:
      case ESDHC_TUNING_CTRL:
      case ESDHC_WTMK_LVL:
 +        break;
 +
      case ESDHC_VENDOR_SPEC:
 +        s->vendor_spec = value;
 +        switch (s->vendor) {
 +        case SDHCI_VENDOR_IMX:
 +            if (value & ESDHC_IMX_FRC_SDCLK_ON) {
 +                s->prnsts &= ~SDHC_IMX_CLOCK_GATE_OFF;
 +            } else {
 +                s->prnsts |= SDHC_IMX_CLOCK_GATE_OFF;
 +            }
 +            break;
 +        default:
 +            break;
 +        }
-+        break;
-+    case 0xd8c: /* NSACR */
-+        if (attrs.secure && arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
-+            /* We implement only the Floating Point extension's CP10/CP11 */
-+            cpu->env.v7m.nsacr = value & (3 << 10);
-+        }
-+        break;
-     case 0xd90: /* MPU_TYPE */
-         return; /* RO */
-     case 0xd94: /* MPU_CTRL */
-@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
-         }
          break;
-     }
-+    case 0xf34: /* FPCCR */
+     case SDHC_HOSTCTL:
 +        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            /* Not all bits here are banked. */
 +            uint32_t fpccr_s;
 +
 +            if (!arm_feature(&cpu->env, ARM_FEATURE_V8)) {
 +                /* Don't allow setting of bits not present in v7M */
 +                value &= (R_V7M_FPCCR_LSPACT_MASK |
 +                          R_V7M_FPCCR_USER_MASK |
 +                          R_V7M_FPCCR_THREAD_MASK |
 +                          R_V7M_FPCCR_HFRDY_MASK |
 +                          R_V7M_FPCCR_MMRDY_MASK |
 +                          R_V7M_FPCCR_BFRDY_MASK |
 +                          R_V7M_FPCCR_MONRDY_MASK |
 +                          R_V7M_FPCCR_LSPEN_MASK |
 +                          R_V7M_FPCCR_ASPEN_MASK);
 +            }
 +            value &= ~R_V7M_FPCCR_RES0_MASK;
 +
 +            if (!attrs.secure) {
 +                /* Some non-banked bits are configurably writable by NS */
 +                fpccr_s = cpu->env.v7m.fpccr[M_REG_S];
 +                if (!(fpccr_s & R_V7M_FPCCR_LSPENS_MASK)) {
 +                    uint32_t lspen = FIELD_EX32(value, V7M_FPCCR, LSPEN);
 +                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, LSPEN, lspen);
 +                }
 +                if (!(fpccr_s & R_V7M_FPCCR_CLRONRETS_MASK)) {
 +                    uint32_t cor = FIELD_EX32(value, V7M_FPCCR, CLRONRET);
 +                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, CLRONRET, cor);
 +                }
 +                if ((s->cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK)) {
 +                    uint32_t hfrdy = FIELD_EX32(value, V7M_FPCCR, HFRDY);
 +                    uint32_t bfrdy = FIELD_EX32(value, V7M_FPCCR, BFRDY);
 +                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, HFRDY, hfrdy);
 +                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, BFRDY, bfrdy);
 +                }
 +                /* TODO MONRDY should RAZ/WI if DEMCR.SDME is set */
 +                {
 +                    uint32_t monrdy = FIELD_EX32(value, V7M_FPCCR, MONRDY);
 +                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, MONRDY, monrdy);
 +                }
 +
 +                /*
 +                 * All other non-banked bits are RAZ/WI from NS; write
 +                 * just the banked bits to fpccr[M_REG_NS].
 +                 */
 +                value &= R_V7M_FPCCR_BANKED_MASK;
 +                cpu->env.v7m.fpccr[M_REG_NS] = value;
 +            } else {
 +                fpccr_s = value;
 +            }
 +            cpu->env.v7m.fpccr[M_REG_S] = fpccr_s;
 +        }
 +        break;
 +    case 0xf38: /* FPCAR */
 +        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            value &= ~7;
 +            cpu->env.v7m.fpcar[attrs.secure] = value;
 +        }
 +        break;
 +    case 0xf3c: /* FPDSCR */
 +        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
 +            value &= 0x07c00000;
 +            cpu->env.v7m.fpdscr[attrs.secure] = value;
 +        }
 +        break;
      case 0xf50: /* ICIALLU */
      case 0xf58: /* ICIMVAU */
      case 0xf5c: /* DCIMVAC */
 diff --git a/target/arm/cpu.c b/target/arm/cpu.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu.c
 +++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
              env->v7m.ccr[M_REG_S] |= R_V7M_CCR_UNALIGN_TRP_MASK;
          }
 +        if (arm_feature(env, ARM_FEATURE_VFP)) {
 +            env->v7m.fpccr[M_REG_NS] = R_V7M_FPCCR_ASPEN_MASK;
 +            env->v7m.fpccr[M_REG_S] = R_V7M_FPCCR_ASPEN_MASK |
 +                R_V7M_FPCCR_LSPEN_MASK | R_V7M_FPCCR_S_MASK;
 +        }
          /* Unlike A/R profile, M profile defines the reset LR value */
          env->regs[14] = 0xffffffff;
 diff --git a/target/arm/machine.c b/target/arm/machine.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/machine.c
 +++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_m_v8m = {
      }
  };
 +static const VMStateDescription vmstate_m_fp = {
 +    .name = "cpu/m/fp",
 +    .version_id = 1,
 +    .minimum_version_id = 1,
 +    .needed = vfp_needed,
 +    .fields = (VMStateField[]) {
 +        VMSTATE_UINT32_ARRAY(env.v7m.fpcar, ARMCPU, M_REG_NUM_BANKS),
 +        VMSTATE_UINT32_ARRAY(env.v7m.fpccr, ARMCPU, M_REG_NUM_BANKS),
 +        VMSTATE_UINT32_ARRAY(env.v7m.fpdscr, ARMCPU, M_REG_NUM_BANKS),
 +        VMSTATE_UINT32_ARRAY(env.v7m.cpacr, ARMCPU, M_REG_NUM_BANKS),
 +        VMSTATE_UINT32(env.v7m.nsacr, ARMCPU),
 +        VMSTATE_END_OF_LIST()
 +    }
 +};
 +
  static const VMStateDescription vmstate_m = {
      .name = "cpu/m",
      .version_id = 4,
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_m = {
          &vmstate_m_scr,
          &vmstate_m_other_sp,
          &vmstate_m_v8m,
 +        &vmstate_m_fp,
          NULL
      }
  };
 --
 .20.1

-[Qemu-devel] [PULL 11/42] target/arm: Handle SFPA and FPCA bits in reads and writes of CONTROL
+Deleted patch
-The M-profile CONTROL register has two bits -- SFPA and FPCA --
-which relate to floating-point support, and should be RES0 otherwise.
-Handle them correctly in the MSR/MRS register access code.
-Neither is banked between security states, so they are stored
-in v7m.control[M_REG_S] regardless of current security state.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-9-peter.maydell@linaro.org
----
- target/arm/helper.c | 57 ++++++++++++++++++++++++++++++++++++++-------
-file changed, 49 insertions(+), 8 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(v7m_mrs)(CPUARMState *env, uint32_t reg)
-         return xpsr_read(env) & mask;
-         break;
-     case 20: /* CONTROL */
--        return env->v7m.control[env->v7m.secure];
-+    {
-+        uint32_t value = env->v7m.control[env->v7m.secure];
-+        if (!env->v7m.secure) {
-+            /* SFPA is RAZ/WI from NS; FPCA is stored in the M_REG_S bank */
-+            value |= env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK;
-+        }
-+        return value;
-+    }
-     case 0x94: /* CONTROL_NS */
-         /* We have to handle this here because unprivileged Secure code
-          * can read the NS CONTROL register.
-@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(v7m_mrs)(CPUARMState *env, uint32_t reg)
-         if (!env->v7m.secure) {
-             return 0;
-         }
--        return env->v7m.control[M_REG_NS];
-+        return env->v7m.control[M_REG_NS] |
-+            (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK);
-     }
-     if (el == 0) {
-@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
-      */
-     uint32_t mask = extract32(maskreg, 8, 4);
-     uint32_t reg = extract32(maskreg, 0, 8);
-+    int cur_el = arm_current_el(env);
--    if (arm_current_el(env) == 0 && reg > 7) {
--        /* only xPSR sub-fields may be written by unprivileged */
-+    if (cur_el == 0 && reg > 7 && reg != 20) {
-+        /*
-+         * only xPSR sub-fields and CONTROL.SFPA may be written by
-+         * unprivileged code
-+         */
-         return;
-     }
-@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
-                 env->v7m.control[M_REG_NS] &= ~R_V7M_CONTROL_NPRIV_MASK;
-                 env->v7m.control[M_REG_NS] |= val & R_V7M_CONTROL_NPRIV_MASK;
-             }
-+            /*
-+             * SFPA is RAZ/WI from NS. FPCA is RO if NSACR.CP10 == 0,
-+             * RES0 if the FPU is not present, and is stored in the S bank
-+             */
-+            if (arm_feature(env, ARM_FEATURE_VFP) &&
-+                extract32(env->v7m.nsacr, 10, 1)) {
-+                env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
-+                env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_FPCA_MASK;
-+            }
-             return;
-         case 0x98: /* SP_NS */
-         {
-@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
-         env->v7m.faultmask[env->v7m.secure] = val & 1;
-         break;
-     case 20: /* CONTROL */
--        /* Writing to the SPSEL bit only has an effect if we are in
-+        /*
-+         * Writing to the SPSEL bit only has an effect if we are in
-          * thread mode; other bits can be updated by any privileged code.
-          * write_v7m_control_spsel() deals with updating the SPSEL bit in
-          * env->v7m.control, so we only need update the others.
-          * For v7M, we must just ignore explicit writes to SPSEL in handler
-          * mode; for v8M the write is permitted but will have no effect.
-+         * All these bits are writes-ignored from non-privileged code,
-+         * except for SFPA.
-          */
--        if (arm_feature(env, ARM_FEATURE_V8) ||
--            !arm_v7m_is_handler_mode(env)) {
-+        if (cur_el > 0 && (arm_feature(env, ARM_FEATURE_V8) ||
-+                           !arm_v7m_is_handler_mode(env))) {
-             write_v7m_control_spsel(env, (val & R_V7M_CONTROL_SPSEL_MASK) != 0);
-         }
--        if (arm_feature(env, ARM_FEATURE_M_MAIN)) {
-+        if (cur_el > 0 && arm_feature(env, ARM_FEATURE_M_MAIN)) {
-             env->v7m.control[env->v7m.secure] &= ~R_V7M_CONTROL_NPRIV_MASK;
-             env->v7m.control[env->v7m.secure] |= val & R_V7M_CONTROL_NPRIV_MASK;
-         }
-+        if (arm_feature(env, ARM_FEATURE_VFP)) {
-+            /*
-+             * SFPA is RAZ/WI from NS or if no FPU.
-+             * FPCA is RO if NSACR.CP10 == 0, RES0 if the FPU is not present.
-+             * Both are stored in the S bank.
-+             */
-+            if (env->v7m.secure) {
-+                env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
-+                env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_SFPA_MASK;
-+            }
-+            if (cur_el > 0 &&
-+                (env->v7m.secure || !arm_feature(env, ARM_FEATURE_M_SECURITY) ||
-+                 extract32(env->v7m.nsacr, 10, 1))) {
-+                env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
-+                env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_FPCA_MASK;
-+            }
-+        }
-         break;
-     default:
-     bad_reg:
---
-.20.1

-[Qemu-devel] [PULL 15/42] target/arm: Clear CONTROL.SFPA in BXNS and BLXNS
+Deleted patch
-For v8M floating point support, transitions from Secure
-to Non-secure state via BLNS and BLXNS must clear the
-CONTROL.SFPA bit. (This corresponds to the pseudocode
-BranchToNS() function.)
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-13-peter.maydell@linaro.org
----
- target/arm/helper.c | 4 ++++
-file changed, 4 insertions(+)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_bxns)(CPUARMState *env, uint32_t dest)
-     /* translate.c should have made BXNS UNDEF unless we're secure */
-     assert(env->v7m.secure);
-+    if (!(dest & 1)) {
-+        env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
-+    }
-     switch_v7m_security_state(env, dest & 1);
-     env->thumb = 1;
-     env->regs[15] = dest & ~1;
-@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
-          */
-         write_v7m_exception(env, 1);
-     }
-+    env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
-     switch_v7m_security_state(env, 0);
-     env->thumb = 1;
-     env->regs[15] = dest;
---
-.20.1

-[Qemu-devel] [PULL 16/42] target/arm: Clean excReturn bits when tail chaining
+Deleted patch
-The TailChain() pseudocode specifies that a tail chaining
-exception should sanitize the excReturn all-ones bits and
-(if there is no FPU) the excReturn FType bits; we weren't
-doing this.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-14-peter.maydell@linaro.org
----
- target/arm/helper.c | 8 ++++++++
-file changed, 8 insertions(+)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
-     qemu_log_mask(CPU_LOG_INT, "...taking pending %s exception %d\n",
-                   targets_secure ? "secure" : "nonsecure", exc);
-+    if (dotailchain) {
-+        /* Sanitize LR FType and PREFIX bits */
-+        if (!arm_feature(env, ARM_FEATURE_VFP)) {
-+            lr |= R_V7M_EXCRET_FTYPE_MASK;
-+        }
-+        lr = deposit32(lr, 24, 8, 0xff);
-+    }
-+
-     if (arm_feature(env, ARM_FEATURE_V8)) {
-         if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
-             (lr & R_V7M_EXCRET_S_MASK)) {
---
-.20.1

-[Qemu-devel] [PULL 17/42] target/arm: Allow for floating point in callee stack integrity check
+Deleted patch
-The magic value pushed onto the callee stack as an integrity
-check is different if floating point is present.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-15-peter.maydell@linaro.org
----
- target/arm/helper.c | 22 +++++++++++++++++++---
-file changed, 19 insertions(+), 3 deletions(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ load_fail:
-     return false;
- }
-+static uint32_t v7m_integrity_sig(CPUARMState *env, uint32_t lr)
-+{
-+    /*
-+     * Return the integrity signature value for the callee-saves
-+     * stack frame section. @lr is the exception return payload/LR value
-+     * whose FType bit forms bit 0 of the signature if FP is present.
-+     */
-+    uint32_t sig = 0xfefa125a;
-+
-+    if (!arm_feature(env, ARM_FEATURE_VFP) || (lr & R_V7M_EXCRET_FTYPE_MASK)) {
-+        sig |= 1;
-+    }
-+    return sig;
-+}
-+
- static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
-                                   bool ignore_faults)
- {
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
-     bool stacked_ok;
-     uint32_t limit;
-     bool want_psp;
-+    uint32_t sig;
-     if (dotailchain) {
-         bool mode = lr & R_V7M_EXCRET_MODE_MASK;
-@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
-     /* Write as much of the stack frame as we can. A write failure may
-      * cause us to pend a derived exception.
-      */
-+    sig = v7m_integrity_sig(env, lr);
-     stacked_ok =
--        v7m_stack_write(cpu, frameptr, 0xfefa125b, mmu_idx, ignore_faults) &&
-+        v7m_stack_write(cpu, frameptr, sig, mmu_idx, ignore_faults) &&
-         v7m_stack_write(cpu, frameptr + 0x8, env->regs[4], mmu_idx,
-                         ignore_faults) &&
-         v7m_stack_write(cpu, frameptr + 0xc, env->regs[5], mmu_idx,
-@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
-         if (return_to_secure &&
-             ((excret & R_V7M_EXCRET_ES_MASK) == 0 ||
-              (excret & R_V7M_EXCRET_DCRS_MASK) == 0)) {
--            uint32_t expected_sig = 0xfefa125b;
-             uint32_t actual_sig;
-             pop_ok = v7m_stack_read(cpu, &actual_sig, frameptr, mmu_idx);
--            if (pop_ok && expected_sig != actual_sig) {
-+            if (pop_ok && v7m_integrity_sig(env, excret) != actual_sig) {
-                 /* Take a SecureFault on the current stack */
-                 env->v7m.sfsr |= R_V7M_SFSR_INVIS_MASK;
-                 armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
---
-.20.1

-[Qemu-devel] [PULL 18/42] target/arm: Handle floating point registers in exception return
+Deleted patch
-Handle floating point registers in exception return.
-This corresponds to pseudocode functions ValidateExceptionReturn(),
-ExceptionReturn(), PopStack() and ConsumeExcStackFrame().
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-16-peter.maydell@linaro.org
----
- target/arm/helper.c | 142 +++++++++++++++++++++++++++++++++++++++++++-
-file changed, 141 insertions(+), 1 deletion(-)
-diff --git a/target/arm/helper.c b/target/arm/helper.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/helper.c
-+++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
-     bool rettobase = false;
-     bool exc_secure = false;
-     bool return_to_secure;
-+    bool ftype;
-+    bool restore_s16_s31;
-     /* If we're not in Handler mode then jumps to magic exception-exit
-      * addresses don't have magic behaviour. However for the v8M
-@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
-                       excret);
-     }
-+    ftype = excret & R_V7M_EXCRET_FTYPE_MASK;
-+
-+    if (!arm_feature(env, ARM_FEATURE_VFP) && !ftype) {
-+        qemu_log_mask(LOG_GUEST_ERROR, "M profile: zero FTYPE in exception "
-+                      "exit PC value 0x%" PRIx32 " is UNPREDICTABLE "
-+                      "if FPU not present\n",
-+                      excret);
-+        ftype = true;
-+    }
-+
-     if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
-         /* EXC_RETURN.ES validation check (R_SMFL). We must do this before
-          * we pick which FAULTMASK to clear.
-@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
-      */
-     write_v7m_control_spsel_for_secstate(env, return_to_sp_process, exc_secure);
-+    /*
-+     * Clear scratch FP values left in caller saved registers; this
-+     * must happen before any kind of tail chaining.
-+     */
-+    if ((env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_CLRONRET_MASK) &&
-+        (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK)) {
-+        if (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPACT_MASK) {
-+            env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
-+            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
-+            qemu_log_mask(CPU_LOG_INT, "...taking SecureFault on existing "
-+                          "stackframe: error during lazy state deactivation\n");
-+            v7m_exception_taken(cpu, excret, true, false);
-+            return;
-+        } else {
-+            /* Clear s0..s15 and FPSCR */
-+            int i;
-+
-+            for (i = 0; i < 16; i += 2) {
-+                *aa32_vfp_dreg(env, i / 2) = 0;
-+            }
-+            vfp_set_fpscr(env, 0);
-+        }
-+    }
-+
-     if (sfault) {
-         env->v7m.sfsr |= R_V7M_SFSR_INVER_MASK;
-         armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
-@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
-             }
-         }
-+        if (!ftype) {
-+            /* FP present and we need to handle it */
-+            if (!return_to_secure &&
-+                (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPACT_MASK)) {
-+                armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
-+                env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
-+                qemu_log_mask(CPU_LOG_INT,
-+                              "...taking SecureFault on existing stackframe: "
-+                              "Secure LSPACT set but exception return is "
-+                              "not to secure state\n");
-+                v7m_exception_taken(cpu, excret, true, false);
-+                return;
-+            }
-+
-+            restore_s16_s31 = return_to_secure &&
-+                (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK);
-+
-+            if (env->v7m.fpccr[return_to_secure] & R_V7M_FPCCR_LSPACT_MASK) {
-+                /* State in FPU is still valid, just clear LSPACT */
-+                env->v7m.fpccr[return_to_secure] &= ~R_V7M_FPCCR_LSPACT_MASK;
-+            } else {
-+                int i;
-+                uint32_t fpscr;
-+                bool cpacr_pass, nsacr_pass;
-+
-+                cpacr_pass = v7m_cpacr_pass(env, return_to_secure,
-+                                            return_to_priv);
-+                nsacr_pass = return_to_secure ||
-+                    extract32(env->v7m.nsacr, 10, 1);
-+
-+                if (!cpacr_pass) {
-+                    armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
-+                                            return_to_secure);
-+                    env->v7m.cfsr[return_to_secure] |= R_V7M_CFSR_NOCP_MASK;
-+                    qemu_log_mask(CPU_LOG_INT,
-+                                  "...taking UsageFault on existing "
-+                                  "stackframe: CPACR.CP10 prevents unstacking "
-+                                  "FP regs\n");
-+                    v7m_exception_taken(cpu, excret, true, false);
-+                    return;
-+                } else if (!nsacr_pass) {
-+                    armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, true);
-+                    env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_INVPC_MASK;
-+                    qemu_log_mask(CPU_LOG_INT,
-+                                  "...taking Secure UsageFault on existing "
-+                                  "stackframe: NSACR.CP10 prevents unstacking "
-+                                  "FP regs\n");
-+                    v7m_exception_taken(cpu, excret, true, false);
-+                    return;
-+                }
-+
-+                for (i = 0; i < (restore_s16_s31 ? 32 : 16); i += 2) {
-+                    uint32_t slo, shi;
-+                    uint64_t dn;
-+                    uint32_t faddr = frameptr + 0x20 + 4 * i;
-+
-+                    if (i >= 16) {
-+                        faddr += 8; /* Skip the slot for the FPSCR */
-+                    }
-+
-+                    pop_ok = pop_ok &&
-+                        v7m_stack_read(cpu, &slo, faddr, mmu_idx) &&
-+                        v7m_stack_read(cpu, &shi, faddr + 4, mmu_idx);
-+
-+                    if (!pop_ok) {
-+                        break;
-+                    }
-+
-+                    dn = (uint64_t)shi << 32 | slo;
-+                    *aa32_vfp_dreg(env, i / 2) = dn;
-+                }
-+                pop_ok = pop_ok &&
-+                    v7m_stack_read(cpu, &fpscr, frameptr + 0x60, mmu_idx);
-+                if (pop_ok) {
-+                    vfp_set_fpscr(env, fpscr);
-+                }
-+                if (!pop_ok) {
-+                    /*
-+                     * These regs are 0 if security extension present;
-+                     * otherwise merely UNKNOWN. We zero always.
-+                     */
-+                    for (i = 0; i < (restore_s16_s31 ? 32 : 16); i += 2) {
-+                        *aa32_vfp_dreg(env, i / 2) = 0;
-+                    }
-+                    vfp_set_fpscr(env, 0);
-+                }
-+            }
-+        }
-+        env->v7m.control[M_REG_S] = FIELD_DP32(env->v7m.control[M_REG_S],
-+                                               V7M_CONTROL, FPCA, !ftype);
-+
-         /* Commit to consuming the stack frame */
-         frameptr += 0x20;
-+        if (!ftype) {
-+            frameptr += 0x48;
-+            if (restore_s16_s31) {
-+                frameptr += 0x40;
-+            }
-+        }
-         /* Undo stack alignment (the SPREALIGN bit indicates that the original
-          * pre-exception SP was not 8-aligned and we added a padding word to
-          * align it, so we undo this by ORing in the bit that increases it
-@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
-         *frame_sp_p = frameptr;
-     }
-     /* This xpsr_write() will invalidate frame_sp_p as it may switch stack */
--    xpsr_write(env, xpsr, ~XPSR_SPREALIGN);
-+    xpsr_write(env, xpsr, ~(XPSR_SPREALIGN | XPSR_SFPA));
-+
-+    if (env->v7m.secure) {
-+        bool sfpa = xpsr & XPSR_SFPA;
-+
-+        env->v7m.control[M_REG_S] = FIELD_DP32(env->v7m.control[M_REG_S],
-+                                               V7M_CONTROL, SFPA, sfpa);
-+    }
-     /* The restored xPSR exception field will be zero if we're
-      * resuming in Thread mode. If that doesn't match what the
---
-.20.1

-[Qemu-devel] [PULL 19/42] target/arm: Move NS TBFLAG from bit 19 to bit 6
+Deleted patch
-Move the NS TBFLAG down from bit 19 to bit 6, which has not
-been used since commit c1e3781090b9d36c60 in 2015, when we
-started passing the entire MMU index in the TB flags rather
-than just a 'privilege level' bit.
-This rearrangement is not strictly necessary, but means that
-we can put M-profile-only bits next to each other rather
-than scattered across the flag word.
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Message-id: 20190416125744.27770-17-peter.maydell@linaro.org
----
- target/arm/cpu.h | 11 ++++++-----
-file changed, 6 insertions(+), 5 deletions(-)
-diff --git a/target/arm/cpu.h b/target/arm/cpu.h
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.h
-+++ b/target/arm/cpu.h
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
- FIELD(TBFLAG_A32, THUMB, 0, 1)
- FIELD(TBFLAG_A32, VECLEN, 1, 3)
- FIELD(TBFLAG_A32, VECSTRIDE, 4, 2)
-+/*
-+ * Indicates whether cp register reads and writes by guest code should access
-+ * the secure or nonsecure bank of banked registers; note that this is not
-+ * the same thing as the current security state of the processor!
-+ */
-+FIELD(TBFLAG_A32, NS, 6, 1)
- FIELD(TBFLAG_A32, VFPEN, 7, 1)
- FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
- FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
-@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
-  * checks on the other bits at runtime
-  */
- FIELD(TBFLAG_A32, XSCALE_CPAR, 17, 2)
--/* Indicates whether cp register reads and writes by guest code should access
-- * the secure or nonsecure bank of banked registers; note that this is not
-- * the same thing as the current security state of the processor!
-- */
--FIELD(TBFLAG_A32, NS, 19, 1)
- /* For M profile only, Handler (ie not Thread) mode */
- FIELD(TBFLAG_A32, HANDLER, 21, 1)
- /* For M profile only, whether we should generate stack-limit checks */
---
-.20.1

-[Qemu-devel] [PULL 30/42] hw/dma: Compile the bcm2835_dma device as common object
+Deleted patch
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
-This device is used by both ARM (BCM2836, for raspi2) and AArch64
-(BCM2837, for raspi3) targets, and is not CPU-specific.
-Move it to common object, so we build it once for all targets.
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190427133028.12874-1-philmd@redhat.com
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- hw/dma/Makefile.objs | 2 +-
-file changed, 1 insertion(+), 1 deletion(-)
-diff --git a/hw/dma/Makefile.objs b/hw/dma/Makefile.objs
-index XXXXXXX..XXXXXXX 100644
---- a/hw/dma/Makefile.objs
-+++ b/hw/dma/Makefile.objs
-@@ -XXX,XX +XXX,XX @@ common-obj-$(CONFIG_XLNX_ZYNQMP_ARM) += xlnx-zdma.o
- obj-$(CONFIG_OMAP) += omap_dma.o soc_dma.o
- obj-$(CONFIG_PXA2XX) += pxa2xx_dma.o
--obj-$(CONFIG_RASPI) += bcm2835_dma.o
-+common-obj-$(CONFIG_RASPI) += bcm2835_dma.o
---
-.20.1

-[Qemu-devel] [PULL 31/42] hw/arm/aspeed: Use TYPE_TMP105/TYPE_PCA9552 instead of hardcoded string
+Deleted patch
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
-Reviewed-by: Thomas Huth <thuth@redhat.com>
-Reviewed-by: Cédric Le Goater <clg@kaod.org>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190412165416.7977-2-philmd@redhat.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- hw/arm/aspeed.c | 13 +++++++++----
-file changed, 9 insertions(+), 4 deletions(-)
-diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/aspeed.c
-+++ b/hw/arm/aspeed.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/arm/aspeed_soc.h"
- #include "hw/boards.h"
- #include "hw/i2c/smbus_eeprom.h"
-+#include "hw/misc/pca9552.h"
-+#include "hw/misc/tmp105.h"
- #include "qemu/log.h"
- #include "sysemu/block-backend.h"
- #include "hw/loader.h"
-@@ -XXX,XX +XXX,XX @@ static void ast2500_evb_i2c_init(AspeedBoardState *bmc)
-                           eeprom_buf);
-     /* The AST2500 EVB expects a LM75 but a TMP105 is compatible */
--    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 7), "tmp105", 0x4d);
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 7),
-+                     TYPE_TMP105, 0x4d);
-     /* The AST2500 EVB does not have an RTC. Let's pretend that one is
-      * plugged on the I2C bus header */
-@@ -XXX,XX +XXX,XX @@ static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
-     AspeedSoCState *soc = &bmc->soc;
-     uint8_t *eeprom_buf = g_malloc0(8 * 1024);
--    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 3), "pca9552", 0x60);
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 3), TYPE_PCA9552,
-+                     0x60);
-     i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 4), "tmp423", 0x4c);
-     i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 5), "tmp423", 0x4c);
-     /* The Witherspoon expects a TMP275 but a TMP105 is compatible */
--    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 9), "tmp105", 0x4a);
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 9), TYPE_TMP105,
-+                     0x4a);
-     /* The witherspoon board expects Epson RX8900 I2C RTC but a ds1338 is
-      * good enough */
-@@ -XXX,XX +XXX,XX @@ static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
-     smbus_eeprom_init_one(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 11), 0x51,
-                           eeprom_buf);
--    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 11), "pca9552",
-+    i2c_create_slave(aspeed_i2c_get_bus(DEVICE(&soc->i2c), 11), TYPE_PCA9552,
-x60);
- }
---
-.20.1

-[Qemu-devel] [PULL 32/42] hw/arm/nseries: Use TYPE_TMP105 instead of hardcoded string
+[PULL 23/23] hw: arm: Set vendor property for IMX SDHCI emulations
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
+From: Guenter Roeck <linux@roeck-us.net>
-Suggested-by: Markus Armbruster <armbru@redhat.com>
+Set vendor property to IMX to enable IMX specific functionality
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+in sdhci code.
-Message-id: 20190412165416.7977-3-philmd@redhat.com
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Signed-off-by: Guenter Roeck <linux@roeck-us.net>
 Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Message-id: 20200603145258.195920-3-linux@roeck-us.net
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/nseries.c | 3 ++-
+ hw/arm/fsl-imx25.c  | 6 ++++++
-file changed, 2 insertions(+), 1 deletion(-)
+ hw/arm/fsl-imx6.c   | 6 ++++++
  hw/arm/fsl-imx6ul.c | 2 ++
  hw/arm/fsl-imx7.c   | 2 ++
 files changed, 16 insertions(+)
-diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
+diff --git a/hw/arm/fsl-imx25.c b/hw/arm/fsl-imx25.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/nseries.c
+--- a/hw/arm/fsl-imx25.c
-+++ b/hw/arm/nseries.c
++++ b/hw/arm/fsl-imx25.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static void fsl_imx25_realize(DeviceState *dev, Error **errp)
- #include "hw/boards.h"
+                                  &err);
- #include "hw/i2c/i2c.h"
+         object_property_set_uint(OBJECT(&s->esdhc[i]), IMX25_ESDHC_CAPABILITIES,
- #include "hw/devices.h"
+                                  "capareg", &err);
-+#include "hw/misc/tmp105.h"
++        object_property_set_uint(OBJECT(&s->esdhc[i]), SDHCI_VENDOR_IMX,
- #include "hw/block/flash.h"
++                                 "vendor", &err);
- #include "hw/hw.h"
++        if (err) {
- #include "hw/bt.h"
++            error_propagate(errp, err);
-@@ -XXX,XX +XXX,XX @@ static void n8x0_i2c_setup(struct n800_s *s)
++            return;
-     qemu_register_powerdown_notifier(&n8x0_system_powerdown_notifier);
++        }
+         object_property_set_bool(OBJECT(&s->esdhc[i]), true, "realized", &err);
-     /* Attach a TMP105 PM chip (A0 wired to ground) */
+         if (err) {
--    dev = i2c_create_slave(i2c, "tmp105", N8X0_TMP105_ADDR);
+             error_propagate(errp, err);
-+    dev = i2c_create_slave(i2c, TYPE_TMP105, N8X0_TMP105_ADDR);
+diff --git a/hw/arm/fsl-imx6.c b/hw/arm/fsl-imx6.c
-     qdev_connect_gpio_out(dev, 0, tmp_irq);
+index XXXXXXX..XXXXXXX 100644
- }
+--- a/hw/arm/fsl-imx6.c
 +++ b/hw/arm/fsl-imx6.c
@@ -XXX,XX +XXX,XX @@ static void fsl_imx6_realize(DeviceState *dev, Error **errp)
                                   &err);
          object_property_set_uint(OBJECT(&s->esdhc[i]), IMX6_ESDHC_CAPABILITIES,
                                   "capareg", &err);
 +        object_property_set_uint(OBJECT(&s->esdhc[i]), SDHCI_VENDOR_IMX,
 +                                 "vendor", &err);
 +        if (err) {
 +            error_propagate(errp, err);
 +            return;
 +        }
          object_property_set_bool(OBJECT(&s->esdhc[i]), true, "realized", &err);
          if (err) {
              error_propagate(errp, err);
 diff --git a/hw/arm/fsl-imx6ul.c b/hw/arm/fsl-imx6ul.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/fsl-imx6ul.c
 +++ b/hw/arm/fsl-imx6ul.c
@@ -XXX,XX +XXX,XX @@ static void fsl_imx6ul_realize(DeviceState *dev, Error **errp)
              FSL_IMX6UL_USDHC2_IRQ,
          };
 +        object_property_set_uint(OBJECT(&s->usdhc[i]), SDHCI_VENDOR_IMX,
 +                                        "vendor", &error_abort);
          object_property_set_bool(OBJECT(&s->usdhc[i]), true, "realized",
                                   &error_abort);
 diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/fsl-imx7.c
 +++ b/hw/arm/fsl-imx7.c
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
              FSL_IMX7_USDHC3_IRQ,
          };
 +        object_property_set_uint(OBJECT(&s->usdhc[i]), SDHCI_VENDOR_IMX,
 +                                 "vendor", &error_abort);
          object_property_set_bool(OBJECT(&s->usdhc[i]), true, "realized",
                                   &error_abort);
 --
 .20.1

-[Qemu-devel] [PULL 34/42] hw/devices: Move TC6393XB declarations into a new header
+Deleted patch
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190412165416.7977-5-philmd@redhat.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/devices.h          |  6 ------
- include/hw/display/tc6393xb.h | 24 ++++++++++++++++++++++++
- hw/arm/tosa.c                 |  2 +-
- hw/display/tc6393xb.c         |  2 +-
- MAINTAINERS                   |  1 +
-files changed, 27 insertions(+), 8 deletions(-)
- create mode 100644 include/hw/display/tc6393xb.h
-diff --git a/include/hw/devices.h b/include/hw/devices.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
-+++ b/include/hw/devices.h
-@@ -XXX,XX +XXX,XX @@ void *tahvo_init(qemu_irq irq, int betty);
- void retu_key_event(void *retu, int state);
--/* tc6393xb.c */
--typedef struct TC6393xbState TC6393xbState;
--TC6393xbState *tc6393xb_init(struct MemoryRegion *sysmem,
--                             uint32_t base, qemu_irq irq);
--qemu_irq tc6393xb_l3v_get(TC6393xbState *s);
--
- #endif
-diff --git a/include/hw/display/tc6393xb.h b/include/hw/display/tc6393xb.h
-new file mode 100644
-index XXXXXXX..XXXXXXX
---- /dev/null
-+++ b/include/hw/display/tc6393xb.h
-@@ -XXX,XX +XXX,XX @@
-+/*
-+ * Toshiba TC6393XB I/O Controller.
-+ * Found in Sharp Zaurus SL-6000 (tosa) or some
-+ * Toshiba e-Series PDAs.
-+ *
-+ * Copyright (c) 2007 Hervé Poussineau
-+ *
-+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
-+ * See the COPYING file in the top-level directory.
-+ */
-+
-+#ifndef HW_DISPLAY_TC6393XB_H
-+#define HW_DISPLAY_TC6393XB_H
-+
-+#include "exec/memory.h"
-+#include "hw/irq.h"
-+
-+typedef struct TC6393xbState TC6393xbState;
-+
-+TC6393xbState *tc6393xb_init(struct MemoryRegion *sysmem,
-+                             uint32_t base, qemu_irq irq);
-+qemu_irq tc6393xb_l3v_get(TC6393xbState *s);
-+
-+#endif
-diff --git a/hw/arm/tosa.c b/hw/arm/tosa.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/tosa.c
-+++ b/hw/arm/tosa.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/hw.h"
- #include "hw/arm/pxa.h"
- #include "hw/arm/arm.h"
--#include "hw/devices.h"
- #include "hw/arm/sharpsl.h"
- #include "hw/pcmcia.h"
- #include "hw/boards.h"
-+#include "hw/display/tc6393xb.h"
- #include "hw/i2c/i2c.h"
- #include "hw/ssi/ssi.h"
- #include "hw/sysbus.h"
-diff --git a/hw/display/tc6393xb.c b/hw/display/tc6393xb.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/display/tc6393xb.c
-+++ b/hw/display/tc6393xb.c
-@@ -XXX,XX +XXX,XX @@
- #include "qapi/error.h"
- #include "qemu/host-utils.h"
- #include "hw/hw.h"
--#include "hw/devices.h"
-+#include "hw/display/tc6393xb.h"
- #include "hw/block/flash.h"
- #include "ui/console.h"
- #include "ui/pixel_ops.h"
-diff --git a/MAINTAINERS b/MAINTAINERS
-index XXXXXXX..XXXXXXX 100644
---- a/MAINTAINERS
-+++ b/MAINTAINERS
-@@ -XXX,XX +XXX,XX @@ F: hw/misc/mst_fpga.c
- F: hw/misc/max111x.c
- F: include/hw/arm/pxa.h
- F: include/hw/arm/sharpsl.h
-+F: include/hw/display/tc6393xb.h
- SABRELITE / i.MX6
- M: Peter Maydell <peter.maydell@linaro.org>
---
-.20.1

-[Qemu-devel] [PULL 35/42] hw/devices: Move Blizzard declarations into a new header
+Deleted patch
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
-Add an entries the Blizzard device in MAINTAINERS.
-Reviewed-by: Thomas Huth <thuth@redhat.com>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190412165416.7977-6-philmd@redhat.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/devices.h          |  7 -------
- include/hw/display/blizzard.h | 22 ++++++++++++++++++++++
- hw/arm/nseries.c              |  1 +
- hw/display/blizzard.c         |  2 +-
- MAINTAINERS                   |  2 ++
-files changed, 26 insertions(+), 8 deletions(-)
- create mode 100644 include/hw/display/blizzard.h
-diff --git a/include/hw/devices.h b/include/hw/devices.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
-+++ b/include/hw/devices.h
-@@ -XXX,XX +XXX,XX @@ void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
- /* stellaris_input.c */
- void stellaris_gamepad_init(int n, qemu_irq *irq, const int *keycode);
--/* blizzard.c */
--void *s1d13745_init(qemu_irq gpio_int);
--void s1d13745_write(void *opaque, int dc, uint16_t value);
--void s1d13745_write_block(void *opaque, int dc,
--                void *buf, size_t len, int pitch);
--uint16_t s1d13745_read(void *opaque, int dc);
--
- /* cbus.c */
- typedef struct {
-     qemu_irq clk;
-diff --git a/include/hw/display/blizzard.h b/include/hw/display/blizzard.h
-new file mode 100644
-index XXXXXXX..XXXXXXX
---- /dev/null
-+++ b/include/hw/display/blizzard.h
-@@ -XXX,XX +XXX,XX @@
-+/*
-+ * Epson S1D13744/S1D13745 (Blizzard/Hailstorm/Tornado) LCD/TV controller.
-+ *
-+ * Copyright (C) 2008 Nokia Corporation
-+ * Written by Andrzej Zaborowski
-+ *
-+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
-+ * See the COPYING file in the top-level directory.
-+ */
-+
-+#ifndef HW_DISPLAY_BLIZZARD_H
-+#define HW_DISPLAY_BLIZZARD_H
-+
-+#include "hw/irq.h"
-+
-+void *s1d13745_init(qemu_irq gpio_int);
-+void s1d13745_write(void *opaque, int dc, uint16_t value);
-+void s1d13745_write_block(void *opaque, int dc,
-+                          void *buf, size_t len, int pitch);
-+uint16_t s1d13745_read(void *opaque, int dc);
-+
-+#endif
-diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/nseries.c
-+++ b/hw/arm/nseries.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/boards.h"
- #include "hw/i2c/i2c.h"
- #include "hw/devices.h"
-+#include "hw/display/blizzard.h"
- #include "hw/misc/tmp105.h"
- #include "hw/block/flash.h"
- #include "hw/hw.h"
-diff --git a/hw/display/blizzard.c b/hw/display/blizzard.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/display/blizzard.c
-+++ b/hw/display/blizzard.c
-@@ -XXX,XX +XXX,XX @@
- #include "qemu/osdep.h"
- #include "qemu-common.h"
- #include "ui/console.h"
--#include "hw/devices.h"
-+#include "hw/display/blizzard.h"
- #include "ui/pixel_ops.h"
- typedef void (*blizzard_fn_t)(uint8_t *, const uint8_t *, unsigned int);
-diff --git a/MAINTAINERS b/MAINTAINERS
-index XXXXXXX..XXXXXXX 100644
---- a/MAINTAINERS
-+++ b/MAINTAINERS
-@@ -XXX,XX +XXX,XX @@ M: Peter Maydell <peter.maydell@linaro.org>
- L: qemu-arm@nongnu.org
- S: Odd Fixes
- F: hw/arm/nseries.c
-+F: hw/display/blizzard.c
- F: hw/input/lm832x.c
- F: hw/input/tsc2005.c
- F: hw/misc/cbus.c
- F: hw/timer/twl92230.c
-+F: include/hw/display/blizzard.h
- Palm
- M: Andrzej Zaborowski <balrogg@gmail.com>
---
-.20.1

-[Qemu-devel] [PULL 36/42] hw/devices: Move CBus declarations into a new header
+Deleted patch
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
-Reviewed-by: Thomas Huth <thuth@redhat.com>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190412165416.7977-7-philmd@redhat.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/devices.h   | 14 --------------
- include/hw/misc/cbus.h | 32 ++++++++++++++++++++++++++++++++
- hw/arm/nseries.c       |  1 +
- hw/misc/cbus.c         |  2 +-
- MAINTAINERS            |  1 +
-files changed, 35 insertions(+), 15 deletions(-)
- create mode 100644 include/hw/misc/cbus.h
-diff --git a/include/hw/devices.h b/include/hw/devices.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
-+++ b/include/hw/devices.h
-@@ -XXX,XX +XXX,XX @@ void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
- /* stellaris_input.c */
- void stellaris_gamepad_init(int n, qemu_irq *irq, const int *keycode);
--/* cbus.c */
--typedef struct {
--    qemu_irq clk;
--    qemu_irq dat;
--    qemu_irq sel;
--} CBus;
--CBus *cbus_init(qemu_irq dat_out);
--void cbus_attach(CBus *bus, void *slave_opaque);
--
--void *retu_init(qemu_irq irq, int vilma);
--void *tahvo_init(qemu_irq irq, int betty);
--
--void retu_key_event(void *retu, int state);
--
- #endif
-diff --git a/include/hw/misc/cbus.h b/include/hw/misc/cbus.h
-new file mode 100644
-index XXXXXXX..XXXXXXX
---- /dev/null
-+++ b/include/hw/misc/cbus.h
-@@ -XXX,XX +XXX,XX @@
-+/*
-+ * CBUS three-pin bus and the Retu / Betty / Tahvo / Vilma / Avilma /
-+ * Hinku / Vinku / Ahne / Pihi chips used in various Nokia platforms.
-+ * Based on reverse-engineering of a linux driver.
-+ *
-+ * Copyright (C) 2008 Nokia Corporation
-+ * Written by Andrzej Zaborowski
-+ *
-+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
-+ * See the COPYING file in the top-level directory.
-+ */
-+
-+#ifndef HW_MISC_CBUS_H
-+#define HW_MISC_CBUS_H
-+
-+#include "hw/irq.h"
-+
-+typedef struct {
-+    qemu_irq clk;
-+    qemu_irq dat;
-+    qemu_irq sel;
-+} CBus;
-+
-+CBus *cbus_init(qemu_irq dat_out);
-+void cbus_attach(CBus *bus, void *slave_opaque);
-+
-+void *retu_init(qemu_irq irq, int vilma);
-+void *tahvo_init(qemu_irq irq, int betty);
-+
-+void retu_key_event(void *retu, int state);
-+
-+#endif
-diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/nseries.c
-+++ b/hw/arm/nseries.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/i2c/i2c.h"
- #include "hw/devices.h"
- #include "hw/display/blizzard.h"
-+#include "hw/misc/cbus.h"
- #include "hw/misc/tmp105.h"
- #include "hw/block/flash.h"
- #include "hw/hw.h"
-diff --git a/hw/misc/cbus.c b/hw/misc/cbus.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/cbus.c
-+++ b/hw/misc/cbus.c
-@@ -XXX,XX +XXX,XX @@
- #include "qemu/osdep.h"
- #include "hw/hw.h"
- #include "hw/irq.h"
--#include "hw/devices.h"
-+#include "hw/misc/cbus.h"
- #include "sysemu/sysemu.h"
- //#define DEBUG
-diff --git a/MAINTAINERS b/MAINTAINERS
-index XXXXXXX..XXXXXXX 100644
---- a/MAINTAINERS
-+++ b/MAINTAINERS
-@@ -XXX,XX +XXX,XX @@ F: hw/input/tsc2005.c
- F: hw/misc/cbus.c
- F: hw/timer/twl92230.c
- F: include/hw/display/blizzard.h
-+F: include/hw/misc/cbus.h
- Palm
- M: Andrzej Zaborowski <balrogg@gmail.com>
---
-.20.1

-[Qemu-devel] [PULL 37/42] hw/devices: Move Gamepad declarations into a new header
+Deleted patch
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190412165416.7977-8-philmd@redhat.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/devices.h       |  3 ---
- include/hw/input/gamepad.h | 19 +++++++++++++++++++
- hw/arm/stellaris.c         |  2 +-
- hw/input/stellaris_input.c |  2 +-
- MAINTAINERS                |  1 +
-files changed, 22 insertions(+), 5 deletions(-)
- create mode 100644 include/hw/input/gamepad.h
-diff --git a/include/hw/devices.h b/include/hw/devices.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
-+++ b/include/hw/devices.h
-@@ -XXX,XX +XXX,XX @@ void *tsc2005_init(qemu_irq pintdav);
- uint32_t tsc2005_txrx(void *opaque, uint32_t value, int len);
- void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
--/* stellaris_input.c */
--void stellaris_gamepad_init(int n, qemu_irq *irq, const int *keycode);
--
- #endif
-diff --git a/include/hw/input/gamepad.h b/include/hw/input/gamepad.h
-new file mode 100644
-index XXXXXXX..XXXXXXX
---- /dev/null
-+++ b/include/hw/input/gamepad.h
-@@ -XXX,XX +XXX,XX @@
-+/*
-+ * Gamepad style buttons connected to IRQ/GPIO lines
-+ *
-+ * Copyright (c) 2007 CodeSourcery.
-+ * Written by Paul Brook
-+ *
-+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
-+ * See the COPYING file in the top-level directory.
-+ */
-+
-+#ifndef HW_INPUT_GAMEPAD_H
-+#define HW_INPUT_GAMEPAD_H
-+
-+#include "hw/irq.h"
-+
-+/* stellaris_input.c */
-+void stellaris_gamepad_init(int n, qemu_irq *irq, const int *keycode);
-+
-+#endif
-diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/stellaris.c
-+++ b/hw/arm/stellaris.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/sysbus.h"
- #include "hw/ssi/ssi.h"
- #include "hw/arm/arm.h"
--#include "hw/devices.h"
- #include "qemu/timer.h"
- #include "hw/i2c/i2c.h"
- #include "net/net.h"
-@@ -XXX,XX +XXX,XX @@
- #include "sysemu/sysemu.h"
- #include "hw/arm/armv7m.h"
- #include "hw/char/pl011.h"
-+#include "hw/input/gamepad.h"
- #include "hw/watchdog/cmsdk-apb-watchdog.h"
- #include "hw/misc/unimp.h"
- #include "cpu.h"
-diff --git a/hw/input/stellaris_input.c b/hw/input/stellaris_input.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/input/stellaris_input.c
-+++ b/hw/input/stellaris_input.c
-@@ -XXX,XX +XXX,XX @@
-  */
- #include "qemu/osdep.h"
- #include "hw/hw.h"
--#include "hw/devices.h"
-+#include "hw/input/gamepad.h"
- #include "ui/console.h"
- typedef struct {
-diff --git a/MAINTAINERS b/MAINTAINERS
-index XXXXXXX..XXXXXXX 100644
---- a/MAINTAINERS
-+++ b/MAINTAINERS
-@@ -XXX,XX +XXX,XX @@ M: Peter Maydell <peter.maydell@linaro.org>
- L: qemu-arm@nongnu.org
- S: Maintained
- F: hw/*/stellaris*
-+F: include/hw/input/gamepad.h
- Versatile Express
- M: Peter Maydell <peter.maydell@linaro.org>
---
-.20.1

-[Qemu-devel] [PULL 38/42] hw/devices: Move TI touchscreen declarations into a new header
+Deleted patch
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
-Since uWireSlave is only used in this new header, there is no
-need to expose it via "qemu/typedefs.h".
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190412165416.7977-9-philmd@redhat.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/arm/omap.h      |  6 +-----
- include/hw/devices.h       | 15 ---------------
- include/hw/input/tsc2xxx.h | 36 ++++++++++++++++++++++++++++++++++++
- include/qemu/typedefs.h    |  1 -
- hw/arm/nseries.c           |  2 +-
- hw/arm/palm.c              |  2 +-
- hw/input/tsc2005.c         |  2 +-
- hw/input/tsc210x.c         |  4 ++--
- MAINTAINERS                |  2 ++
-files changed, 44 insertions(+), 26 deletions(-)
- create mode 100644 include/hw/input/tsc2xxx.h
-diff --git a/include/hw/arm/omap.h b/include/hw/arm/omap.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/arm/omap.h
-+++ b/include/hw/arm/omap.h
-@@ -XXX,XX +XXX,XX @@
- #include "exec/memory.h"
- # define hw_omap_h        "omap.h"
- #include "hw/irq.h"
-+#include "hw/input/tsc2xxx.h"
- #include "target/arm/cpu-qom.h"
- #include "qemu/log.h"
-@@ -XXX,XX +XXX,XX @@ qemu_irq *omap_mpuio_in_get(struct omap_mpuio_s *s);
- void omap_mpuio_out_set(struct omap_mpuio_s *s, int line, qemu_irq handler);
- void omap_mpuio_key(struct omap_mpuio_s *s, int row, int col, int down);
--struct uWireSlave {
--    uint16_t (*receive)(void *opaque);
--    void (*send)(void *opaque, uint16_t data);
--    void *opaque;
--};
- struct omap_uwire_s;
- void omap_uwire_attach(struct omap_uwire_s *s,
-                 uWireSlave *slave, int chipselect);
-diff --git a/include/hw/devices.h b/include/hw/devices.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
-+++ b/include/hw/devices.h
-@@ -XXX,XX +XXX,XX @@
- /* Devices that have nowhere better to go.  */
- #include "hw/hw.h"
--#include "ui/console.h"
- /* smc91c111.c */
- void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
-@@ -XXX,XX +XXX,XX @@ void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
- /* lan9118.c */
- void lan9118_init(NICInfo *, uint32_t, qemu_irq);
--/* tsc210x.c */
--uWireSlave *tsc2102_init(qemu_irq pint);
--uWireSlave *tsc2301_init(qemu_irq penirq, qemu_irq kbirq, qemu_irq dav);
--I2SCodec *tsc210x_codec(uWireSlave *chip);
--uint32_t tsc210x_txrx(void *opaque, uint32_t value, int len);
--void tsc210x_set_transform(uWireSlave *chip,
--                MouseTransformInfo *info);
--void tsc210x_key_event(uWireSlave *chip, int key, int down);
--
--/* tsc2005.c */
--void *tsc2005_init(qemu_irq pintdav);
--uint32_t tsc2005_txrx(void *opaque, uint32_t value, int len);
--void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
--
- #endif
-diff --git a/include/hw/input/tsc2xxx.h b/include/hw/input/tsc2xxx.h
-new file mode 100644
-index XXXXXXX..XXXXXXX
---- /dev/null
-+++ b/include/hw/input/tsc2xxx.h
-@@ -XXX,XX +XXX,XX @@
-+/*
-+ * TI touchscreen controller
-+ *
-+ * Copyright (c) 2006 Andrzej Zaborowski
-+ * Copyright (C) 2008 Nokia Corporation
-+ *
-+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
-+ * See the COPYING file in the top-level directory.
-+ */
-+
-+#ifndef HW_INPUT_TSC2XXX_H
-+#define HW_INPUT_TSC2XXX_H
-+
-+#include "hw/irq.h"
-+#include "ui/console.h"
-+
-+typedef struct uWireSlave {
-+    uint16_t (*receive)(void *opaque);
-+    void (*send)(void *opaque, uint16_t data);
-+    void *opaque;
-+} uWireSlave;
-+
-+/* tsc210x.c */
-+uWireSlave *tsc2102_init(qemu_irq pint);
-+uWireSlave *tsc2301_init(qemu_irq penirq, qemu_irq kbirq, qemu_irq dav);
-+I2SCodec *tsc210x_codec(uWireSlave *chip);
-+uint32_t tsc210x_txrx(void *opaque, uint32_t value, int len);
-+void tsc210x_set_transform(uWireSlave *chip, MouseTransformInfo *info);
-+void tsc210x_key_event(uWireSlave *chip, int key, int down);
-+
-+/* tsc2005.c */
-+void *tsc2005_init(qemu_irq pintdav);
-+uint32_t tsc2005_txrx(void *opaque, uint32_t value, int len);
-+void tsc2005_set_transform(void *opaque, MouseTransformInfo *info);
-+
-+#endif
-diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/qemu/typedefs.h
-+++ b/include/qemu/typedefs.h
-@@ -XXX,XX +XXX,XX @@ typedef struct RAMBlock RAMBlock;
- typedef struct Range Range;
- typedef struct SHPCDevice SHPCDevice;
- typedef struct SSIBus SSIBus;
--typedef struct uWireSlave uWireSlave;
- typedef struct VirtIODevice VirtIODevice;
- typedef struct Visitor Visitor;
- typedef void SaveStateHandler(QEMUFile *f, void *opaque);
-diff --git a/hw/arm/nseries.c b/hw/arm/nseries.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/nseries.c
-+++ b/hw/arm/nseries.c
-@@ -XXX,XX +XXX,XX @@
- #include "ui/console.h"
- #include "hw/boards.h"
- #include "hw/i2c/i2c.h"
--#include "hw/devices.h"
- #include "hw/display/blizzard.h"
-+#include "hw/input/tsc2xxx.h"
- #include "hw/misc/cbus.h"
- #include "hw/misc/tmp105.h"
- #include "hw/block/flash.h"
-diff --git a/hw/arm/palm.c b/hw/arm/palm.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/palm.c
-+++ b/hw/arm/palm.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/arm/omap.h"
- #include "hw/boards.h"
- #include "hw/arm/arm.h"
--#include "hw/devices.h"
-+#include "hw/input/tsc2xxx.h"
- #include "hw/loader.h"
- #include "exec/address-spaces.h"
- #include "cpu.h"
-diff --git a/hw/input/tsc2005.c b/hw/input/tsc2005.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/input/tsc2005.c
-+++ b/hw/input/tsc2005.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/hw.h"
- #include "qemu/timer.h"
- #include "ui/console.h"
--#include "hw/devices.h"
-+#include "hw/input/tsc2xxx.h"
- #include "trace.h"
- #define TSC_CUT_RESOLUTION(value, p)    ((value) >> (16 - (p ? 12 : 10)))
-diff --git a/hw/input/tsc210x.c b/hw/input/tsc210x.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/input/tsc210x.c
-+++ b/hw/input/tsc210x.c
-@@ -XXX,XX +XXX,XX @@
- #include "audio/audio.h"
- #include "qemu/timer.h"
- #include "ui/console.h"
--#include "hw/arm/omap.h"    /* For I2SCodec and uWireSlave */
--#include "hw/devices.h"
-+#include "hw/arm/omap.h"            /* For I2SCodec */
-+#include "hw/input/tsc2xxx.h"
- #define TSC_DATA_REGISTERS_PAGE        0x0
- #define TSC_CONTROL_REGISTERS_PAGE    0x1
-diff --git a/MAINTAINERS b/MAINTAINERS
-index XXXXXXX..XXXXXXX 100644
---- a/MAINTAINERS
-+++ b/MAINTAINERS
-@@ -XXX,XX +XXX,XX @@ F: hw/input/tsc2005.c
- F: hw/misc/cbus.c
- F: hw/timer/twl92230.c
- F: include/hw/display/blizzard.h
-+F: include/hw/input/tsc2xxx.h
- F: include/hw/misc/cbus.h
- Palm
-@@ -XXX,XX +XXX,XX @@ L: qemu-arm@nongnu.org
- S: Odd Fixes
- F: hw/arm/palm.c
- F: hw/input/tsc210x.c
-+F: include/hw/input/tsc2xxx.h
- Raspberry Pi
- M: Peter Maydell <peter.maydell@linaro.org>
---
-.20.1

-[Qemu-devel] [PULL 39/42] hw/devices: Move LAN9118 declarations into a new header
+Deleted patch
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190412165416.7977-10-philmd@redhat.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/devices.h     |  3 ---
- include/hw/net/lan9118.h | 19 +++++++++++++++++++
- hw/arm/kzm.c             |  2 +-
- hw/arm/mps2.c            |  2 +-
- hw/arm/realview.c        |  1 +
- hw/arm/vexpress.c        |  2 +-
- hw/net/lan9118.c         |  2 +-
-files changed, 24 insertions(+), 7 deletions(-)
- create mode 100644 include/hw/net/lan9118.h
-diff --git a/include/hw/devices.h b/include/hw/devices.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/devices.h
-+++ b/include/hw/devices.h
-@@ -XXX,XX +XXX,XX @@
- /* smc91c111.c */
- void smc91c111_init(NICInfo *, uint32_t, qemu_irq);
--/* lan9118.c */
--void lan9118_init(NICInfo *, uint32_t, qemu_irq);
--
- #endif
-diff --git a/include/hw/net/lan9118.h b/include/hw/net/lan9118.h
-new file mode 100644
-index XXXXXXX..XXXXXXX
---- /dev/null
-+++ b/include/hw/net/lan9118.h
-@@ -XXX,XX +XXX,XX @@
-+/*
-+ * SMSC LAN9118 Ethernet interface emulation
-+ *
-+ * Copyright (c) 2009 CodeSourcery, LLC.
-+ * Written by Paul Brook
-+ *
-+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
-+ * See the COPYING file in the top-level directory.
-+ */
-+
-+#ifndef HW_NET_LAN9118_H
-+#define HW_NET_LAN9118_H
-+
-+#include "hw/irq.h"
-+#include "net/net.h"
-+
-+void lan9118_init(NICInfo *, uint32_t, qemu_irq);
-+
-+#endif
-diff --git a/hw/arm/kzm.c b/hw/arm/kzm.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/kzm.c
-+++ b/hw/arm/kzm.c
-@@ -XXX,XX +XXX,XX @@
- #include "qemu/error-report.h"
- #include "exec/address-spaces.h"
- #include "net/net.h"
--#include "hw/devices.h"
-+#include "hw/net/lan9118.h"
- #include "hw/char/serial.h"
- #include "sysemu/qtest.h"
-diff --git a/hw/arm/mps2.c b/hw/arm/mps2.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/mps2.c
-+++ b/hw/arm/mps2.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/timer/cmsdk-apb-timer.h"
- #include "hw/timer/cmsdk-apb-dualtimer.h"
- #include "hw/misc/mps2-scc.h"
--#include "hw/devices.h"
-+#include "hw/net/lan9118.h"
- #include "net/net.h"
- typedef enum MPS2FPGAType {
-diff --git a/hw/arm/realview.c b/hw/arm/realview.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/realview.c
-+++ b/hw/arm/realview.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/arm/arm.h"
- #include "hw/arm/primecell.h"
- #include "hw/devices.h"
-+#include "hw/net/lan9118.h"
- #include "hw/pci/pci.h"
- #include "net/net.h"
- #include "sysemu/sysemu.h"
-diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/vexpress.c
-+++ b/hw/arm/vexpress.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/sysbus.h"
- #include "hw/arm/arm.h"
- #include "hw/arm/primecell.h"
--#include "hw/devices.h"
-+#include "hw/net/lan9118.h"
- #include "hw/i2c/i2c.h"
- #include "net/net.h"
- #include "sysemu/sysemu.h"
-diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/net/lan9118.c
-+++ b/hw/net/lan9118.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/sysbus.h"
- #include "net/net.h"
- #include "net/eth.h"
--#include "hw/devices.h"
-+#include "hw/net/lan9118.h"
- #include "sysemu/sysemu.h"
- #include "hw/ptimer.h"
- #include "qemu/log.h"
---
-.20.1

-[Qemu-devel] [PULL 40/42] hw/net/ne2000-isa: Add guards to the header
+Deleted patch
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
-Reviewed-by: Thomas Huth <thuth@redhat.com>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190412165416.7977-11-philmd@redhat.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/net/ne2000-isa.h | 6 ++++++
-file changed, 6 insertions(+)
-diff --git a/include/hw/net/ne2000-isa.h b/include/hw/net/ne2000-isa.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/net/ne2000-isa.h
-+++ b/include/hw/net/ne2000-isa.h
-@@ -XXX,XX +XXX,XX @@
-  * This work is licensed under the terms of the GNU GPL, version 2 or later.
-  * See the COPYING file in the top-level directory.
-  */
-+
-+#ifndef HW_NET_NE2K_ISA_H
-+#define HW_NET_NE2K_ISA_H
-+
- #include "hw/hw.h"
- #include "hw/qdev.h"
- #include "hw/isa/isa.h"
-@@ -XXX,XX +XXX,XX @@ static inline ISADevice *isa_ne2000_init(ISABus *bus, int base, int irq,
-     }
-     return d;
- }
-+
-+#endif
---
-.20.1

-[Qemu-devel] [PULL 41/42] hw/net/lan9118: Export TYPE_LAN9118 and use it instead of hardcoded string
+Deleted patch
-From: Philippe Mathieu-Daudé <philmd@redhat.com>
-Reviewed-by: Markus Armbruster <armbru@redhat.com>
-Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Message-id: 20190412165416.7977-12-philmd@redhat.com
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
----
- include/hw/net/lan9118.h | 2 ++
- hw/arm/exynos4_boards.c  | 3 ++-
- hw/arm/mps2-tz.c         | 3 ++-
- hw/net/lan9118.c         | 1 -
-files changed, 6 insertions(+), 3 deletions(-)
-diff --git a/include/hw/net/lan9118.h b/include/hw/net/lan9118.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/hw/net/lan9118.h
-+++ b/include/hw/net/lan9118.h
-@@ -XXX,XX +XXX,XX @@
- #include "hw/irq.h"
- #include "net/net.h"
-+#define TYPE_LAN9118 "lan9118"
-+
- void lan9118_init(NICInfo *, uint32_t, qemu_irq);
- #endif
-diff --git a/hw/arm/exynos4_boards.c b/hw/arm/exynos4_boards.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/exynos4_boards.c
-+++ b/hw/arm/exynos4_boards.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/arm/arm.h"
- #include "exec/address-spaces.h"
- #include "hw/arm/exynos4210.h"
-+#include "hw/net/lan9118.h"
- #include "hw/boards.h"
- #undef DEBUG
-@@ -XXX,XX +XXX,XX @@ static void lan9215_init(uint32_t base, qemu_irq irq)
-     /* This should be a 9215 but the 9118 is close enough */
-     if (nd_table[0].used) {
-         qemu_check_nic_model(&nd_table[0], "lan9118");
--        dev = qdev_create(NULL, "lan9118");
-+        dev = qdev_create(NULL, TYPE_LAN9118);
-         qdev_set_nic_properties(dev, &nd_table[0]);
-         qdev_prop_set_uint32(dev, "mode_16bit", 1);
-         qdev_init_nofail(dev);
-diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/mps2-tz.c
-+++ b/hw/arm/mps2-tz.c
-@@ -XXX,XX +XXX,XX @@
- #include "hw/arm/armsse.h"
- #include "hw/dma/pl080.h"
- #include "hw/ssi/pl022.h"
-+#include "hw/net/lan9118.h"
- #include "net/net.h"
- #include "hw/core/split-irq.h"
-@@ -XXX,XX +XXX,XX @@ static MemoryRegion *make_eth_dev(MPS2TZMachineState *mms, void *opaque,
-      * except that it doesn't support the checksum-offload feature.
-      */
-     qemu_check_nic_model(nd, "lan9118");
--    mms->lan9118 = qdev_create(NULL, "lan9118");
-+    mms->lan9118 = qdev_create(NULL, TYPE_LAN9118);
-     qdev_set_nic_properties(mms->lan9118, nd);
-     qdev_init_nofail(mms->lan9118);
-diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c
-index XXXXXXX..XXXXXXX 100644
---- a/hw/net/lan9118.c
-+++ b/hw/net/lan9118.c
-@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_lan9118_packet = {
-     }
- };
--#define TYPE_LAN9118 "lan9118"
- #define LAN9118(obj) OBJECT_CHECK(lan9118_state, (obj), TYPE_LAN9118)
- typedef struct {
---
-.20.1

First pullreq for arm of the 4.1 series, since I'm back from
holiday now. This is mostly my M-profile FPU series and Philippe's
devices.h cleanup. I have a pile of other patchsets to work through
in my to-review folder, but 42 patches is definitely quite
big enough to send now...

thanks
-- PMM

The following changes since commit 413a99a92c13ec408dcf2adaa87918dc81e890c8:

Add Nios II semihosting support. (2019-04-29 16:09:51 +0100)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20190429

for you to fetch changes up to 437cc27ddfded3bbab6afd5ac1761e0e195edba7:

hw/devices: Move SMSC 91C111 declaration into a new header (2019-04-29 17:57:21 +0100)

----------------------------------------------------------------
target-arm queue:
 * remove "bag of random stuff" hw/devices.h header
 * implement FPU for Cortex-M and enable it for Cortex-M4 and -M33
 * hw/dma: Compile the bcm2835_dma device as common object
 * configure: Remove --source-path option
 * hw/ssi/xilinx_spips: Avoid variable length array
 * hw/arm/smmuv3: Remove SMMUNotifierNode

----------------------------------------------------------------
Eric Auger (1):
      hw/arm/smmuv3: Remove SMMUNotifierNode

Peter Maydell (28):
      hw/ssi/xilinx_spips: Avoid variable length array
      configure: Remove --source-path option
      target/arm: Make sure M-profile FPSCR RES0 bits are not settable
      hw/intc/armv7m_nvic: Allow reading of M-profile MVFR* registers
      target/arm: Implement dummy versions of M-profile FP-related registers
      target/arm: Disable most VFP sysregs for M-profile
      target/arm: Honour M-profile FP enable bits
      target/arm: Decode FP instructions for M profile
      target/arm: Clear CONTROL_S.SFPA in SG insn if FPU present
      target/arm: Handle SFPA and FPCA bits in reads and writes of CONTROL
      target/arm/helper: don't return early for STKOF faults during stacking
      target/arm: Handle floating point registers in exception entry
      target/arm: Implement v7m_update_fpccr()
      target/arm: Clear CONTROL.SFPA in BXNS and BLXNS
      target/arm: Clean excReturn bits when tail chaining
      target/arm: Allow for floating point in callee stack integrity check
      target/arm: Handle floating point registers in exception return
      target/arm: Move NS TBFLAG from bit 19 to bit 6
      target/arm: Overlap VECSTRIDE and XSCALE_CPAR TB flags
      target/arm: Set FPCCR.S when executing M-profile floating point insns
      target/arm: Activate M-profile floating point context when FPCCR.ASPEN is set
      target/arm: New helper function arm_v7m_mmu_idx_all()
      target/arm: New function armv7m_nvic_set_pending_lazyfp()
      target/arm: Add lazy-FP-stacking support to v7m_stack_write()
      target/arm: Implement M-profile lazy FP state preservation
      target/arm: Implement VLSTM for v7M CPUs with an FPU
      target/arm: Implement VLLDM for v7M CPUs with an FPU
      target/arm: Enable FPU for Cortex-M4 and Cortex-M33

Philippe Mathieu-Daudé (13):
      hw/dma: Compile the bcm2835_dma device as common object
      hw/arm/aspeed: Use TYPE_TMP105/TYPE_PCA9552 instead of hardcoded string
      hw/arm/nseries: Use TYPE_TMP105 instead of hardcoded string
      hw/display/tc6393xb: Remove unused functions
      hw/devices: Move TC6393XB declarations into a new header
      hw/devices: Move Blizzard declarations into a new header
      hw/devices: Move CBus declarations into a new header
      hw/devices: Move Gamepad declarations into a new header
      hw/devices: Move TI touchscreen declarations into a new header
      hw/devices: Move LAN9118 declarations into a new header
      hw/net/ne2000-isa: Add guards to the header
      hw/net/lan9118: Export TYPE_LAN9118 and use it instead of hardcoded string
      hw/devices: Move SMSC 91C111 declaration into a new header

From: Eric Auger <eric.auger@redhat.com>

The SMMUNotifierNode struct is not necessary and brings extra
complexity so let's remove it. We now directly track the SMMUDevices
which have registered IOMMU MR notifiers.

This is inspired from the same transformation on intel-iommu
done in commit b4a4ba0d68f50f218ee3957b6638dbee32a5eeef
("intel-iommu: remove IntelIOMMUNotifierNode")

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-id: 20190409160219.19026-1-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/hw/arm/smmu-common.h |  8 ++------
 hw/arm/smmu-common.c         |  6 +++---
 hw/arm/smmuv3.c              | 28 +++++++---------------------
 3 files changed, 12 insertions(+), 30 deletions(-)

diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -XXX,XX +XXX,XX @@ typedef struct SMMUDevice {
     AddressSpace       as;
     uint32_t           cfg_cache_hits;
     uint32_t           cfg_cache_misses;
+    QLIST_ENTRY(SMMUDevice) next;
 } SMMUDevice;
 
-typedef struct SMMUNotifierNode {
-    SMMUDevice *sdev;
-    QLIST_ENTRY(SMMUNotifierNode) next;
-} SMMUNotifierNode;
-
 typedef struct SMMUPciBus {
     PCIBus       *bus;
     SMMUDevice   *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
@@ -XXX,XX +XXX,XX @@ typedef struct SMMUState {
     GHashTable *iotlb;
     SMMUPciBus *smmu_pcibus_by_bus_num[SMMU_PCI_BUS_MAX];
     PCIBus *pci_bus;
-    QLIST_HEAD(, SMMUNotifierNode) notifiers_list;
+    QLIST_HEAD(, SMMUDevice) devices_with_notifiers;
     uint8_t bus_num;
     PCIBus *primary_bus;
 } SMMUState;
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -XXX,XX +XXX,XX @@ inline void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
 /* Unmap all notifiers of all mr's */
 void smmu_inv_notifiers_all(SMMUState *s)
 {
-    SMMUNotifierNode *node;
+    SMMUDevice *sdev;
 
-    QLIST_FOREACH(node, &s->notifiers_list, next) {
-        smmu_inv_notifiers_mr(&node->sdev->iommu);
+    QLIST_FOREACH(sdev, &s->devices_with_notifiers, next) {
+        smmu_inv_notifiers_mr(&sdev->iommu);
     }
 }
 
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -XXX,XX +XXX,XX @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
 /* invalidate an asid/iova tuple in all mr's */
 static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova)
 {
-    SMMUNotifierNode *node;
+    SMMUDevice *sdev;
 
-    QLIST_FOREACH(node, &s->notifiers_list, next) {
-        IOMMUMemoryRegion *mr = &node->sdev->iommu;
+    QLIST_FOREACH(sdev, &s->devices_with_notifiers, next) {
+        IOMMUMemoryRegion *mr = &sdev->iommu;
         IOMMUNotifier *n;
 
         trace_smmuv3_inv_notifiers_iova(mr->parent_obj.name, asid, iova);
@@ -XXX,XX +XXX,XX @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
     SMMUDevice *sdev = container_of(iommu, SMMUDevice, iommu);
     SMMUv3State *s3 = sdev->smmu;
     SMMUState *s = &(s3->smmu_state);
-    SMMUNotifierNode *node = NULL;
-    SMMUNotifierNode *next_node = NULL;
 
     if (new & IOMMU_NOTIFIER_MAP) {
         int bus_num = pci_bus_num(sdev->bus);
@@ -XXX,XX +XXX,XX @@ static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
 
     if (old == IOMMU_NOTIFIER_NONE) {
         trace_smmuv3_notify_flag_add(iommu->parent_obj.name);
-        node = g_malloc0(sizeof(*node));
-        node->sdev = sdev;
-        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
-        return;
-    }
-
-    /* update notifier node with new flags */
-    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
-        if (node->sdev == sdev) {
-            if (new == IOMMU_NOTIFIER_NONE) {
-                trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
-                QLIST_REMOVE(node, next);
-                g_free(node);
-            }
-            return;
-        }
+        QLIST_INSERT_HEAD(&s->devices_with_notifiers, sdev, next);
+    } else if (new == IOMMU_NOTIFIER_NONE) {
+        trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
+        QLIST_REMOVE(sdev, next);
     }
 }
 
-- 
2.20.1

In the stripe8() function we use a variable length array; however
we know that the maximum length required is MAX_NUM_BUSSES. Use
a fixed-length array and an assert instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190328152635.2794-1-peter.maydell@linaro.org
---
 hw/ssi/xilinx_spips.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/ssi/xilinx_spips.c
+++ b/hw/ssi/xilinx_spips.c
@@ -XXX,XX +XXX,XX @@ static void xlnx_zynqmp_qspips_reset(DeviceState *d)
 
 static inline void stripe8(uint8_t *x, int num, bool dir)
 {
-    uint8_t r[num];
-    memset(r, 0, sizeof(uint8_t) * num);
+    uint8_t r[MAX_NUM_BUSSES];
     int idx[2] = {0, 0};
     int bit[2] = {0, 7};
     int d = dir;
 
+    assert(num <= MAX_NUM_BUSSES);
+    memset(r, 0, sizeof(uint8_t) * num);
+
     for (idx[0] = 0; idx[0] < num; ++idx[0]) {
         for (bit[0] = 7; bit[0] >= 0; bit[0]--) {
             r[idx[!d]] |= x[idx[d]] & 1 << bit[d] ? 1 << bit[!d] : 0;
-- 
2.20.1

Normally configure identifies the source path by looking
at the location where the configure script itself exists.
We also provide a --source-path option which lets the user
manually override this.

There isn't really an obvious use case for the --source-path
option, and in commit 927128222b0a91f56c13a in 2017 we
accidentally added some logic that looks at $source_path
before the command line option that overrides it has been
processed.

The fact that nobody complained suggests that there isn't
any use of this option and we aren't testing it either;
remove it. This allows us to move the "make $source_path
absolute" logic up so that there is no window in the script
where $source_path is set but not yet absolute.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20190318134019.23729-1-peter.maydell@linaro.org
---
 configure | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/configure b/configure
index XXXXXXX..XXXXXXX 100755
--- a/configure
+++ b/configure
@@ -XXX,XX +XXX,XX @@ ld_has() {
 
 # default parameters
 source_path=$(dirname "$0")
+# make source path absolute
+source_path=$(cd "$source_path"; pwd)
 cpu=""
 iasl="iasl"
 interp_prefix="/usr/gnemul/qemu-%M"
@@ -XXX,XX +XXX,XX @@ for opt do
   ;;
   --cxx=*) CXX="$optarg"
   ;;
-  --source-path=*) source_path="$optarg"
-  ;;
   --cpu=*) cpu="$optarg"
   ;;
   --extra-cflags=*) QEMU_CFLAGS="$QEMU_CFLAGS $optarg"
@@ -XXX,XX +XXX,XX @@ if test "$debug_info" = "yes"; then
     LDFLAGS="-g $LDFLAGS"
 fi
 
-# make source path absolute
-source_path=$(cd "$source_path"; pwd)
-
 # running configure in the source tree?
 # we know that's the case if configure is there.
 if test -f "./configure"; then
@@ -XXX,XX +XXX,XX @@ for opt do
   ;;
   --interp-prefix=*) interp_prefix="$optarg"
   ;;
-  --source-path=*)
-  ;;
   --cross-prefix=*)
   ;;
   --cc=*)
@@ -XXX,XX +XXX,XX @@ $(echo Available targets: $default_target_list | \
   --target-list-exclude=LIST exclude a set of targets from the default target-list
 
 Advanced options (experts only):
-  --source-path=PATH       path of source code [$source_path]
   --cross-prefix=PREFIX    use PREFIX for compile tools [$cross_prefix]
   --cc=CC                  use C compiler CC [$cc]
   --iasl=IASL              use ACPI compiler IASL [$iasl]
-- 
2.20.1

Enforce that for M-profile various FPSCR bits which are RES0 there
but have defined meanings on A-profile are never settable. This
ensures that M-profile code can't enable the A-profile behaviour
(notably vector length/stride handling) by accident.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-2-peter.maydell@linaro.org
---
 target/arm/vfp_helper.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/vfp_helper.c
+++ b/target/arm/vfp_helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val)
         val &= ~FPCR_FZ16;
     }
 
+    if (arm_feature(env, ARM_FEATURE_M)) {
+        /*
+         * M profile FPSCR is RES0 for the QC, STRIDE, FZ16, LEN bits
+         * and also for the trapped-exception-handling bits IxE.
+         */
+        val &= 0xf7c0009f;
+    }
+
     /*
      * We don't implement trapped exception handling, so the
      * trap enable bits, IDE|IXE|UFE|OFE|DZE|IOE are all RAZ/WI (not RES0!)
-- 
2.20.1

For M-profile the MVFR* ID registers are memory mapped, in the
range we implement via the NVIC. Allow them to be read.
(If the CPU has no FPU, these registers are defined to be RAZ.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-3-peter.maydell@linaro.org
---
 hw/intc/armv7m_nvic.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
             return 0;
         }
         return cpu->env.v7m.sfar;
+    case 0xf40: /* MVFR0 */
+        return cpu->isar.mvfr0;
+    case 0xf44: /* MVFR1 */
+        return cpu->isar.mvfr1;
+    case 0xf48: /* MVFR2 */
+        return cpu->isar.mvfr2;
     default:
     bad_offset:
         qemu_log_mask(LOG_GUEST_ERROR, "NVIC: Bad read offset 0x%x\n", offset);
-- 
2.20.1

The M-profile floating point support has three associated config
registers: FPCAR, FPCCR and FPDSCR. It also makes the registers
CPACR and NSACR have behaviour other than reads-as-zero.
Add support for all of these as simple reads-as-written registers.
We will hook up actual functionality later.

The main complexity here is handling the FPCCR register, which
has a mix of banked and unbanked bits.

Note that we don't share storage with the A-profile
cpu->cp15.nsacr and cpu->cp15.cpacr_el1, though the behaviour
is quite similar, for two reasons:
 * the M profile CPACR is banked between security states
 * it preserves the invariant that M profile uses no state
   inside the cp15 substruct

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-4-peter.maydell@linaro.org
---
 target/arm/cpu.h      |  34 ++++++++++++
 hw/intc/armv7m_nvic.c | 125 ++++++++++++++++++++++++++++++++++++++++++
 target/arm/cpu.c      |   5 ++
 target/arm/machine.c  |  16 ++++++
 4 files changed, 180 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ typedef struct CPUARMState {
         uint32_t scr[M_REG_NUM_BANKS];
         uint32_t msplim[M_REG_NUM_BANKS];
         uint32_t psplim[M_REG_NUM_BANKS];
+        uint32_t fpcar[M_REG_NUM_BANKS];
+        uint32_t fpccr[M_REG_NUM_BANKS];
+        uint32_t fpdscr[M_REG_NUM_BANKS];
+        uint32_t cpacr[M_REG_NUM_BANKS];
+        uint32_t nsacr;
     } v7m;
 
     /* Information associated with an exception about to be taken:
@@ -XXX,XX +XXX,XX @@ FIELD(V7M_CSSELR, LEVEL, 1, 3)
  */
 FIELD(V7M_CSSELR, INDEX, 0, 4)
 
+/* v7M FPCCR bits */
+FIELD(V7M_FPCCR, LSPACT, 0, 1)
+FIELD(V7M_FPCCR, USER, 1, 1)
+FIELD(V7M_FPCCR, S, 2, 1)
+FIELD(V7M_FPCCR, THREAD, 3, 1)
+FIELD(V7M_FPCCR, HFRDY, 4, 1)
+FIELD(V7M_FPCCR, MMRDY, 5, 1)
+FIELD(V7M_FPCCR, BFRDY, 6, 1)
+FIELD(V7M_FPCCR, SFRDY, 7, 1)
+FIELD(V7M_FPCCR, MONRDY, 8, 1)
+FIELD(V7M_FPCCR, SPLIMVIOL, 9, 1)
+FIELD(V7M_FPCCR, UFRDY, 10, 1)
+FIELD(V7M_FPCCR, RES0, 11, 15)
+FIELD(V7M_FPCCR, TS, 26, 1)
+FIELD(V7M_FPCCR, CLRONRETS, 27, 1)
+FIELD(V7M_FPCCR, CLRONRET, 28, 1)
+FIELD(V7M_FPCCR, LSPENS, 29, 1)
+FIELD(V7M_FPCCR, LSPEN, 30, 1)
+FIELD(V7M_FPCCR, ASPEN, 31, 1)
+/* These bits are banked. Others are non-banked and live in the M_REG_S bank */
+#define R_V7M_FPCCR_BANKED_MASK                 \
+    (R_V7M_FPCCR_LSPACT_MASK |                  \
+     R_V7M_FPCCR_USER_MASK |                    \
+     R_V7M_FPCCR_THREAD_MASK |                  \
+     R_V7M_FPCCR_MMRDY_MASK |                   \
+     R_V7M_FPCCR_SPLIMVIOL_MASK |               \
+     R_V7M_FPCCR_UFRDY_MASK |                   \
+     R_V7M_FPCCR_ASPEN_MASK)
+
 /*
  * System register ID fields.
  */
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
     }
     case 0xd84: /* CSSELR */
         return cpu->env.v7m.csselr[attrs.secure];
+    case 0xd88: /* CPACR */
+        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            return 0;
+        }
+        return cpu->env.v7m.cpacr[attrs.secure];
+    case 0xd8c: /* NSACR */
+        if (!attrs.secure || !arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            return 0;
+        }
+        return cpu->env.v7m.nsacr;
     /* TODO: Implement debug registers.  */
     case 0xd90: /* MPU_TYPE */
         /* Unified MPU; if the MPU is not present this value is zero */
@@ -XXX,XX +XXX,XX @@ static uint32_t nvic_readl(NVICState *s, uint32_t offset, MemTxAttrs attrs)
             return 0;
         }
         return cpu->env.v7m.sfar;
+    case 0xf34: /* FPCCR */
+        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            return 0;
+        }
+        if (attrs.secure) {
+            return cpu->env.v7m.fpccr[M_REG_S];
+        } else {
+            /*
+             * NS can read LSPEN, CLRONRET and MONRDY. It can read
+             * BFRDY and HFRDY if AIRCR.BFHFNMINS != 0;
+             * other non-banked bits RAZ.
+             * TODO: MONRDY should RAZ/WI if DEMCR.SDME is set.
+             */
+            uint32_t value = cpu->env.v7m.fpccr[M_REG_S];
+            uint32_t mask = R_V7M_FPCCR_LSPEN_MASK |
+                R_V7M_FPCCR_CLRONRET_MASK |
+                R_V7M_FPCCR_MONRDY_MASK;
+
+            if (s->cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK) {
+                mask |= R_V7M_FPCCR_BFRDY_MASK | R_V7M_FPCCR_HFRDY_MASK;
+            }
+
+            value &= mask;
+
+            value |= cpu->env.v7m.fpccr[M_REG_NS];
+            return value;
+        }
+    case 0xf38: /* FPCAR */
+        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            return 0;
+        }
+        return cpu->env.v7m.fpcar[attrs.secure];
+    case 0xf3c: /* FPDSCR */
+        if (!arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            return 0;
+        }
+        return cpu->env.v7m.fpdscr[attrs.secure];
     case 0xf40: /* MVFR0 */
         return cpu->isar.mvfr0;
     case 0xf44: /* MVFR1 */
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
             cpu->env.v7m.csselr[attrs.secure] = value & R_V7M_CSSELR_INDEX_MASK;
         }
         break;
+    case 0xd88: /* CPACR */
+        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            /* We implement only the Floating Point extension's CP10/CP11 */
+            cpu->env.v7m.cpacr[attrs.secure] = value & (0xf << 20);
+        }
+        break;
+    case 0xd8c: /* NSACR */
+        if (attrs.secure && arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            /* We implement only the Floating Point extension's CP10/CP11 */
+            cpu->env.v7m.nsacr = value & (3 << 10);
+        }
+        break;
     case 0xd90: /* MPU_TYPE */
         return; /* RO */
     case 0xd94: /* MPU_CTRL */
@@ -XXX,XX +XXX,XX @@ static void nvic_writel(NVICState *s, uint32_t offset, uint32_t value,
         }
         break;
     }
+    case 0xf34: /* FPCCR */
+        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            /* Not all bits here are banked. */
+            uint32_t fpccr_s;
+
+            if (!arm_feature(&cpu->env, ARM_FEATURE_V8)) {
+                /* Don't allow setting of bits not present in v7M */
+                value &= (R_V7M_FPCCR_LSPACT_MASK |
+                          R_V7M_FPCCR_USER_MASK |
+                          R_V7M_FPCCR_THREAD_MASK |
+                          R_V7M_FPCCR_HFRDY_MASK |
+                          R_V7M_FPCCR_MMRDY_MASK |
+                          R_V7M_FPCCR_BFRDY_MASK |
+                          R_V7M_FPCCR_MONRDY_MASK |
+                          R_V7M_FPCCR_LSPEN_MASK |
+                          R_V7M_FPCCR_ASPEN_MASK);
+            }
+            value &= ~R_V7M_FPCCR_RES0_MASK;
+
+            if (!attrs.secure) {
+                /* Some non-banked bits are configurably writable by NS */
+                fpccr_s = cpu->env.v7m.fpccr[M_REG_S];
+                if (!(fpccr_s & R_V7M_FPCCR_LSPENS_MASK)) {
+                    uint32_t lspen = FIELD_EX32(value, V7M_FPCCR, LSPEN);
+                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, LSPEN, lspen);
+                }
+                if (!(fpccr_s & R_V7M_FPCCR_CLRONRETS_MASK)) {
+                    uint32_t cor = FIELD_EX32(value, V7M_FPCCR, CLRONRET);
+                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, CLRONRET, cor);
+                }
+                if ((s->cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK)) {
+                    uint32_t hfrdy = FIELD_EX32(value, V7M_FPCCR, HFRDY);
+                    uint32_t bfrdy = FIELD_EX32(value, V7M_FPCCR, BFRDY);
+                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, HFRDY, hfrdy);
+                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, BFRDY, bfrdy);
+                }
+                /* TODO MONRDY should RAZ/WI if DEMCR.SDME is set */
+                {
+                    uint32_t monrdy = FIELD_EX32(value, V7M_FPCCR, MONRDY);
+                    fpccr_s = FIELD_DP32(fpccr_s, V7M_FPCCR, MONRDY, monrdy);
+                }
+
+                /*
+                 * All other non-banked bits are RAZ/WI from NS; write
+                 * just the banked bits to fpccr[M_REG_NS].
+                 */
+                value &= R_V7M_FPCCR_BANKED_MASK;
+                cpu->env.v7m.fpccr[M_REG_NS] = value;
+            } else {
+                fpccr_s = value;
+            }
+            cpu->env.v7m.fpccr[M_REG_S] = fpccr_s;
+        }
+        break;
+    case 0xf38: /* FPCAR */
+        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            value &= ~7;
+            cpu->env.v7m.fpcar[attrs.secure] = value;
+        }
+        break;
+    case 0xf3c: /* FPDSCR */
+        if (arm_feature(&cpu->env, ARM_FEATURE_VFP)) {
+            value &= 0x07c00000;
+            cpu->env.v7m.fpdscr[attrs.secure] = value;
+        }
+        break;
     case 0xf50: /* ICIALLU */
     case 0xf58: /* ICIMVAU */
     case 0xf5c: /* DCIMVAC */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_reset(CPUState *s)
             env->v7m.ccr[M_REG_S] |= R_V7M_CCR_UNALIGN_TRP_MASK;
         }
 
+        if (arm_feature(env, ARM_FEATURE_VFP)) {
+            env->v7m.fpccr[M_REG_NS] = R_V7M_FPCCR_ASPEN_MASK;
+            env->v7m.fpccr[M_REG_S] = R_V7M_FPCCR_ASPEN_MASK |
+                R_V7M_FPCCR_LSPEN_MASK | R_V7M_FPCCR_S_MASK;
+        }
         /* Unlike A/R profile, M profile defines the reset LR value */
         env->regs[14] = 0xffffffff;
 
diff --git a/target/arm/machine.c b/target/arm/machine.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_m_v8m = {
     }
 };
 
+static const VMStateDescription vmstate_m_fp = {
+    .name = "cpu/m/fp",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = vfp_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32_ARRAY(env.v7m.fpcar, ARMCPU, M_REG_NUM_BANKS),
+        VMSTATE_UINT32_ARRAY(env.v7m.fpccr, ARMCPU, M_REG_NUM_BANKS),
+        VMSTATE_UINT32_ARRAY(env.v7m.fpdscr, ARMCPU, M_REG_NUM_BANKS),
+        VMSTATE_UINT32_ARRAY(env.v7m.cpacr, ARMCPU, M_REG_NUM_BANKS),
+        VMSTATE_UINT32(env.v7m.nsacr, ARMCPU),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_m = {
     .name = "cpu/m",
     .version_id = 4,
@@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_m = {
         &vmstate_m_scr,
         &vmstate_m_other_sp,
         &vmstate_m_v8m,
+        &vmstate_m_fp,
         NULL
     }
 };
-- 
2.20.1

The only "system register" that M-profile floating point exposes
via the VMRS/VMRS instructions is FPSCR, and it does not have
the odd special case for rd==15. Add a check to ensure we only
expose FPSCR.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-5-peter.maydell@linaro.org
---
 target/arm/translate.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
                     }
                 }
             } else { /* !dp */
+                bool is_sysreg;
+
                 if ((insn & 0x6f) != 0x00)
                     return 1;
                 rn = VFP_SREG_N(insn);
+
+                is_sysreg = extract32(insn, 21, 1);
+
+                if (arm_dc_feature(s, ARM_FEATURE_M)) {
+                    /*
+                     * The only M-profile VFP vmrs/vmsr sysreg is FPSCR.
+                     * Writes to R15 are UNPREDICTABLE; we choose to undef.
+                     */
+                    if (is_sysreg && (rd == 15 || (rn >> 1) != ARM_VFP_FPSCR)) {
+                        return 1;
+                    }
+                }
+
                 if (insn & ARM_CP_RW_BIT) {
                     /* vfp->arm */
-                    if (insn & (1 << 21)) {
+                    if (is_sysreg) {
                         /* system register */
                         rn >>= 1;
 
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
                     }
                 } else {
                     /* arm->vfp */
-                    if (insn & (1 << 21)) {
+                    if (is_sysreg) {
                         rn >>= 1;
                         /* system register */
                         switch (rn) {
-- 
2.20.1

Like AArch64, M-profile floating point has no FPEXC enable
bit to gate floating point; so always set the VFPEN TB flag.

M-profile also has CPACR and NSACR similar to A-profile;
they behave slightly differently:
 * the CPACR is banked between Secure and Non-Secure
 * if the NSACR forces a trap then this is taken to
   the Secure state, not the Non-Secure state

Honour the CPACR and NSACR settings. The NSACR handling
requires us to borrow the exception.target_el field
(usually meaningless for M profile) to distinguish the
NOCP UsageFault taken to Secure state from the more
usual fault taken to the current security state.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-6-peter.maydell@linaro.org
---
 target/arm/helper.c    | 55 +++++++++++++++++++++++++++++++++++++++---
 target/arm/translate.c | 10 ++++++--
 2 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ uint32_t arm_phys_excp_target_el(CPUState *cs, uint32_t excp_idx,
     return target_el;
 }
 
+/*
+ * Return true if the v7M CPACR permits access to the FPU for the specified
+ * security state and privilege level.
+ */
+static bool v7m_cpacr_pass(CPUARMState *env, bool is_secure, bool is_priv)
+{
+    switch (extract32(env->v7m.cpacr[is_secure], 20, 2)) {
+    case 0:
+    case 2: /* UNPREDICTABLE: we treat like 0 */
+        return false;
+    case 1:
+        return is_priv;
+    case 3:
+        return true;
+    default:
+        g_assert_not_reached();
+    }
+}
+
 static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
                             ARMMMUIdx mmu_idx, bool ignfault)
 {
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
         env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_UNDEFINSTR_MASK;
         break;
     case EXCP_NOCP:
-        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
-        env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_NOCP_MASK;
+    {
+        /*
+         * NOCP might be directed to something other than the current
+         * security state if this fault is because of NSACR; we indicate
+         * the target security state using exception.target_el.
+         */
+        int target_secstate;
+
+        if (env->exception.target_el == 3) {
+            target_secstate = M_REG_S;
+        } else {
+            target_secstate = env->v7m.secure;
+        }
+        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, target_secstate);
+        env->v7m.cfsr[target_secstate] |= R_V7M_CFSR_NOCP_MASK;
         break;
+    }
     case EXCP_INVSTATE:
         armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
         env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_INVSTATE_MASK;
@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
         return 0;
     }
 
+    if (arm_feature(env, ARM_FEATURE_M)) {
+        /* CPACR can cause a NOCP UsageFault taken to current security state */
+        if (!v7m_cpacr_pass(env, env->v7m.secure, cur_el != 0)) {
+            return 1;
+        }
+
+        if (arm_feature(env, ARM_FEATURE_M_SECURITY) && !env->v7m.secure) {
+            if (!extract32(env->v7m.nsacr, 10, 1)) {
+                /* FP insns cause a NOCP UsageFault taken to Secure */
+                return 3;
+            }
+        }
+
+        return 0;
+    }
+
     /* The CPACR controls traps to EL1, or PL1 if we're 32 bit:
      * 0, 2 : trap EL0 and EL1/PL1 accesses
      * 1    : trap only EL0 accesses
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         flags = FIELD_DP32(flags, TBFLAG_A32, SCTLR_B, arm_sctlr_b(env));
         flags = FIELD_DP32(flags, TBFLAG_A32, NS, !access_secure_reg(env));
         if (env->vfp.xregs[ARM_VFP_FPEXC] & (1 << 30)
-            || arm_el_is_aa64(env, 1)) {
+            || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
             flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
         }
         flags = FIELD_DP32(flags, TBFLAG_A32, XSCALE_CPAR, env->cp15.c15_cpar);
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
      * for attempts to execute invalid vfp/neon encodings with FP disabled.
      */
     if (s->fp_excp_el) {
-        gen_exception_insn(s, 4, EXCP_UDEF,
-                           syn_fp_access_trap(1, 0xe, false), s->fp_excp_el);
+        if (arm_dc_feature(s, ARM_FEATURE_M)) {
+            gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
+                               s->fp_excp_el);
+        } else {
+            gen_exception_insn(s, 4, EXCP_UDEF,
+                               syn_fp_access_trap(1, 0xe, false),
+                               s->fp_excp_el);
+        }
         return 0;
     }
 
-- 
2.20.1

Correct the decode of the M-profile "coprocessor and
floating-point instructions" space:
 * op0 == 0b11 is always unallocated
 * if the CPU has an FPU then all insns with op1 == 0b101
   are floating point and go to disas_vfp_insn()

For the moment we leave VLLDM and VLSTM as NOPs; in
a later commit we will fill in the proper implementation
for the case where an FPU is present.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-7-peter.maydell@linaro.org
---
 target/arm/translate.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
     case 6: case 7: case 14: case 15:
         /* Coprocessor.  */
         if (arm_dc_feature(s, ARM_FEATURE_M)) {
-            /* We don't currently implement M profile FP support,
-             * so this entire space should give a NOCP fault, with
-             * the exception of the v8M VLLDM and VLSTM insns, which
-             * must be NOPs in Secure state and UNDEF in Nonsecure state.
+            /* 0b111x_11xx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx */
+            if (extract32(insn, 24, 2) == 3) {
+                goto illegal_op; /* op0 = 0b11 : unallocated */
+            }
+
+            /*
+             * Decode VLLDM and VLSTM first: these are nonstandard because:
+             *  * if there is no FPU then these insns must NOP in
+             *    Secure state and UNDEF in Nonsecure state
+             *  * if there is an FPU then these insns do not have
+             *    the usual behaviour that disas_vfp_insn() provides of
+             *    being controlled by CPACR/NSACR enable bits or the
+             *    lazy-stacking logic.
              */
             if (arm_dc_feature(s, ARM_FEATURE_V8) &&
                 (insn & 0xffa00f00) == 0xec200a00) {
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                 /* Just NOP since FP support is not implemented */
                 break;
             }
+            if (arm_dc_feature(s, ARM_FEATURE_VFP) &&
+                ((insn >> 8) & 0xe) == 10) {
+                /* FP, and the CPU supports it */
+                if (disas_vfp_insn(s, insn)) {
+                    goto illegal_op;
+                }
+                break;
+            }
+
             /* All other insns: NOCP */
             gen_exception_insn(s, 4, EXCP_NOCP, syn_uncategorized(),
                                default_exception_el(s));
-- 
2.20.1

If the floating point extension is present, then the SG instruction
must clear the CONTROL_S.SFPA bit. Implement this.

(On a no-FPU system the bit will always be zero, so we don't need
to make the clearing of the bit conditional on ARM_FEATURE_VFP.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-8-peter.maydell@linaro.org
---
 target/arm/helper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static bool v7m_handle_execute_nsc(ARMCPU *cpu)
     qemu_log_mask(CPU_LOG_INT, "...really an SG instruction at 0x%08" PRIx32
                   ", executing it\n", env->regs[15]);
     env->regs[14] &= ~1;
+    env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
     switch_v7m_security_state(env, true);
     xpsr_write(env, 0, XPSR_IT);
     env->regs[15] += 4;
-- 
2.20.1

The M-profile CONTROL register has two bits -- SFPA and FPCA --
which relate to floating-point support, and should be RES0 otherwise.
Handle them correctly in the MSR/MRS register access code.
Neither is banked between security states, so they are stored
in v7m.control[M_REG_S] regardless of current security state.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-9-peter.maydell@linaro.org
---
 target/arm/helper.c | 57 ++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 49 insertions(+), 8 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(v7m_mrs)(CPUARMState *env, uint32_t reg)
         return xpsr_read(env) & mask;
         break;
     case 20: /* CONTROL */
-        return env->v7m.control[env->v7m.secure];
+    {
+        uint32_t value = env->v7m.control[env->v7m.secure];
+        if (!env->v7m.secure) {
+            /* SFPA is RAZ/WI from NS; FPCA is stored in the M_REG_S bank */
+            value |= env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK;
+        }
+        return value;
+    }
     case 0x94: /* CONTROL_NS */
         /* We have to handle this here because unprivileged Secure code
          * can read the NS CONTROL register.
@@ -XXX,XX +XXX,XX @@ uint32_t HELPER(v7m_mrs)(CPUARMState *env, uint32_t reg)
         if (!env->v7m.secure) {
             return 0;
         }
-        return env->v7m.control[M_REG_NS];
+        return env->v7m.control[M_REG_NS] |
+            (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK);
     }
 
     if (el == 0) {
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
      */
     uint32_t mask = extract32(maskreg, 8, 4);
     uint32_t reg = extract32(maskreg, 0, 8);
+    int cur_el = arm_current_el(env);
 
-    if (arm_current_el(env) == 0 && reg > 7) {
-        /* only xPSR sub-fields may be written by unprivileged */
+    if (cur_el == 0 && reg > 7 && reg != 20) {
+        /*
+         * only xPSR sub-fields and CONTROL.SFPA may be written by
+         * unprivileged code
+         */
         return;
     }
 
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
                 env->v7m.control[M_REG_NS] &= ~R_V7M_CONTROL_NPRIV_MASK;
                 env->v7m.control[M_REG_NS] |= val & R_V7M_CONTROL_NPRIV_MASK;
             }
+            /*
+             * SFPA is RAZ/WI from NS. FPCA is RO if NSACR.CP10 == 0,
+             * RES0 if the FPU is not present, and is stored in the S bank
+             */
+            if (arm_feature(env, ARM_FEATURE_VFP) &&
+                extract32(env->v7m.nsacr, 10, 1)) {
+                env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
+                env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_FPCA_MASK;
+            }
             return;
         case 0x98: /* SP_NS */
         {
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t maskreg, uint32_t val)
         env->v7m.faultmask[env->v7m.secure] = val & 1;
         break;
     case 20: /* CONTROL */
-        /* Writing to the SPSEL bit only has an effect if we are in
+        /*
+         * Writing to the SPSEL bit only has an effect if we are in
          * thread mode; other bits can be updated by any privileged code.
          * write_v7m_control_spsel() deals with updating the SPSEL bit in
          * env->v7m.control, so we only need update the others.
          * For v7M, we must just ignore explicit writes to SPSEL in handler
          * mode; for v8M the write is permitted but will have no effect.
+         * All these bits are writes-ignored from non-privileged code,
+         * except for SFPA.
          */
-        if (arm_feature(env, ARM_FEATURE_V8) ||
-            !arm_v7m_is_handler_mode(env)) {
+        if (cur_el > 0 && (arm_feature(env, ARM_FEATURE_V8) ||
+                           !arm_v7m_is_handler_mode(env))) {
             write_v7m_control_spsel(env, (val & R_V7M_CONTROL_SPSEL_MASK) != 0);
         }
-        if (arm_feature(env, ARM_FEATURE_M_MAIN)) {
+        if (cur_el > 0 && arm_feature(env, ARM_FEATURE_M_MAIN)) {
             env->v7m.control[env->v7m.secure] &= ~R_V7M_CONTROL_NPRIV_MASK;
             env->v7m.control[env->v7m.secure] |= val & R_V7M_CONTROL_NPRIV_MASK;
         }
+        if (arm_feature(env, ARM_FEATURE_VFP)) {
+            /*
+             * SFPA is RAZ/WI from NS or if no FPU.
+             * FPCA is RO if NSACR.CP10 == 0, RES0 if the FPU is not present.
+             * Both are stored in the S bank.
+             */
+            if (env->v7m.secure) {
+                env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
+                env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_SFPA_MASK;
+            }
+            if (cur_el > 0 &&
+                (env->v7m.secure || !arm_feature(env, ARM_FEATURE_M_SECURITY) ||
+                 extract32(env->v7m.nsacr, 10, 1))) {
+                env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
+                env->v7m.control[M_REG_S] |= val & R_V7M_CONTROL_FPCA_MASK;
+            }
+        }
         break;
     default:
     bad_reg:
-- 
2.20.1

Currently the code in v7m_push_stack() which detects a violation
of the v8M stack limit simply returns early if it does so. This
is OK for the current integer-only code, but won't work for the
floating point handling we're about to add. We need to continue
executing the rest of the function so that we check for other
exceptions like not having permission to use the FPU and so
that we correctly set the FPCCR state if we are doing lazy
stacking. Refactor to avoid the early return.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-10-peter.maydell@linaro.org
---
 target/arm/helper.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
      * should ignore further stack faults trying to process
      * that derived exception.)
      */
-    bool stacked_ok;
+    bool stacked_ok = true, limitviol = false;
     CPUARMState *env = &cpu->env;
     uint32_t xpsr = xpsr_read(env);
     uint32_t frameptr = env->regs[13];
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
             armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
                                     env->v7m.secure);
             env->regs[13] = limit;
-            return true;
+            /*
+             * We won't try to perform any further memory accesses but
+             * we must continue through the following code to check for
+             * permission faults during FPU state preservation, and we
+             * must update FPCCR if lazy stacking is enabled.
+             */
+            limitviol = true;
+            stacked_ok = false;
         }
     }
 
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
      * (which may be taken in preference to the one we started with
      * if it has higher priority).
      */
-    stacked_ok =
+    stacked_ok = stacked_ok &&
         v7m_stack_write(cpu, frameptr, env->regs[0], mmu_idx, false) &&
         v7m_stack_write(cpu, frameptr + 4, env->regs[1], mmu_idx, false) &&
         v7m_stack_write(cpu, frameptr + 8, env->regs[2], mmu_idx, false) &&
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
         v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
         v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
 
-    /* Update SP regardless of whether any of the stack accesses failed. */
-    env->regs[13] = frameptr;
+    /*
+     * If we broke a stack limit then SP was already updated earlier;
+     * otherwise we update SP regardless of whether any of the stack
+     * accesses failed or we took some other kind of fault.
+     */
+    if (!limitviol) {
+        env->regs[13] = frameptr;
+    }
 
     return !stacked_ok;
 }
-- 
2.20.1

Handle floating point registers in exception entry.
This corresponds to the FP-specific parts of the pseudocode
functions ActivateException() and PushStack().

We defer the code corresponding to UpdateFPCCR() to a later patch.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-11-peter.maydell@linaro.org
---
 target/arm/helper.c | 98 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 95 insertions(+), 3 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     switch_v7m_security_state(env, targets_secure);
     write_v7m_control_spsel(env, 0);
     arm_clear_exclusive(env);
+    /* Clear SFPA and FPCA (has no effect if no FPU) */
+    env->v7m.control[M_REG_S] &=
+        ~(R_V7M_CONTROL_FPCA_MASK | R_V7M_CONTROL_SFPA_MASK);
     /* Clear IT bits */
     env->condexec_bits = 0;
     env->regs[14] = lr;
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
     uint32_t xpsr = xpsr_read(env);
     uint32_t frameptr = env->regs[13];
     ARMMMUIdx mmu_idx = arm_mmu_idx(env);
+    uint32_t framesize;
+    bool nsacr_cp10 = extract32(env->v7m.nsacr, 10, 1);
+
+    if ((env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) &&
+        (env->v7m.secure || nsacr_cp10)) {
+        if (env->v7m.secure &&
+            env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK) {
+            framesize = 0xa8;
+        } else {
+            framesize = 0x68;
+        }
+    } else {
+        framesize = 0x20;
+    }
 
     /* Align stack pointer if the guest wants that */
     if ((frameptr & 4) &&
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
         xpsr |= XPSR_SPREALIGN;
     }
 
-    frameptr -= 0x20;
+    xpsr &= ~XPSR_SFPA;
+    if (env->v7m.secure &&
+        (env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)) {
+        xpsr |= XPSR_SFPA;
+    }
+
+    frameptr -= framesize;
 
     if (arm_feature(env, ARM_FEATURE_V8)) {
         uint32_t limit = v7m_sp_limit(env);
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
         v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
         v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
 
+    if (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) {
+        /* FPU is active, try to save its registers */
+        bool fpccr_s = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
+        bool lspact = env->v7m.fpccr[fpccr_s] & R_V7M_FPCCR_LSPACT_MASK;
+
+        if (lspact && arm_feature(env, ARM_FEATURE_M_SECURITY)) {
+            qemu_log_mask(CPU_LOG_INT,
+                          "...SecureFault because LSPACT and FPCA both set\n");
+            env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
+            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
+        } else if (!env->v7m.secure && !nsacr_cp10) {
+            qemu_log_mask(CPU_LOG_INT,
+                          "...Secure UsageFault with CFSR.NOCP because "
+                          "NSACR.CP10 prevents stacking FP regs\n");
+            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, M_REG_S);
+            env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_NOCP_MASK;
+        } else {
+            if (!(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPEN_MASK)) {
+                /* Lazy stacking disabled, save registers now */
+                int i;
+                bool cpacr_pass = v7m_cpacr_pass(env, env->v7m.secure,
+                                                 arm_current_el(env) != 0);
+
+                if (stacked_ok && !cpacr_pass) {
+                    /*
+                     * Take UsageFault if CPACR forbids access. The pseudocode
+                     * here does a full CheckCPEnabled() but we know the NSACR
+                     * check can never fail as we have already handled that.
+                     */
+                    qemu_log_mask(CPU_LOG_INT,
+                                  "...UsageFault with CFSR.NOCP because "
+                                  "CPACR.CP10 prevents stacking FP regs\n");
+                    armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
+                                            env->v7m.secure);
+                    env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_NOCP_MASK;
+                    stacked_ok = false;
+                }
+
+                for (i = 0; i < ((framesize == 0xa8) ? 32 : 16); i += 2) {
+                    uint64_t dn = *aa32_vfp_dreg(env, i / 2);
+                    uint32_t faddr = frameptr + 0x20 + 4 * i;
+                    uint32_t slo = extract64(dn, 0, 32);
+                    uint32_t shi = extract64(dn, 32, 32);
+
+                    if (i >= 16) {
+                        faddr += 8; /* skip the slot for the FPSCR */
+                    }
+                    stacked_ok = stacked_ok &&
+                        v7m_stack_write(cpu, faddr, slo, mmu_idx, false) &&
+                        v7m_stack_write(cpu, faddr + 4, shi, mmu_idx, false);
+                }
+                stacked_ok = stacked_ok &&
+                    v7m_stack_write(cpu, frameptr + 0x60,
+                                    vfp_get_fpscr(env), mmu_idx, false);
+                if (cpacr_pass) {
+                    for (i = 0; i < ((framesize == 0xa8) ? 32 : 16); i += 2) {
+                        *aa32_vfp_dreg(env, i / 2) = 0;
+                    }
+                    vfp_set_fpscr(env, 0);
+                }
+            } else {
+                /* Lazy stacking enabled, save necessary info to stack later */
+                /* TODO : equivalent of UpdateFPCCR() pseudocode */
+            }
+        }
+    }
+
     /*
      * If we broke a stack limit then SP was already updated earlier;
      * otherwise we update SP regardless of whether any of the stack
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
 
     if (arm_feature(env, ARM_FEATURE_V8)) {
         lr = R_V7M_EXCRET_RES1_MASK |
-            R_V7M_EXCRET_DCRS_MASK |
-            R_V7M_EXCRET_FTYPE_MASK;
+            R_V7M_EXCRET_DCRS_MASK;
         /* The S bit indicates whether we should return to Secure
          * or NonSecure (ie our current state).
          * The ES bit indicates whether we're taking this exception
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
         if (env->v7m.secure) {
             lr |= R_V7M_EXCRET_S_MASK;
         }
+        if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK)) {
+            lr |= R_V7M_EXCRET_FTYPE_MASK;
+        }
     } else {
         lr = R_V7M_EXCRET_RES1_MASK |
             R_V7M_EXCRET_S_MASK |
-- 
2.20.1

Implement the code which updates the FPCCR register on an
exception entry where we are going to use lazy FP stacking.
We have to defer to the NVIC to determine whether the
various exceptions are currently ready or not.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190416125744.27770-12-peter.maydell@linaro.org
---
 target/arm/cpu.h      | 14 +++++++++
 hw/intc/armv7m_nvic.c | 34 ++++++++++++++++++++++
 target/arm/helper.c   | 67 ++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ void armv7m_nvic_acknowledge_irq(void *opaque);
  * (Ignoring -1, this is the same as the RETTOBASE value before completion.)
  */
 int armv7m_nvic_complete_irq(void *opaque, int irq, bool secure);
+/**
+ * armv7m_nvic_get_ready_status(void *opaque, int irq, bool secure)
+ * @opaque: the NVIC
+ * @irq: the exception number to mark pending
+ * @secure: false for non-banked exceptions or for the nonsecure
+ * version of a banked exception, true for the secure version of a banked
+ * exception.
+ *
+ * Return whether an exception is "ready", i.e. whether the exception is
+ * enabled and is configured at a priority which would allow it to
+ * interrupt the current execution priority. This controls whether the
+ * RDY bit for it in the FPCCR is set.
+ */
+bool armv7m_nvic_get_ready_status(void *opaque, int irq, bool secure);
 /**
  * armv7m_nvic_raw_execution_priority: return the raw execution priority
  * @opaque: the NVIC
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ int armv7m_nvic_complete_irq(void *opaque, int irq, bool secure)
     return ret;
 }
 
+bool armv7m_nvic_get_ready_status(void *opaque, int irq, bool secure)
+{
+    /*
+     * Return whether an exception is "ready", i.e. it is enabled and is
+     * configured at a priority which would allow it to interrupt the
+     * current execution priority.
+     *
+     * irq and secure have the same semantics as for armv7m_nvic_set_pending():
+     * for non-banked exceptions secure is always false; for banked exceptions
+     * it indicates which of the exceptions is required.
+     */
+    NVICState *s = (NVICState *)opaque;
+    bool banked = exc_is_banked(irq);
+    VecInfo *vec;
+    int running = nvic_exec_prio(s);
+
+    assert(irq > ARMV7M_EXCP_RESET && irq < s->num_irq);
+    assert(!secure || banked);
+
+    /*
+     * HardFault is an odd special case: we always check against -1,
+     * even if we're secure and HardFault has priority -3; we never
+     * need to check for enabled state.
+     */
+    if (irq == ARMV7M_EXCP_HARD) {
+        return running > -1;
+    }
+
+    vec = (banked && secure) ? &s->sec_vectors[irq] : &s->vectors[irq];
+
+    return vec->enabled &&
+        exc_group_prio(s, vec->prio, secure) < running;
+}
+
 /* callback when external interrupt line is changed */
 static void set_irq_level(void *opaque, int n, int level)
 {
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     env->thumb = addr & 1;
 }
 
+static void v7m_update_fpccr(CPUARMState *env, uint32_t frameptr,
+                             bool apply_splim)
+{
+    /*
+     * Like the pseudocode UpdateFPCCR: save state in FPCAR and FPCCR
+     * that we will need later in order to do lazy FP reg stacking.
+     */
+    bool is_secure = env->v7m.secure;
+    void *nvic = env->nvic;
+    /*
+     * Some bits are unbanked and live always in fpccr[M_REG_S]; some bits
+     * are banked and we want to update the bit in the bank for the
+     * current security state; and in one case we want to specifically
+     * update the NS banked version of a bit even if we are secure.
+     */
+    uint32_t *fpccr_s = &env->v7m.fpccr[M_REG_S];
+    uint32_t *fpccr_ns = &env->v7m.fpccr[M_REG_NS];
+    uint32_t *fpccr = &env->v7m.fpccr[is_secure];
+    bool hfrdy, bfrdy, mmrdy, ns_ufrdy, s_ufrdy, sfrdy, monrdy;
+
+    env->v7m.fpcar[is_secure] = frameptr & ~0x7;
+
+    if (apply_splim && arm_feature(env, ARM_FEATURE_V8)) {
+        bool splimviol;
+        uint32_t splim = v7m_sp_limit(env);
+        bool ign = armv7m_nvic_neg_prio_requested(nvic, is_secure) &&
+            (env->v7m.ccr[is_secure] & R_V7M_CCR_STKOFHFNMIGN_MASK);
+
+        splimviol = !ign && frameptr < splim;
+        *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, SPLIMVIOL, splimviol);
+    }
+
+    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, LSPACT, 1);
+
+    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, S, is_secure);
+
+    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, USER, arm_current_el(env) == 0);
+
+    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, THREAD,
+                        !arm_v7m_is_handler_mode(env));
+
+    hfrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_HARD, false);
+    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, HFRDY, hfrdy);
+
+    bfrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_BUS, false);
+    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, BFRDY, bfrdy);
+
+    mmrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_MEM, is_secure);
+    *fpccr = FIELD_DP32(*fpccr, V7M_FPCCR, MMRDY, mmrdy);
+
+    ns_ufrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_USAGE, false);
+    *fpccr_ns = FIELD_DP32(*fpccr_ns, V7M_FPCCR, UFRDY, ns_ufrdy);
+
+    monrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_DEBUG, false);
+    *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, MONRDY, monrdy);
+
+    if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
+        s_ufrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_USAGE, true);
+        *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, UFRDY, s_ufrdy);
+
+        sfrdy = armv7m_nvic_get_ready_status(nvic, ARMV7M_EXCP_SECURE, false);
+        *fpccr_s = FIELD_DP32(*fpccr_s, V7M_FPCCR, SFRDY, sfrdy);
+    }
+}
+
 static bool v7m_push_stack(ARMCPU *cpu)
 {
     /* Do the "set up stack frame" part of exception entry,
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
                 }
             } else {
                 /* Lazy stacking enabled, save necessary info to stack later */
-                /* TODO : equivalent of UpdateFPCCR() pseudocode */
+                v7m_update_fpccr(env, frameptr + 0x20, true);
             }
         }
     }
-- 
2.20.1

For v8M floating point support, transitions from Secure
to Non-secure state via BLNS and BLXNS must clear the
CONTROL.SFPA bit. (This corresponds to the pseudocode
BranchToNS() function.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-13-peter.maydell@linaro.org
---
 target/arm/helper.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_bxns)(CPUARMState *env, uint32_t dest)
     /* translate.c should have made BXNS UNDEF unless we're secure */
     assert(env->v7m.secure);
 
+    if (!(dest & 1)) {
+        env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
+    }
     switch_v7m_security_state(env, dest & 1);
     env->thumb = 1;
     env->regs[15] = dest & ~1;
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
          */
         write_v7m_exception(env, 1);
     }
+    env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_SFPA_MASK;
     switch_v7m_security_state(env, 0);
     env->thumb = 1;
     env->regs[15] = dest;
-- 
2.20.1

The TailChain() pseudocode specifies that a tail chaining
exception should sanitize the excReturn all-ones bits and
(if there is no FPU) the excReturn FType bits; we weren't
doing this.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-14-peter.maydell@linaro.org
---
 target/arm/helper.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void v7m_exception_taken(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     qemu_log_mask(CPU_LOG_INT, "...taking pending %s exception %d\n",
                   targets_secure ? "secure" : "nonsecure", exc);
 
+    if (dotailchain) {
+        /* Sanitize LR FType and PREFIX bits */
+        if (!arm_feature(env, ARM_FEATURE_VFP)) {
+            lr |= R_V7M_EXCRET_FTYPE_MASK;
+        }
+        lr = deposit32(lr, 24, 8, 0xff);
+    }
+
     if (arm_feature(env, ARM_FEATURE_V8)) {
         if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
             (lr & R_V7M_EXCRET_S_MASK)) {
-- 
2.20.1

The magic value pushed onto the callee stack as an integrity
check is different if floating point is present.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-15-peter.maydell@linaro.org
---
 target/arm/helper.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ load_fail:
     return false;
 }
 
+static uint32_t v7m_integrity_sig(CPUARMState *env, uint32_t lr)
+{
+    /*
+     * Return the integrity signature value for the callee-saves
+     * stack frame section. @lr is the exception return payload/LR value
+     * whose FType bit forms bit 0 of the signature if FP is present.
+     */
+    uint32_t sig = 0xfefa125a;
+
+    if (!arm_feature(env, ARM_FEATURE_VFP) || (lr & R_V7M_EXCRET_FTYPE_MASK)) {
+        sig |= 1;
+    }
+    return sig;
+}
+
 static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
                                   bool ignore_faults)
 {
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     bool stacked_ok;
     uint32_t limit;
     bool want_psp;
+    uint32_t sig;
 
     if (dotailchain) {
         bool mode = lr & R_V7M_EXCRET_MODE_MASK;
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     /* Write as much of the stack frame as we can. A write failure may
      * cause us to pend a derived exception.
      */
+    sig = v7m_integrity_sig(env, lr);
     stacked_ok =
-        v7m_stack_write(cpu, frameptr, 0xfefa125b, mmu_idx, ignore_faults) &&
+        v7m_stack_write(cpu, frameptr, sig, mmu_idx, ignore_faults) &&
         v7m_stack_write(cpu, frameptr + 0x8, env->regs[4], mmu_idx,
                         ignore_faults) &&
         v7m_stack_write(cpu, frameptr + 0xc, env->regs[5], mmu_idx,
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
         if (return_to_secure &&
             ((excret & R_V7M_EXCRET_ES_MASK) == 0 ||
              (excret & R_V7M_EXCRET_DCRS_MASK) == 0)) {
-            uint32_t expected_sig = 0xfefa125b;
             uint32_t actual_sig;
 
             pop_ok = v7m_stack_read(cpu, &actual_sig, frameptr, mmu_idx);
 
-            if (pop_ok && expected_sig != actual_sig) {
+            if (pop_ok && v7m_integrity_sig(env, excret) != actual_sig) {
                 /* Take a SecureFault on the current stack */
                 env->v7m.sfsr |= R_V7M_SFSR_INVIS_MASK;
                 armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
-- 
2.20.1

Handle floating point registers in exception return.
This corresponds to pseudocode functions ValidateExceptionReturn(),
ExceptionReturn(), PopStack() and ConsumeExcStackFrame().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-16-peter.maydell@linaro.org
---
 target/arm/helper.c | 142 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 141 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
     bool rettobase = false;
     bool exc_secure = false;
     bool return_to_secure;
+    bool ftype;
+    bool restore_s16_s31;
 
     /* If we're not in Handler mode then jumps to magic exception-exit
      * addresses don't have magic behaviour. However for the v8M
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
                       excret);
     }
 
+    ftype = excret & R_V7M_EXCRET_FTYPE_MASK;
+
+    if (!arm_feature(env, ARM_FEATURE_VFP) && !ftype) {
+        qemu_log_mask(LOG_GUEST_ERROR, "M profile: zero FTYPE in exception "
+                      "exit PC value 0x%" PRIx32 " is UNPREDICTABLE "
+                      "if FPU not present\n",
+                      excret);
+        ftype = true;
+    }
+
     if (arm_feature(env, ARM_FEATURE_M_SECURITY)) {
         /* EXC_RETURN.ES validation check (R_SMFL). We must do this before
          * we pick which FAULTMASK to clear.
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
      */
     write_v7m_control_spsel_for_secstate(env, return_to_sp_process, exc_secure);
 
+    /*
+     * Clear scratch FP values left in caller saved registers; this
+     * must happen before any kind of tail chaining.
+     */
+    if ((env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_CLRONRET_MASK) &&
+        (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK)) {
+        if (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPACT_MASK) {
+            env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
+            armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
+            qemu_log_mask(CPU_LOG_INT, "...taking SecureFault on existing "
+                          "stackframe: error during lazy state deactivation\n");
+            v7m_exception_taken(cpu, excret, true, false);
+            return;
+        } else {
+            /* Clear s0..s15 and FPSCR */
+            int i;
+
+            for (i = 0; i < 16; i += 2) {
+                *aa32_vfp_dreg(env, i / 2) = 0;
+            }
+            vfp_set_fpscr(env, 0);
+        }
+    }
+
     if (sfault) {
         env->v7m.sfsr |= R_V7M_SFSR_INVER_MASK;
         armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
             }
         }
 
+        if (!ftype) {
+            /* FP present and we need to handle it */
+            if (!return_to_secure &&
+                (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPACT_MASK)) {
+                armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
+                env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
+                qemu_log_mask(CPU_LOG_INT,
+                              "...taking SecureFault on existing stackframe: "
+                              "Secure LSPACT set but exception return is "
+                              "not to secure state\n");
+                v7m_exception_taken(cpu, excret, true, false);
+                return;
+            }
+
+            restore_s16_s31 = return_to_secure &&
+                (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK);
+
+            if (env->v7m.fpccr[return_to_secure] & R_V7M_FPCCR_LSPACT_MASK) {
+                /* State in FPU is still valid, just clear LSPACT */
+                env->v7m.fpccr[return_to_secure] &= ~R_V7M_FPCCR_LSPACT_MASK;
+            } else {
+                int i;
+                uint32_t fpscr;
+                bool cpacr_pass, nsacr_pass;
+
+                cpacr_pass = v7m_cpacr_pass(env, return_to_secure,
+                                            return_to_priv);
+                nsacr_pass = return_to_secure ||
+                    extract32(env->v7m.nsacr, 10, 1);
+
+                if (!cpacr_pass) {
+                    armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE,
+                                            return_to_secure);
+                    env->v7m.cfsr[return_to_secure] |= R_V7M_CFSR_NOCP_MASK;
+                    qemu_log_mask(CPU_LOG_INT,
+                                  "...taking UsageFault on existing "
+                                  "stackframe: CPACR.CP10 prevents unstacking "
+                                  "FP regs\n");
+                    v7m_exception_taken(cpu, excret, true, false);
+                    return;
+                } else if (!nsacr_pass) {
+                    armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, true);
+                    env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_INVPC_MASK;
+                    qemu_log_mask(CPU_LOG_INT,
+                                  "...taking Secure UsageFault on existing "
+                                  "stackframe: NSACR.CP10 prevents unstacking "
+                                  "FP regs\n");
+                    v7m_exception_taken(cpu, excret, true, false);
+                    return;
+                }
+
+                for (i = 0; i < (restore_s16_s31 ? 32 : 16); i += 2) {
+                    uint32_t slo, shi;
+                    uint64_t dn;
+                    uint32_t faddr = frameptr + 0x20 + 4 * i;
+
+                    if (i >= 16) {
+                        faddr += 8; /* Skip the slot for the FPSCR */
+                    }
+
+                    pop_ok = pop_ok &&
+                        v7m_stack_read(cpu, &slo, faddr, mmu_idx) &&
+                        v7m_stack_read(cpu, &shi, faddr + 4, mmu_idx);
+
+                    if (!pop_ok) {
+                        break;
+                    }
+
+                    dn = (uint64_t)shi << 32 | slo;
+                    *aa32_vfp_dreg(env, i / 2) = dn;
+                }
+                pop_ok = pop_ok &&
+                    v7m_stack_read(cpu, &fpscr, frameptr + 0x60, mmu_idx);
+                if (pop_ok) {
+                    vfp_set_fpscr(env, fpscr);
+                }
+                if (!pop_ok) {
+                    /*
+                     * These regs are 0 if security extension present;
+                     * otherwise merely UNKNOWN. We zero always.
+                     */
+                    for (i = 0; i < (restore_s16_s31 ? 32 : 16); i += 2) {
+                        *aa32_vfp_dreg(env, i / 2) = 0;
+                    }
+                    vfp_set_fpscr(env, 0);
+                }
+            }
+        }
+        env->v7m.control[M_REG_S] = FIELD_DP32(env->v7m.control[M_REG_S],
+                                               V7M_CONTROL, FPCA, !ftype);
+
         /* Commit to consuming the stack frame */
         frameptr += 0x20;
+        if (!ftype) {
+            frameptr += 0x48;
+            if (restore_s16_s31) {
+                frameptr += 0x40;
+            }
+        }
         /* Undo stack alignment (the SPREALIGN bit indicates that the original
          * pre-exception SP was not 8-aligned and we added a padding word to
          * align it, so we undo this by ORing in the bit that increases it
@@ -XXX,XX +XXX,XX @@ static void do_v7m_exception_exit(ARMCPU *cpu)
         *frame_sp_p = frameptr;
     }
     /* This xpsr_write() will invalidate frame_sp_p as it may switch stack */
-    xpsr_write(env, xpsr, ~XPSR_SPREALIGN);
+    xpsr_write(env, xpsr, ~(XPSR_SPREALIGN | XPSR_SFPA));
+
+    if (env->v7m.secure) {
+        bool sfpa = xpsr & XPSR_SFPA;
+
+        env->v7m.control[M_REG_S] = FIELD_DP32(env->v7m.control[M_REG_S],
+                                               V7M_CONTROL, SFPA, sfpa);
+    }
 
     /* The restored xPSR exception field will be zero if we're
      * resuming in Thread mode. If that doesn't match what the
-- 
2.20.1

Move the NS TBFLAG down from bit 19 to bit 6, which has not
been used since commit c1e3781090b9d36c60 in 2015, when we
started passing the entire MMU index in the TB flags rather
than just a 'privilege level' bit.

This rearrangement is not strictly necessary, but means that
we can put M-profile-only bits next to each other rather
than scattered across the flag word.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-17-peter.maydell@linaro.org
---
 target/arm/cpu.h | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

We are close to running out of TB flags for AArch32; we could
start using the cs_base word, but before we do that we can
economise on our usage by sharing the same bits for the VFP
VECSTRIDE field and the XScale XSCALE_CPAR field. This
works because no XScale CPU ever had VFP.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-18-peter.maydell@linaro.org
---
 target/arm/cpu.h       | 10 ++++++----
 target/arm/cpu.c       |  7 +++++++
 target/arm/helper.c    |  6 +++++-
 target/arm/translate.c |  9 +++++++--
 4 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
 FIELD(TBFLAG_A32, THUMB, 0, 1)
 FIELD(TBFLAG_A32, VECLEN, 1, 3)
 FIELD(TBFLAG_A32, VECSTRIDE, 4, 2)
+/*
+ * We store the bottom two bits of the CPAR as TB flags and handle
+ * checks on the other bits at runtime. This shares the same bits as
+ * VECSTRIDE, which is OK as no XScale CPU has VFP.
+ */
+FIELD(TBFLAG_A32, XSCALE_CPAR, 4, 2)
 /*
  * Indicates whether cp register reads and writes by guest code should access
  * the secure or nonsecure bank of banked registers; note that this is not
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
 FIELD(TBFLAG_A32, VFPEN, 7, 1)
 FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
 FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
-/* We store the bottom two bits of the CPAR as TB flags and handle
- * checks on the other bits at runtime
- */
-FIELD(TBFLAG_A32, XSCALE_CPAR, 17, 2)
 /* For M profile only, Handler (ie not Thread) mode */
 FIELD(TBFLAG_A32, HANDLER, 21, 1)
 /* For M profile only, whether we should generate stack-limit checks */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
         set_feature(env, ARM_FEATURE_THUMB_DSP);
     }
 
+    /*
+     * We rely on no XScale CPU having VFP so we can use the same bits in the
+     * TB flags field for VECSTRIDE and XSCALE_CPAR.
+     */
+    assert(!(arm_feature(env, ARM_FEATURE_VFP) &&
+             arm_feature(env, ARM_FEATURE_XSCALE)));
+
     if (arm_feature(env, ARM_FEATURE_V7) &&
         !arm_feature(env, ARM_FEATURE_M) &&
         !arm_feature(env, ARM_FEATURE_PMSA)) {
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
             || arm_el_is_aa64(env, 1) || arm_feature(env, ARM_FEATURE_M)) {
             flags = FIELD_DP32(flags, TBFLAG_A32, VFPEN, 1);
         }
-        flags = FIELD_DP32(flags, TBFLAG_A32, XSCALE_CPAR, env->cp15.c15_cpar);
+        /* Note that XSCALE_CPAR shares bits with VECSTRIDE */
+        if (arm_feature(env, ARM_FEATURE_XSCALE)) {
+            flags = FIELD_DP32(flags, TBFLAG_A32,
+                               XSCALE_CPAR, env->cp15.c15_cpar);
+        }
     }
 
     flags = FIELD_DP32(flags, TBFLAG_ANY, MMUIDX, arm_to_core_mmu_idx(mmu_idx));
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     dc->fp_excp_el = FIELD_EX32(tb_flags, TBFLAG_ANY, FPEXC_EL);
     dc->vfp_enabled = FIELD_EX32(tb_flags, TBFLAG_A32, VFPEN);
     dc->vec_len = FIELD_EX32(tb_flags, TBFLAG_A32, VECLEN);
-    dc->vec_stride = FIELD_EX32(tb_flags, TBFLAG_A32, VECSTRIDE);
-    dc->c15_cpar = FIELD_EX32(tb_flags, TBFLAG_A32, XSCALE_CPAR);
+    if (arm_feature(env, ARM_FEATURE_XSCALE)) {
+        dc->c15_cpar = FIELD_EX32(tb_flags, TBFLAG_A32, XSCALE_CPAR);
+        dc->vec_stride = 0;
+    } else {
+        dc->vec_stride = FIELD_EX32(tb_flags, TBFLAG_A32, VECSTRIDE);
+        dc->c15_cpar = 0;
+    }
     dc->v7m_handler_mode = FIELD_EX32(tb_flags, TBFLAG_A32, HANDLER);
     dc->v8m_secure = arm_feature(env, ARM_FEATURE_M_SECURITY) &&
         regime_is_secure(env, dc->mmu_idx);
-- 
2.20.1

The M-profile FPCCR.S bit indicates the security status of
the floating point context. In the pseudocode ExecuteFPCheck()
function it is unconditionally set to match the current
security state whenever a floating point instruction is
executed.

Implement this by adding a new TB flag which tracks whether
FPCCR.S is different from the current security state, so
that we only need to emit the code to update it in the
less-common case when it is not already set correctly.

Note that we will add the handling for the other work done
by ExecuteFPCheck() in later commits.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-19-peter.maydell@linaro.org
---
 target/arm/cpu.h       |  2 ++
 target/arm/translate.h |  1 +
 target/arm/helper.c    |  5 +++++
 target/arm/translate.c | 20 ++++++++++++++++++++
 4 files changed, 28 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
 FIELD(TBFLAG_A32, VFPEN, 7, 1)
 FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
 FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
+/* For M profile only, set if FPCCR.S does not match current security state */
+FIELD(TBFLAG_A32, FPCCR_S_WRONG, 20, 1)
 /* For M profile only, Handler (ie not Thread) mode */
 FIELD(TBFLAG_A32, HANDLER, 21, 1)
 /* For M profile only, whether we should generate stack-limit checks */
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
     bool v7m_handler_mode;
     bool v8m_secure; /* true if v8M and we're in Secure mode */
     bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
+    bool v8m_fpccr_s_wrong; /* true if v8M FPCCR.S != v8m_secure */
     /* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
      * so that top level loop can generate correct syndrome information.
      */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         flags = FIELD_DP32(flags, TBFLAG_A32, STACKCHECK, 1);
     }
 
+    if (arm_feature(env, ARM_FEATURE_M_SECURITY) &&
+        FIELD_EX32(env->v7m.fpccr[M_REG_S], V7M_FPCCR, S) != env->v7m.secure) {
+        flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
+    }
+
     *pflags = flags;
     *cs_base = 0;
 }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
         }
     }
 
+    if (arm_dc_feature(s, ARM_FEATURE_M)) {
+        /* Handle M-profile lazy FP state mechanics */
+
+        /* Update ownership of FP context: set FPCCR.S to match current state */
+        if (s->v8m_fpccr_s_wrong) {
+            TCGv_i32 tmp;
+
+            tmp = load_cpu_field(v7m.fpccr[M_REG_S]);
+            if (s->v8m_secure) {
+                tcg_gen_ori_i32(tmp, tmp, R_V7M_FPCCR_S_MASK);
+            } else {
+                tcg_gen_andi_i32(tmp, tmp, ~R_V7M_FPCCR_S_MASK);
+            }
+            store_cpu_field(tmp, v7m.fpccr[M_REG_S]);
+            /* Don't need to do this for any further FP insns in this TB */
+            s->v8m_fpccr_s_wrong = false;
+        }
+    }
+
     if (extract32(insn, 28, 4) == 0xf) {
         /*
          * Encodings with T=1 (Thumb) or unconditional (ARM):
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     dc->v8m_secure = arm_feature(env, ARM_FEATURE_M_SECURITY) &&
         regime_is_secure(env, dc->mmu_idx);
     dc->v8m_stackcheck = FIELD_EX32(tb_flags, TBFLAG_A32, STACKCHECK);
+    dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
     dc->cp_regs = cpu->cp_regs;
     dc->features = env->features;
 
-- 
2.20.1

The M-profile FPCCR.ASPEN bit indicates that automatic floating-point
context preservation is enabled. Before executing any floating-point
instruction, if FPCCR.ASPEN is set and the CONTROL FPCA/SFPA bits
indicate that there is no active floating point context then we
must create a new context (by initializing FPSCR and setting
FPCA/SFPA to indicate that the context is now active). In the
pseudocode this is handled by ExecuteFPCheck().

Implement this with a new TB flag which tracks whether we
need to create a new FP context.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-20-peter.maydell@linaro.org
---
 target/arm/cpu.h       |  2 ++
 target/arm/translate.h |  1 +
 target/arm/helper.c    | 13 +++++++++++++
 target/arm/translate.c | 29 +++++++++++++++++++++++++++++
 4 files changed, 45 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
 FIELD(TBFLAG_A32, VFPEN, 7, 1)
 FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
 FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
+/* For M profile only, set if we must create a new FP context */
+FIELD(TBFLAG_A32, NEW_FP_CTXT_NEEDED, 19, 1)
 /* For M profile only, set if FPCCR.S does not match current security state */
 FIELD(TBFLAG_A32, FPCCR_S_WRONG, 20, 1)
 /* For M profile only, Handler (ie not Thread) mode */
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
     bool v8m_secure; /* true if v8M and we're in Secure mode */
     bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
     bool v8m_fpccr_s_wrong; /* true if v8M FPCCR.S != v8m_secure */
+    bool v7m_new_fp_ctxt_needed; /* ASPEN set but no active FP context */
     /* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
      * so that top level loop can generate correct syndrome information.
      */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         flags = FIELD_DP32(flags, TBFLAG_A32, FPCCR_S_WRONG, 1);
     }
 
+    if (arm_feature(env, ARM_FEATURE_M) &&
+        (env->v7m.fpccr[env->v7m.secure] & R_V7M_FPCCR_ASPEN_MASK) &&
+        (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) ||
+         (env->v7m.secure &&
+          !(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)))) {
+        /*
+         * ASPEN is set, but FPCA/SFPA indicate that there is no active
+         * FP context; we must create a new FP context before executing
+         * any FP insn.
+         */
+        flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
+    }
+
     *pflags = flags;
     *cs_base = 0;
 }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
             /* Don't need to do this for any further FP insns in this TB */
             s->v8m_fpccr_s_wrong = false;
         }
+
+        if (s->v7m_new_fp_ctxt_needed) {
+            /*
+             * Create new FP context by updating CONTROL.FPCA, CONTROL.SFPA
+             * and the FPSCR.
+             */
+            TCGv_i32 control, fpscr;
+            uint32_t bits = R_V7M_CONTROL_FPCA_MASK;
+
+            fpscr = load_cpu_field(v7m.fpdscr[s->v8m_secure]);
+            gen_helper_vfp_set_fpscr(cpu_env, fpscr);
+            tcg_temp_free_i32(fpscr);
+            /*
+             * We don't need to arrange to end the TB, because the only
+             * parts of FPSCR which we cache in the TB flags are the VECLEN
+             * and VECSTRIDE, and those don't exist for M-profile.
+             */
+
+            if (s->v8m_secure) {
+                bits |= R_V7M_CONTROL_SFPA_MASK;
+            }
+            control = load_cpu_field(v7m.control[M_REG_S]);
+            tcg_gen_ori_i32(control, control, bits);
+            store_cpu_field(control, v7m.control[M_REG_S]);
+            /* Don't need to do this for any further FP insns in this TB */
+            s->v7m_new_fp_ctxt_needed = false;
+        }
     }
 
     if (extract32(insn, 28, 4) == 0xf) {
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
         regime_is_secure(env, dc->mmu_idx);
     dc->v8m_stackcheck = FIELD_EX32(tb_flags, TBFLAG_A32, STACKCHECK);
     dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
+    dc->v7m_new_fp_ctxt_needed =
+        FIELD_EX32(tb_flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED);
     dc->cp_regs = cpu->cp_regs;
     dc->features = env->features;
 
-- 
2.20.1

Add a new helper function which returns the MMU index to use
for v7M, where the caller specifies all of the security
state, privilege level and whether the execution priority
is negative, and reimplement the existing
arm_v7m_mmu_idx_for_secstate_and_priv() in terms of it.

We are going to need this for the lazy-FP-stacking code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-21-peter.maydell@linaro.org
---
 target/arm/cpu.h    |  7 +++++++
 target/arm/helper.c | 14 +++++++++++---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ static inline int arm_mmu_idx_to_el(ARMMMUIdx mmu_idx)
     }
 }
 
+/*
+ * Return the MMU index for a v7M CPU with all relevant information
+ * manually specified.
+ */
+ARMMMUIdx arm_v7m_mmu_idx_all(CPUARMState *env,
+                              bool secstate, bool priv, bool negpri);
+
 /* Return the MMU index for a v7M CPU in the specified security and
  * privilege state.
  */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ int fp_exception_el(CPUARMState *env, int cur_el)
     return 0;
 }
 
-ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
-                                                bool secstate, bool priv)
+ARMMMUIdx arm_v7m_mmu_idx_all(CPUARMState *env,
+                              bool secstate, bool priv, bool negpri)
 {
     ARMMMUIdx mmu_idx = ARM_MMU_IDX_M;
 
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
         mmu_idx |= ARM_MMU_IDX_M_PRIV;
     }
 
-    if (armv7m_nvic_neg_prio_requested(env->nvic, secstate)) {
+    if (negpri) {
         mmu_idx |= ARM_MMU_IDX_M_NEGPRI;
     }
 
@@ -XXX,XX +XXX,XX @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
     return mmu_idx;
 }
 
+ARMMMUIdx arm_v7m_mmu_idx_for_secstate_and_priv(CPUARMState *env,
+                                                bool secstate, bool priv)
+{
+    bool negpri = armv7m_nvic_neg_prio_requested(env->nvic, secstate);
+
+    return arm_v7m_mmu_idx_all(env, secstate, priv, negpri);
+}
+
 /* Return the MMU index for a v7M CPU in the specified security state */
 ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState *env, bool secstate)
 {
-- 
2.20.1

In the v7M architecture, if an exception is generated in the process
of doing the lazy stacking of FP registers, the handling of
possible escalation to HardFault is treated differently to the normal
approach: it works based on the saved information about exception
readiness that was stored in the FPCCR when the stack frame was
created. Provide a new function armv7m_nvic_set_pending_lazyfp()
which pends exceptions during lazy stacking, and implements
this logic.

This corresponds to the pseudocode TakePreserveFPException().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-22-peter.maydell@linaro.org
---
 target/arm/cpu.h      | 12 ++++++
 hw/intc/armv7m_nvic.c | 96 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 108 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ void armv7m_nvic_set_pending(void *opaque, int irq, bool secure);
  * a different exception).
  */
 void armv7m_nvic_set_pending_derived(void *opaque, int irq, bool secure);
+/**
+ * armv7m_nvic_set_pending_lazyfp: mark this lazy FP exception as pending
+ * @opaque: the NVIC
+ * @irq: the exception number to mark pending
+ * @secure: false for non-banked exceptions or for the nonsecure
+ * version of a banked exception, true for the secure version of a banked
+ * exception.
+ *
+ * Similar to armv7m_nvic_set_pending(), but specifically for exceptions
+ * generated in the course of lazy stacking of FP registers.
+ */
+void armv7m_nvic_set_pending_lazyfp(void *opaque, int irq, bool secure);
 /**
  * armv7m_nvic_get_pending_irq_info: return highest priority pending
  *    exception, and whether it targets Secure state
diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -XXX,XX +XXX,XX @@ void armv7m_nvic_set_pending_derived(void *opaque, int irq, bool secure)
     do_armv7m_nvic_set_pending(opaque, irq, secure, true);
 }
 
+void armv7m_nvic_set_pending_lazyfp(void *opaque, int irq, bool secure)
+{
+    /*
+     * Pend an exception during lazy FP stacking. This differs
+     * from the usual exception pending because the logic for
+     * whether we should escalate depends on the saved context
+     * in the FPCCR register, not on the current state of the CPU/NVIC.
+     */
+    NVICState *s = (NVICState *)opaque;
+    bool banked = exc_is_banked(irq);
+    VecInfo *vec;
+    bool targets_secure;
+    bool escalate = false;
+    /*
+     * We will only look at bits in fpccr if this is a banked exception
+     * (in which case 'secure' tells us whether it is the S or NS version).
+     * All the bits for the non-banked exceptions are in fpccr_s.
+     */
+    uint32_t fpccr_s = s->cpu->env.v7m.fpccr[M_REG_S];
+    uint32_t fpccr = s->cpu->env.v7m.fpccr[secure];
+
+    assert(irq > ARMV7M_EXCP_RESET && irq < s->num_irq);
+    assert(!secure || banked);
+
+    vec = (banked && secure) ? &s->sec_vectors[irq] : &s->vectors[irq];
+
+    targets_secure = banked ? secure : exc_targets_secure(s, irq);
+
+    switch (irq) {
+    case ARMV7M_EXCP_DEBUG:
+        if (!(fpccr_s & R_V7M_FPCCR_MONRDY_MASK)) {
+            /* Ignore DebugMonitor exception */
+            return;
+        }
+        break;
+    case ARMV7M_EXCP_MEM:
+        escalate = !(fpccr & R_V7M_FPCCR_MMRDY_MASK);
+        break;
+    case ARMV7M_EXCP_USAGE:
+        escalate = !(fpccr & R_V7M_FPCCR_UFRDY_MASK);
+        break;
+    case ARMV7M_EXCP_BUS:
+        escalate = !(fpccr_s & R_V7M_FPCCR_BFRDY_MASK);
+        break;
+    case ARMV7M_EXCP_SECURE:
+        escalate = !(fpccr_s & R_V7M_FPCCR_SFRDY_MASK);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    if (escalate) {
+        /*
+         * Escalate to HardFault: faults that initially targeted Secure
+         * continue to do so, even if HF normally targets NonSecure.
+         */
+        irq = ARMV7M_EXCP_HARD;
+        if (arm_feature(&s->cpu->env, ARM_FEATURE_M_SECURITY) &&
+            (targets_secure ||
+             !(s->cpu->env.v7m.aircr & R_V7M_AIRCR_BFHFNMINS_MASK))) {
+            vec = &s->sec_vectors[irq];
+        } else {
+            vec = &s->vectors[irq];
+        }
+    }
+
+    if (!vec->enabled ||
+        nvic_exec_prio(s) <= exc_group_prio(s, vec->prio, secure)) {
+        if (!(fpccr_s & R_V7M_FPCCR_HFRDY_MASK)) {
+            /*
+             * We want to escalate to HardFault but the context the
+             * FP state belongs to prevents the exception pre-empting.
+             */
+            cpu_abort(&s->cpu->parent_obj,
+                      "Lockup: can't escalate to HardFault during "
+                      "lazy FP register stacking\n");
+        }
+    }
+
+    if (escalate) {
+        s->cpu->env.v7m.hfsr |= R_V7M_HFSR_FORCED_MASK;
+    }
+    if (!vec->pending) {
+        vec->pending = 1;
+        /*
+         * We do not call nvic_irq_update(), because we know our caller
+         * is going to handle causing us to take the exception by
+         * raising EXCP_LAZYFP, so raising the IRQ line would be
+         * pointless extra work. We just need to recompute the
+         * priorities so that armv7m_nvic_can_take_pending_exception()
+         * returns the right answer.
+         */
+        nvic_recompute_state(s);
+    }
+}
+
 /* Make pending IRQ active.  */
 void armv7m_nvic_acknowledge_irq(void *opaque)
 {
-- 
2.20.1

Pushing registers to the stack for v7M needs to handle three cases:
 * the "normal" case where we pend exceptions
 * an "ignore faults" case where we set FSR bits but
   do not pend exceptions (this is used when we are
   handling some kinds of derived exception on exception entry)
 * a "lazy FP stacking" case, where different FSR bits
   are set and the exception is pended differently

Implement this by changing the existing flag argument that
tells us whether to ignore faults or not into an enum that
specifies which of the 3 modes we should handle.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-23-peter.maydell@linaro.org
---
 target/arm/helper.c | 118 +++++++++++++++++++++++++++++---------------
 1 file changed, 79 insertions(+), 39 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ static bool v7m_cpacr_pass(CPUARMState *env, bool is_secure, bool is_priv)
     }
 }
 
+/*
+ * What kind of stack write are we doing? This affects how exceptions
+ * generated during the stacking are treated.
+ */
+typedef enum StackingMode {
+    STACK_NORMAL,
+    STACK_IGNFAULTS,
+    STACK_LAZYFP,
+} StackingMode;
+
 static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
-                            ARMMMUIdx mmu_idx, bool ignfault)
+                            ARMMMUIdx mmu_idx, StackingMode mode)
 {
     CPUState *cs = CPU(cpu);
     CPUARMState *env = &cpu->env;
@@ -XXX,XX +XXX,XX @@ static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
                       &attrs, &prot, &page_size, &fi, NULL)) {
         /* MPU/SAU lookup failed */
         if (fi.type == ARMFault_QEMU_SFault) {
-            qemu_log_mask(CPU_LOG_INT,
-                          "...SecureFault with SFSR.AUVIOL during stacking\n");
-            env->v7m.sfsr |= R_V7M_SFSR_AUVIOL_MASK | R_V7M_SFSR_SFARVALID_MASK;
+            if (mode == STACK_LAZYFP) {
+                qemu_log_mask(CPU_LOG_INT,
+                              "...SecureFault with SFSR.LSPERR "
+                              "during lazy stacking\n");
+                env->v7m.sfsr |= R_V7M_SFSR_LSPERR_MASK;
+            } else {
+                qemu_log_mask(CPU_LOG_INT,
+                              "...SecureFault with SFSR.AUVIOL "
+                              "during stacking\n");
+                env->v7m.sfsr |= R_V7M_SFSR_AUVIOL_MASK;
+            }
+            env->v7m.sfsr |= R_V7M_SFSR_SFARVALID_MASK;
             env->v7m.sfar = addr;
             exc = ARMV7M_EXCP_SECURE;
             exc_secure = false;
         } else {
-            qemu_log_mask(CPU_LOG_INT, "...MemManageFault with CFSR.MSTKERR\n");
-            env->v7m.cfsr[secure] |= R_V7M_CFSR_MSTKERR_MASK;
+            if (mode == STACK_LAZYFP) {
+                qemu_log_mask(CPU_LOG_INT,
+                              "...MemManageFault with CFSR.MLSPERR\n");
+                env->v7m.cfsr[secure] |= R_V7M_CFSR_MLSPERR_MASK;
+            } else {
+                qemu_log_mask(CPU_LOG_INT,
+                              "...MemManageFault with CFSR.MSTKERR\n");
+                env->v7m.cfsr[secure] |= R_V7M_CFSR_MSTKERR_MASK;
+            }
             exc = ARMV7M_EXCP_MEM;
             exc_secure = secure;
         }
@@ -XXX,XX +XXX,XX @@ static bool v7m_stack_write(ARMCPU *cpu, uint32_t addr, uint32_t value,
                          attrs, &txres);
     if (txres != MEMTX_OK) {
         /* BusFault trying to write the data */
-        qemu_log_mask(CPU_LOG_INT, "...BusFault with BFSR.STKERR\n");
-        env->v7m.cfsr[M_REG_NS] |= R_V7M_CFSR_STKERR_MASK;
+        if (mode == STACK_LAZYFP) {
+            qemu_log_mask(CPU_LOG_INT, "...BusFault with BFSR.LSPERR\n");
+            env->v7m.cfsr[M_REG_NS] |= R_V7M_CFSR_LSPERR_MASK;
+        } else {
+            qemu_log_mask(CPU_LOG_INT, "...BusFault with BFSR.STKERR\n");
+            env->v7m.cfsr[M_REG_NS] |= R_V7M_CFSR_STKERR_MASK;
+        }
         exc = ARMV7M_EXCP_BUS;
         exc_secure = false;
         goto pend_fault;
@@ -XXX,XX +XXX,XX @@ pend_fault:
      * later if we have two derived exceptions.
      * The only case when we must not pend the exception but instead
      * throw it away is if we are doing the push of the callee registers
-     * and we've already generated a derived exception. Even in this
-     * case we will still update the fault status registers.
+     * and we've already generated a derived exception (this is indicated
+     * by the caller passing STACK_IGNFAULTS). Even in this case we will
+     * still update the fault status registers.
      */
-    if (!ignfault) {
+    switch (mode) {
+    case STACK_NORMAL:
         armv7m_nvic_set_pending_derived(env->nvic, exc, exc_secure);
+        break;
+    case STACK_LAZYFP:
+        armv7m_nvic_set_pending_lazyfp(env->nvic, exc, exc_secure);
+        break;
+    case STACK_IGNFAULTS:
+        break;
     }
     return false;
 }
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
     uint32_t limit;
     bool want_psp;
     uint32_t sig;
+    StackingMode smode = ignore_faults ? STACK_IGNFAULTS : STACK_NORMAL;
 
     if (dotailchain) {
         bool mode = lr & R_V7M_EXCRET_MODE_MASK;
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_callee_stack(ARMCPU *cpu, uint32_t lr, bool dotailchain,
      */
     sig = v7m_integrity_sig(env, lr);
     stacked_ok =
-        v7m_stack_write(cpu, frameptr, sig, mmu_idx, ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x8, env->regs[4], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0xc, env->regs[5], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x10, env->regs[6], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x14, env->regs[7], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x18, env->regs[8], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x1c, env->regs[9], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x20, env->regs[10], mmu_idx,
-                        ignore_faults) &&
-        v7m_stack_write(cpu, frameptr + 0x24, env->regs[11], mmu_idx,
-                        ignore_faults);
+        v7m_stack_write(cpu, frameptr, sig, mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x8, env->regs[4], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0xc, env->regs[5], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x10, env->regs[6], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x14, env->regs[7], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x18, env->regs[8], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x1c, env->regs[9], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x20, env->regs[10], mmu_idx, smode) &&
+        v7m_stack_write(cpu, frameptr + 0x24, env->regs[11], mmu_idx, smode);
 
     /* Update SP regardless of whether any of the stack accesses failed. */
     *frame_sp_p = frameptr;
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
      * if it has higher priority).
      */
     stacked_ok = stacked_ok &&
-        v7m_stack_write(cpu, frameptr, env->regs[0], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 4, env->regs[1], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 8, env->regs[2], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 12, env->regs[3], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 16, env->regs[12], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 20, env->regs[14], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 24, env->regs[15], mmu_idx, false) &&
-        v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, false);
+        v7m_stack_write(cpu, frameptr, env->regs[0], mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 4, env->regs[1],
+                        mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 8, env->regs[2],
+                        mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 12, env->regs[3],
+                        mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 16, env->regs[12],
+                        mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 20, env->regs[14],
+                        mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 24, env->regs[15],
+                        mmu_idx, STACK_NORMAL) &&
+        v7m_stack_write(cpu, frameptr + 28, xpsr, mmu_idx, STACK_NORMAL);
 
     if (env->v7m.control[M_REG_S] & R_V7M_CONTROL_FPCA_MASK) {
         /* FPU is active, try to save its registers */
@@ -XXX,XX +XXX,XX @@ static bool v7m_push_stack(ARMCPU *cpu)
                         faddr += 8; /* skip the slot for the FPSCR */
                     }
                     stacked_ok = stacked_ok &&
-                        v7m_stack_write(cpu, faddr, slo, mmu_idx, false) &&
-                        v7m_stack_write(cpu, faddr + 4, shi, mmu_idx, false);
+                        v7m_stack_write(cpu, faddr, slo,
+                                        mmu_idx, STACK_NORMAL) &&
+                        v7m_stack_write(cpu, faddr + 4, shi,
+                                        mmu_idx, STACK_NORMAL);
                 }
                 stacked_ok = stacked_ok &&
                     v7m_stack_write(cpu, frameptr + 0x60,
-                                    vfp_get_fpscr(env), mmu_idx, false);
+                                    vfp_get_fpscr(env), mmu_idx, STACK_NORMAL);
                 if (cpacr_pass) {
                     for (i = 0; i < ((framesize == 0xa8) ? 32 : 16); i += 2) {
                         *aa32_vfp_dreg(env, i / 2) = 0;
-- 
2.20.1

The M-profile architecture floating point system supports
lazy FP state preservation, where FP registers are not
pushed to the stack when an exception occurs but are instead
only saved if and when the first FP instruction in the exception
handler is executed. Implement this in QEMU, corresponding
to the check of LSPACT in the pseudocode ExecuteFPCheck().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-24-peter.maydell@linaro.org
---
 target/arm/cpu.h       |   3 ++
 target/arm/helper.h    |   2 +
 target/arm/translate.h |   1 +
 target/arm/helper.c    | 112 +++++++++++++++++++++++++++++++++++++++++
 target/arm/translate.c |  22 ++++++++
 5 files changed, 140 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@
 #define EXCP_NOCP           17   /* v7M NOCP UsageFault */
 #define EXCP_INVSTATE       18   /* v7M INVSTATE UsageFault */
 #define EXCP_STKOF          19   /* v8M STKOF UsageFault */
+#define EXCP_LAZYFP         20   /* v7M fault during lazy FP stacking */
 /* NB: add new EXCP_ defines to the array in arm_log_exception() too */
 
 #define ARMV7M_EXCP_RESET   1
@@ -XXX,XX +XXX,XX @@ FIELD(TBFLAG_A32, NS, 6, 1)
 FIELD(TBFLAG_A32, VFPEN, 7, 1)
 FIELD(TBFLAG_A32, CONDEXEC, 8, 8)
 FIELD(TBFLAG_A32, SCTLR_B, 16, 1)
+/* For M profile only, set if FPCCR.LSPACT is set */
+FIELD(TBFLAG_A32, LSPACT, 18, 1)
 /* For M profile only, set if we must create a new FP context */
 FIELD(TBFLAG_A32, NEW_FP_CTXT_NEEDED, 19, 1)
 /* For M profile only, set if FPCCR.S does not match current security state */
diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(v7m_blxns, void, env, i32)
 
 DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
 
+DEF_HELPER_1(v7m_preserve_fp_state, void, env)
+
 DEF_HELPER_2(v8m_stackcheck, void, env, i32)
 
 DEF_HELPER_4(access_check_cp_reg, void, env, ptr, i32, i32)
diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef struct DisasContext {
     bool v8m_stackcheck; /* true if we need to perform v8M stack limit checks */
     bool v8m_fpccr_s_wrong; /* true if v8M FPCCR.S != v8m_secure */
     bool v7m_new_fp_ctxt_needed; /* ASPEN set but no active FP context */
+    bool v7m_lspact; /* FPCCR.LSPACT set */
     /* Immediate value in AArch32 SVC insn; must be set if is_jmp == DISAS_SWI
      * so that top level loop can generate correct syndrome information.
      */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
     g_assert_not_reached();
 }
 
+void HELPER(v7m_preserve_fp_state)(CPUARMState *env)
+{
+    /* translate.c should never generate calls here in user-only mode */
+    g_assert_not_reached();
+}
+
 uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
 {
     /* The TT instructions can be used by unprivileged code, but in
@@ -XXX,XX +XXX,XX @@ pend_fault:
     return false;
 }
 
+void HELPER(v7m_preserve_fp_state)(CPUARMState *env)
+{
+    /*
+     * Preserve FP state (because LSPACT was set and we are about
+     * to execute an FP instruction). This corresponds to the
+     * PreserveFPState() pseudocode.
+     * We may throw an exception if the stacking fails.
+     */
+    ARMCPU *cpu = arm_env_get_cpu(env);
+    bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
+    bool negpri = !(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_HFRDY_MASK);
+    bool is_priv = !(env->v7m.fpccr[is_secure] & R_V7M_FPCCR_USER_MASK);
+    bool splimviol = env->v7m.fpccr[is_secure] & R_V7M_FPCCR_SPLIMVIOL_MASK;
+    uint32_t fpcar = env->v7m.fpcar[is_secure];
+    bool stacked_ok = true;
+    bool ts = is_secure && (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK);
+    bool take_exception;
+
+    /* Take the iothread lock as we are going to touch the NVIC */
+    qemu_mutex_lock_iothread();
+
+    /* Check the background context had access to the FPU */
+    if (!v7m_cpacr_pass(env, is_secure, is_priv)) {
+        armv7m_nvic_set_pending_lazyfp(env->nvic, ARMV7M_EXCP_USAGE, is_secure);
+        env->v7m.cfsr[is_secure] |= R_V7M_CFSR_NOCP_MASK;
+        stacked_ok = false;
+    } else if (!is_secure && !extract32(env->v7m.nsacr, 10, 1)) {
+        armv7m_nvic_set_pending_lazyfp(env->nvic, ARMV7M_EXCP_USAGE, M_REG_S);
+        env->v7m.cfsr[M_REG_S] |= R_V7M_CFSR_NOCP_MASK;
+        stacked_ok = false;
+    }
+
+    if (!splimviol && stacked_ok) {
+        /* We only stack if the stack limit wasn't violated */
+        int i;
+        ARMMMUIdx mmu_idx;
+
+        mmu_idx = arm_v7m_mmu_idx_all(env, is_secure, is_priv, negpri);
+        for (i = 0; i < (ts ? 32 : 16); i += 2) {
+            uint64_t dn = *aa32_vfp_dreg(env, i / 2);
+            uint32_t faddr = fpcar + 4 * i;
+            uint32_t slo = extract64(dn, 0, 32);
+            uint32_t shi = extract64(dn, 32, 32);
+
+            if (i >= 16) {
+                faddr += 8; /* skip the slot for the FPSCR */
+            }
+            stacked_ok = stacked_ok &&
+                v7m_stack_write(cpu, faddr, slo, mmu_idx, STACK_LAZYFP) &&
+                v7m_stack_write(cpu, faddr + 4, shi, mmu_idx, STACK_LAZYFP);
+        }
+
+        stacked_ok = stacked_ok &&
+            v7m_stack_write(cpu, fpcar + 0x40,
+                            vfp_get_fpscr(env), mmu_idx, STACK_LAZYFP);
+    }
+
+    /*
+     * We definitely pended an exception, but it's possible that it
+     * might not be able to be taken now. If its priority permits us
+     * to take it now, then we must not update the LSPACT or FP regs,
+     * but instead jump out to take the exception immediately.
+     * If it's just pending and won't be taken until the current
+     * handler exits, then we do update LSPACT and the FP regs.
+     */
+    take_exception = !stacked_ok &&
+        armv7m_nvic_can_take_pending_exception(env->nvic);
+
+    qemu_mutex_unlock_iothread();
+
+    if (take_exception) {
+        raise_exception_ra(env, EXCP_LAZYFP, 0, 1, GETPC());
+    }
+
+    env->v7m.fpccr[is_secure] &= ~R_V7M_FPCCR_LSPACT_MASK;
+
+    if (ts) {
+        /* Clear s0 to s31 and the FPSCR */
+        int i;
+
+        for (i = 0; i < 32; i += 2) {
+            *aa32_vfp_dreg(env, i / 2) = 0;
+        }
+        vfp_set_fpscr(env, 0);
+    }
+    /*
+     * Otherwise s0 to s15 and FPSCR are UNKNOWN; we choose to leave them
+     * unchanged.
+     */
+}
+
 /* Write to v7M CONTROL.SPSEL bit for the specified security bank.
  * This may change the current stack pointer between Main and Process
  * stack pointers if it is done for the CONTROL register for the current
@@ -XXX,XX +XXX,XX @@ static void arm_log_exception(int idx)
             [EXCP_NOCP] = "v7M NOCP UsageFault",
             [EXCP_INVSTATE] = "v7M INVSTATE UsageFault",
             [EXCP_STKOF] = "v8M STKOF UsageFault",
+            [EXCP_LAZYFP] = "v7M exception during lazy FP stacking",
         };
 
         if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
             return;
         }
         break;
+    case EXCP_LAZYFP:
+        /*
+         * We already pended the specific exception in the NVIC in the
+         * v7m_preserve_fp_state() helper function.
+         */
+        break;
     default:
         cpu_abort(cs, "Unhandled exception 0x%x\n", cs->exception_index);
         return; /* Never happens.  Keep compiler happy.  */
@@ -XXX,XX +XXX,XX @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc,
         flags = FIELD_DP32(flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED, 1);
     }
 
+    if (arm_feature(env, ARM_FEATURE_M)) {
+        bool is_secure = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
+
+        if (env->v7m.fpccr[is_secure] & R_V7M_FPCCR_LSPACT_MASK) {
+            flags = FIELD_DP32(flags, TBFLAG_A32, LSPACT, 1);
+        }
+    }
+
     *pflags = flags;
     *cs_base = 0;
 }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_vfp_insn(DisasContext *s, uint32_t insn)
     if (arm_dc_feature(s, ARM_FEATURE_M)) {
         /* Handle M-profile lazy FP state mechanics */
 
+        /* Trigger lazy-state preservation if necessary */
+        if (s->v7m_lspact) {
+            /*
+             * Lazy state saving affects external memory and also the NVIC,
+             * so we must mark it as an IO operation for icount.
+             */
+            if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
+                gen_io_start();
+            }
+            gen_helper_v7m_preserve_fp_state(cpu_env);
+            if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
+                gen_io_end();
+            }
+            /*
+             * If the preserve_fp_state helper doesn't throw an exception
+             * then it will clear LSPACT; we don't need to repeat this for
+             * any further FP insns in this TB.
+             */
+            s->v7m_lspact = false;
+        }
+
         /* Update ownership of FP context: set FPCCR.S to match current state */
         if (s->v8m_fpccr_s_wrong) {
             TCGv_i32 tmp;
@@ -XXX,XX +XXX,XX @@ static void arm_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     dc->v8m_fpccr_s_wrong = FIELD_EX32(tb_flags, TBFLAG_A32, FPCCR_S_WRONG);
     dc->v7m_new_fp_ctxt_needed =
         FIELD_EX32(tb_flags, TBFLAG_A32, NEW_FP_CTXT_NEEDED);
+    dc->v7m_lspact = FIELD_EX32(tb_flags, TBFLAG_A32, LSPACT);
     dc->cp_regs = cpu->cp_regs;
     dc->features = env->features;
 
-- 
2.20.1

Implement the VLSTM instruction for v7M for the FPU present case.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-25-peter.maydell@linaro.org
---
 target/arm/cpu.h       |  2 +
 target/arm/helper.h    |  2 +
 target/arm/helper.c    | 84 ++++++++++++++++++++++++++++++++++++++++++
 target/arm/translate.c | 15 +++++++-
 4 files changed, 102 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@
 #define EXCP_INVSTATE       18   /* v7M INVSTATE UsageFault */
 #define EXCP_STKOF          19   /* v8M STKOF UsageFault */
 #define EXCP_LAZYFP         20   /* v7M fault during lazy FP stacking */
+#define EXCP_LSERR          21   /* v8M LSERR SecureFault */
+#define EXCP_UNALIGNED      22   /* v7M UNALIGNED UsageFault */
 /* NB: add new EXCP_ defines to the array in arm_log_exception() too */
 
 #define ARMV7M_EXCP_RESET   1
diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
 
 DEF_HELPER_1(v7m_preserve_fp_state, void, env)
 
+DEF_HELPER_2(v7m_vlstm, void, env, i32)
+
 DEF_HELPER_2(v8m_stackcheck, void, env, i32)
 
 DEF_HELPER_4(access_check_cp_reg, void, env, ptr, i32, i32)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_preserve_fp_state)(CPUARMState *env)
     g_assert_not_reached();
 }
 
+void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
+{
+    /* translate.c should never generate calls here in user-only mode */
+    g_assert_not_reached();
+}
+
 uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
 {
     /* The TT instructions can be used by unprivileged code, but in
@@ -XXX,XX +XXX,XX @@ static void v7m_update_fpccr(CPUARMState *env, uint32_t frameptr,
     }
 }
 
+void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
+{
+    /* fptr is the value of Rn, the frame pointer we store the FP regs to */
+    bool s = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
+    bool lspact = env->v7m.fpccr[s] & R_V7M_FPCCR_LSPACT_MASK;
+
+    assert(env->v7m.secure);
+
+    if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)) {
+        return;
+    }
+
+    /* Check access to the coprocessor is permitted */
+    if (!v7m_cpacr_pass(env, true, arm_current_el(env) != 0)) {
+        raise_exception_ra(env, EXCP_NOCP, 0, 1, GETPC());
+    }
+
+    if (lspact) {
+        /* LSPACT should not be active when there is active FP state */
+        raise_exception_ra(env, EXCP_LSERR, 0, 1, GETPC());
+    }
+
+    if (fptr & 7) {
+        raise_exception_ra(env, EXCP_UNALIGNED, 0, 1, GETPC());
+    }
+
+    /*
+     * Note that we do not use v7m_stack_write() here, because the
+     * accesses should not set the FSR bits for stacking errors if they
+     * fail. (In pseudocode terms, they are AccType_NORMAL, not AccType_STACK
+     * or AccType_LAZYFP). Faults in cpu_stl_data() will throw exceptions
+     * and longjmp out.
+     */
+    if (!(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPEN_MASK)) {
+        bool ts = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK;
+        int i;
+
+        for (i = 0; i < (ts ? 32 : 16); i += 2) {
+            uint64_t dn = *aa32_vfp_dreg(env, i / 2);
+            uint32_t faddr = fptr + 4 * i;
+            uint32_t slo = extract64(dn, 0, 32);
+            uint32_t shi = extract64(dn, 32, 32);
+
+            if (i >= 16) {
+                faddr += 8; /* skip the slot for the FPSCR */
+            }
+            cpu_stl_data(env, faddr, slo);
+            cpu_stl_data(env, faddr + 4, shi);
+        }
+        cpu_stl_data(env, fptr + 0x40, vfp_get_fpscr(env));
+
+        /*
+         * If TS is 0 then s0 to s15 and FPSCR are UNKNOWN; we choose to
+         * leave them unchanged, matching our choice in v7m_preserve_fp_state.
+         */
+        if (ts) {
+            for (i = 0; i < 32; i += 2) {
+                *aa32_vfp_dreg(env, i / 2) = 0;
+            }
+            vfp_set_fpscr(env, 0);
+        }
+    } else {
+        v7m_update_fpccr(env, fptr, false);
+    }
+
+    env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
+}
+
 static bool v7m_push_stack(ARMCPU *cpu)
 {
     /* Do the "set up stack frame" part of exception entry,
@@ -XXX,XX +XXX,XX @@ static void arm_log_exception(int idx)
             [EXCP_INVSTATE] = "v7M INVSTATE UsageFault",
             [EXCP_STKOF] = "v8M STKOF UsageFault",
             [EXCP_LAZYFP] = "v7M exception during lazy FP stacking",
+            [EXCP_LSERR] = "v8M LSERR UsageFault",
+            [EXCP_UNALIGNED] = "v7M UNALIGNED UsageFault",
         };
 
         if (idx >= 0 && idx < ARRAY_SIZE(excnames)) {
@@ -XXX,XX +XXX,XX @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
         armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
         env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_STKOF_MASK;
         break;
+    case EXCP_LSERR:
+        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SECURE, false);
+        env->v7m.sfsr |= R_V7M_SFSR_LSERR_MASK;
+        break;
+    case EXCP_UNALIGNED:
+        armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE, env->v7m.secure);
+        env->v7m.cfsr[env->v7m.secure] |= R_V7M_CFSR_UNALIGNED_MASK;
+        break;
     case EXCP_SWI:
         /* The PC already points to the next instruction.  */
         armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SVC, env->v7m.secure);
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                 if (!s->v8m_secure || (insn & 0x0040f0ff)) {
                     goto illegal_op;
                 }
-                /* Just NOP since FP support is not implemented */
+
+                if (arm_dc_feature(s, ARM_FEATURE_VFP)) {
+                    TCGv_i32 fptr = load_reg(s, rn);
+
+                    if (extract32(insn, 20, 1)) {
+                        /* VLLDM */
+                    } else {
+                        gen_helper_v7m_vlstm(cpu_env, fptr);
+                    }
+                    tcg_temp_free_i32(fptr);
+
+                    /* End the TB, because we have updated FP control bits */
+                    s->base.is_jmp = DISAS_UPDATE;
+                }
                 break;
             }
             if (arm_dc_feature(s, ARM_FEATURE_VFP) &&
-- 
2.20.1

Implement the VLLDM instruction for v7M for the FPU present cas.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-26-peter.maydell@linaro.org
---
 target/arm/helper.h    |  1 +
 target/arm/helper.c    | 54 ++++++++++++++++++++++++++++++++++++++++++
 target/arm/translate.c |  2 +-
 3 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(v7m_tt, i32, env, i32, i32)
 DEF_HELPER_1(v7m_preserve_fp_state, void, env)
 
 DEF_HELPER_2(v7m_vlstm, void, env, i32)
+DEF_HELPER_2(v7m_vlldm, void, env, i32)
 
 DEF_HELPER_2(v8m_stackcheck, void, env, i32)
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
     g_assert_not_reached();
 }
 
+void HELPER(v7m_vlldm)(CPUARMState *env, uint32_t fptr)
+{
+    /* translate.c should never generate calls here in user-only mode */
+    g_assert_not_reached();
+}
+
 uint32_t HELPER(v7m_tt)(CPUARMState *env, uint32_t addr, uint32_t op)
 {
     /* The TT instructions can be used by unprivileged code, but in
@@ -XXX,XX +XXX,XX @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
     env->v7m.control[M_REG_S] &= ~R_V7M_CONTROL_FPCA_MASK;
 }
 
+void HELPER(v7m_vlldm)(CPUARMState *env, uint32_t fptr)
+{
+    /* fptr is the value of Rn, the frame pointer we load the FP regs from */
+    assert(env->v7m.secure);
+
+    if (!(env->v7m.control[M_REG_S] & R_V7M_CONTROL_SFPA_MASK)) {
+        return;
+    }
+
+    /* Check access to the coprocessor is permitted */
+    if (!v7m_cpacr_pass(env, true, arm_current_el(env) != 0)) {
+        raise_exception_ra(env, EXCP_NOCP, 0, 1, GETPC());
+    }
+
+    if (env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPACT_MASK) {
+        /* State in FP is still valid */
+        env->v7m.fpccr[M_REG_S] &= ~R_V7M_FPCCR_LSPACT_MASK;
+    } else {
+        bool ts = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_TS_MASK;
+        int i;
+        uint32_t fpscr;
+
+        if (fptr & 7) {
+            raise_exception_ra(env, EXCP_UNALIGNED, 0, 1, GETPC());
+        }
+
+        for (i = 0; i < (ts ? 32 : 16); i += 2) {
+            uint32_t slo, shi;
+            uint64_t dn;
+            uint32_t faddr = fptr + 4 * i;
+
+            if (i >= 16) {
+                faddr += 8; /* skip the slot for the FPSCR */
+            }
+
+            slo = cpu_ldl_data(env, faddr);
+            shi = cpu_ldl_data(env, faddr + 4);
+
+            dn = (uint64_t) shi << 32 | slo;
+            *aa32_vfp_dreg(env, i / 2) = dn;
+        }
+        fpscr = cpu_ldl_data(env, fptr + 0x40);
+        vfp_set_fpscr(env, fpscr);
+    }
+
+    env->v7m.control[M_REG_S] |= R_V7M_CONTROL_FPCA_MASK;
+}
+
 static bool v7m_push_stack(ARMCPU *cpu)
 {
     /* Do the "set up stack frame" part of exception entry,
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
                     TCGv_i32 fptr = load_reg(s, rn);
 
                     if (extract32(insn, 20, 1)) {
-                        /* VLLDM */
+                        gen_helper_v7m_vlldm(cpu_env, fptr);
                     } else {
                         gen_helper_v7m_vlstm(cpu_env, fptr);
                     }
-- 
2.20.1

Enable the FPU by default for the Cortex-M4 and Cortex-M33.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190416125744.27770-27-peter.maydell@linaro.org
---
 target/arm/cpu.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ static void cortex_m4_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_M);
     set_feature(&cpu->env, ARM_FEATURE_M_MAIN);
     set_feature(&cpu->env, ARM_FEATURE_THUMB_DSP);
+    set_feature(&cpu->env, ARM_FEATURE_VFP4);
     cpu->midr = 0x410fc240; /* r0p0 */
     cpu->pmsav7_dregion = 8;
+    cpu->isar.mvfr0 = 0x10110021;
+    cpu->isar.mvfr1 = 0x11000011;
+    cpu->isar.mvfr2 = 0x00000000;
     cpu->id_pfr0 = 0x00000030;
     cpu->id_pfr1 = 0x00000200;
     cpu->id_dfr0 = 0x00100000;
@@ -XXX,XX +XXX,XX @@ static void cortex_m33_initfn(Object *obj)
     set_feature(&cpu->env, ARM_FEATURE_M_MAIN);
     set_feature(&cpu->env, ARM_FEATURE_M_SECURITY);
     set_feature(&cpu->env, ARM_FEATURE_THUMB_DSP);
+    set_feature(&cpu->env, ARM_FEATURE_VFP4);
     cpu->midr = 0x410fd213; /* r0p3 */
     cpu->pmsav7_dregion = 16;
     cpu->sau_sregion = 8;
+    cpu->isar.mvfr0 = 0x10110021;
+    cpu->isar.mvfr1 = 0x11000011;
+    cpu->isar.mvfr2 = 0x00000040;
     cpu->id_pfr0 = 0x00000030;
     cpu->id_pfr1 = 0x00000210;
     cpu->id_dfr0 = 0x00200000;
-- 
2.20.1