Series comparison

-[PULL 00/23] target-arm queue
+[PULL 00/33] target-arm queue
-Mostly my decodetree stuff, but also some patches for various
+Hi; here's the first target-arm pullreq for the 7.0 cycle.
 smaller bugs/features from others.
 thanks
 -- PMM
-The following changes since commit 53550e81e2cafe7c03a39526b95cd21b5194d9b1:
+The following changes since commit 76b56fdfc9fa43ec6e5986aee33f108c6c6a511e:
-  Merge remote-tracking branch 'remotes/berrange/tags/qcrypto-next-pull-request' into staging (2020-06-15 16:36:34 +0100)
+  Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging (2021-12-14 12:46:18 -0800)
 are available in the Git repository at:
-  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200616
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20211215
-for you to fetch changes up to 64b397417a26509bcdff44ab94356a35c7901c79:
+for you to fetch changes up to aed176558806674d030a8305d989d4e6a5073359:
-  hw: arm: Set vendor property for IMX SDHCI emulations (2020-06-16 10:32:29 +0100)
+  tests/acpi: add expected blob for VIOT test on virt machine (2021-12-15 10:35:26 +0000)
 ----------------------------------------------------------------
- * hw: arm: Set vendor property for IMX SDHCI emulations
+target-arm queue:
- * sd: sdhci: Implement basic vendor specific register support
+ * ITS: error reporting cleanup
- * hw/net/imx_fec: Convert debug fprintf() to trace events
+ * aspeed: improve documentation
- * target/arm/cpu: adjust virtual time for all KVM arm cpus
+ * Fix STM32F2XX USART data register readout
- * Implement configurable descriptor size in ftgmac100
+ * allow emulated GICv3 to be disabled in non-TCG builds
- * hw/misc/imx6ul_ccm: Implement non writable bits in CCM registers
+ * fix exception priority for singlestep, misaligned PC, bp, etc
- * target/arm: More Neon decodetree conversion work
+ * Correct calculation of tlb range invalidate length
  * npcm7xx_emc: fix missing queue_flush
  * virt: Add VIOT ACPI table for virtio-iommu
  * target/i386: Use assert() to sanity-check b1 in SSE decode
  * Don't include qemu-common unnecessarily
 ----------------------------------------------------------------
-Erik Smit (1):
+Alex Bennée (1):
-      Implement configurable descriptor size in ftgmac100
+      hw/intc: clean-up error reporting for failed ITS cmd
-Guenter Roeck (2):
+Jean-Philippe Brucker (8):
-      sd: sdhci: Implement basic vendor specific register support
+      hw/arm/virt-acpi-build: Add VIOT table for virtio-iommu
-      hw: arm: Set vendor property for IMX SDHCI emulations
+      hw/arm/virt: Remove device tree restriction for virtio-iommu
       hw/arm/virt: Reject instantiation of multiple IOMMUs
       hw/arm/virt: Use object_property_set instead of qdev_prop_set
       tests/acpi: allow updates of VIOT expected data files
       tests/acpi: add test case for VIOT
       tests/acpi: add expected blobs for VIOT test on q35 machine
       tests/acpi: add expected blob for VIOT test on virt machine
-Jean-Christophe Dubois (2):
+Joel Stanley (4):
-      hw/misc/imx6ul_ccm: Implement non writable bits in CCM registers
+      docs: aspeed: Add new boards
-      hw/net/imx_fec: Convert debug fprintf() to trace events
+      docs: aspeed: Update OpenBMC image URL
       docs: aspeed: Give an example of booting a kernel
       docs: aspeed: ADC is now modelled
-Peter Maydell (17):
+Olivier Hériveaux (1):
-      target/arm: Fix missing temp frees in do_vshll_2sh
+      Fix STM32F2XX USART data register readout
       target/arm: Convert Neon 3-reg-diff prewidening ops to decodetree
       target/arm: Convert Neon 3-reg-diff narrowing ops to decodetree
       target/arm: Convert Neon 3-reg-diff VABAL, VABDL to decodetree
       target/arm: Convert Neon 3-reg-diff long multiplies
       target/arm: Convert Neon 3-reg-diff saturating doubling multiplies
       target/arm: Convert Neon 3-reg-diff polynomial VMULL
       target/arm: Add 'static' and 'const' annotations to VSHLL function arrays
       target/arm: Add missing TCG temp free in do_2shift_env_64()
       target/arm: Convert Neon 2-reg-scalar integer multiplies to decodetree
       target/arm: Convert Neon 2-reg-scalar float multiplies to decodetree
       target/arm: Convert Neon 2-reg-scalar VQDMULH, VQRDMULH to decodetree
       target/arm: Convert Neon 2-reg-scalar VQRDMLAH, VQRDMLSH to decodetree
       target/arm: Convert Neon 2-reg-scalar long multiplies to decodetree
       target/arm: Convert Neon VEXT to decodetree
       target/arm: Convert Neon VTBL, VTBX to decodetree
       target/arm: Convert Neon VDUP (scalar) to decodetree
-fangying (1):
+Patrick Venture (1):
-      target/arm/cpu: adjust virtual time for all KVM arm cpus
+      hw/net: npcm7xx_emc fix missing queue_flush
- hw/sd/sdhci-internal.h          |    5 +
+Peter Maydell (6):
- include/hw/sd/sdhci.h           |    5 +
+      target/i386: Use assert() to sanity-check b1 in SSE decode
- target/arm/translate.h          |    1 +
+      include/hw/i386: Don't include qemu-common.h in .h files
- target/arm/neon-dp.decode       |  130 +++++
+      target/hexagon/cpu.h: don't include qemu-common.h
- hw/arm/fsl-imx25.c              |    6 +
+      target/rx/cpu.h: Don't include qemu-common.h
- hw/arm/fsl-imx6.c               |    6 +
+      hw/arm: Don't include qemu-common.h unnecessarily
- hw/arm/fsl-imx6ul.c             |    2 +
+      target/arm: Correct calculation of tlb range invalidate length
  hw/arm/fsl-imx7.c               |    2 +
  hw/misc/imx6ul_ccm.c            |   76 ++-
  hw/net/ftgmac100.c              |   26 +-
  hw/net/imx_fec.c                |  106 ++--
  hw/sd/sdhci.c                   |   18 +-
  target/arm/cpu.c                |    6 +-
  target/arm/cpu64.c              |    1 -
  target/arm/kvm.c                |   21 +-
  target/arm/translate-neon.inc.c | 1148 ++++++++++++++++++++++++++++++++++++++-
  target/arm/translate.c          |  684 +----------------------
  hw/net/trace-events             |   18 +
 files changed, 1495 insertions(+), 766 deletions(-)
+Philippe Mathieu-Daudé (2):
+      hw/intc/arm_gicv3: Extract gicv3_set_gicv3state from arm_gicv3_cpuif.c
+      hw/intc/arm_gicv3: Introduce CONFIG_ARM_GIC_TCG Kconfig selector
+Richard Henderson (10):
+      target/arm: Hoist pc_next to a local variable in aarch64_tr_translate_insn
+      target/arm: Hoist pc_next to a local variable in arm_tr_translate_insn
+      target/arm: Hoist pc_next to a local variable in thumb_tr_translate_insn
+      target/arm: Split arm_pre_translate_insn
+      target/arm: Advance pc for arch single-step exception
+      target/arm: Split compute_fsr_fsc out of arm_deliver_fault
+      target/arm: Take an exception if PC is misaligned
+      target/arm: Assert thumb pc is aligned
+      target/arm: Suppress bp for exceptions with more priority
+      tests/tcg: Add arm and aarch64 pc alignment tests
+ docs/system/arm/aspeed.rst        |  26 ++++++++++++----
+ include/hw/i386/microvm.h         |   1 -
+ include/hw/i386/x86.h             |   1 -
+ target/arm/helper.h               |   1 +
+ target/arm/syndrome.h             |   5 +++
+ target/hexagon/cpu.h              |   1 -
+ target/rx/cpu.h                   |   1 -
+ hw/arm/boot.c                     |   1 -
+ hw/arm/digic_boards.c             |   1 -
+ hw/arm/highbank.c                 |   1 -
+ hw/arm/npcm7xx_boards.c           |   1 -
+ hw/arm/sbsa-ref.c                 |   1 -
+ hw/arm/stm32f405_soc.c            |   1 -
+ hw/arm/vexpress.c                 |   1 -
+ hw/arm/virt-acpi-build.c          |   7 +++++
+ hw/arm/virt.c                     |  21 ++++++-------
+ hw/char/stm32f2xx_usart.c         |   3 +-
+ hw/intc/arm_gicv3.c               |   2 +-
+ hw/intc/arm_gicv3_cpuif.c         |  10 +-----
+ hw/intc/arm_gicv3_cpuif_common.c  |  22 +++++++++++++
+ hw/intc/arm_gicv3_its.c           |  39 +++++++++++++++--------
+ hw/net/npcm7xx_emc.c              |  18 +++++------
+ hw/virtio/virtio-iommu-pci.c      |  12 ++------
+ linux-user/aarch64/cpu_loop.c     |  46 ++++++++++++++++------------
+ linux-user/hexagon/cpu_loop.c     |   1 +
+ target/arm/debug_helper.c         |  23 ++++++++++++++
+ target/arm/gdbstub.c              |   9 ++++--
+ target/arm/helper.c               |   6 ++--
+ target/arm/machine.c              |  10 ++++++
+ target/arm/tlb_helper.c           |  63 ++++++++++++++++++++++++++++----------
+ target/arm/translate-a64.c        |  23 ++++++++++++--
+ target/arm/translate.c            |  58 ++++++++++++++++++++++++++---------
+ target/i386/tcg/translate.c       |  12 ++------
+ tests/qtest/bios-tables-test.c    |  38 +++++++++++++++++++++++
+ tests/tcg/aarch64/pcalign-a64.c   |  37 ++++++++++++++++++++++
+ tests/tcg/arm/pcalign-a32.c       |  46 ++++++++++++++++++++++++++++
+ hw/arm/Kconfig                    |   1 +
+ hw/intc/Kconfig                   |   5 +++
+ hw/intc/meson.build               |  11 ++++---
+ tests/data/acpi/q35/DSDT.viot     | Bin 0 -> 9398 bytes
+ tests/data/acpi/q35/VIOT.viot     | Bin 0 -> 112 bytes
+ tests/data/acpi/virt/VIOT         | Bin 0 -> 88 bytes
+ tests/tcg/aarch64/Makefile.target |   4 +--
+ tests/tcg/arm/Makefile.target     |   4 +++
+files changed, 429 insertions(+), 145 deletions(-)
+ create mode 100644 hw/intc/arm_gicv3_cpuif_common.c
+ create mode 100644 tests/tcg/aarch64/pcalign-a64.c
+ create mode 100644 tests/tcg/arm/pcalign-a32.c
+ create mode 100644 tests/data/acpi/q35/DSDT.viot
+ create mode 100644 tests/data/acpi/q35/VIOT.viot
+ create mode 100644 tests/data/acpi/virt/VIOT

-New patch
+[PULL 01/33] hw/intc: clean-up error reporting for failed ITS cmd
+From: Alex Bennée <alex.bennee@linaro.org>
+While trying to debug a GIC ITS failure I saw some guest errors that
+had poor formatting as well as leaving me confused as to what failed.
+As most of the checks aren't possible without a valid dte split that
+check apart and then check the other conditions in steps. This avoids
+us relying on undefined data.
+I still get a failure with the current kvm-unit-tests but at least I
+know (partially) why now:
+  Exception return from AArch64 EL1 to AArch64 EL1 PC 0x40080588
+  PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 now triggers an LPI
+  ITS: MAPD devid=2 size = 0x8 itt=0x40430000 valid=0
+  INT dev_id=2 event_id=20
+  process_its_cmd: invalid command attributes: invalid dte: 0 for 2 (MEM_TX: 0)
+  PASS: gicv3: its-trigger: mapd valid=false: no LPI after device unmap
+  SUMMARY: 6 tests, 1 unexpected failures
+Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20211112170454.3158925-1-alex.bennee@linaro.org
+Cc: Shashi Mallela <shashi.mallela@linaro.org>
+Cc: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/intc/arm_gicv3_its.c | 39 +++++++++++++++++++++++++++------------
+file changed, 27 insertions(+), 12 deletions(-)
+diff --git a/hw/intc/arm_gicv3_its.c b/hw/intc/arm_gicv3_its.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/intc/arm_gicv3_its.c
++++ b/hw/intc/arm_gicv3_its.c
+@@ -XXX,XX +XXX,XX @@ static bool process_its_cmd(GICv3ITSState *s, uint64_t value, uint32_t offset,
+         if (res != MEMTX_OK) {
+             return result;
+         }
++    } else {
++        qemu_log_mask(LOG_GUEST_ERROR,
++                      "%s: invalid command attributes: "
++                      "invalid dte: %"PRIx64" for %d (MEM_TX: %d)\n",
++                      __func__, dte, devid, res);
++        return result;
+     }
+-    if ((devid > s->dt.maxids.max_devids) || !dte_valid || !ite_valid ||
+-            !cte_valid || (eventid > max_eventid)) {
++
++    /*
++     * In this implementation, in case of guest errors we ignore the
++     * command and move onto the next command in the queue.
++     */
++    if (devid > s->dt.maxids.max_devids) {
+         qemu_log_mask(LOG_GUEST_ERROR,
+-                      "%s: invalid command attributes "
+-                      "devid %d or eventid %d or invalid dte %d or"
+-                      "invalid cte %d or invalid ite %d\n",
+-                      __func__, devid, eventid, dte_valid, cte_valid,
+-                      ite_valid);
+-        /*
+-         * in this implementation, in case of error
+-         * we ignore this command and move onto the next
+-         * command in the queue
+-         */
++                      "%s: invalid command attributes: devid %d>%d",
++                      __func__, devid, s->dt.maxids.max_devids);
++
++    } else if (!dte_valid || !ite_valid || !cte_valid) {
++        qemu_log_mask(LOG_GUEST_ERROR,
++                      "%s: invalid command attributes: "
++                      "dte: %s, ite: %s, cte: %s\n",
++                      __func__,
++                      dte_valid ? "valid" : "invalid",
++                      ite_valid ? "valid" : "invalid",
++                      cte_valid ? "valid" : "invalid");
++    } else if (eventid > max_eventid) {
++        qemu_log_mask(LOG_GUEST_ERROR,
++                      "%s: invalid command attributes: eventid %d > %d\n",
++                      __func__, eventid, max_eventid);
+     } else {
+         /*
+          * Current implementation only supports rdbase == procnum
+--
+.25.1

-New patch
+[PULL 02/33] docs: aspeed: Add new boards
+From: Joel Stanley <joel@jms.id.au>
+Add X11, FP5280G2, G220A, Rainier and Fuji. Mention that Swift will be
+removed in v7.0.
+Signed-off-by: Joel Stanley <joel@jms.id.au>
+Reviewed-by: Cédric Le Goater <clg@kaod.org>
+Message-id: 20211117065752.330632-2-joel@jms.id.au
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ docs/system/arm/aspeed.rst | 7 ++++++-
+file changed, 6 insertions(+), 1 deletion(-)
+diff --git a/docs/system/arm/aspeed.rst b/docs/system/arm/aspeed.rst
+index XXXXXXX..XXXXXXX 100644
+--- a/docs/system/arm/aspeed.rst
++++ b/docs/system/arm/aspeed.rst
+@@ -XXX,XX +XXX,XX @@ AST2400 SoC based machines :
+ - ``palmetto-bmc``         OpenPOWER Palmetto POWER8 BMC
+ - ``quanta-q71l-bmc``      OpenBMC Quanta BMC
++- ``supermicrox11-bmc``    Supermicro X11 BMC
+ AST2500 SoC based machines :
+@@ -XXX,XX +XXX,XX @@ AST2500 SoC based machines :
+ - ``romulus-bmc``          OpenPOWER Romulus POWER9 BMC
+ - ``witherspoon-bmc``      OpenPOWER Witherspoon POWER9 BMC
+ - ``sonorapass-bmc``       OCP SonoraPass BMC
+-- ``swift-bmc``            OpenPOWER Swift BMC POWER9
++- ``swift-bmc``            OpenPOWER Swift BMC POWER9 (to be removed in v7.0)
++- ``fp5280g2-bmc``         Inspur FP5280G2 BMC
++- ``g220a-bmc``            Bytedance G220A BMC
+ AST2600 SoC based machines :
+ - ``ast2600-evb``          Aspeed AST2600 Evaluation board (Cortex-A7)
+ - ``tacoma-bmc``           OpenPOWER Witherspoon POWER9 AST2600 BMC
++- ``rainier-bmc``          IBM Rainier POWER10 BMC
++- ``fuji-bmc``             Facebook Fuji BMC
+ Supported devices
+ -----------------
+--
+.25.1

-New patch
+[PULL 03/33] docs: aspeed: Update OpenBMC image URL
+From: Joel Stanley <joel@jms.id.au>
+This is the latest URL for the OpenBMC CI. The old URL still works, but
+redirects.
+Reviewed-by: Cédric Le Goater <clg@kaod.org>
+Signed-off-by: Joel Stanley <joel@jms.id.au>
+Message-id: 20211117065752.330632-3-joel@jms.id.au
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ docs/system/arm/aspeed.rst | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/docs/system/arm/aspeed.rst b/docs/system/arm/aspeed.rst
+index XXXXXXX..XXXXXXX 100644
+--- a/docs/system/arm/aspeed.rst
++++ b/docs/system/arm/aspeed.rst
+@@ -XXX,XX +XXX,XX @@ The Aspeed machines can be started using the ``-kernel`` option to
+ load a Linux kernel or from a firmware. Images can be downloaded from
+ the OpenBMC jenkins :
+-   https://jenkins.openbmc.org/job/ci-openbmc/lastSuccessfulBuild/distro=ubuntu,label=docker-builder
++   https://jenkins.openbmc.org/job/ci-openbmc/lastSuccessfulBuild/
+ or directly from the OpenBMC GitHub release repository :
+--
+.25.1

-New patch
+[PULL 04/33] docs: aspeed: Give an example of booting a kernel
+From: Joel Stanley <joel@jms.id.au>
+A common use case for the ASPEED machine is to boot a Linux kernel.
+Provide a full example command line.
+Reviewed-by: Cédric Le Goater <clg@kaod.org>
+Signed-off-by: Joel Stanley <joel@jms.id.au>
+Message-id: 20211117065752.330632-4-joel@jms.id.au
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ docs/system/arm/aspeed.rst | 15 ++++++++++++---
+file changed, 12 insertions(+), 3 deletions(-)
+diff --git a/docs/system/arm/aspeed.rst b/docs/system/arm/aspeed.rst
+index XXXXXXX..XXXXXXX 100644
+--- a/docs/system/arm/aspeed.rst
++++ b/docs/system/arm/aspeed.rst
+@@ -XXX,XX +XXX,XX @@ Missing devices
+ Boot options
+ ------------
+-The Aspeed machines can be started using the ``-kernel`` option to
+-load a Linux kernel or from a firmware. Images can be downloaded from
+-the OpenBMC jenkins :
++The Aspeed machines can be started using the ``-kernel`` and ``-dtb`` options
++to load a Linux kernel or from a firmware. Images can be downloaded from the
++OpenBMC jenkins :
+    https://jenkins.openbmc.org/job/ci-openbmc/lastSuccessfulBuild/
+@@ -XXX,XX +XXX,XX @@ or directly from the OpenBMC GitHub release repository :
+    https://github.com/openbmc/openbmc/releases
++To boot a kernel directly from a Linux build tree:
++
++.. code-block:: bash
++
++  $ qemu-system-arm -M ast2600-evb -nographic \
++        -kernel arch/arm/boot/zImage \
++        -dtb arch/arm/boot/dts/aspeed-ast2600-evb.dtb \
++        -initrd rootfs.cpio
++
+ The image should be attached as an MTD drive. Run :
+ .. code-block:: bash
+--
+.25.1

-New patch
+[PULL 05/33] docs: aspeed: ADC is now modelled
+From: Joel Stanley <joel@jms.id.au>
+Move it to the supported list.
+Signed-off-by: Joel Stanley <joel@jms.id.au>
+Message-id: 20211117065752.330632-5-joel@jms.id.au
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ docs/system/arm/aspeed.rst | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/docs/system/arm/aspeed.rst b/docs/system/arm/aspeed.rst
+index XXXXXXX..XXXXXXX 100644
+--- a/docs/system/arm/aspeed.rst
++++ b/docs/system/arm/aspeed.rst
+@@ -XXX,XX +XXX,XX @@ Supported devices
+  * Front LEDs (PCA9552 on I2C bus)
+  * LPC Peripheral Controller (a subset of subdevices are supported)
+  * Hash/Crypto Engine (HACE) - Hash support only. TODO: HMAC and RSA
++ * ADC
+ Missing devices
+ ---------------
+  * Coprocessor support
+- * ADC (out of tree implementation)
+  * PWM and Fan Controller
+  * Slave GPIO Controller
+  * Super I/O Controller
+--
+.25.1

-New patch
+[PULL 06/33] Fix STM32F2XX USART data register readout
+From: Olivier Hériveaux <olivier.heriveaux@ledger.fr>
+Fix issue where the data register may be overwritten by next character
+reception before being read and returned.
+Signed-off-by: Olivier Hériveaux <olivier.heriveaux@ledger.fr>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
+Message-id: 20211128120723.4053-1-olivier.heriveaux@ledger.fr
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/char/stm32f2xx_usart.c | 3 ++-
+file changed, 2 insertions(+), 1 deletion(-)
+diff --git a/hw/char/stm32f2xx_usart.c b/hw/char/stm32f2xx_usart.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/char/stm32f2xx_usart.c
++++ b/hw/char/stm32f2xx_usart.c
+@@ -XXX,XX +XXX,XX @@ static uint64_t stm32f2xx_usart_read(void *opaque, hwaddr addr,
+         return retvalue;
+     case USART_DR:
+         DB_PRINT("Value: 0x%" PRIx32 ", %c\n", s->usart_dr, (char) s->usart_dr);
++        retvalue = s->usart_dr & 0x3FF;
+         s->usart_sr &= ~USART_SR_RXNE;
+         qemu_chr_fe_accept_input(&s->chr);
+         qemu_set_irq(s->irq, 0);
+-        return s->usart_dr & 0x3FF;
++        return retvalue;
+     case USART_BRR:
+         return s->usart_brr;
+     case USART_CR1:
+--
+.25.1

-New patch
+[PULL 07/33] hw/intc/arm_gicv3: Extract gicv3_set_gicv3state from arm_gicv3_cpuif.c
+From: Philippe Mathieu-Daudé <philmd@redhat.com>
+gicv3_set_gicv3state() is used by arm_gicv3_common.c in
+arm_gicv3_common_realize(). Since we want to restrict
+arm_gicv3_cpuif.c to TCG, extract gicv3_set_gicv3state()
+to a new file. Add this file to the meson 'specific'
+source set, since it needs access to "cpu.h".
+Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20211115223619.2599282-2-philmd@redhat.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/intc/arm_gicv3_cpuif.c        | 10 +---------
+ hw/intc/arm_gicv3_cpuif_common.c | 22 ++++++++++++++++++++++
+ hw/intc/meson.build              |  1 +
+files changed, 24 insertions(+), 9 deletions(-)
+ create mode 100644 hw/intc/arm_gicv3_cpuif_common.c
+diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/intc/arm_gicv3_cpuif.c
++++ b/hw/intc/arm_gicv3_cpuif.c
+@@ -XXX,XX +XXX,XX @@
+ /*
+- * ARM Generic Interrupt Controller v3
++ * ARM Generic Interrupt Controller v3 (emulation)
+  *
+  * Copyright (c) 2016 Linaro Limited
+  * Written by Peter Maydell
+@@ -XXX,XX +XXX,XX @@
+ #include "hw/irq.h"
+ #include "cpu.h"
+-void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *s)
+-{
+-    ARMCPU *arm_cpu = ARM_CPU(cpu);
+-    CPUARMState *env = &arm_cpu->env;
+-
+-    env->gicv3state = (void *)s;
+-};
+-
+ static GICv3CPUState *icc_cs_from_env(CPUARMState *env)
+ {
+     return env->gicv3state;
+diff --git a/hw/intc/arm_gicv3_cpuif_common.c b/hw/intc/arm_gicv3_cpuif_common.c
+new file mode 100644
+index XXXXXXX..XXXXXXX
+--- /dev/null
++++ b/hw/intc/arm_gicv3_cpuif_common.c
+@@ -XXX,XX +XXX,XX @@
++/* SPDX-License-Identifier: GPL-2.0-or-later */
++/*
++ * ARM Generic Interrupt Controller v3
++ *
++ * Copyright (c) 2016 Linaro Limited
++ * Written by Peter Maydell
++ *
++ * This code is licensed under the GPL, version 2 or (at your option)
++ * any later version.
++ */
++
++#include "qemu/osdep.h"
++#include "gicv3_internal.h"
++#include "cpu.h"
++
++void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *s)
++{
++    ARMCPU *arm_cpu = ARM_CPU(cpu);
++    CPUARMState *env = &arm_cpu->env;
++
++    env->gicv3state = (void *)s;
++};
+diff --git a/hw/intc/meson.build b/hw/intc/meson.build
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/intc/meson.build
++++ b/hw/intc/meson.build
+@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_XLNX_ZYNQMP_PMU', if_true: files('xlnx-pmu-iomod-in
+ specific_ss.add(when: 'CONFIG_ALLWINNER_A10_PIC', if_true: files('allwinner-a10-pic.c'))
+ specific_ss.add(when: 'CONFIG_APIC', if_true: files('apic.c', 'apic_common.c'))
++specific_ss.add(when: 'CONFIG_ARM_GIC', if_true: files('arm_gicv3_cpuif_common.c'))
+ specific_ss.add(when: 'CONFIG_ARM_GIC', if_true: files('arm_gicv3_cpuif.c'))
+ specific_ss.add(when: 'CONFIG_ARM_GIC_KVM', if_true: files('arm_gic_kvm.c'))
+ specific_ss.add(when: ['CONFIG_ARM_GIC_KVM', 'TARGET_AARCH64'], if_true: files('arm_gicv3_kvm.c', 'arm_gicv3_its_kvm.c'))
+--
+.25.1

-New patch
+[PULL 08/33] hw/intc/arm_gicv3: Introduce CONFIG_ARM_GIC_TCG Kconfig selector
+From: Philippe Mathieu-Daudé <philmd@redhat.com>
+The TYPE_ARM_GICV3 device is an emulated one.  When using
+KVM, it is recommended to use the TYPE_KVM_ARM_GICV3 device
+(which uses in-kernel support).
+When using --with-devices-FOO, it is possible to build a
+binary with a specific set of devices. When this binary is
+restricted to KVM accelerator, the TYPE_ARM_GICV3 device is
+irrelevant, and it is desirable to remove it from the binary.
+Therefore introduce the CONFIG_ARM_GIC_TCG Kconfig selector
+which select the files required to have the TYPE_ARM_GICV3
+device, but also allowing to de-select this device.
+Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Message-id: 20211115223619.2599282-3-philmd@redhat.com
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/intc/arm_gicv3.c |  2 +-
+ hw/intc/Kconfig     |  5 +++++
+ hw/intc/meson.build | 10 ++++++----
+files changed, 12 insertions(+), 5 deletions(-)
+diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/intc/arm_gicv3.c
++++ b/hw/intc/arm_gicv3.c
+@@ -XXX,XX +XXX,XX @@
+ /*
+- * ARM Generic Interrupt Controller v3
++ * ARM Generic Interrupt Controller v3 (emulation)
+  *
+  * Copyright (c) 2015 Huawei.
+  * Copyright (c) 2016 Linaro Limited
+diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/intc/Kconfig
++++ b/hw/intc/Kconfig
+@@ -XXX,XX +XXX,XX @@ config APIC
+     select MSI_NONBROKEN
+     select I8259
++config ARM_GIC_TCG
++    bool
++    default y
++    depends on ARM_GIC && TCG
++
+ config ARM_GIC_KVM
+     bool
+     default y
+diff --git a/hw/intc/meson.build b/hw/intc/meson.build
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/intc/meson.build
++++ b/hw/intc/meson.build
+@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_ARM_GIC', if_true: files(
+   'arm_gic.c',
+   'arm_gic_common.c',
+   'arm_gicv2m.c',
+-  'arm_gicv3.c',
+   'arm_gicv3_common.c',
+-  'arm_gicv3_dist.c',
+   'arm_gicv3_its_common.c',
+-  'arm_gicv3_redist.c',
++))
++softmmu_ss.add(when: 'CONFIG_ARM_GIC_TCG', if_true: files(
++  'arm_gicv3.c',
++  'arm_gicv3_dist.c',
+   'arm_gicv3_its.c',
++  'arm_gicv3_redist.c',
+ ))
+ softmmu_ss.add(when: 'CONFIG_ETRAXFS', if_true: files('etraxfs_pic.c'))
+ softmmu_ss.add(when: 'CONFIG_HEATHROW_PIC', if_true: files('heathrow_pic.c'))
+@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_XLNX_ZYNQMP_PMU', if_true: files('xlnx-pmu-iomod-in
+ specific_ss.add(when: 'CONFIG_ALLWINNER_A10_PIC', if_true: files('allwinner-a10-pic.c'))
+ specific_ss.add(when: 'CONFIG_APIC', if_true: files('apic.c', 'apic_common.c'))
+ specific_ss.add(when: 'CONFIG_ARM_GIC', if_true: files('arm_gicv3_cpuif_common.c'))
+-specific_ss.add(when: 'CONFIG_ARM_GIC', if_true: files('arm_gicv3_cpuif.c'))
++specific_ss.add(when: 'CONFIG_ARM_GIC_TCG', if_true: files('arm_gicv3_cpuif.c'))
+ specific_ss.add(when: 'CONFIG_ARM_GIC_KVM', if_true: files('arm_gic_kvm.c'))
+ specific_ss.add(when: ['CONFIG_ARM_GIC_KVM', 'TARGET_AARCH64'], if_true: files('arm_gicv3_kvm.c', 'arm_gicv3_its_kvm.c'))
+ specific_ss.add(when: 'CONFIG_ARM_V7M', if_true: files('armv7m_nvic.c'))
+--
+.25.1

-New patch
+[PULL 09/33] target/arm: Hoist pc_next to a local variable in aarch64_tr_translate_insn
+From: Richard Henderson <richard.henderson@linaro.org>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ target/arm/translate-a64.c | 7 ++++---
+file changed, 4 insertions(+), 3 deletions(-)
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/translate-a64.c
++++ b/target/arm/translate-a64.c
+@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
+ {
+     DisasContext *s = container_of(dcbase, DisasContext, base);
+     CPUARMState *env = cpu->env_ptr;
++    uint64_t pc = s->base.pc_next;
+     uint32_t insn;
+     if (s->ss_active && !s->pstate_ss) {
+@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
+         return;
+     }
+-    s->pc_curr = s->base.pc_next;
+-    insn = arm_ldl_code(env, &s->base, s->base.pc_next, s->sctlr_b);
++    s->pc_curr = pc;
++    insn = arm_ldl_code(env, &s->base, pc, s->sctlr_b);
+     s->insn = insn;
+-    s->base.pc_next += 4;
++    s->base.pc_next = pc + 4;
+     s->fp_access_checked = false;
+     s->sve_access_checked = false;
+--
+.25.1

-[PULL 17/23] target/arm: Convert Neon VDUP (scalar) to decodetree
+[PULL 10/33] target/arm: Hoist pc_next to a local variable in arm_tr_translate_insn
-Convert the Neon VDUP (scalar) insn to decodetree.  (Note that we
+From: Richard Henderson <richard.henderson@linaro.org>
 can't call this just "VDUP" as we used that already in vfp.decode for
 the "VDUP (general purpose register" insn.)
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
- target/arm/neon-dp.decode       |  7 +++++++
+ target/arm/translate.c | 9 +++++----
- target/arm/translate-neon.inc.c | 26 ++++++++++++++++++++++++++
+file changed, 5 insertions(+), 4 deletions(-)
  target/arm/translate.c          | 25 +------------------------
 files changed, 34 insertions(+), 24 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
-     VTBL         1111 001 1 1 . 11 .... .... 10 len:2 . op:1 . 0 .... \
-                  vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+
-+    VDUP_scalar  1111 001 1 1 . 11 index:3 1 .... 11 000 q:1 . 0 .... \
-+                 vm=%vm_dp vd=%vd_dp size=0
-+    VDUP_scalar  1111 001 1 1 . 11 index:2 10 .... 11 000 q:1 . 0 .... \
-+                 vm=%vm_dp vd=%vd_dp size=1
-+    VDUP_scalar  1111 001 1 1 . 11 index:1 100 .... 11 000 q:1 . 0 .... \
-+                 vm=%vm_dp vd=%vd_dp size=2
-   ]
-   # Subgroup for size != 0b11
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VTBL(DisasContext *s, arg_VTBL *a)
-     tcg_temp_free_i32(tmp);
-     return true;
- }
-+
-+static bool trans_VDUP_scalar(DisasContext *s, arg_VDUP_scalar *a)
-+{
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+        return false;
-+    }
-+
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        ((a->vd | a->vm) & 0x10)) {
-+        return false;
-+    }
-+
-+    if (a->vd & a->q) {
-+        return false;
-+    }
-+
-+    if (!vfp_access_check(s)) {
-+        return true;
-+    }
-+
-+    tcg_gen_gvec_dup_mem(a->size, neon_reg_offset(a->vd, 0),
-+                         neon_element_offset(a->vm, a->index, a->size),
-+                         a->q ? 16 : 8, a->q ? 16 : 8);
-+    return true;
-+}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void arm_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
-                     }
+ {
-                     break;
+     DisasContext *dc = container_of(dcbase, DisasContext, base);
-                 }
+     CPUARMState *env = cpu->env_ptr;
--            } else if ((insn & (1 << 10)) == 0) {
++    uint32_t pc = dc->base.pc_next;
--                /* VTBL, VTBX: handled by decodetree */
+     unsigned int insn;
--                return 1;
--            } else if ((insn & 0x380) == 0) {
+     if (arm_pre_translate_insn(dc)) {
--                /* VDUP */
+-        dc->base.pc_next += 4;
--                int element;
++        dc->base.pc_next = pc + 4;
--                MemOp size;
+         return;
--
+     }
--                if ((insn & (7 << 16)) == 0 || (q && (rd & 1))) {
--                    return 1;
+-    dc->pc_curr = dc->base.pc_next;
--                }
+-    insn = arm_ldl_code(env, &dc->base, dc->base.pc_next, dc->sctlr_b);
--                if (insn & (1 << 16)) {
++    dc->pc_curr = pc;
--                    size = MO_8;
++    insn = arm_ldl_code(env, &dc->base, pc, dc->sctlr_b);
--                    element = (insn >> 17) & 7;
+     dc->insn = insn;
--                } else if (insn & (1 << 17)) {
+-    dc->base.pc_next += 4;
--                    size = MO_16;
++    dc->base.pc_next = pc + 4;
--                    element = (insn >> 18) & 3;
+     disas_arm_insn(dc, insn);
--                } else {
--                    size = MO_32;
+     arm_post_translate_insn(dc);
 -                    element = (insn >> 19) & 1;
 -                }
 -                tcg_gen_gvec_dup_mem(size, neon_reg_offset(rd, 0),
 -                                     neon_element_offset(rm, element, size),
 -                                     q ? 16 : 8, q ? 16 : 8);
              } else {
 +                /* VTBL, VTBX, VDUP: handled by decodetree */
                  return 1;
              }
          }
 --
-.20.1
+.25.1

-[PULL 16/23] target/arm: Convert Neon VTBL, VTBX to decodetree
+[PULL 11/33] target/arm: Hoist pc_next to a local variable in thumb_tr_translate_insn
-Convert the Neon VTBL, VTBX instructions to decodetree.  The actual
+From: Richard Henderson <richard.henderson@linaro.org>
 implementation of the insn is copied across to the new trans function
 unchanged except for renaming 'tmp5' to 'tmp4'.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
- target/arm/neon-dp.decode       |  3 ++
+ target/arm/translate.c | 16 ++++++++--------
- target/arm/translate-neon.inc.c | 56 +++++++++++++++++++++++++++++++++
+file changed, 8 insertions(+), 8 deletions(-)
  target/arm/translate.c          | 41 +++---------------------
 files changed, 63 insertions(+), 37 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
-     ##################################################################
-     VEXT         1111 001 0 1 . 11 .... .... imm:4 . q:1 . 0 .... \
-                  vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+
-+    VTBL         1111 001 1 1 . 11 .... .... 10 len:2 . op:1 . 0 .... \
-+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
-   ]
-   # Subgroup for size != 0b11
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VEXT(DisasContext *s, arg_VEXT *a)
-     }
-     return true;
- }
-+
-+static bool trans_VTBL(DisasContext *s, arg_VTBL *a)
-+{
-+    int n;
-+    TCGv_i32 tmp, tmp2, tmp3, tmp4;
-+    TCGv_ptr ptr1;
-+
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+        return false;
-+    }
-+
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
-+        return false;
-+    }
-+
-+    if (!vfp_access_check(s)) {
-+        return true;
-+    }
-+
-+    n = a->len + 1;
-+    if ((a->vn + n) > 32) {
-+        /*
-+         * This is UNPREDICTABLE; we choose to UNDEF to avoid the
-+         * helper function running off the end of the register file.
-+         */
-+        return false;
-+    }
-+    n <<= 3;
-+    if (a->op) {
-+        tmp = neon_load_reg(a->vd, 0);
-+    } else {
-+        tmp = tcg_temp_new_i32();
-+        tcg_gen_movi_i32(tmp, 0);
-+    }
-+    tmp2 = neon_load_reg(a->vm, 0);
-+    ptr1 = vfp_reg_ptr(true, a->vn);
-+    tmp4 = tcg_const_i32(n);
-+    gen_helper_neon_tbl(tmp2, tmp2, tmp, ptr1, tmp4);
-+    tcg_temp_free_i32(tmp);
-+    if (a->op) {
-+        tmp = neon_load_reg(a->vd, 1);
-+    } else {
-+        tmp = tcg_temp_new_i32();
-+        tcg_gen_movi_i32(tmp, 0);
-+    }
-+    tmp3 = neon_load_reg(a->vm, 1);
-+    gen_helper_neon_tbl(tmp3, tmp3, tmp, ptr1, tmp4);
-+    tcg_temp_free_i32(tmp4);
-+    tcg_temp_free_ptr(ptr1);
-+    neon_store_reg(a->vd, 0, tmp2);
-+    neon_store_reg(a->vd, 1, tmp3);
-+    tcg_temp_free_i32(tmp);
-+    return true;
-+}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void thumb_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
  {
-     int op;
+     DisasContext *dc = container_of(dcbase, DisasContext, base);
-     int q;
+     CPUARMState *env = cpu->env_ptr;
--    int rd, rn, rm, rd_ofs, rm_ofs;
++    uint32_t pc = dc->base.pc_next;
-+    int rd, rm, rd_ofs, rm_ofs;
+     uint32_t insn;
-     int size;
+     bool is_16bit;
-     int pass;
-     int u;
+     if (arm_pre_translate_insn(dc)) {
-     int vec_size;
+-        dc->base.pc_next += 2;
--    TCGv_i32 tmp, tmp2, tmp3, tmp5;
++        dc->base.pc_next = pc + 2;
--    TCGv_ptr ptr1;
+         return;
-+    TCGv_i32 tmp, tmp2, tmp3;
+     }
-     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+-    dc->pc_curr = dc->base.pc_next;
-         return 1;
+-    insn = arm_lduw_code(env, &dc->base, dc->base.pc_next, dc->sctlr_b);
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
++    dc->pc_curr = pc;
-     q = (insn & (1 << 6)) != 0;
++    insn = arm_lduw_code(env, &dc->base, pc, dc->sctlr_b);
-     u = (insn >> 24) & 1;
+     is_16bit = thumb_insn_is_16bit(dc, dc->base.pc_next, insn);
-     VFP_DREG_D(rd, insn);
+-    dc->base.pc_next += 2;
--    VFP_DREG_N(rn, insn);
++    pc += 2;
-     VFP_DREG_M(rm, insn);
+     if (!is_16bit) {
-     size = (insn >> 20) & 3;
+-        uint32_t insn2 = arm_lduw_code(env, &dc->base, dc->base.pc_next,
-     vec_size = q ? 16 : 8;
+-                                       dc->sctlr_b);
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+-
-                     break;
++        uint32_t insn2 = arm_lduw_code(env, &dc->base, pc, dc->sctlr_b);
-                 }
+         insn = insn << 16 | insn2;
-             } else if ((insn & (1 << 10)) == 0) {
+-        dc->base.pc_next += 2;
--                /* VTBL, VTBX.  */
++        pc += 2;
--                int n = ((insn >> 8) & 3) + 1;
+     }
--                if ((rn + n) > 32) {
++    dc->base.pc_next = pc;
--                    /* This is UNPREDICTABLE; we choose to UNDEF to avoid the
+     dc->insn = insn;
--                     * helper function running off the end of the register file.
--                     */
+     if (dc->pstate_il) {
 -                    return 1;
 -                }
 -                n <<= 3;
 -                if (insn & (1 << 6)) {
 -                    tmp = neon_load_reg(rd, 0);
 -                } else {
 -                    tmp = tcg_temp_new_i32();
 -                    tcg_gen_movi_i32(tmp, 0);
 -                }
 -                tmp2 = neon_load_reg(rm, 0);
 -                ptr1 = vfp_reg_ptr(true, rn);
 -                tmp5 = tcg_const_i32(n);
 -                gen_helper_neon_tbl(tmp2, tmp2, tmp, ptr1, tmp5);
 -                tcg_temp_free_i32(tmp);
 -                if (insn & (1 << 6)) {
 -                    tmp = neon_load_reg(rd, 1);
 -                } else {
 -                    tmp = tcg_temp_new_i32();
 -                    tcg_gen_movi_i32(tmp, 0);
 -                }
 -                tmp3 = neon_load_reg(rm, 1);
 -                gen_helper_neon_tbl(tmp3, tmp3, tmp, ptr1, tmp5);
 -                tcg_temp_free_i32(tmp5);
 -                tcg_temp_free_ptr(ptr1);
 -                neon_store_reg(rd, 0, tmp2);
 -                neon_store_reg(rd, 1, tmp3);
 -                tcg_temp_free_i32(tmp);
 +                /* VTBL, VTBX: handled by decodetree */
 +                return 1;
              } else if ((insn & 0x380) == 0) {
                  /* VDUP */
                  int element;
 --
-.20.1
+.25.1

-[PULL 14/23] target/arm: Convert Neon 2-reg-scalar long multiplies to decodetree
+[PULL 12/33] target/arm: Split arm_pre_translate_insn
-Convert the Neon 2-reg-scalar long multiplies to decodetree.
+From: Richard Henderson <richard.henderson@linaro.org>
 These are the last instructions in the group.
+Create arm_check_ss_active and arm_check_kernelpage.
+Reverse the order of the tests.  While it doesn't matter in practice,
+because only user-only has a kernel page and user-only never sets
+ss_active, ss_active has priority over execution exceptions and it
+is best to keep them in the proper order.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
- target/arm/neon-dp.decode       |  18 ++++
+ target/arm/translate.c | 10 +++++++---
- target/arm/translate-neon.inc.c | 163 ++++++++++++++++++++++++++++
+file changed, 7 insertions(+), 3 deletions(-)
  target/arm/translate.c          | 182 ++------------------------------
 files changed, 187 insertions(+), 176 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
-@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
-     @2scalar     .... ... q:1 . . size:2 .... .... .... . . . . .... \
-                  &2scalar vm=%vm_dp vn=%vn_dp vd=%vd_dp
-+    # For the 'long' ops the Q bit is part of insn decode
-+    @2scalar_q0  .... ... . . . size:2 .... .... .... . . . . .... \
-+                 &2scalar vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
-     VMLA_2sc     1111 001 . 1 . .. .... .... 0000 . 1 . 0 .... @2scalar
-     VMLA_F_2sc   1111 001 . 1 . .. .... .... 0001 . 1 . 0 .... @2scalar
-+    VMLAL_S_2sc  1111 001 0 1 . .. .... .... 0010 . 1 . 0 .... @2scalar_q0
-+    VMLAL_U_2sc  1111 001 1 1 . .. .... .... 0010 . 1 . 0 .... @2scalar_q0
-+
-+    VQDMLAL_2sc  1111 001 0 1 . .. .... .... 0011 . 1 . 0 .... @2scalar_q0
-+
-     VMLS_2sc     1111 001 . 1 . .. .... .... 0100 . 1 . 0 .... @2scalar
-     VMLS_F_2sc   1111 001 . 1 . .. .... .... 0101 . 1 . 0 .... @2scalar
-+    VMLSL_S_2sc  1111 001 0 1 . .. .... .... 0110 . 1 . 0 .... @2scalar_q0
-+    VMLSL_U_2sc  1111 001 1 1 . .. .... .... 0110 . 1 . 0 .... @2scalar_q0
-+
-+    VQDMLSL_2sc  1111 001 0 1 . .. .... .... 0111 . 1 . 0 .... @2scalar_q0
-+
-     VMUL_2sc     1111 001 . 1 . .. .... .... 1000 . 1 . 0 .... @2scalar
-     VMUL_F_2sc   1111 001 . 1 . .. .... .... 1001 . 1 . 0 .... @2scalar
-+    VMULL_S_2sc  1111 001 0 1 . .. .... .... 1010 . 1 . 0 .... @2scalar_q0
-+    VMULL_U_2sc  1111 001 1 1 . .. .... .... 1010 . 1 . 0 .... @2scalar_q0
-+
-+    VQDMULL_2sc  1111 001 0 1 . .. .... .... 1011 . 1 . 0 .... @2scalar_q0
-+
-     VQDMULH_2sc  1111 001 . 1 . .. .... .... 1100 . 1 . 0 .... @2scalar
-     VQRDMULH_2sc 1111 001 . 1 . .. .... .... 1101 . 1 . 0 .... @2scalar
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
-+++ b/target/arm/translate-neon.inc.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VQRDMLSH_2sc(DisasContext *s, arg_2scalar *a)
-     };
-     return do_vqrdmlah_2sc(s, a, opfn[a->size]);
- }
-+
-+static bool do_2scalar_long(DisasContext *s, arg_2scalar *a,
-+                            NeonGenTwoOpWidenFn *opfn,
-+                            NeonGenTwo64OpFn *accfn)
-+{
-+    /*
-+     * Two registers and a scalar, long operations: perform an
-+     * operation on the input elements and the scalar which produces
-+     * a double-width result, and then possibly perform an accumulation
-+     * operation of that result into the destination.
-+     */
-+    TCGv_i32 scalar, rn;
-+    TCGv_i64 rn0_64, rn1_64;
-+
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-+        return false;
-+    }
-+
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
-+        return false;
-+    }
-+
-+    if (!opfn) {
-+        /* Bad size (including size == 3, which is a different insn group) */
-+        return false;
-+    }
-+
-+    if (a->vd & 1) {
-+        return false;
-+    }
-+
-+    if (!vfp_access_check(s)) {
-+        return true;
-+    }
-+
-+    scalar = neon_get_scalar(a->size, a->vm);
-+
-+    /* Load all inputs before writing any outputs, in case of overlap */
-+    rn = neon_load_reg(a->vn, 0);
-+    rn0_64 = tcg_temp_new_i64();
-+    opfn(rn0_64, rn, scalar);
-+    tcg_temp_free_i32(rn);
-+
-+    rn = neon_load_reg(a->vn, 1);
-+    rn1_64 = tcg_temp_new_i64();
-+    opfn(rn1_64, rn, scalar);
-+    tcg_temp_free_i32(rn);
-+    tcg_temp_free_i32(scalar);
-+
-+    if (accfn) {
-+        TCGv_i64 t64 = tcg_temp_new_i64();
-+        neon_load_reg64(t64, a->vd);
-+        accfn(t64, t64, rn0_64);
-+        neon_store_reg64(t64, a->vd);
-+        neon_load_reg64(t64, a->vd + 1);
-+        accfn(t64, t64, rn1_64);
-+        neon_store_reg64(t64, a->vd + 1);
-+        tcg_temp_free_i64(t64);
-+    } else {
-+        neon_store_reg64(rn0_64, a->vd);
-+        neon_store_reg64(rn1_64, a->vd + 1);
-+    }
-+    tcg_temp_free_i64(rn0_64);
-+    tcg_temp_free_i64(rn1_64);
-+    return true;
-+}
-+
-+static bool trans_VMULL_S_2sc(DisasContext *s, arg_2scalar *a)
-+{
-+    static NeonGenTwoOpWidenFn * const opfn[] = {
-+        NULL,
-+        gen_helper_neon_mull_s16,
-+        gen_mull_s32,
-+        NULL,
-+    };
-+
-+    return do_2scalar_long(s, a, opfn[a->size], NULL);
-+}
-+
-+static bool trans_VMULL_U_2sc(DisasContext *s, arg_2scalar *a)
-+{
-+    static NeonGenTwoOpWidenFn * const opfn[] = {
-+        NULL,
-+        gen_helper_neon_mull_u16,
-+        gen_mull_u32,
-+        NULL,
-+    };
-+
-+    return do_2scalar_long(s, a, opfn[a->size], NULL);
-+}
-+
-+#define DO_VMLAL_2SC(INSN, MULL, ACC)                                   \
-+    static bool trans_##INSN##_2sc(DisasContext *s, arg_2scalar *a)     \
-+    {                                                                   \
-+        static NeonGenTwoOpWidenFn * const opfn[] = {                   \
-+            NULL,                                                       \
-+            gen_helper_neon_##MULL##16,                                 \
-+            gen_##MULL##32,                                             \
-+            NULL,                                                       \
-+        };                                                              \
-+        static NeonGenTwo64OpFn * const accfn[] = {                     \
-+            NULL,                                                       \
-+            gen_helper_neon_##ACC##l_u32,                               \
-+            tcg_gen_##ACC##_i64,                                        \
-+            NULL,                                                       \
-+        };                                                              \
-+        return do_2scalar_long(s, a, opfn[a->size], accfn[a->size]);    \
-+    }
-+
-+DO_VMLAL_2SC(VMLAL_S, mull_s, add)
-+DO_VMLAL_2SC(VMLAL_U, mull_u, add)
-+DO_VMLAL_2SC(VMLSL_S, mull_s, sub)
-+DO_VMLAL_2SC(VMLSL_U, mull_u, sub)
-+
-+static bool trans_VQDMULL_2sc(DisasContext *s, arg_2scalar *a)
-+{
-+    static NeonGenTwoOpWidenFn * const opfn[] = {
-+        NULL,
-+        gen_VQDMULL_16,
-+        gen_VQDMULL_32,
-+        NULL,
-+    };
-+
-+    return do_2scalar_long(s, a, opfn[a->size], NULL);
-+}
-+
-+static bool trans_VQDMLAL_2sc(DisasContext *s, arg_2scalar *a)
-+{
-+    static NeonGenTwoOpWidenFn * const opfn[] = {
-+        NULL,
-+        gen_VQDMULL_16,
-+        gen_VQDMULL_32,
-+        NULL,
-+    };
-+    static NeonGenTwo64OpFn * const accfn[] = {
-+        NULL,
-+        gen_VQDMLAL_acc_16,
-+        gen_VQDMLAL_acc_32,
-+        NULL,
-+    };
-+
-+    return do_2scalar_long(s, a, opfn[a->size], accfn[a->size]);
-+}
-+
-+static bool trans_VQDMLSL_2sc(DisasContext *s, arg_2scalar *a)
-+{
-+    static NeonGenTwoOpWidenFn * const opfn[] = {
-+        NULL,
-+        gen_VQDMULL_16,
-+        gen_VQDMULL_32,
-+        NULL,
-+    };
-+    static NeonGenTwo64OpFn * const accfn[] = {
-+        NULL,
-+        gen_VQDMLSL_acc_16,
-+        gen_VQDMLSL_acc_32,
-+        NULL,
-+    };
-+
-+    return do_2scalar_long(s, a, opfn[a->size], accfn[a->size]);
-+}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static void gen_revsh(TCGv_i32 dest, TCGv_i32 var)
+@@ -XXX,XX +XXX,XX @@ static void arm_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
-     tcg_gen_ext16s_i32(dest, var);
+     dc->insn_start = tcg_last_op();
  }
--/* 32x32->64 multiply.  Marks inputs as dead.  */
+-static bool arm_pre_translate_insn(DisasContext *dc)
--static TCGv_i64 gen_mulu_i64_i32(TCGv_i32 a, TCGv_i32 b)
++static bool arm_check_kernelpage(DisasContext *dc)
 -{
 -    TCGv_i32 lo = tcg_temp_new_i32();
 -    TCGv_i32 hi = tcg_temp_new_i32();
 -    TCGv_i64 ret;
 -
 -    tcg_gen_mulu2_i32(lo, hi, a, b);
 -    tcg_temp_free_i32(a);
 -    tcg_temp_free_i32(b);
 -
 -    ret = tcg_temp_new_i64();
 -    tcg_gen_concat_i32_i64(ret, lo, hi);
 -    tcg_temp_free_i32(lo);
 -    tcg_temp_free_i32(hi);
 -
 -    return ret;
 -}
 -
 -static TCGv_i64 gen_muls_i64_i32(TCGv_i32 a, TCGv_i32 b)
 -{
 -    TCGv_i32 lo = tcg_temp_new_i32();
 -    TCGv_i32 hi = tcg_temp_new_i32();
 -    TCGv_i64 ret;
 -
 -    tcg_gen_muls2_i32(lo, hi, a, b);
 -    tcg_temp_free_i32(a);
 -    tcg_temp_free_i32(b);
 -
 -    ret = tcg_temp_new_i64();
 -    tcg_gen_concat_i32_i64(ret, lo, hi);
 -    tcg_temp_free_i32(lo);
 -    tcg_temp_free_i32(hi);
 -
 -    return ret;
 -}
 -
  /* Swap low and high halfwords.  */
  static void gen_swap_half(TCGv_i32 var)
  {
-@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_addl(int size)
+ #ifdef CONFIG_USER_ONLY
      /* Intercept jump to the magic kernel page.  */
@@ -XXX,XX +XXX,XX @@ static bool arm_pre_translate_insn(DisasContext *dc)
          return true;
      }
- }
+ #endif
++    return false;
--static inline void gen_neon_negl(TCGv_i64 var, int size)
++}
--{
--    switch (size) {
++static bool arm_check_ss_active(DisasContext *dc)
--    case 0: gen_helper_neon_negl_u16(var, var); break;
++{
--    case 1: gen_helper_neon_negl_u32(var, var); break;
+     if (dc->ss_active && !dc->pstate_ss) {
--    case 2:
+         /* Singlestep state is Active-pending.
--        tcg_gen_neg_i64(var, var);
+          * If we're in this state at the start of a TB then either
--        break;
+@@ -XXX,XX +XXX,XX @@ static void arm_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
--    default: abort();
+     uint32_t pc = dc->base.pc_next;
--    }
+     unsigned int insn;
--}
--
+-    if (arm_pre_translate_insn(dc)) {
--static inline void gen_neon_addl_saturate(TCGv_i64 op0, TCGv_i64 op1, int size)
++    if (arm_check_ss_active(dc) || arm_check_kernelpage(dc)) {
--{
+         dc->base.pc_next = pc + 4;
--    switch (size) {
+         return;
--    case 1: gen_helper_neon_addl_saturate_s32(op0, cpu_env, op0, op1); break;
+     }
--    case 2: gen_helper_neon_addl_saturate_s64(op0, cpu_env, op0, op1); break;
+@@ -XXX,XX +XXX,XX @@ static void thumb_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
--    default: abort();
+     uint32_t insn;
--    }
+     bool is_16bit;
--}
--
+-    if (arm_pre_translate_insn(dc)) {
--static inline void gen_neon_mull(TCGv_i64 dest, TCGv_i32 a, TCGv_i32 b,
++    if (arm_check_ss_active(dc) || arm_check_kernelpage(dc)) {
--                                 int size, int u)
+         dc->base.pc_next = pc + 2;
--{
+         return;
--    TCGv_i64 tmp;
+     }
 -
 -    switch ((size << 1) | u) {
 -    case 0: gen_helper_neon_mull_s8(dest, a, b); break;
 -    case 1: gen_helper_neon_mull_u8(dest, a, b); break;
 -    case 2: gen_helper_neon_mull_s16(dest, a, b); break;
 -    case 3: gen_helper_neon_mull_u16(dest, a, b); break;
 -    case 4:
 -        tmp = gen_muls_i64_i32(a, b);
 -        tcg_gen_mov_i64(dest, tmp);
 -        tcg_temp_free_i64(tmp);
 -        break;
 -    case 5:
 -        tmp = gen_mulu_i64_i32(a, b);
 -        tcg_gen_mov_i64(dest, tmp);
 -        tcg_temp_free_i64(tmp);
 -        break;
 -    default: abort();
 -    }
 -
 -    /* gen_helper_neon_mull_[su]{8|16} do not free their parameters.
 -       Don't forget to clean them now.  */
 -    if (size < 2) {
 -        tcg_temp_free_i32(a);
 -        tcg_temp_free_i32(b);
 -    }
 -}
 -
  static void gen_neon_narrow_op(int op, int u, int size,
                                 TCGv_i32 dest, TCGv_i64 src)
  {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      int u;
      int vec_size;
      uint32_t imm;
 -    TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
 +    TCGv_i32 tmp, tmp2, tmp3, tmp5;
      TCGv_ptr ptr1;
      TCGv_i64 tmp64;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          return 1;
      } else { /* (insn & 0x00800010 == 0x00800000) */
          if (size != 3) {
 -            op = (insn >> 8) & 0xf;
 -            if ((insn & (1 << 6)) == 0) {
 -                /* Three registers of different lengths: handled by decodetree */
 -                return 1;
 -            } else {
 -                /* Two registers and a scalar. NB that for ops of this form
 -                 * the ARM ARM labels bit 24 as Q, but it is in our variable
 -                 * 'u', not 'q'.
 -                 */
 -                if (size == 0) {
 -                    return 1;
 -                }
 -                switch (op) {
 -                case 0: /* Integer VMLA scalar */
 -                case 4: /* Integer VMLS scalar */
 -                case 8: /* Integer VMUL scalar */
 -                case 1: /* Float VMLA scalar */
 -                case 5: /* Floating point VMLS scalar */
 -                case 9: /* Floating point VMUL scalar */
 -                case 12: /* VQDMULH scalar */
 -                case 13: /* VQRDMULH scalar */
 -                case 14: /* VQRDMLAH scalar */
 -                case 15: /* VQRDMLSH scalar */
 -                    return 1; /* handled by decodetree */
 -
 -                case 3: /* VQDMLAL scalar */
 -                case 7: /* VQDMLSL scalar */
 -                case 11: /* VQDMULL scalar */
 -                    if (u == 1) {
 -                        return 1;
 -                    }
 -                    /* fall through */
 -                case 2: /* VMLAL sclar */
 -                case 6: /* VMLSL scalar */
 -                case 10: /* VMULL scalar */
 -                    if (rd & 1) {
 -                        return 1;
 -                    }
 -                    tmp2 = neon_get_scalar(size, rm);
 -                    /* We need a copy of tmp2 because gen_neon_mull
 -                     * deletes it during pass 0.  */
 -                    tmp4 = tcg_temp_new_i32();
 -                    tcg_gen_mov_i32(tmp4, tmp2);
 -                    tmp3 = neon_load_reg(rn, 1);
 -
 -                    for (pass = 0; pass < 2; pass++) {
 -                        if (pass == 0) {
 -                            tmp = neon_load_reg(rn, 0);
 -                        } else {
 -                            tmp = tmp3;
 -                            tmp2 = tmp4;
 -                        }
 -                        gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
 -                        if (op != 11) {
 -                            neon_load_reg64(cpu_V1, rd + pass);
 -                        }
 -                        switch (op) {
 -                        case 6:
 -                            gen_neon_negl(cpu_V0, size);
 -                            /* Fall through */
 -                        case 2:
 -                            gen_neon_addl(size);
 -                            break;
 -                        case 3: case 7:
 -                            gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
 -                            if (op == 7) {
 -                                gen_neon_negl(cpu_V0, size);
 -                            }
 -                            gen_neon_addl_saturate(cpu_V0, cpu_V1, size);
 -                            break;
 -                        case 10:
 -                            /* no-op */
 -                            break;
 -                        case 11:
 -                            gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
 -                            break;
 -                        default:
 -                            abort();
 -                        }
 -                        neon_store_reg64(cpu_V0, rd + pass);
 -                    }
 -                    break;
 -                default:
 -                    g_assert_not_reached();
 -                }
 -            }
 +            /*
 +             * Three registers of different lengths, or two registers and
 +             * a scalar: handled by decodetree
 +             */
 +            return 1;
          } else { /* size == 3 */
              if (!u) {
                  /* Extract.  */
 --
-.20.1
+.25.1

-[PULL 19/23] Implement configurable descriptor size in ftgmac100
+[PULL 13/33] target/arm: Advance pc for arch single-step exception
-From: Erik Smit <erik.lucas.smit@gmail.com>
+From: Richard Henderson <richard.henderson@linaro.org>
-The hardware supports configurable descriptor sizes, configured in the DBLAC
+The size of the code covered by a TranslationBlock cannot be 0;
-register.
+this is checked via assert in tb_gen_code.
-Most drivers use the default 4 word descriptor, which is currently hardcoded,
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
-but Aspeed SDK configures 8 words to store extra data.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Erik Smit <erik.lucas.smit@gmail.com>
 Reviewed-by: Cédric Le Goater <clg@kaod.org>
 [PMM: removed unnecessary parens]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/net/ftgmac100.c | 26 ++++++++++++++++++++++++--
+ target/arm/translate-a64.c | 1 +
-file changed, 24 insertions(+), 2 deletions(-)
+file changed, 1 insertion(+)
-diff --git a/hw/net/ftgmac100.c b/hw/net/ftgmac100.c
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/net/ftgmac100.c
+--- a/target/arm/translate-a64.c
-+++ b/hw/net/ftgmac100.c
++++ b/target/arm/translate-a64.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
- #define FTGMAC100_APTC_TXPOLL_CNT(x)        (((x) >> 8) & 0xf)
+         assert(s->base.num_insns == 1);
- #define FTGMAC100_APTC_TXPOLL_TIME_SEL      (1 << 12)
+         gen_swstep_exception(s, 0, 0);
+         s->base.is_jmp = DISAS_NORETURN;
-+/*
++        s->base.pc_next = pc + 4;
-+ * DMA burst length and arbitration control register
+         return;
 + */
 +#define FTGMAC100_DBLAC_RXBURST_SIZE(x)     (((x) >> 8) & 0x3)
 +#define FTGMAC100_DBLAC_TXBURST_SIZE(x)     (((x) >> 10) & 0x3)
 +#define FTGMAC100_DBLAC_RXDES_SIZE(x)       ((((x) >> 12) & 0xf) * 8)
 +#define FTGMAC100_DBLAC_TXDES_SIZE(x)       ((((x) >> 16) & 0xf) * 8)
 +#define FTGMAC100_DBLAC_IFG_CNT(x)          (((x) >> 20) & 0x7)
 +#define FTGMAC100_DBLAC_IFG_INC             (1 << 23)
 +
  /*
   * PHY control register
   */
@@ -XXX,XX +XXX,XX @@ static void ftgmac100_do_tx(FTGMAC100State *s, uint32_t tx_ring,
          if (bd.des0 & s->txdes0_edotr) {
              addr = tx_ring;
          } else {
 -            addr += sizeof(FTGMAC100Desc);
 +            addr += FTGMAC100_DBLAC_TXDES_SIZE(s->dblac);
          }
      }
-@@ -XXX,XX +XXX,XX @@ static void ftgmac100_write(void *opaque, hwaddr addr,
-         s->phydata = value & 0xffff;
-         break;
-     case FTGMAC100_DBLAC: /* DMA Burst Length and Arbitration Control */
-+        if (FTGMAC100_DBLAC_TXDES_SIZE(s->dblac) < sizeof(FTGMAC100Desc)) {
-+            qemu_log_mask(LOG_GUEST_ERROR,
-+                          "%s: transmit descriptor too small : %d bytes\n",
-+                          __func__, FTGMAC100_DBLAC_TXDES_SIZE(s->dblac));
-+            break;
-+        }
-+        if (FTGMAC100_DBLAC_RXDES_SIZE(s->dblac) < sizeof(FTGMAC100Desc)) {
-+            qemu_log_mask(LOG_GUEST_ERROR,
-+                          "%s: receive descriptor too small : %d bytes\n",
-+                          __func__, FTGMAC100_DBLAC_RXDES_SIZE(s->dblac));
-+            break;
-+        }
-         s->dblac = value;
-         break;
-     case FTGMAC100_REVR:  /* Feature Register */
-@@ -XXX,XX +XXX,XX @@ static ssize_t ftgmac100_receive(NetClientState *nc, const uint8_t *buf,
-         if (bd.des0 & s->rxdes0_edorr) {
-             addr = s->rx_ring;
-         } else {
--            addr += sizeof(FTGMAC100Desc);
-+            addr += FTGMAC100_DBLAC_RXDES_SIZE(s->dblac);
-         }
-     }
-     s->rx_descriptor = addr;
 --
-.20.1
+.25.1

-[PULL 13/23] target/arm: Convert Neon 2-reg-scalar VQRDMLAH, VQRDMLSH to decodetree
+[PULL 14/33] target/arm: Split compute_fsr_fsc out of arm_deliver_fault
-Convert the VQRDMLAH and VQRDMLSH insns in the 2-reg-scalar
+From: Richard Henderson <richard.henderson@linaro.org>
 group to decodetree.
+We will reuse this section of arm_deliver_fault for
+raising pc alignment faults.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
- target/arm/neon-dp.decode       |  3 ++
+ target/arm/tlb_helper.c | 45 +++++++++++++++++++++++++----------------
- target/arm/translate-neon.inc.c | 74 +++++++++++++++++++++++++++++++++
+file changed, 28 insertions(+), 17 deletions(-)
  target/arm/translate.c          | 38 +----------------
 files changed, 79 insertions(+), 36 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/target/arm/tlb_helper.c
-+++ b/target/arm/neon-dp.decode
++++ b/target/arm/tlb_helper.c
-@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
+@@ -XXX,XX +XXX,XX @@ static inline uint32_t merge_syn_data_abort(uint32_t template_syn,
+     return syn;
      VQDMULH_2sc  1111 001 . 1 . .. .... .... 1100 . 1 . 0 .... @2scalar
      VQRDMULH_2sc 1111 001 . 1 . .. .... .... 1101 . 1 . 0 .... @2scalar
 +
 +    VQRDMLAH_2sc 1111 001 . 1 . .. .... .... 1110 . 1 . 0 .... @2scalar
 +    VQRDMLSH_2sc 1111 001 . 1 . .. .... .... 1111 . 1 . 0 .... @2scalar
    ]
  }
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
+-static void QEMU_NORETURN arm_deliver_fault(ARMCPU *cpu, vaddr addr,
---- a/target/arm/translate-neon.inc.c
+-                                            MMUAccessType access_type,
-+++ b/target/arm/translate-neon.inc.c
+-                                            int mmu_idx, ARMMMUFaultInfo *fi)
-@@ -XXX,XX +XXX,XX @@ static bool trans_VQRDMULH_2sc(DisasContext *s, arg_2scalar *a)
++static uint32_t compute_fsr_fsc(CPUARMState *env, ARMMMUFaultInfo *fi,
++                                int target_el, int mmu_idx, uint32_t *ret_fsc)
-     return do_2scalar(s, a, opfn[a->size], NULL);
+ {
- }
+-    CPUARMState *env = &cpu->env;
-+
+-    int target_el;
-+static bool do_vqrdmlah_2sc(DisasContext *s, arg_2scalar *a,
+-    bool same_el;
-+                            NeonGenThreeOpEnvFn *opfn)
+-    uint32_t syn, exc, fsr, fsc;
-+{
+     ARMMMUIdx arm_mmu_idx = core_to_arm_mmu_idx(env, mmu_idx);
-+    /*
+-
-+     * VQRDMLAH/VQRDMLSH: this is like do_2scalar, but the opfn
+-    target_el = exception_target_el(env);
-+     * performs a kind of fused op-then-accumulate using a helper
+-    if (fi->stage2) {
-+     * function that takes all of rd, rn and the scalar at once.
+-        target_el = 2;
-+     */
+-        env->cp15.hpfar_el2 = extract64(fi->s2addr, 12, 47) << 4;
-+    TCGv_i32 scalar;
+-        if (arm_is_secure_below_el3(env) && fi->s1ns) {
-+    int pass;
+-            env->cp15.hpfar_el2 |= HPFAR_NS;
-+
+-        }
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+-    }
-+        return false;
+-    same_el = (arm_current_el(env) == target_el);
-+    }
++    uint32_t fsr, fsc;
-+
-+    if (!dc_isar_feature(aa32_rdm, s)) {
+     if (target_el == 2 || arm_el_is_aa64(env, target_el) ||
-+        return false;
+         arm_s1_regime_using_lpae_format(env, arm_mmu_idx)) {
-+    }
+@@ -XXX,XX +XXX,XX @@ static void QEMU_NORETURN arm_deliver_fault(ARMCPU *cpu, vaddr addr,
-+
+         fsc = 0x3f;
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
+     }
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
++    *ret_fsc = fsc;
-+        return false;
++    return fsr;
 +    }
 +
 +    if (!opfn) {
 +        /* Bad size (including size == 3, which is a different insn group) */
 +        return false;
 +    }
 +
 +    if (a->q && ((a->vd | a->vn) & 1)) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    scalar = neon_get_scalar(a->size, a->vm);
 +
 +    for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
 +        TCGv_i32 rn = neon_load_reg(a->vn, pass);
 +        TCGv_i32 rd = neon_load_reg(a->vd, pass);
 +        opfn(rd, cpu_env, rn, scalar, rd);
 +        tcg_temp_free_i32(rn);
 +        neon_store_reg(a->vd, pass, rd);
 +    }
 +    tcg_temp_free_i32(scalar);
 +
 +    return true;
 +}
 +
-+static bool trans_VQRDMLAH_2sc(DisasContext *s, arg_2scalar *a)
++static void QEMU_NORETURN arm_deliver_fault(ARMCPU *cpu, vaddr addr,
 +                                            MMUAccessType access_type,
 +                                            int mmu_idx, ARMMMUFaultInfo *fi)
 +{
-+    static NeonGenThreeOpEnvFn *opfn[] = {
++    CPUARMState *env = &cpu->env;
-+        NULL,
++    int target_el;
-+        gen_helper_neon_qrdmlah_s16,
++    bool same_el;
-+        gen_helper_neon_qrdmlah_s32,
++    uint32_t syn, exc, fsr, fsc;
 +        NULL,
 +    };
 +    return do_vqrdmlah_2sc(s, a, opfn[a->size]);
 +}
 +
-+static bool trans_VQRDMLSH_2sc(DisasContext *s, arg_2scalar *a)
++    target_el = exception_target_el(env);
-+{
++    if (fi->stage2) {
-+    static NeonGenThreeOpEnvFn *opfn[] = {
++        target_el = 2;
-+        NULL,
++        env->cp15.hpfar_el2 = extract64(fi->s2addr, 12, 47) << 4;
-+        gen_helper_neon_qrdmlsh_s16,
++        if (arm_is_secure_below_el3(env) && fi->s1ns) {
-+        gen_helper_neon_qrdmlsh_s32,
++            env->cp15.hpfar_el2 |= HPFAR_NS;
-+        NULL,
++        }
-+    };
++    }
-+    return do_vqrdmlah_2sc(s, a, opfn[a->size]);
++    same_el = (arm_current_el(env) == target_el);
-+}
++
-diff --git a/target/arm/translate.c b/target/arm/translate.c
++    fsr = compute_fsr_fsc(env, fi, target_el, mmu_idx, &fsc);
-index XXXXXXX..XXXXXXX 100644
++
---- a/target/arm/translate.c
+     if (access_type == MMU_INST_FETCH) {
-+++ b/target/arm/translate.c
+         syn = syn_insn_abort(same_el, fi->ea, fi->s1ptw, fsc);
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+         exc = EXCP_PREFETCH_ABORT;
                  case 9: /* Floating point VMUL scalar */
                  case 12: /* VQDMULH scalar */
                  case 13: /* VQRDMULH scalar */
 +                case 14: /* VQRDMLAH scalar */
 +                case 15: /* VQRDMLSH scalar */
                      return 1; /* handled by decodetree */
                  case 3: /* VQDMLAL scalar */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          neon_store_reg64(cpu_V0, rd + pass);
                      }
                      break;
 -                case 14: /* VQRDMLAH scalar */
 -                case 15: /* VQRDMLSH scalar */
 -                    {
 -                        NeonGenThreeOpEnvFn *fn;
 -
 -                        if (!dc_isar_feature(aa32_rdm, s)) {
 -                            return 1;
 -                        }
 -                        if (u && ((rd | rn) & 1)) {
 -                            return 1;
 -                        }
 -                        if (op == 14) {
 -                            if (size == 1) {
 -                                fn = gen_helper_neon_qrdmlah_s16;
 -                            } else {
 -                                fn = gen_helper_neon_qrdmlah_s32;
 -                            }
 -                        } else {
 -                            if (size == 1) {
 -                                fn = gen_helper_neon_qrdmlsh_s16;
 -                            } else {
 -                                fn = gen_helper_neon_qrdmlsh_s32;
 -                            }
 -                        }
 -
 -                        tmp2 = neon_get_scalar(size, rm);
 -                        for (pass = 0; pass < (u ? 4 : 2); pass++) {
 -                            tmp = neon_load_reg(rn, pass);
 -                            tmp3 = neon_load_reg(rd, pass);
 -                            fn(tmp, cpu_env, tmp, tmp2, tmp3);
 -                            tcg_temp_free_i32(tmp3);
 -                            neon_store_reg(rd, pass, tmp);
 -                        }
 -                        tcg_temp_free_i32(tmp2);
 -                    }
 -                    break;
                  default:
                      g_assert_not_reached();
                  }
 --
-.20.1
+.25.1

-[PULL 06/23] target/arm: Convert Neon 3-reg-diff saturating doubling multiplies
+[PULL 15/33] target/arm: Take an exception if PC is misaligned
-Convert the Neon 3-reg-diff insns VQDMULL, VQDMLAL and VQDMLSL:
+From: Richard Henderson <richard.henderson@linaro.org>
-these are all saturating doubling long multiplies with a possible
-accumulate step.
+For A64, any input to an indirect branch can cause this.
-These are the last insns in the group which use the pass-over-each
+For A32, many indirect branch paths force the branch to be aligned,
-elements loop, so we can delete that code.
+but BXWritePC does not.  This includes the BX instruction but also
+other interworking changes to PC.  Prior to v8, this case is UNDEFINED.
 With v8, this is CONSTRAINED UNPREDICTABLE and may either raise an
 exception or force align the PC.
 We choose to raise an exception because we have the infrastructure,
 it makes the generated code for gen_bx simpler, and it has the
 possibility of catching more guest bugs.
 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
- target/arm/neon-dp.decode       |  6 +++
+ target/arm/helper.h           |  1 +
- target/arm/translate-neon.inc.c | 82 +++++++++++++++++++++++++++++++++
+ target/arm/syndrome.h         |  5 ++++
- target/arm/translate.c          | 59 ++----------------------
+ linux-user/aarch64/cpu_loop.c | 46 ++++++++++++++++++++---------------
-files changed, 92 insertions(+), 55 deletions(-)
+ target/arm/tlb_helper.c       | 18 ++++++++++++++
+ target/arm/translate-a64.c    | 15 ++++++++++++
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+ target/arm/translate.c        | 22 ++++++++++++++++-
-index XXXXXXX..XXXXXXX 100644
+files changed, 87 insertions(+), 20 deletions(-)
---- a/target/arm/neon-dp.decode
-+++ b/target/arm/neon-dp.decode
+diff --git a/target/arm/helper.h b/target/arm/helper.h
-@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
+index XXXXXXX..XXXXXXX 100644
-     VMLAL_S_3d   1111 001 0 1 . .. .... .... 1000 . 0 . 0 .... @3diff
+--- a/target/arm/helper.h
-     VMLAL_U_3d   1111 001 1 1 . .. .... .... 1000 . 0 . 0 .... @3diff
++++ b/target/arm/helper.h
+@@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(sel_flags, TCG_CALL_NO_RWG_SE,
-+    VQDMLAL_3d   1111 001 0 1 . .. .... .... 1001 . 0 . 0 .... @3diff
+ DEF_HELPER_2(exception_internal, void, env, i32)
-+
+ DEF_HELPER_4(exception_with_syndrome, void, env, i32, i32, i32)
-     VMLSL_S_3d   1111 001 0 1 . .. .... .... 1010 . 0 . 0 .... @3diff
+ DEF_HELPER_2(exception_bkpt_insn, void, env, i32)
-     VMLSL_U_3d   1111 001 1 1 . .. .... .... 1010 . 0 . 0 .... @3diff
++DEF_HELPER_2(exception_pc_alignment, noreturn, env, tl)
+ DEF_HELPER_1(setend, void, env)
-+    VQDMLSL_3d   1111 001 0 1 . .. .... .... 1011 . 0 . 0 .... @3diff
+ DEF_HELPER_2(wfi, void, env, i32)
-+
+ DEF_HELPER_1(wfe, void, env)
-     VMULL_S_3d   1111 001 0 1 . .. .... .... 1100 . 0 . 0 .... @3diff
+diff --git a/target/arm/syndrome.h b/target/arm/syndrome.h
-     VMULL_U_3d   1111 001 1 1 . .. .... .... 1100 . 0 . 0 .... @3diff
+index XXXXXXX..XXXXXXX 100644
-+
+--- a/target/arm/syndrome.h
-+    VQDMULL_3d   1111 001 0 1 . .. .... .... 1101 . 0 . 0 .... @3diff
++++ b/target/arm/syndrome.h
-   ]
+@@ -XXX,XX +XXX,XX @@ static inline uint32_t syn_illegalstate(void)
      return (EC_ILLEGALSTATE << ARM_EL_EC_SHIFT) | ARM_EL_IL;
  }
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
++static inline uint32_t syn_pcalignment(void)
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_VMLAL(VMLAL_S,mull_s,add)
  DO_VMLAL(VMLAL_U,mull_u,add)
  DO_VMLAL(VMLSL_S,mull_s,sub)
  DO_VMLAL(VMLSL_U,mull_u,sub)
 +
 +static void gen_VQDMULL_16(TCGv_i64 rd, TCGv_i32 rn, TCGv_i32 rm)
 +{
-+    gen_helper_neon_mull_s16(rd, rn, rm);
++    return (EC_PCALIGNMENT << ARM_EL_EC_SHIFT) | ARM_EL_IL;
 +    gen_helper_neon_addl_saturate_s32(rd, cpu_env, rd, rd);
 +}
 +
-+static void gen_VQDMULL_32(TCGv_i64 rd, TCGv_i32 rn, TCGv_i32 rm)
+ #endif /* TARGET_ARM_SYNDROME_H */
 diff --git a/linux-user/aarch64/cpu_loop.c b/linux-user/aarch64/cpu_loop.c
 index XXXXXXX..XXXXXXX 100644
 --- a/linux-user/aarch64/cpu_loop.c
 +++ b/linux-user/aarch64/cpu_loop.c
@@ -XXX,XX +XXX,XX @@ void cpu_loop(CPUARMState *env)
              break;
          case EXCP_PREFETCH_ABORT:
          case EXCP_DATA_ABORT:
 -            /* We should only arrive here with EC in {DATAABORT, INSNABORT}. */
              ec = syn_get_ec(env->exception.syndrome);
 -            assert(ec == EC_DATAABORT || ec == EC_INSNABORT);
 -
 -            /* Both EC have the same format for FSC, or close enough. */
 -            fsc = extract32(env->exception.syndrome, 0, 6);
 -            switch (fsc) {
 -            case 0x04 ... 0x07: /* Translation fault, level {0-3} */
 -                si_signo = TARGET_SIGSEGV;
 -                si_code = TARGET_SEGV_MAPERR;
 +            switch (ec) {
 +            case EC_DATAABORT:
 +            case EC_INSNABORT:
 +                /* Both EC have the same format for FSC, or close enough. */
 +                fsc = extract32(env->exception.syndrome, 0, 6);
 +                switch (fsc) {
 +                case 0x04 ... 0x07: /* Translation fault, level {0-3} */
 +                    si_signo = TARGET_SIGSEGV;
 +                    si_code = TARGET_SEGV_MAPERR;
 +                    break;
 +                case 0x09 ... 0x0b: /* Access flag fault, level {1-3} */
 +                case 0x0d ... 0x0f: /* Permission fault, level {1-3} */
 +                    si_signo = TARGET_SIGSEGV;
 +                    si_code = TARGET_SEGV_ACCERR;
 +                    break;
 +                case 0x11: /* Synchronous Tag Check Fault */
 +                    si_signo = TARGET_SIGSEGV;
 +                    si_code = TARGET_SEGV_MTESERR;
 +                    break;
 +                case 0x21: /* Alignment fault */
 +                    si_signo = TARGET_SIGBUS;
 +                    si_code = TARGET_BUS_ADRALN;
 +                    break;
 +                default:
 +                    g_assert_not_reached();
 +                }
                  break;
 -            case 0x09 ... 0x0b: /* Access flag fault, level {1-3} */
 -            case 0x0d ... 0x0f: /* Permission fault, level {1-3} */
 -                si_signo = TARGET_SIGSEGV;
 -                si_code = TARGET_SEGV_ACCERR;
 -                break;
 -            case 0x11: /* Synchronous Tag Check Fault */
 -                si_signo = TARGET_SIGSEGV;
 -                si_code = TARGET_SEGV_MTESERR;
 -                break;
 -            case 0x21: /* Alignment fault */
 +            case EC_PCALIGNMENT:
                  si_signo = TARGET_SIGBUS;
                  si_code = TARGET_BUS_ADRALN;
                  break;
 diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/tlb_helper.c
 +++ b/target/arm/tlb_helper.c
@@ -XXX,XX +XXX,XX @@
  #include "cpu.h"
  #include "internals.h"
  #include "exec/exec-all.h"
 +#include "exec/helper-proto.h"
  static inline uint32_t merge_syn_data_abort(uint32_t template_syn,
                                              unsigned int target_el,
@@ -XXX,XX +XXX,XX @@ void arm_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr,
      arm_deliver_fault(cpu, vaddr, access_type, mmu_idx, &fi);
  }
 +void helper_exception_pc_alignment(CPUARMState *env, target_ulong pc)
 +{
-+    gen_mull_s32(rd, rn, rm);
++    ARMMMUFaultInfo fi = { .type = ARMFault_Alignment };
-+    gen_helper_neon_addl_saturate_s64(rd, cpu_env, rd, rd);
++    int target_el = exception_target_el(env);
 +    int mmu_idx = cpu_mmu_index(env, true);
 +    uint32_t fsc;
 +
 +    env->exception.vaddress = pc;
 +
 +    /*
 +     * Note that the fsc is not applicable to this exception,
 +     * since any syndrome is pcalignment not insn_abort.
 +     */
 +    env->exception.fsr = compute_fsr_fsc(env, &fi, target_el, mmu_idx, &fsc);
 +    raise_exception(env, EXCP_PREFETCH_ABORT, syn_pcalignment(), target_el);
 +}
 +
-+static bool trans_VQDMULL_3d(DisasContext *s, arg_3diff *a)
+ #if !defined(CONFIG_USER_ONLY)
-+{
-+    static NeonGenTwoOpWidenFn * const opfn[] = {
+ /*
-+        NULL,
+diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
-+        gen_VQDMULL_16,
+index XXXXXXX..XXXXXXX 100644
-+        gen_VQDMULL_32,
+--- a/target/arm/translate-a64.c
-+        NULL,
++++ b/target/arm/translate-a64.c
-+    };
+@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
-+
+     uint64_t pc = s->base.pc_next;
-+    return do_long_3d(s, a, opfn[a->size], NULL);
+     uint32_t insn;
-+}
-+
++    /* Singlestep exceptions have the highest priority. */
-+static void gen_VQDMLAL_acc_16(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
+     if (s->ss_active && !s->pstate_ss) {
-+{
+         /* Singlestep state is Active-pending.
-+    gen_helper_neon_addl_saturate_s32(rd, cpu_env, rn, rm);
+          * If we're in this state at the start of a TB then either
-+}
+@@ -XXX,XX +XXX,XX @@ static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
-+
+         return;
-+static void gen_VQDMLAL_acc_32(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
+     }
-+{
-+    gen_helper_neon_addl_saturate_s64(rd, cpu_env, rn, rm);
++    if (pc & 3) {
-+}
++        /*
-+
++         * PC alignment fault.  This has priority over the instruction abort
-+static bool trans_VQDMLAL_3d(DisasContext *s, arg_3diff *a)
++         * that we would receive from a translation fault via arm_ldl_code.
-+{
++         * This should only be possible after an indirect branch, at the
-+    static NeonGenTwoOpWidenFn * const opfn[] = {
++         * start of the TB.
-+        NULL,
++         */
-+        gen_VQDMULL_16,
++        assert(s->base.num_insns == 1);
-+        gen_VQDMULL_32,
++        gen_helper_exception_pc_alignment(cpu_env, tcg_constant_tl(pc));
-+        NULL,
++        s->base.is_jmp = DISAS_NORETURN;
-+    };
++        s->base.pc_next = QEMU_ALIGN_UP(pc, 4);
-+    static NeonGenTwo64OpFn * const accfn[] = {
++        return;
-+        NULL,
++    }
-+        gen_VQDMLAL_acc_16,
++
-+        gen_VQDMLAL_acc_32,
+     s->pc_curr = pc;
-+        NULL,
+     insn = arm_ldl_code(env, &s->base, pc, s->sctlr_b);
-+    };
+     s->insn = insn;
 +
 +    return do_long_3d(s, a, opfn[a->size], accfn[a->size]);
 +}
 +
 +static void gen_VQDMLSL_acc_16(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
 +{
 +    gen_helper_neon_negl_u32(rm, rm);
 +    gen_helper_neon_addl_saturate_s32(rd, cpu_env, rn, rm);
 +}
 +
 +static void gen_VQDMLSL_acc_32(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm)
 +{
 +    tcg_gen_neg_i64(rm, rm);
 +    gen_helper_neon_addl_saturate_s64(rd, cpu_env, rn, rm);
 +}
 +
 +static bool trans_VQDMLSL_3d(DisasContext *s, arg_3diff *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        NULL,
 +        gen_VQDMULL_16,
 +        gen_VQDMULL_32,
 +        NULL,
 +    };
 +    static NeonGenTwo64OpFn * const accfn[] = {
 +        NULL,
 +        gen_VQDMLSL_acc_16,
 +        gen_VQDMLSL_acc_32,
 +        NULL,
 +    };
 +
 +    return do_long_3d(s, a, opfn[a->size], accfn[a->size]);
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void arm_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
-                     {0, 0, 0, 7}, /* VSUBHN: handled by decodetree */
+     uint32_t pc = dc->base.pc_next;
-                     {0, 0, 0, 7}, /* VABDL */
+     unsigned int insn;
-                     {0, 0, 0, 7}, /* VMLAL */
--                    {0, 0, 0, 9}, /* VQDMLAL */
+-    if (arm_check_ss_active(dc) || arm_check_kernelpage(dc)) {
-+                    {0, 0, 0, 7}, /* VQDMLAL */
++    /* Singlestep exceptions have the highest priority. */
-                     {0, 0, 0, 7}, /* VMLSL */
++    if (arm_check_ss_active(dc)) {
--                    {0, 0, 0, 9}, /* VQDMLSL */
++        dc->base.pc_next = pc + 4;
-+                    {0, 0, 0, 7}, /* VQDMLSL */
++        return;
-                     {0, 0, 0, 7}, /* Integer VMULL */
++    }
--                    {0, 0, 0, 9}, /* VQDMULL */
++
-+                    {0, 0, 0, 7}, /* VQDMULL */
++    if (pc & 3) {
-                     {0, 0, 0, 0xa}, /* Polynomial VMULL */
++        /*
-                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
++         * PC alignment fault.  This has priority over the instruction abort
-                 };
++         * that we would receive from a translation fault via arm_ldl_code
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
++         * (or the execution of the kernelpage entrypoint). This should only
-                     }
++         * be possible after an indirect branch, at the start of the TB.
-                     return 0;
++         */
-                 }
++        assert(dc->base.num_insns == 1);
--
++        gen_helper_exception_pc_alignment(cpu_env, tcg_constant_tl(pc));
--                /* Avoid overlapping operands.  Wide source operands are
++        dc->base.is_jmp = DISAS_NORETURN;
--                   always aligned so will never overlap with wide
++        dc->base.pc_next = QEMU_ALIGN_UP(pc, 4);
--                   destinations in problematic ways.  */
++        return;
--                if (rd == rm) {
++    }
--                    tmp = neon_load_reg(rm, 1);
++
--                    neon_store_scratch(2, tmp);
++    if (arm_check_kernelpage(dc)) {
--                } else if (rd == rn) {
+         dc->base.pc_next = pc + 4;
--                    tmp = neon_load_reg(rn, 1);
+         return;
--                    neon_store_scratch(2, tmp);
+     }
 -                }
 -                tmp3 = NULL;
 -                for (pass = 0; pass < 2; pass++) {
 -                    if (pass == 1 && rd == rn) {
 -                        tmp = neon_load_scratch(2);
 -                    } else {
 -                        tmp = neon_load_reg(rn, pass);
 -                    }
 -                    if (pass == 1 && rd == rm) {
 -                        tmp2 = neon_load_scratch(2);
 -                    } else {
 -                        tmp2 = neon_load_reg(rm, pass);
 -                    }
 -                    switch (op) {
 -                    case 9: case 11: case 13:
 -                        /* VQDMLAL, VQDMLSL, VQDMULL */
 -                        gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
 -                        break;
 -                    default: /* 15 is RESERVED: caught earlier  */
 -                        abort();
 -                    }
 -                    if (op == 13) {
 -                        /* VQDMULL */
 -                        gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
 -                        neon_store_reg64(cpu_V0, rd + pass);
 -                    } else {
 -                        /* Accumulate.  */
 -                        neon_load_reg64(cpu_V1, rd + pass);
 -                        switch (op) {
 -                        case 9: case 11: /* VQDMLAL, VQDMLSL */
 -                            gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
 -                            if (op == 11) {
 -                                gen_neon_negl(cpu_V0, size);
 -                            }
 -                            gen_neon_addl_saturate(cpu_V0, cpu_V1, size);
 -                            break;
 -                        default:
 -                            abort();
 -                        }
 -                        neon_store_reg64(cpu_V0, rd + pass);
 -                    }
 -                }
 +                abort(); /* all others handled by decodetree */
              } else {
                  /* Two registers and a scalar. NB that for ops of this form
                   * the ARM ARM labels bit 24 as Q, but it is in our variable
 --
-.20.1
+.25.1

-[PULL 15/23] target/arm: Convert Neon VEXT to decodetree
+[PULL 16/33] target/arm: Assert thumb pc is aligned
-Convert the Neon VEXT insn to decodetree. Rather than keeping the
+From: Richard Henderson <richard.henderson@linaro.org>
 old implementation which used fixed temporaries cpu_V0 and cpu_V1
 and did the extraction with by-hand shift and logic ops, we use
 the TCG extract2 insn.
-We don't need to special case 0 or 8 immediates any more as the
+Misaligned thumb PC is architecturally impossible.
-optimizer is smart enough to throw away the dead code.
+Assert is better than proceeding, in case we've missed
 something somewhere.
+Expand a comment about aligning the pc in gdbstub.
+Fail an incoming migrate if a thumb pc is misaligned.
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
- target/arm/neon-dp.decode       |  8 +++-
+ target/arm/gdbstub.c   |  9 +++++++--
- target/arm/translate-neon.inc.c | 76 +++++++++++++++++++++++++++++++++
+ target/arm/machine.c   | 10 ++++++++++
- target/arm/translate.c          | 58 +------------------------
+ target/arm/translate.c |  3 +++
-files changed, 85 insertions(+), 57 deletions(-)
+files changed, 20 insertions(+), 2 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/target/arm/gdbstub.c
-+++ b/target/arm/neon-dp.decode
++++ b/target/arm/gdbstub.c
-@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
+@@ -XXX,XX +XXX,XX @@ int arm_cpu_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
- # return false for size==3.
- ######################################################################
+     tmp = ldl_p(mem_buf);
- {
--  # 0b11 subgroup will go here
+-    /* Mask out low bit of PC to workaround gdb bugs.  This will probably
-+  [
+-       cause problems if we ever implement the Jazelle DBX extensions.  */
-+    ##################################################################
++    /*
-+    # Miscellaneous size=0b11 insns
++     * Mask out low bits of PC to workaround gdb bugs.
-+    ##################################################################
++     * This avoids an assert in thumb_tr_translate_insn, because it is
-+    VEXT         1111 001 0 1 . 11 .... .... imm:4 . q:1 . 0 .... \
++     * architecturally impossible to misalign the pc.
-+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
++     * This will probably cause problems if we ever implement the
-+  ]
++     * Jazelle DBX extensions.
++     */
-   # Subgroup for size != 0b11
+     if (n == 15) {
-   [
+         tmp &= ~1;
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+     }
 diff --git a/target/arm/machine.c b/target/arm/machine.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/arm/machine.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/arm/machine.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VQDMLSL_2sc(DisasContext *s, arg_2scalar *a)
+@@ -XXX,XX +XXX,XX @@ static int cpu_post_load(void *opaque, int version_id)
+             return -1;
-     return do_2scalar_long(s, a, opfn[a->size], accfn[a->size]);
+         }
- }
+     }
 +
-+static bool trans_VEXT(DisasContext *s, arg_VEXT *a)
++    /*
-+{
++     * Misaligned thumb pc is architecturally impossible.
-+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
++     * We have an assert in thumb_tr_translate_insn to verify this.
-+        return false;
++     * Fail an incoming migrate to avoid this assert.
 +     */
 +    if (!is_a64(env) && env->thumb && (env->regs[15] & 1)) {
 +        return -1;
 +    }
 +
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
+     if (!kvm_enabled()) {
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+         pmu_op_finish(&cpu->env);
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
+     }
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm | a->vd) & a->q) {
 +        return false;
 +    }
 +
 +    if (a->imm > 7 && !a->q) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    if (!a->q) {
 +        /* Extract 64 bits from <Vm:Vn> */
 +        TCGv_i64 left, right, dest;
 +
 +        left = tcg_temp_new_i64();
 +        right = tcg_temp_new_i64();
 +        dest = tcg_temp_new_i64();
 +
 +        neon_load_reg64(right, a->vn);
 +        neon_load_reg64(left, a->vm);
 +        tcg_gen_extract2_i64(dest, right, left, a->imm * 8);
 +        neon_store_reg64(dest, a->vd);
 +
 +        tcg_temp_free_i64(left);
 +        tcg_temp_free_i64(right);
 +        tcg_temp_free_i64(dest);
 +    } else {
 +        /* Extract 128 bits from <Vm+1:Vm:Vn+1:Vn> */
 +        TCGv_i64 left, middle, right, destleft, destright;
 +
 +        left = tcg_temp_new_i64();
 +        middle = tcg_temp_new_i64();
 +        right = tcg_temp_new_i64();
 +        destleft = tcg_temp_new_i64();
 +        destright = tcg_temp_new_i64();
 +
 +        if (a->imm < 8) {
 +            neon_load_reg64(right, a->vn);
 +            neon_load_reg64(middle, a->vn + 1);
 +            tcg_gen_extract2_i64(destright, right, middle, a->imm * 8);
 +            neon_load_reg64(left, a->vm);
 +            tcg_gen_extract2_i64(destleft, middle, left, a->imm * 8);
 +        } else {
 +            neon_load_reg64(right, a->vn + 1);
 +            neon_load_reg64(middle, a->vm);
 +            tcg_gen_extract2_i64(destright, right, middle, (a->imm - 8) * 8);
 +            neon_load_reg64(left, a->vm + 1);
 +            tcg_gen_extract2_i64(destleft, middle, left, (a->imm - 8) * 8);
 +        }
 +
 +        neon_store_reg64(destright, a->vd);
 +        neon_store_reg64(destleft, a->vd + 1);
 +
 +        tcg_temp_free_i64(destright);
 +        tcg_temp_free_i64(destleft);
 +        tcg_temp_free_i64(right);
 +        tcg_temp_free_i64(middle);
 +        tcg_temp_free_i64(left);
 +    }
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void thumb_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
-     int pass;
+     uint32_t insn;
-     int u;
+     bool is_16bit;
-     int vec_size;
--    uint32_t imm;
++    /* Misaligned thumb PC is architecturally impossible. */
-     TCGv_i32 tmp, tmp2, tmp3, tmp5;
++    assert((dc->base.pc_next & 1) == 0);
-     TCGv_ptr ptr1;
++
--    TCGv_i64 tmp64;
+     if (arm_check_ss_active(dc) || arm_check_kernelpage(dc)) {
+         dc->base.pc_next = pc + 2;
-     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+         return;
          return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
              return 1;
          } else { /* size == 3 */
              if (!u) {
 -                /* Extract.  */
 -                imm = (insn >> 8) & 0xf;
 -
 -                if (imm > 7 && !q)
 -                    return 1;
 -
 -                if (q && ((rd | rn | rm) & 1)) {
 -                    return 1;
 -                }
 -
 -                if (imm == 0) {
 -                    neon_load_reg64(cpu_V0, rn);
 -                    if (q) {
 -                        neon_load_reg64(cpu_V1, rn + 1);
 -                    }
 -                } else if (imm == 8) {
 -                    neon_load_reg64(cpu_V0, rn + 1);
 -                    if (q) {
 -                        neon_load_reg64(cpu_V1, rm);
 -                    }
 -                } else if (q) {
 -                    tmp64 = tcg_temp_new_i64();
 -                    if (imm < 8) {
 -                        neon_load_reg64(cpu_V0, rn);
 -                        neon_load_reg64(tmp64, rn + 1);
 -                    } else {
 -                        neon_load_reg64(cpu_V0, rn + 1);
 -                        neon_load_reg64(tmp64, rm);
 -                    }
 -                    tcg_gen_shri_i64(cpu_V0, cpu_V0, (imm & 7) * 8);
 -                    tcg_gen_shli_i64(cpu_V1, tmp64, 64 - ((imm & 7) * 8));
 -                    tcg_gen_or_i64(cpu_V0, cpu_V0, cpu_V1);
 -                    if (imm < 8) {
 -                        neon_load_reg64(cpu_V1, rm);
 -                    } else {
 -                        neon_load_reg64(cpu_V1, rm + 1);
 -                        imm -= 8;
 -                    }
 -                    tcg_gen_shli_i64(cpu_V1, cpu_V1, 64 - (imm * 8));
 -                    tcg_gen_shri_i64(tmp64, tmp64, imm * 8);
 -                    tcg_gen_or_i64(cpu_V1, cpu_V1, tmp64);
 -                    tcg_temp_free_i64(tmp64);
 -                } else {
 -                    /* BUGFIX */
 -                    neon_load_reg64(cpu_V0, rn);
 -                    tcg_gen_shri_i64(cpu_V0, cpu_V0, imm * 8);
 -                    neon_load_reg64(cpu_V1, rm);
 -                    tcg_gen_shli_i64(cpu_V1, cpu_V1, 64 - (imm * 8));
 -                    tcg_gen_or_i64(cpu_V0, cpu_V0, cpu_V1);
 -                }
 -                neon_store_reg64(cpu_V0, rd);
 -                if (q) {
 -                    neon_store_reg64(cpu_V1, rd + 1);
 -                }
 +                /* Extract: handled by decodetree */
 +                return 1;
              } else if ((insn & (1 << 11)) == 0) {
                  /* Two register misc.  */
                  op = ((insn >> 12) & 0x30) | ((insn >> 7) & 0xf);
 --
-.20.1
+.25.1

-[PULL 10/23] target/arm: Convert Neon 2-reg-scalar integer multiplies to decodetree
+[PULL 17/33] target/arm: Suppress bp for exceptions with more priority
-Convert the VMLA, VMLS and VMUL insns in the Neon "2 registers and a
+From: Richard Henderson <richard.henderson@linaro.org>
 scalar" group to decodetree.  These are 32x32->32 operations where
 one of the inputs is the scalar, followed by a possible accumulate
 operation of the 32-bit result.
-The refactoring removes some of the oddities of the old decoder:
+Both single-step and pc alignment faults have priority over
- * operands to the operation and accumulation were often
+breakpoint exceptions.
    reversed (taking advantage of the fact that most of these ops
    are commutative); the new code follows the pseudocode order
  * the Q bit in the insn was in a local variable 'u'; in the
    new code it is decoded into a->q
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
- target/arm/neon-dp.decode       |  15 ++++
+ target/arm/debug_helper.c | 23 +++++++++++++++++++++++
- target/arm/translate-neon.inc.c | 133 ++++++++++++++++++++++++++++++++
+file changed, 23 insertions(+)
  target/arm/translate.c          |  77 ++----------------
 files changed, 154 insertions(+), 71 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/target/arm/debug_helper.c
-+++ b/target/arm/neon-dp.decode
++++ b/target/arm/debug_helper.c
-@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
+@@ -XXX,XX +XXX,XX @@ bool arm_debug_check_breakpoint(CPUState *cs)
-     VQDMULL_3d   1111 001 0 1 . .. .... .... 1101 . 0 . 0 .... @3diff
+ {
+     ARMCPU *cpu = ARM_CPU(cs);
-     VMULL_P_3d   1111 001 0 1 . .. .... .... 1110 . 0 . 0 .... @3diff
+     CPUARMState *env = &cpu->env;
-+
++    target_ulong pc;
-+    ##################################################################
+     int n;
-+    # 2-regs-plus-scalar grouping:
-+    # 1111 001 Q 1 D sz!=11 Vn:4 Vd:4 opc:4 N 1 M 0 Vm:4
+     /*
-+    ##################################################################
+@@ -XXX,XX +XXX,XX @@ bool arm_debug_check_breakpoint(CPUState *cs)
-+    &2scalar vm vn vd size q
+         return false;
-+
+     }
-+    @2scalar     .... ... q:1 . . size:2 .... .... .... . . . . .... \
 +                 &2scalar vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +
 +    VMLA_2sc     1111 001 . 1 . .. .... .... 0000 . 1 . 0 .... @2scalar
 +
 +    VMLS_2sc     1111 001 . 1 . .. .... .... 0100 . 1 . 0 .... @2scalar
 +
 +    VMUL_2sc     1111 001 . 1 . .. .... .... 1000 . 1 . 0 .... @2scalar
    ]
  }
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VMULL_P_3d(DisasContext *s, arg_3diff *a)
 , 16, 0, fn_gvec);
      return true;
  }
 +
 +static void gen_neon_dup_low16(TCGv_i32 var)
 +{
 +    TCGv_i32 tmp = tcg_temp_new_i32();
 +    tcg_gen_ext16u_i32(var, var);
 +    tcg_gen_shli_i32(tmp, var, 16);
 +    tcg_gen_or_i32(var, var, tmp);
 +    tcg_temp_free_i32(tmp);
 +}
 +
 +static void gen_neon_dup_high16(TCGv_i32 var)
 +{
 +    TCGv_i32 tmp = tcg_temp_new_i32();
 +    tcg_gen_andi_i32(var, var, 0xffff0000);
 +    tcg_gen_shri_i32(tmp, var, 16);
 +    tcg_gen_or_i32(var, var, tmp);
 +    tcg_temp_free_i32(tmp);
 +}
 +
 +static inline TCGv_i32 neon_get_scalar(int size, int reg)
 +{
 +    TCGv_i32 tmp;
 +    if (size == 1) {
 +        tmp = neon_load_reg(reg & 7, reg >> 4);
 +        if (reg & 8) {
 +            gen_neon_dup_high16(tmp);
 +        } else {
 +            gen_neon_dup_low16(tmp);
 +        }
 +    } else {
 +        tmp = neon_load_reg(reg & 15, reg >> 4);
 +    }
 +    return tmp;
 +}
 +
 +static bool do_2scalar(DisasContext *s, arg_2scalar *a,
 +                       NeonGenTwoOpFn *opfn, NeonGenTwoOpFn *accfn)
 +{
 +    /*
-+     * Two registers and a scalar: perform an operation between
++     * Single-step exceptions have priority over breakpoint exceptions.
-+     * the input elements and the scalar, and then possibly
++     * If single-step state is active-pending, suppress the bp.
 +     * perform an accumulation operation of that result into the
 +     * destination.
 +     */
-+    TCGv_i32 scalar;
++    if (arm_singlestep_active(env) && !(env->pstate & PSTATE_SS)) {
 +    int pass;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
++    /*
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
++     * PC alignment faults have priority over breakpoint exceptions.
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
++     */
 +    pc = is_a64(env) ? env->pc : env->regs[15];
 +    if ((is_a64(env) || !env->thumb) && (pc & 3) != 0) {
 +        return false;
 +    }
 +
-+    if (!opfn) {
++    /*
-+        /* Bad size (including size == 3, which is a different insn group) */
++     * Instruction aborts have priority over breakpoint exceptions.
-+        return false;
++     * TODO: We would need to look up the page for PC and verify that
-+    }
++     * it is present and executable.
 +     */
 +
-+    if (a->q && ((a->vd | a->vn) & 1)) {
+     for (n = 0; n < ARRAY_SIZE(env->cpu_breakpoint); n++) {
-+        return false;
+         if (bp_wp_matches(cpu, n, false)) {
-+    }
+             return true;
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    scalar = neon_get_scalar(a->size, a->vm);
 +
 +    for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
 +        TCGv_i32 tmp = neon_load_reg(a->vn, pass);
 +        opfn(tmp, tmp, scalar);
 +        if (accfn) {
 +            TCGv_i32 rd = neon_load_reg(a->vd, pass);
 +            accfn(tmp, rd, tmp);
 +            tcg_temp_free_i32(rd);
 +        }
 +        neon_store_reg(a->vd, pass, tmp);
 +    }
 +    tcg_temp_free_i32(scalar);
 +    return true;
 +}
 +
 +static bool trans_VMUL_2sc(DisasContext *s, arg_2scalar *a)
 +{
 +    static NeonGenTwoOpFn * const opfn[] = {
 +        NULL,
 +        gen_helper_neon_mul_u16,
 +        tcg_gen_mul_i32,
 +        NULL,
 +    };
 +
 +    return do_2scalar(s, a, opfn[a->size], NULL);
 +}
 +
 +static bool trans_VMLA_2sc(DisasContext *s, arg_2scalar *a)
 +{
 +    static NeonGenTwoOpFn * const opfn[] = {
 +        NULL,
 +        gen_helper_neon_mul_u16,
 +        tcg_gen_mul_i32,
 +        NULL,
 +    };
 +    static NeonGenTwoOpFn * const accfn[] = {
 +        NULL,
 +        gen_helper_neon_add_u16,
 +        tcg_gen_add_i32,
 +        NULL,
 +    };
 +
 +    return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
 +}
 +
 +static bool trans_VMLS_2sc(DisasContext *s, arg_2scalar *a)
 +{
 +    static NeonGenTwoOpFn * const opfn[] = {
 +        NULL,
 +        gen_helper_neon_mul_u16,
 +        tcg_gen_mul_i32,
 +        NULL,
 +    };
 +    static NeonGenTwoOpFn * const accfn[] = {
 +        NULL,
 +        gen_helper_neon_sub_u16,
 +        tcg_gen_sub_i32,
 +        NULL,
 +    };
 +
 +    return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_dsp_insn(DisasContext *s, uint32_t insn)
  #define VFP_DREG_N(reg, insn) VFP_DREG(reg, insn, 16,  7)
  #define VFP_DREG_M(reg, insn) VFP_DREG(reg, insn,  0,  5)
 -static void gen_neon_dup_low16(TCGv_i32 var)
 -{
 -    TCGv_i32 tmp = tcg_temp_new_i32();
 -    tcg_gen_ext16u_i32(var, var);
 -    tcg_gen_shli_i32(tmp, var, 16);
 -    tcg_gen_or_i32(var, var, tmp);
 -    tcg_temp_free_i32(tmp);
 -}
 -
 -static void gen_neon_dup_high16(TCGv_i32 var)
 -{
 -    TCGv_i32 tmp = tcg_temp_new_i32();
 -    tcg_gen_andi_i32(var, var, 0xffff0000);
 -    tcg_gen_shri_i32(tmp, var, 16);
 -    tcg_gen_or_i32(var, var, tmp);
 -    tcg_temp_free_i32(tmp);
 -}
 -
  static inline bool use_goto_tb(DisasContext *s, target_ulong dest)
  {
  #ifndef CONFIG_USER_ONLY
@@ -XXX,XX +XXX,XX @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
  #define CPU_V001 cpu_V0, cpu_V0, cpu_V1
 -static inline void gen_neon_add(int size, TCGv_i32 t0, TCGv_i32 t1)
 -{
 -    switch (size) {
 -    case 0: gen_helper_neon_add_u8(t0, t0, t1); break;
 -    case 1: gen_helper_neon_add_u16(t0, t0, t1); break;
 -    case 2: tcg_gen_add_i32(t0, t0, t1); break;
 -    default: abort();
 -    }
 -}
 -
 -static inline void gen_neon_rsb(int size, TCGv_i32 t0, TCGv_i32 t1)
 -{
 -    switch (size) {
 -    case 0: gen_helper_neon_sub_u8(t0, t1, t0); break;
 -    case 1: gen_helper_neon_sub_u16(t0, t1, t0); break;
 -    case 2: tcg_gen_sub_i32(t0, t1, t0); break;
 -    default: return;
 -    }
 -}
 -
  static TCGv_i32 neon_load_scratch(int scratch)
  {
      TCGv_i32 tmp = tcg_temp_new_i32();
@@ -XXX,XX +XXX,XX @@ static void neon_store_scratch(int scratch, TCGv_i32 var)
      tcg_temp_free_i32(var);
  }
 -static inline TCGv_i32 neon_get_scalar(int size, int reg)
 -{
 -    TCGv_i32 tmp;
 -    if (size == 1) {
 -        tmp = neon_load_reg(reg & 7, reg >> 4);
 -        if (reg & 8) {
 -            gen_neon_dup_high16(tmp);
 -        } else {
 -            gen_neon_dup_low16(tmp);
 -        }
 -    } else {
 -        tmp = neon_load_reg(reg & 15, reg >> 4);
 -    }
 -    return tmp;
 -}
 -
  static int gen_neon_unzip(int rd, int rm, int size, int q)
  {
      TCGv_ptr pd, pm;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      return 1;
                  }
                  switch (op) {
 +                case 0: /* Integer VMLA scalar */
 +                case 4: /* Integer VMLS scalar */
 +                case 8: /* Integer VMUL scalar */
 +                    return 1; /* handled by decodetree */
 +
                  case 1: /* Float VMLA scalar */
                  case 5: /* Floating point VMLS scalar */
                  case 9: /* Floating point VMUL scalar */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          return 1;
                      }
                      /* fall through */
 -                case 0: /* Integer VMLA scalar */
 -                case 4: /* Integer VMLS scalar */
 -                case 8: /* Integer VMUL scalar */
                  case 12: /* VQDMULH scalar */
                  case 13: /* VQRDMULH scalar */
                      if (u && ((rd | rn) & 1)) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              } else {
                                  gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
                              }
 -                        } else if (op & 1) {
 +                        } else {
                              TCGv_ptr fpstatus = get_fpstatus_ptr(1);
                              gen_helper_vfp_muls(tmp, tmp, tmp2, fpstatus);
                              tcg_temp_free_ptr(fpstatus);
 -                        } else {
 -                            switch (size) {
 -                            case 0: gen_helper_neon_mul_u8(tmp, tmp, tmp2); break;
 -                            case 1: gen_helper_neon_mul_u16(tmp, tmp, tmp2); break;
 -                            case 2: tcg_gen_mul_i32(tmp, tmp, tmp2); break;
 -                            default: abort();
 -                            }
                          }
                          tcg_temp_free_i32(tmp2);
                          if (op < 8) {
                              /* Accumulate.  */
                              tmp2 = neon_load_reg(rd, pass);
                              switch (op) {
 -                            case 0:
 -                                gen_neon_add(size, tmp, tmp2);
 -                                break;
                              case 1:
                              {
                                  TCGv_ptr fpstatus = get_fpstatus_ptr(1);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                  tcg_temp_free_ptr(fpstatus);
                                  break;
                              }
 -                            case 4:
 -                                gen_neon_rsb(size, tmp, tmp2);
 -                                break;
                              case 5:
                              {
                                  TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 --
-.20.1
+.25.1

-[PULL 05/23] target/arm: Convert Neon 3-reg-diff long multiplies
+[PULL 18/33] tests/tcg: Add arm and aarch64 pc alignment tests
-Convert the Neon 3-reg-diff insns VMULL, VMLAL and VMLSL; these perform
+From: Richard Henderson <richard.henderson@linaro.org>
 a 32x32->64 multiply with possible accumulate.
-Note that for VMLSL we do the accumulate directly with a subtraction
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
-rather than doing a negate-then-add as the old code did.
+Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  tests/tcg/aarch64/pcalign-a64.c   | 37 +++++++++++++++++++++++++
  tests/tcg/arm/pcalign-a32.c       | 46 +++++++++++++++++++++++++++++++
  tests/tcg/aarch64/Makefile.target |  4 +--
  tests/tcg/arm/Makefile.target     |  4 +++
 files changed, 89 insertions(+), 2 deletions(-)
  create mode 100644 tests/tcg/aarch64/pcalign-a64.c
  create mode 100644 tests/tcg/arm/pcalign-a32.c
-Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+diff --git a/tests/tcg/aarch64/pcalign-a64.c b/tests/tcg/aarch64/pcalign-a64.c
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+new file mode 100644
----
+index XXXXXXX..XXXXXXX
- target/arm/neon-dp.decode       |  9 +++++
+--- /dev/null
- target/arm/translate-neon.inc.c | 71 +++++++++++++++++++++++++++++++++
++++ b/tests/tcg/aarch64/pcalign-a64.c
- target/arm/translate.c          | 21 +++-------
+@@ -XXX,XX +XXX,XX @@
-files changed, 86 insertions(+), 15 deletions(-)
++/* Test PC misalignment exception */
 diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/neon-dp.decode
 +++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
      VABDL_S_3d   1111 001 0 1 . .. .... .... 0111 . 0 . 0 .... @3diff
      VABDL_U_3d   1111 001 1 1 . .. .... .... 0111 . 0 . 0 .... @3diff
 +
-+    VMLAL_S_3d   1111 001 0 1 . .. .... .... 1000 . 0 . 0 .... @3diff
++#include <assert.h>
-+    VMLAL_U_3d   1111 001 1 1 . .. .... .... 1000 . 0 . 0 .... @3diff
++#include <signal.h>
 +#include <stdlib.h>
 +#include <stdio.h>
 +
-+    VMLSL_S_3d   1111 001 0 1 . .. .... .... 1010 . 0 . 0 .... @3diff
++static void *expected;
 +    VMLSL_U_3d   1111 001 1 1 . .. .... .... 1010 . 0 . 0 .... @3diff
 +
-+    VMULL_S_3d   1111 001 0 1 . .. .... .... 1100 . 0 . 0 .... @3diff
++static void sigbus(int sig, siginfo_t *info, void *vuc)
 +    VMULL_U_3d   1111 001 1 1 . .. .... .... 1100 . 0 . 0 .... @3diff
    ]
  }
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VABAL_U_3d(DisasContext *s, arg_3diff *a)
      return do_long_3d(s, a, opfn[a->size], addfn[a->size]);
  }
 +
 +static void gen_mull_s32(TCGv_i64 rd, TCGv_i32 rn, TCGv_i32 rm)
 +{
-+    TCGv_i32 lo = tcg_temp_new_i32();
++    assert(info->si_code == BUS_ADRALN);
-+    TCGv_i32 hi = tcg_temp_new_i32();
++    assert(info->si_addr == expected);
-+
++    exit(EXIT_SUCCESS);
 +    tcg_gen_muls2_i32(lo, hi, rn, rm);
 +    tcg_gen_concat_i32_i64(rd, lo, hi);
 +
 +    tcg_temp_free_i32(lo);
 +    tcg_temp_free_i32(hi);
 +}
 +
-+static void gen_mull_u32(TCGv_i64 rd, TCGv_i32 rn, TCGv_i32 rm)
++int main()
 +{
-+    TCGv_i32 lo = tcg_temp_new_i32();
++    void *tmp;
 +    TCGv_i32 hi = tcg_temp_new_i32();
 +
-+    tcg_gen_mulu2_i32(lo, hi, rn, rm);
++    struct sigaction sa = {
-+    tcg_gen_concat_i32_i64(rd, lo, hi);
++        .sa_sigaction = sigbus,
 +        .sa_flags = SA_SIGINFO
 +    };
 +
-+    tcg_temp_free_i32(lo);
++    if (sigaction(SIGBUS, &sa, NULL) < 0) {
-+    tcg_temp_free_i32(hi);
++        perror("sigaction");
 +        return EXIT_FAILURE;
 +    }
 +
 +    asm volatile("adr %0, 1f + 1\n\t"
 +                 "str %0, %1\n\t"
 +                 "br  %0\n"
 +                 "1:"
 +                 : "=&r"(tmp), "=m"(expected));
 +    abort();
 +}
 diff --git a/tests/tcg/arm/pcalign-a32.c b/tests/tcg/arm/pcalign-a32.c
 new file mode 100644
 index XXXXXXX..XXXXXXX
 --- /dev/null
 +++ b/tests/tcg/arm/pcalign-a32.c
@@ -XXX,XX +XXX,XX @@
 +/* Test PC misalignment exception */
 +
 +#ifdef __thumb__
 +#error "This test must be compiled for ARM"
 +#endif
 +
 +#include <assert.h>
 +#include <signal.h>
 +#include <stdlib.h>
 +#include <stdio.h>
 +
 +static void *expected;
 +
 +static void sigbus(int sig, siginfo_t *info, void *vuc)
 +{
 +    assert(info->si_code == BUS_ADRALN);
 +    assert(info->si_addr == expected);
 +    exit(EXIT_SUCCESS);
 +}
 +
-+static bool trans_VMULL_S_3d(DisasContext *s, arg_3diff *a)
++int main()
 +{
-+    static NeonGenTwoOpWidenFn * const opfn[] = {
++    void *tmp;
-+        gen_helper_neon_mull_s8,
++
-+        gen_helper_neon_mull_s16,
++    struct sigaction sa = {
-+        gen_mull_s32,
++        .sa_sigaction = sigbus,
-+        NULL,
++        .sa_flags = SA_SIGINFO
 +    };
 +
-+    return do_long_3d(s, a, opfn[a->size], NULL);
++    if (sigaction(SIGBUS, &sa, NULL) < 0) {
-+}
++        perror("sigaction");
-+
++        return EXIT_FAILURE;
 +static bool trans_VMULL_U_3d(DisasContext *s, arg_3diff *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        gen_helper_neon_mull_u8,
 +        gen_helper_neon_mull_u16,
 +        gen_mull_u32,
 +        NULL,
 +    };
 +
 +    return do_long_3d(s, a, opfn[a->size], NULL);
 +}
 +
 +#define DO_VMLAL(INSN,MULL,ACC)                                         \
 +    static bool trans_##INSN##_3d(DisasContext *s, arg_3diff *a)        \
 +    {                                                                   \
 +        static NeonGenTwoOpWidenFn * const opfn[] = {                   \
 +            gen_helper_neon_##MULL##8,                                  \
 +            gen_helper_neon_##MULL##16,                                 \
 +            gen_##MULL##32,                                             \
 +            NULL,                                                       \
 +        };                                                              \
 +        static NeonGenTwo64OpFn * const accfn[] = {                     \
 +            gen_helper_neon_##ACC##l_u16,                               \
 +            gen_helper_neon_##ACC##l_u32,                               \
 +            tcg_gen_##ACC##_i64,                                        \
 +            NULL,                                                       \
 +        };                                                              \
 +        return do_long_3d(s, a, opfn[a->size], accfn[a->size]);         \
 +    }
 +
-+DO_VMLAL(VMLAL_S,mull_s,add)
++    asm volatile("adr %0, 1f + 2\n\t"
-+DO_VMLAL(VMLAL_U,mull_u,add)
++                 "str %0, %1\n\t"
-+DO_VMLAL(VMLSL_S,mull_s,sub)
++                 "bx  %0\n"
-+DO_VMLAL(VMLSL_U,mull_u,sub)
++                 "1:"
-diff --git a/target/arm/translate.c b/target/arm/translate.c
++                 : "=&r"(tmp), "=m"(expected));
 +
 +    /*
 +     * From v8, it is CONSTRAINED UNPREDICTABLE whether BXWritePC aligns
 +     * the address or not.  If so, we can legitimately fall through.
 +     */
 +    return EXIT_SUCCESS;
 +}
 diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/tests/tcg/aarch64/Makefile.target
-+++ b/target/arm/translate.c
++++ b/tests/tcg/aarch64/Makefile.target
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ VPATH         += $(ARM_SRC)
-                     {0, 0, 0, 7}, /* VABAL */
+ AARCH64_SRC=$(SRC_PATH)/tests/tcg/aarch64
-                     {0, 0, 0, 7}, /* VSUBHN: handled by decodetree */
+ VPATH         += $(AARCH64_SRC)
-                     {0, 0, 0, 7}, /* VABDL */
--                    {0, 0, 0, 0}, /* VMLAL */
+-# Float-convert Tests
-+                    {0, 0, 0, 7}, /* VMLAL */
+-AARCH64_TESTS=fcvt
-                     {0, 0, 0, 9}, /* VQDMLAL */
++# Base architecture tests
--                    {0, 0, 0, 0}, /* VMLSL */
++AARCH64_TESTS=fcvt pcalign-a64
-+                    {0, 0, 0, 7}, /* VMLSL */
-                     {0, 0, 0, 9}, /* VQDMLSL */
+ fcvt: LDFLAGS+=-lm
--                    {0, 0, 0, 0}, /* Integer VMULL */
-+                    {0, 0, 0, 7}, /* Integer VMULL */
+diff --git a/tests/tcg/arm/Makefile.target b/tests/tcg/arm/Makefile.target
-                     {0, 0, 0, 9}, /* VQDMULL */
+index XXXXXXX..XXXXXXX 100644
-                     {0, 0, 0, 0xa}, /* Polynomial VMULL */
+--- a/tests/tcg/arm/Makefile.target
-                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
++++ b/tests/tcg/arm/Makefile.target
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ run-fcvt: fcvt
-                         tmp2 = neon_load_reg(rm, pass);
+     $(call run-test,fcvt,$(QEMU) $<,"$< on $(TARGET_NAME)")
-                     }
+     $(call diff-out,fcvt,$(ARM_SRC)/fcvt.ref)
-                     switch (op) {
--                    case 8: case 9: case 10: case 11: case 12: case 13:
++# PC alignment test
--                        /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL */
++ARM_TESTS += pcalign-a32
-+                    case 9: case 11: case 13:
++pcalign-a32: CFLAGS+=-marm
-+                        /* VQDMLAL, VQDMLSL, VQDMULL */
++
-                         gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
+ ifeq ($(CONFIG_ARM_COMPATIBLE_SEMIHOSTING),y)
-                         break;
-                     default: /* 15 is RESERVED: caught earlier  */
+ # Semihosting smoke test for linux-user
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          /* VQDMULL */
                          gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
                          neon_store_reg64(cpu_V0, rd + pass);
 -                    } else if (op == 5 || (op >= 8 && op <= 11)) {
 +                    } else {
                          /* Accumulate.  */
                          neon_load_reg64(cpu_V1, rd + pass);
                          switch (op) {
 -                        case 10: /* VMLSL */
 -                            gen_neon_negl(cpu_V0, size);
 -                            /* Fall through */
 -                        case 8: /* VABAL, VMLAL */
 -                            gen_neon_addl(size);
 -                            break;
                          case 9: case 11: /* VQDMLAL, VQDMLSL */
                              gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
                              if (op == 11) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              abort();
                          }
                          neon_store_reg64(cpu_V0, rd + pass);
 -                    } else {
 -                        /* Write back the result.  */
 -                        neon_store_reg64(cpu_V0, rd + pass);
                      }
                  }
              } else {
 --
-.20.1
+.25.1

-[PULL 09/23] target/arm: Add missing TCG temp free in do_2shift_env_64()
+[PULL 19/33] target/i386: Use assert() to sanity-check b1 in SSE decode
-In commit 37bfce81b10450071 we accidentally introduced a leak of a TCG
+In the SSE decode function gen_sse(), we combine a byte
-temporary in do_2shift_env_64(); free it.
+'b' and a value 'b1' which can be [0..3], and switch on them:
    b |= (b1 << 8);
    switch (b) {
    ...
    default:
    unknown_op:
        gen_unknown_opcode(env, s);
        return;
    }
+In three cases inside this switch, we were then also checking for
+ "if (b1 >= 2) { goto unknown_op; }".
+However, this can never happen, because the 'case' values in each place
+are 0x0nn or 0x1nn and the switch will have directed the b1 == (2, 3)
+cases to the default already.
+This check was added in commit c045af25a52e9 in 2010; the added code
+was unnecessary then as well, and was apparently intended only to
+ensure that we never accidentally ended up indexing off the end
+of an sse_op_table with only 2 entries as a result of future bugs
+in the decode logic.
+Change the checks to assert() instead, and make sure they're always
+immediately before the array access they are protecting.
+Fixes: Coverity CID 1460207
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
- target/arm/translate-neon.inc.c | 1 +
+ target/i386/tcg/translate.c | 12 +++---------
-file changed, 1 insertion(+)
+file changed, 3 insertions(+), 9 deletions(-)
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/i386/tcg/translate.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/i386/tcg/translate.c
-@@ -XXX,XX +XXX,XX @@ static bool do_2shift_env_64(DisasContext *s, arg_2reg_shift *a,
+@@ -XXX,XX +XXX,XX @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
-         neon_load_reg64(tmp, a->vm + pass);
+         case 0x171: /* shift xmm, im */
-         fn(tmp, cpu_env, tmp, constimm);
+         case 0x172:
-         neon_store_reg64(tmp, a->vd + pass);
+         case 0x173:
-+        tcg_temp_free_i64(tmp);
+-            if (b1 >= 2) {
-     }
+-                goto unknown_op;
-     tcg_temp_free_i64(constimm);
+-            }
-     return true;
+             val = x86_ldub_code(env, s);
              if (is_xmm) {
                  tcg_gen_movi_tl(s->T0, val);
@@ -XXX,XX +XXX,XX @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
                                  offsetof(CPUX86State, mmx_t0.MMX_L(1)));
                  op1_offset = offsetof(CPUX86State,mmx_t0);
              }
 +            assert(b1 < 2);
              sse_fn_epp = sse_op_table2[((b - 1) & 3) * 8 +
                                         (((modrm >> 3)) & 7)][b1];
              if (!sse_fn_epp) {
@@ -XXX,XX +XXX,XX @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
              rm = modrm & 7;
              reg = ((modrm >> 3) & 7) | REX_R(s);
              mod = (modrm >> 6) & 3;
 -            if (b1 >= 2) {
 -                goto unknown_op;
 -            }
 +            assert(b1 < 2);
              sse_fn_epp = sse_op_table6[b].op[b1];
              if (!sse_fn_epp) {
                  goto unknown_op;
@@ -XXX,XX +XXX,XX @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b,
              rm = modrm & 7;
              reg = ((modrm >> 3) & 7) | REX_R(s);
              mod = (modrm >> 6) & 3;
 -            if (b1 >= 2) {
 -                goto unknown_op;
 -            }
 +            assert(b1 < 2);
              sse_fn_eppi = sse_op_table7[b].op[b1];
              if (!sse_fn_eppi) {
                  goto unknown_op;
 --
-.20.1
+.25.1

-[PULL 12/23] target/arm: Convert Neon 2-reg-scalar VQDMULH, VQRDMULH to decodetree
+[PULL 20/33] include/hw/i386: Don't include qemu-common.h in .h files
-Convert the VQDMULH and VQRDMULH insns in the 2-reg-scalar group
+The qemu-common.h header is not supposed to be included from any
-to decodetree.
+other header files, only from .c files (as documented in a comment at
 the start of it).
 include/hw/i386/x86.h and include/hw/i386/microvm.h break this rule.
 In fact, the include is not required at all, so we can just drop it
 from both files.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20211129200510.1233037-2-peter.maydell@linaro.org
 ---
- target/arm/neon-dp.decode       |  3 +++
+ include/hw/i386/microvm.h | 1 -
- target/arm/translate-neon.inc.c | 29 +++++++++++++++++++++++
+ include/hw/i386/x86.h     | 1 -
- target/arm/translate.c          | 42 ++-------------------------------
+files changed, 2 deletions(-)
 files changed, 34 insertions(+), 40 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/include/hw/i386/microvm.h b/include/hw/i386/microvm.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/include/hw/i386/microvm.h
-+++ b/target/arm/neon-dp.decode
++++ b/include/hw/i386/microvm.h
-@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
+@@ -XXX,XX +XXX,XX @@
+ #ifndef HW_I386_MICROVM_H
-     VMUL_2sc     1111 001 . 1 . .. .... .... 1000 . 1 . 0 .... @2scalar
+ #define HW_I386_MICROVM_H
-     VMUL_F_2sc   1111 001 . 1 . .. .... .... 1001 . 1 . 0 .... @2scalar
-+
+-#include "qemu-common.h"
-+    VQDMULH_2sc  1111 001 . 1 . .. .... .... 1100 . 1 . 0 .... @2scalar
+ #include "exec/hwaddr.h"
-+    VQRDMULH_2sc 1111 001 . 1 . .. .... .... 1101 . 1 . 0 .... @2scalar
+ #include "qemu/notify.h"
-   ]
- }
+diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/include/hw/i386/x86.h
-+++ b/target/arm/translate-neon.inc.c
++++ b/include/hw/i386/x86.h
-@@ -XXX,XX +XXX,XX @@ static bool trans_VMLS_F_2sc(DisasContext *s, arg_2scalar *a)
+@@ -XXX,XX +XXX,XX @@
+ #ifndef HW_I386_X86_H
-     return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
+ #define HW_I386_X86_H
- }
-+
+-#include "qemu-common.h"
-+WRAP_ENV_FN(gen_VQDMULH_16, gen_helper_neon_qdmulh_s16)
+ #include "exec/hwaddr.h"
-+WRAP_ENV_FN(gen_VQDMULH_32, gen_helper_neon_qdmulh_s32)
+ #include "qemu/notify.h"
-+WRAP_ENV_FN(gen_VQRDMULH_16, gen_helper_neon_qrdmulh_s16)
 +WRAP_ENV_FN(gen_VQRDMULH_32, gen_helper_neon_qrdmulh_s32)
 +
 +static bool trans_VQDMULH_2sc(DisasContext *s, arg_2scalar *a)
 +{
 +    static NeonGenTwoOpFn * const opfn[] = {
 +        NULL,
 +        gen_VQDMULH_16,
 +        gen_VQDMULH_32,
 +        NULL,
 +    };
 +
 +    return do_2scalar(s, a, opfn[a->size], NULL);
 +}
 +
 +static bool trans_VQRDMULH_2sc(DisasContext *s, arg_2scalar *a)
 +{
 +    static NeonGenTwoOpFn * const opfn[] = {
 +        NULL,
 +        gen_VQRDMULH_16,
 +        gen_VQRDMULH_32,
 +        NULL,
 +    };
 +
 +    return do_2scalar(s, a, opfn[a->size], NULL);
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
  #define CPU_V001 cpu_V0, cpu_V0, cpu_V1
 -static TCGv_i32 neon_load_scratch(int scratch)
 -{
 -    TCGv_i32 tmp = tcg_temp_new_i32();
 -    tcg_gen_ld_i32(tmp, cpu_env, offsetof(CPUARMState, vfp.scratch[scratch]));
 -    return tmp;
 -}
 -
 -static void neon_store_scratch(int scratch, TCGv_i32 var)
 -{
 -    tcg_gen_st_i32(var, cpu_env, offsetof(CPUARMState, vfp.scratch[scratch]));
 -    tcg_temp_free_i32(var);
 -}
 -
  static int gen_neon_unzip(int rd, int rm, int size, int q)
  {
      TCGv_ptr pd, pm;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  case 1: /* Float VMLA scalar */
                  case 5: /* Floating point VMLS scalar */
                  case 9: /* Floating point VMUL scalar */
 -                    return 1; /* handled by decodetree */
 -
                  case 12: /* VQDMULH scalar */
                  case 13: /* VQRDMULH scalar */
 -                    if (u && ((rd | rn) & 1)) {
 -                        return 1;
 -                    }
 -                    tmp = neon_get_scalar(size, rm);
 -                    neon_store_scratch(0, tmp);
 -                    for (pass = 0; pass < (u ? 4 : 2); pass++) {
 -                        tmp = neon_load_scratch(0);
 -                        tmp2 = neon_load_reg(rn, pass);
 -                        if (op == 12) {
 -                            if (size == 1) {
 -                                gen_helper_neon_qdmulh_s16(tmp, cpu_env, tmp, tmp2);
 -                            } else {
 -                                gen_helper_neon_qdmulh_s32(tmp, cpu_env, tmp, tmp2);
 -                            }
 -                        } else {
 -                            if (size == 1) {
 -                                gen_helper_neon_qrdmulh_s16(tmp, cpu_env, tmp, tmp2);
 -                            } else {
 -                                gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
 -                            }
 -                        }
 -                        tcg_temp_free_i32(tmp2);
 -                        neon_store_reg(rd, pass, tmp);
 -                    }
 -                    break;
 +                    return 1; /* handled by decodetree */
 +
                  case 3: /* VQDMLAL scalar */
                  case 7: /* VQDMLSL scalar */
                  case 11: /* VQDMULL scalar */
 --
-.20.1
+.25.1

-[PULL 08/23] target/arm: Add 'static' and 'const' annotations to VSHLL function arrays
+[PULL 21/33] target/hexagon/cpu.h: don't include qemu-common.h
-Mark the arrays of function pointers in trans_VSHLL_S_2sh() and
+The qemu-common.h header is not supposed to be included from any
-trans_VSHLL_U_2sh() as both 'static' and 'const'.
+other header files, only from .c files (as documented in a comment at
 the start of it).
 Move the include to linux-user/hexagon/cpu_loop.c, which needs it for
 the declaration of cpu_exec_step_atomic().
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
+Message-id: 20211129200510.1233037-3-peter.maydell@linaro.org
 ---
- target/arm/translate-neon.inc.c | 4 ++--
+ target/hexagon/cpu.h          | 1 -
-file changed, 2 insertions(+), 2 deletions(-)
+ linux-user/hexagon/cpu_loop.c | 1 +
 files changed, 1 insertion(+), 1 deletion(-)
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/target/hexagon/cpu.h
-+++ b/target/arm/translate-neon.inc.c
++++ b/target/hexagon/cpu.h
-@@ -XXX,XX +XXX,XX @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a,
+@@ -XXX,XX +XXX,XX @@ typedef struct CPUHexagonState CPUHexagonState;
- static bool trans_VSHLL_S_2sh(DisasContext *s, arg_2reg_shift *a)
+ #include "fpu/softfloat-types.h"
- {
--    NeonGenWidenFn *widenfn[] = {
+-#include "qemu-common.h"
-+    static NeonGenWidenFn * const widenfn[] = {
+ #include "exec/cpu-defs.h"
-         gen_helper_neon_widen_s8,
+ #include "hex_regs.h"
-         gen_helper_neon_widen_s16,
+ #include "mmvec/mmvec.h"
-         tcg_gen_ext_i32_i64,
+diff --git a/linux-user/hexagon/cpu_loop.c b/linux-user/hexagon/cpu_loop.c
-@@ -XXX,XX +XXX,XX @@ static bool trans_VSHLL_S_2sh(DisasContext *s, arg_2reg_shift *a)
+index XXXXXXX..XXXXXXX 100644
+--- a/linux-user/hexagon/cpu_loop.c
- static bool trans_VSHLL_U_2sh(DisasContext *s, arg_2reg_shift *a)
++++ b/linux-user/hexagon/cpu_loop.c
- {
+@@ -XXX,XX +XXX,XX @@
--    NeonGenWidenFn *widenfn[] = {
+  */
-+    static NeonGenWidenFn * const widenfn[] = {
-         gen_helper_neon_widen_u8,
+ #include "qemu/osdep.h"
-         gen_helper_neon_widen_u16,
++#include "qemu-common.h"
-         tcg_gen_extu_i32_i64,
+ #include "qemu.h"
  #include "user-internals.h"
  #include "cpu_loop-common.h"
 --
-.20.1
+.25.1

-[PULL 07/23] target/arm: Convert Neon 3-reg-diff polynomial VMULL
+[PULL 22/33] target/rx/cpu.h: Don't include qemu-common.h
-Convert the Neon 3-reg-diff insn polynomial VMULL. This is the last
+The qemu-common.h header is not supposed to be included from any
-insn in this group to be converted.
+other header files, only from .c files (as documented in a comment at
 the start of it).
 Nothing actually relies on target/rx/cpu.h including it, so we can
 just drop the include.
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
+Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp>
+Message-id: 20211129200510.1233037-4-peter.maydell@linaro.org
 ---
- target/arm/neon-dp.decode       |  2 ++
+ target/rx/cpu.h | 1 -
- target/arm/translate-neon.inc.c | 43 +++++++++++++++++++++++
+file changed, 1 deletion(-)
  target/arm/translate.c          | 60 ++-------------------------------
 files changed, 48 insertions(+), 57 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/target/rx/cpu.h b/target/rx/cpu.h
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/target/rx/cpu.h
-+++ b/target/arm/neon-dp.decode
++++ b/target/rx/cpu.h
-@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
+@@ -XXX,XX +XXX,XX @@
-     VMULL_U_3d   1111 001 1 1 . .. .... .... 1100 . 0 . 0 .... @3diff
+ #define RX_CPU_H
-     VQDMULL_3d   1111 001 0 1 . .. .... .... 1101 . 0 . 0 .... @3diff
+ #include "qemu/bitops.h"
-+
+-#include "qemu-common.h"
-+    VMULL_P_3d   1111 001 0 1 . .. .... .... 1110 . 0 . 0 .... @3diff
+ #include "hw/registerfields.h"
-   ]
+ #include "cpu-qom.h"
- }
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VQDMLSL_3d(DisasContext *s, arg_3diff *a)
      return do_long_3d(s, a, opfn[a->size], accfn[a->size]);
  }
 +
 +static bool trans_VMULL_P_3d(DisasContext *s, arg_3diff *a)
 +{
 +    gen_helper_gvec_3 *fn_gvec;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if (a->vd & 1) {
 +        return false;
 +    }
 +
 +    switch (a->size) {
 +    case 0:
 +        fn_gvec = gen_helper_neon_pmull_h;
 +        break;
 +    case 2:
 +        if (!dc_isar_feature(aa32_pmull, s)) {
 +            return false;
 +        }
 +        fn_gvec = gen_helper_gvec_pmull_q;
 +        break;
 +    default:
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    tcg_gen_gvec_3_ool(neon_reg_offset(a->vd, 0),
 +                       neon_reg_offset(a->vn, 0),
 +                       neon_reg_offset(a->vm, 0),
 +                       16, 16, 0, fn_gvec);
 +    return true;
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
  {
      int op;
      int q;
 -    int rd, rn, rm, rd_ofs, rn_ofs, rm_ofs;
 +    int rd, rn, rm, rd_ofs, rm_ofs;
      int size;
      int pass;
      int u;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
      size = (insn >> 20) & 3;
      vec_size = q ? 16 : 8;
      rd_ofs = neon_reg_offset(rd, 0);
 -    rn_ofs = neon_reg_offset(rn, 0);
      rm_ofs = neon_reg_offset(rm, 0);
      if ((insn & (1 << 23)) == 0) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
          if (size != 3) {
              op = (insn >> 8) & 0xf;
              if ((insn & (1 << 6)) == 0) {
 -                /* Three registers of different lengths.  */
 -                /* undefreq: bit 0 : UNDEF if size == 0
 -                 *           bit 1 : UNDEF if size == 1
 -                 *           bit 2 : UNDEF if size == 2
 -                 *           bit 3 : UNDEF if U == 1
 -                 * Note that [2:0] set implies 'always UNDEF'
 -                 */
 -                int undefreq;
 -                /* prewiden, src1_wide, src2_wide, undefreq */
 -                static const int neon_3reg_wide[16][4] = {
 -                    {0, 0, 0, 7}, /* VADDL: handled by decodetree */
 -                    {0, 0, 0, 7}, /* VADDW: handled by decodetree */
 -                    {0, 0, 0, 7}, /* VSUBL: handled by decodetree */
 -                    {0, 0, 0, 7}, /* VSUBW: handled by decodetree */
 -                    {0, 0, 0, 7}, /* VADDHN: handled by decodetree */
 -                    {0, 0, 0, 7}, /* VABAL */
 -                    {0, 0, 0, 7}, /* VSUBHN: handled by decodetree */
 -                    {0, 0, 0, 7}, /* VABDL */
 -                    {0, 0, 0, 7}, /* VMLAL */
 -                    {0, 0, 0, 7}, /* VQDMLAL */
 -                    {0, 0, 0, 7}, /* VMLSL */
 -                    {0, 0, 0, 7}, /* VQDMLSL */
 -                    {0, 0, 0, 7}, /* Integer VMULL */
 -                    {0, 0, 0, 7}, /* VQDMULL */
 -                    {0, 0, 0, 0xa}, /* Polynomial VMULL */
 -                    {0, 0, 0, 7}, /* Reserved: always UNDEF */
 -                };
 -
 -                undefreq = neon_3reg_wide[op][3];
 -
 -                if ((undefreq & (1 << size)) ||
 -                    ((undefreq & 8) && u)) {
 -                    return 1;
 -                }
 -                if (rd & 1) {
 -                    return 1;
 -                }
 -
 -                /* Handle polynomial VMULL in a single pass.  */
 -                if (op == 14) {
 -                    if (size == 0) {
 -                        /* VMULL.P8 */
 -                        tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, 16, 16,
 -                                           0, gen_helper_neon_pmull_h);
 -                    } else {
 -                        /* VMULL.P64 */
 -                        if (!dc_isar_feature(aa32_pmull, s)) {
 -                            return 1;
 -                        }
 -                        tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, 16, 16,
 -                                           0, gen_helper_gvec_pmull_q);
 -                    }
 -                    return 0;
 -                }
 -                abort(); /* all others handled by decodetree */
 +                /* Three registers of different lengths: handled by decodetree */
 +                return 1;
              } else {
                  /* Two registers and a scalar. NB that for ops of this form
                   * the ARM ARM labels bit 24 as Q, but it is in our variable
 --
-.20.1
+.25.1

-[PULL 01/23] target/arm: Fix missing temp frees in do_vshll_2sh
+[PULL 23/33] hw/arm: Don't include qemu-common.h unnecessarily
-The widenfn() in do_vshll_2sh() does not free the input 32-bit
+A lot of C files in hw/arm include qemu-common.h when they don't
-TCGv, so we need to do this in the calling code.
+need anything from it. Drop the include lines.
 omap1.c, pxa2xx.c and strongarm.c retain the include because they
 use it for the prototype of qemu_get_timedate().
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
+Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp>
+Message-id: 20211129200510.1233037-5-peter.maydell@linaro.org
 ---
- target/arm/translate-neon.inc.c | 2 ++
+ hw/arm/boot.c           | 1 -
-file changed, 2 insertions(+)
+ hw/arm/digic_boards.c   | 1 -
  hw/arm/highbank.c       | 1 -
  hw/arm/npcm7xx_boards.c | 1 -
  hw/arm/sbsa-ref.c       | 1 -
  hw/arm/stm32f405_soc.c  | 1 -
  hw/arm/vexpress.c       | 1 -
  hw/arm/virt.c           | 1 -
 files changed, 8 deletions(-)
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
+diff --git a/hw/arm/boot.c b/hw/arm/boot.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate-neon.inc.c
+--- a/hw/arm/boot.c
-+++ b/target/arm/translate-neon.inc.c
++++ b/hw/arm/boot.c
-@@ -XXX,XX +XXX,XX @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a,
+@@ -XXX,XX +XXX,XX @@
-     tmp = tcg_temp_new_i64();
+  */
-     widenfn(tmp, rm0);
+ #include "qemu/osdep.h"
-+    tcg_temp_free_i32(rm0);
+-#include "qemu-common.h"
-     if (a->shift != 0) {
+ #include "qemu/datadir.h"
-         tcg_gen_shli_i64(tmp, tmp, a->shift);
+ #include "qemu/error-report.h"
-         tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
+ #include "qapi/error.h"
-@@ -XXX,XX +XXX,XX @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a,
+diff --git a/hw/arm/digic_boards.c b/hw/arm/digic_boards.c
-     neon_store_reg64(tmp, a->vd);
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/arm/digic_boards.c
-     widenfn(tmp, rm1);
++++ b/hw/arm/digic_boards.c
-+    tcg_temp_free_i32(rm1);
+@@ -XXX,XX +XXX,XX @@
-     if (a->shift != 0) {
-         tcg_gen_shli_i64(tmp, tmp, a->shift);
+ #include "qemu/osdep.h"
-         tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
+ #include "qapi/error.h"
 -#include "qemu-common.h"
  #include "qemu/datadir.h"
  #include "hw/boards.h"
  #include "qemu/error-report.h"
 diff --git a/hw/arm/highbank.c b/hw/arm/highbank.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/highbank.c
 +++ b/hw/arm/highbank.c
@@ -XXX,XX +XXX,XX @@
   */
  #include "qemu/osdep.h"
 -#include "qemu-common.h"
  #include "qemu/datadir.h"
  #include "qapi/error.h"
  #include "hw/sysbus.h"
 diff --git a/hw/arm/npcm7xx_boards.c b/hw/arm/npcm7xx_boards.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/npcm7xx_boards.c
 +++ b/hw/arm/npcm7xx_boards.c
@@ -XXX,XX +XXX,XX @@
  #include "hw/qdev-core.h"
  #include "hw/qdev-properties.h"
  #include "qapi/error.h"
 -#include "qemu-common.h"
  #include "qemu/datadir.h"
  #include "qemu/units.h"
  #include "sysemu/blockdev.h"
 diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/sbsa-ref.c
 +++ b/hw/arm/sbsa-ref.c
@@ -XXX,XX +XXX,XX @@
   */
  #include "qemu/osdep.h"
 -#include "qemu-common.h"
  #include "qemu/datadir.h"
  #include "qapi/error.h"
  #include "qemu/error-report.h"
 diff --git a/hw/arm/stm32f405_soc.c b/hw/arm/stm32f405_soc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/stm32f405_soc.c
 +++ b/hw/arm/stm32f405_soc.c
@@ -XXX,XX +XXX,XX @@
  #include "qemu/osdep.h"
  #include "qapi/error.h"
 -#include "qemu-common.h"
  #include "exec/address-spaces.h"
  #include "sysemu/sysemu.h"
  #include "hw/arm/stm32f405_soc.h"
 diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/vexpress.c
 +++ b/hw/arm/vexpress.c
@@ -XXX,XX +XXX,XX @@
  #include "qemu/osdep.h"
  #include "qapi/error.h"
 -#include "qemu-common.h"
  #include "qemu/datadir.h"
  #include "cpu.h"
  #include "hw/sysbus.h"
 diff --git a/hw/arm/virt.c b/hw/arm/virt.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/virt.c
 +++ b/hw/arm/virt.c
@@ -XXX,XX +XXX,XX @@
   */
  #include "qemu/osdep.h"
 -#include "qemu-common.h"
  #include "qemu/datadir.h"
  #include "qemu/units.h"
  #include "qemu/option.h"
 --
-.20.1
+.25.1

-[PULL 04/23] target/arm: Convert Neon 3-reg-diff VABAL, VABDL to decodetree
+[PULL 24/33] target/arm: Correct calculation of tlb range invalidate length
-Convert the Neon 3-reg-diff insns VABAL and VABDL to decodetree.
+The calculation of the length of TLB range invalidate operations
-Like almost all the remaining insns in this group, these are
+in tlbi_aa64_range_get_length() is incorrect in two ways:
-a combination of a two-input operation which returns a double width
+ * the NUM field is 5 bits, but we read only 4 bits
-result and then a possible accumulation of that double width
+ * we miscalculate the page_shift value, because of an
-result into the destination.
+   off-by-one error:
     TG 0b00 is invalid
     TG 0b01 is 4K granule size == 4096 == 2^12
     TG 0b10 is 16K granule size == 16384 == 2^14
     TG 0b11 is 64K granule size == 65536 == 2^16
    so page_shift should be (TG - 1) * 2 + 12
+Thanks to the bug report submitter Cha HyunSoo for identifying
+both these errors.
+Fixes: 84940ed82552d3c ("target/arm: Add support for FEAT_TLBIRANGE")
+Resolves: https://gitlab.com/qemu-project/qemu/-/issues/734
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20211130173257.1274194-1-peter.maydell@linaro.org
 ---
- target/arm/translate.h          |   1 +
+ target/arm/helper.c | 6 +++---
- target/arm/neon-dp.decode       |   6 ++
+file changed, 3 insertions(+), 3 deletions(-)
  target/arm/translate-neon.inc.c | 132 ++++++++++++++++++++++++++++++++
  target/arm/translate.c          |  31 +-------
 files changed, 142 insertions(+), 28 deletions(-)
-diff --git a/target/arm/translate.h b/target/arm/translate.h
+diff --git a/target/arm/helper.c b/target/arm/helper.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.h
+--- a/target/arm/helper.c
-+++ b/target/arm/translate.h
++++ b/target/arm/helper.c
-@@ -XXX,XX +XXX,XX @@ typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
+@@ -XXX,XX +XXX,XX @@ static uint64_t tlbi_aa64_range_get_length(CPUARMState *env,
- typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
+     uint64_t exponent;
- typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
+     uint64_t length;
- typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
-+typedef void NeonGenTwoOpWidenFn(TCGv_i64, TCGv_i32, TCGv_i32);
+-    num = extract64(value, 39, 4);
- typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
++    num = extract64(value, 39, 5);
- typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
+     scale = extract64(value, 44, 2);
- typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
+     page_size_granule = extract64(value, 46, 2);
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
-index XXXXXXX..XXXXXXX 100644
+-    page_shift = page_size_granule * 2 + 12;
---- a/target/arm/neon-dp.decode
+-
-+++ b/target/arm/neon-dp.decode
+     if (page_size_granule == 0) {
-@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
+         qemu_log_mask(LOG_GUEST_ERROR, "Invalid page size granule %d\n",
-     VADDHN_3d    1111 001 0 1 . .. .... .... 0100 . 0 . 0 .... @3diff
+                       page_size_granule);
-     VRADDHN_3d   1111 001 1 1 . .. .... .... 0100 . 0 . 0 .... @3diff
+         return 0;
+     }
-+    VABAL_S_3d   1111 001 0 1 . .. .... .... 0101 . 0 . 0 .... @3diff
-+    VABAL_U_3d   1111 001 1 1 . .. .... .... 0101 . 0 . 0 .... @3diff
++    page_shift = (page_size_granule - 1) * 2 + 12;
 +
-     VSUBHN_3d    1111 001 0 1 . .. .... .... 0110 . 0 . 0 .... @3diff
+     exponent = (5 * scale) + 1;
-     VRSUBHN_3d   1111 001 1 1 . .. .... .... 0110 . 0 . 0 .... @3diff
+     length = (num + 1) << (exponent + page_shift);
-+
 +    VABDL_S_3d   1111 001 0 1 . .. .... .... 0111 . 0 . 0 .... @3diff
 +    VABDL_U_3d   1111 001 1 1 . .. .... .... 0111 . 0 . 0 .... @3diff
    ]
  }
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_NARROW_3D(VADDHN, add, narrow, tcg_gen_extrh_i64_i32)
  DO_NARROW_3D(VSUBHN, sub, narrow, tcg_gen_extrh_i64_i32)
  DO_NARROW_3D(VRADDHN, add, narrow_round, gen_narrow_round_high_u32)
  DO_NARROW_3D(VRSUBHN, sub, narrow_round, gen_narrow_round_high_u32)
 +
 +static bool do_long_3d(DisasContext *s, arg_3diff *a,
 +                       NeonGenTwoOpWidenFn *opfn,
 +                       NeonGenTwo64OpFn *accfn)
 +{
 +    /*
 +     * 3-regs different lengths, long operations.
 +     * These perform an operation on two inputs that returns a double-width
 +     * result, and then possibly perform an accumulation operation of
 +     * that result into the double-width destination.
 +     */
 +    TCGv_i64 rd0, rd1, tmp;
 +    TCGv_i32 rn, rm;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if (!opfn) {
 +        /* size == 3 case, which is an entirely different insn group */
 +        return false;
 +    }
 +
 +    if (a->vd & 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    rd0 = tcg_temp_new_i64();
 +    rd1 = tcg_temp_new_i64();
 +
 +    rn = neon_load_reg(a->vn, 0);
 +    rm = neon_load_reg(a->vm, 0);
 +    opfn(rd0, rn, rm);
 +    tcg_temp_free_i32(rn);
 +    tcg_temp_free_i32(rm);
 +
 +    rn = neon_load_reg(a->vn, 1);
 +    rm = neon_load_reg(a->vm, 1);
 +    opfn(rd1, rn, rm);
 +    tcg_temp_free_i32(rn);
 +    tcg_temp_free_i32(rm);
 +
 +    /* Don't store results until after all loads: they might overlap */
 +    if (accfn) {
 +        tmp = tcg_temp_new_i64();
 +        neon_load_reg64(tmp, a->vd);
 +        accfn(tmp, tmp, rd0);
 +        neon_store_reg64(tmp, a->vd);
 +        neon_load_reg64(tmp, a->vd + 1);
 +        accfn(tmp, tmp, rd1);
 +        neon_store_reg64(tmp, a->vd + 1);
 +        tcg_temp_free_i64(tmp);
 +    } else {
 +        neon_store_reg64(rd0, a->vd);
 +        neon_store_reg64(rd1, a->vd + 1);
 +    }
 +
 +    tcg_temp_free_i64(rd0);
 +    tcg_temp_free_i64(rd1);
 +
 +    return true;
 +}
 +
 +static bool trans_VABDL_S_3d(DisasContext *s, arg_3diff *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        gen_helper_neon_abdl_s16,
 +        gen_helper_neon_abdl_s32,
 +        gen_helper_neon_abdl_s64,
 +        NULL,
 +    };
 +
 +    return do_long_3d(s, a, opfn[a->size], NULL);
 +}
 +
 +static bool trans_VABDL_U_3d(DisasContext *s, arg_3diff *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        gen_helper_neon_abdl_u16,
 +        gen_helper_neon_abdl_u32,
 +        gen_helper_neon_abdl_u64,
 +        NULL,
 +    };
 +
 +    return do_long_3d(s, a, opfn[a->size], NULL);
 +}
 +
 +static bool trans_VABAL_S_3d(DisasContext *s, arg_3diff *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        gen_helper_neon_abdl_s16,
 +        gen_helper_neon_abdl_s32,
 +        gen_helper_neon_abdl_s64,
 +        NULL,
 +    };
 +    static NeonGenTwo64OpFn * const addfn[] = {
 +        gen_helper_neon_addl_u16,
 +        gen_helper_neon_addl_u32,
 +        tcg_gen_add_i64,
 +        NULL,
 +    };
 +
 +    return do_long_3d(s, a, opfn[a->size], addfn[a->size]);
 +}
 +
 +static bool trans_VABAL_U_3d(DisasContext *s, arg_3diff *a)
 +{
 +    static NeonGenTwoOpWidenFn * const opfn[] = {
 +        gen_helper_neon_abdl_u16,
 +        gen_helper_neon_abdl_u32,
 +        gen_helper_neon_abdl_u64,
 +        NULL,
 +    };
 +    static NeonGenTwo64OpFn * const addfn[] = {
 +        gen_helper_neon_addl_u16,
 +        gen_helper_neon_addl_u32,
 +        tcg_gen_add_i64,
 +        NULL,
 +    };
 +
 +    return do_long_3d(s, a, opfn[a->size], addfn[a->size]);
 +}
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      {0, 0, 0, 7}, /* VSUBL: handled by decodetree */
                      {0, 0, 0, 7}, /* VSUBW: handled by decodetree */
                      {0, 0, 0, 7}, /* VADDHN: handled by decodetree */
 -                    {0, 0, 0, 0}, /* VABAL */
 +                    {0, 0, 0, 7}, /* VABAL */
                      {0, 0, 0, 7}, /* VSUBHN: handled by decodetree */
 -                    {0, 0, 0, 0}, /* VABDL */
 +                    {0, 0, 0, 7}, /* VABDL */
                      {0, 0, 0, 0}, /* VMLAL */
                      {0, 0, 0, 9}, /* VQDMLAL */
                      {0, 0, 0, 0}, /* VMLSL */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          tmp2 = neon_load_reg(rm, pass);
                      }
                      switch (op) {
 -                    case 5: case 7: /* VABAL, VABDL */
 -                        switch ((size << 1) | u) {
 -                        case 0:
 -                            gen_helper_neon_abdl_s16(cpu_V0, tmp, tmp2);
 -                            break;
 -                        case 1:
 -                            gen_helper_neon_abdl_u16(cpu_V0, tmp, tmp2);
 -                            break;
 -                        case 2:
 -                            gen_helper_neon_abdl_s32(cpu_V0, tmp, tmp2);
 -                            break;
 -                        case 3:
 -                            gen_helper_neon_abdl_u32(cpu_V0, tmp, tmp2);
 -                            break;
 -                        case 4:
 -                            gen_helper_neon_abdl_s64(cpu_V0, tmp, tmp2);
 -                            break;
 -                        case 5:
 -                            gen_helper_neon_abdl_u64(cpu_V0, tmp, tmp2);
 -                            break;
 -                        default: abort();
 -                        }
 -                        tcg_temp_free_i32(tmp2);
 -                        tcg_temp_free_i32(tmp);
 -                        break;
                      case 8: case 9: case 10: case 11: case 12: case 13:
                          /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL */
                          gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          case 10: /* VMLSL */
                              gen_neon_negl(cpu_V0, size);
                              /* Fall through */
 -                        case 5: case 8: /* VABAL, VMLAL */
 +                        case 8: /* VABAL, VMLAL */
                              gen_neon_addl(size);
                              break;
                          case 9: case 11: /* VQDMLAL, VQDMLSL */
 --
-.20.1
+.25.1

-[PULL 03/23] target/arm: Convert Neon 3-reg-diff narrowing ops to decodetree
+[PULL 25/33] hw/net: npcm7xx_emc fix missing queue_flush
-Convert the narrow-to-high-half insns VADDHN, VSUBHN, VRADDHN,
+From: Patrick Venture <venture@google.com>
 VRSUBHN in the Neon 3-registers-different-lengths group to
 decodetree.
+The rx_active boolean change to true should always trigger a try_read
+call that flushes the queue.
+Signed-off-by: Patrick Venture <venture@google.com>
+Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Message-id: 20211203221002.1719306-1-venture@google.com
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
- target/arm/neon-dp.decode       |  6 +++
+ hw/net/npcm7xx_emc.c | 18 ++++++++----------
- target/arm/translate-neon.inc.c | 87 +++++++++++++++++++++++++++++++
+file changed, 8 insertions(+), 10 deletions(-)
  target/arm/translate.c          | 91 ++++-----------------------------
 files changed, 104 insertions(+), 80 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/hw/net/npcm7xx_emc.c b/hw/net/npcm7xx_emc.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/hw/net/npcm7xx_emc.c
-+++ b/target/arm/neon-dp.decode
++++ b/hw/net/npcm7xx_emc.c
-@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
+@@ -XXX,XX +XXX,XX @@ static void emc_halt_rx(NPCM7xxEMCState *emc, uint32_t mista_flag)
+     emc_set_mista(emc, mista_flag);
      VSUBW_S_3d   1111 001 0 1 . .. .... .... 0011 . 0 . 0 .... @3diff
      VSUBW_U_3d   1111 001 1 1 . .. .... .... 0011 . 0 . 0 .... @3diff
 +
 +    VADDHN_3d    1111 001 0 1 . .. .... .... 0100 . 0 . 0 .... @3diff
 +    VRADDHN_3d   1111 001 1 1 . .. .... .... 0100 . 0 . 0 .... @3diff
 +
 +    VSUBHN_3d    1111 001 0 1 . .. .... .... 0110 . 0 . 0 .... @3diff
 +    VRSUBHN_3d   1111 001 1 1 . .. .... .... 0110 . 0 . 0 .... @3diff
    ]
  }
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
++static void emc_enable_rx_and_flush(NPCM7xxEMCState *emc)
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_PREWIDEN(VADDW_S, s, ext, add, true)
  DO_PREWIDEN(VADDW_U, u, extu, add, true)
  DO_PREWIDEN(VSUBW_S, s, ext, sub, true)
  DO_PREWIDEN(VSUBW_U, u, extu, sub, true)
 +
 +static bool do_narrow_3d(DisasContext *s, arg_3diff *a,
 +                         NeonGenTwo64OpFn *opfn, NeonGenNarrowFn *narrowfn)
 +{
-+    /* 3-regs different lengths, narrowing (VADDHN/VSUBHN/VRADDHN/VRSUBHN) */
++    emc->rx_active = true;
-+    TCGv_i64 rn_64, rm_64;
++    qemu_flush_queued_packets(qemu_get_queue(emc->nic));
 +    TCGv_i32 rd0, rd1;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
 +    /* UNDEF accesses to D16-D31 if they don't exist. */
 +    if (!dc_isar_feature(aa32_simd_r32, s) &&
 +        ((a->vd | a->vn | a->vm) & 0x10)) {
 +        return false;
 +    }
 +
 +    if (!opfn || !narrowfn) {
 +        /* size == 3 case, which is an entirely different insn group */
 +        return false;
 +    }
 +
 +    if ((a->vn | a->vm) & 1) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    rn_64 = tcg_temp_new_i64();
 +    rm_64 = tcg_temp_new_i64();
 +    rd0 = tcg_temp_new_i32();
 +    rd1 = tcg_temp_new_i32();
 +
 +    neon_load_reg64(rn_64, a->vn);
 +    neon_load_reg64(rm_64, a->vm);
 +
 +    opfn(rn_64, rn_64, rm_64);
 +
 +    narrowfn(rd0, rn_64);
 +
 +    neon_load_reg64(rn_64, a->vn + 1);
 +    neon_load_reg64(rm_64, a->vm + 1);
 +
 +    opfn(rn_64, rn_64, rm_64);
 +
 +    narrowfn(rd1, rn_64);
 +
 +    neon_store_reg(a->vd, 0, rd0);
 +    neon_store_reg(a->vd, 1, rd1);
 +
 +    tcg_temp_free_i64(rn_64);
 +    tcg_temp_free_i64(rm_64);
 +
 +    return true;
 +}
 +
-+#define DO_NARROW_3D(INSN, OP, NARROWTYPE, EXTOP)                       \
+ static void emc_set_next_tx_descriptor(NPCM7xxEMCState *emc,
-+    static bool trans_##INSN##_3d(DisasContext *s, arg_3diff *a)        \
+                                        const NPCM7xxEMCTxDesc *tx_desc,
-+    {                                                                   \
+                                        uint32_t desc_addr)
-+        static NeonGenTwo64OpFn * const addfn[] = {                     \
+@@ -XXX,XX +XXX,XX @@ static ssize_t emc_receive(NetClientState *nc, const uint8_t *buf, size_t len1)
-+            gen_helper_neon_##OP##l_u16,                                \
+     return len;
 +            gen_helper_neon_##OP##l_u32,                                \
 +            tcg_gen_##OP##_i64,                                         \
 +            NULL,                                                       \
 +        };                                                              \
 +        static NeonGenNarrowFn * const narrowfn[] = {                   \
 +            gen_helper_neon_##NARROWTYPE##_high_u8,                     \
 +            gen_helper_neon_##NARROWTYPE##_high_u16,                    \
 +            EXTOP,                                                      \
 +            NULL,                                                       \
 +        };                                                              \
 +        return do_narrow_3d(s, a, addfn[a->size], narrowfn[a->size]);   \
 +    }
 +
 +static void gen_narrow_round_high_u32(TCGv_i32 rd, TCGv_i64 rn)
 +{
 +    tcg_gen_addi_i64(rn, rn, 1u << 31);
 +    tcg_gen_extrh_i64_i32(rd, rn);
 +}
 +
 +DO_NARROW_3D(VADDHN, add, narrow, tcg_gen_extrh_i64_i32)
 +DO_NARROW_3D(VSUBHN, sub, narrow, tcg_gen_extrh_i64_i32)
 +DO_NARROW_3D(VRADDHN, add, narrow_round, gen_narrow_round_high_u32)
 +DO_NARROW_3D(VRSUBHN, sub, narrow_round, gen_narrow_round_high_u32)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_addl(int size)
      }
  }
--static inline void gen_neon_subl(int size)
+-static void emc_try_receive_next_packet(NPCM7xxEMCState *emc)
 -{
--    switch (size) {
+-    if (emc_can_receive(qemu_get_queue(emc->nic))) {
--    case 0: gen_helper_neon_subl_u16(CPU_V001); break;
+-        qemu_flush_queued_packets(qemu_get_queue(emc->nic));
 -    case 1: gen_helper_neon_subl_u32(CPU_V001); break;
 -    case 2: tcg_gen_sub_i64(CPU_V001); break;
 -    default: abort();
 -    }
 -}
 -
- static inline void gen_neon_negl(TCGv_i64 var, int size)
+ static uint64_t npcm7xx_emc_read(void *opaque, hwaddr offset, unsigned size)
  {
-     switch (size) {
+     NPCM7xxEMCState *emc = opaque;
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void npcm7xx_emc_write(void *opaque, hwaddr offset,
-             op = (insn >> 8) & 0xf;
+             emc->regs[REG_MGSTA] |= REG_MGSTA_RXHA;
-             if ((insn & (1 << 6)) == 0) {
+         }
-                 /* Three registers of different lengths.  */
+         if (value & REG_MCMDR_RXON) {
--                int src1_wide;
+-            emc->rx_active = true;
--                int src2_wide;
++            emc_enable_rx_and_flush(emc);
-                 /* undefreq: bit 0 : UNDEF if size == 0
+         } else {
-                  *           bit 1 : UNDEF if size == 1
+             emc_halt_rx(emc, 0);
-                  *           bit 2 : UNDEF if size == 2
+         }
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ static void npcm7xx_emc_write(void *opaque, hwaddr offset,
-                     {0, 0, 0, 7}, /* VADDW: handled by decodetree */
+         break;
-                     {0, 0, 0, 7}, /* VSUBL: handled by decodetree */
+     case REG_RSDR:
-                     {0, 0, 0, 7}, /* VSUBW: handled by decodetree */
+         if (emc->regs[REG_MCMDR] & REG_MCMDR_RXON) {
--                    {0, 1, 1, 0}, /* VADDHN */
+-            emc->rx_active = true;
-+                    {0, 0, 0, 7}, /* VADDHN: handled by decodetree */
+-            emc_try_receive_next_packet(emc);
-                     {0, 0, 0, 0}, /* VABAL */
++            emc_enable_rx_and_flush(emc);
--                    {0, 1, 1, 0}, /* VSUBHN */
+         }
-+                    {0, 0, 0, 7}, /* VSUBHN: handled by decodetree */
+         break;
-                     {0, 0, 0, 0}, /* VABDL */
+     case REG_MIIDA:
                      {0, 0, 0, 0}, /* VMLAL */
                      {0, 0, 0, 9}, /* VQDMLAL */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      {0, 0, 0, 7}, /* Reserved: always UNDEF */
                  };
 -                src1_wide = neon_3reg_wide[op][1];
 -                src2_wide = neon_3reg_wide[op][2];
                  undefreq = neon_3reg_wide[op][3];
                  if ((undefreq & (1 << size)) ||
                      ((undefreq & 8) && u)) {
                      return 1;
                  }
 -                if ((src1_wide && (rn & 1)) ||
 -                    (src2_wide && (rm & 1)) ||
 -                    (!src2_wide && (rd & 1))) {
 +                if (rd & 1) {
                      return 1;
                  }
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  /* Avoid overlapping operands.  Wide source operands are
                     always aligned so will never overlap with wide
                     destinations in problematic ways.  */
 -                if (rd == rm && !src2_wide) {
 +                if (rd == rm) {
                      tmp = neon_load_reg(rm, 1);
                      neon_store_scratch(2, tmp);
 -                } else if (rd == rn && !src1_wide) {
 +                } else if (rd == rn) {
                      tmp = neon_load_reg(rn, 1);
                      neon_store_scratch(2, tmp);
                  }
                  tmp3 = NULL;
                  for (pass = 0; pass < 2; pass++) {
 -                    if (src1_wide) {
 -                        neon_load_reg64(cpu_V0, rn + pass);
 -                        tmp = NULL;
 +                    if (pass == 1 && rd == rn) {
 +                        tmp = neon_load_scratch(2);
                      } else {
 -                        if (pass == 1 && rd == rn) {
 -                            tmp = neon_load_scratch(2);
 -                        } else {
 -                            tmp = neon_load_reg(rn, pass);
 -                        }
 +                        tmp = neon_load_reg(rn, pass);
                      }
 -                    if (src2_wide) {
 -                        neon_load_reg64(cpu_V1, rm + pass);
 -                        tmp2 = NULL;
 +                    if (pass == 1 && rd == rm) {
 +                        tmp2 = neon_load_scratch(2);
                      } else {
 -                        if (pass == 1 && rd == rm) {
 -                            tmp2 = neon_load_scratch(2);
 -                        } else {
 -                            tmp2 = neon_load_reg(rm, pass);
 -                        }
 +                        tmp2 = neon_load_reg(rm, pass);
                      }
                      switch (op) {
 -                    case 0: case 1: case 4: /* VADDL, VADDW, VADDHN, VRADDHN */
 -                        gen_neon_addl(size);
 -                        break;
 -                    case 2: case 3: case 6: /* VSUBL, VSUBW, VSUBHN, VRSUBHN */
 -                        gen_neon_subl(size);
 -                        break;
                      case 5: case 7: /* VABAL, VABDL */
                          switch ((size << 1) | u) {
                          case 0:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              abort();
                          }
                          neon_store_reg64(cpu_V0, rd + pass);
 -                    } else if (op == 4 || op == 6) {
 -                        /* Narrowing operation.  */
 -                        tmp = tcg_temp_new_i32();
 -                        if (!u) {
 -                            switch (size) {
 -                            case 0:
 -                                gen_helper_neon_narrow_high_u8(tmp, cpu_V0);
 -                                break;
 -                            case 1:
 -                                gen_helper_neon_narrow_high_u16(tmp, cpu_V0);
 -                                break;
 -                            case 2:
 -                                tcg_gen_extrh_i64_i32(tmp, cpu_V0);
 -                                break;
 -                            default: abort();
 -                            }
 -                        } else {
 -                            switch (size) {
 -                            case 0:
 -                                gen_helper_neon_narrow_round_high_u8(tmp, cpu_V0);
 -                                break;
 -                            case 1:
 -                                gen_helper_neon_narrow_round_high_u16(tmp, cpu_V0);
 -                                break;
 -                            case 2:
 -                                tcg_gen_addi_i64(cpu_V0, cpu_V0, 1u << 31);
 -                                tcg_gen_extrh_i64_i32(tmp, cpu_V0);
 -                                break;
 -                            default: abort();
 -                            }
 -                        }
 -                        if (pass == 0) {
 -                            tmp3 = tmp;
 -                        } else {
 -                            neon_store_reg(rd, 0, tmp3);
 -                            neon_store_reg(rd, 1, tmp);
 -                        }
                      } else {
                          /* Write back the result.  */
                          neon_store_reg64(cpu_V0, rd + pass);
 --
-.20.1
+.25.1

-[PULL 02/23] target/arm: Convert Neon 3-reg-diff prewidening ops to decodetree
+[PULL 26/33] hw/arm/virt-acpi-build: Add VIOT table for virtio-iommu
-Convert the "pre-widening" insns VADDL, VSUBL, VADDW and VSUBW
+From: Jean-Philippe Brucker <jean-philippe@linaro.org>
 in the Neon 3-registers-different-lengths group to decodetree.
 These insns work by widening one or both inputs to double their
 size, performing an add or subtract at the doubled size and
 then storing the double-size result.
-As usual, rather than copying the loop of the original decoder
+When a virtio-iommu is instantiated, describe it using the ACPI VIOT
-(which needs awkward code to avoid problems when source and
+table.
 destination registers overlap) we just unroll the two passes.
+Acked-by: Igor Mammedov <imammedo@redhat.com>
+Reviewed-by: Eric Auger <eric.auger@redhat.com>
+Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
+Message-id: 20211210170415.583179-2-jean-philippe@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
- target/arm/neon-dp.decode       |  43 +++++++++++++
+ hw/arm/virt-acpi-build.c | 7 +++++++
- target/arm/translate-neon.inc.c | 104 ++++++++++++++++++++++++++++++++
+ hw/arm/Kconfig           | 1 +
- target/arm/translate.c          |  16 ++---
+files changed, 8 insertions(+)
 files changed, 151 insertions(+), 12 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/hw/arm/virt-acpi-build.c
-+++ b/target/arm/neon-dp.decode
++++ b/hw/arm/virt-acpi-build.c
-@@ -XXX,XX +XXX,XX @@ VCVT_FU_2sh      1111 001 1 1 . ...... .... 1111 0 . . 1 .... @2reg_vcvt
+@@ -XXX,XX +XXX,XX @@
- # So we have a single decode line and check the cmode/op in the
+ #include "kvm_arm.h"
- # trans function.
+ #include "migration/vmstate.h"
- Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
+ #include "hw/acpi/ghes.h"
-+
++#include "hw/acpi/viot.h"
-+######################################################################
-+# Within the "two registers, or three registers of different lengths"
+ #define ARM_SPI_BASE 32
-+# grouping ([23,4]=0b10), bits [21:20] are either part of the opcode
-+# decode: 0b11 for VEXT, two-reg-misc, VTBL, and duplicate-scalar;
+@@ -XXX,XX +XXX,XX @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 +# or they are a size field for the three-reg-different-lengths and
 +# two-reg-and-scalar insn groups (where size cannot be 0b11). This
 +# is slightly awkward for decodetree: we handle it with this
 +# non-exclusive group which contains within it two exclusive groups:
 +# one for the size=0b11 patterns, and one for the size-not-0b11
 +# patterns. This allows us to check that none of the insns within
 +# each subgroup accidentally overlap each other. Note that all the
 +# trans functions for the size-not-0b11 patterns must check and
 +# return false for size==3.
 +######################################################################
 +{
 +  # 0b11 subgroup will go here
 +
 +  # Subgroup for size != 0b11
 +  [
 +    ##################################################################
 +    # 3-reg-different-length grouping:
 +    # 1111 001 U 1 D sz!=11 Vn:4 Vd:4 opc:4 N 0 M 0 Vm:4
 +    ##################################################################
 +
 +    &3diff vm vn vd size
 +
 +    @3diff       .... ... . . . size:2 .... .... .... . . . . .... \
 +                 &3diff vm=%vm_dp vn=%vn_dp vd=%vd_dp
 +
 +    VADDL_S_3d   1111 001 0 1 . .. .... .... 0000 . 0 . 0 .... @3diff
 +    VADDL_U_3d   1111 001 1 1 . .. .... .... 0000 . 0 . 0 .... @3diff
 +
 +    VADDW_S_3d   1111 001 0 1 . .. .... .... 0001 . 0 . 0 .... @3diff
 +    VADDW_U_3d   1111 001 1 1 . .. .... .... 0001 . 0 . 0 .... @3diff
 +
 +    VSUBL_S_3d   1111 001 0 1 . .. .... .... 0010 . 0 . 0 .... @3diff
 +    VSUBL_U_3d   1111 001 1 1 . .. .... .... 0010 . 0 . 0 .... @3diff
 +
 +    VSUBW_S_3d   1111 001 0 1 . .. .... .... 0011 . 0 . 0 .... @3diff
 +    VSUBW_U_3d   1111 001 1 1 . .. .... .... 0011 . 0 . 0 .... @3diff
 +  ]
 +}
 diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_Vimm_1r(DisasContext *s, arg_1reg_imm *a)
      }
-     return do_1reg_imm(s, a, fn);
+ #endif
- }
-+
++    if (vms->iommu == VIRT_IOMMU_VIRTIO) {
-+static bool do_prewiden_3d(DisasContext *s, arg_3diff *a,
++        acpi_add_table(table_offsets, tables_blob);
-+                           NeonGenWidenFn *widenfn,
++        build_viot(ms, tables_blob, tables->linker, vms->virtio_iommu_bdf,
-+                           NeonGenTwo64OpFn *opfn,
++                   vms->oem_id, vms->oem_table_id);
 +                           bool src1_wide)
 +{
 +    /* 3-regs different lengths, prewidening case (VADDL/VSUBL/VAADW/VSUBW) */
 +    TCGv_i64 rn0_64, rn1_64, rm_64;
 +    TCGv_i32 rm;
 +
 +    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 +        return false;
 +    }
 +
-+    /* UNDEF accesses to D16-D31 if they don't exist. */
+     /* XSDT is pointed to by RSDP */
-+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+     xsdt = tables_blob->len;
-+        ((a->vd | a->vn | a->vm) & 0x10)) {
+     build_xsdt(tables_blob, tables->linker, table_offsets, vms->oem_id,
-+        return false;
+diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
 +    }
 +
 +    if (!widenfn || !opfn) {
 +        /* size == 3 case, which is an entirely different insn group */
 +        return false;
 +    }
 +
 +    if ((a->vd & 1) || (src1_wide && (a->vn & 1))) {
 +        return false;
 +    }
 +
 +    if (!vfp_access_check(s)) {
 +        return true;
 +    }
 +
 +    rn0_64 = tcg_temp_new_i64();
 +    rn1_64 = tcg_temp_new_i64();
 +    rm_64 = tcg_temp_new_i64();
 +
 +    if (src1_wide) {
 +        neon_load_reg64(rn0_64, a->vn);
 +    } else {
 +        TCGv_i32 tmp = neon_load_reg(a->vn, 0);
 +        widenfn(rn0_64, tmp);
 +        tcg_temp_free_i32(tmp);
 +    }
 +    rm = neon_load_reg(a->vm, 0);
 +
 +    widenfn(rm_64, rm);
 +    tcg_temp_free_i32(rm);
 +    opfn(rn0_64, rn0_64, rm_64);
 +
 +    /*
 +     * Load second pass inputs before storing the first pass result, to
 +     * avoid incorrect results if a narrow input overlaps with the result.
 +     */
 +    if (src1_wide) {
 +        neon_load_reg64(rn1_64, a->vn + 1);
 +    } else {
 +        TCGv_i32 tmp = neon_load_reg(a->vn, 1);
 +        widenfn(rn1_64, tmp);
 +        tcg_temp_free_i32(tmp);
 +    }
 +    rm = neon_load_reg(a->vm, 1);
 +
 +    neon_store_reg64(rn0_64, a->vd);
 +
 +    widenfn(rm_64, rm);
 +    tcg_temp_free_i32(rm);
 +    opfn(rn1_64, rn1_64, rm_64);
 +    neon_store_reg64(rn1_64, a->vd + 1);
 +
 +    tcg_temp_free_i64(rn0_64);
 +    tcg_temp_free_i64(rn1_64);
 +    tcg_temp_free_i64(rm_64);
 +
 +    return true;
 +}
 +
 +#define DO_PREWIDEN(INSN, S, EXT, OP, SRC1WIDE)                         \
 +    static bool trans_##INSN##_3d(DisasContext *s, arg_3diff *a)        \
 +    {                                                                   \
 +        static NeonGenWidenFn * const widenfn[] = {                     \
 +            gen_helper_neon_widen_##S##8,                               \
 +            gen_helper_neon_widen_##S##16,                              \
 +            tcg_gen_##EXT##_i32_i64,                                    \
 +            NULL,                                                       \
 +        };                                                              \
 +        static NeonGenTwo64OpFn * const addfn[] = {                     \
 +            gen_helper_neon_##OP##l_u16,                                \
 +            gen_helper_neon_##OP##l_u32,                                \
 +            tcg_gen_##OP##_i64,                                         \
 +            NULL,                                                       \
 +        };                                                              \
 +        return do_prewiden_3d(s, a, widenfn[a->size],                   \
 +                              addfn[a->size], SRC1WIDE);                \
 +    }
 +
 +DO_PREWIDEN(VADDL_S, s, ext, add, false)
 +DO_PREWIDEN(VADDL_U, u, extu, add, false)
 +DO_PREWIDEN(VSUBL_S, s, ext, sub, false)
 +DO_PREWIDEN(VSUBL_U, u, extu, sub, false)
 +DO_PREWIDEN(VADDW_S, s, ext, add, true)
 +DO_PREWIDEN(VADDW_U, u, extu, add, true)
 +DO_PREWIDEN(VSUBW_S, s, ext, sub, true)
 +DO_PREWIDEN(VSUBW_U, u, extu, sub, true)
 diff --git a/target/arm/translate.c b/target/arm/translate.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/translate.c
+--- a/hw/arm/Kconfig
-+++ b/target/arm/translate.c
++++ b/hw/arm/Kconfig
-@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
+@@ -XXX,XX +XXX,XX @@ config ARM_VIRT
-                 /* Three registers of different lengths.  */
+     select DIMM
-                 int src1_wide;
+     select ACPI_HW_REDUCED
-                 int src2_wide;
+     select ACPI_APEI
--                int prewiden;
++    select ACPI_VIOT
-                 /* undefreq: bit 0 : UNDEF if size == 0
-                  *           bit 1 : UNDEF if size == 1
+ config CHEETAH
-                  *           bit 2 : UNDEF if size == 2
+     bool
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  int undefreq;
                  /* prewiden, src1_wide, src2_wide, undefreq */
                  static const int neon_3reg_wide[16][4] = {
 -                    {1, 0, 0, 0}, /* VADDL */
 -                    {1, 1, 0, 0}, /* VADDW */
 -                    {1, 0, 0, 0}, /* VSUBL */
 -                    {1, 1, 0, 0}, /* VSUBW */
 +                    {0, 0, 0, 7}, /* VADDL: handled by decodetree */
 +                    {0, 0, 0, 7}, /* VADDW: handled by decodetree */
 +                    {0, 0, 0, 7}, /* VSUBL: handled by decodetree */
 +                    {0, 0, 0, 7}, /* VSUBW: handled by decodetree */
                      {0, 1, 1, 0}, /* VADDHN */
                      {0, 0, 0, 0}, /* VABAL */
                      {0, 1, 1, 0}, /* VSUBHN */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                      {0, 0, 0, 7}, /* Reserved: always UNDEF */
                  };
 -                prewiden = neon_3reg_wide[op][0];
                  src1_wide = neon_3reg_wide[op][1];
                  src2_wide = neon_3reg_wide[op][2];
                  undefreq = neon_3reg_wide[op][3];
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          } else {
                              tmp = neon_load_reg(rn, pass);
                          }
 -                        if (prewiden) {
 -                            gen_neon_widen(cpu_V0, tmp, size, u);
 -                        }
                      }
                      if (src2_wide) {
                          neon_load_reg64(cpu_V1, rm + pass);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                          } else {
                              tmp2 = neon_load_reg(rm, pass);
                          }
 -                        if (prewiden) {
 -                            gen_neon_widen(cpu_V1, tmp2, size, u);
 -                        }
                      }
                      switch (op) {
                      case 0: case 1: case 4: /* VADDL, VADDW, VADDHN, VRADDHN */
 --
-.20.1
+.25.1

-[PULL 20/23] target/arm/cpu: adjust virtual time for all KVM arm cpus
+[PULL 27/33] hw/arm/virt: Remove device tree restriction for virtio-iommu
-From: fangying <fangying1@huawei.com>
+From: Jean-Philippe Brucker <jean-philippe@linaro.org>
-Virtual time adjustment was implemented for virt-5.0 machine type,
+virtio-iommu is now supported with ACPI VIOT as well as device tree.
-but the cpu property was enabled only for host-passthrough and max
+Remove the restriction that prevents from instantiating a virtio-iommu
-cpu model.  Let's add it for any KVM arm cpu which has the generic
+device under ACPI.
 timer feature enabled.
-Signed-off-by: Ying Fang <fangying1@huawei.com>
+Acked-by: Igor Mammedov <imammedo@redhat.com>
-Reviewed-by: Andrew Jones <drjones@redhat.com>
+Reviewed-by: Eric Auger <eric.auger@redhat.com>
-Message-id: 20200608121243.2076-1-fangying1@huawei.com
+Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
-[PMM: minor commit message tweak, removed inaccurate
+Message-id: 20211210170415.583179-3-jean-philippe@linaro.org
  suggested-by tag]
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- target/arm/cpu.c   |  6 ++++--
+ hw/arm/virt.c                | 10 ++--------
- target/arm/cpu64.c |  1 -
+ hw/virtio/virtio-iommu-pci.c | 12 ++----------
- target/arm/kvm.c   | 21 +++++++++++----------
+files changed, 4 insertions(+), 18 deletions(-)
 files changed, 15 insertions(+), 13 deletions(-)
-diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+diff --git a/hw/arm/virt.c b/hw/arm/virt.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/cpu.c
+--- a/hw/arm/virt.c
-+++ b/target/arm/cpu.c
++++ b/hw/arm/virt.c
-@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
+@@ -XXX,XX +XXX,XX @@ static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
-     if (arm_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER)) {
+     MachineClass *mc = MACHINE_GET_CLASS(machine);
-         qdev_property_add_static(DEVICE(cpu), &arm_cpu_gt_cntfrq_property);
      if (device_is_dynamic_sysbus(mc, dev) ||
 -       (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM))) {
 +        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
 +        object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
          return HOTPLUG_HANDLER(machine);
      }
-+
+-    if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
-+    if (kvm_enabled()) {
+-        VirtMachineState *vms = VIRT_MACHINE(machine);
-+        kvm_arm_add_vcpu_properties(obj);
+-
-+    }
+-        if (!vms->bootinfo.firmware_loaded || !virt_is_acpi_enabled(vms)) {
 -            return HOTPLUG_HANDLER(machine);
 -        }
 -    }
      return NULL;
  }
- static void arm_cpu_finalizefn(Object *obj)
+diff --git a/hw/virtio/virtio-iommu-pci.c b/hw/virtio/virtio-iommu-pci.c
-@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/virtio/virtio-iommu-pci.c
-     if (kvm_enabled()) {
++++ b/hw/virtio/virtio-iommu-pci.c
-         kvm_arm_set_cpu_features_from_host(cpu);
+@@ -XXX,XX +XXX,XX @@ static void virtio_iommu_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
--        kvm_arm_add_vcpu_properties(obj);
+     VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
-     } else {
-         cortex_a15_initfn(obj);
+     if (!qdev_get_machine_hotplug_handler(DEVICE(vpci_dev))) {
+-        MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
-@@ -XXX,XX +XXX,XX @@ static void arm_host_initfn(Object *obj)
+-
-     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+-        error_setg(errp,
-         aarch64_add_sve_properties(obj);
+-                   "%s machine fails to create iommu-map device tree bindings",
 -                   mc->name);
 -        error_append_hint(errp,
 -                          "Check your machine implements a hotplug handler "
 -                          "for the virtio-iommu-pci device\n");
 -        error_append_hint(errp, "Check the guest is booted without FW or with "
 -                          "-no-acpi\n");
 +        error_setg(errp, "Check your machine implements a hotplug handler "
 +                         "for the virtio-iommu-pci device");
          return;
      }
--    kvm_arm_add_vcpu_properties(obj);
+     for (int i = 0; i < s->nb_reserved_regions; i++) {
      arm_cpu_post_init(obj);
  }
 diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/cpu64.c
 +++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
      if (kvm_enabled()) {
          kvm_arm_set_cpu_features_from_host(cpu);
 -        kvm_arm_add_vcpu_properties(obj);
      } else {
          uint64_t t;
          uint32_t u;
 diff --git a/target/arm/kvm.c b/target/arm/kvm.c
 index XXXXXXX..XXXXXXX 100644
 --- a/target/arm/kvm.c
 +++ b/target/arm/kvm.c
@@ -XXX,XX +XXX,XX @@ static void kvm_no_adjvtime_set(Object *obj, bool value, Error **errp)
  /* KVM VCPU properties should be prefixed with "kvm-". */
  void kvm_arm_add_vcpu_properties(Object *obj)
  {
 -    if (!kvm_enabled()) {
 -        return;
 -    }
 +    ARMCPU *cpu = ARM_CPU(obj);
 +    CPUARMState *env = &cpu->env;
 -    ARM_CPU(obj)->kvm_adjvtime = true;
 -    object_property_add_bool(obj, "kvm-no-adjvtime", kvm_no_adjvtime_get,
 -                             kvm_no_adjvtime_set);
 -    object_property_set_description(obj, "kvm-no-adjvtime",
 -                                    "Set on to disable the adjustment of "
 -                                    "the virtual counter. VM stopped time "
 -                                    "will be counted.");
 +    if (arm_feature(env, ARM_FEATURE_GENERIC_TIMER)) {
 +        cpu->kvm_adjvtime = true;
 +        object_property_add_bool(obj, "kvm-no-adjvtime", kvm_no_adjvtime_get,
 +                                 kvm_no_adjvtime_set);
 +        object_property_set_description(obj, "kvm-no-adjvtime",
 +                                        "Set on to disable the adjustment of "
 +                                        "the virtual counter. VM stopped time "
 +                                        "will be counted.");
 +    }
  }
  bool kvm_arm_pmu_supported(CPUState *cpu)
 --
-.20.1
+.25.1

-[PULL 23/23] hw: arm: Set vendor property for IMX SDHCI emulations
+[PULL 28/33] hw/arm/virt: Reject instantiation of multiple IOMMUs
-From: Guenter Roeck <linux@roeck-us.net>
+From: Jean-Philippe Brucker <jean-philippe@linaro.org>
-Set vendor property to IMX to enable IMX specific functionality
+We do not support instantiating multiple IOMMUs. Before adding a
-in sdhci code.
+virtio-iommu, check that no other IOMMU is present. This will detect
 both "iommu=smmuv3" machine parameter and another virtio-iommu instance.
-Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Fixes: 70e89132c9 ("hw/arm/virt: Add the virtio-iommu device tree mappings")
-Signed-off-by: Guenter Roeck <linux@roeck-us.net>
+Reviewed-by: Eric Auger <eric.auger@redhat.com>
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+Reviewed-by: Igor Mammedov <imammedo@redhat.com>
-Message-id: 20200603145258.195920-3-linux@roeck-us.net
+Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
 Message-id: 20211210170415.583179-4-jean-philippe@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/arm/fsl-imx25.c  | 6 ++++++
+ hw/arm/virt.c | 5 +++++
- hw/arm/fsl-imx6.c   | 6 ++++++
+file changed, 5 insertions(+)
  hw/arm/fsl-imx6ul.c | 2 ++
  hw/arm/fsl-imx7.c   | 2 ++
 files changed, 16 insertions(+)
-diff --git a/hw/arm/fsl-imx25.c b/hw/arm/fsl-imx25.c
+diff --git a/hw/arm/virt.c b/hw/arm/virt.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/arm/fsl-imx25.c
+--- a/hw/arm/virt.c
-+++ b/hw/arm/fsl-imx25.c
++++ b/hw/arm/virt.c
-@@ -XXX,XX +XXX,XX @@ static void fsl_imx25_realize(DeviceState *dev, Error **errp)
+@@ -XXX,XX +XXX,XX @@ static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
-                                  &err);
+         hwaddr db_start = 0, db_end = 0;
-         object_property_set_uint(OBJECT(&s->esdhc[i]), IMX25_ESDHC_CAPABILITIES,
+         char *resv_prop_str;
-                                  "capareg", &err);
-+        object_property_set_uint(OBJECT(&s->esdhc[i]), SDHCI_VENDOR_IMX,
++        if (vms->iommu != VIRT_IOMMU_NONE) {
-+                                 "vendor", &err);
++            error_setg(errp, "virt machine does not support multiple IOMMUs");
 +        if (err) {
 +            error_propagate(errp, err);
 +            return;
 +        }
-         object_property_set_bool(OBJECT(&s->esdhc[i]), true, "realized", &err);
++
-         if (err) {
+         switch (vms->msi_controller) {
-             error_propagate(errp, err);
+         case VIRT_MSI_CTRL_NONE:
-diff --git a/hw/arm/fsl-imx6.c b/hw/arm/fsl-imx6.c
+             return;
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/fsl-imx6.c
 +++ b/hw/arm/fsl-imx6.c
@@ -XXX,XX +XXX,XX @@ static void fsl_imx6_realize(DeviceState *dev, Error **errp)
                                   &err);
          object_property_set_uint(OBJECT(&s->esdhc[i]), IMX6_ESDHC_CAPABILITIES,
                                   "capareg", &err);
 +        object_property_set_uint(OBJECT(&s->esdhc[i]), SDHCI_VENDOR_IMX,
 +                                 "vendor", &err);
 +        if (err) {
 +            error_propagate(errp, err);
 +            return;
 +        }
          object_property_set_bool(OBJECT(&s->esdhc[i]), true, "realized", &err);
          if (err) {
              error_propagate(errp, err);
 diff --git a/hw/arm/fsl-imx6ul.c b/hw/arm/fsl-imx6ul.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/fsl-imx6ul.c
 +++ b/hw/arm/fsl-imx6ul.c
@@ -XXX,XX +XXX,XX @@ static void fsl_imx6ul_realize(DeviceState *dev, Error **errp)
              FSL_IMX6UL_USDHC2_IRQ,
          };
 +        object_property_set_uint(OBJECT(&s->usdhc[i]), SDHCI_VENDOR_IMX,
 +                                        "vendor", &error_abort);
          object_property_set_bool(OBJECT(&s->usdhc[i]), true, "realized",
                                   &error_abort);
 diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/arm/fsl-imx7.c
 +++ b/hw/arm/fsl-imx7.c
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
              FSL_IMX7_USDHC3_IRQ,
          };
 +        object_property_set_uint(OBJECT(&s->usdhc[i]), SDHCI_VENDOR_IMX,
 +                                 "vendor", &error_abort);
          object_property_set_bool(OBJECT(&s->usdhc[i]), true, "realized",
                                   &error_abort);
 --
-.20.1
+.25.1

-[PULL 18/23] hw/misc/imx6ul_ccm: Implement non writable bits in CCM registers
+[PULL 29/33] hw/arm/virt: Use object_property_set instead of qdev_prop_set
-From: Jean-Christophe Dubois <jcd@tribudubois.net>
+From: Jean-Philippe Brucker <jean-philippe@linaro.org>
-Some bits of the CCM registers are non writable.
+To propagate errors to the caller of the pre_plug callback, use the
 object_poperty_set*() functions directly instead of the qdev_prop_set*()
 helpers.
-This was left undone in the initial commit (all bits of registers were
+Suggested-by: Igor Mammedov <imammedo@redhat.com>
-writable).
+Reviewed-by: Eric Auger <eric.auger@redhat.com>
+Reviewed-by: Igor Mammedov <imammedo@redhat.com>
-This patch adds the required code to protect the non writable bits.
+Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
+Message-id: 20211210170415.583179-5-jean-philippe@linaro.org
 Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
 Message-id: 20200608133508.550046-1-jcd@tribudubois.net
 Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/misc/imx6ul_ccm.c | 76 ++++++++++++++++++++++++++++++++++++--------
+ hw/arm/virt.c | 5 +++--
-file changed, 63 insertions(+), 13 deletions(-)
+file changed, 3 insertions(+), 2 deletions(-)
-diff --git a/hw/misc/imx6ul_ccm.c b/hw/misc/imx6ul_ccm.c
+diff --git a/hw/arm/virt.c b/hw/arm/virt.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/misc/imx6ul_ccm.c
+--- a/hw/arm/virt.c
-+++ b/hw/misc/imx6ul_ccm.c
++++ b/hw/arm/virt.c
-@@ -XXX,XX +XXX,XX @@
+@@ -XXX,XX +XXX,XX @@ static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
+                                         db_start, db_end,
- #include "trace.h"
+                                         VIRTIO_IOMMU_RESV_MEM_T_MSI);
-+static const uint32_t ccm_mask[CCM_MAX] = {
+-        qdev_prop_set_uint32(dev, "len-reserved-regions", 1);
-+    [CCM_CCR] = 0xf01fef80,
+-        qdev_prop_set_string(dev, "reserved-regions[0]", resv_prop_str);
-+    [CCM_CCDR] = 0xfffeffff,
++        object_property_set_uint(OBJECT(dev), "len-reserved-regions", 1, errp);
-+    [CCM_CSR] = 0xffffffff,
++        object_property_set_str(OBJECT(dev), "reserved-regions[0]",
-+    [CCM_CCSR] = 0xfffffef2,
++                                resv_prop_str, errp);
-+    [CCM_CACRR] = 0xfffffff8,
+         g_free(resv_prop_str);
 +    [CCM_CBCDR] = 0xc1f8e000,
 +    [CCM_CBCMR] = 0xfc03cfff,
 +    [CCM_CSCMR1] = 0x80700000,
 +    [CCM_CSCMR2] = 0xe01ff003,
 +    [CCM_CSCDR1] = 0xfe00c780,
 +    [CCM_CS1CDR] = 0xfe00fe00,
 +    [CCM_CS2CDR] = 0xf8007000,
 +    [CCM_CDCDR] = 0xf00fffff,
 +    [CCM_CHSCCDR] = 0xfffc01ff,
 +    [CCM_CSCDR2] = 0xfe0001ff,
 +    [CCM_CSCDR3] = 0xffffc1ff,
 +    [CCM_CDHIPR] = 0xffffffff,
 +    [CCM_CTOR] = 0x00000000,
 +    [CCM_CLPCR] = 0xf39ff01c,
 +    [CCM_CISR] = 0xfb85ffbe,
 +    [CCM_CIMR] = 0xfb85ffbf,
 +    [CCM_CCOSR] = 0xfe00fe00,
 +    [CCM_CGPR] = 0xfffc3fea,
 +    [CCM_CCGR0] = 0x00000000,
 +    [CCM_CCGR1] = 0x00000000,
 +    [CCM_CCGR2] = 0x00000000,
 +    [CCM_CCGR3] = 0x00000000,
 +    [CCM_CCGR4] = 0x00000000,
 +    [CCM_CCGR5] = 0x00000000,
 +    [CCM_CCGR6] = 0x00000000,
 +    [CCM_CMEOR] = 0xafffff1f,
 +};
 +
 +static const uint32_t analog_mask[CCM_ANALOG_MAX] = {
 +    [CCM_ANALOG_PLL_ARM] = 0xfff60f80,
 +    [CCM_ANALOG_PLL_USB1] = 0xfffe0fbc,
 +    [CCM_ANALOG_PLL_USB2] = 0xfffe0fbc,
 +    [CCM_ANALOG_PLL_SYS] = 0xfffa0ffe,
 +    [CCM_ANALOG_PLL_SYS_SS] = 0x00000000,
 +    [CCM_ANALOG_PLL_SYS_NUM] = 0xc0000000,
 +    [CCM_ANALOG_PLL_SYS_DENOM] = 0xc0000000,
 +    [CCM_ANALOG_PLL_AUDIO] = 0xffe20f80,
 +    [CCM_ANALOG_PLL_AUDIO_NUM] = 0xc0000000,
 +    [CCM_ANALOG_PLL_AUDIO_DENOM] = 0xc0000000,
 +    [CCM_ANALOG_PLL_VIDEO] = 0xffe20f80,
 +    [CCM_ANALOG_PLL_VIDEO_NUM] = 0xc0000000,
 +    [CCM_ANALOG_PLL_VIDEO_DENOM] = 0xc0000000,
 +    [CCM_ANALOG_PLL_ENET] = 0xffc20ff0,
 +    [CCM_ANALOG_PFD_480] = 0x40404040,
 +    [CCM_ANALOG_PFD_528] = 0x40404040,
 +    [PMU_MISC0] = 0x01fe8306,
 +    [PMU_MISC1] = 0x07fcede0,
 +    [PMU_MISC2] = 0x005f5f5f,
 +};
 +
  static const char *imx6ul_ccm_reg_name(uint32_t reg)
  {
      static char unknown[20];
@@ -XXX,XX +XXX,XX @@ static void imx6ul_ccm_write(void *opaque, hwaddr offset, uint64_t value,
      trace_ccm_write_reg(imx6ul_ccm_reg_name(index), (uint32_t)value);
 -    /*
 -     * We will do a better implementation later. In particular some bits
 -     * cannot be written to.
 -     */
 -    s->ccm[index] = (uint32_t)value;
 +    s->ccm[index] = (s->ccm[index] & ccm_mask[index]) |
 +                           ((uint32_t)value & ~ccm_mask[index]);
  }
  static uint64_t imx6ul_analog_read(void *opaque, hwaddr offset, unsigned size)
@@ -XXX,XX +XXX,XX @@ static void imx6ul_analog_write(void *opaque, hwaddr offset, uint64_t value,
           * the REG_NAME register. So we change the value of the
           * REG_NAME register, setting bits passed in the value.
           */
 -        s->analog[index - 1] |= value;
 +        s->analog[index - 1] |= (value & ~analog_mask[index - 1]);
          break;
      case CCM_ANALOG_PLL_ARM_CLR:
      case CCM_ANALOG_PLL_USB1_CLR:
@@ -XXX,XX +XXX,XX @@ static void imx6ul_analog_write(void *opaque, hwaddr offset, uint64_t value,
           * the REG_NAME register. So we change the value of the
           * REG_NAME register, unsetting bits passed in the value.
           */
 -        s->analog[index - 2] &= ~value;
 +        s->analog[index - 2] &= ~(value & ~analog_mask[index - 2]);
          break;
      case CCM_ANALOG_PLL_ARM_TOG:
      case CCM_ANALOG_PLL_USB1_TOG:
@@ -XXX,XX +XXX,XX @@ static void imx6ul_analog_write(void *opaque, hwaddr offset, uint64_t value,
           * the REG_NAME register. So we change the value of the
           * REG_NAME register, toggling bits passed in the value.
           */
 -        s->analog[index - 3] ^= value;
 +        s->analog[index - 3] ^= (value & ~analog_mask[index - 3]);
          break;
      default:
 -        /*
 -         * We will do a better implementation later. In particular some bits
 -         * cannot be written to.
 -         */
 -        s->analog[index] = value;
 +        s->analog[index] = (s->analog[index] & analog_mask[index]) |
 +                           (value & ~analog_mask[index]);
          break;
      }
  }
 --
-.20.1
+.25.1

-[PULL 22/23] sd: sdhci: Implement basic vendor specific register support
+[PULL 30/33] tests/acpi: allow updates of VIOT expected data files
-From: Guenter Roeck <linux@roeck-us.net>
+From: Jean-Philippe Brucker <jean-philippe@linaro.org>
-The Linux kernel's IMX code now uses vendor specific commands.
+Create empty data files and allow updates for the upcoming VIOT tests.
 This results in endless warnings when booting the Linux kernel.
-sdhci-esdhc-imx 2194000.usdhc: esdhc_wait_for_card_clock_gate_off:
+Acked-by: Igor Mammedov <imammedo@redhat.com>
-    card clock still not gate off in 100us!.
+Reviewed-by: Eric Auger <eric.auger@redhat.com>
+Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
-Implement support for the vendor specific command implemented in IMX hardware
+Message-id: 20211210170415.583179-6-jean-philippe@linaro.org
 to be able to avoid this warning.
 Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
 Signed-off-by: Guenter Roeck <linux@roeck-us.net>
 Message-id: 20200603145258.195920-2-linux@roeck-us.net
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/sd/sdhci-internal.h |  5 +++++
+ tests/qtest/bios-tables-test-allowed-diff.h | 3 +++
- include/hw/sd/sdhci.h  |  5 +++++
+ tests/data/acpi/q35/DSDT.viot               | 0
- hw/sd/sdhci.c          | 18 +++++++++++++++++-
+ tests/data/acpi/q35/VIOT.viot               | 0
-files changed, 27 insertions(+), 1 deletion(-)
+ tests/data/acpi/virt/VIOT                   | 0
 files changed, 3 insertions(+)
  create mode 100644 tests/data/acpi/q35/DSDT.viot
  create mode 100644 tests/data/acpi/q35/VIOT.viot
  create mode 100644 tests/data/acpi/virt/VIOT
-diff --git a/hw/sd/sdhci-internal.h b/hw/sd/sdhci-internal.h
+diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
 index XXXXXXX..XXXXXXX 100644
---- a/hw/sd/sdhci-internal.h
+--- a/tests/qtest/bios-tables-test-allowed-diff.h
-+++ b/hw/sd/sdhci-internal.h
++++ b/tests/qtest/bios-tables-test-allowed-diff.h
-@@ -XXX,XX +XXX,XX @@
+@@ -1 +1,4 @@
- #define SDHC_CMD_INHIBIT               0x00000001
+ /* List of comma-separated changed AML files to ignore */
- #define SDHC_DATA_INHIBIT              0x00000002
++"tests/data/acpi/virt/VIOT",
- #define SDHC_DAT_LINE_ACTIVE           0x00000004
++"tests/data/acpi/q35/DSDT.viot",
-+#define SDHC_IMX_CLOCK_GATE_OFF        0x00000080
++"tests/data/acpi/q35/VIOT.viot",
- #define SDHC_DOING_WRITE               0x00000100
+diff --git a/tests/data/acpi/q35/DSDT.viot b/tests/data/acpi/q35/DSDT.viot
- #define SDHC_DOING_READ                0x00000200
+new file mode 100644
- #define SDHC_SPACE_AVAILABLE           0x00000400
+index XXXXXXX..XXXXXXX
-@@ -XXX,XX +XXX,XX @@ extern const VMStateDescription sdhci_vmstate;
+diff --git a/tests/data/acpi/q35/VIOT.viot b/tests/data/acpi/q35/VIOT.viot
+new file mode 100644
+index XXXXXXX..XXXXXXX
- #define ESDHC_MIX_CTRL                  0x48
+diff --git a/tests/data/acpi/virt/VIOT b/tests/data/acpi/virt/VIOT
-+
+new file mode 100644
- #define ESDHC_VENDOR_SPEC               0xc0
+index XXXXXXX..XXXXXXX
 +#define ESDHC_IMX_FRC_SDCLK_ON          (1 << 8)
 +
  #define ESDHC_DLL_CTRL                  0x60
  #define ESDHC_TUNING_CTRL               0xcc
@@ -XXX,XX +XXX,XX @@ extern const VMStateDescription sdhci_vmstate;
  #define DEFINE_SDHCI_COMMON_PROPERTIES(_state) \
      DEFINE_PROP_UINT8("sd-spec-version", _state, sd_spec_version, 2), \
      DEFINE_PROP_UINT8("uhs", _state, uhs_mode, UHS_NOT_SUPPORTED), \
 +    DEFINE_PROP_UINT8("vendor", _state, vendor, SDHCI_VENDOR_NONE), \
      \
      /* Capabilities registers provide information on supported
       * features of this specific host controller implementation */ \
 diff --git a/include/hw/sd/sdhci.h b/include/hw/sd/sdhci.h
 index XXXXXXX..XXXXXXX 100644
 --- a/include/hw/sd/sdhci.h
 +++ b/include/hw/sd/sdhci.h
@@ -XXX,XX +XXX,XX @@ typedef struct SDHCIState {
      uint16_t acmd12errsts; /* Auto CMD12 error status register */
      uint16_t hostctl2;     /* Host Control 2 */
      uint64_t admasysaddr;  /* ADMA System Address Register */
 +    uint16_t vendor_spec;  /* Vendor specific register */
      /* Read-only registers */
      uint64_t capareg;      /* Capabilities Register */
@@ -XXX,XX +XXX,XX @@ typedef struct SDHCIState {
      uint32_t quirks;
      uint8_t sd_spec_version;
      uint8_t uhs_mode;
 +    uint8_t vendor;        /* For vendor specific functionality */
  } SDHCIState;
 +#define SDHCI_VENDOR_NONE       0
 +#define SDHCI_VENDOR_IMX        1
 +
  /*
   * Controller does not provide transfer-complete interrupt when not
   * busy.
 diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
 index XXXXXXX..XXXXXXX 100644
 --- a/hw/sd/sdhci.c
 +++ b/hw/sd/sdhci.c
@@ -XXX,XX +XXX,XX @@ static uint64_t usdhc_read(void *opaque, hwaddr offset, unsigned size)
          }
          break;
 +    case ESDHC_VENDOR_SPEC:
 +        ret = s->vendor_spec;
 +        break;
      case ESDHC_DLL_CTRL:
      case ESDHC_TUNE_CTRL_STATUS:
      case ESDHC_UNDOCUMENTED_REG27:
      case ESDHC_TUNING_CTRL:
 -    case ESDHC_VENDOR_SPEC:
      case ESDHC_MIX_CTRL:
      case ESDHC_WTMK_LVL:
          ret = 0;
@@ -XXX,XX +XXX,XX @@ usdhc_write(void *opaque, hwaddr offset, uint64_t val, unsigned size)
      case ESDHC_UNDOCUMENTED_REG27:
      case ESDHC_TUNING_CTRL:
      case ESDHC_WTMK_LVL:
 +        break;
 +
      case ESDHC_VENDOR_SPEC:
 +        s->vendor_spec = value;
 +        switch (s->vendor) {
 +        case SDHCI_VENDOR_IMX:
 +            if (value & ESDHC_IMX_FRC_SDCLK_ON) {
 +                s->prnsts &= ~SDHC_IMX_CLOCK_GATE_OFF;
 +            } else {
 +                s->prnsts |= SDHC_IMX_CLOCK_GATE_OFF;
 +            }
 +            break;
 +        default:
 +            break;
 +        }
          break;
      case SDHC_HOSTCTL:
 --
-.20.1
+.25.1

-[PULL 11/23] target/arm: Convert Neon 2-reg-scalar float multiplies to decodetree
+[PULL 31/33] tests/acpi: add test case for VIOT
-Convert the float versions of VMLA, VMLS and VMUL in the Neon
+From: Jean-Philippe Brucker <jean-philippe@linaro.org>
 -reg-scalar group to decodetree.
+Add two test cases for VIOT, one on the q35 machine and the other on
+virt. To test complex topologies the q35 test has two PCIe buses that
+bypass the IOMMU (and are therefore not described by VIOT), and two
+buses that are translated by virtio-iommu.
+Reviewed-by: Eric Auger <eric.auger@redhat.com>
+Reviewed-by: Igor Mammedov <imammedo@redhat.com>
+Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
+Message-id: 20211210170415.583179-7-jean-philippe@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
-As noted in the comment on the WRAP_FP_FN macro, we could have
+ tests/qtest/bios-tables-test.c | 38 ++++++++++++++++++++++++++++++++++
-had a do_2scalar_fp() function, but for 3 insns it seemed
+file changed, 38 insertions(+)
 simpler to just do the wrapping to get hold of the fpstatus ptr.
 (These are the only fp insns in the group.)
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 ---
  target/arm/neon-dp.decode       |  3 ++
  target/arm/translate-neon.inc.c | 65 +++++++++++++++++++++++++++++++++
  target/arm/translate.c          | 37 ++-----------------
 files changed, 71 insertions(+), 34 deletions(-)
-diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
+diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
 index XXXXXXX..XXXXXXX 100644
---- a/target/arm/neon-dp.decode
+--- a/tests/qtest/bios-tables-test.c
-+++ b/target/arm/neon-dp.decode
++++ b/tests/qtest/bios-tables-test.c
-@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
+@@ -XXX,XX +XXX,XX @@ static void test_acpi_virt_tcg(void)
-                  &2scalar vm=%vm_dp vn=%vn_dp vd=%vd_dp
+     free_test_data(&data);
      VMLA_2sc     1111 001 . 1 . .. .... .... 0000 . 1 . 0 .... @2scalar
 +    VMLA_F_2sc   1111 001 . 1 . .. .... .... 0001 . 1 . 0 .... @2scalar
      VMLS_2sc     1111 001 . 1 . .. .... .... 0100 . 1 . 0 .... @2scalar
 +    VMLS_F_2sc   1111 001 . 1 . .. .... .... 0101 . 1 . 0 .... @2scalar
      VMUL_2sc     1111 001 . 1 . .. .... .... 1000 . 1 . 0 .... @2scalar
 +    VMUL_F_2sc   1111 001 . 1 . .. .... .... 1001 . 1 . 0 .... @2scalar
    ]
  }
-diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
-index XXXXXXX..XXXXXXX 100644
++static void test_acpi_q35_viot(void)
 --- a/target/arm/translate-neon.inc.c
 +++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VMLS_2sc(DisasContext *s, arg_2scalar *a)
      return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
  }
 +
 +/*
 + * Rather than have a float-specific version of do_2scalar just for
 + * three insns, we wrap a NeonGenTwoSingleOpFn to turn it into
 + * a NeonGenTwoOpFn.
 + */
 +#define WRAP_FP_FN(WRAPNAME, FUNC)                              \
 +    static void WRAPNAME(TCGv_i32 rd, TCGv_i32 rn, TCGv_i32 rm) \
 +    {                                                           \
 +        TCGv_ptr fpstatus = get_fpstatus_ptr(1);                \
 +        FUNC(rd, rn, rm, fpstatus);                             \
 +        tcg_temp_free_ptr(fpstatus);                            \
 +    }
 +
 +WRAP_FP_FN(gen_VMUL_F_mul, gen_helper_vfp_muls)
 +WRAP_FP_FN(gen_VMUL_F_add, gen_helper_vfp_adds)
 +WRAP_FP_FN(gen_VMUL_F_sub, gen_helper_vfp_subs)
 +
 +static bool trans_VMUL_F_2sc(DisasContext *s, arg_2scalar *a)
 +{
-+    static NeonGenTwoOpFn * const opfn[] = {
++    test_data data = {
-+        NULL,
++        .machine = MACHINE_Q35,
-+        NULL, /* TODO: fp16 support */
++        .variant = ".viot",
 +        gen_VMUL_F_mul,
 +        NULL,
 +    };
 +
-+    return do_2scalar(s, a, opfn[a->size], NULL);
++    /*
 +     * To keep things interesting, two buses bypass the IOMMU.
 +     * VIOT should only describes the other two buses.
 +     */
 +    test_acpi_one("-machine default_bus_bypass_iommu=on "
 +                  "-device virtio-iommu-pci "
 +                  "-device pxb-pcie,bus_nr=0x10,id=pcie.100,bus=pcie.0 "
 +                  "-device pxb-pcie,bus_nr=0x20,id=pcie.200,bus=pcie.0,bypass_iommu=on "
 +                  "-device pxb-pcie,bus_nr=0x30,id=pcie.300,bus=pcie.0",
 +                  &data);
 +    free_test_data(&data);
 +}
 +
-+static bool trans_VMLA_F_2sc(DisasContext *s, arg_2scalar *a)
++static void test_acpi_virt_viot(void)
 +{
-+    static NeonGenTwoOpFn * const opfn[] = {
++    test_data data = {
-+        NULL,
++        .machine = "virt",
-+        NULL, /* TODO: fp16 support */
++        .uefi_fl1 = "pc-bios/edk2-aarch64-code.fd",
-+        gen_VMUL_F_mul,
++        .uefi_fl2 = "pc-bios/edk2-arm-vars.fd",
-+        NULL,
++        .cd = "tests/data/uefi-boot-images/bios-tables-test.aarch64.iso.qcow2",
-+    };
++        .ram_start = 0x40000000ULL,
-+    static NeonGenTwoOpFn * const accfn[] = {
++        .scan_len = 128ULL * 1024 * 1024,
 +        NULL,
 +        NULL, /* TODO: fp16 support */
 +        gen_VMUL_F_add,
 +        NULL,
 +    };
 +
-+    return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
++    test_acpi_one("-cpu cortex-a57 "
 +                  "-device virtio-iommu-pci", &data);
 +    free_test_data(&data);
 +}
 +
-+static bool trans_VMLS_F_2sc(DisasContext *s, arg_2scalar *a)
+ static void test_oem_fields(test_data *data)
-+{
+ {
-+    static NeonGenTwoOpFn * const opfn[] = {
+     int i;
-+        NULL,
+@@ -XXX,XX +XXX,XX @@ int main(int argc, char *argv[])
-+        NULL, /* TODO: fp16 support */
+             qtest_add_func("acpi/q35/kvm/xapic", test_acpi_q35_kvm_xapic);
-+        gen_VMUL_F_mul,
+             qtest_add_func("acpi/q35/kvm/dmar", test_acpi_q35_kvm_dmar);
-+        NULL,
+         }
-+    };
++        qtest_add_func("acpi/q35/viot", test_acpi_q35_viot);
-+    static NeonGenTwoOpFn * const accfn[] = {
+     } else if (strcmp(arch, "aarch64") == 0) {
-+        NULL,
+         if (has_tcg) {
-+        NULL, /* TODO: fp16 support */
+             qtest_add_func("acpi/virt", test_acpi_virt_tcg);
-+        gen_VMUL_F_sub,
+@@ -XXX,XX +XXX,XX @@ int main(int argc, char *argv[])
-+        NULL,
+             qtest_add_func("acpi/virt/memhp", test_acpi_virt_tcg_memhp);
-+    };
+             qtest_add_func("acpi/virt/pxb", test_acpi_virt_tcg_pxb);
-+
+             qtest_add_func("acpi/virt/oem-fields", test_acpi_oem_fields_virt);
-+    return do_2scalar(s, a, opfn[a->size], accfn[a->size]);
++            qtest_add_func("acpi/virt/viot", test_acpi_virt_viot);
-+}
+         }
-diff --git a/target/arm/translate.c b/target/arm/translate.c
+     }
-index XXXXXXX..XXXXXXX 100644
+     ret = g_test_run();
 --- a/target/arm/translate.c
 +++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                  case 0: /* Integer VMLA scalar */
                  case 4: /* Integer VMLS scalar */
                  case 8: /* Integer VMUL scalar */
 -                    return 1; /* handled by decodetree */
 -
                  case 1: /* Float VMLA scalar */
                  case 5: /* Floating point VMLS scalar */
                  case 9: /* Floating point VMUL scalar */
 -                    if (size == 1) {
 -                        return 1;
 -                    }
 -                    /* fall through */
 +                    return 1; /* handled by decodetree */
 +
                  case 12: /* VQDMULH scalar */
                  case 13: /* VQRDMULH scalar */
                      if (u && ((rd | rn) & 1)) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                              } else {
                                  gen_helper_neon_qdmulh_s32(tmp, cpu_env, tmp, tmp2);
                              }
 -                        } else if (op == 13) {
 +                        } else {
                              if (size == 1) {
                                  gen_helper_neon_qrdmulh_s16(tmp, cpu_env, tmp, tmp2);
                              } else {
                                  gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
                              }
 -                        } else {
 -                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -                            gen_helper_vfp_muls(tmp, tmp, tmp2, fpstatus);
 -                            tcg_temp_free_ptr(fpstatus);
                          }
                          tcg_temp_free_i32(tmp2);
 -                        if (op < 8) {
 -                            /* Accumulate.  */
 -                            tmp2 = neon_load_reg(rd, pass);
 -                            switch (op) {
 -                            case 1:
 -                            {
 -                                TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -                                gen_helper_vfp_adds(tmp, tmp, tmp2, fpstatus);
 -                                tcg_temp_free_ptr(fpstatus);
 -                                break;
 -                            }
 -                            case 5:
 -                            {
 -                                TCGv_ptr fpstatus = get_fpstatus_ptr(1);
 -                                gen_helper_vfp_subs(tmp, tmp2, tmp, fpstatus);
 -                                tcg_temp_free_ptr(fpstatus);
 -                                break;
 -                            }
 -                            default:
 -                                abort();
 -                            }
 -                            tcg_temp_free_i32(tmp2);
 -                        }
                          neon_store_reg(rd, pass, tmp);
                      }
                      break;
 --
-.20.1
+.25.1

-New patch
+[PULL 32/33] tests/acpi: add expected blobs for VIOT test on q35 machine
+From: Jean-Philippe Brucker <jean-philippe@linaro.org>
 Add expected blobs of the VIOT and DSDT table for the VIOT test on the
 q35 machine.
 Since the test instantiates a virtio device and two PCIe expander
 bridges, DSDT.viot has more blocks than the base DSDT.
 The VIOT table generated for the q35 test is:
 [000h 0000   4]                    Signature : "VIOT"    [Virtual I/O Translation Table]
 [004h 0004   4]                 Table Length : 00000070
 [008h 0008   1]                     Revision : 00
 [009h 0009   1]                     Checksum : 3D
 [00Ah 0010   6]                       Oem ID : "BOCHS "
 [010h 0016   8]                 Oem Table ID : "BXPC    "
 [018h 0024   4]                 Oem Revision : 00000001
 [01Ch 0028   4]              Asl Compiler ID : "BXPC"
 [020h 0032   4]        Asl Compiler Revision : 00000001
 [024h 0036   2]                   Node count : 0003
 [026h 0038   2]                  Node offset : 0030
 [028h 0040   8]                     Reserved : 0000000000000000
 [030h 0048   1]                         Type : 03 [VirtIO-PCI IOMMU]
 [031h 0049   1]                     Reserved : 00
 [032h 0050   2]                       Length : 0010
 [034h 0052   2]                  PCI Segment : 0000
 [036h 0054   2]               PCI BDF number : 0010
 [038h 0056   8]                     Reserved : 0000000000000000
 [040h 0064   1]                         Type : 01 [PCI Range]
 [041h 0065   1]                     Reserved : 00
 [042h 0066   2]                       Length : 0018
 [044h 0068   4]               Endpoint start : 00003000
 [048h 0072   2]            PCI Segment start : 0000
 [04Ah 0074   2]              PCI Segment end : 0000
 [04Ch 0076   2]                PCI BDF start : 3000
 [04Eh 0078   2]                  PCI BDF end : 30FF
 [050h 0080   2]                  Output node : 0030
 [052h 0082   6]                     Reserved : 000000000000
 [058h 0088   1]                         Type : 01 [PCI Range]
 [059h 0089   1]                     Reserved : 00
 [05Ah 0090   2]                       Length : 0018
 [05Ch 0092   4]               Endpoint start : 00001000
 [060h 0096   2]            PCI Segment start : 0000
 [062h 0098   2]              PCI Segment end : 0000
 [064h 0100   2]                PCI BDF start : 1000
 [066h 0102   2]                  PCI BDF end : 10FF
 [068h 0104   2]                  Output node : 0030
 [06Ah 0106   6]                     Reserved : 000000000000
 And the DSDT diff is:
@@ -XXX,XX +XXX,XX @@
   *
   * Disassembling to symbolic ASL+ operators
   *
 - * Disassembly of tests/data/acpi/q35/DSDT, Fri Dec 10 15:03:08 2021
 + * Disassembly of /tmp/aml-H9Y5D1, Fri Dec 10 15:02:27 2021
   *
   * Original Table Header:
   *     Signature        "DSDT"
 - *     Length           0x00002061 (8289)
 + *     Length           0x000024B6 (9398)
   *     Revision         0x01 **** 32-bit table (V1), no 64-bit math support
 - *     Checksum         0xFA
 + *     Checksum         0xA7
   *     OEM ID           "BOCHS "
   *     OEM Table ID     "BXPC    "
   *     OEM Revision     0x00000001 (1)
@@ -XXX,XX +XXX,XX @@
          }
      }
 +    Scope (\_SB)
 +    {
 +        Device (PC30)
 +        {
 +            Name (_UID, 0x30)  // _UID: Unique ID
 +            Name (_BBN, 0x30)  // _BBN: BIOS Bus Number
 +            Name (_HID, EisaId ("PNP0A08") /* PCI Express Bus */)  // _HID: Hardware ID
 +            Name (_CID, EisaId ("PNP0A03") /* PCI Bus */)  // _CID: Compatible ID
 +            Method (_OSC, 4, NotSerialized)  // _OSC: Operating System Capabilities
 +            {
 +                CreateDWordField (Arg3, Zero, CDW1)
 +                If ((Arg0 == ToUUID ("33db4d5b-1ff7-401c-9657-7441c03dd766") /* PCI Host Bridge Device */))
 +                {
 +                    CreateDWordField (Arg3, 0x04, CDW2)
 +                    CreateDWordField (Arg3, 0x08, CDW3)
 +                    Local0 = CDW3 /* \_SB_.PC30._OSC.CDW3 */
 +                    Local0 &= 0x1F
 +                    If ((Arg1 != One))
 +                    {
 +                        CDW1 |= 0x08
 +                    }
 +
 +                    If ((CDW3 != Local0))
 +                    {
 +                        CDW1 |= 0x10
 +                    }
 +
 +                    CDW3 = Local0
 +                }
 +                Else
 +                {
 +                    CDW1 |= 0x04
 +                }
 +
 +                Return (Arg3)
 +            }
 +
 +            Method (_PRT, 0, NotSerialized)  // _PRT: PCI Routing Table
 +            {
 +                Local0 = Package (0x80){}
 +                Local1 = Zero
 +                While ((Local1 < 0x80))
 +                {
 +                    Local2 = (Local1 >> 0x02)
 +                    Local3 = ((Local1 + Local2) & 0x03)
 +                    If ((Local3 == Zero))
 +                    {
 +                        Local4 = Package (0x04)
 +                            {
 +                                Zero,
 +                                Zero,
 +                                LNKD,
 +                                Zero
 +                            }
 +                    }
 +
 +                    If ((Local3 == One))
 +                    {
 +                        Local4 = Package (0x04)
 +                            {
 +                                Zero,
 +                                Zero,
 +                                LNKA,
 +                                Zero
 +                            }
 +                    }
 +
 +                    If ((Local3 == 0x02))
 +                    {
 +                        Local4 = Package (0x04)
 +                            {
 +                                Zero,
 +                                Zero,
 +                                LNKB,
 +                                Zero
 +                            }
 +                    }
 +
 +                    If ((Local3 == 0x03))
 +                    {
 +                        Local4 = Package (0x04)
 +                            {
 +                                Zero,
 +                                Zero,
 +                                LNKC,
 +                                Zero
 +                            }
 +                    }
 +
 +                    Local4 [Zero] = ((Local2 << 0x10) | 0xFFFF)
 +                    Local4 [One] = (Local1 & 0x03)
 +                    Local0 [Local1] = Local4
 +                    Local1++
 +                }
 +
 +                Return (Local0)
 +            }
 +
 +            Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
 +            {
 +                WordBusNumber (ResourceProducer, MinFixed, MaxFixed, PosDecode,
 +                    0x0000,             // Granularity
 +                    0x0030,             // Range Minimum
 +                    0x0030,             // Range Maximum
 +                    0x0000,             // Translation Offset
 +                    0x0001,             // Length
 +                    ,, )
 +            })
 +        }
 +    }
 +
 +    Scope (\_SB)
 +    {
 +        Device (PC20)
 +        {
 +            Name (_UID, 0x20)  // _UID: Unique ID
 +            Name (_BBN, 0x20)  // _BBN: BIOS Bus Number
 +            Name (_HID, EisaId ("PNP0A08") /* PCI Express Bus */)  // _HID: Hardware ID
 +            Name (_CID, EisaId ("PNP0A03") /* PCI Bus */)  // _CID: Compatible ID
 +            Method (_OSC, 4, NotSerialized)  // _OSC: Operating System Capabilities
 +            {
 +                CreateDWordField (Arg3, Zero, CDW1)
 +                If ((Arg0 == ToUUID ("33db4d5b-1ff7-401c-9657-7441c03dd766") /* PCI Host Bridge Device */))
 +                {
 +                    CreateDWordField (Arg3, 0x04, CDW2)
 +                    CreateDWordField (Arg3, 0x08, CDW3)
 +                    Local0 = CDW3 /* \_SB_.PC20._OSC.CDW3 */
 +                    Local0 &= 0x1F
 +                    If ((Arg1 != One))
 +                    {
 +                        CDW1 |= 0x08
 +                    }
 +
 +                    If ((CDW3 != Local0))
 +                    {
 +                        CDW1 |= 0x10
 +                    }
 +
 +                    CDW3 = Local0
 +                }
 +                Else
 +                {
 +                    CDW1 |= 0x04
 +                }
 +
 +                Return (Arg3)
 +            }
 +
 +            Method (_PRT, 0, NotSerialized)  // _PRT: PCI Routing Table
 +            {
 +                Local0 = Package (0x80){}
 +                Local1 = Zero
 +                While ((Local1 < 0x80))
 +                {
 +                    Local2 = (Local1 >> 0x02)
 +                    Local3 = ((Local1 + Local2) & 0x03)
 +                    If ((Local3 == Zero))
 +                    {
 +                        Local4 = Package (0x04)
 +                            {
 +                                Zero,
 +                                Zero,
 +                                LNKD,
 +                                Zero
 +                            }
 +                    }
 +
 +                    If ((Local3 == One))
 +                    {
 +                        Local4 = Package (0x04)
 +                            {
 +                                Zero,
 +                                Zero,
 +                                LNKA,
 +                                Zero
 +                            }
 +                    }
 +
 +                    If ((Local3 == 0x02))
 +                    {
 +                        Local4 = Package (0x04)
 +                            {
 +                                Zero,
 +                                Zero,
 +                                LNKB,
 +                                Zero
 +                            }
 +                    }
 +
 +                    If ((Local3 == 0x03))
 +                    {
 +                        Local4 = Package (0x04)
 +                            {
 +                                Zero,
 +                                Zero,
 +                                LNKC,
 +                                Zero
 +                            }
 +                    }
 +
 +                    Local4 [Zero] = ((Local2 << 0x10) | 0xFFFF)
 +                    Local4 [One] = (Local1 & 0x03)
 +                    Local0 [Local1] = Local4
 +                    Local1++
 +                }
 +
 +                Return (Local0)
 +            }
 +
 +            Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
 +            {
 +                WordBusNumber (ResourceProducer, MinFixed, MaxFixed, PosDecode,
 +                    0x0000,             // Granularity
 +                    0x0020,             // Range Minimum
 +                    0x0020,             // Range Maximum
 +                    0x0000,             // Translation Offset
 +                    0x0001,             // Length
 +                    ,, )
 +            })
 +        }
 +    }
 +
 +    Scope (\_SB)
 +    {
 +        Device (PC10)
 +        {
 +            Name (_UID, 0x10)  // _UID: Unique ID
 +            Name (_BBN, 0x10)  // _BBN: BIOS Bus Number
 +            Name (_HID, EisaId ("PNP0A08") /* PCI Express Bus */)  // _HID: Hardware ID
 +            Name (_CID, EisaId ("PNP0A03") /* PCI Bus */)  // _CID: Compatible ID
 +            Method (_OSC, 4, NotSerialized)  // _OSC: Operating System Capabilities
 +            {
 +                CreateDWordField (Arg3, Zero, CDW1)
 +                If ((Arg0 == ToUUID ("33db4d5b-1ff7-401c-9657-7441c03dd766") /* PCI Host Bridge Device */))
 +                {
 +                    CreateDWordField (Arg3, 0x04, CDW2)
 +                    CreateDWordField (Arg3, 0x08, CDW3)
 +                    Local0 = CDW3 /* \_SB_.PC10._OSC.CDW3 */
 +                    Local0 &= 0x1F
 +                    If ((Arg1 != One))
 +                    {
 +                        CDW1 |= 0x08
 +                    }
 +
 +                    If ((CDW3 != Local0))
 +                    {
 +                        CDW1 |= 0x10
 +                    }
 +
 +                    CDW3 = Local0
 +                }
 +                Else
 +                {
 +                    CDW1 |= 0x04
 +                }
 +
 +                Return (Arg3)
 +            }
 +
 +            Method (_PRT, 0, NotSerialized)  // _PRT: PCI Routing Table
 +            {
 +                Local0 = Package (0x80){}
 +                Local1 = Zero
 +                While ((Local1 < 0x80))
 +                {
 +                    Local2 = (Local1 >> 0x02)
 +                    Local3 = ((Local1 + Local2) & 0x03)
 +                    If ((Local3 == Zero))
 +                    {
 +                        Local4 = Package (0x04)
 +                            {
 +                                Zero,
 +                                Zero,
 +                                LNKD,
 +                                Zero
 +                            }
 +                    }
 +
 +                    If ((Local3 == One))
 +                    {
 +                        Local4 = Package (0x04)
 +                            {
 +                                Zero,
 +                                Zero,
 +                                LNKA,
 +                                Zero
 +                            }
 +                    }
 +
 +                    If ((Local3 == 0x02))
 +                    {
 +                        Local4 = Package (0x04)
 +                            {
 +                                Zero,
 +                                Zero,
 +                                LNKB,
 +                                Zero
 +                            }
 +                    }
 +
 +                    If ((Local3 == 0x03))
 +                    {
 +                        Local4 = Package (0x04)
 +                            {
 +                                Zero,
 +                                Zero,
 +                                LNKC,
 +                                Zero
 +                            }
 +                    }
 +
 +                    Local4 [Zero] = ((Local2 << 0x10) | 0xFFFF)
 +                    Local4 [One] = (Local1 & 0x03)
 +                    Local0 [Local1] = Local4
 +                    Local1++
 +                }
 +
 +                Return (Local0)
 +            }
 +
 +            Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
 +            {
 +                WordBusNumber (ResourceProducer, MinFixed, MaxFixed, PosDecode,
 +                    0x0000,             // Granularity
 +                    0x0010,             // Range Minimum
 +                    0x0010,             // Range Maximum
 +                    0x0000,             // Translation Offset
 +                    0x0001,             // Length
 +                    ,, )
 +            })
 +        }
 +    }
 +
      Scope (\_SB.PCI0)
      {
          Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
@@ -XXX,XX +XXX,XX @@
              WordBusNumber (ResourceProducer, MinFixed, MaxFixed, PosDecode,
 x0000,             // Granularity
 x0000,             // Range Minimum
 -                0x00FF,             // Range Maximum
 +                0x000F,             // Range Maximum
 x0000,             // Translation Offset
 -                0x0100,             // Length
 +                0x0010,             // Length
                  ,, )
              IO (Decode16,
 x0CF8,             // Range Minimum
@@ -XXX,XX +XXX,XX @@
                  }
              }
 +            Device (S10)
 +            {
 +                Name (_ADR, 0x00020000)  // _ADR: Address
 +            }
 +
 +            Device (S18)
 +            {
 +                Name (_ADR, 0x00030000)  // _ADR: Address
 +            }
 +
 +            Device (S20)
 +            {
 +                Name (_ADR, 0x00040000)  // _ADR: Address
 +            }
 +
 +            Device (S28)
 +            {
 +                Name (_ADR, 0x00050000)  // _ADR: Address
 +            }
 +
              Method (PCNT, 0, NotSerialized)
              {
              }
 Reviewed-by: Eric Auger <eric.auger@redhat.com>
 Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
 Message-id: 20211210170415.583179-8-jean-philippe@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
  tests/qtest/bios-tables-test-allowed-diff.h |   2 --
  tests/data/acpi/q35/DSDT.viot               | Bin 0 -> 9398 bytes
  tests/data/acpi/q35/VIOT.viot               | Bin 0 -> 112 bytes
 files changed, 2 deletions(-)
 diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
 index XXXXXXX..XXXXXXX 100644
 --- a/tests/qtest/bios-tables-test-allowed-diff.h
 +++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -XXX,XX +XXX,XX @@
  /* List of comma-separated changed AML files to ignore */
  "tests/data/acpi/virt/VIOT",
 -"tests/data/acpi/q35/DSDT.viot",
 -"tests/data/acpi/q35/VIOT.viot",
 diff --git a/tests/data/acpi/q35/DSDT.viot b/tests/data/acpi/q35/DSDT.viot
 index XXXXXXX..XXXXXXX 100644
 GIT binary patch
 literal 9398
 zcmeHNO>7&-8J*>iv|O&FB}G~Oi$yp||57BBoWHhc5OS9yDTx$CQgH$r;8Idr*-4Q_
 z5(9Az1F`}niVsB-)<KW7p`g9Br(A2Gm-gmc1N78GFS!;)e2V(MnH_0{q<{#yMgn&C
 zn|*J-d9yqFhO_H6z19~`FlPL*u<DkZ*}|)JH;X@mF-FI<cPg<fti9tEN*yB^i5czN
 zNq&q?!OZ;BE3B7{KWzJ-`Tn~f`9?Qj8~2^N8{Oc8J%57{==w%rS#;nOCp*nTr@iZ1
 zb+?i;JLQUJ=O0?8*>S~D)a>NF1~WVB6^~_B#yhJ`H+JU@=6aXs`?Yv)J2h=N?drcS
 zeLZ*n<<Bm^n}6`jfBx#u8&(W}1?)}iF9o#mZ~E2+zwdn7yK3AbIzKnxpZ>JRPm3~#
 z&ICS{+_OayRW-l=Mtk=~uaS3o8z<_udd|(wqg`&JnVPfCe>BUOO`Su3e>pff_^UW%
 z&JE^NO`)=Amg~iqRB1pPscP?(>#ZuY8GHCmlEvD$9g3%4Db~Dfz2SATnddvrR-Oe^
 z;s;dJec!hnzi)ri^I6YN9vtkm{^TdUF8h7gX8-<Qe4p)GQ=)AtYx2VcwdLVAEXEjG
 z^Mj|UHPqkj-LsWuzQem1>F3atdZn=zv3$#RmZzSHN+6-yyU#8cJb=YDilX&sl}vNm
 znkgAR^O<3kj4if>{ly5fwRfMWuC5=lrlvKPX~i#654Cp}R_d*JS$9laZ$ra6)<ns8
 zFZy28G%xP(nit&F>LDi%G<tIc=TY=gl$jSD&Uv!Yat~XR46h%rI$!}a%!|xG7u8Zn
 zeY8_|n=K>xz_v_W8VX$W-Fg-qFWcT}7MCyz{%%{ia7hZ>Law-k6NOr}VI&_48U=2l
 zwqDKFE8eTwwozDdms#e?x?5a|v>&JF;2_v0L~z5n%BYU^52<*cWuD4|GYUm@1+?))
 zte^45>Rz)t*<T5V#={r>@t@{%?^i#W{i=HAZ*Dc9y59Va-+#P!jrGs;u38a{fLr`N
 zvT@rUu>DljxJ?^&Z?-?vyJn3C>3D=qux{Y*bs5|5n)Qmi$TD^Zdn4GU$ocJS2Hh-<
 z`xPI^^+v0nUVdjMos8k`WGl7hA`{03ju%<lrgAHSpd^DRf-*}_#Ly0mB!LSfVgWcQ
 z&T$@~G9)JI=hz5m0vkrel+Xy{Oh7pkAu-V!j*W7rY(bO}Q$nMH2`FbGB&N)QaV4<4
 zo)~9JXiP9=;}NPl<C@MmXG&;XFlFNrsyfFsonxFSp<}vEgsRSQP3O3#b6nSnP}ON_
 zI!#Tdsp~|j>ckUB>FI=~GokB5sOq#dotCE4(sd$KbtW~PNlj-`*NIToiD#j5J#9^=
 zt?NXn>YUJYPG~wObe#xQos*i*NloXZt`niEb4t@WrRki~bs|)CI+{*L)9L6s5vn><
 zn$DD_Go|Z9sOn5>I@6lYw5}7Os&iV?Ij!lO)^#FOb!If38BJ$K*NIToIiu;E(R9w}
 zIuWWmPiZ<&X*y5oIuWWmF_XaEC!a&Jn$B5WCqh-{X-(&8P3LJ{Cqh-{8P3dyPr@^t
 zSqL9?X9Uwd3W@23*s~h*tj0X6GZCuHa~kuU#yqDp5vt7d8uPryJg+kms?5hU=3^T3
 zF`bD}WnSP+=`t5MQ$FJ_2&Q~+BP6E0f^%BVIW6a$o)e+SX~IDBih-7z6{O~7YTy`&
 zLjy&Cv?7QikV#>n0>>@MV8oK`Gmun34-FKdlm-J8SZSaNlnhir4-FI{S|bfqV8e)V
 zss<{chX#reE#g=hsKAC%sF6d-Km}BWs!kZFsFpKfpbC@>6rprQGEjt4Ck#|zITHq|
 zK*>M_l;<P^MJRQ`Kn0dFVW0|>3{*fllMEE0)CmI>Sk8ojDo`>|0p(0GP=xY&!axO<
 zGhv_#lnhirIg<<&q0|Wj6<E%MfhtfkPyyvkGEjt4Ck#|zITHq|K*>M_lrzad5lWpf
 zP=V!47^ngz0~JutBm+e#b;3XemNQ|X3X}{~Ksl2P6rt1!0~J`#gn=qhGEf2KOfpb}
 zQYQ>lU^x>8szAv=1(Y+%KoLrvFi?TzOc<yFB?A>u&LjgxD0RX>1(q{mpbC@>R6seC
 z3>2Z%2?G^a&V+#~P%=;f<xDbAgi<FARA4z12C6{GKn0XD$v_cGoiI>=<xCi;0wn_#
 zP|hR+MJRQ`Kn0dFVW0|>3{*fllMEE0)CmI>Sk8ojDo`>|0p(0GP=rz^3{+q_69%e4
 z$v_2^Gs!>^N}VuJf#pmXr~)Me6;RG314Srx!axxz28u{EP=u<1B2)}iVZuNaCK;&0
 zBm-5LFi?dF167!0pbC==RAItE6($T+VUmF=Ofpb~2?JG_Fi?d_2C6X0Kouqo6p_5T
 zFi=FeV!SiSKoR0H$dH(_Z(*Q_WZ%L-5y`$K14StNmJAdjmWs}HV4<vU_xO+1efmLq
 zZ;W>N_U)fP6Qy6Nw5mbt9Y(#emWSi66=>tq#xoh#Ue=0qyhxi8ZOUe5y0V7VfPUhp
 zwX=;ymc+i5%sg9Ja~lZ&8oAV@mHc>&CHP9v4R(jhtT?un;O4e9#pno)Xkh7OWgK&a
 zyj=3Iv0OuoK_;5rOr5f(Kb~ZXDBO+V`OWYo#_C08imwChQxnjdd?wZLDou8aj;$SD
 zGDYiA3<$Tu<JnHL(KPOChi#zrR32t83}naR$+ym4P_h?z_5#|cW-nw$XD_sOtE62l
 zrD3@*)NVyiklt0&yF9%+klsBey&I<Y2E<!f(E8TuJte)z(|ZHyy<^gQVfx}=`q&B5
 z7nSryp1wGczIaUfVwiq$Fn#<4=@*ssi#+|}K>EdF(l3VTOM~ghPLRH&q%ZOGrGfON
 zW73zx^yR_y<0nX8R??Sw`tm^f@-gYlNFSp|*<gA{q?Zp5Oe-+l#rmyYmKozi9y=P>
 zVReJU*h=ZuVXiS$ohTbw-O#v9>(yZbGE|)?8(H1ZIKvV!jWa0>vy!3eMA^vdhQ>`s
 zuMSg{q3T50$m)j1!HixV<}X9liL#N^4c*tL^y)CF8LCc{jjV3yKAqL8!%SzWI#H%q
 z=bSrQ&)%JCRttF5g4Zf`6l?y@>PzD7MA^D>wBlcH6r1ucwJ<p0O%rZ?JzIY3-QdmZ
 zzs|n>`a5r3e|z)wcUaqS>nqFQ-8x}eCF4u`OWUxqst-@1rSmUs%WmKP5e0dcb?e2N
 z;Z|x*!);VwF|Yuhqs^khqOM!@u*jY!WYldISF(V6`BoNd&6Qfk3>X#SuD^7J>p_D=
 zBPa51y^_n#=cpOt#Zf$ya$Ae9Mfz56n|<i!a=ELS@)%a{^NIH3SDuN<R~sah1km#P
 zU@?*f%<rG=4W1wgfi;C?_n|W@%lm$&8YfvNOJodIg&IcIpIJQRHr<+ej11GQ6)&eF
 z2Lam*jIH}#y0>KnY%4JQfOYS$*uU%f#@$U6`N8I3N-lV?5ErFCdv~xDmu2(wexld4
 z4v^;aVAT2k6GJ^m*FD(Wqc(Qg^)6a<?}h$zLoj}4;PP!+(O{@!a1y-hoAhF_7!z+6
 zslpAmNtYbjHrw-~#SPVk_FUf>-Obg6yV`8o$8_`PyJe_;bY5_EMBfBfWU!Q=*9HsG
 z%_Cda{@_Krr!oHVhv9+y+T5qR8zZ2aZ>5r!$*|f$^U%yBUYfR&B!+EYy_PwL!BeUi
 zJH^}r3r9Q+B)X@Z)fk=P13w&7x#wBtXTZ)g>WITPg5r&pQc!nmyrmk#S(>>b9xnNr
 zx_b#v9Xv-Y><Wb%?S^0Xe&<)bbKl_=Z|3C$tf|F<bYzE*mfHB;uC)`q-?buaBe?l?
 zcLTpK*k<49Z32`K?|nSBMFqxTK^_IE-li2fEGdK~(ZdoKBl6ab4a;Hler#`xvEXJG
 zb?<E%EZExfX>jcOVhS*0rS~RS1dA#xhkv@Nct@#q?LyeKS<$uFec!bw>{@uu$gZ6a
 zyVen1i{1BKd%~`D7|m$;U0a<I*3I7%^N%N%lGYdU_GS!gaR8T$NA@GzFi~z`l7hdl
 zarZy6590|88pi(1zq;V(>38zM0sT&<zX;R5$1w3;`_JMG`;&I&0Y23DMx1%@(w(R9
 z4M$j;D5J+Gy%fijRQsctzFKf&cv|BAz#YLq3CZJWDdtL4u1u1|mkdcUp7|sxJC+?Y
 z_@@s`v3j}Q7*z>6X~cwUxUL8G1KT)_XTp!KAbs;vCp{K3&~_X@+ew=-D}v`2MbFV0
 zQsVsL=rXi-pI*G|iiz;VTCutgUs)hDzV1+4?8KcoP3xROf<M%qC6lgVdpFt4<-|uM
 z=#rl_b1#YjSIl6Toj2z_hOZcKupkdE(LozC(fN=FY(x|sk)ym|;Rq2E1xJWD%Z!ol
 Gu>S+TT-130
 literal 0
 HcmV?d00001
 diff --git a/tests/data/acpi/q35/VIOT.viot b/tests/data/acpi/q35/VIOT.viot
 index XXXXXXX..XXXXXXX 100644
 GIT binary patch
 literal 112
 zcmWIZ^baXu00LVle`k+i1*eDrX9XZ&1PX!JAex!M0Hgv8m>C3sGzdcgBZCA3T-xBj
 Q0Zb)W9Hva*zW_`e0M!8s0RR91
 literal 0
 HcmV?d00001
 --
 .25.1

-[PULL 21/23] hw/net/imx_fec: Convert debug fprintf() to trace events
+[PULL 33/33] tests/acpi: add expected blob for VIOT test on virt machine
-From: Jean-Christophe Dubois <jcd@tribudubois.net>
+From: Jean-Philippe Brucker <jean-philippe@linaro.org>
-Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
+The VIOT blob contains the following:
-Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
-Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+[000h 0000   4]                    Signature : "VIOT"    [Virtual I/O Translation Table]
-[PMD: Fixed 32-bit format string using PRIx32/PRIx64]
+[004h 0004   4]                 Table Length : 00000058
-Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
+[008h 0008   1]                     Revision : 00
 [009h 0009   1]                     Checksum : 66
 [00Ah 0010   6]                       Oem ID : "BOCHS "
 [010h 0016   8]                 Oem Table ID : "BXPC    "
 [018h 0024   4]                 Oem Revision : 00000001
 [01Ch 0028   4]              Asl Compiler ID : "BXPC"
 [020h 0032   4]        Asl Compiler Revision : 00000001
 [024h 0036   2]                   Node count : 0002
 [026h 0038   2]                  Node offset : 0030
 [028h 0040   8]                     Reserved : 0000000000000000
 [030h 0048   1]                         Type : 03 [VirtIO-PCI IOMMU]
 [031h 0049   1]                     Reserved : 00
 [032h 0050   2]                       Length : 0010
 [034h 0052   2]                  PCI Segment : 0000
 [036h 0054   2]               PCI BDF number : 0008
 [038h 0056   8]                     Reserved : 0000000000000000
 [040h 0064   1]                         Type : 01 [PCI Range]
 [041h 0065   1]                     Reserved : 00
 [042h 0066   2]                       Length : 0018
 [044h 0068   4]               Endpoint start : 00000000
 [048h 0072   2]            PCI Segment start : 0000
 [04Ah 0074   2]              PCI Segment end : 0000
 [04Ch 0076   2]                PCI BDF start : 0000
 [04Eh 0078   2]                  PCI BDF end : 00FF
 [050h 0080   2]                  Output node : 0030
 [052h 0082   6]                     Reserved : 000000000000
 Acked-by: Ani Sinha <ani@anisinha.ca>
 Reviewed-by: Eric Auger <eric.auger@redhat.com>
 Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
 Message-id: 20211210170415.583179-9-jean-philippe@linaro.org
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/net/imx_fec.c    | 106 +++++++++++++++++++-------------------------
+ tests/qtest/bios-tables-test-allowed-diff.h |   1 -
- hw/net/trace-events |  18 ++++++++
+ tests/data/acpi/virt/VIOT                   | Bin 0 -> 88 bytes
-files changed, 63 insertions(+), 61 deletions(-)
+files changed, 1 deletion(-)
-diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
+diff --git a/tests/qtest/bios-tables-test-allowed-diff.h b/tests/qtest/bios-tables-test-allowed-diff.h
 index XXXXXXX..XXXXXXX 100644
---- a/hw/net/imx_fec.c
+--- a/tests/qtest/bios-tables-test-allowed-diff.h
-+++ b/hw/net/imx_fec.c
++++ b/tests/qtest/bios-tables-test-allowed-diff.h
-@@ -XXX,XX +XXX,XX @@
+@@ -1,2 +1 @@
- #include "qemu/module.h"
+ /* List of comma-separated changed AML files to ignore */
- #include "net/checksum.h"
+-"tests/data/acpi/virt/VIOT",
- #include "net/eth.h"
+diff --git a/tests/data/acpi/virt/VIOT b/tests/data/acpi/virt/VIOT
 +#include "trace.h"
  /* For crc32 */
  #include <zlib.h>
 -#ifndef DEBUG_IMX_FEC
 -#define DEBUG_IMX_FEC 0
 -#endif
 -
 -#define FEC_PRINTF(fmt, args...) \
 -    do { \
 -        if (DEBUG_IMX_FEC) { \
 -            fprintf(stderr, "[%s]%s: " fmt , TYPE_IMX_FEC, \
 -                                             __func__, ##args); \
 -        } \
 -    } while (0)
 -
 -#ifndef DEBUG_IMX_PHY
 -#define DEBUG_IMX_PHY 0
 -#endif
 -
 -#define PHY_PRINTF(fmt, args...) \
 -    do { \
 -        if (DEBUG_IMX_PHY) { \
 -            fprintf(stderr, "[%s.phy]%s: " fmt , TYPE_IMX_FEC, \
 -                                                 __func__, ##args); \
 -        } \
 -    } while (0)
 -
  #define IMX_MAX_DESC    1024
  static const char *imx_default_reg_name(IMXFECState *s, uint32_t index)
@@ -XXX,XX +XXX,XX @@ static void imx_eth_update(IMXFECState *s);
   * For now we don't handle any GPIO/interrupt line, so the OS will
   * have to poll for the PHY status.
   */
 -static void phy_update_irq(IMXFECState *s)
 +static void imx_phy_update_irq(IMXFECState *s)
  {
      imx_eth_update(s);
  }
 -static void phy_update_link(IMXFECState *s)
 +static void imx_phy_update_link(IMXFECState *s)
  {
      /* Autonegotiation status mirrors link status.  */
      if (qemu_get_queue(s->nic)->link_down) {
 -        PHY_PRINTF("link is down\n");
 +        trace_imx_phy_update_link("down");
          s->phy_status &= ~0x0024;
          s->phy_int |= PHY_INT_DOWN;
      } else {
 -        PHY_PRINTF("link is up\n");
 +        trace_imx_phy_update_link("up");
          s->phy_status |= 0x0024;
          s->phy_int |= PHY_INT_ENERGYON;
          s->phy_int |= PHY_INT_AUTONEG_COMPLETE;
      }
 -    phy_update_irq(s);
 +    imx_phy_update_irq(s);
  }
  static void imx_eth_set_link(NetClientState *nc)
  {
 -    phy_update_link(IMX_FEC(qemu_get_nic_opaque(nc)));
 +    imx_phy_update_link(IMX_FEC(qemu_get_nic_opaque(nc)));
  }
 -static void phy_reset(IMXFECState *s)
 +static void imx_phy_reset(IMXFECState *s)
  {
 +    trace_imx_phy_reset();
 +
      s->phy_status = 0x7809;
      s->phy_control = 0x3000;
      s->phy_advertise = 0x01e1;
      s->phy_int_mask = 0;
      s->phy_int = 0;
 -    phy_update_link(s);
 +    imx_phy_update_link(s);
  }
 -static uint32_t do_phy_read(IMXFECState *s, int reg)
 +static uint32_t imx_phy_read(IMXFECState *s, int reg)
  {
      uint32_t val;
@@ -XXX,XX +XXX,XX @@ static uint32_t do_phy_read(IMXFECState *s, int reg)
      case 29:    /* Interrupt source.  */
          val = s->phy_int;
          s->phy_int = 0;
 -        phy_update_irq(s);
 +        imx_phy_update_irq(s);
          break;
      case 30:    /* Interrupt mask */
          val = s->phy_int_mask;
@@ -XXX,XX +XXX,XX @@ static uint32_t do_phy_read(IMXFECState *s, int reg)
          break;
      }
 -    PHY_PRINTF("read 0x%04x @ %d\n", val, reg);
 +    trace_imx_phy_read(val, reg);
      return val;
  }
 -static void do_phy_write(IMXFECState *s, int reg, uint32_t val)
 +static void imx_phy_write(IMXFECState *s, int reg, uint32_t val)
  {
 -    PHY_PRINTF("write 0x%04x @ %d\n", val, reg);
 +    trace_imx_phy_write(val, reg);
      if (reg > 31) {
          /* we only advertise one phy */
@@ -XXX,XX +XXX,XX @@ static void do_phy_write(IMXFECState *s, int reg, uint32_t val)
      switch (reg) {
      case 0:     /* Basic Control */
          if (val & 0x8000) {
 -            phy_reset(s);
 +            imx_phy_reset(s);
          } else {
              s->phy_control = val & 0x7980;
              /* Complete autonegotiation immediately.  */
@@ -XXX,XX +XXX,XX @@ static void do_phy_write(IMXFECState *s, int reg, uint32_t val)
          break;
      case 30:    /* Interrupt mask */
          s->phy_int_mask = val & 0xff;
 -        phy_update_irq(s);
 +        imx_phy_update_irq(s);
          break;
      case 17:
      case 18:
@@ -XXX,XX +XXX,XX @@ static void do_phy_write(IMXFECState *s, int reg, uint32_t val)
  static void imx_fec_read_bd(IMXFECBufDesc *bd, dma_addr_t addr)
  {
      dma_memory_read(&address_space_memory, addr, bd, sizeof(*bd));
 +
 +    trace_imx_fec_read_bd(addr, bd->flags, bd->length, bd->data);
  }
  static void imx_fec_write_bd(IMXFECBufDesc *bd, dma_addr_t addr)
@@ -XXX,XX +XXX,XX @@ static void imx_fec_write_bd(IMXFECBufDesc *bd, dma_addr_t addr)
  static void imx_enet_read_bd(IMXENETBufDesc *bd, dma_addr_t addr)
  {
      dma_memory_read(&address_space_memory, addr, bd, sizeof(*bd));
 +
 +    trace_imx_enet_read_bd(addr, bd->flags, bd->length, bd->data,
 +                   bd->option, bd->status);
  }
  static void imx_enet_write_bd(IMXENETBufDesc *bd, dma_addr_t addr)
@@ -XXX,XX +XXX,XX @@ static void imx_fec_do_tx(IMXFECState *s)
          int len;
          imx_fec_read_bd(&bd, addr);
 -        FEC_PRINTF("tx_bd %x flags %04x len %d data %08x\n",
 -                   addr, bd.flags, bd.length, bd.data);
          if ((bd.flags & ENET_BD_R) == 0) {
 +
              /* Run out of descriptors to transmit.  */
 -            FEC_PRINTF("tx_bd ran out of descriptors to transmit\n");
 +            trace_imx_eth_tx_bd_busy();
 +
              break;
          }
          len = bd.length;
@@ -XXX,XX +XXX,XX @@ static void imx_enet_do_tx(IMXFECState *s, uint32_t index)
          int len;
          imx_enet_read_bd(&bd, addr);
 -        FEC_PRINTF("tx_bd %x flags %04x len %d data %08x option %04x "
 -                   "status %04x\n", addr, bd.flags, bd.length, bd.data,
 -                   bd.option, bd.status);
          if ((bd.flags & ENET_BD_R) == 0) {
              /* Run out of descriptors to transmit.  */
 +
 +            trace_imx_eth_tx_bd_busy();
 +
              break;
          }
          len = bd.length;
@@ -XXX,XX +XXX,XX @@ static void imx_eth_enable_rx(IMXFECState *s, bool flush)
      s->regs[ENET_RDAR] = (bd.flags & ENET_BD_E) ? ENET_RDAR_RDAR : 0;
      if (!s->regs[ENET_RDAR]) {
 -        FEC_PRINTF("RX buffer full\n");
 +        trace_imx_eth_rx_bd_full();
      } else if (flush) {
          qemu_flush_queued_packets(qemu_get_queue(s->nic));
      }
@@ -XXX,XX +XXX,XX @@ static void imx_eth_reset(DeviceState *d)
      memset(s->tx_descriptor, 0, sizeof(s->tx_descriptor));
      /* We also reset the PHY */
 -    phy_reset(s);
 +    imx_phy_reset(s);
  }
  static uint32_t imx_default_read(IMXFECState *s, uint32_t index)
@@ -XXX,XX +XXX,XX @@ static uint64_t imx_eth_read(void *opaque, hwaddr offset, unsigned size)
          break;
      }
 -    FEC_PRINTF("reg[%s] => 0x%" PRIx32 "\n", imx_eth_reg_name(s, index),
 -                                              value);
 +    trace_imx_eth_read(index, imx_eth_reg_name(s, index), value);
      return value;
  }
@@ -XXX,XX +XXX,XX @@ static void imx_eth_write(void *opaque, hwaddr offset, uint64_t value,
      const bool single_tx_ring = !imx_eth_is_multi_tx_ring(s);
      uint32_t index = offset >> 2;
 -    FEC_PRINTF("reg[%s] <= 0x%" PRIx32 "\n", imx_eth_reg_name(s, index),
 -                (uint32_t)value);
 +    trace_imx_eth_write(index, imx_eth_reg_name(s, index), value);
      switch (index) {
      case ENET_EIR:
@@ -XXX,XX +XXX,XX @@ static void imx_eth_write(void *opaque, hwaddr offset, uint64_t value,
          if (extract32(value, 29, 1)) {
              /* This is a read operation */
              s->regs[ENET_MMFR] = deposit32(s->regs[ENET_MMFR], 0, 16,
 -                                           do_phy_read(s,
 +                                           imx_phy_read(s,
                                                         extract32(value,
 , 10)));
          } else {
              /* This a write operation */
 -            do_phy_write(s, extract32(value, 18, 10), extract32(value, 0, 16));
 +            imx_phy_write(s, extract32(value, 18, 10), extract32(value, 0, 16));
          }
          /* raise the interrupt as the PHY operation is done */
          s->regs[ENET_EIR] |= ENET_INT_MII;
@@ -XXX,XX +XXX,XX @@ static bool imx_eth_can_receive(NetClientState *nc)
  {
      IMXFECState *s = IMX_FEC(qemu_get_nic_opaque(nc));
 -    FEC_PRINTF("\n");
 -
      return !!s->regs[ENET_RDAR];
  }
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
      unsigned int buf_len;
      size_t size = len;
 -    FEC_PRINTF("len %d\n", (int)size);
 +    trace_imx_fec_receive(size);
      if (!s->regs[ENET_RDAR]) {
          qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: Unexpected packet\n",
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
          bd.length = buf_len;
          size -= buf_len;
 -        FEC_PRINTF("rx_bd 0x%x length %d\n", addr, bd.length);
 +        trace_imx_fec_receive_len(addr, bd.length);
          /* The last 4 bytes are the CRC.  */
          if (size < 4) {
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
          if (size == 0) {
              /* Last buffer in frame.  */
              bd.flags |= flags | ENET_BD_L;
 -            FEC_PRINTF("rx frame flags %04x\n", bd.flags);
 +
 +            trace_imx_fec_receive_last(bd.flags);
 +
              s->regs[ENET_EIR] |= ENET_INT_RXF;
          } else {
              s->regs[ENET_EIR] |= ENET_INT_RXB;
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf,
      size_t size = len;
      bool shift16 = s->regs[ENET_RACC] & ENET_RACC_SHIFT16;
 -    FEC_PRINTF("len %d\n", (int)size);
 +    trace_imx_enet_receive(size);
      if (!s->regs[ENET_RDAR]) {
          qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: Unexpected packet\n",
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf,
          bd.length = buf_len;
          size -= buf_len;
 -        FEC_PRINTF("rx_bd 0x%x length %d\n", addr, bd.length);
 +        trace_imx_enet_receive_len(addr, bd.length);
          /* The last 4 bytes are the CRC.  */
          if (size < 4) {
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf,
          if (size == 0) {
              /* Last buffer in frame.  */
              bd.flags |= flags | ENET_BD_L;
 -            FEC_PRINTF("rx frame flags %04x\n", bd.flags);
 +
 +            trace_imx_enet_receive_last(bd.flags);
 +
              /* Indicate that we've updated the last buffer descriptor. */
              bd.last_buffer = ENET_BD_BDU;
              if (bd.option & ENET_BD_RX_INT) {
 diff --git a/hw/net/trace-events b/hw/net/trace-events
 index XXXXXXX..XXXXXXX 100644
---- a/hw/net/trace-events
+GIT binary patch
-+++ b/hw/net/trace-events
+literal 88
-@@ -XXX,XX +XXX,XX @@ i82596_receive_packet(size_t sz) "len=%zu"
+zcmWIZ^bd((0D?3pe`k+i1*eDrX9XZ&1PX!JAexE60Hgv8m>C3sGzXN&z`)2L0cSHX
- i82596_new_mac(const char *id_with_mac) "New MAC for: %s"
+I{D-Rq0Q5fy0RR91
- i82596_set_multicast(uint16_t count) "Added %d multicast entries"
- i82596_channel_attention(void *s) "%p: Received CHANNEL ATTENTION"
+literal 0
-+
+HcmV?d00001
-+# imx_fec.c
 +imx_phy_read(uint32_t val, int reg) "0x%04"PRIx32" <= reg[%d]"
 +imx_phy_write(uint32_t val, int reg) "0x%04"PRIx32" => reg[%d]"
 +imx_phy_update_link(const char *s) "%s"
 +imx_phy_reset(void) ""
 +imx_fec_read_bd(uint64_t addr, int flags, int len, int data) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x"
 +imx_enet_read_bd(uint64_t addr, int flags, int len, int data, int options, int status) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x option 0x%04x status 0x%04x"
 +imx_eth_tx_bd_busy(void) "tx_bd ran out of descriptors to transmit"
 +imx_eth_rx_bd_full(void) "RX buffer is full"
 +imx_eth_read(int reg, const char *reg_name, uint32_t value) "reg[%d:%s] => 0x%08"PRIx32
 +imx_eth_write(int reg, const char *reg_name, uint64_t value) "reg[%d:%s] <= 0x%08"PRIx64
 +imx_fec_receive(size_t size) "len %zu"
 +imx_fec_receive_len(uint64_t addr, int len) "rx_bd 0x%"PRIx64" length %d"
 +imx_fec_receive_last(int last) "rx frame flags 0x%04x"
 +imx_enet_receive(size_t size) "len %zu"
 +imx_enet_receive_len(uint64_t addr, int len) "rx_bd 0x%"PRIx64" length %d"
 +imx_enet_receive_last(int last) "rx frame flags 0x%04x"
 --
-.20.1
+.25.1

Mostly my decodetree stuff, but also some patches for various
smaller bugs/features from others.

thanks
-- PMM

The following changes since commit 53550e81e2cafe7c03a39526b95cd21b5194d9b1:

Merge remote-tracking branch 'remotes/berrange/tags/qcrypto-next-pull-request' into staging (2020-06-15 16:36:34 +0100)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20200616

for you to fetch changes up to 64b397417a26509bcdff44ab94356a35c7901c79:

hw: arm: Set vendor property for IMX SDHCI emulations (2020-06-16 10:32:29 +0100)

----------------------------------------------------------------
 * hw: arm: Set vendor property for IMX SDHCI emulations
 * sd: sdhci: Implement basic vendor specific register support
 * hw/net/imx_fec: Convert debug fprintf() to trace events
 * target/arm/cpu: adjust virtual time for all KVM arm cpus
 * Implement configurable descriptor size in ftgmac100
 * hw/misc/imx6ul_ccm: Implement non writable bits in CCM registers
 * target/arm: More Neon decodetree conversion work

----------------------------------------------------------------
Erik Smit (1):
      Implement configurable descriptor size in ftgmac100

Guenter Roeck (2):
      sd: sdhci: Implement basic vendor specific register support
      hw: arm: Set vendor property for IMX SDHCI emulations

Jean-Christophe Dubois (2):
      hw/misc/imx6ul_ccm: Implement non writable bits in CCM registers
      hw/net/imx_fec: Convert debug fprintf() to trace events

Peter Maydell (17):
      target/arm: Fix missing temp frees in do_vshll_2sh
      target/arm: Convert Neon 3-reg-diff prewidening ops to decodetree
      target/arm: Convert Neon 3-reg-diff narrowing ops to decodetree
      target/arm: Convert Neon 3-reg-diff VABAL, VABDL to decodetree
      target/arm: Convert Neon 3-reg-diff long multiplies
      target/arm: Convert Neon 3-reg-diff saturating doubling multiplies
      target/arm: Convert Neon 3-reg-diff polynomial VMULL
      target/arm: Add 'static' and 'const' annotations to VSHLL function arrays
      target/arm: Add missing TCG temp free in do_2shift_env_64()
      target/arm: Convert Neon 2-reg-scalar integer multiplies to decodetree
      target/arm: Convert Neon 2-reg-scalar float multiplies to decodetree
      target/arm: Convert Neon 2-reg-scalar VQDMULH, VQRDMULH to decodetree
      target/arm: Convert Neon 2-reg-scalar VQRDMLAH, VQRDMLSH to decodetree
      target/arm: Convert Neon 2-reg-scalar long multiplies to decodetree
      target/arm: Convert Neon VEXT to decodetree
      target/arm: Convert Neon VTBL, VTBX to decodetree
      target/arm: Convert Neon VDUP (scalar) to decodetree

fangying (1):
      target/arm/cpu: adjust virtual time for all KVM arm cpus

The widenfn() in do_vshll_2sh() does not free the input 32-bit
TCGv, so we need to do this in the calling code.

diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a,
     tmp = tcg_temp_new_i64();
 
     widenfn(tmp, rm0);
+    tcg_temp_free_i32(rm0);
     if (a->shift != 0) {
         tcg_gen_shli_i64(tmp, tmp, a->shift);
         tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
@@ -XXX,XX +XXX,XX @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a,
     neon_store_reg64(tmp, a->vd);
 
     widenfn(tmp, rm1);
+    tcg_temp_free_i32(rm1);
     if (a->shift != 0) {
         tcg_gen_shli_i64(tmp, tmp, a->shift);
         tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
-- 
2.20.1

Convert the "pre-widening" insns VADDL, VSUBL, VADDW and VSUBW
in the Neon 3-registers-different-lengths group to decodetree.
These insns work by widening one or both inputs to double their
size, performing an add or subtract at the doubled size and
then storing the double-size result.

As usual, rather than copying the loop of the original decoder
(which needs awkward code to avoid problems when source and
destination registers overlap) we just unroll the two passes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  43 +++++++++++++
 target/arm/translate-neon.inc.c | 104 ++++++++++++++++++++++++++++++++
 target/arm/translate.c          |  16 ++---
 3 files changed, 151 insertions(+), 12 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ VCVT_FU_2sh      1111 001 1 1 . ...... .... 1111 0 . . 1 .... @2reg_vcvt
 # So we have a single decode line and check the cmode/op in the
 # trans function.
 Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
+
+######################################################################
+# Within the "two registers, or three registers of different lengths"
+# grouping ([23,4]=0b10), bits [21:20] are either part of the opcode
+# decode: 0b11 for VEXT, two-reg-misc, VTBL, and duplicate-scalar;
+# or they are a size field for the three-reg-different-lengths and
+# two-reg-and-scalar insn groups (where size cannot be 0b11). This
+# is slightly awkward for decodetree: we handle it with this
+# non-exclusive group which contains within it two exclusive groups:
+# one for the size=0b11 patterns, and one for the size-not-0b11
+# patterns. This allows us to check that none of the insns within
+# each subgroup accidentally overlap each other. Note that all the
+# trans functions for the size-not-0b11 patterns must check and
+# return false for size==3.
+######################################################################
+{
+  # 0b11 subgroup will go here
+
+  # Subgroup for size != 0b11
+  [
+    ##################################################################
+    # 3-reg-different-length grouping:
+    # 1111 001 U 1 D sz!=11 Vn:4 Vd:4 opc:4 N 0 M 0 Vm:4
+    ##################################################################
+
+    &3diff vm vn vd size
+
+    @3diff       .... ... . . . size:2 .... .... .... . . . . .... \
+                 &3diff vm=%vm_dp vn=%vn_dp vd=%vd_dp
+
+    VADDL_S_3d   1111 001 0 1 . .. .... .... 0000 . 0 . 0 .... @3diff
+    VADDL_U_3d   1111 001 1 1 . .. .... .... 0000 . 0 . 0 .... @3diff
+
+    VADDW_S_3d   1111 001 0 1 . .. .... .... 0001 . 0 . 0 .... @3diff
+    VADDW_U_3d   1111 001 1 1 . .. .... .... 0001 . 0 . 0 .... @3diff
+
+    VSUBL_S_3d   1111 001 0 1 . .. .... .... 0010 . 0 . 0 .... @3diff
+    VSUBL_U_3d   1111 001 1 1 . .. .... .... 0010 . 0 . 0 .... @3diff
+
+    VSUBW_S_3d   1111 001 0 1 . .. .... .... 0011 . 0 . 0 .... @3diff
+    VSUBW_U_3d   1111 001 1 1 . .. .... .... 0011 . 0 . 0 .... @3diff
+  ]
+}
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_Vimm_1r(DisasContext *s, arg_1reg_imm *a)
     }
     return do_1reg_imm(s, a, fn);
 }
+
+static bool do_prewiden_3d(DisasContext *s, arg_3diff *a,
+                           NeonGenWidenFn *widenfn,
+                           NeonGenTwo64OpFn *opfn,
+                           bool src1_wide)
+{
+    /* 3-regs different lengths, prewidening case (VADDL/VSUBL/VAADW/VSUBW) */
+    TCGv_i64 rn0_64, rn1_64, rm_64;
+    TCGv_i32 rm;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (!widenfn || !opfn) {
+        /* size == 3 case, which is an entirely different insn group */
+        return false;
+    }
+
+    if ((a->vd & 1) || (src1_wide && (a->vn & 1))) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    rn0_64 = tcg_temp_new_i64();
+    rn1_64 = tcg_temp_new_i64();
+    rm_64 = tcg_temp_new_i64();
+
+    if (src1_wide) {
+        neon_load_reg64(rn0_64, a->vn);
+    } else {
+        TCGv_i32 tmp = neon_load_reg(a->vn, 0);
+        widenfn(rn0_64, tmp);
+        tcg_temp_free_i32(tmp);
+    }
+    rm = neon_load_reg(a->vm, 0);
+
+    widenfn(rm_64, rm);
+    tcg_temp_free_i32(rm);
+    opfn(rn0_64, rn0_64, rm_64);
+
+    /*
+     * Load second pass inputs before storing the first pass result, to
+     * avoid incorrect results if a narrow input overlaps with the result.
+     */
+    if (src1_wide) {
+        neon_load_reg64(rn1_64, a->vn + 1);
+    } else {
+        TCGv_i32 tmp = neon_load_reg(a->vn, 1);
+        widenfn(rn1_64, tmp);
+        tcg_temp_free_i32(tmp);
+    }
+    rm = neon_load_reg(a->vm, 1);
+
+    neon_store_reg64(rn0_64, a->vd);
+
+    widenfn(rm_64, rm);
+    tcg_temp_free_i32(rm);
+    opfn(rn1_64, rn1_64, rm_64);
+    neon_store_reg64(rn1_64, a->vd + 1);
+
+    tcg_temp_free_i64(rn0_64);
+    tcg_temp_free_i64(rn1_64);
+    tcg_temp_free_i64(rm_64);
+
+    return true;
+}
+
+#define DO_PREWIDEN(INSN, S, EXT, OP, SRC1WIDE)                         \
+    static bool trans_##INSN##_3d(DisasContext *s, arg_3diff *a)        \
+    {                                                                   \
+        static NeonGenWidenFn * const widenfn[] = {                     \
+            gen_helper_neon_widen_##S##8,                               \
+            gen_helper_neon_widen_##S##16,                              \
+            tcg_gen_##EXT##_i32_i64,                                    \
+            NULL,                                                       \
+        };                                                              \
+        static NeonGenTwo64OpFn * const addfn[] = {                     \
+            gen_helper_neon_##OP##l_u16,                                \
+            gen_helper_neon_##OP##l_u32,                                \
+            tcg_gen_##OP##_i64,                                         \
+            NULL,                                                       \
+        };                                                              \
+        return do_prewiden_3d(s, a, widenfn[a->size],                   \
+                              addfn[a->size], SRC1WIDE);                \
+    }
+
+DO_PREWIDEN(VADDL_S, s, ext, add, false)
+DO_PREWIDEN(VADDL_U, u, extu, add, false)
+DO_PREWIDEN(VSUBL_S, s, ext, sub, false)
+DO_PREWIDEN(VSUBL_U, u, extu, sub, false)
+DO_PREWIDEN(VADDW_S, s, ext, add, true)
+DO_PREWIDEN(VADDW_U, u, extu, add, true)
+DO_PREWIDEN(VSUBW_S, s, ext, sub, true)
+DO_PREWIDEN(VSUBW_U, u, extu, sub, true)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 /* Three registers of different lengths.  */
                 int src1_wide;
                 int src2_wide;
-                int prewiden;
                 /* undefreq: bit 0 : UNDEF if size == 0
                  *           bit 1 : UNDEF if size == 1
                  *           bit 2 : UNDEF if size == 2
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 int undefreq;
                 /* prewiden, src1_wide, src2_wide, undefreq */
                 static const int neon_3reg_wide[16][4] = {
-                    {1, 0, 0, 0}, /* VADDL */
-                    {1, 1, 0, 0}, /* VADDW */
-                    {1, 0, 0, 0}, /* VSUBL */
-                    {1, 1, 0, 0}, /* VSUBW */
+                    {0, 0, 0, 7}, /* VADDL: handled by decodetree */
+                    {0, 0, 0, 7}, /* VADDW: handled by decodetree */
+                    {0, 0, 0, 7}, /* VSUBL: handled by decodetree */
+                    {0, 0, 0, 7}, /* VSUBW: handled by decodetree */
                     {0, 1, 1, 0}, /* VADDHN */
                     {0, 0, 0, 0}, /* VABAL */
                     {0, 1, 1, 0}, /* VSUBHN */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
                 };
 
-                prewiden = neon_3reg_wide[op][0];
                 src1_wide = neon_3reg_wide[op][1];
                 src2_wide = neon_3reg_wide[op][2];
                 undefreq = neon_3reg_wide[op][3];
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         } else {
                             tmp = neon_load_reg(rn, pass);
                         }
-                        if (prewiden) {
-                            gen_neon_widen(cpu_V0, tmp, size, u);
-                        }
                     }
                     if (src2_wide) {
                         neon_load_reg64(cpu_V1, rm + pass);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         } else {
                             tmp2 = neon_load_reg(rm, pass);
                         }
-                        if (prewiden) {
-                            gen_neon_widen(cpu_V1, tmp2, size, u);
-                        }
                     }
                     switch (op) {
                     case 0: case 1: case 4: /* VADDL, VADDW, VADDHN, VRADDHN */
-- 
2.20.1

Convert the narrow-to-high-half insns VADDHN, VSUBHN, VRADDHN,
VRSUBHN in the Neon 3-registers-different-lengths group to
decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  6 +++
 target/arm/translate-neon.inc.c | 87 +++++++++++++++++++++++++++++++
 target/arm/translate.c          | 91 ++++-----------------------------
 3 files changed, 104 insertions(+), 80 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
 
     VSUBW_S_3d   1111 001 0 1 . .. .... .... 0011 . 0 . 0 .... @3diff
     VSUBW_U_3d   1111 001 1 1 . .. .... .... 0011 . 0 . 0 .... @3diff
+
+    VADDHN_3d    1111 001 0 1 . .. .... .... 0100 . 0 . 0 .... @3diff
+    VRADDHN_3d   1111 001 1 1 . .. .... .... 0100 . 0 . 0 .... @3diff
+
+    VSUBHN_3d    1111 001 0 1 . .. .... .... 0110 . 0 . 0 .... @3diff
+    VRSUBHN_3d   1111 001 1 1 . .. .... .... 0110 . 0 . 0 .... @3diff
   ]
 }
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_PREWIDEN(VADDW_S, s, ext, add, true)
 DO_PREWIDEN(VADDW_U, u, extu, add, true)
 DO_PREWIDEN(VSUBW_S, s, ext, sub, true)
 DO_PREWIDEN(VSUBW_U, u, extu, sub, true)
+
+static bool do_narrow_3d(DisasContext *s, arg_3diff *a,
+                         NeonGenTwo64OpFn *opfn, NeonGenNarrowFn *narrowfn)
+{
+    /* 3-regs different lengths, narrowing (VADDHN/VSUBHN/VRADDHN/VRSUBHN) */
+    TCGv_i64 rn_64, rm_64;
+    TCGv_i32 rd0, rd1;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (!opfn || !narrowfn) {
+        /* size == 3 case, which is an entirely different insn group */
+        return false;
+    }
+
+    if ((a->vn | a->vm) & 1) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    rn_64 = tcg_temp_new_i64();
+    rm_64 = tcg_temp_new_i64();
+    rd0 = tcg_temp_new_i32();
+    rd1 = tcg_temp_new_i32();
+
+    neon_load_reg64(rn_64, a->vn);
+    neon_load_reg64(rm_64, a->vm);
+
+    opfn(rn_64, rn_64, rm_64);
+
+    narrowfn(rd0, rn_64);
+
+    neon_load_reg64(rn_64, a->vn + 1);
+    neon_load_reg64(rm_64, a->vm + 1);
+
+    opfn(rn_64, rn_64, rm_64);
+
+    narrowfn(rd1, rn_64);
+
+    neon_store_reg(a->vd, 0, rd0);
+    neon_store_reg(a->vd, 1, rd1);
+
+    tcg_temp_free_i64(rn_64);
+    tcg_temp_free_i64(rm_64);
+
+    return true;
+}
+
+#define DO_NARROW_3D(INSN, OP, NARROWTYPE, EXTOP)                       \
+    static bool trans_##INSN##_3d(DisasContext *s, arg_3diff *a)        \
+    {                                                                   \
+        static NeonGenTwo64OpFn * const addfn[] = {                     \
+            gen_helper_neon_##OP##l_u16,                                \
+            gen_helper_neon_##OP##l_u32,                                \
+            tcg_gen_##OP##_i64,                                         \
+            NULL,                                                       \
+        };                                                              \
+        static NeonGenNarrowFn * const narrowfn[] = {                   \
+            gen_helper_neon_##NARROWTYPE##_high_u8,                     \
+            gen_helper_neon_##NARROWTYPE##_high_u16,                    \
+            EXTOP,                                                      \
+            NULL,                                                       \
+        };                                                              \
+        return do_narrow_3d(s, a, addfn[a->size], narrowfn[a->size]);   \
+    }
+
+static void gen_narrow_round_high_u32(TCGv_i32 rd, TCGv_i64 rn)
+{
+    tcg_gen_addi_i64(rn, rn, 1u << 31);
+    tcg_gen_extrh_i64_i32(rd, rn);
+}
+
+DO_NARROW_3D(VADDHN, add, narrow, tcg_gen_extrh_i64_i32)
+DO_NARROW_3D(VSUBHN, sub, narrow, tcg_gen_extrh_i64_i32)
+DO_NARROW_3D(VRADDHN, add, narrow_round, gen_narrow_round_high_u32)
+DO_NARROW_3D(VRSUBHN, sub, narrow_round, gen_narrow_round_high_u32)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_addl(int size)
     }
 }
 
-static inline void gen_neon_subl(int size)
-{
-    switch (size) {
-    case 0: gen_helper_neon_subl_u16(CPU_V001); break;
-    case 1: gen_helper_neon_subl_u32(CPU_V001); break;
-    case 2: tcg_gen_sub_i64(CPU_V001); break;
-    default: abort();
-    }
-}
-
 static inline void gen_neon_negl(TCGv_i64 var, int size)
 {
     switch (size) {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             op = (insn >> 8) & 0xf;
             if ((insn & (1 << 6)) == 0) {
                 /* Three registers of different lengths.  */
-                int src1_wide;
-                int src2_wide;
                 /* undefreq: bit 0 : UNDEF if size == 0
                  *           bit 1 : UNDEF if size == 1
                  *           bit 2 : UNDEF if size == 2
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     {0, 0, 0, 7}, /* VADDW: handled by decodetree */
                     {0, 0, 0, 7}, /* VSUBL: handled by decodetree */
                     {0, 0, 0, 7}, /* VSUBW: handled by decodetree */
-                    {0, 1, 1, 0}, /* VADDHN */
+                    {0, 0, 0, 7}, /* VADDHN: handled by decodetree */
                     {0, 0, 0, 0}, /* VABAL */
-                    {0, 1, 1, 0}, /* VSUBHN */
+                    {0, 0, 0, 7}, /* VSUBHN: handled by decodetree */
                     {0, 0, 0, 0}, /* VABDL */
                     {0, 0, 0, 0}, /* VMLAL */
                     {0, 0, 0, 9}, /* VQDMLAL */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
                 };
 
-                src1_wide = neon_3reg_wide[op][1];
-                src2_wide = neon_3reg_wide[op][2];
                 undefreq = neon_3reg_wide[op][3];
 
                 if ((undefreq & (1 << size)) ||
                     ((undefreq & 8) && u)) {
                     return 1;
                 }
-                if ((src1_wide && (rn & 1)) ||
-                    (src2_wide && (rm & 1)) ||
-                    (!src2_wide && (rd & 1))) {
+                if (rd & 1) {
                     return 1;
                 }
 
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 /* Avoid overlapping operands.  Wide source operands are
                    always aligned so will never overlap with wide
                    destinations in problematic ways.  */
-                if (rd == rm && !src2_wide) {
+                if (rd == rm) {
                     tmp = neon_load_reg(rm, 1);
                     neon_store_scratch(2, tmp);
-                } else if (rd == rn && !src1_wide) {
+                } else if (rd == rn) {
                     tmp = neon_load_reg(rn, 1);
                     neon_store_scratch(2, tmp);
                 }
                 tmp3 = NULL;
                 for (pass = 0; pass < 2; pass++) {
-                    if (src1_wide) {
-                        neon_load_reg64(cpu_V0, rn + pass);
-                        tmp = NULL;
+                    if (pass == 1 && rd == rn) {
+                        tmp = neon_load_scratch(2);
                     } else {
-                        if (pass == 1 && rd == rn) {
-                            tmp = neon_load_scratch(2);
-                        } else {
-                            tmp = neon_load_reg(rn, pass);
-                        }
+                        tmp = neon_load_reg(rn, pass);
                     }
-                    if (src2_wide) {
-                        neon_load_reg64(cpu_V1, rm + pass);
-                        tmp2 = NULL;
+                    if (pass == 1 && rd == rm) {
+                        tmp2 = neon_load_scratch(2);
                     } else {
-                        if (pass == 1 && rd == rm) {
-                            tmp2 = neon_load_scratch(2);
-                        } else {
-                            tmp2 = neon_load_reg(rm, pass);
-                        }
+                        tmp2 = neon_load_reg(rm, pass);
                     }
                     switch (op) {
-                    case 0: case 1: case 4: /* VADDL, VADDW, VADDHN, VRADDHN */
-                        gen_neon_addl(size);
-                        break;
-                    case 2: case 3: case 6: /* VSUBL, VSUBW, VSUBHN, VRSUBHN */
-                        gen_neon_subl(size);
-                        break;
                     case 5: case 7: /* VABAL, VABDL */
                         switch ((size << 1) | u) {
                         case 0:
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             abort();
                         }
                         neon_store_reg64(cpu_V0, rd + pass);
-                    } else if (op == 4 || op == 6) {
-                        /* Narrowing operation.  */
-                        tmp = tcg_temp_new_i32();
-                        if (!u) {
-                            switch (size) {
-                            case 0:
-                                gen_helper_neon_narrow_high_u8(tmp, cpu_V0);
-                                break;
-                            case 1:
-                                gen_helper_neon_narrow_high_u16(tmp, cpu_V0);
-                                break;
-                            case 2:
-                                tcg_gen_extrh_i64_i32(tmp, cpu_V0);
-                                break;
-                            default: abort();
-                            }
-                        } else {
-                            switch (size) {
-                            case 0:
-                                gen_helper_neon_narrow_round_high_u8(tmp, cpu_V0);
-                                break;
-                            case 1:
-                                gen_helper_neon_narrow_round_high_u16(tmp, cpu_V0);
-                                break;
-                            case 2:
-                                tcg_gen_addi_i64(cpu_V0, cpu_V0, 1u << 31);
-                                tcg_gen_extrh_i64_i32(tmp, cpu_V0);
-                                break;
-                            default: abort();
-                            }
-                        }
-                        if (pass == 0) {
-                            tmp3 = tmp;
-                        } else {
-                            neon_store_reg(rd, 0, tmp3);
-                            neon_store_reg(rd, 1, tmp);
-                        }
                     } else {
                         /* Write back the result.  */
                         neon_store_reg64(cpu_V0, rd + pass);
-- 
2.20.1

Convert the Neon 3-reg-diff insns VABAL and VABDL to decodetree.
Like almost all the remaining insns in this group, these are
a combination of a two-input operation which returns a double width
result and then a possible accumulation of that double width
result into the destination.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate.h          |   1 +
 target/arm/neon-dp.decode       |   6 ++
 target/arm/translate-neon.inc.c | 132 ++++++++++++++++++++++++++++++++
 target/arm/translate.c          |  31 +-------
 4 files changed, 142 insertions(+), 28 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -XXX,XX +XXX,XX @@ typedef void NeonGenTwo64OpEnvFn(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64);
 typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
 typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
 typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
+typedef void NeonGenTwoOpWidenFn(TCGv_i64, TCGv_i32, TCGv_i32);
 typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
 typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
 typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     VADDHN_3d    1111 001 0 1 . .. .... .... 0100 . 0 . 0 .... @3diff
     VRADDHN_3d   1111 001 1 1 . .. .... .... 0100 . 0 . 0 .... @3diff
 
+    VABAL_S_3d   1111 001 0 1 . .. .... .... 0101 . 0 . 0 .... @3diff
+    VABAL_U_3d   1111 001 1 1 . .. .... .... 0101 . 0 . 0 .... @3diff
+
     VSUBHN_3d    1111 001 0 1 . .. .... .... 0110 . 0 . 0 .... @3diff
     VRSUBHN_3d   1111 001 1 1 . .. .... .... 0110 . 0 . 0 .... @3diff
+
+    VABDL_S_3d   1111 001 0 1 . .. .... .... 0111 . 0 . 0 .... @3diff
+    VABDL_U_3d   1111 001 1 1 . .. .... .... 0111 . 0 . 0 .... @3diff
   ]
 }
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ DO_NARROW_3D(VADDHN, add, narrow, tcg_gen_extrh_i64_i32)
 DO_NARROW_3D(VSUBHN, sub, narrow, tcg_gen_extrh_i64_i32)
 DO_NARROW_3D(VRADDHN, add, narrow_round, gen_narrow_round_high_u32)
 DO_NARROW_3D(VRSUBHN, sub, narrow_round, gen_narrow_round_high_u32)
+
+static bool do_long_3d(DisasContext *s, arg_3diff *a,
+                       NeonGenTwoOpWidenFn *opfn,
+                       NeonGenTwo64OpFn *accfn)
+{
+    /*
+     * 3-regs different lengths, long operations.
+     * These perform an operation on two inputs that returns a double-width
+     * result, and then possibly perform an accumulation operation of
+     * that result into the double-width destination.
+     */
+    TCGv_i64 rd0, rd1, tmp;
+    TCGv_i32 rn, rm;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (!opfn) {
+        /* size == 3 case, which is an entirely different insn group */
+        return false;
+    }
+
+    if (a->vd & 1) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    rd0 = tcg_temp_new_i64();
+    rd1 = tcg_temp_new_i64();
+
+    rn = neon_load_reg(a->vn, 0);
+    rm = neon_load_reg(a->vm, 0);
+    opfn(rd0, rn, rm);
+    tcg_temp_free_i32(rn);
+    tcg_temp_free_i32(rm);
+
+    rn = neon_load_reg(a->vn, 1);
+    rm = neon_load_reg(a->vm, 1);
+    opfn(rd1, rn, rm);
+    tcg_temp_free_i32(rn);
+    tcg_temp_free_i32(rm);
+
+    /* Don't store results until after all loads: they might overlap */
+    if (accfn) {
+        tmp = tcg_temp_new_i64();
+        neon_load_reg64(tmp, a->vd);
+        accfn(tmp, tmp, rd0);
+        neon_store_reg64(tmp, a->vd);
+        neon_load_reg64(tmp, a->vd + 1);
+        accfn(tmp, tmp, rd1);
+        neon_store_reg64(tmp, a->vd + 1);
+        tcg_temp_free_i64(tmp);
+    } else {
+        neon_store_reg64(rd0, a->vd);
+        neon_store_reg64(rd1, a->vd + 1);
+    }
+
+    tcg_temp_free_i64(rd0);
+    tcg_temp_free_i64(rd1);
+
+    return true;
+}
+
+static bool trans_VABDL_S_3d(DisasContext *s, arg_3diff *a)
+{
+    static NeonGenTwoOpWidenFn * const opfn[] = {
+        gen_helper_neon_abdl_s16,
+        gen_helper_neon_abdl_s32,
+        gen_helper_neon_abdl_s64,
+        NULL,
+    };
+
+    return do_long_3d(s, a, opfn[a->size], NULL);
+}
+
+static bool trans_VABDL_U_3d(DisasContext *s, arg_3diff *a)
+{
+    static NeonGenTwoOpWidenFn * const opfn[] = {
+        gen_helper_neon_abdl_u16,
+        gen_helper_neon_abdl_u32,
+        gen_helper_neon_abdl_u64,
+        NULL,
+    };
+
+    return do_long_3d(s, a, opfn[a->size], NULL);
+}
+
+static bool trans_VABAL_S_3d(DisasContext *s, arg_3diff *a)
+{
+    static NeonGenTwoOpWidenFn * const opfn[] = {
+        gen_helper_neon_abdl_s16,
+        gen_helper_neon_abdl_s32,
+        gen_helper_neon_abdl_s64,
+        NULL,
+    };
+    static NeonGenTwo64OpFn * const addfn[] = {
+        gen_helper_neon_addl_u16,
+        gen_helper_neon_addl_u32,
+        tcg_gen_add_i64,
+        NULL,
+    };
+
+    return do_long_3d(s, a, opfn[a->size], addfn[a->size]);
+}
+
+static bool trans_VABAL_U_3d(DisasContext *s, arg_3diff *a)
+{
+    static NeonGenTwoOpWidenFn * const opfn[] = {
+        gen_helper_neon_abdl_u16,
+        gen_helper_neon_abdl_u32,
+        gen_helper_neon_abdl_u64,
+        NULL,
+    };
+    static NeonGenTwo64OpFn * const addfn[] = {
+        gen_helper_neon_addl_u16,
+        gen_helper_neon_addl_u32,
+        tcg_gen_add_i64,
+        NULL,
+    };
+
+    return do_long_3d(s, a, opfn[a->size], addfn[a->size]);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     {0, 0, 0, 7}, /* VSUBL: handled by decodetree */
                     {0, 0, 0, 7}, /* VSUBW: handled by decodetree */
                     {0, 0, 0, 7}, /* VADDHN: handled by decodetree */
-                    {0, 0, 0, 0}, /* VABAL */
+                    {0, 0, 0, 7}, /* VABAL */
                     {0, 0, 0, 7}, /* VSUBHN: handled by decodetree */
-                    {0, 0, 0, 0}, /* VABDL */
+                    {0, 0, 0, 7}, /* VABDL */
                     {0, 0, 0, 0}, /* VMLAL */
                     {0, 0, 0, 9}, /* VQDMLAL */
                     {0, 0, 0, 0}, /* VMLSL */
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         tmp2 = neon_load_reg(rm, pass);
                     }
                     switch (op) {
-                    case 5: case 7: /* VABAL, VABDL */
-                        switch ((size << 1) | u) {
-                        case 0:
-                            gen_helper_neon_abdl_s16(cpu_V0, tmp, tmp2);
-                            break;
-                        case 1:
-                            gen_helper_neon_abdl_u16(cpu_V0, tmp, tmp2);
-                            break;
-                        case 2:
-                            gen_helper_neon_abdl_s32(cpu_V0, tmp, tmp2);
-                            break;
-                        case 3:
-                            gen_helper_neon_abdl_u32(cpu_V0, tmp, tmp2);
-                            break;
-                        case 4:
-                            gen_helper_neon_abdl_s64(cpu_V0, tmp, tmp2);
-                            break;
-                        case 5:
-                            gen_helper_neon_abdl_u64(cpu_V0, tmp, tmp2);
-                            break;
-                        default: abort();
-                        }
-                        tcg_temp_free_i32(tmp2);
-                        tcg_temp_free_i32(tmp);
-                        break;
                     case 8: case 9: case 10: case 11: case 12: case 13:
                         /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL */
                         gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         case 10: /* VMLSL */
                             gen_neon_negl(cpu_V0, size);
                             /* Fall through */
-                        case 5: case 8: /* VABAL, VMLAL */
+                        case 8: /* VABAL, VMLAL */
                             gen_neon_addl(size);
                             break;
                         case 9: case 11: /* VQDMLAL, VQDMLSL */
-- 
2.20.1

Convert the Neon 3-reg-diff insns VMULL, VMLAL and VMLSL; these perform
a 32x32->64 multiply with possible accumulate.

Note that for VMLSL we do the accumulate directly with a subtraction
rather than doing a negate-then-add as the old code did.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  9 +++++
 target/arm/translate-neon.inc.c | 71 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 21 +++-------
 3 files changed, 86 insertions(+), 15 deletions(-)

Convert the Neon 3-reg-diff insns VQDMULL, VQDMLAL and VQDMLSL:
these are all saturating doubling long multiplies with a possible
accumulate step.

These are the last insns in the group which use the pass-over-each
elements loop, so we can delete that code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  6 +++
 target/arm/translate-neon.inc.c | 82 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 59 ++----------------------
 3 files changed, 92 insertions(+), 55 deletions(-)

Convert the Neon 3-reg-diff insn polynomial VMULL. This is the last
insn in this group to be converted.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  2 ++
 target/arm/translate-neon.inc.c | 43 +++++++++++++++++++++++
 target/arm/translate.c          | 60 ++-------------------------------
 3 files changed, 48 insertions(+), 57 deletions(-)

Mark the arrays of function pointers in trans_VSHLL_S_2sh() and
trans_VSHLL_U_2sh() as both 'static' and 'const'.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-neon.inc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a,
 
 static bool trans_VSHLL_S_2sh(DisasContext *s, arg_2reg_shift *a)
 {
-    NeonGenWidenFn *widenfn[] = {
+    static NeonGenWidenFn * const widenfn[] = {
         gen_helper_neon_widen_s8,
         gen_helper_neon_widen_s16,
         tcg_gen_ext_i32_i64,
@@ -XXX,XX +XXX,XX @@ static bool trans_VSHLL_S_2sh(DisasContext *s, arg_2reg_shift *a)
 
 static bool trans_VSHLL_U_2sh(DisasContext *s, arg_2reg_shift *a)
 {
-    NeonGenWidenFn *widenfn[] = {
+    static NeonGenWidenFn * const widenfn[] = {
         gen_helper_neon_widen_u8,
         gen_helper_neon_widen_u16,
         tcg_gen_extu_i32_i64,
-- 
2.20.1

Convert the VMLA, VMLS and VMUL insns in the Neon "2 registers and a
scalar" group to decodetree.  These are 32x32->32 operations where
one of the inputs is the scalar, followed by a possible accumulate
operation of the 32-bit result.

The refactoring removes some of the oddities of the old decoder:
 * operands to the operation and accumulation were often
   reversed (taking advantage of the fact that most of these ops
   are commutative); the new code follows the pseudocode order
 * the Q bit in the insn was in a local variable 'u'; in the
   new code it is decoded into a->q

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  15 ++++
 target/arm/translate-neon.inc.c | 133 ++++++++++++++++++++++++++++++++
 target/arm/translate.c          |  77 ++----------------
 3 files changed, 154 insertions(+), 71 deletions(-)

Convert the float versions of VMLA, VMLS and VMUL in the Neon
2-reg-scalar group to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
As noted in the comment on the WRAP_FP_FN macro, we could have
had a do_2scalar_fp() function, but for 3 insns it seemed
simpler to just do the wrapping to get hold of the fpstatus ptr.
(These are the only fp insns in the group.)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  3 ++
 target/arm/translate-neon.inc.c | 65 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 37 ++-----------------
 3 files changed, 71 insertions(+), 34 deletions(-)

Convert the VQDMULH and VQRDMULH insns in the 2-reg-scalar group
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  3 +++
 target/arm/translate-neon.inc.c | 29 +++++++++++++++++++++++
 target/arm/translate.c          | 42 ++-------------------------------
 3 files changed, 34 insertions(+), 40 deletions(-)

Convert the VQRDMLAH and VQRDMLSH insns in the 2-reg-scalar
group to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  3 ++
 target/arm/translate-neon.inc.c | 74 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 38 +----------------
 3 files changed, 79 insertions(+), 36 deletions(-)

Convert the Neon 2-reg-scalar long multiplies to decodetree.
These are the last instructions in the group.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  18 ++++
 target/arm/translate-neon.inc.c | 163 ++++++++++++++++++++++++++++
 target/arm/translate.c          | 182 ++------------------------------
 3 files changed, 187 insertions(+), 176 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
 
     @2scalar     .... ... q:1 . . size:2 .... .... .... . . . . .... \
                  &2scalar vm=%vm_dp vn=%vn_dp vd=%vd_dp
+    # For the 'long' ops the Q bit is part of insn decode
+    @2scalar_q0  .... ... . . . size:2 .... .... .... . . . . .... \
+                 &2scalar vm=%vm_dp vn=%vn_dp vd=%vd_dp q=0
 
     VMLA_2sc     1111 001 . 1 . .. .... .... 0000 . 1 . 0 .... @2scalar
     VMLA_F_2sc   1111 001 . 1 . .. .... .... 0001 . 1 . 0 .... @2scalar
 
+    VMLAL_S_2sc  1111 001 0 1 . .. .... .... 0010 . 1 . 0 .... @2scalar_q0
+    VMLAL_U_2sc  1111 001 1 1 . .. .... .... 0010 . 1 . 0 .... @2scalar_q0
+
+    VQDMLAL_2sc  1111 001 0 1 . .. .... .... 0011 . 1 . 0 .... @2scalar_q0
+
     VMLS_2sc     1111 001 . 1 . .. .... .... 0100 . 1 . 0 .... @2scalar
     VMLS_F_2sc   1111 001 . 1 . .. .... .... 0101 . 1 . 0 .... @2scalar
 
+    VMLSL_S_2sc  1111 001 0 1 . .. .... .... 0110 . 1 . 0 .... @2scalar_q0
+    VMLSL_U_2sc  1111 001 1 1 . .. .... .... 0110 . 1 . 0 .... @2scalar_q0
+
+    VQDMLSL_2sc  1111 001 0 1 . .. .... .... 0111 . 1 . 0 .... @2scalar_q0
+
     VMUL_2sc     1111 001 . 1 . .. .... .... 1000 . 1 . 0 .... @2scalar
     VMUL_F_2sc   1111 001 . 1 . .. .... .... 1001 . 1 . 0 .... @2scalar
 
+    VMULL_S_2sc  1111 001 0 1 . .. .... .... 1010 . 1 . 0 .... @2scalar_q0
+    VMULL_U_2sc  1111 001 1 1 . .. .... .... 1010 . 1 . 0 .... @2scalar_q0
+
+    VQDMULL_2sc  1111 001 0 1 . .. .... .... 1011 . 1 . 0 .... @2scalar_q0
+
     VQDMULH_2sc  1111 001 . 1 . .. .... .... 1100 . 1 . 0 .... @2scalar
     VQRDMULH_2sc 1111 001 . 1 . .. .... .... 1101 . 1 . 0 .... @2scalar
 
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VQRDMLSH_2sc(DisasContext *s, arg_2scalar *a)
     };
     return do_vqrdmlah_2sc(s, a, opfn[a->size]);
 }
+
+static bool do_2scalar_long(DisasContext *s, arg_2scalar *a,
+                            NeonGenTwoOpWidenFn *opfn,
+                            NeonGenTwo64OpFn *accfn)
+{
+    /*
+     * Two registers and a scalar, long operations: perform an
+     * operation on the input elements and the scalar which produces
+     * a double-width result, and then possibly perform an accumulation
+     * operation of that result into the destination.
+     */
+    TCGv_i32 scalar, rn;
+    TCGv_i64 rn0_64, rn1_64;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (!opfn) {
+        /* Bad size (including size == 3, which is a different insn group) */
+        return false;
+    }
+
+    if (a->vd & 1) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    scalar = neon_get_scalar(a->size, a->vm);
+
+    /* Load all inputs before writing any outputs, in case of overlap */
+    rn = neon_load_reg(a->vn, 0);
+    rn0_64 = tcg_temp_new_i64();
+    opfn(rn0_64, rn, scalar);
+    tcg_temp_free_i32(rn);
+
+    rn = neon_load_reg(a->vn, 1);
+    rn1_64 = tcg_temp_new_i64();
+    opfn(rn1_64, rn, scalar);
+    tcg_temp_free_i32(rn);
+    tcg_temp_free_i32(scalar);
+
+    if (accfn) {
+        TCGv_i64 t64 = tcg_temp_new_i64();
+        neon_load_reg64(t64, a->vd);
+        accfn(t64, t64, rn0_64);
+        neon_store_reg64(t64, a->vd);
+        neon_load_reg64(t64, a->vd + 1);
+        accfn(t64, t64, rn1_64);
+        neon_store_reg64(t64, a->vd + 1);
+        tcg_temp_free_i64(t64);
+    } else {
+        neon_store_reg64(rn0_64, a->vd);
+        neon_store_reg64(rn1_64, a->vd + 1);
+    }
+    tcg_temp_free_i64(rn0_64);
+    tcg_temp_free_i64(rn1_64);
+    return true;
+}
+
+static bool trans_VMULL_S_2sc(DisasContext *s, arg_2scalar *a)
+{
+    static NeonGenTwoOpWidenFn * const opfn[] = {
+        NULL,
+        gen_helper_neon_mull_s16,
+        gen_mull_s32,
+        NULL,
+    };
+
+    return do_2scalar_long(s, a, opfn[a->size], NULL);
+}
+
+static bool trans_VMULL_U_2sc(DisasContext *s, arg_2scalar *a)
+{
+    static NeonGenTwoOpWidenFn * const opfn[] = {
+        NULL,
+        gen_helper_neon_mull_u16,
+        gen_mull_u32,
+        NULL,
+    };
+
+    return do_2scalar_long(s, a, opfn[a->size], NULL);
+}
+
+#define DO_VMLAL_2SC(INSN, MULL, ACC)                                   \
+    static bool trans_##INSN##_2sc(DisasContext *s, arg_2scalar *a)     \
+    {                                                                   \
+        static NeonGenTwoOpWidenFn * const opfn[] = {                   \
+            NULL,                                                       \
+            gen_helper_neon_##MULL##16,                                 \
+            gen_##MULL##32,                                             \
+            NULL,                                                       \
+        };                                                              \
+        static NeonGenTwo64OpFn * const accfn[] = {                     \
+            NULL,                                                       \
+            gen_helper_neon_##ACC##l_u32,                               \
+            tcg_gen_##ACC##_i64,                                        \
+            NULL,                                                       \
+        };                                                              \
+        return do_2scalar_long(s, a, opfn[a->size], accfn[a->size]);    \
+    }
+
+DO_VMLAL_2SC(VMLAL_S, mull_s, add)
+DO_VMLAL_2SC(VMLAL_U, mull_u, add)
+DO_VMLAL_2SC(VMLSL_S, mull_s, sub)
+DO_VMLAL_2SC(VMLSL_U, mull_u, sub)
+
+static bool trans_VQDMULL_2sc(DisasContext *s, arg_2scalar *a)
+{
+    static NeonGenTwoOpWidenFn * const opfn[] = {
+        NULL,
+        gen_VQDMULL_16,
+        gen_VQDMULL_32,
+        NULL,
+    };
+
+    return do_2scalar_long(s, a, opfn[a->size], NULL);
+}
+
+static bool trans_VQDMLAL_2sc(DisasContext *s, arg_2scalar *a)
+{
+    static NeonGenTwoOpWidenFn * const opfn[] = {
+        NULL,
+        gen_VQDMULL_16,
+        gen_VQDMULL_32,
+        NULL,
+    };
+    static NeonGenTwo64OpFn * const accfn[] = {
+        NULL,
+        gen_VQDMLAL_acc_16,
+        gen_VQDMLAL_acc_32,
+        NULL,
+    };
+
+    return do_2scalar_long(s, a, opfn[a->size], accfn[a->size]);
+}
+
+static bool trans_VQDMLSL_2sc(DisasContext *s, arg_2scalar *a)
+{
+    static NeonGenTwoOpWidenFn * const opfn[] = {
+        NULL,
+        gen_VQDMULL_16,
+        gen_VQDMULL_32,
+        NULL,
+    };
+    static NeonGenTwo64OpFn * const accfn[] = {
+        NULL,
+        gen_VQDMLSL_acc_16,
+        gen_VQDMLSL_acc_32,
+        NULL,
+    };
+
+    return do_2scalar_long(s, a, opfn[a->size], accfn[a->size]);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static void gen_revsh(TCGv_i32 dest, TCGv_i32 var)
     tcg_gen_ext16s_i32(dest, var);
 }
 
-/* 32x32->64 multiply.  Marks inputs as dead.  */
-static TCGv_i64 gen_mulu_i64_i32(TCGv_i32 a, TCGv_i32 b)
-{
-    TCGv_i32 lo = tcg_temp_new_i32();
-    TCGv_i32 hi = tcg_temp_new_i32();
-    TCGv_i64 ret;
-
-    tcg_gen_mulu2_i32(lo, hi, a, b);
-    tcg_temp_free_i32(a);
-    tcg_temp_free_i32(b);
-
-    ret = tcg_temp_new_i64();
-    tcg_gen_concat_i32_i64(ret, lo, hi);
-    tcg_temp_free_i32(lo);
-    tcg_temp_free_i32(hi);
-
-    return ret;
-}
-
-static TCGv_i64 gen_muls_i64_i32(TCGv_i32 a, TCGv_i32 b)
-{
-    TCGv_i32 lo = tcg_temp_new_i32();
-    TCGv_i32 hi = tcg_temp_new_i32();
-    TCGv_i64 ret;
-
-    tcg_gen_muls2_i32(lo, hi, a, b);
-    tcg_temp_free_i32(a);
-    tcg_temp_free_i32(b);
-
-    ret = tcg_temp_new_i64();
-    tcg_gen_concat_i32_i64(ret, lo, hi);
-    tcg_temp_free_i32(lo);
-    tcg_temp_free_i32(hi);
-
-    return ret;
-}
-
 /* Swap low and high halfwords.  */
 static void gen_swap_half(TCGv_i32 var)
 {
@@ -XXX,XX +XXX,XX @@ static inline void gen_neon_addl(int size)
     }
 }
 
-static inline void gen_neon_negl(TCGv_i64 var, int size)
-{
-    switch (size) {
-    case 0: gen_helper_neon_negl_u16(var, var); break;
-    case 1: gen_helper_neon_negl_u32(var, var); break;
-    case 2:
-        tcg_gen_neg_i64(var, var);
-        break;
-    default: abort();
-    }
-}
-
-static inline void gen_neon_addl_saturate(TCGv_i64 op0, TCGv_i64 op1, int size)
-{
-    switch (size) {
-    case 1: gen_helper_neon_addl_saturate_s32(op0, cpu_env, op0, op1); break;
-    case 2: gen_helper_neon_addl_saturate_s64(op0, cpu_env, op0, op1); break;
-    default: abort();
-    }
-}
-
-static inline void gen_neon_mull(TCGv_i64 dest, TCGv_i32 a, TCGv_i32 b,
-                                 int size, int u)
-{
-    TCGv_i64 tmp;
-
-    switch ((size << 1) | u) {
-    case 0: gen_helper_neon_mull_s8(dest, a, b); break;
-    case 1: gen_helper_neon_mull_u8(dest, a, b); break;
-    case 2: gen_helper_neon_mull_s16(dest, a, b); break;
-    case 3: gen_helper_neon_mull_u16(dest, a, b); break;
-    case 4:
-        tmp = gen_muls_i64_i32(a, b);
-        tcg_gen_mov_i64(dest, tmp);
-        tcg_temp_free_i64(tmp);
-        break;
-    case 5:
-        tmp = gen_mulu_i64_i32(a, b);
-        tcg_gen_mov_i64(dest, tmp);
-        tcg_temp_free_i64(tmp);
-        break;
-    default: abort();
-    }
-
-    /* gen_helper_neon_mull_[su]{8|16} do not free their parameters.
-       Don't forget to clean them now.  */
-    if (size < 2) {
-        tcg_temp_free_i32(a);
-        tcg_temp_free_i32(b);
-    }
-}
-
 static void gen_neon_narrow_op(int op, int u, int size,
                                TCGv_i32 dest, TCGv_i64 src)
 {
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     int u;
     int vec_size;
     uint32_t imm;
-    TCGv_i32 tmp, tmp2, tmp3, tmp4, tmp5;
+    TCGv_i32 tmp, tmp2, tmp3, tmp5;
     TCGv_ptr ptr1;
     TCGv_i64 tmp64;
 
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
         return 1;
     } else { /* (insn & 0x00800010 == 0x00800000) */
         if (size != 3) {
-            op = (insn >> 8) & 0xf;
-            if ((insn & (1 << 6)) == 0) {
-                /* Three registers of different lengths: handled by decodetree */
-                return 1;
-            } else {
-                /* Two registers and a scalar. NB that for ops of this form
-                 * the ARM ARM labels bit 24 as Q, but it is in our variable
-                 * 'u', not 'q'.
-                 */
-                if (size == 0) {
-                    return 1;
-                }
-                switch (op) {
-                case 0: /* Integer VMLA scalar */
-                case 4: /* Integer VMLS scalar */
-                case 8: /* Integer VMUL scalar */
-                case 1: /* Float VMLA scalar */
-                case 5: /* Floating point VMLS scalar */
-                case 9: /* Floating point VMUL scalar */
-                case 12: /* VQDMULH scalar */
-                case 13: /* VQRDMULH scalar */
-                case 14: /* VQRDMLAH scalar */
-                case 15: /* VQRDMLSH scalar */
-                    return 1; /* handled by decodetree */
-
-                case 3: /* VQDMLAL scalar */
-                case 7: /* VQDMLSL scalar */
-                case 11: /* VQDMULL scalar */
-                    if (u == 1) {
-                        return 1;
-                    }
-                    /* fall through */
-                case 2: /* VMLAL sclar */
-                case 6: /* VMLSL scalar */
-                case 10: /* VMULL scalar */
-                    if (rd & 1) {
-                        return 1;
-                    }
-                    tmp2 = neon_get_scalar(size, rm);
-                    /* We need a copy of tmp2 because gen_neon_mull
-                     * deletes it during pass 0.  */
-                    tmp4 = tcg_temp_new_i32();
-                    tcg_gen_mov_i32(tmp4, tmp2);
-                    tmp3 = neon_load_reg(rn, 1);
-
-                    for (pass = 0; pass < 2; pass++) {
-                        if (pass == 0) {
-                            tmp = neon_load_reg(rn, 0);
-                        } else {
-                            tmp = tmp3;
-                            tmp2 = tmp4;
-                        }
-                        gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
-                        if (op != 11) {
-                            neon_load_reg64(cpu_V1, rd + pass);
-                        }
-                        switch (op) {
-                        case 6:
-                            gen_neon_negl(cpu_V0, size);
-                            /* Fall through */
-                        case 2:
-                            gen_neon_addl(size);
-                            break;
-                        case 3: case 7:
-                            gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
-                            if (op == 7) {
-                                gen_neon_negl(cpu_V0, size);
-                            }
-                            gen_neon_addl_saturate(cpu_V0, cpu_V1, size);
-                            break;
-                        case 10:
-                            /* no-op */
-                            break;
-                        case 11:
-                            gen_neon_addl_saturate(cpu_V0, cpu_V0, size);
-                            break;
-                        default:
-                            abort();
-                        }
-                        neon_store_reg64(cpu_V0, rd + pass);
-                    }
-                    break;
-                default:
-                    g_assert_not_reached();
-                }
-            }
+            /*
+             * Three registers of different lengths, or two registers and
+             * a scalar: handled by decodetree
+             */
+            return 1;
         } else { /* size == 3 */
             if (!u) {
                 /* Extract.  */
-- 
2.20.1

Convert the Neon VEXT insn to decodetree. Rather than keeping the
old implementation which used fixed temporaries cpu_V0 and cpu_V1
and did the extraction with by-hand shift and logic ops, we use
the TCG extract2 insn.

We don't need to special case 0 or 8 immediates any more as the
optimizer is smart enough to throw away the dead code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  8 +++-
 target/arm/translate-neon.inc.c | 76 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 58 +------------------------
 3 files changed, 85 insertions(+), 57 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
 # return false for size==3.
 ######################################################################
 {
-  # 0b11 subgroup will go here
+  [
+    ##################################################################
+    # Miscellaneous size=0b11 insns
+    ##################################################################
+    VEXT         1111 001 0 1 . 11 .... .... imm:4 . q:1 . 0 .... \
+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
+  ]
 
   # Subgroup for size != 0b11
   [
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VQDMLSL_2sc(DisasContext *s, arg_2scalar *a)
 
     return do_2scalar_long(s, a, opfn[a->size], accfn[a->size]);
 }
+
+static bool trans_VEXT(DisasContext *s, arg_VEXT *a)
+{
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vn | a->vm | a->vd) & a->q) {
+        return false;
+    }
+
+    if (a->imm > 7 && !a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    if (!a->q) {
+        /* Extract 64 bits from <Vm:Vn> */
+        TCGv_i64 left, right, dest;
+
+        left = tcg_temp_new_i64();
+        right = tcg_temp_new_i64();
+        dest = tcg_temp_new_i64();
+
+        neon_load_reg64(right, a->vn);
+        neon_load_reg64(left, a->vm);
+        tcg_gen_extract2_i64(dest, right, left, a->imm * 8);
+        neon_store_reg64(dest, a->vd);
+
+        tcg_temp_free_i64(left);
+        tcg_temp_free_i64(right);
+        tcg_temp_free_i64(dest);
+    } else {
+        /* Extract 128 bits from <Vm+1:Vm:Vn+1:Vn> */
+        TCGv_i64 left, middle, right, destleft, destright;
+
+        left = tcg_temp_new_i64();
+        middle = tcg_temp_new_i64();
+        right = tcg_temp_new_i64();
+        destleft = tcg_temp_new_i64();
+        destright = tcg_temp_new_i64();
+
+        if (a->imm < 8) {
+            neon_load_reg64(right, a->vn);
+            neon_load_reg64(middle, a->vn + 1);
+            tcg_gen_extract2_i64(destright, right, middle, a->imm * 8);
+            neon_load_reg64(left, a->vm);
+            tcg_gen_extract2_i64(destleft, middle, left, a->imm * 8);
+        } else {
+            neon_load_reg64(right, a->vn + 1);
+            neon_load_reg64(middle, a->vm);
+            tcg_gen_extract2_i64(destright, right, middle, (a->imm - 8) * 8);
+            neon_load_reg64(left, a->vm + 1);
+            tcg_gen_extract2_i64(destleft, middle, left, (a->imm - 8) * 8);
+        }
+
+        neon_store_reg64(destright, a->vd);
+        neon_store_reg64(destleft, a->vd + 1);
+
+        tcg_temp_free_i64(destright);
+        tcg_temp_free_i64(destleft);
+        tcg_temp_free_i64(right);
+        tcg_temp_free_i64(middle);
+        tcg_temp_free_i64(left);
+    }
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     int pass;
     int u;
     int vec_size;
-    uint32_t imm;
     TCGv_i32 tmp, tmp2, tmp3, tmp5;
     TCGv_ptr ptr1;
-    TCGv_i64 tmp64;
 
     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
         return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
             return 1;
         } else { /* size == 3 */
             if (!u) {
-                /* Extract.  */
-                imm = (insn >> 8) & 0xf;
-
-                if (imm > 7 && !q)
-                    return 1;
-
-                if (q && ((rd | rn | rm) & 1)) {
-                    return 1;
-                }
-
-                if (imm == 0) {
-                    neon_load_reg64(cpu_V0, rn);
-                    if (q) {
-                        neon_load_reg64(cpu_V1, rn + 1);
-                    }
-                } else if (imm == 8) {
-                    neon_load_reg64(cpu_V0, rn + 1);
-                    if (q) {
-                        neon_load_reg64(cpu_V1, rm);
-                    }
-                } else if (q) {
-                    tmp64 = tcg_temp_new_i64();
-                    if (imm < 8) {
-                        neon_load_reg64(cpu_V0, rn);
-                        neon_load_reg64(tmp64, rn + 1);
-                    } else {
-                        neon_load_reg64(cpu_V0, rn + 1);
-                        neon_load_reg64(tmp64, rm);
-                    }
-                    tcg_gen_shri_i64(cpu_V0, cpu_V0, (imm & 7) * 8);
-                    tcg_gen_shli_i64(cpu_V1, tmp64, 64 - ((imm & 7) * 8));
-                    tcg_gen_or_i64(cpu_V0, cpu_V0, cpu_V1);
-                    if (imm < 8) {
-                        neon_load_reg64(cpu_V1, rm);
-                    } else {
-                        neon_load_reg64(cpu_V1, rm + 1);
-                        imm -= 8;
-                    }
-                    tcg_gen_shli_i64(cpu_V1, cpu_V1, 64 - (imm * 8));
-                    tcg_gen_shri_i64(tmp64, tmp64, imm * 8);
-                    tcg_gen_or_i64(cpu_V1, cpu_V1, tmp64);
-                    tcg_temp_free_i64(tmp64);
-                } else {
-                    /* BUGFIX */
-                    neon_load_reg64(cpu_V0, rn);
-                    tcg_gen_shri_i64(cpu_V0, cpu_V0, imm * 8);
-                    neon_load_reg64(cpu_V1, rm);
-                    tcg_gen_shli_i64(cpu_V1, cpu_V1, 64 - (imm * 8));
-                    tcg_gen_or_i64(cpu_V0, cpu_V0, cpu_V1);
-                }
-                neon_store_reg64(cpu_V0, rd);
-                if (q) {
-                    neon_store_reg64(cpu_V1, rd + 1);
-                }
+                /* Extract: handled by decodetree */
+                return 1;
             } else if ((insn & (1 << 11)) == 0) {
                 /* Two register misc.  */
                 op = ((insn >> 12) & 0x30) | ((insn >> 7) & 0xf);
-- 
2.20.1

Convert the Neon VTBL, VTBX instructions to decodetree.  The actual
implementation of the insn is copied across to the new trans function
unchanged except for renaming 'tmp5' to 'tmp4'.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  3 ++
 target/arm/translate-neon.inc.c | 56 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 41 +++---------------------
 3 files changed, 63 insertions(+), 37 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -XXX,XX +XXX,XX @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     ##################################################################
     VEXT         1111 001 0 1 . 11 .... .... imm:4 . q:1 . 0 .... \
                  vm=%vm_dp vn=%vn_dp vd=%vd_dp
+
+    VTBL         1111 001 1 1 . 11 .... .... 10 len:2 . op:1 . 0 .... \
+                 vm=%vm_dp vn=%vn_dp vd=%vd_dp
   ]
 
   # Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -XXX,XX +XXX,XX @@ static bool trans_VEXT(DisasContext *s, arg_VEXT *a)
     }
     return true;
 }
+
+static bool trans_VTBL(DisasContext *s, arg_VTBL *a)
+{
+    int n;
+    TCGv_i32 tmp, tmp2, tmp3, tmp4;
+    TCGv_ptr ptr1;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    n = a->len + 1;
+    if ((a->vn + n) > 32) {
+        /*
+         * This is UNPREDICTABLE; we choose to UNDEF to avoid the
+         * helper function running off the end of the register file.
+         */
+        return false;
+    }
+    n <<= 3;
+    if (a->op) {
+        tmp = neon_load_reg(a->vd, 0);
+    } else {
+        tmp = tcg_temp_new_i32();
+        tcg_gen_movi_i32(tmp, 0);
+    }
+    tmp2 = neon_load_reg(a->vm, 0);
+    ptr1 = vfp_reg_ptr(true, a->vn);
+    tmp4 = tcg_const_i32(n);
+    gen_helper_neon_tbl(tmp2, tmp2, tmp, ptr1, tmp4);
+    tcg_temp_free_i32(tmp);
+    if (a->op) {
+        tmp = neon_load_reg(a->vd, 1);
+    } else {
+        tmp = tcg_temp_new_i32();
+        tcg_gen_movi_i32(tmp, 0);
+    }
+    tmp3 = neon_load_reg(a->vm, 1);
+    gen_helper_neon_tbl(tmp3, tmp3, tmp, ptr1, tmp4);
+    tcg_temp_free_i32(tmp4);
+    tcg_temp_free_ptr(ptr1);
+    neon_store_reg(a->vd, 0, tmp2);
+    neon_store_reg(a->vd, 1, tmp3);
+    tcg_temp_free_i32(tmp);
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
 {
     int op;
     int q;
-    int rd, rn, rm, rd_ofs, rm_ofs;
+    int rd, rm, rd_ofs, rm_ofs;
     int size;
     int pass;
     int u;
     int vec_size;
-    TCGv_i32 tmp, tmp2, tmp3, tmp5;
-    TCGv_ptr ptr1;
+    TCGv_i32 tmp, tmp2, tmp3;
 
     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
         return 1;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     q = (insn & (1 << 6)) != 0;
     u = (insn >> 24) & 1;
     VFP_DREG_D(rd, insn);
-    VFP_DREG_N(rn, insn);
     VFP_DREG_M(rm, insn);
     size = (insn >> 20) & 3;
     vec_size = q ? 16 : 8;
@@ -XXX,XX +XXX,XX @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     break;
                 }
             } else if ((insn & (1 << 10)) == 0) {
-                /* VTBL, VTBX.  */
-                int n = ((insn >> 8) & 3) + 1;
-                if ((rn + n) > 32) {
-                    /* This is UNPREDICTABLE; we choose to UNDEF to avoid the
-                     * helper function running off the end of the register file.
-                     */
-                    return 1;
-                }
-                n <<= 3;
-                if (insn & (1 << 6)) {
-                    tmp = neon_load_reg(rd, 0);
-                } else {
-                    tmp = tcg_temp_new_i32();
-                    tcg_gen_movi_i32(tmp, 0);
-                }
-                tmp2 = neon_load_reg(rm, 0);
-                ptr1 = vfp_reg_ptr(true, rn);
-                tmp5 = tcg_const_i32(n);
-                gen_helper_neon_tbl(tmp2, tmp2, tmp, ptr1, tmp5);
-                tcg_temp_free_i32(tmp);
-                if (insn & (1 << 6)) {
-                    tmp = neon_load_reg(rd, 1);
-                } else {
-                    tmp = tcg_temp_new_i32();
-                    tcg_gen_movi_i32(tmp, 0);
-                }
-                tmp3 = neon_load_reg(rm, 1);
-                gen_helper_neon_tbl(tmp3, tmp3, tmp, ptr1, tmp5);
-                tcg_temp_free_i32(tmp5);
-                tcg_temp_free_ptr(ptr1);
-                neon_store_reg(rd, 0, tmp2);
-                neon_store_reg(rd, 1, tmp3);
-                tcg_temp_free_i32(tmp);
+                /* VTBL, VTBX: handled by decodetree */
+                return 1;
             } else if ((insn & 0x380) == 0) {
                 /* VDUP */
                 int element;
-- 
2.20.1

Convert the Neon VDUP (scalar) insn to decodetree.  (Note that we
can't call this just "VDUP" as we used that already in vfp.decode for
the "VDUP (general purpose register" insn.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/neon-dp.decode       |  7 +++++++
 target/arm/translate-neon.inc.c | 26 ++++++++++++++++++++++++++
 target/arm/translate.c          | 25 +------------------------
 3 files changed, 34 insertions(+), 24 deletions(-)

From: Jean-Christophe Dubois <jcd@tribudubois.net>

Some bits of the CCM registers are non writable.

This was left undone in the initial commit (all bits of registers were
writable).

This patch adds the required code to protect the non writable bits.

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 20200608133508.550046-1-jcd@tribudubois.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/misc/imx6ul_ccm.c | 76 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 63 insertions(+), 13 deletions(-)

diff --git a/hw/misc/imx6ul_ccm.c b/hw/misc/imx6ul_ccm.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/imx6ul_ccm.c
+++ b/hw/misc/imx6ul_ccm.c
@@ -XXX,XX +XXX,XX @@
 
 #include "trace.h"
 
+static const uint32_t ccm_mask[CCM_MAX] = {
+    [CCM_CCR] = 0xf01fef80,
+    [CCM_CCDR] = 0xfffeffff,
+    [CCM_CSR] = 0xffffffff,
+    [CCM_CCSR] = 0xfffffef2,
+    [CCM_CACRR] = 0xfffffff8,
+    [CCM_CBCDR] = 0xc1f8e000,
+    [CCM_CBCMR] = 0xfc03cfff,
+    [CCM_CSCMR1] = 0x80700000,
+    [CCM_CSCMR2] = 0xe01ff003,
+    [CCM_CSCDR1] = 0xfe00c780,
+    [CCM_CS1CDR] = 0xfe00fe00,
+    [CCM_CS2CDR] = 0xf8007000,
+    [CCM_CDCDR] = 0xf00fffff,
+    [CCM_CHSCCDR] = 0xfffc01ff,
+    [CCM_CSCDR2] = 0xfe0001ff,
+    [CCM_CSCDR3] = 0xffffc1ff,
+    [CCM_CDHIPR] = 0xffffffff,
+    [CCM_CTOR] = 0x00000000,
+    [CCM_CLPCR] = 0xf39ff01c,
+    [CCM_CISR] = 0xfb85ffbe,
+    [CCM_CIMR] = 0xfb85ffbf,
+    [CCM_CCOSR] = 0xfe00fe00,
+    [CCM_CGPR] = 0xfffc3fea,
+    [CCM_CCGR0] = 0x00000000,
+    [CCM_CCGR1] = 0x00000000,
+    [CCM_CCGR2] = 0x00000000,
+    [CCM_CCGR3] = 0x00000000,
+    [CCM_CCGR4] = 0x00000000,
+    [CCM_CCGR5] = 0x00000000,
+    [CCM_CCGR6] = 0x00000000,
+    [CCM_CMEOR] = 0xafffff1f,
+};
+
+static const uint32_t analog_mask[CCM_ANALOG_MAX] = {
+    [CCM_ANALOG_PLL_ARM] = 0xfff60f80,
+    [CCM_ANALOG_PLL_USB1] = 0xfffe0fbc,
+    [CCM_ANALOG_PLL_USB2] = 0xfffe0fbc,
+    [CCM_ANALOG_PLL_SYS] = 0xfffa0ffe,
+    [CCM_ANALOG_PLL_SYS_SS] = 0x00000000,
+    [CCM_ANALOG_PLL_SYS_NUM] = 0xc0000000,
+    [CCM_ANALOG_PLL_SYS_DENOM] = 0xc0000000,
+    [CCM_ANALOG_PLL_AUDIO] = 0xffe20f80,
+    [CCM_ANALOG_PLL_AUDIO_NUM] = 0xc0000000,
+    [CCM_ANALOG_PLL_AUDIO_DENOM] = 0xc0000000,
+    [CCM_ANALOG_PLL_VIDEO] = 0xffe20f80,
+    [CCM_ANALOG_PLL_VIDEO_NUM] = 0xc0000000,
+    [CCM_ANALOG_PLL_VIDEO_DENOM] = 0xc0000000,
+    [CCM_ANALOG_PLL_ENET] = 0xffc20ff0,
+    [CCM_ANALOG_PFD_480] = 0x40404040,
+    [CCM_ANALOG_PFD_528] = 0x40404040,
+    [PMU_MISC0] = 0x01fe8306,
+    [PMU_MISC1] = 0x07fcede0,
+    [PMU_MISC2] = 0x005f5f5f,
+};
+
 static const char *imx6ul_ccm_reg_name(uint32_t reg)
 {
     static char unknown[20];
@@ -XXX,XX +XXX,XX @@ static void imx6ul_ccm_write(void *opaque, hwaddr offset, uint64_t value,
 
     trace_ccm_write_reg(imx6ul_ccm_reg_name(index), (uint32_t)value);
 
-    /*
-     * We will do a better implementation later. In particular some bits
-     * cannot be written to.
-     */
-    s->ccm[index] = (uint32_t)value;
+    s->ccm[index] = (s->ccm[index] & ccm_mask[index]) |
+                           ((uint32_t)value & ~ccm_mask[index]);
 }
 
 static uint64_t imx6ul_analog_read(void *opaque, hwaddr offset, unsigned size)
@@ -XXX,XX +XXX,XX @@ static void imx6ul_analog_write(void *opaque, hwaddr offset, uint64_t value,
          * the REG_NAME register. So we change the value of the
          * REG_NAME register, setting bits passed in the value.
          */
-        s->analog[index - 1] |= value;
+        s->analog[index - 1] |= (value & ~analog_mask[index - 1]);
         break;
     case CCM_ANALOG_PLL_ARM_CLR:
     case CCM_ANALOG_PLL_USB1_CLR:
@@ -XXX,XX +XXX,XX @@ static void imx6ul_analog_write(void *opaque, hwaddr offset, uint64_t value,
          * the REG_NAME register. So we change the value of the
          * REG_NAME register, unsetting bits passed in the value.
          */
-        s->analog[index - 2] &= ~value;
+        s->analog[index - 2] &= ~(value & ~analog_mask[index - 2]);
         break;
     case CCM_ANALOG_PLL_ARM_TOG:
     case CCM_ANALOG_PLL_USB1_TOG:
@@ -XXX,XX +XXX,XX @@ static void imx6ul_analog_write(void *opaque, hwaddr offset, uint64_t value,
          * the REG_NAME register. So we change the value of the
          * REG_NAME register, toggling bits passed in the value.
          */
-        s->analog[index - 3] ^= value;
+        s->analog[index - 3] ^= (value & ~analog_mask[index - 3]);
         break;
     default:
-        /*
-         * We will do a better implementation later. In particular some bits
-         * cannot be written to.
-         */
-        s->analog[index] = value;
+        s->analog[index] = (s->analog[index] & analog_mask[index]) |
+                           (value & ~analog_mask[index]);
         break;
     }
 }
-- 
2.20.1

From: Erik Smit <erik.lucas.smit@gmail.com>

The hardware supports configurable descriptor sizes, configured in the DBLAC
register.

Most drivers use the default 4 word descriptor, which is currently hardcoded,
but Aspeed SDK configures 8 words to store extra data.

Signed-off-by: Erik Smit <erik.lucas.smit@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[PMM: removed unnecessary parens]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/net/ftgmac100.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/hw/net/ftgmac100.c b/hw/net/ftgmac100.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/net/ftgmac100.c
+++ b/hw/net/ftgmac100.c
@@ -XXX,XX +XXX,XX @@
 #define FTGMAC100_APTC_TXPOLL_CNT(x)        (((x) >> 8) & 0xf)
 #define FTGMAC100_APTC_TXPOLL_TIME_SEL      (1 << 12)
 
+/*
+ * DMA burst length and arbitration control register
+ */
+#define FTGMAC100_DBLAC_RXBURST_SIZE(x)     (((x) >> 8) & 0x3)
+#define FTGMAC100_DBLAC_TXBURST_SIZE(x)     (((x) >> 10) & 0x3)
+#define FTGMAC100_DBLAC_RXDES_SIZE(x)       ((((x) >> 12) & 0xf) * 8)
+#define FTGMAC100_DBLAC_TXDES_SIZE(x)       ((((x) >> 16) & 0xf) * 8)
+#define FTGMAC100_DBLAC_IFG_CNT(x)          (((x) >> 20) & 0x7)
+#define FTGMAC100_DBLAC_IFG_INC             (1 << 23)
+
 /*
  * PHY control register
  */
@@ -XXX,XX +XXX,XX @@ static void ftgmac100_do_tx(FTGMAC100State *s, uint32_t tx_ring,
         if (bd.des0 & s->txdes0_edotr) {
             addr = tx_ring;
         } else {
-            addr += sizeof(FTGMAC100Desc);
+            addr += FTGMAC100_DBLAC_TXDES_SIZE(s->dblac);
         }
     }
 
@@ -XXX,XX +XXX,XX @@ static void ftgmac100_write(void *opaque, hwaddr addr,
         s->phydata = value & 0xffff;
         break;
     case FTGMAC100_DBLAC: /* DMA Burst Length and Arbitration Control */
+        if (FTGMAC100_DBLAC_TXDES_SIZE(s->dblac) < sizeof(FTGMAC100Desc)) {
+            qemu_log_mask(LOG_GUEST_ERROR,
+                          "%s: transmit descriptor too small : %d bytes\n",
+                          __func__, FTGMAC100_DBLAC_TXDES_SIZE(s->dblac));
+            break;
+        }
+        if (FTGMAC100_DBLAC_RXDES_SIZE(s->dblac) < sizeof(FTGMAC100Desc)) {
+            qemu_log_mask(LOG_GUEST_ERROR,
+                          "%s: receive descriptor too small : %d bytes\n",
+                          __func__, FTGMAC100_DBLAC_RXDES_SIZE(s->dblac));
+            break;
+        }
         s->dblac = value;
         break;
     case FTGMAC100_REVR:  /* Feature Register */
@@ -XXX,XX +XXX,XX @@ static ssize_t ftgmac100_receive(NetClientState *nc, const uint8_t *buf,
         if (bd.des0 & s->rxdes0_edorr) {
             addr = s->rx_ring;
         } else {
-            addr += sizeof(FTGMAC100Desc);
+            addr += FTGMAC100_DBLAC_RXDES_SIZE(s->dblac);
         }
     }
     s->rx_descriptor = addr;
-- 
2.20.1

From: fangying <fangying1@huawei.com>

Virtual time adjustment was implemented for virt-5.0 machine type,
but the cpu property was enabled only for host-passthrough and max
cpu model.  Let's add it for any KVM arm cpu which has the generic
timer feature enabled.

Signed-off-by: Ying Fang <fangying1@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 20200608121243.2076-1-fangying1@huawei.com
[PMM: minor commit message tweak, removed inaccurate
 suggested-by tag]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/cpu.c   |  6 ++++--
 target/arm/cpu64.c |  1 -
 target/arm/kvm.c   | 21 +++++++++++----------
 3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -XXX,XX +XXX,XX @@ void arm_cpu_post_init(Object *obj)
     if (arm_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER)) {
         qdev_property_add_static(DEVICE(cpu), &arm_cpu_gt_cntfrq_property);
     }
+
+    if (kvm_enabled()) {
+        kvm_arm_add_vcpu_properties(obj);
+    }
 }
 
 static void arm_cpu_finalizefn(Object *obj)
@@ -XXX,XX +XXX,XX @@ static void arm_max_initfn(Object *obj)
 
     if (kvm_enabled()) {
         kvm_arm_set_cpu_features_from_host(cpu);
-        kvm_arm_add_vcpu_properties(obj);
     } else {
         cortex_a15_initfn(obj);
 
@@ -XXX,XX +XXX,XX @@ static void arm_host_initfn(Object *obj)
     if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
         aarch64_add_sve_properties(obj);
     }
-    kvm_arm_add_vcpu_properties(obj);
     arm_cpu_post_init(obj);
 }
 
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ static void aarch64_max_initfn(Object *obj)
 
     if (kvm_enabled()) {
         kvm_arm_set_cpu_features_from_host(cpu);
-        kvm_arm_add_vcpu_properties(obj);
     } else {
         uint64_t t;
         uint32_t u;
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -XXX,XX +XXX,XX @@ static void kvm_no_adjvtime_set(Object *obj, bool value, Error **errp)
 /* KVM VCPU properties should be prefixed with "kvm-". */
 void kvm_arm_add_vcpu_properties(Object *obj)
 {
-    if (!kvm_enabled()) {
-        return;
-    }
+    ARMCPU *cpu = ARM_CPU(obj);
+    CPUARMState *env = &cpu->env;
 
-    ARM_CPU(obj)->kvm_adjvtime = true;
-    object_property_add_bool(obj, "kvm-no-adjvtime", kvm_no_adjvtime_get,
-                             kvm_no_adjvtime_set);
-    object_property_set_description(obj, "kvm-no-adjvtime",
-                                    "Set on to disable the adjustment of "
-                                    "the virtual counter. VM stopped time "
-                                    "will be counted.");
+    if (arm_feature(env, ARM_FEATURE_GENERIC_TIMER)) {
+        cpu->kvm_adjvtime = true;
+        object_property_add_bool(obj, "kvm-no-adjvtime", kvm_no_adjvtime_get,
+                                 kvm_no_adjvtime_set);
+        object_property_set_description(obj, "kvm-no-adjvtime",
+                                        "Set on to disable the adjustment of "
+                                        "the virtual counter. VM stopped time "
+                                        "will be counted.");
+    }
 }
 
 bool kvm_arm_pmu_supported(CPUState *cpu)
-- 
2.20.1

From: Jean-Christophe Dubois <jcd@tribudubois.net>

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[PMD: Fixed 32-bit format string using PRIx32/PRIx64]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/net/imx_fec.c    | 106 +++++++++++++++++++-------------------------
 hw/net/trace-events |  18 ++++++++
 2 files changed, 63 insertions(+), 61 deletions(-)

diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/module.h"
 #include "net/checksum.h"
 #include "net/eth.h"
+#include "trace.h"
 
 /* For crc32 */
 #include <zlib.h>
 
-#ifndef DEBUG_IMX_FEC
-#define DEBUG_IMX_FEC 0
-#endif
-
-#define FEC_PRINTF(fmt, args...) \
-    do { \
-        if (DEBUG_IMX_FEC) { \
-            fprintf(stderr, "[%s]%s: " fmt , TYPE_IMX_FEC, \
-                                             __func__, ##args); \
-        } \
-    } while (0)
-
-#ifndef DEBUG_IMX_PHY
-#define DEBUG_IMX_PHY 0
-#endif
-
-#define PHY_PRINTF(fmt, args...) \
-    do { \
-        if (DEBUG_IMX_PHY) { \
-            fprintf(stderr, "[%s.phy]%s: " fmt , TYPE_IMX_FEC, \
-                                                 __func__, ##args); \
-        } \
-    } while (0)
-
 #define IMX_MAX_DESC    1024
 
 static const char *imx_default_reg_name(IMXFECState *s, uint32_t index)
@@ -XXX,XX +XXX,XX @@ static void imx_eth_update(IMXFECState *s);
  * For now we don't handle any GPIO/interrupt line, so the OS will
  * have to poll for the PHY status.
  */
-static void phy_update_irq(IMXFECState *s)
+static void imx_phy_update_irq(IMXFECState *s)
 {
     imx_eth_update(s);
 }
 
-static void phy_update_link(IMXFECState *s)
+static void imx_phy_update_link(IMXFECState *s)
 {
     /* Autonegotiation status mirrors link status.  */
     if (qemu_get_queue(s->nic)->link_down) {
-        PHY_PRINTF("link is down\n");
+        trace_imx_phy_update_link("down");
         s->phy_status &= ~0x0024;
         s->phy_int |= PHY_INT_DOWN;
     } else {
-        PHY_PRINTF("link is up\n");
+        trace_imx_phy_update_link("up");
         s->phy_status |= 0x0024;
         s->phy_int |= PHY_INT_ENERGYON;
         s->phy_int |= PHY_INT_AUTONEG_COMPLETE;
     }
-    phy_update_irq(s);
+    imx_phy_update_irq(s);
 }
 
 static void imx_eth_set_link(NetClientState *nc)
 {
-    phy_update_link(IMX_FEC(qemu_get_nic_opaque(nc)));
+    imx_phy_update_link(IMX_FEC(qemu_get_nic_opaque(nc)));
 }
 
-static void phy_reset(IMXFECState *s)
+static void imx_phy_reset(IMXFECState *s)
 {
+    trace_imx_phy_reset();
+
     s->phy_status = 0x7809;
     s->phy_control = 0x3000;
     s->phy_advertise = 0x01e1;
     s->phy_int_mask = 0;
     s->phy_int = 0;
-    phy_update_link(s);
+    imx_phy_update_link(s);
 }
 
-static uint32_t do_phy_read(IMXFECState *s, int reg)
+static uint32_t imx_phy_read(IMXFECState *s, int reg)
 {
     uint32_t val;
 
@@ -XXX,XX +XXX,XX @@ static uint32_t do_phy_read(IMXFECState *s, int reg)
     case 29:    /* Interrupt source.  */
         val = s->phy_int;
         s->phy_int = 0;
-        phy_update_irq(s);
+        imx_phy_update_irq(s);
         break;
     case 30:    /* Interrupt mask */
         val = s->phy_int_mask;
@@ -XXX,XX +XXX,XX @@ static uint32_t do_phy_read(IMXFECState *s, int reg)
         break;
     }
 
-    PHY_PRINTF("read 0x%04x @ %d\n", val, reg);
+    trace_imx_phy_read(val, reg);
 
     return val;
 }
 
-static void do_phy_write(IMXFECState *s, int reg, uint32_t val)
+static void imx_phy_write(IMXFECState *s, int reg, uint32_t val)
 {
-    PHY_PRINTF("write 0x%04x @ %d\n", val, reg);
+    trace_imx_phy_write(val, reg);
 
     if (reg > 31) {
         /* we only advertise one phy */
@@ -XXX,XX +XXX,XX @@ static void do_phy_write(IMXFECState *s, int reg, uint32_t val)
     switch (reg) {
     case 0:     /* Basic Control */
         if (val & 0x8000) {
-            phy_reset(s);
+            imx_phy_reset(s);
         } else {
             s->phy_control = val & 0x7980;
             /* Complete autonegotiation immediately.  */
@@ -XXX,XX +XXX,XX @@ static void do_phy_write(IMXFECState *s, int reg, uint32_t val)
         break;
     case 30:    /* Interrupt mask */
         s->phy_int_mask = val & 0xff;
-        phy_update_irq(s);
+        imx_phy_update_irq(s);
         break;
     case 17:
     case 18:
@@ -XXX,XX +XXX,XX @@ static void do_phy_write(IMXFECState *s, int reg, uint32_t val)
 static void imx_fec_read_bd(IMXFECBufDesc *bd, dma_addr_t addr)
 {
     dma_memory_read(&address_space_memory, addr, bd, sizeof(*bd));
+
+    trace_imx_fec_read_bd(addr, bd->flags, bd->length, bd->data);
 }
 
 static void imx_fec_write_bd(IMXFECBufDesc *bd, dma_addr_t addr)
@@ -XXX,XX +XXX,XX @@ static void imx_fec_write_bd(IMXFECBufDesc *bd, dma_addr_t addr)
 static void imx_enet_read_bd(IMXENETBufDesc *bd, dma_addr_t addr)
 {
     dma_memory_read(&address_space_memory, addr, bd, sizeof(*bd));
+
+    trace_imx_enet_read_bd(addr, bd->flags, bd->length, bd->data,
+                   bd->option, bd->status);
 }
 
 static void imx_enet_write_bd(IMXENETBufDesc *bd, dma_addr_t addr)
@@ -XXX,XX +XXX,XX @@ static void imx_fec_do_tx(IMXFECState *s)
         int len;
 
         imx_fec_read_bd(&bd, addr);
-        FEC_PRINTF("tx_bd %x flags %04x len %d data %08x\n",
-                   addr, bd.flags, bd.length, bd.data);
         if ((bd.flags & ENET_BD_R) == 0) {
+
             /* Run out of descriptors to transmit.  */
-            FEC_PRINTF("tx_bd ran out of descriptors to transmit\n");
+            trace_imx_eth_tx_bd_busy();
+
             break;
         }
         len = bd.length;
@@ -XXX,XX +XXX,XX @@ static void imx_enet_do_tx(IMXFECState *s, uint32_t index)
         int len;
 
         imx_enet_read_bd(&bd, addr);
-        FEC_PRINTF("tx_bd %x flags %04x len %d data %08x option %04x "
-                   "status %04x\n", addr, bd.flags, bd.length, bd.data,
-                   bd.option, bd.status);
         if ((bd.flags & ENET_BD_R) == 0) {
             /* Run out of descriptors to transmit.  */
+
+            trace_imx_eth_tx_bd_busy();
+
             break;
         }
         len = bd.length;
@@ -XXX,XX +XXX,XX @@ static void imx_eth_enable_rx(IMXFECState *s, bool flush)
     s->regs[ENET_RDAR] = (bd.flags & ENET_BD_E) ? ENET_RDAR_RDAR : 0;
 
     if (!s->regs[ENET_RDAR]) {
-        FEC_PRINTF("RX buffer full\n");
+        trace_imx_eth_rx_bd_full();
     } else if (flush) {
         qemu_flush_queued_packets(qemu_get_queue(s->nic));
     }
@@ -XXX,XX +XXX,XX @@ static void imx_eth_reset(DeviceState *d)
     memset(s->tx_descriptor, 0, sizeof(s->tx_descriptor));
 
     /* We also reset the PHY */
-    phy_reset(s);
+    imx_phy_reset(s);
 }
 
 static uint32_t imx_default_read(IMXFECState *s, uint32_t index)
@@ -XXX,XX +XXX,XX @@ static uint64_t imx_eth_read(void *opaque, hwaddr offset, unsigned size)
         break;
     }
 
-    FEC_PRINTF("reg[%s] => 0x%" PRIx32 "\n", imx_eth_reg_name(s, index),
-                                              value);
+    trace_imx_eth_read(index, imx_eth_reg_name(s, index), value);
 
     return value;
 }
@@ -XXX,XX +XXX,XX @@ static void imx_eth_write(void *opaque, hwaddr offset, uint64_t value,
     const bool single_tx_ring = !imx_eth_is_multi_tx_ring(s);
     uint32_t index = offset >> 2;
 
-    FEC_PRINTF("reg[%s] <= 0x%" PRIx32 "\n", imx_eth_reg_name(s, index),
-                (uint32_t)value);
+    trace_imx_eth_write(index, imx_eth_reg_name(s, index), value);
 
     switch (index) {
     case ENET_EIR:
@@ -XXX,XX +XXX,XX @@ static void imx_eth_write(void *opaque, hwaddr offset, uint64_t value,
         if (extract32(value, 29, 1)) {
             /* This is a read operation */
             s->regs[ENET_MMFR] = deposit32(s->regs[ENET_MMFR], 0, 16,
-                                           do_phy_read(s,
+                                           imx_phy_read(s,
                                                        extract32(value,
                                                                  18, 10)));
         } else {
             /* This a write operation */
-            do_phy_write(s, extract32(value, 18, 10), extract32(value, 0, 16));
+            imx_phy_write(s, extract32(value, 18, 10), extract32(value, 0, 16));
         }
         /* raise the interrupt as the PHY operation is done */
         s->regs[ENET_EIR] |= ENET_INT_MII;
@@ -XXX,XX +XXX,XX @@ static bool imx_eth_can_receive(NetClientState *nc)
 {
     IMXFECState *s = IMX_FEC(qemu_get_nic_opaque(nc));
 
-    FEC_PRINTF("\n");
-
     return !!s->regs[ENET_RDAR];
 }
 
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
     unsigned int buf_len;
     size_t size = len;
 
-    FEC_PRINTF("len %d\n", (int)size);
+    trace_imx_fec_receive(size);
 
     if (!s->regs[ENET_RDAR]) {
         qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: Unexpected packet\n",
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
         bd.length = buf_len;
         size -= buf_len;
 
-        FEC_PRINTF("rx_bd 0x%x length %d\n", addr, bd.length);
+        trace_imx_fec_receive_len(addr, bd.length);
 
         /* The last 4 bytes are the CRC.  */
         if (size < 4) {
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
         if (size == 0) {
             /* Last buffer in frame.  */
             bd.flags |= flags | ENET_BD_L;
-            FEC_PRINTF("rx frame flags %04x\n", bd.flags);
+
+            trace_imx_fec_receive_last(bd.flags);
+
             s->regs[ENET_EIR] |= ENET_INT_RXF;
         } else {
             s->regs[ENET_EIR] |= ENET_INT_RXB;
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf,
     size_t size = len;
     bool shift16 = s->regs[ENET_RACC] & ENET_RACC_SHIFT16;
 
-    FEC_PRINTF("len %d\n", (int)size);
+    trace_imx_enet_receive(size);
 
     if (!s->regs[ENET_RDAR]) {
         qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: Unexpected packet\n",
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf,
         bd.length = buf_len;
         size -= buf_len;
 
-        FEC_PRINTF("rx_bd 0x%x length %d\n", addr, bd.length);
+        trace_imx_enet_receive_len(addr, bd.length);
 
         /* The last 4 bytes are the CRC.  */
         if (size < 4) {
@@ -XXX,XX +XXX,XX @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf,
         if (size == 0) {
             /* Last buffer in frame.  */
             bd.flags |= flags | ENET_BD_L;
-            FEC_PRINTF("rx frame flags %04x\n", bd.flags);
+
+            trace_imx_enet_receive_last(bd.flags);
+
             /* Indicate that we've updated the last buffer descriptor. */
             bd.last_buffer = ENET_BD_BDU;
             if (bd.option & ENET_BD_RX_INT) {
diff --git a/hw/net/trace-events b/hw/net/trace-events
index XXXXXXX..XXXXXXX 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -XXX,XX +XXX,XX @@ i82596_receive_packet(size_t sz) "len=%zu"
 i82596_new_mac(const char *id_with_mac) "New MAC for: %s"
 i82596_set_multicast(uint16_t count) "Added %d multicast entries"
 i82596_channel_attention(void *s) "%p: Received CHANNEL ATTENTION"
+
+# imx_fec.c
+imx_phy_read(uint32_t val, int reg) "0x%04"PRIx32" <= reg[%d]"
+imx_phy_write(uint32_t val, int reg) "0x%04"PRIx32" => reg[%d]"
+imx_phy_update_link(const char *s) "%s"
+imx_phy_reset(void) ""
+imx_fec_read_bd(uint64_t addr, int flags, int len, int data) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x"
+imx_enet_read_bd(uint64_t addr, int flags, int len, int data, int options, int status) "tx_bd 0x%"PRIx64" flags 0x%04x len %d data 0x%08x option 0x%04x status 0x%04x"
+imx_eth_tx_bd_busy(void) "tx_bd ran out of descriptors to transmit"
+imx_eth_rx_bd_full(void) "RX buffer is full"
+imx_eth_read(int reg, const char *reg_name, uint32_t value) "reg[%d:%s] => 0x%08"PRIx32
+imx_eth_write(int reg, const char *reg_name, uint64_t value) "reg[%d:%s] <= 0x%08"PRIx64
+imx_fec_receive(size_t size) "len %zu"
+imx_fec_receive_len(uint64_t addr, int len) "rx_bd 0x%"PRIx64" length %d"
+imx_fec_receive_last(int last) "rx frame flags 0x%04x"
+imx_enet_receive(size_t size) "len %zu"
+imx_enet_receive_len(uint64_t addr, int len) "rx_bd 0x%"PRIx64" length %d"
+imx_enet_receive_last(int last) "rx frame flags 0x%04x"
-- 
2.20.1

From: Guenter Roeck <linux@roeck-us.net>

The Linux kernel's IMX code now uses vendor specific commands.
This results in endless warnings when booting the Linux kernel.

sdhci-esdhc-imx 2194000.usdhc: esdhc_wait_for_card_clock_gate_off:
	card clock still not gate off in 100us!.

Implement support for the vendor specific command implemented in IMX hardware
to be able to avoid this warning.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20200603145258.195920-2-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/sd/sdhci-internal.h |  5 +++++
 include/hw/sd/sdhci.h  |  5 +++++
 hw/sd/sdhci.c          | 18 +++++++++++++++++-
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/hw/sd/sdhci-internal.h b/hw/sd/sdhci-internal.h
index XXXXXXX..XXXXXXX 100644
--- a/hw/sd/sdhci-internal.h
+++ b/hw/sd/sdhci-internal.h
@@ -XXX,XX +XXX,XX @@
 #define SDHC_CMD_INHIBIT               0x00000001
 #define SDHC_DATA_INHIBIT              0x00000002
 #define SDHC_DAT_LINE_ACTIVE           0x00000004
+#define SDHC_IMX_CLOCK_GATE_OFF        0x00000080
 #define SDHC_DOING_WRITE               0x00000100
 #define SDHC_DOING_READ                0x00000200
 #define SDHC_SPACE_AVAILABLE           0x00000400
@@ -XXX,XX +XXX,XX @@ extern const VMStateDescription sdhci_vmstate;
 
 
 #define ESDHC_MIX_CTRL                  0x48
+
 #define ESDHC_VENDOR_SPEC               0xc0
+#define ESDHC_IMX_FRC_SDCLK_ON          (1 << 8)
+
 #define ESDHC_DLL_CTRL                  0x60
 
 #define ESDHC_TUNING_CTRL               0xcc
@@ -XXX,XX +XXX,XX @@ extern const VMStateDescription sdhci_vmstate;
 #define DEFINE_SDHCI_COMMON_PROPERTIES(_state) \
     DEFINE_PROP_UINT8("sd-spec-version", _state, sd_spec_version, 2), \
     DEFINE_PROP_UINT8("uhs", _state, uhs_mode, UHS_NOT_SUPPORTED), \
+    DEFINE_PROP_UINT8("vendor", _state, vendor, SDHCI_VENDOR_NONE), \
     \
     /* Capabilities registers provide information on supported
      * features of this specific host controller implementation */ \
diff --git a/include/hw/sd/sdhci.h b/include/hw/sd/sdhci.h
index XXXXXXX..XXXXXXX 100644
--- a/include/hw/sd/sdhci.h
+++ b/include/hw/sd/sdhci.h
@@ -XXX,XX +XXX,XX @@ typedef struct SDHCIState {
     uint16_t acmd12errsts; /* Auto CMD12 error status register */
     uint16_t hostctl2;     /* Host Control 2 */
     uint64_t admasysaddr;  /* ADMA System Address Register */
+    uint16_t vendor_spec;  /* Vendor specific register */
 
     /* Read-only registers */
     uint64_t capareg;      /* Capabilities Register */
@@ -XXX,XX +XXX,XX @@ typedef struct SDHCIState {
     uint32_t quirks;
     uint8_t sd_spec_version;
     uint8_t uhs_mode;
+    uint8_t vendor;        /* For vendor specific functionality */
 } SDHCIState;
 
+#define SDHCI_VENDOR_NONE       0
+#define SDHCI_VENDOR_IMX        1
+
 /*
  * Controller does not provide transfer-complete interrupt when not
  * busy.
diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/sd/sdhci.c
+++ b/hw/sd/sdhci.c
@@ -XXX,XX +XXX,XX @@ static uint64_t usdhc_read(void *opaque, hwaddr offset, unsigned size)
         }
         break;
 
+    case ESDHC_VENDOR_SPEC:
+        ret = s->vendor_spec;
+        break;
     case ESDHC_DLL_CTRL:
     case ESDHC_TUNE_CTRL_STATUS:
     case ESDHC_UNDOCUMENTED_REG27:
     case ESDHC_TUNING_CTRL:
-    case ESDHC_VENDOR_SPEC:
     case ESDHC_MIX_CTRL:
     case ESDHC_WTMK_LVL:
         ret = 0;
@@ -XXX,XX +XXX,XX @@ usdhc_write(void *opaque, hwaddr offset, uint64_t val, unsigned size)
     case ESDHC_UNDOCUMENTED_REG27:
     case ESDHC_TUNING_CTRL:
     case ESDHC_WTMK_LVL:
+        break;
+
     case ESDHC_VENDOR_SPEC:
+        s->vendor_spec = value;
+        switch (s->vendor) {
+        case SDHCI_VENDOR_IMX:
+            if (value & ESDHC_IMX_FRC_SDCLK_ON) {
+                s->prnsts &= ~SDHC_IMX_CLOCK_GATE_OFF;
+            } else {
+                s->prnsts |= SDHC_IMX_CLOCK_GATE_OFF;
+            }
+            break;
+        default:
+            break;
+        }
         break;
 
     case SDHC_HOSTCTL:
-- 
2.20.1

From: Guenter Roeck <linux@roeck-us.net>

Set vendor property to IMX to enable IMX specific functionality
in sdhci code.

Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200603145258.195920-3-linux@roeck-us.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/fsl-imx25.c  | 6 ++++++
 hw/arm/fsl-imx6.c   | 6 ++++++
 hw/arm/fsl-imx6ul.c | 2 ++
 hw/arm/fsl-imx7.c   | 2 ++
 4 files changed, 16 insertions(+)

diff --git a/hw/arm/fsl-imx25.c b/hw/arm/fsl-imx25.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/fsl-imx25.c
+++ b/hw/arm/fsl-imx25.c
@@ -XXX,XX +XXX,XX @@ static void fsl_imx25_realize(DeviceState *dev, Error **errp)
                                  &err);
         object_property_set_uint(OBJECT(&s->esdhc[i]), IMX25_ESDHC_CAPABILITIES,
                                  "capareg", &err);
+        object_property_set_uint(OBJECT(&s->esdhc[i]), SDHCI_VENDOR_IMX,
+                                 "vendor", &err);
+        if (err) {
+            error_propagate(errp, err);
+            return;
+        }
         object_property_set_bool(OBJECT(&s->esdhc[i]), true, "realized", &err);
         if (err) {
             error_propagate(errp, err);
diff --git a/hw/arm/fsl-imx6.c b/hw/arm/fsl-imx6.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/fsl-imx6.c
+++ b/hw/arm/fsl-imx6.c
@@ -XXX,XX +XXX,XX @@ static void fsl_imx6_realize(DeviceState *dev, Error **errp)
                                  &err);
         object_property_set_uint(OBJECT(&s->esdhc[i]), IMX6_ESDHC_CAPABILITIES,
                                  "capareg", &err);
+        object_property_set_uint(OBJECT(&s->esdhc[i]), SDHCI_VENDOR_IMX,
+                                 "vendor", &err);
+        if (err) {
+            error_propagate(errp, err);
+            return;
+        }
         object_property_set_bool(OBJECT(&s->esdhc[i]), true, "realized", &err);
         if (err) {
             error_propagate(errp, err);
diff --git a/hw/arm/fsl-imx6ul.c b/hw/arm/fsl-imx6ul.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/fsl-imx6ul.c
+++ b/hw/arm/fsl-imx6ul.c
@@ -XXX,XX +XXX,XX @@ static void fsl_imx6ul_realize(DeviceState *dev, Error **errp)
             FSL_IMX6UL_USDHC2_IRQ,
         };
 
+        object_property_set_uint(OBJECT(&s->usdhc[i]), SDHCI_VENDOR_IMX,
+                                        "vendor", &error_abort);
         object_property_set_bool(OBJECT(&s->usdhc[i]), true, "realized",
                                  &error_abort);
 
diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/arm/fsl-imx7.c
+++ b/hw/arm/fsl-imx7.c
@@ -XXX,XX +XXX,XX @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
             FSL_IMX7_USDHC3_IRQ,
         };
 
+        object_property_set_uint(OBJECT(&s->usdhc[i]), SDHCI_VENDOR_IMX,
+                                 "vendor", &error_abort);
         object_property_set_bool(OBJECT(&s->usdhc[i]), true, "realized",
                                  &error_abort);
 
-- 
2.20.1

Hi; here's the first target-arm pullreq for the 7.0 cycle.

thanks
-- PMM

The following changes since commit 76b56fdfc9fa43ec6e5986aee33f108c6c6a511e:

Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging (2021-12-14 12:46:18 -0800)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20211215

for you to fetch changes up to aed176558806674d030a8305d989d4e6a5073359:

tests/acpi: add expected blob for VIOT test on virt machine (2021-12-15 10:35:26 +0000)

----------------------------------------------------------------
target-arm queue:
 * ITS: error reporting cleanup
 * aspeed: improve documentation
 * Fix STM32F2XX USART data register readout
 * allow emulated GICv3 to be disabled in non-TCG builds
 * fix exception priority for singlestep, misaligned PC, bp, etc
 * Correct calculation of tlb range invalidate length
 * npcm7xx_emc: fix missing queue_flush
 * virt: Add VIOT ACPI table for virtio-iommu
 * target/i386: Use assert() to sanity-check b1 in SSE decode
 * Don't include qemu-common unnecessarily

----------------------------------------------------------------
Alex Bennée (1):
      hw/intc: clean-up error reporting for failed ITS cmd

Jean-Philippe Brucker (8):
      hw/arm/virt-acpi-build: Add VIOT table for virtio-iommu
      hw/arm/virt: Remove device tree restriction for virtio-iommu
      hw/arm/virt: Reject instantiation of multiple IOMMUs
      hw/arm/virt: Use object_property_set instead of qdev_prop_set
      tests/acpi: allow updates of VIOT expected data files
      tests/acpi: add test case for VIOT
      tests/acpi: add expected blobs for VIOT test on q35 machine
      tests/acpi: add expected blob for VIOT test on virt machine

Joel Stanley (4):
      docs: aspeed: Add new boards
      docs: aspeed: Update OpenBMC image URL
      docs: aspeed: Give an example of booting a kernel
      docs: aspeed: ADC is now modelled

Olivier Hériveaux (1):
      Fix STM32F2XX USART data register readout

Patrick Venture (1):
      hw/net: npcm7xx_emc fix missing queue_flush

Peter Maydell (6):
      target/i386: Use assert() to sanity-check b1 in SSE decode
      include/hw/i386: Don't include qemu-common.h in .h files
      target/hexagon/cpu.h: don't include qemu-common.h
      target/rx/cpu.h: Don't include qemu-common.h
      hw/arm: Don't include qemu-common.h unnecessarily
      target/arm: Correct calculation of tlb range invalidate length

Philippe Mathieu-Daudé (2):
      hw/intc/arm_gicv3: Extract gicv3_set_gicv3state from arm_gicv3_cpuif.c
      hw/intc/arm_gicv3: Introduce CONFIG_ARM_GIC_TCG Kconfig selector

Richard Henderson (10):
      target/arm: Hoist pc_next to a local variable in aarch64_tr_translate_insn
      target/arm: Hoist pc_next to a local variable in arm_tr_translate_insn
      target/arm: Hoist pc_next to a local variable in thumb_tr_translate_insn
      target/arm: Split arm_pre_translate_insn
      target/arm: Advance pc for arch single-step exception
      target/arm: Split compute_fsr_fsc out of arm_deliver_fault
      target/arm: Take an exception if PC is misaligned
      target/arm: Assert thumb pc is aligned
      target/arm: Suppress bp for exceptions with more priority
      tests/tcg: Add arm and aarch64 pc alignment tests

docs/system/arm/aspeed.rst        |  26 ++++++++++++----
 include/hw/i386/microvm.h         |   1 -
 include/hw/i386/x86.h             |   1 -
 target/arm/helper.h               |   1 +
 target/arm/syndrome.h             |   5 +++
 target/hexagon/cpu.h              |   1 -
 target/rx/cpu.h                   |   1 -
 hw/arm/boot.c                     |   1 -
 hw/arm/digic_boards.c             |   1 -
 hw/arm/highbank.c                 |   1 -
 hw/arm/npcm7xx_boards.c           |   1 -
 hw/arm/sbsa-ref.c                 |   1 -
 hw/arm/stm32f405_soc.c            |   1 -
 hw/arm/vexpress.c                 |   1 -
 hw/arm/virt-acpi-build.c          |   7 +++++
 hw/arm/virt.c                     |  21 ++++++-------
 hw/char/stm32f2xx_usart.c         |   3 +-
 hw/intc/arm_gicv3.c               |   2 +-
 hw/intc/arm_gicv3_cpuif.c         |  10 +-----
 hw/intc/arm_gicv3_cpuif_common.c  |  22 +++++++++++++
 hw/intc/arm_gicv3_its.c           |  39 +++++++++++++++--------
 hw/net/npcm7xx_emc.c              |  18 +++++------
 hw/virtio/virtio-iommu-pci.c      |  12 ++------
 linux-user/aarch64/cpu_loop.c     |  46 ++++++++++++++++------------
 linux-user/hexagon/cpu_loop.c     |   1 +
 target/arm/debug_helper.c         |  23 ++++++++++++++
 target/arm/gdbstub.c              |   9 ++++--
 target/arm/helper.c               |   6 ++--
 target/arm/machine.c              |  10 ++++++
 target/arm/tlb_helper.c           |  63 ++++++++++++++++++++++++++++----------
 target/arm/translate-a64.c        |  23 ++++++++++++--
 target/arm/translate.c            |  58 ++++++++++++++++++++++++++---------
 target/i386/tcg/translate.c       |  12 ++------
 tests/qtest/bios-tables-test.c    |  38 +++++++++++++++++++++++
 tests/tcg/aarch64/pcalign-a64.c   |  37 ++++++++++++++++++++++
 tests/tcg/arm/pcalign-a32.c       |  46 ++++++++++++++++++++++++++++
 hw/arm/Kconfig                    |   1 +
 hw/intc/Kconfig                   |   5 +++
 hw/intc/meson.build               |  11 ++++---
 tests/data/acpi/q35/DSDT.viot     | Bin 0 -> 9398 bytes
 tests/data/acpi/q35/VIOT.viot     | Bin 0 -> 112 bytes
 tests/data/acpi/virt/VIOT         | Bin 0 -> 88 bytes
 tests/tcg/aarch64/Makefile.target |   4 +--
 tests/tcg/arm/Makefile.target     |   4 +++
 44 files changed, 429 insertions(+), 145 deletions(-)
 create mode 100644 hw/intc/arm_gicv3_cpuif_common.c
 create mode 100644 tests/tcg/aarch64/pcalign-a64.c
 create mode 100644 tests/tcg/arm/pcalign-a32.c
 create mode 100644 tests/data/acpi/q35/DSDT.viot
 create mode 100644 tests/data/acpi/q35/VIOT.viot
 create mode 100644 tests/data/acpi/virt/VIOT

From: Alex Bennée <alex.bennee@linaro.org>

While trying to debug a GIC ITS failure I saw some guest errors that
had poor formatting as well as leaving me confused as to what failed.
As most of the checks aren't possible without a valid dte split that
check apart and then check the other conditions in steps. This avoids
us relying on undefined data.

I still get a failure with the current kvm-unit-tests but at least I
know (partially) why now:

Exception return from AArch64 EL1 to AArch64 EL1 PC 0x40080588
  PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 now triggers an LPI
  ITS: MAPD devid=2 size = 0x8 itt=0x40430000 valid=0
  INT dev_id=2 event_id=20
  process_its_cmd: invalid command attributes: invalid dte: 0 for 2 (MEM_TX: 0)
  PASS: gicv3: its-trigger: mapd valid=false: no LPI after device unmap
  SUMMARY: 6 tests, 1 unexpected failures

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20211112170454.3158925-1-alex.bennee@linaro.org
Cc: Shashi Mallela <shashi.mallela@linaro.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/intc/arm_gicv3_its.c | 39 +++++++++++++++++++++++++++------------
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/hw/intc/arm_gicv3_its.c b/hw/intc/arm_gicv3_its.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/arm_gicv3_its.c
+++ b/hw/intc/arm_gicv3_its.c
@@ -XXX,XX +XXX,XX @@ static bool process_its_cmd(GICv3ITSState *s, uint64_t value, uint32_t offset,
         if (res != MEMTX_OK) {
             return result;
         }
+    } else {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: invalid command attributes: "
+                      "invalid dte: %"PRIx64" for %d (MEM_TX: %d)\n",
+                      __func__, dte, devid, res);
+        return result;
     }
 
-    if ((devid > s->dt.maxids.max_devids) || !dte_valid || !ite_valid ||
-            !cte_valid || (eventid > max_eventid)) {
+
+    /*
+     * In this implementation, in case of guest errors we ignore the
+     * command and move onto the next command in the queue.
+     */
+    if (devid > s->dt.maxids.max_devids) {
         qemu_log_mask(LOG_GUEST_ERROR,
-                      "%s: invalid command attributes "
-                      "devid %d or eventid %d or invalid dte %d or"
-                      "invalid cte %d or invalid ite %d\n",
-                      __func__, devid, eventid, dte_valid, cte_valid,
-                      ite_valid);
-        /*
-         * in this implementation, in case of error
-         * we ignore this command and move onto the next
-         * command in the queue
-         */
+                      "%s: invalid command attributes: devid %d>%d",
+                      __func__, devid, s->dt.maxids.max_devids);
+
+    } else if (!dte_valid || !ite_valid || !cte_valid) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: invalid command attributes: "
+                      "dte: %s, ite: %s, cte: %s\n",
+                      __func__,
+                      dte_valid ? "valid" : "invalid",
+                      ite_valid ? "valid" : "invalid",
+                      cte_valid ? "valid" : "invalid");
+    } else if (eventid > max_eventid) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: invalid command attributes: eventid %d > %d\n",
+                      __func__, eventid, max_eventid);
     } else {
         /*
          * Current implementation only supports rdbase == procnum
-- 
2.25.1

From: Joel Stanley <joel@jms.id.au>

Add X11, FP5280G2, G220A, Rainier and Fuji. Mention that Swift will be
removed in v7.0.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20211117065752.330632-2-joel@jms.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/system/arm/aspeed.rst | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/system/arm/aspeed.rst b/docs/system/arm/aspeed.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/system/arm/aspeed.rst
+++ b/docs/system/arm/aspeed.rst
@@ -XXX,XX +XXX,XX @@ AST2400 SoC based machines :
 
 - ``palmetto-bmc``         OpenPOWER Palmetto POWER8 BMC
 - ``quanta-q71l-bmc``      OpenBMC Quanta BMC
+- ``supermicrox11-bmc``    Supermicro X11 BMC
 
 AST2500 SoC based machines :
 
@@ -XXX,XX +XXX,XX @@ AST2500 SoC based machines :
 - ``romulus-bmc``          OpenPOWER Romulus POWER9 BMC
 - ``witherspoon-bmc``      OpenPOWER Witherspoon POWER9 BMC
 - ``sonorapass-bmc``       OCP SonoraPass BMC
-- ``swift-bmc``            OpenPOWER Swift BMC POWER9
+- ``swift-bmc``            OpenPOWER Swift BMC POWER9 (to be removed in v7.0)
+- ``fp5280g2-bmc``         Inspur FP5280G2 BMC
+- ``g220a-bmc``            Bytedance G220A BMC
 
 AST2600 SoC based machines :
 
 - ``ast2600-evb``          Aspeed AST2600 Evaluation board (Cortex-A7)
 - ``tacoma-bmc``           OpenPOWER Witherspoon POWER9 AST2600 BMC
+- ``rainier-bmc``          IBM Rainier POWER10 BMC
+- ``fuji-bmc``             Facebook Fuji BMC
 
 Supported devices
 -----------------
-- 
2.25.1

From: Joel Stanley <joel@jms.id.au>

A common use case for the ASPEED machine is to boot a Linux kernel.
Provide a full example command line.

Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Message-id: 20211117065752.330632-4-joel@jms.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/system/arm/aspeed.rst | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/docs/system/arm/aspeed.rst b/docs/system/arm/aspeed.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/system/arm/aspeed.rst
+++ b/docs/system/arm/aspeed.rst
@@ -XXX,XX +XXX,XX @@ Missing devices
 Boot options
 ------------
 
-The Aspeed machines can be started using the ``-kernel`` option to
-load a Linux kernel or from a firmware. Images can be downloaded from
-the OpenBMC jenkins :
+The Aspeed machines can be started using the ``-kernel`` and ``-dtb`` options
+to load a Linux kernel or from a firmware. Images can be downloaded from the
+OpenBMC jenkins :
 
    https://jenkins.openbmc.org/job/ci-openbmc/lastSuccessfulBuild/
 
@@ -XXX,XX +XXX,XX @@ or directly from the OpenBMC GitHub release repository :
 
    https://github.com/openbmc/openbmc/releases
 
+To boot a kernel directly from a Linux build tree:
+
+.. code-block:: bash
+
+  $ qemu-system-arm -M ast2600-evb -nographic \
+        -kernel arch/arm/boot/zImage \
+        -dtb arch/arm/boot/dts/aspeed-ast2600-evb.dtb \
+        -initrd rootfs.cpio
+
 The image should be attached as an MTD drive. Run :
 
 .. code-block:: bash
-- 
2.25.1

From: Olivier Hériveaux <olivier.heriveaux@ledger.fr>

Fix issue where the data register may be overwritten by next character
reception before being read and returned.

Signed-off-by: Olivier Hériveaux <olivier.heriveaux@ledger.fr>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20211128120723.4053-1-olivier.heriveaux@ledger.fr
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/char/stm32f2xx_usart.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/char/stm32f2xx_usart.c b/hw/char/stm32f2xx_usart.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/char/stm32f2xx_usart.c
+++ b/hw/char/stm32f2xx_usart.c
@@ -XXX,XX +XXX,XX @@ static uint64_t stm32f2xx_usart_read(void *opaque, hwaddr addr,
         return retvalue;
     case USART_DR:
         DB_PRINT("Value: 0x%" PRIx32 ", %c\n", s->usart_dr, (char) s->usart_dr);
+        retvalue = s->usart_dr & 0x3FF;
         s->usart_sr &= ~USART_SR_RXNE;
         qemu_chr_fe_accept_input(&s->chr);
         qemu_set_irq(s->irq, 0);
-        return s->usart_dr & 0x3FF;
+        return retvalue;
     case USART_BRR:
         return s->usart_brr;
     case USART_CR1:
-- 
2.25.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

gicv3_set_gicv3state() is used by arm_gicv3_common.c in
arm_gicv3_common_realize(). Since we want to restrict
arm_gicv3_cpuif.c to TCG, extract gicv3_set_gicv3state()
to a new file. Add this file to the meson 'specific'
source set, since it needs access to "cpu.h".

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20211115223619.2599282-2-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/intc/arm_gicv3_cpuif.c        | 10 +---------
 hw/intc/arm_gicv3_cpuif_common.c | 22 ++++++++++++++++++++++
 hw/intc/meson.build              |  1 +
 3 files changed, 24 insertions(+), 9 deletions(-)
 create mode 100644 hw/intc/arm_gicv3_cpuif_common.c

diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -XXX,XX +XXX,XX @@
 /*
- * ARM Generic Interrupt Controller v3
+ * ARM Generic Interrupt Controller v3 (emulation)
  *
  * Copyright (c) 2016 Linaro Limited
  * Written by Peter Maydell
@@ -XXX,XX +XXX,XX @@
 #include "hw/irq.h"
 #include "cpu.h"
 
-void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *s)
-{
-    ARMCPU *arm_cpu = ARM_CPU(cpu);
-    CPUARMState *env = &arm_cpu->env;
-
-    env->gicv3state = (void *)s;
-};
-
 static GICv3CPUState *icc_cs_from_env(CPUARMState *env)
 {
     return env->gicv3state;
diff --git a/hw/intc/arm_gicv3_cpuif_common.c b/hw/intc/arm_gicv3_cpuif_common.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/hw/intc/arm_gicv3_cpuif_common.c
@@ -XXX,XX +XXX,XX @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * ARM Generic Interrupt Controller v3
+ *
+ * Copyright (c) 2016 Linaro Limited
+ * Written by Peter Maydell
+ *
+ * This code is licensed under the GPL, version 2 or (at your option)
+ * any later version.
+ */
+
+#include "qemu/osdep.h"
+#include "gicv3_internal.h"
+#include "cpu.h"
+
+void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *s)
+{
+    ARMCPU *arm_cpu = ARM_CPU(cpu);
+    CPUARMState *env = &arm_cpu->env;
+
+    env->gicv3state = (void *)s;
+};
diff --git a/hw/intc/meson.build b/hw/intc/meson.build
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/meson.build
+++ b/hw/intc/meson.build
@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_XLNX_ZYNQMP_PMU', if_true: files('xlnx-pmu-iomod-in
 
 specific_ss.add(when: 'CONFIG_ALLWINNER_A10_PIC', if_true: files('allwinner-a10-pic.c'))
 specific_ss.add(when: 'CONFIG_APIC', if_true: files('apic.c', 'apic_common.c'))
+specific_ss.add(when: 'CONFIG_ARM_GIC', if_true: files('arm_gicv3_cpuif_common.c'))
 specific_ss.add(when: 'CONFIG_ARM_GIC', if_true: files('arm_gicv3_cpuif.c'))
 specific_ss.add(when: 'CONFIG_ARM_GIC_KVM', if_true: files('arm_gic_kvm.c'))
 specific_ss.add(when: ['CONFIG_ARM_GIC_KVM', 'TARGET_AARCH64'], if_true: files('arm_gicv3_kvm.c', 'arm_gicv3_its_kvm.c'))
-- 
2.25.1

From: Philippe Mathieu-Daudé <philmd@redhat.com>

The TYPE_ARM_GICV3 device is an emulated one.  When using
KVM, it is recommended to use the TYPE_KVM_ARM_GICV3 device
(which uses in-kernel support).

When using --with-devices-FOO, it is possible to build a
binary with a specific set of devices. When this binary is
restricted to KVM accelerator, the TYPE_ARM_GICV3 device is
irrelevant, and it is desirable to remove it from the binary.

Therefore introduce the CONFIG_ARM_GIC_TCG Kconfig selector
which select the files required to have the TYPE_ARM_GICV3
device, but also allowing to de-select this device.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20211115223619.2599282-3-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/intc/arm_gicv3.c |  2 +-
 hw/intc/Kconfig     |  5 +++++
 hw/intc/meson.build | 10 ++++++----
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/arm_gicv3.c
+++ b/hw/intc/arm_gicv3.c
@@ -XXX,XX +XXX,XX @@
 /*
- * ARM Generic Interrupt Controller v3
+ * ARM Generic Interrupt Controller v3 (emulation)
  *
  * Copyright (c) 2015 Huawei.
  * Copyright (c) 2016 Linaro Limited
diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/Kconfig
+++ b/hw/intc/Kconfig
@@ -XXX,XX +XXX,XX @@ config APIC
     select MSI_NONBROKEN
     select I8259
 
+config ARM_GIC_TCG
+    bool
+    default y
+    depends on ARM_GIC && TCG
+
 config ARM_GIC_KVM
     bool
     default y
diff --git a/hw/intc/meson.build b/hw/intc/meson.build
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/meson.build
+++ b/hw/intc/meson.build
@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_ARM_GIC', if_true: files(
   'arm_gic.c',
   'arm_gic_common.c',
   'arm_gicv2m.c',
-  'arm_gicv3.c',
   'arm_gicv3_common.c',
-  'arm_gicv3_dist.c',
   'arm_gicv3_its_common.c',
-  'arm_gicv3_redist.c',
+))
+softmmu_ss.add(when: 'CONFIG_ARM_GIC_TCG', if_true: files(
+  'arm_gicv3.c',
+  'arm_gicv3_dist.c',
   'arm_gicv3_its.c',
+  'arm_gicv3_redist.c',
 ))
 softmmu_ss.add(when: 'CONFIG_ETRAXFS', if_true: files('etraxfs_pic.c'))
 softmmu_ss.add(when: 'CONFIG_HEATHROW_PIC', if_true: files('heathrow_pic.c'))
@@ -XXX,XX +XXX,XX @@ softmmu_ss.add(when: 'CONFIG_XLNX_ZYNQMP_PMU', if_true: files('xlnx-pmu-iomod-in
 specific_ss.add(when: 'CONFIG_ALLWINNER_A10_PIC', if_true: files('allwinner-a10-pic.c'))
 specific_ss.add(when: 'CONFIG_APIC', if_true: files('apic.c', 'apic_common.c'))
 specific_ss.add(when: 'CONFIG_ARM_GIC', if_true: files('arm_gicv3_cpuif_common.c'))
-specific_ss.add(when: 'CONFIG_ARM_GIC', if_true: files('arm_gicv3_cpuif.c'))
+specific_ss.add(when: 'CONFIG_ARM_GIC_TCG', if_true: files('arm_gicv3_cpuif.c'))
 specific_ss.add(when: 'CONFIG_ARM_GIC_KVM', if_true: files('arm_gic_kvm.c'))
 specific_ss.add(when: ['CONFIG_ARM_GIC_KVM', 'TARGET_AARCH64'], if_true: files('arm_gicv3_kvm.c', 'arm_gicv3_its_kvm.c'))
 specific_ss.add(when: 'CONFIG_ARM_V7M', if_true: files('armv7m_nvic.c'))
-- 
2.25.1