Series comparison

-[PULL for-6.2 0/1] target-arm queue
+[PULL 00/11] target-arm queue
-Last minute pullreq with one patch, fixing the GICv3 ICH_MISR_EL2.LRENP
+The following changes since commit 3214bec13d8d4c40f707d21d8350d04e4123ae97:
 calculation. I went back-and-forth on whether to put this in, but:
  * it's an effective regression from 6.1 (the bug itself has been
    present since before then, but it was previously masked by the
    other bug which we fixed in 9cee1efe92)
  * I just realized it could cause a screaming maintenance interrupt
    even for hypervisors like KVM that don't set LRENPIE
-On the other hand this is very late and we haven't seen it be a
+  Merge tag 'migration-20250110-pull-request' of https://gitlab.com/farosas/qemu into staging (2025-01-10 13:39:19 -0500)
 problem with any guest except Qualcomm's hypervisor. So if you want
 to decide it's better not going in that's OK too.
 Tested on the gitlab CI and with a local test of nested KVM.
 -- PMM
 The following changes since commit 7635eff97104242d618400e4b6746d0a5c97af82:
   Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging (2021-12-06 11:18:06 -0800)
 are available in the Git repository at:
-  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20211207
+  https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20250113
-for you to fetch changes up to 2958e5150dfa297dd5a51fe57a29156b8744f07f:
+for you to fetch changes up to 435d260e7ec5ff9c79e3e62f1d66ec82d2d691ae:
-  gicv3: fix ICH_MISR's LRENP computation (2021-12-07 15:30:08 +0000)
+  docs/system/arm/virt: mention specific migration information (2025-01-13 12:35:35 +0000)
 ----------------------------------------------------------------
 target-arm queue:
- * Fix calculation of ICH_MISR_EL2.LRENP to avoid incorrect generation
+ * hw/arm_sysctl: fix extracting 31th bit of val
-   of maintenance interrupts
+ * hw/misc: cast rpm to uint64_t
  * tests/qtest/boot-serial-test: Improve ASM
  * target/arm: Move minor arithmetic helpers out of helper.c
  * target/arm: change default pauth algorithm to impdef
 ----------------------------------------------------------------
-Damien Hedde (1):
+Anastasia Belova (1):
-      gicv3: fix ICH_MISR's LRENP computation
+      hw/arm_sysctl: fix extracting 31th bit of val
- hw/intc/arm_gicv3_cpuif.c | 3 ++-
+Peter Maydell (2):
-file changed, 2 insertions(+), 1 deletion(-)
+      target/arm: Move minor arithmetic helpers out of helper.c
       tests/tcg/aarch64: force qarma5 for pauth-3 test
+Philippe Mathieu-Daudé (4):
+      tests/qtest/boot-serial-test: Improve ASM comments of PL011 tests
+      tests/qtest/boot-serial-test: Reduce for() loop in PL011 tests
+      tests/qtest/boot-serial-test: Reorder pair of instructions in PL011 test
+      tests/qtest/boot-serial-test: Initialize PL011 Control register
+Pierrick Bouvier (3):
+      target/arm: add new property to select pauth-qarma5
+      target/arm: change default pauth algorithm to impdef
+      docs/system/arm/virt: mention specific migration information
+Tigran Sogomonian (1):
+      hw/misc: cast rpm to uint64_t
+ docs/system/arm/cpu-features.rst                |   7 +-
+ docs/system/arm/virt.rst                        |   4 +
+ docs/system/introduction.rst                    |   2 +-
+ target/arm/cpu.h                                |   4 +
+ hw/core/machine.c                               |   4 +-
+ hw/misc/arm_sysctl.c                            |   2 +-
+ hw/misc/npcm7xx_mft.c                           |   5 +-
+ target/arm/arm-qmp-cmds.c                       |   2 +-
+ target/arm/cpu.c                                |   2 +
+ target/arm/cpu64.c                              |  38 ++-
+ target/arm/helper.c                             | 285 -----------------------
+ target/arm/tcg/arith_helper.c                   | 296 ++++++++++++++++++++++++
+ tests/qtest/arm-cpu-features.c                  |  15 +-
+ tests/qtest/boot-serial-test.c                  |  23 +-
+ target/arm/{op_addsub.h => tcg/op_addsub.c.inc} |   0
+ target/arm/tcg/meson.build                      |   1 +
+ tests/tcg/aarch64/Makefile.softmmu-target       |   3 +
+files changed, 377 insertions(+), 316 deletions(-)
+ create mode 100644 target/arm/tcg/arith_helper.c
+ rename target/arm/{op_addsub.h => tcg/op_addsub.c.inc} (100%)

-New patch
+[PULL 01/11] hw/arm_sysctl: fix extracting 31th bit of val
+From: Anastasia Belova <abelova@astralinux.ru>
+<< 31 is casted to uint64_t while bitwise and with val.
+So this value may become 0xffffffff80000000 but only
+th "start" bit is required.
+This is not possible in practice because the MemoryRegionOps
+uses the default max access size of 4 bytes and so none
+of the upper bytes of val will be set, but the bitfield
+extract API is clearer anyway.
+Use the bitfield extract() API instead.
+Found by Linux Verification Center (linuxtesting.org) with SVACE.
+Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
+Message-id: 20241220125429.7552-1-abelova@astralinux.ru
+Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+[PMM: add clarification to commit message]
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/misc/arm_sysctl.c | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/hw/misc/arm_sysctl.c b/hw/misc/arm_sysctl.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/misc/arm_sysctl.c
++++ b/hw/misc/arm_sysctl.c
+@@ -XXX,XX +XXX,XX @@ static void arm_sysctl_write(void *opaque, hwaddr offset,
+          * as zero.
+          */
+         s->sys_cfgctrl = val & ~((3 << 18) | (1 << 31));
+-        if (val & (1 << 31)) {
++        if (extract64(val, 31, 1)) {
+             /* Start bit set -- actually do something */
+             unsigned int dcc = extract32(s->sys_cfgctrl, 26, 4);
+             unsigned int function = extract32(s->sys_cfgctrl, 20, 6);
+--
+.34.1

-New patch
+[PULL 02/11] hw/misc: cast rpm to uint64_t
+From: Tigran Sogomonian <tsogomonian@astralinux.ru>
+The value of an arithmetic expression
+'rpm * NPCM7XX_MFT_PULSE_PER_REVOLUTION' is a subject
+to overflow because its operands are not cast to
+a larger data type before performing arithmetic. Thus, need
+to cast rpm to uint64_t.
+Found by Linux Verification Center (linuxtesting.org) with SVACE.
+Signed-off-by: Tigran Sogomonian <tsogomonian@astralinux.ru>
+Reviewed-by: Patrick Leis <venture@google.com>
+Reviewed-by: Hao Wu <wuhaotsh@google.com>
+Message-id: 20241226130311.1349-1-tsogomonian@astralinux.ru
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ hw/misc/npcm7xx_mft.c | 5 +++--
+file changed, 3 insertions(+), 2 deletions(-)
+diff --git a/hw/misc/npcm7xx_mft.c b/hw/misc/npcm7xx_mft.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/misc/npcm7xx_mft.c
++++ b/hw/misc/npcm7xx_mft.c
+@@ -XXX,XX +XXX,XX @@ static NPCM7xxMFTCaptureState npcm7xx_mft_compute_cnt(
+          * RPM = revolution/min. The time for one revlution (in ns) is
+          * MINUTE_TO_NANOSECOND / RPM.
+          */
+-        count = clock_ns_to_ticks(clock, (60 * NANOSECONDS_PER_SECOND) /
+-            (rpm * NPCM7XX_MFT_PULSE_PER_REVOLUTION));
++        count = clock_ns_to_ticks(clock,
++            (uint64_t)(60 * NANOSECONDS_PER_SECOND) /
++            ((uint64_t)rpm * NPCM7XX_MFT_PULSE_PER_REVOLUTION));
+     }
+     if (count > NPCM7XX_MFT_MAX_CNT) {
+--
+.34.1

-New patch
+[PULL 03/11] tests/qtest/boot-serial-test: Improve ASM comments of PL011 tests
+From: Philippe Mathieu-Daudé <philmd@linaro.org>
+Re-indent ASM comments adding the 'loop:' label.
+Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Fabiano Rosas <farosas@suse.de>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ tests/qtest/boot-serial-test.c | 18 +++++++++---------
+file changed, 9 insertions(+), 9 deletions(-)
+diff --git a/tests/qtest/boot-serial-test.c b/tests/qtest/boot-serial-test.c
+index XXXXXXX..XXXXXXX 100644
+--- a/tests/qtest/boot-serial-test.c
++++ b/tests/qtest/boot-serial-test.c
+@@ -XXX,XX +XXX,XX @@ static const uint8_t kernel_plml605[] = {
+ };
+ static const uint8_t bios_raspi2[] = {
+-    0x08, 0x30, 0x9f, 0xe5,                 /* ldr   r3,[pc,#8]    Get base */
+-    0x54, 0x20, 0xa0, 0xe3,                 /* mov     r2,#'T' */
+-    0x00, 0x20, 0xc3, 0xe5,                 /* strb    r2,[r3] */
+-    0xfb, 0xff, 0xff, 0xea,                 /* b       loop */
+-    0x00, 0x10, 0x20, 0x3f,                 /* 0x3f201000 = UART0 base addr */
++    0x08, 0x30, 0x9f, 0xe5,                 /* loop:  ldr     r3, [pc, #8]   Get &UART0 */
++    0x54, 0x20, 0xa0, 0xe3,                 /*        mov     r2, #'T' */
++    0x00, 0x20, 0xc3, 0xe5,                 /*        strb    r2, [r3]       *TXDAT = 'T' */
++    0xfb, 0xff, 0xff, 0xea,                 /*        b       -12            (loop) */
++    0x00, 0x10, 0x20, 0x3f,                 /* UART0: 0x3f201000 */
+ };
+ static const uint8_t kernel_aarch64[] = {
+-    0x81, 0x0a, 0x80, 0x52,                 /* mov     w1, #0x54 */
+-    0x02, 0x20, 0xa1, 0xd2,                 /* mov     x2, #0x9000000 */
+-    0x41, 0x00, 0x00, 0x39,                 /* strb    w1, [x2] */
+-    0xfd, 0xff, 0xff, 0x17,                 /* b       -12 (loop) */
++    0x81, 0x0a, 0x80, 0x52,                 /* loop:  mov    w1, #'T' */
++    0x02, 0x20, 0xa1, 0xd2,                 /*        mov    x2, #0x9000000  Load UART0 */
++    0x41, 0x00, 0x00, 0x39,                 /*        strb   w1, [x2]        *TXDAT = 'T' */
++    0xfd, 0xff, 0xff, 0x17,                 /*        b      -12             (loop) */
+ };
+ static const uint8_t kernel_nrf51[] = {
+--
+.34.1

-New patch
+[PULL 04/11] tests/qtest/boot-serial-test: Reduce for() loop in PL011 tests
+From: Philippe Mathieu-Daudé <philmd@linaro.org>
+Since registers are not modified, we don't need
+to refill their values. Directly jump to the previous
+store instruction to keep filling the TXDAT register.
+The equivalent C code remains:
+  while (true) {
+      *UART_DATA = 'T';
+  }
+Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Fabiano Rosas <farosas@suse.de>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ tests/qtest/boot-serial-test.c | 12 ++++++------
+file changed, 6 insertions(+), 6 deletions(-)
+diff --git a/tests/qtest/boot-serial-test.c b/tests/qtest/boot-serial-test.c
+index XXXXXXX..XXXXXXX 100644
+--- a/tests/qtest/boot-serial-test.c
++++ b/tests/qtest/boot-serial-test.c
+@@ -XXX,XX +XXX,XX @@ static const uint8_t kernel_plml605[] = {
+ };
+ static const uint8_t bios_raspi2[] = {
+-    0x08, 0x30, 0x9f, 0xe5,                 /* loop:  ldr     r3, [pc, #8]   Get &UART0 */
++    0x08, 0x30, 0x9f, 0xe5,                 /*        ldr     r3, [pc, #8]   Get &UART0 */
+x54, 0x20, 0xa0, 0xe3,                 /*        mov     r2, #'T' */
+-    0x00, 0x20, 0xc3, 0xe5,                 /*        strb    r2, [r3]       *TXDAT = 'T' */
+-    0xfb, 0xff, 0xff, 0xea,                 /*        b       -12            (loop) */
++    0x00, 0x20, 0xc3, 0xe5,                 /* loop:  strb    r2, [r3]       *TXDAT = 'T' */
++    0xff, 0xff, 0xff, 0xea,                 /*        b       -4             (loop) */
+x00, 0x10, 0x20, 0x3f,                 /* UART0: 0x3f201000 */
+ };
+ static const uint8_t kernel_aarch64[] = {
+-    0x81, 0x0a, 0x80, 0x52,                 /* loop:  mov    w1, #'T' */
++    0x81, 0x0a, 0x80, 0x52,                 /*        mov    w1, #'T' */
+x02, 0x20, 0xa1, 0xd2,                 /*        mov    x2, #0x9000000  Load UART0 */
+-    0x41, 0x00, 0x00, 0x39,                 /*        strb   w1, [x2]        *TXDAT = 'T' */
+-    0xfd, 0xff, 0xff, 0x17,                 /*        b      -12             (loop) */
++    0x41, 0x00, 0x00, 0x39,                 /* loop:  strb   w1, [x2]        *TXDAT = 'T' */
++    0xff, 0xff, 0xff, 0x17,                 /*        b      -4              (loop) */
+ };
+ static const uint8_t kernel_nrf51[] = {
+--
+.34.1

-New patch
+[PULL 05/11] tests/qtest/boot-serial-test: Reorder pair of instructions in PL011 test
+From: Philippe Mathieu-Daudé <philmd@linaro.org>
+In the next commit we are going to use a different value
+for the $w1 register, maintaining the same $x2 value. In
+order to keep the next commit trivial to review, set $x2
+before $w1.
+Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Reviewed-by: Fabiano Rosas <farosas@suse.de>
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ tests/qtest/boot-serial-test.c | 2 +-
+file changed, 1 insertion(+), 1 deletion(-)
+diff --git a/tests/qtest/boot-serial-test.c b/tests/qtest/boot-serial-test.c
+index XXXXXXX..XXXXXXX 100644
+--- a/tests/qtest/boot-serial-test.c
++++ b/tests/qtest/boot-serial-test.c
+@@ -XXX,XX +XXX,XX @@ static const uint8_t bios_raspi2[] = {
+ };
+ static const uint8_t kernel_aarch64[] = {
+-    0x81, 0x0a, 0x80, 0x52,                 /*        mov    w1, #'T' */
+x02, 0x20, 0xa1, 0xd2,                 /*        mov    x2, #0x9000000  Load UART0 */
++    0x81, 0x0a, 0x80, 0x52,                 /*        mov    w1, #'T' */
+x41, 0x00, 0x00, 0x39,                 /* loop:  strb   w1, [x2]        *TXDAT = 'T' */
+xff, 0xff, 0xff, 0x17,                 /*        b      -4              (loop) */
+ };
+--
+.34.1

-[PULL 1/1] gicv3: fix ICH_MISR's LRENP computation
+[PULL 06/11] tests/qtest/boot-serial-test: Initialize PL011 Control register
-From: Damien Hedde <damien.hedde@greensocs.com>
+From: Philippe Mathieu-Daudé <philmd@linaro.org>
-According to the "Arm Generic Interrupt Controller Architecture
+The tests using the PL011 UART of the virt and raspi machines
-Specification GIC architecture version 3 and 4" (version G: page 345
+weren't properly enabling the UART and its transmitter previous
-for aarch64 or 509 for aarch32):
+to sending characters. Follow the PL011 manual initialization
-LRENP bit of ICH_MISR is set when ICH_HCR.LRENPIE==1 and
+recommendation by setting the proper bits of the control register.
 ICH_HCR.EOIcount is non-zero.
-When only LRENPIE was set (and EOI count was zero), the LRENP bit was
+Update the ASM code prefixing:
 wrongly set and MISR value was wrong.
-As an additional consequence, if an hypervisor set ICH_HCR.LRENPIE,
+  *UART_CTRL = UART_ENABLE | TX_ENABLE;
 the maintenance interrupt was constantly fired. It happens since patch
 cee1efe92 ("hw/intc: Set GIC maintenance interrupt level to only 0 or 1")
 which fixed another bug about maintenance interrupt (most significant
 bits of misr, including this one, were ignored in the interrupt trigger).
-Fixes: 83f036fe3d ("hw/intc/arm_gicv3: Add accessors for ICH_ system registers")
+to:
-Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
-Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
+  while (true) {
-Message-id: 20211207094427.3473-1-damien.hedde@greensocs.com
+      *UART_DATA = 'T';
   }
 Note, since commit 51b61dd4d56 ("hw/char/pl011: Warn when using
 disabled transmitter") incomplete PL011 initialization can be
 logged using the '-d guest_errors' command line option.
 Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
 Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
 ---
- hw/intc/arm_gicv3_cpuif.c | 3 ++-
+ tests/qtest/boot-serial-test.c | 7 ++++++-
-file changed, 2 insertions(+), 1 deletion(-)
+file changed, 6 insertions(+), 1 deletion(-)
-diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
+diff --git a/tests/qtest/boot-serial-test.c b/tests/qtest/boot-serial-test.c
 index XXXXXXX..XXXXXXX 100644
---- a/hw/intc/arm_gicv3_cpuif.c
+--- a/tests/qtest/boot-serial-test.c
-+++ b/hw/intc/arm_gicv3_cpuif.c
++++ b/tests/qtest/boot-serial-test.c
-@@ -XXX,XX +XXX,XX @@ static uint32_t maintenance_interrupt_state(GICv3CPUState *cs)
+@@ -XXX,XX +XXX,XX @@ static const uint8_t kernel_plml605[] = {
-     /* Scan list registers and fill in the U, NP and EOI bits */
+ };
-     eoi_maintenance_interrupt_state(cs, &value);
+ static const uint8_t bios_raspi2[] = {
--    if (cs->ich_hcr_el2 & (ICH_HCR_EL2_LRENPIE | ICH_HCR_EL2_EOICOUNT_MASK)) {
+-    0x08, 0x30, 0x9f, 0xe5,                 /*        ldr     r3, [pc, #8]   Get &UART0 */
-+    if ((cs->ich_hcr_el2 & ICH_HCR_EL2_LRENPIE) &&
++    0x10, 0x30, 0x9f, 0xe5,                 /*        ldr     r3, [pc, #16]  Get &UART0 */
-+        (cs->ich_hcr_el2 & ICH_HCR_EL2_EOICOUNT_MASK)) {
++    0x10, 0x20, 0x9f, 0xe5,                 /*        ldr     r2, [pc, #16]  Get &CR */
-         value |= ICH_MISR_EL2_LRENP;
++    0xb0, 0x23, 0xc3, 0xe1,                 /*        strh    r2, [r3, #48]  Set CR */
-     }
+x54, 0x20, 0xa0, 0xe3,                 /*        mov     r2, #'T' */
+x00, 0x20, 0xc3, 0xe5,                 /* loop:  strb    r2, [r3]       *TXDAT = 'T' */
 xff, 0xff, 0xff, 0xea,                 /*        b       -4             (loop) */
 x00, 0x10, 0x20, 0x3f,                 /* UART0: 0x3f201000 */
 +    0x01, 0x01, 0x00, 0x00,                 /* CR:    0x101 = UARTEN|TXE */
  };
  static const uint8_t kernel_aarch64[] = {
 x02, 0x20, 0xa1, 0xd2,                 /*        mov    x2, #0x9000000  Load UART0 */
 +    0x21, 0x20, 0x80, 0x52,                 /*        mov    w1, 0x101       CR = UARTEN|TXE */
 +    0x41, 0x60, 0x00, 0x79,                 /*        strh   w1, [x2, #48]   Set CR */
 x81, 0x0a, 0x80, 0x52,                 /*        mov    w1, #'T' */
 x41, 0x00, 0x00, 0x39,                 /* loop:  strb   w1, [x2]        *TXDAT = 'T' */
 xff, 0xff, 0xff, 0x17,                 /*        b      -4              (loop) */
 --
-.25.1
+.34.1

-New patch
+[PULL 07/11] target/arm: Move minor arithmetic helpers out of helper.c
+helper.c includes some small TCG helper functions used for mostly
+arithmetic instructions.  These are TCG only and there's no need for
+them to be in the large and unwieldy helper.c.  Move them out to
+their own source file in the tcg/ subdirectory, together with the
+op_addsub.h multiply-included template header that they use.
+Since we are moving op_addsub.h, we take the opportunity to
+give it a name which matches our convention for files which
+are not true header files but which are #included from other
+C files: op_addsub.c.inc.
+(Ironically, this means that helper.c no longer contains
+any TCG helper function definitions at all.)
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20250110131211.2546314-1-peter.maydell@linaro.org
+Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
+---
+ target/arm/helper.c                           | 285 -----------------
+ target/arm/tcg/arith_helper.c                 | 296 ++++++++++++++++++
+ .../arm/{op_addsub.h => tcg/op_addsub.c.inc}  |   0
+ target/arm/tcg/meson.build                    |   1 +
+files changed, 297 insertions(+), 285 deletions(-)
+ create mode 100644 target/arm/tcg/arith_helper.c
+ rename target/arm/{op_addsub.h => tcg/op_addsub.c.inc} (100%)
+diff --git a/target/arm/helper.c b/target/arm/helper.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/helper.c
++++ b/target/arm/helper.c
+@@ -XXX,XX +XXX,XX @@
+ #include "qemu/main-loop.h"
+ #include "qemu/timer.h"
+ #include "qemu/bitops.h"
+-#include "qemu/crc32c.h"
+ #include "qemu/qemu-print.h"
+ #include "exec/exec-all.h"
+ #include "exec/translation-block.h"
+-#include <zlib.h> /* for crc32 */
+ #include "hw/irq.h"
+ #include "system/cpu-timers.h"
+ #include "system/kvm.h"
+@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
+     };
+ }
+-/*
+- * Note that signed overflow is undefined in C.  The following routines are
+- * careful to use unsigned types where modulo arithmetic is required.
+- * Failure to do so _will_ break on newer gcc.
+- */
+-
+-/* Signed saturating arithmetic.  */
+-
+-/* Perform 16-bit signed saturating addition.  */
+-static inline uint16_t add16_sat(uint16_t a, uint16_t b)
+-{
+-    uint16_t res;
+-
+-    res = a + b;
+-    if (((res ^ a) & 0x8000) && !((a ^ b) & 0x8000)) {
+-        if (a & 0x8000) {
+-            res = 0x8000;
+-        } else {
+-            res = 0x7fff;
+-        }
+-    }
+-    return res;
+-}
+-
+-/* Perform 8-bit signed saturating addition.  */
+-static inline uint8_t add8_sat(uint8_t a, uint8_t b)
+-{
+-    uint8_t res;
+-
+-    res = a + b;
+-    if (((res ^ a) & 0x80) && !((a ^ b) & 0x80)) {
+-        if (a & 0x80) {
+-            res = 0x80;
+-        } else {
+-            res = 0x7f;
+-        }
+-    }
+-    return res;
+-}
+-
+-/* Perform 16-bit signed saturating subtraction.  */
+-static inline uint16_t sub16_sat(uint16_t a, uint16_t b)
+-{
+-    uint16_t res;
+-
+-    res = a - b;
+-    if (((res ^ a) & 0x8000) && ((a ^ b) & 0x8000)) {
+-        if (a & 0x8000) {
+-            res = 0x8000;
+-        } else {
+-            res = 0x7fff;
+-        }
+-    }
+-    return res;
+-}
+-
+-/* Perform 8-bit signed saturating subtraction.  */
+-static inline uint8_t sub8_sat(uint8_t a, uint8_t b)
+-{
+-    uint8_t res;
+-
+-    res = a - b;
+-    if (((res ^ a) & 0x80) && ((a ^ b) & 0x80)) {
+-        if (a & 0x80) {
+-            res = 0x80;
+-        } else {
+-            res = 0x7f;
+-        }
+-    }
+-    return res;
+-}
+-
+-#define ADD16(a, b, n) RESULT(add16_sat(a, b), n, 16);
+-#define SUB16(a, b, n) RESULT(sub16_sat(a, b), n, 16);
+-#define ADD8(a, b, n)  RESULT(add8_sat(a, b), n, 8);
+-#define SUB8(a, b, n)  RESULT(sub8_sat(a, b), n, 8);
+-#define PFX q
+-
+-#include "op_addsub.h"
+-
+-/* Unsigned saturating arithmetic.  */
+-static inline uint16_t add16_usat(uint16_t a, uint16_t b)
+-{
+-    uint16_t res;
+-    res = a + b;
+-    if (res < a) {
+-        res = 0xffff;
+-    }
+-    return res;
+-}
+-
+-static inline uint16_t sub16_usat(uint16_t a, uint16_t b)
+-{
+-    if (a > b) {
+-        return a - b;
+-    } else {
+-        return 0;
+-    }
+-}
+-
+-static inline uint8_t add8_usat(uint8_t a, uint8_t b)
+-{
+-    uint8_t res;
+-    res = a + b;
+-    if (res < a) {
+-        res = 0xff;
+-    }
+-    return res;
+-}
+-
+-static inline uint8_t sub8_usat(uint8_t a, uint8_t b)
+-{
+-    if (a > b) {
+-        return a - b;
+-    } else {
+-        return 0;
+-    }
+-}
+-
+-#define ADD16(a, b, n) RESULT(add16_usat(a, b), n, 16);
+-#define SUB16(a, b, n) RESULT(sub16_usat(a, b), n, 16);
+-#define ADD8(a, b, n)  RESULT(add8_usat(a, b), n, 8);
+-#define SUB8(a, b, n)  RESULT(sub8_usat(a, b), n, 8);
+-#define PFX uq
+-
+-#include "op_addsub.h"
+-
+-/* Signed modulo arithmetic.  */
+-#define SARITH16(a, b, n, op) do { \
+-    int32_t sum; \
+-    sum = (int32_t)(int16_t)(a) op (int32_t)(int16_t)(b); \
+-    RESULT(sum, n, 16); \
+-    if (sum >= 0) \
+-        ge |= 3 << (n * 2); \
+-    } while (0)
+-
+-#define SARITH8(a, b, n, op) do { \
+-    int32_t sum; \
+-    sum = (int32_t)(int8_t)(a) op (int32_t)(int8_t)(b); \
+-    RESULT(sum, n, 8); \
+-    if (sum >= 0) \
+-        ge |= 1 << n; \
+-    } while (0)
+-
+-
+-#define ADD16(a, b, n) SARITH16(a, b, n, +)
+-#define SUB16(a, b, n) SARITH16(a, b, n, -)
+-#define ADD8(a, b, n)  SARITH8(a, b, n, +)
+-#define SUB8(a, b, n)  SARITH8(a, b, n, -)
+-#define PFX s
+-#define ARITH_GE
+-
+-#include "op_addsub.h"
+-
+-/* Unsigned modulo arithmetic.  */
+-#define ADD16(a, b, n) do { \
+-    uint32_t sum; \
+-    sum = (uint32_t)(uint16_t)(a) + (uint32_t)(uint16_t)(b); \
+-    RESULT(sum, n, 16); \
+-    if ((sum >> 16) == 1) \
+-        ge |= 3 << (n * 2); \
+-    } while (0)
+-
+-#define ADD8(a, b, n) do { \
+-    uint32_t sum; \
+-    sum = (uint32_t)(uint8_t)(a) + (uint32_t)(uint8_t)(b); \
+-    RESULT(sum, n, 8); \
+-    if ((sum >> 8) == 1) \
+-        ge |= 1 << n; \
+-    } while (0)
+-
+-#define SUB16(a, b, n) do { \
+-    uint32_t sum; \
+-    sum = (uint32_t)(uint16_t)(a) - (uint32_t)(uint16_t)(b); \
+-    RESULT(sum, n, 16); \
+-    if ((sum >> 16) == 0) \
+-        ge |= 3 << (n * 2); \
+-    } while (0)
+-
+-#define SUB8(a, b, n) do { \
+-    uint32_t sum; \
+-    sum = (uint32_t)(uint8_t)(a) - (uint32_t)(uint8_t)(b); \
+-    RESULT(sum, n, 8); \
+-    if ((sum >> 8) == 0) \
+-        ge |= 1 << n; \
+-    } while (0)
+-
+-#define PFX u
+-#define ARITH_GE
+-
+-#include "op_addsub.h"
+-
+-/* Halved signed arithmetic.  */
+-#define ADD16(a, b, n) \
+-  RESULT(((int32_t)(int16_t)(a) + (int32_t)(int16_t)(b)) >> 1, n, 16)
+-#define SUB16(a, b, n) \
+-  RESULT(((int32_t)(int16_t)(a) - (int32_t)(int16_t)(b)) >> 1, n, 16)
+-#define ADD8(a, b, n) \
+-  RESULT(((int32_t)(int8_t)(a) + (int32_t)(int8_t)(b)) >> 1, n, 8)
+-#define SUB8(a, b, n) \
+-  RESULT(((int32_t)(int8_t)(a) - (int32_t)(int8_t)(b)) >> 1, n, 8)
+-#define PFX sh
+-
+-#include "op_addsub.h"
+-
+-/* Halved unsigned arithmetic.  */
+-#define ADD16(a, b, n) \
+-  RESULT(((uint32_t)(uint16_t)(a) + (uint32_t)(uint16_t)(b)) >> 1, n, 16)
+-#define SUB16(a, b, n) \
+-  RESULT(((uint32_t)(uint16_t)(a) - (uint32_t)(uint16_t)(b)) >> 1, n, 16)
+-#define ADD8(a, b, n) \
+-  RESULT(((uint32_t)(uint8_t)(a) + (uint32_t)(uint8_t)(b)) >> 1, n, 8)
+-#define SUB8(a, b, n) \
+-  RESULT(((uint32_t)(uint8_t)(a) - (uint32_t)(uint8_t)(b)) >> 1, n, 8)
+-#define PFX uh
+-
+-#include "op_addsub.h"
+-
+-static inline uint8_t do_usad(uint8_t a, uint8_t b)
+-{
+-    if (a > b) {
+-        return a - b;
+-    } else {
+-        return b - a;
+-    }
+-}
+-
+-/* Unsigned sum of absolute byte differences.  */
+-uint32_t HELPER(usad8)(uint32_t a, uint32_t b)
+-{
+-    uint32_t sum;
+-    sum = do_usad(a, b);
+-    sum += do_usad(a >> 8, b >> 8);
+-    sum += do_usad(a >> 16, b >> 16);
+-    sum += do_usad(a >> 24, b >> 24);
+-    return sum;
+-}
+-
+-/* For ARMv6 SEL instruction.  */
+-uint32_t HELPER(sel_flags)(uint32_t flags, uint32_t a, uint32_t b)
+-{
+-    uint32_t mask;
+-
+-    mask = 0;
+-    if (flags & 1) {
+-        mask |= 0xff;
+-    }
+-    if (flags & 2) {
+-        mask |= 0xff00;
+-    }
+-    if (flags & 4) {
+-        mask |= 0xff0000;
+-    }
+-    if (flags & 8) {
+-        mask |= 0xff000000;
+-    }
+-    return (a & mask) | (b & ~mask);
+-}
+-
+-/*
+- * CRC helpers.
+- * The upper bytes of val (above the number specified by 'bytes') must have
+- * been zeroed out by the caller.
+- */
+-uint32_t HELPER(crc32)(uint32_t acc, uint32_t val, uint32_t bytes)
+-{
+-    uint8_t buf[4];
+-
+-    stl_le_p(buf, val);
+-
+-    /* zlib crc32 converts the accumulator and output to one's complement.  */
+-    return crc32(acc ^ 0xffffffff, buf, bytes) ^ 0xffffffff;
+-}
+-
+-uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, uint32_t bytes)
+-{
+-    uint8_t buf[4];
+-
+-    stl_le_p(buf, val);
+-
+-    /* Linux crc32c converts the output to one's complement.  */
+-    return crc32c(acc, buf, bytes) ^ 0xffffffff;
+-}
+ /*
+  * Return the exception level to which FP-disabled exceptions should
+diff --git a/target/arm/tcg/arith_helper.c b/target/arm/tcg/arith_helper.c
+new file mode 100644
+index XXXXXXX..XXXXXXX
+--- /dev/null
++++ b/target/arm/tcg/arith_helper.c
+@@ -XXX,XX +XXX,XX @@
++/*
++ * ARM generic helpers for various arithmetical operations.
++ *
++ * This code is licensed under the GNU GPL v2 or later.
++ *
++ * SPDX-License-Identifier: GPL-2.0-or-later
++ */
++#include "qemu/osdep.h"
++#include "cpu.h"
++#include "exec/helper-proto.h"
++#include "qemu/crc32c.h"
++#include <zlib.h> /* for crc32 */
++
++/*
++ * Note that signed overflow is undefined in C.  The following routines are
++ * careful to use unsigned types where modulo arithmetic is required.
++ * Failure to do so _will_ break on newer gcc.
++ */
++
++/* Signed saturating arithmetic.  */
++
++/* Perform 16-bit signed saturating addition.  */
++static inline uint16_t add16_sat(uint16_t a, uint16_t b)
++{
++    uint16_t res;
++
++    res = a + b;
++    if (((res ^ a) & 0x8000) && !((a ^ b) & 0x8000)) {
++        if (a & 0x8000) {
++            res = 0x8000;
++        } else {
++            res = 0x7fff;
++        }
++    }
++    return res;
++}
++
++/* Perform 8-bit signed saturating addition.  */
++static inline uint8_t add8_sat(uint8_t a, uint8_t b)
++{
++    uint8_t res;
++
++    res = a + b;
++    if (((res ^ a) & 0x80) && !((a ^ b) & 0x80)) {
++        if (a & 0x80) {
++            res = 0x80;
++        } else {
++            res = 0x7f;
++        }
++    }
++    return res;
++}
++
++/* Perform 16-bit signed saturating subtraction.  */
++static inline uint16_t sub16_sat(uint16_t a, uint16_t b)
++{
++    uint16_t res;
++
++    res = a - b;
++    if (((res ^ a) & 0x8000) && ((a ^ b) & 0x8000)) {
++        if (a & 0x8000) {
++            res = 0x8000;
++        } else {
++            res = 0x7fff;
++        }
++    }
++    return res;
++}
++
++/* Perform 8-bit signed saturating subtraction.  */
++static inline uint8_t sub8_sat(uint8_t a, uint8_t b)
++{
++    uint8_t res;
++
++    res = a - b;
++    if (((res ^ a) & 0x80) && ((a ^ b) & 0x80)) {
++        if (a & 0x80) {
++            res = 0x80;
++        } else {
++            res = 0x7f;
++        }
++    }
++    return res;
++}
++
++#define ADD16(a, b, n) RESULT(add16_sat(a, b), n, 16);
++#define SUB16(a, b, n) RESULT(sub16_sat(a, b), n, 16);
++#define ADD8(a, b, n)  RESULT(add8_sat(a, b), n, 8);
++#define SUB8(a, b, n)  RESULT(sub8_sat(a, b), n, 8);
++#define PFX q
++
++#include "op_addsub.c.inc"
++
++/* Unsigned saturating arithmetic.  */
++static inline uint16_t add16_usat(uint16_t a, uint16_t b)
++{
++    uint16_t res;
++    res = a + b;
++    if (res < a) {
++        res = 0xffff;
++    }
++    return res;
++}
++
++static inline uint16_t sub16_usat(uint16_t a, uint16_t b)
++{
++    if (a > b) {
++        return a - b;
++    } else {
++        return 0;
++    }
++}
++
++static inline uint8_t add8_usat(uint8_t a, uint8_t b)
++{
++    uint8_t res;
++    res = a + b;
++    if (res < a) {
++        res = 0xff;
++    }
++    return res;
++}
++
++static inline uint8_t sub8_usat(uint8_t a, uint8_t b)
++{
++    if (a > b) {
++        return a - b;
++    } else {
++        return 0;
++    }
++}
++
++#define ADD16(a, b, n) RESULT(add16_usat(a, b), n, 16);
++#define SUB16(a, b, n) RESULT(sub16_usat(a, b), n, 16);
++#define ADD8(a, b, n)  RESULT(add8_usat(a, b), n, 8);
++#define SUB8(a, b, n)  RESULT(sub8_usat(a, b), n, 8);
++#define PFX uq
++
++#include "op_addsub.c.inc"
++
++/* Signed modulo arithmetic.  */
++#define SARITH16(a, b, n, op) do { \
++    int32_t sum; \
++    sum = (int32_t)(int16_t)(a) op (int32_t)(int16_t)(b); \
++    RESULT(sum, n, 16); \
++    if (sum >= 0) \
++        ge |= 3 << (n * 2); \
++    } while (0)
++
++#define SARITH8(a, b, n, op) do { \
++    int32_t sum; \
++    sum = (int32_t)(int8_t)(a) op (int32_t)(int8_t)(b); \
++    RESULT(sum, n, 8); \
++    if (sum >= 0) \
++        ge |= 1 << n; \
++    } while (0)
++
++
++#define ADD16(a, b, n) SARITH16(a, b, n, +)
++#define SUB16(a, b, n) SARITH16(a, b, n, -)
++#define ADD8(a, b, n)  SARITH8(a, b, n, +)
++#define SUB8(a, b, n)  SARITH8(a, b, n, -)
++#define PFX s
++#define ARITH_GE
++
++#include "op_addsub.c.inc"
++
++/* Unsigned modulo arithmetic.  */
++#define ADD16(a, b, n) do { \
++    uint32_t sum; \
++    sum = (uint32_t)(uint16_t)(a) + (uint32_t)(uint16_t)(b); \
++    RESULT(sum, n, 16); \
++    if ((sum >> 16) == 1) \
++        ge |= 3 << (n * 2); \
++    } while (0)
++
++#define ADD8(a, b, n) do { \
++    uint32_t sum; \
++    sum = (uint32_t)(uint8_t)(a) + (uint32_t)(uint8_t)(b); \
++    RESULT(sum, n, 8); \
++    if ((sum >> 8) == 1) \
++        ge |= 1 << n; \
++    } while (0)
++
++#define SUB16(a, b, n) do { \
++    uint32_t sum; \
++    sum = (uint32_t)(uint16_t)(a) - (uint32_t)(uint16_t)(b); \
++    RESULT(sum, n, 16); \
++    if ((sum >> 16) == 0) \
++        ge |= 3 << (n * 2); \
++    } while (0)
++
++#define SUB8(a, b, n) do { \
++    uint32_t sum; \
++    sum = (uint32_t)(uint8_t)(a) - (uint32_t)(uint8_t)(b); \
++    RESULT(sum, n, 8); \
++    if ((sum >> 8) == 0) \
++        ge |= 1 << n; \
++    } while (0)
++
++#define PFX u
++#define ARITH_GE
++
++#include "op_addsub.c.inc"
++
++/* Halved signed arithmetic.  */
++#define ADD16(a, b, n) \
++  RESULT(((int32_t)(int16_t)(a) + (int32_t)(int16_t)(b)) >> 1, n, 16)
++#define SUB16(a, b, n) \
++  RESULT(((int32_t)(int16_t)(a) - (int32_t)(int16_t)(b)) >> 1, n, 16)
++#define ADD8(a, b, n) \
++  RESULT(((int32_t)(int8_t)(a) + (int32_t)(int8_t)(b)) >> 1, n, 8)
++#define SUB8(a, b, n) \
++  RESULT(((int32_t)(int8_t)(a) - (int32_t)(int8_t)(b)) >> 1, n, 8)
++#define PFX sh
++
++#include "op_addsub.c.inc"
++
++/* Halved unsigned arithmetic.  */
++#define ADD16(a, b, n) \
++  RESULT(((uint32_t)(uint16_t)(a) + (uint32_t)(uint16_t)(b)) >> 1, n, 16)
++#define SUB16(a, b, n) \
++  RESULT(((uint32_t)(uint16_t)(a) - (uint32_t)(uint16_t)(b)) >> 1, n, 16)
++#define ADD8(a, b, n) \
++  RESULT(((uint32_t)(uint8_t)(a) + (uint32_t)(uint8_t)(b)) >> 1, n, 8)
++#define SUB8(a, b, n) \
++  RESULT(((uint32_t)(uint8_t)(a) - (uint32_t)(uint8_t)(b)) >> 1, n, 8)
++#define PFX uh
++
++#include "op_addsub.c.inc"
++
++static inline uint8_t do_usad(uint8_t a, uint8_t b)
++{
++    if (a > b) {
++        return a - b;
++    } else {
++        return b - a;
++    }
++}
++
++/* Unsigned sum of absolute byte differences.  */
++uint32_t HELPER(usad8)(uint32_t a, uint32_t b)
++{
++    uint32_t sum;
++    sum = do_usad(a, b);
++    sum += do_usad(a >> 8, b >> 8);
++    sum += do_usad(a >> 16, b >> 16);
++    sum += do_usad(a >> 24, b >> 24);
++    return sum;
++}
++
++/* For ARMv6 SEL instruction.  */
++uint32_t HELPER(sel_flags)(uint32_t flags, uint32_t a, uint32_t b)
++{
++    uint32_t mask;
++
++    mask = 0;
++    if (flags & 1) {
++        mask |= 0xff;
++    }
++    if (flags & 2) {
++        mask |= 0xff00;
++    }
++    if (flags & 4) {
++        mask |= 0xff0000;
++    }
++    if (flags & 8) {
++        mask |= 0xff000000;
++    }
++    return (a & mask) | (b & ~mask);
++}
++
++/*
++ * CRC helpers.
++ * The upper bytes of val (above the number specified by 'bytes') must have
++ * been zeroed out by the caller.
++ */
++uint32_t HELPER(crc32)(uint32_t acc, uint32_t val, uint32_t bytes)
++{
++    uint8_t buf[4];
++
++    stl_le_p(buf, val);
++
++    /* zlib crc32 converts the accumulator and output to one's complement.  */
++    return crc32(acc ^ 0xffffffff, buf, bytes) ^ 0xffffffff;
++}
++
++uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, uint32_t bytes)
++{
++    uint8_t buf[4];
++
++    stl_le_p(buf, val);
++
++    /* Linux crc32c converts the output to one's complement.  */
++    return crc32c(acc, buf, bytes) ^ 0xffffffff;
++}
+diff --git a/target/arm/op_addsub.h b/target/arm/tcg/op_addsub.c.inc
+similarity index 100%
+rename from target/arm/op_addsub.h
+rename to target/arm/tcg/op_addsub.c.inc
+diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/tcg/meson.build
++++ b/target/arm/tcg/meson.build
+@@ -XXX,XX +XXX,XX @@ arm_ss.add(files(
+   'tlb_helper.c',
+   'vec_helper.c',
+   'tlb-insns.c',
++  'arith_helper.c',
+ ))
+ arm_ss.add(when: 'TARGET_AARCH64', if_true: files(
+--
+.34.1

-New patch
+[PULL 08/11] target/arm: add new property to select pauth-qarma5
+From: Pierrick Bouvier <pierrick.bouvier@linaro.org>
+Before changing default pauth algorithm, we need to make sure current
+default one (QARMA5) can still be selected.
+$ qemu-system-aarch64 -cpu max,pauth-qarma5=on ...
+Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20241219183211.3493974-2-pierrick.bouvier@linaro.org
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ docs/system/arm/cpu-features.rst |  5 ++++-
+ target/arm/cpu.h                 |  1 +
+ target/arm/arm-qmp-cmds.c        |  2 +-
+ target/arm/cpu64.c               | 20 ++++++++++++++------
+ tests/qtest/arm-cpu-features.c   | 15 +++++++++++----
+files changed, 31 insertions(+), 12 deletions(-)
+diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
+index XXXXXXX..XXXXXXX 100644
+--- a/docs/system/arm/cpu-features.rst
++++ b/docs/system/arm/cpu-features.rst
+@@ -XXX,XX +XXX,XX @@ Below is the list of TCG VCPU features and their descriptions.
+ ``pauth-qarma3``
+   When ``pauth`` is enabled, select the architected QARMA3 algorithm.
+-Without either ``pauth-impdef`` or ``pauth-qarma3`` enabled,
++``pauth-qarma5``
++  When ``pauth`` is enabled, select the architected QARMA5 algorithm.
++
++Without ``pauth-impdef``, ``pauth-qarma3`` or ``pauth-qarma5`` enabled,
+ the architected QARMA5 algorithm is used.  The architected QARMA5
+ and QARMA3 algorithms have good cryptographic properties, but can
+ be quite slow to emulate.  The impdef algorithm used by QEMU is
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.h
++++ b/target/arm/cpu.h
+@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
+     bool prop_pauth;
+     bool prop_pauth_impdef;
+     bool prop_pauth_qarma3;
++    bool prop_pauth_qarma5;
+     bool prop_lpa2;
+     /* DCZ blocksize, in log_2(words), ie low 4 bits of DCZID_EL0 */
+diff --git a/target/arm/arm-qmp-cmds.c b/target/arm/arm-qmp-cmds.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/arm-qmp-cmds.c
++++ b/target/arm/arm-qmp-cmds.c
+@@ -XXX,XX +XXX,XX @@ static const char *cpu_model_advertised_features[] = {
+     "sve640", "sve768", "sve896", "sve1024", "sve1152", "sve1280",
+     "sve1408", "sve1536", "sve1664", "sve1792", "sve1920", "sve2048",
+     "kvm-no-adjvtime", "kvm-steal-time",
+-    "pauth", "pauth-impdef", "pauth-qarma3",
++    "pauth", "pauth-impdef", "pauth-qarma3", "pauth-qarma5",
+     NULL
+ };
+diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu64.c
++++ b/target/arm/cpu64.c
+@@ -XXX,XX +XXX,XX @@ void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp)
+         }
+         if (cpu->prop_pauth) {
+-            if (cpu->prop_pauth_impdef && cpu->prop_pauth_qarma3) {
++            if ((cpu->prop_pauth_impdef && cpu->prop_pauth_qarma3) ||
++                (cpu->prop_pauth_impdef && cpu->prop_pauth_qarma5) ||
++                (cpu->prop_pauth_qarma3 && cpu->prop_pauth_qarma5)) {
+                 error_setg(errp,
+-                           "cannot enable both pauth-impdef and pauth-qarma3");
++                           "cannot enable pauth-impdef, pauth-qarma3 and "
++                           "pauth-qarma5 at the same time");
+                 return;
+             }
+@@ -XXX,XX +XXX,XX @@ void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp)
+             } else if (cpu->prop_pauth_qarma3) {
+                 isar2 = FIELD_DP64(isar2, ID_AA64ISAR2, APA3, features);
+                 isar2 = FIELD_DP64(isar2, ID_AA64ISAR2, GPA3, 1);
+-            } else {
++            } else { /* default is pauth-qarma5 */
+                 isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, APA, features);
+                 isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPA, 1);
+             }
+-        } else if (cpu->prop_pauth_impdef || cpu->prop_pauth_qarma3) {
+-            error_setg(errp, "cannot enable pauth-impdef or "
+-                       "pauth-qarma3 without pauth");
++        } else if (cpu->prop_pauth_impdef ||
++                   cpu->prop_pauth_qarma3 ||
++                   cpu->prop_pauth_qarma5) {
++            error_setg(errp, "cannot enable pauth-impdef, pauth-qarma3 or "
++                       "pauth-qarma5 without pauth");
+             error_append_hint(errp, "Add pauth=on to the CPU property list.\n");
+         }
+     }
+@@ -XXX,XX +XXX,XX @@ static const Property arm_cpu_pauth_impdef_property =
+     DEFINE_PROP_BOOL("pauth-impdef", ARMCPU, prop_pauth_impdef, false);
+ static const Property arm_cpu_pauth_qarma3_property =
+     DEFINE_PROP_BOOL("pauth-qarma3", ARMCPU, prop_pauth_qarma3, false);
++static Property arm_cpu_pauth_qarma5_property =
++    DEFINE_PROP_BOOL("pauth-qarma5", ARMCPU, prop_pauth_qarma5, false);
+ void aarch64_add_pauth_properties(Object *obj)
+ {
+@@ -XXX,XX +XXX,XX @@ void aarch64_add_pauth_properties(Object *obj)
+     } else {
+         qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth_impdef_property);
+         qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth_qarma3_property);
++        qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth_qarma5_property);
+     }
+ }
+diff --git a/tests/qtest/arm-cpu-features.c b/tests/qtest/arm-cpu-features.c
+index XXXXXXX..XXXXXXX 100644
+--- a/tests/qtest/arm-cpu-features.c
++++ b/tests/qtest/arm-cpu-features.c
+@@ -XXX,XX +XXX,XX @@ static void pauth_tests_default(QTestState *qts, const char *cpu_type)
+     assert_has_feature_enabled(qts, cpu_type, "pauth");
+     assert_has_feature_disabled(qts, cpu_type, "pauth-impdef");
+     assert_has_feature_disabled(qts, cpu_type, "pauth-qarma3");
++    assert_has_feature_disabled(qts, cpu_type, "pauth-qarma5");
+     assert_set_feature(qts, cpu_type, "pauth", false);
+     assert_set_feature(qts, cpu_type, "pauth", true);
+     assert_set_feature(qts, cpu_type, "pauth-impdef", true);
+     assert_set_feature(qts, cpu_type, "pauth-impdef", false);
+     assert_set_feature(qts, cpu_type, "pauth-qarma3", true);
+     assert_set_feature(qts, cpu_type, "pauth-qarma3", false);
++    assert_set_feature(qts, cpu_type, "pauth-qarma5", true);
++    assert_set_feature(qts, cpu_type, "pauth-qarma5", false);
+     assert_error(qts, cpu_type,
+-                 "cannot enable pauth-impdef or pauth-qarma3 without pauth",
++                 "cannot enable pauth-impdef, pauth-qarma3 or pauth-qarma5 without pauth",
+                  "{ 'pauth': false, 'pauth-impdef': true }");
+     assert_error(qts, cpu_type,
+-                 "cannot enable pauth-impdef or pauth-qarma3 without pauth",
++                 "cannot enable pauth-impdef, pauth-qarma3 or pauth-qarma5 without pauth",
+                  "{ 'pauth': false, 'pauth-qarma3': true }");
+     assert_error(qts, cpu_type,
+-                 "cannot enable both pauth-impdef and pauth-qarma3",
+-                 "{ 'pauth': true, 'pauth-impdef': true, 'pauth-qarma3': true }");
++                 "cannot enable pauth-impdef, pauth-qarma3 or pauth-qarma5 without pauth",
++                 "{ 'pauth': false, 'pauth-qarma5': true }");
++    assert_error(qts, cpu_type,
++                 "cannot enable pauth-impdef, pauth-qarma3 and pauth-qarma5 at the same time",
++                 "{ 'pauth': true, 'pauth-impdef': true, 'pauth-qarma3': true,"
++                 "  'pauth-qarma5': true }");
+ }
+ static void test_query_cpu_model_expansion(const void *data)
+--
+.34.1

-New patch
+[PULL 09/11] tests/tcg/aarch64: force qarma5 for pauth-3 test
+The pauth-3 test explicitly tests that a computation of the
+pointer-authentication produces the expected result.  This means that
+it must be run with the QARMA5 algorithm.
+Explicitly set the pauth algorithm when running this test, so that it
+doesn't break when we change the default algorithm the 'max' CPU
+uses.
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ tests/tcg/aarch64/Makefile.softmmu-target | 3 +++
+file changed, 3 insertions(+)
+diff --git a/tests/tcg/aarch64/Makefile.softmmu-target b/tests/tcg/aarch64/Makefile.softmmu-target
+index XXXXXXX..XXXXXXX 100644
+--- a/tests/tcg/aarch64/Makefile.softmmu-target
++++ b/tests/tcg/aarch64/Makefile.softmmu-target
+@@ -XXX,XX +XXX,XX @@ EXTRA_RUNS+=run-memory-replay
+ ifneq ($(CROSS_CC_HAS_ARMV8_3),)
+ pauth-3: CFLAGS += $(CROSS_CC_HAS_ARMV8_3)
++# This test explicitly checks the output of the pauth operation so we
++# must force the use of the QARMA5 algorithm for it.
++run-pauth-3: QEMU_BASE_MACHINE=-M virt -cpu max,pauth-qarma5=on -display none
+ else
+ pauth-3:
+     $(call skip-test, "BUILD of $@", "missing compiler support")
+--
+.34.1

-New patch
+[PULL 10/11] target/arm: change default pauth algorithm to impdef
+From: Pierrick Bouvier <pierrick.bouvier@linaro.org>
+Pointer authentication on aarch64 is pretty expensive (up to 50% of
+execution time) when running a virtual machine with tcg and -cpu max
+(which enables pauth=on).
+The advice is always: use pauth-impdef=on.
+Our documentation even mentions it "by default" in
+docs/system/introduction.rst.
+Thus, we change the default to use impdef by default. This does not
+affect kvm or hvf acceleration, since pauth algorithm used is the one
+from host cpu.
+This change is retro compatible, in terms of cli, with previous
+versions, as the semantic of using -cpu max,pauth-impdef=on, and -cpu
+max,pauth-qarma3=on is preserved.
+The new option introduced in previous patch and matching old default is
+-cpu max,pauth-qarma5=on.
+It is retro compatible with migration as well, by defining a backcompat
+property, that will use qarma5 by default for virt machine <= 9.2.
+Tested by saving and restoring a vm from qemu 9.2.0 into qemu-master
+(10.0) for cpus neoverse-n2 and max.
+Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
+Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
+Message-id: 20241219183211.3493974-3-pierrick.bouvier@linaro.org
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ docs/system/arm/cpu-features.rst |  2 +-
+ docs/system/introduction.rst     |  2 +-
+ target/arm/cpu.h                 |  3 +++
+ hw/core/machine.c                |  4 +++-
+ target/arm/cpu.c                 |  2 ++
+ target/arm/cpu64.c               | 22 ++++++++++++++++------
+files changed, 26 insertions(+), 9 deletions(-)
+diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
+index XXXXXXX..XXXXXXX 100644
+--- a/docs/system/arm/cpu-features.rst
++++ b/docs/system/arm/cpu-features.rst
+@@ -XXX,XX +XXX,XX @@ Below is the list of TCG VCPU features and their descriptions.
+   When ``pauth`` is enabled, select the architected QARMA5 algorithm.
+ Without ``pauth-impdef``, ``pauth-qarma3`` or ``pauth-qarma5`` enabled,
+-the architected QARMA5 algorithm is used.  The architected QARMA5
++the QEMU impdef algorithm is used.  The architected QARMA5
+ and QARMA3 algorithms have good cryptographic properties, but can
+ be quite slow to emulate.  The impdef algorithm used by QEMU is
+ non-cryptographic but significantly faster.
+diff --git a/docs/system/introduction.rst b/docs/system/introduction.rst
+index XXXXXXX..XXXXXXX 100644
+--- a/docs/system/introduction.rst
++++ b/docs/system/introduction.rst
+@@ -XXX,XX +XXX,XX @@ would default to it anyway.
+ .. code::
+- -cpu max,pauth-impdef=on \
++ -cpu max \
+  -smp 4 \
+  -accel tcg \
+diff --git a/target/arm/cpu.h b/target/arm/cpu.h
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.h
++++ b/target/arm/cpu.h
+@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
+     /* QOM property to indicate we should use the back-compat CNTFRQ default */
+     bool backcompat_cntfrq;
++    /* QOM property to indicate we should use the back-compat QARMA5 default */
++    bool backcompat_pauth_default_use_qarma5;
++
+     /* Specify the number of cores in this CPU cluster. Used for the L2CTLR
+      * register.
+      */
+diff --git a/hw/core/machine.c b/hw/core/machine.c
+index XXXXXXX..XXXXXXX 100644
+--- a/hw/core/machine.c
++++ b/hw/core/machine.c
+@@ -XXX,XX +XXX,XX @@
+ #include "hw/virtio/virtio-iommu.h"
+ #include "audio/audio.h"
+-GlobalProperty hw_compat_9_2[] = {};
++GlobalProperty hw_compat_9_2[] = {
++    {"arm-cpu", "backcompat-pauth-default-use-qarma5", "true"},
++};
+ const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2);
+ GlobalProperty hw_compat_9_1[] = {
+diff --git a/target/arm/cpu.c b/target/arm/cpu.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu.c
++++ b/target/arm/cpu.c
+@@ -XXX,XX +XXX,XX @@ static const Property arm_cpu_properties[] = {
+     DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1),
+     /* True to default to the backward-compat old CNTFRQ rather than 1Ghz */
+     DEFINE_PROP_BOOL("backcompat-cntfrq", ARMCPU, backcompat_cntfrq, false),
++    DEFINE_PROP_BOOL("backcompat-pauth-default-use-qarma5", ARMCPU,
++                      backcompat_pauth_default_use_qarma5, false),
+ };
+ static const gchar *arm_gdb_arch_name(CPUState *cs)
+diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
+index XXXXXXX..XXXXXXX 100644
+--- a/target/arm/cpu64.c
++++ b/target/arm/cpu64.c
+@@ -XXX,XX +XXX,XX @@ void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp)
+                 return;
+             }
+-            if (cpu->prop_pauth_impdef) {
+-                isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, API, features);
+-                isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPI, 1);
++            bool use_default = !cpu->prop_pauth_qarma5 &&
++                               !cpu->prop_pauth_qarma3 &&
++                               !cpu->prop_pauth_impdef;
++
++            if (cpu->prop_pauth_qarma5 ||
++                (use_default &&
++                 cpu->backcompat_pauth_default_use_qarma5)) {
++                isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, APA, features);
++                isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPA, 1);
+             } else if (cpu->prop_pauth_qarma3) {
+                 isar2 = FIELD_DP64(isar2, ID_AA64ISAR2, APA3, features);
+                 isar2 = FIELD_DP64(isar2, ID_AA64ISAR2, GPA3, 1);
+-            } else { /* default is pauth-qarma5 */
+-                isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, APA, features);
+-                isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPA, 1);
++            } else if (cpu->prop_pauth_impdef ||
++                       (use_default &&
++                        !cpu->backcompat_pauth_default_use_qarma5)) {
++                isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, API, features);
++                isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPI, 1);
++            } else {
++                g_assert_not_reached();
+             }
+         } else if (cpu->prop_pauth_impdef ||
+                    cpu->prop_pauth_qarma3 ||
+--
+.34.1

-New patch
+[PULL 11/11] docs/system/arm/virt: mention specific migration information
+From: Pierrick Bouvier <pierrick.bouvier@linaro.org>
+Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
+Message-id: 20241219183211.3493974-4-pierrick.bouvier@linaro.org
+[PMM: Removed a paragraph about using non-versioned models.]
+Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
+---
+ docs/system/arm/virt.rst | 4 ++++
+file changed, 4 insertions(+)
+diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
+index XXXXXXX..XXXXXXX 100644
+--- a/docs/system/arm/virt.rst
++++ b/docs/system/arm/virt.rst
+@@ -XXX,XX +XXX,XX @@ of the 5.0 release and ``virt-5.0`` of the 5.1 release. Migration
+ is not guaranteed to work between different QEMU releases for
+ the non-versioned ``virt`` machine type.
++VM migration is not guaranteed when using ``-cpu max``, as features
++supported may change between QEMU versions.  To ensure your VM can be
++migrated, it is recommended to use another cpu model instead.
++
+ Supported devices
+ """""""""""""""""
+--
+.34.1

Last minute pullreq with one patch, fixing the GICv3 ICH_MISR_EL2.LRENP
calculation. I went back-and-forth on whether to put this in, but:
 * it's an effective regression from 6.1 (the bug itself has been
   present since before then, but it was previously masked by the
   other bug which we fixed in 9cee1efe92)
 * I just realized it could cause a screaming maintenance interrupt
   even for hypervisors like KVM that don't set LRENPIE

On the other hand this is very late and we haven't seen it be a
problem with any guest except Qualcomm's hypervisor. So if you want
to decide it's better not going in that's OK too.

Tested on the gitlab CI and with a local test of nested KVM.

-- PMM

The following changes since commit 7635eff97104242d618400e4b6746d0a5c97af82:

Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging (2021-12-06 11:18:06 -0800)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20211207

for you to fetch changes up to 2958e5150dfa297dd5a51fe57a29156b8744f07f:

gicv3: fix ICH_MISR's LRENP computation (2021-12-07 15:30:08 +0000)

----------------------------------------------------------------
target-arm queue:
 * Fix calculation of ICH_MISR_EL2.LRENP to avoid incorrect generation
   of maintenance interrupts

----------------------------------------------------------------
Damien Hedde (1):
      gicv3: fix ICH_MISR's LRENP computation

hw/intc/arm_gicv3_cpuif.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

From: Damien Hedde <damien.hedde@greensocs.com>

According to the "Arm Generic Interrupt Controller Architecture
Specification GIC architecture version 3 and 4" (version G: page 345
for aarch64 or 509 for aarch32):
LRENP bit of ICH_MISR is set when ICH_HCR.LRENPIE==1 and
ICH_HCR.EOIcount is non-zero.

When only LRENPIE was set (and EOI count was zero), the LRENP bit was
wrongly set and MISR value was wrong.

As an additional consequence, if an hypervisor set ICH_HCR.LRENPIE,
the maintenance interrupt was constantly fired. It happens since patch
9cee1efe92 ("hw/intc: Set GIC maintenance interrupt level to only 0 or 1")
which fixed another bug about maintenance interrupt (most significant
bits of misr, including this one, were ignored in the interrupt trigger).

Fixes: 83f036fe3d ("hw/intc/arm_gicv3: Add accessors for ICH_ system registers")
Signed-off-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20211207094427.3473-1-damien.hedde@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/intc/arm_gicv3_cpuif.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -XXX,XX +XXX,XX @@ static uint32_t maintenance_interrupt_state(GICv3CPUState *cs)
     /* Scan list registers and fill in the U, NP and EOI bits */
     eoi_maintenance_interrupt_state(cs, &value);
 
-    if (cs->ich_hcr_el2 & (ICH_HCR_EL2_LRENPIE | ICH_HCR_EL2_EOICOUNT_MASK)) {
+    if ((cs->ich_hcr_el2 & ICH_HCR_EL2_LRENPIE) &&
+        (cs->ich_hcr_el2 & ICH_HCR_EL2_EOICOUNT_MASK)) {
         value |= ICH_MISR_EL2_LRENP;
     }
 
-- 
2.25.1

The following changes since commit 3214bec13d8d4c40f707d21d8350d04e4123ae97:

Merge tag 'migration-20250110-pull-request' of https://gitlab.com/farosas/qemu into staging (2025-01-10 13:39:19 -0500)

are available in the Git repository at:

https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20250113

for you to fetch changes up to 435d260e7ec5ff9c79e3e62f1d66ec82d2d691ae:

docs/system/arm/virt: mention specific migration information (2025-01-13 12:35:35 +0000)

----------------------------------------------------------------
target-arm queue:
 * hw/arm_sysctl: fix extracting 31th bit of val
 * hw/misc: cast rpm to uint64_t
 * tests/qtest/boot-serial-test: Improve ASM
 * target/arm: Move minor arithmetic helpers out of helper.c
 * target/arm: change default pauth algorithm to impdef

----------------------------------------------------------------
Anastasia Belova (1):
      hw/arm_sysctl: fix extracting 31th bit of val

Peter Maydell (2):
      target/arm: Move minor arithmetic helpers out of helper.c
      tests/tcg/aarch64: force qarma5 for pauth-3 test

Philippe Mathieu-Daudé (4):
      tests/qtest/boot-serial-test: Improve ASM comments of PL011 tests
      tests/qtest/boot-serial-test: Reduce for() loop in PL011 tests
      tests/qtest/boot-serial-test: Reorder pair of instructions in PL011 test
      tests/qtest/boot-serial-test: Initialize PL011 Control register

Pierrick Bouvier (3):
      target/arm: add new property to select pauth-qarma5
      target/arm: change default pauth algorithm to impdef
      docs/system/arm/virt: mention specific migration information

Tigran Sogomonian (1):
      hw/misc: cast rpm to uint64_t

docs/system/arm/cpu-features.rst                |   7 +-
 docs/system/arm/virt.rst                        |   4 +
 docs/system/introduction.rst                    |   2 +-
 target/arm/cpu.h                                |   4 +
 hw/core/machine.c                               |   4 +-
 hw/misc/arm_sysctl.c                            |   2 +-
 hw/misc/npcm7xx_mft.c                           |   5 +-
 target/arm/arm-qmp-cmds.c                       |   2 +-
 target/arm/cpu.c                                |   2 +
 target/arm/cpu64.c                              |  38 ++-
 target/arm/helper.c                             | 285 -----------------------
 target/arm/tcg/arith_helper.c                   | 296 ++++++++++++++++++++++++
 tests/qtest/arm-cpu-features.c                  |  15 +-
 tests/qtest/boot-serial-test.c                  |  23 +-
 target/arm/{op_addsub.h => tcg/op_addsub.c.inc} |   0
 target/arm/tcg/meson.build                      |   1 +
 tests/tcg/aarch64/Makefile.softmmu-target       |   3 +
 17 files changed, 377 insertions(+), 316 deletions(-)
 create mode 100644 target/arm/tcg/arith_helper.c
 rename target/arm/{op_addsub.h => tcg/op_addsub.c.inc} (100%)

From: Anastasia Belova <abelova@astralinux.ru>

1 << 31 is casted to uint64_t while bitwise and with val.
So this value may become 0xffffffff80000000 but only
31th "start" bit is required.

This is not possible in practice because the MemoryRegionOps
uses the default max access size of 4 bytes and so none
of the upper bytes of val will be set, but the bitfield
extract API is clearer anyway.

Use the bitfield extract() API instead.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Message-id: 20241220125429.7552-1-abelova@astralinux.ru
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: add clarification to commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/misc/arm_sysctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/misc/arm_sysctl.c b/hw/misc/arm_sysctl.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/arm_sysctl.c
+++ b/hw/misc/arm_sysctl.c
@@ -XXX,XX +XXX,XX @@ static void arm_sysctl_write(void *opaque, hwaddr offset,
          * as zero.
          */
         s->sys_cfgctrl = val & ~((3 << 18) | (1 << 31));
-        if (val & (1 << 31)) {
+        if (extract64(val, 31, 1)) {
             /* Start bit set -- actually do something */
             unsigned int dcc = extract32(s->sys_cfgctrl, 26, 4);
             unsigned int function = extract32(s->sys_cfgctrl, 20, 6);
-- 
2.34.1

From: Tigran Sogomonian <tsogomonian@astralinux.ru>

The value of an arithmetic expression
'rpm * NPCM7XX_MFT_PULSE_PER_REVOLUTION' is a subject
to overflow because its operands are not cast to
a larger data type before performing arithmetic. Thus, need
to cast rpm to uint64_t.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Tigran Sogomonian <tsogomonian@astralinux.ru>
Reviewed-by: Patrick Leis <venture@google.com>
Reviewed-by: Hao Wu <wuhaotsh@google.com>
Message-id: 20241226130311.1349-1-tsogomonian@astralinux.ru
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/misc/npcm7xx_mft.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/misc/npcm7xx_mft.c b/hw/misc/npcm7xx_mft.c
index XXXXXXX..XXXXXXX 100644
--- a/hw/misc/npcm7xx_mft.c
+++ b/hw/misc/npcm7xx_mft.c
@@ -XXX,XX +XXX,XX @@ static NPCM7xxMFTCaptureState npcm7xx_mft_compute_cnt(
          * RPM = revolution/min. The time for one revlution (in ns) is
          * MINUTE_TO_NANOSECOND / RPM.
          */
-        count = clock_ns_to_ticks(clock, (60 * NANOSECONDS_PER_SECOND) /
-            (rpm * NPCM7XX_MFT_PULSE_PER_REVOLUTION));
+        count = clock_ns_to_ticks(clock,
+            (uint64_t)(60 * NANOSECONDS_PER_SECOND) /
+            ((uint64_t)rpm * NPCM7XX_MFT_PULSE_PER_REVOLUTION));
     }
 
     if (count > NPCM7XX_MFT_MAX_CNT) {
-- 
2.34.1

From: Philippe Mathieu-Daudé <philmd@linaro.org>

Re-indent ASM comments adding the 'loop:' label.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 tests/qtest/boot-serial-test.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/tests/qtest/boot-serial-test.c b/tests/qtest/boot-serial-test.c
index XXXXXXX..XXXXXXX 100644
--- a/tests/qtest/boot-serial-test.c
+++ b/tests/qtest/boot-serial-test.c
@@ -XXX,XX +XXX,XX @@ static const uint8_t kernel_plml605[] = {
 };
 
 static const uint8_t bios_raspi2[] = {
-    0x08, 0x30, 0x9f, 0xe5,                 /* ldr   r3,[pc,#8]    Get base */
-    0x54, 0x20, 0xa0, 0xe3,                 /* mov     r2,#'T' */
-    0x00, 0x20, 0xc3, 0xe5,                 /* strb    r2,[r3] */
-    0xfb, 0xff, 0xff, 0xea,                 /* b       loop */
-    0x00, 0x10, 0x20, 0x3f,                 /* 0x3f201000 = UART0 base addr */
+    0x08, 0x30, 0x9f, 0xe5,                 /* loop:  ldr     r3, [pc, #8]   Get &UART0 */
+    0x54, 0x20, 0xa0, 0xe3,                 /*        mov     r2, #'T' */
+    0x00, 0x20, 0xc3, 0xe5,                 /*        strb    r2, [r3]       *TXDAT = 'T' */
+    0xfb, 0xff, 0xff, 0xea,                 /*        b       -12            (loop) */
+    0x00, 0x10, 0x20, 0x3f,                 /* UART0: 0x3f201000 */
 };
 
 static const uint8_t kernel_aarch64[] = {
-    0x81, 0x0a, 0x80, 0x52,                 /* mov     w1, #0x54 */
-    0x02, 0x20, 0xa1, 0xd2,                 /* mov     x2, #0x9000000 */
-    0x41, 0x00, 0x00, 0x39,                 /* strb    w1, [x2] */
-    0xfd, 0xff, 0xff, 0x17,                 /* b       -12 (loop) */
+    0x81, 0x0a, 0x80, 0x52,                 /* loop:  mov    w1, #'T' */
+    0x02, 0x20, 0xa1, 0xd2,                 /*        mov    x2, #0x9000000  Load UART0 */
+    0x41, 0x00, 0x00, 0x39,                 /*        strb   w1, [x2]        *TXDAT = 'T' */
+    0xfd, 0xff, 0xff, 0x17,                 /*        b      -12             (loop) */
 };
 
 static const uint8_t kernel_nrf51[] = {
-- 
2.34.1

From: Philippe Mathieu-Daudé <philmd@linaro.org>

Since registers are not modified, we don't need
to refill their values. Directly jump to the previous
store instruction to keep filling the TXDAT register.

The equivalent C code remains:

while (true) {
      *UART_DATA = 'T';
  }

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 tests/qtest/boot-serial-test.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/qtest/boot-serial-test.c b/tests/qtest/boot-serial-test.c
index XXXXXXX..XXXXXXX 100644
--- a/tests/qtest/boot-serial-test.c
+++ b/tests/qtest/boot-serial-test.c
@@ -XXX,XX +XXX,XX @@ static const uint8_t kernel_plml605[] = {
 };
 
 static const uint8_t bios_raspi2[] = {
-    0x08, 0x30, 0x9f, 0xe5,                 /* loop:  ldr     r3, [pc, #8]   Get &UART0 */
+    0x08, 0x30, 0x9f, 0xe5,                 /*        ldr     r3, [pc, #8]   Get &UART0 */
     0x54, 0x20, 0xa0, 0xe3,                 /*        mov     r2, #'T' */
-    0x00, 0x20, 0xc3, 0xe5,                 /*        strb    r2, [r3]       *TXDAT = 'T' */
-    0xfb, 0xff, 0xff, 0xea,                 /*        b       -12            (loop) */
+    0x00, 0x20, 0xc3, 0xe5,                 /* loop:  strb    r2, [r3]       *TXDAT = 'T' */
+    0xff, 0xff, 0xff, 0xea,                 /*        b       -4             (loop) */
     0x00, 0x10, 0x20, 0x3f,                 /* UART0: 0x3f201000 */
 };
 
 static const uint8_t kernel_aarch64[] = {
-    0x81, 0x0a, 0x80, 0x52,                 /* loop:  mov    w1, #'T' */
+    0x81, 0x0a, 0x80, 0x52,                 /*        mov    w1, #'T' */
     0x02, 0x20, 0xa1, 0xd2,                 /*        mov    x2, #0x9000000  Load UART0 */
-    0x41, 0x00, 0x00, 0x39,                 /*        strb   w1, [x2]        *TXDAT = 'T' */
-    0xfd, 0xff, 0xff, 0x17,                 /*        b      -12             (loop) */
+    0x41, 0x00, 0x00, 0x39,                 /* loop:  strb   w1, [x2]        *TXDAT = 'T' */
+    0xff, 0xff, 0xff, 0x17,                 /*        b      -4              (loop) */
 };
 
 static const uint8_t kernel_nrf51[] = {
-- 
2.34.1

From: Philippe Mathieu-Daudé <philmd@linaro.org>

In the next commit we are going to use a different value
for the $w1 register, maintaining the same $x2 value. In
order to keep the next commit trivial to review, set $x2
before $w1.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 tests/qtest/boot-serial-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

From: Philippe Mathieu-Daudé <philmd@linaro.org>

The tests using the PL011 UART of the virt and raspi machines
weren't properly enabling the UART and its transmitter previous
to sending characters. Follow the PL011 manual initialization
recommendation by setting the proper bits of the control register.

Update the ASM code prefixing:

*UART_CTRL = UART_ENABLE | TX_ENABLE;

to:

while (true) {
      *UART_DATA = 'T';
  }

Note, since commit 51b61dd4d56 ("hw/char/pl011: Warn when using
disabled transmitter") incomplete PL011 initialization can be
logged using the '-d guest_errors' command line option.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 tests/qtest/boot-serial-test.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/boot-serial-test.c b/tests/qtest/boot-serial-test.c
index XXXXXXX..XXXXXXX 100644
--- a/tests/qtest/boot-serial-test.c
+++ b/tests/qtest/boot-serial-test.c
@@ -XXX,XX +XXX,XX @@ static const uint8_t kernel_plml605[] = {
 };
 
 static const uint8_t bios_raspi2[] = {
-    0x08, 0x30, 0x9f, 0xe5,                 /*        ldr     r3, [pc, #8]   Get &UART0 */
+    0x10, 0x30, 0x9f, 0xe5,                 /*        ldr     r3, [pc, #16]  Get &UART0 */
+    0x10, 0x20, 0x9f, 0xe5,                 /*        ldr     r2, [pc, #16]  Get &CR */
+    0xb0, 0x23, 0xc3, 0xe1,                 /*        strh    r2, [r3, #48]  Set CR */
     0x54, 0x20, 0xa0, 0xe3,                 /*        mov     r2, #'T' */
     0x00, 0x20, 0xc3, 0xe5,                 /* loop:  strb    r2, [r3]       *TXDAT = 'T' */
     0xff, 0xff, 0xff, 0xea,                 /*        b       -4             (loop) */
     0x00, 0x10, 0x20, 0x3f,                 /* UART0: 0x3f201000 */
+    0x01, 0x01, 0x00, 0x00,                 /* CR:    0x101 = UARTEN|TXE */
 };
 
 static const uint8_t kernel_aarch64[] = {
     0x02, 0x20, 0xa1, 0xd2,                 /*        mov    x2, #0x9000000  Load UART0 */
+    0x21, 0x20, 0x80, 0x52,                 /*        mov    w1, 0x101       CR = UARTEN|TXE */
+    0x41, 0x60, 0x00, 0x79,                 /*        strh   w1, [x2, #48]   Set CR */
     0x81, 0x0a, 0x80, 0x52,                 /*        mov    w1, #'T' */
     0x41, 0x00, 0x00, 0x39,                 /* loop:  strb   w1, [x2]        *TXDAT = 'T' */
     0xff, 0xff, 0xff, 0x17,                 /*        b      -4              (loop) */
-- 
2.34.1

helper.c includes some small TCG helper functions used for mostly
arithmetic instructions.  These are TCG only and there's no need for
them to be in the large and unwieldy helper.c.  Move them out to
their own source file in the tcg/ subdirectory, together with the
op_addsub.h multiply-included template header that they use.

Since we are moving op_addsub.h, we take the opportunity to
give it a name which matches our convention for files which
are not true header files but which are #included from other
C files: op_addsub.c.inc.

(Ironically, this means that helper.c no longer contains
any TCG helper function definitions at all.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250110131211.2546314-1-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
---
 target/arm/helper.c                           | 285 -----------------
 target/arm/tcg/arith_helper.c                 | 296 ++++++++++++++++++
 .../arm/{op_addsub.h => tcg/op_addsub.c.inc}  |   0
 target/arm/tcg/meson.build                    |   1 +
 4 files changed, 297 insertions(+), 285 deletions(-)
 create mode 100644 target/arm/tcg/arith_helper.c
 rename target/arm/{op_addsub.h => tcg/op_addsub.c.inc} (100%)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -XXX,XX +XXX,XX @@
 #include "qemu/main-loop.h"
 #include "qemu/timer.h"
 #include "qemu/bitops.h"
-#include "qemu/crc32c.h"
 #include "qemu/qemu-print.h"
 #include "exec/exec-all.h"
 #include "exec/translation-block.h"
-#include <zlib.h> /* for crc32 */
 #include "hw/irq.h"
 #include "system/cpu-timers.h"
 #include "system/kvm.h"
@@ -XXX,XX +XXX,XX @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
     };
 }
 
-/*
- * Note that signed overflow is undefined in C.  The following routines are
- * careful to use unsigned types where modulo arithmetic is required.
- * Failure to do so _will_ break on newer gcc.
- */
-
-/* Signed saturating arithmetic.  */
-
-/* Perform 16-bit signed saturating addition.  */
-static inline uint16_t add16_sat(uint16_t a, uint16_t b)
-{
-    uint16_t res;
-
-    res = a + b;
-    if (((res ^ a) & 0x8000) && !((a ^ b) & 0x8000)) {
-        if (a & 0x8000) {
-            res = 0x8000;
-        } else {
-            res = 0x7fff;
-        }
-    }
-    return res;
-}
-
-/* Perform 8-bit signed saturating addition.  */
-static inline uint8_t add8_sat(uint8_t a, uint8_t b)
-{
-    uint8_t res;
-
-    res = a + b;
-    if (((res ^ a) & 0x80) && !((a ^ b) & 0x80)) {
-        if (a & 0x80) {
-            res = 0x80;
-        } else {
-            res = 0x7f;
-        }
-    }
-    return res;
-}
-
-/* Perform 16-bit signed saturating subtraction.  */
-static inline uint16_t sub16_sat(uint16_t a, uint16_t b)
-{
-    uint16_t res;
-
-    res = a - b;
-    if (((res ^ a) & 0x8000) && ((a ^ b) & 0x8000)) {
-        if (a & 0x8000) {
-            res = 0x8000;
-        } else {
-            res = 0x7fff;
-        }
-    }
-    return res;
-}
-
-/* Perform 8-bit signed saturating subtraction.  */
-static inline uint8_t sub8_sat(uint8_t a, uint8_t b)
-{
-    uint8_t res;
-
-    res = a - b;
-    if (((res ^ a) & 0x80) && ((a ^ b) & 0x80)) {
-        if (a & 0x80) {
-            res = 0x80;
-        } else {
-            res = 0x7f;
-        }
-    }
-    return res;
-}
-
-#define ADD16(a, b, n) RESULT(add16_sat(a, b), n, 16);
-#define SUB16(a, b, n) RESULT(sub16_sat(a, b), n, 16);
-#define ADD8(a, b, n)  RESULT(add8_sat(a, b), n, 8);
-#define SUB8(a, b, n)  RESULT(sub8_sat(a, b), n, 8);
-#define PFX q
-
-#include "op_addsub.h"
-
-/* Unsigned saturating arithmetic.  */
-static inline uint16_t add16_usat(uint16_t a, uint16_t b)
-{
-    uint16_t res;
-    res = a + b;
-    if (res < a) {
-        res = 0xffff;
-    }
-    return res;
-}
-
-static inline uint16_t sub16_usat(uint16_t a, uint16_t b)
-{
-    if (a > b) {
-        return a - b;
-    } else {
-        return 0;
-    }
-}
-
-static inline uint8_t add8_usat(uint8_t a, uint8_t b)
-{
-    uint8_t res;
-    res = a + b;
-    if (res < a) {
-        res = 0xff;
-    }
-    return res;
-}
-
-static inline uint8_t sub8_usat(uint8_t a, uint8_t b)
-{
-    if (a > b) {
-        return a - b;
-    } else {
-        return 0;
-    }
-}
-
-#define ADD16(a, b, n) RESULT(add16_usat(a, b), n, 16);
-#define SUB16(a, b, n) RESULT(sub16_usat(a, b), n, 16);
-#define ADD8(a, b, n)  RESULT(add8_usat(a, b), n, 8);
-#define SUB8(a, b, n)  RESULT(sub8_usat(a, b), n, 8);
-#define PFX uq
-
-#include "op_addsub.h"
-
-/* Signed modulo arithmetic.  */
-#define SARITH16(a, b, n, op) do { \
-    int32_t sum; \
-    sum = (int32_t)(int16_t)(a) op (int32_t)(int16_t)(b); \
-    RESULT(sum, n, 16); \
-    if (sum >= 0) \
-        ge |= 3 << (n * 2); \
-    } while (0)
-
-#define SARITH8(a, b, n, op) do { \
-    int32_t sum; \
-    sum = (int32_t)(int8_t)(a) op (int32_t)(int8_t)(b); \
-    RESULT(sum, n, 8); \
-    if (sum >= 0) \
-        ge |= 1 << n; \
-    } while (0)
-
-
-#define ADD16(a, b, n) SARITH16(a, b, n, +)
-#define SUB16(a, b, n) SARITH16(a, b, n, -)
-#define ADD8(a, b, n)  SARITH8(a, b, n, +)
-#define SUB8(a, b, n)  SARITH8(a, b, n, -)
-#define PFX s
-#define ARITH_GE
-
-#include "op_addsub.h"
-
-/* Unsigned modulo arithmetic.  */
-#define ADD16(a, b, n) do { \
-    uint32_t sum; \
-    sum = (uint32_t)(uint16_t)(a) + (uint32_t)(uint16_t)(b); \
-    RESULT(sum, n, 16); \
-    if ((sum >> 16) == 1) \
-        ge |= 3 << (n * 2); \
-    } while (0)
-
-#define ADD8(a, b, n) do { \
-    uint32_t sum; \
-    sum = (uint32_t)(uint8_t)(a) + (uint32_t)(uint8_t)(b); \
-    RESULT(sum, n, 8); \
-    if ((sum >> 8) == 1) \
-        ge |= 1 << n; \
-    } while (0)
-
-#define SUB16(a, b, n) do { \
-    uint32_t sum; \
-    sum = (uint32_t)(uint16_t)(a) - (uint32_t)(uint16_t)(b); \
-    RESULT(sum, n, 16); \
-    if ((sum >> 16) == 0) \
-        ge |= 3 << (n * 2); \
-    } while (0)
-
-#define SUB8(a, b, n) do { \
-    uint32_t sum; \
-    sum = (uint32_t)(uint8_t)(a) - (uint32_t)(uint8_t)(b); \
-    RESULT(sum, n, 8); \
-    if ((sum >> 8) == 0) \
-        ge |= 1 << n; \
-    } while (0)
-
-#define PFX u
-#define ARITH_GE
-
-#include "op_addsub.h"
-
-/* Halved signed arithmetic.  */
-#define ADD16(a, b, n) \
-  RESULT(((int32_t)(int16_t)(a) + (int32_t)(int16_t)(b)) >> 1, n, 16)
-#define SUB16(a, b, n) \
-  RESULT(((int32_t)(int16_t)(a) - (int32_t)(int16_t)(b)) >> 1, n, 16)
-#define ADD8(a, b, n) \
-  RESULT(((int32_t)(int8_t)(a) + (int32_t)(int8_t)(b)) >> 1, n, 8)
-#define SUB8(a, b, n) \
-  RESULT(((int32_t)(int8_t)(a) - (int32_t)(int8_t)(b)) >> 1, n, 8)
-#define PFX sh
-
-#include "op_addsub.h"
-
-/* Halved unsigned arithmetic.  */
-#define ADD16(a, b, n) \
-  RESULT(((uint32_t)(uint16_t)(a) + (uint32_t)(uint16_t)(b)) >> 1, n, 16)
-#define SUB16(a, b, n) \
-  RESULT(((uint32_t)(uint16_t)(a) - (uint32_t)(uint16_t)(b)) >> 1, n, 16)
-#define ADD8(a, b, n) \
-  RESULT(((uint32_t)(uint8_t)(a) + (uint32_t)(uint8_t)(b)) >> 1, n, 8)
-#define SUB8(a, b, n) \
-  RESULT(((uint32_t)(uint8_t)(a) - (uint32_t)(uint8_t)(b)) >> 1, n, 8)
-#define PFX uh
-
-#include "op_addsub.h"
-
-static inline uint8_t do_usad(uint8_t a, uint8_t b)
-{
-    if (a > b) {
-        return a - b;
-    } else {
-        return b - a;
-    }
-}
-
-/* Unsigned sum of absolute byte differences.  */
-uint32_t HELPER(usad8)(uint32_t a, uint32_t b)
-{
-    uint32_t sum;
-    sum = do_usad(a, b);
-    sum += do_usad(a >> 8, b >> 8);
-    sum += do_usad(a >> 16, b >> 16);
-    sum += do_usad(a >> 24, b >> 24);
-    return sum;
-}
-
-/* For ARMv6 SEL instruction.  */
-uint32_t HELPER(sel_flags)(uint32_t flags, uint32_t a, uint32_t b)
-{
-    uint32_t mask;
-
-    mask = 0;
-    if (flags & 1) {
-        mask |= 0xff;
-    }
-    if (flags & 2) {
-        mask |= 0xff00;
-    }
-    if (flags & 4) {
-        mask |= 0xff0000;
-    }
-    if (flags & 8) {
-        mask |= 0xff000000;
-    }
-    return (a & mask) | (b & ~mask);
-}
-
-/*
- * CRC helpers.
- * The upper bytes of val (above the number specified by 'bytes') must have
- * been zeroed out by the caller.
- */
-uint32_t HELPER(crc32)(uint32_t acc, uint32_t val, uint32_t bytes)
-{
-    uint8_t buf[4];
-
-    stl_le_p(buf, val);
-
-    /* zlib crc32 converts the accumulator and output to one's complement.  */
-    return crc32(acc ^ 0xffffffff, buf, bytes) ^ 0xffffffff;
-}
-
-uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, uint32_t bytes)
-{
-    uint8_t buf[4];
-
-    stl_le_p(buf, val);
-
-    /* Linux crc32c converts the output to one's complement.  */
-    return crc32c(acc, buf, bytes) ^ 0xffffffff;
-}
 
 /*
  * Return the exception level to which FP-disabled exceptions should
diff --git a/target/arm/tcg/arith_helper.c b/target/arm/tcg/arith_helper.c
new file mode 100644
index XXXXXXX..XXXXXXX
--- /dev/null
+++ b/target/arm/tcg/arith_helper.c
@@ -XXX,XX +XXX,XX @@
+/*
+ * ARM generic helpers for various arithmetical operations.
+ *
+ * This code is licensed under the GNU GPL v2 or later.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "exec/helper-proto.h"
+#include "qemu/crc32c.h"
+#include <zlib.h> /* for crc32 */
+
+/*
+ * Note that signed overflow is undefined in C.  The following routines are
+ * careful to use unsigned types where modulo arithmetic is required.
+ * Failure to do so _will_ break on newer gcc.
+ */
+
+/* Signed saturating arithmetic.  */
+
+/* Perform 16-bit signed saturating addition.  */
+static inline uint16_t add16_sat(uint16_t a, uint16_t b)
+{
+    uint16_t res;
+
+    res = a + b;
+    if (((res ^ a) & 0x8000) && !((a ^ b) & 0x8000)) {
+        if (a & 0x8000) {
+            res = 0x8000;
+        } else {
+            res = 0x7fff;
+        }
+    }
+    return res;
+}
+
+/* Perform 8-bit signed saturating addition.  */
+static inline uint8_t add8_sat(uint8_t a, uint8_t b)
+{
+    uint8_t res;
+
+    res = a + b;
+    if (((res ^ a) & 0x80) && !((a ^ b) & 0x80)) {
+        if (a & 0x80) {
+            res = 0x80;
+        } else {
+            res = 0x7f;
+        }
+    }
+    return res;
+}
+
+/* Perform 16-bit signed saturating subtraction.  */
+static inline uint16_t sub16_sat(uint16_t a, uint16_t b)
+{
+    uint16_t res;
+
+    res = a - b;
+    if (((res ^ a) & 0x8000) && ((a ^ b) & 0x8000)) {
+        if (a & 0x8000) {
+            res = 0x8000;
+        } else {
+            res = 0x7fff;
+        }
+    }
+    return res;
+}
+
+/* Perform 8-bit signed saturating subtraction.  */
+static inline uint8_t sub8_sat(uint8_t a, uint8_t b)
+{
+    uint8_t res;
+
+    res = a - b;
+    if (((res ^ a) & 0x80) && ((a ^ b) & 0x80)) {
+        if (a & 0x80) {
+            res = 0x80;
+        } else {
+            res = 0x7f;
+        }
+    }
+    return res;
+}
+
+#define ADD16(a, b, n) RESULT(add16_sat(a, b), n, 16);
+#define SUB16(a, b, n) RESULT(sub16_sat(a, b), n, 16);
+#define ADD8(a, b, n)  RESULT(add8_sat(a, b), n, 8);
+#define SUB8(a, b, n)  RESULT(sub8_sat(a, b), n, 8);
+#define PFX q
+
+#include "op_addsub.c.inc"
+
+/* Unsigned saturating arithmetic.  */
+static inline uint16_t add16_usat(uint16_t a, uint16_t b)
+{
+    uint16_t res;
+    res = a + b;
+    if (res < a) {
+        res = 0xffff;
+    }
+    return res;
+}
+
+static inline uint16_t sub16_usat(uint16_t a, uint16_t b)
+{
+    if (a > b) {
+        return a - b;
+    } else {
+        return 0;
+    }
+}
+
+static inline uint8_t add8_usat(uint8_t a, uint8_t b)
+{
+    uint8_t res;
+    res = a + b;
+    if (res < a) {
+        res = 0xff;
+    }
+    return res;
+}
+
+static inline uint8_t sub8_usat(uint8_t a, uint8_t b)
+{
+    if (a > b) {
+        return a - b;
+    } else {
+        return 0;
+    }
+}
+
+#define ADD16(a, b, n) RESULT(add16_usat(a, b), n, 16);
+#define SUB16(a, b, n) RESULT(sub16_usat(a, b), n, 16);
+#define ADD8(a, b, n)  RESULT(add8_usat(a, b), n, 8);
+#define SUB8(a, b, n)  RESULT(sub8_usat(a, b), n, 8);
+#define PFX uq
+
+#include "op_addsub.c.inc"
+
+/* Signed modulo arithmetic.  */
+#define SARITH16(a, b, n, op) do { \
+    int32_t sum; \
+    sum = (int32_t)(int16_t)(a) op (int32_t)(int16_t)(b); \
+    RESULT(sum, n, 16); \
+    if (sum >= 0) \
+        ge |= 3 << (n * 2); \
+    } while (0)
+
+#define SARITH8(a, b, n, op) do { \
+    int32_t sum; \
+    sum = (int32_t)(int8_t)(a) op (int32_t)(int8_t)(b); \
+    RESULT(sum, n, 8); \
+    if (sum >= 0) \
+        ge |= 1 << n; \
+    } while (0)
+
+
+#define ADD16(a, b, n) SARITH16(a, b, n, +)
+#define SUB16(a, b, n) SARITH16(a, b, n, -)
+#define ADD8(a, b, n)  SARITH8(a, b, n, +)
+#define SUB8(a, b, n)  SARITH8(a, b, n, -)
+#define PFX s
+#define ARITH_GE
+
+#include "op_addsub.c.inc"
+
+/* Unsigned modulo arithmetic.  */
+#define ADD16(a, b, n) do { \
+    uint32_t sum; \
+    sum = (uint32_t)(uint16_t)(a) + (uint32_t)(uint16_t)(b); \
+    RESULT(sum, n, 16); \
+    if ((sum >> 16) == 1) \
+        ge |= 3 << (n * 2); \
+    } while (0)
+
+#define ADD8(a, b, n) do { \
+    uint32_t sum; \
+    sum = (uint32_t)(uint8_t)(a) + (uint32_t)(uint8_t)(b); \
+    RESULT(sum, n, 8); \
+    if ((sum >> 8) == 1) \
+        ge |= 1 << n; \
+    } while (0)
+
+#define SUB16(a, b, n) do { \
+    uint32_t sum; \
+    sum = (uint32_t)(uint16_t)(a) - (uint32_t)(uint16_t)(b); \
+    RESULT(sum, n, 16); \
+    if ((sum >> 16) == 0) \
+        ge |= 3 << (n * 2); \
+    } while (0)
+
+#define SUB8(a, b, n) do { \
+    uint32_t sum; \
+    sum = (uint32_t)(uint8_t)(a) - (uint32_t)(uint8_t)(b); \
+    RESULT(sum, n, 8); \
+    if ((sum >> 8) == 0) \
+        ge |= 1 << n; \
+    } while (0)
+
+#define PFX u
+#define ARITH_GE
+
+#include "op_addsub.c.inc"
+
+/* Halved signed arithmetic.  */
+#define ADD16(a, b, n) \
+  RESULT(((int32_t)(int16_t)(a) + (int32_t)(int16_t)(b)) >> 1, n, 16)
+#define SUB16(a, b, n) \
+  RESULT(((int32_t)(int16_t)(a) - (int32_t)(int16_t)(b)) >> 1, n, 16)
+#define ADD8(a, b, n) \
+  RESULT(((int32_t)(int8_t)(a) + (int32_t)(int8_t)(b)) >> 1, n, 8)
+#define SUB8(a, b, n) \
+  RESULT(((int32_t)(int8_t)(a) - (int32_t)(int8_t)(b)) >> 1, n, 8)
+#define PFX sh
+
+#include "op_addsub.c.inc"
+
+/* Halved unsigned arithmetic.  */
+#define ADD16(a, b, n) \
+  RESULT(((uint32_t)(uint16_t)(a) + (uint32_t)(uint16_t)(b)) >> 1, n, 16)
+#define SUB16(a, b, n) \
+  RESULT(((uint32_t)(uint16_t)(a) - (uint32_t)(uint16_t)(b)) >> 1, n, 16)
+#define ADD8(a, b, n) \
+  RESULT(((uint32_t)(uint8_t)(a) + (uint32_t)(uint8_t)(b)) >> 1, n, 8)
+#define SUB8(a, b, n) \
+  RESULT(((uint32_t)(uint8_t)(a) - (uint32_t)(uint8_t)(b)) >> 1, n, 8)
+#define PFX uh
+
+#include "op_addsub.c.inc"
+
+static inline uint8_t do_usad(uint8_t a, uint8_t b)
+{
+    if (a > b) {
+        return a - b;
+    } else {
+        return b - a;
+    }
+}
+
+/* Unsigned sum of absolute byte differences.  */
+uint32_t HELPER(usad8)(uint32_t a, uint32_t b)
+{
+    uint32_t sum;
+    sum = do_usad(a, b);
+    sum += do_usad(a >> 8, b >> 8);
+    sum += do_usad(a >> 16, b >> 16);
+    sum += do_usad(a >> 24, b >> 24);
+    return sum;
+}
+
+/* For ARMv6 SEL instruction.  */
+uint32_t HELPER(sel_flags)(uint32_t flags, uint32_t a, uint32_t b)
+{
+    uint32_t mask;
+
+    mask = 0;
+    if (flags & 1) {
+        mask |= 0xff;
+    }
+    if (flags & 2) {
+        mask |= 0xff00;
+    }
+    if (flags & 4) {
+        mask |= 0xff0000;
+    }
+    if (flags & 8) {
+        mask |= 0xff000000;
+    }
+    return (a & mask) | (b & ~mask);
+}
+
+/*
+ * CRC helpers.
+ * The upper bytes of val (above the number specified by 'bytes') must have
+ * been zeroed out by the caller.
+ */
+uint32_t HELPER(crc32)(uint32_t acc, uint32_t val, uint32_t bytes)
+{
+    uint8_t buf[4];
+
+    stl_le_p(buf, val);
+
+    /* zlib crc32 converts the accumulator and output to one's complement.  */
+    return crc32(acc ^ 0xffffffff, buf, bytes) ^ 0xffffffff;
+}
+
+uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, uint32_t bytes)
+{
+    uint8_t buf[4];
+
+    stl_le_p(buf, val);
+
+    /* Linux crc32c converts the output to one's complement.  */
+    return crc32c(acc, buf, bytes) ^ 0xffffffff;
+}
diff --git a/target/arm/op_addsub.h b/target/arm/tcg/op_addsub.c.inc
similarity index 100%
rename from target/arm/op_addsub.h
rename to target/arm/tcg/op_addsub.c.inc
diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/tcg/meson.build
+++ b/target/arm/tcg/meson.build
@@ -XXX,XX +XXX,XX @@ arm_ss.add(files(
   'tlb_helper.c',
   'vec_helper.c',
   'tlb-insns.c',
+  'arith_helper.c',
 ))
 
 arm_ss.add(when: 'TARGET_AARCH64', if_true: files(
-- 
2.34.1

From: Pierrick Bouvier <pierrick.bouvier@linaro.org>

Before changing default pauth algorithm, we need to make sure current
default one (QARMA5) can still be selected.

$ qemu-system-aarch64 -cpu max,pauth-qarma5=on ...

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241219183211.3493974-2-pierrick.bouvier@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/system/arm/cpu-features.rst |  5 ++++-
 target/arm/cpu.h                 |  1 +
 target/arm/arm-qmp-cmds.c        |  2 +-
 target/arm/cpu64.c               | 20 ++++++++++++++------
 tests/qtest/arm-cpu-features.c   | 15 +++++++++++----
 5 files changed, 31 insertions(+), 12 deletions(-)

diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
index XXXXXXX..XXXXXXX 100644
--- a/docs/system/arm/cpu-features.rst
+++ b/docs/system/arm/cpu-features.rst
@@ -XXX,XX +XXX,XX @@ Below is the list of TCG VCPU features and their descriptions.
 ``pauth-qarma3``
   When ``pauth`` is enabled, select the architected QARMA3 algorithm.
 
-Without either ``pauth-impdef`` or ``pauth-qarma3`` enabled,
+``pauth-qarma5``
+  When ``pauth`` is enabled, select the architected QARMA5 algorithm.
+
+Without ``pauth-impdef``, ``pauth-qarma3`` or ``pauth-qarma5`` enabled,
 the architected QARMA5 algorithm is used.  The architected QARMA5
 and QARMA3 algorithms have good cryptographic properties, but can
 be quite slow to emulate.  The impdef algorithm used by QEMU is
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -XXX,XX +XXX,XX @@ struct ArchCPU {
     bool prop_pauth;
     bool prop_pauth_impdef;
     bool prop_pauth_qarma3;
+    bool prop_pauth_qarma5;
     bool prop_lpa2;
 
     /* DCZ blocksize, in log_2(words), ie low 4 bits of DCZID_EL0 */
diff --git a/target/arm/arm-qmp-cmds.c b/target/arm/arm-qmp-cmds.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/arm-qmp-cmds.c
+++ b/target/arm/arm-qmp-cmds.c
@@ -XXX,XX +XXX,XX @@ static const char *cpu_model_advertised_features[] = {
     "sve640", "sve768", "sve896", "sve1024", "sve1152", "sve1280",
     "sve1408", "sve1536", "sve1664", "sve1792", "sve1920", "sve2048",
     "kvm-no-adjvtime", "kvm-steal-time",
-    "pauth", "pauth-impdef", "pauth-qarma3",
+    "pauth", "pauth-impdef", "pauth-qarma3", "pauth-qarma5",
     NULL
 };
 
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index XXXXXXX..XXXXXXX 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -XXX,XX +XXX,XX @@ void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp)
         }
 
         if (cpu->prop_pauth) {
-            if (cpu->prop_pauth_impdef && cpu->prop_pauth_qarma3) {
+            if ((cpu->prop_pauth_impdef && cpu->prop_pauth_qarma3) ||
+                (cpu->prop_pauth_impdef && cpu->prop_pauth_qarma5) ||
+                (cpu->prop_pauth_qarma3 && cpu->prop_pauth_qarma5)) {
                 error_setg(errp,
-                           "cannot enable both pauth-impdef and pauth-qarma3");
+                           "cannot enable pauth-impdef, pauth-qarma3 and "
+                           "pauth-qarma5 at the same time");
                 return;
             }
 
@@ -XXX,XX +XXX,XX @@ void arm_cpu_pauth_finalize(ARMCPU *cpu, Error **errp)
             } else if (cpu->prop_pauth_qarma3) {
                 isar2 = FIELD_DP64(isar2, ID_AA64ISAR2, APA3, features);
                 isar2 = FIELD_DP64(isar2, ID_AA64ISAR2, GPA3, 1);
-            } else {
+            } else { /* default is pauth-qarma5 */
                 isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, APA, features);
                 isar1 = FIELD_DP64(isar1, ID_AA64ISAR1, GPA, 1);
             }
-        } else if (cpu->prop_pauth_impdef || cpu->prop_pauth_qarma3) {
-            error_setg(errp, "cannot enable pauth-impdef or "
-                       "pauth-qarma3 without pauth");
+        } else if (cpu->prop_pauth_impdef ||
+                   cpu->prop_pauth_qarma3 ||
+                   cpu->prop_pauth_qarma5) {
+            error_setg(errp, "cannot enable pauth-impdef, pauth-qarma3 or "
+                       "pauth-qarma5 without pauth");
             error_append_hint(errp, "Add pauth=on to the CPU property list.\n");
         }
     }
@@ -XXX,XX +XXX,XX @@ static const Property arm_cpu_pauth_impdef_property =
     DEFINE_PROP_BOOL("pauth-impdef", ARMCPU, prop_pauth_impdef, false);
 static const Property arm_cpu_pauth_qarma3_property =
     DEFINE_PROP_BOOL("pauth-qarma3", ARMCPU, prop_pauth_qarma3, false);
+static Property arm_cpu_pauth_qarma5_property =
+    DEFINE_PROP_BOOL("pauth-qarma5", ARMCPU, prop_pauth_qarma5, false);
 
 void aarch64_add_pauth_properties(Object *obj)
 {
@@ -XXX,XX +XXX,XX @@ void aarch64_add_pauth_properties(Object *obj)
     } else {
         qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth_impdef_property);
         qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth_qarma3_property);
+        qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth_qarma5_property);
     }
 }
 
diff --git a/tests/qtest/arm-cpu-features.c b/tests/qtest/arm-cpu-features.c
index XXXXXXX..XXXXXXX 100644
--- a/tests/qtest/arm-cpu-features.c
+++ b/tests/qtest/arm-cpu-features.c
@@ -XXX,XX +XXX,XX @@ static void pauth_tests_default(QTestState *qts, const char *cpu_type)
     assert_has_feature_enabled(qts, cpu_type, "pauth");
     assert_has_feature_disabled(qts, cpu_type, "pauth-impdef");
     assert_has_feature_disabled(qts, cpu_type, "pauth-qarma3");
+    assert_has_feature_disabled(qts, cpu_type, "pauth-qarma5");
     assert_set_feature(qts, cpu_type, "pauth", false);
     assert_set_feature(qts, cpu_type, "pauth", true);
     assert_set_feature(qts, cpu_type, "pauth-impdef", true);
     assert_set_feature(qts, cpu_type, "pauth-impdef", false);
     assert_set_feature(qts, cpu_type, "pauth-qarma3", true);
     assert_set_feature(qts, cpu_type, "pauth-qarma3", false);
+    assert_set_feature(qts, cpu_type, "pauth-qarma5", true);
+    assert_set_feature(qts, cpu_type, "pauth-qarma5", false);
     assert_error(qts, cpu_type,
-                 "cannot enable pauth-impdef or pauth-qarma3 without pauth",
+                 "cannot enable pauth-impdef, pauth-qarma3 or pauth-qarma5 without pauth",
                  "{ 'pauth': false, 'pauth-impdef': true }");
     assert_error(qts, cpu_type,
-                 "cannot enable pauth-impdef or pauth-qarma3 without pauth",
+                 "cannot enable pauth-impdef, pauth-qarma3 or pauth-qarma5 without pauth",
                  "{ 'pauth': false, 'pauth-qarma3': true }");
     assert_error(qts, cpu_type,
-                 "cannot enable both pauth-impdef and pauth-qarma3",
-                 "{ 'pauth': true, 'pauth-impdef': true, 'pauth-qarma3': true }");
+                 "cannot enable pauth-impdef, pauth-qarma3 or pauth-qarma5 without pauth",
+                 "{ 'pauth': false, 'pauth-qarma5': true }");
+    assert_error(qts, cpu_type,
+                 "cannot enable pauth-impdef, pauth-qarma3 and pauth-qarma5 at the same time",
+                 "{ 'pauth': true, 'pauth-impdef': true, 'pauth-qarma3': true,"
+                 "  'pauth-qarma5': true }");
 }
 
 static void test_query_cpu_model_expansion(const void *data)
-- 
2.34.1

The pauth-3 test explicitly tests that a computation of the
pointer-authentication produces the expected result.  This means that
it must be run with the QARMA5 algorithm.

Explicitly set the pauth algorithm when running this test, so that it
doesn't break when we change the default algorithm the 'max' CPU
uses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 tests/tcg/aarch64/Makefile.softmmu-target | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/tcg/aarch64/Makefile.softmmu-target b/tests/tcg/aarch64/Makefile.softmmu-target
index XXXXXXX..XXXXXXX 100644
--- a/tests/tcg/aarch64/Makefile.softmmu-target
+++ b/tests/tcg/aarch64/Makefile.softmmu-target
@@ -XXX,XX +XXX,XX @@ EXTRA_RUNS+=run-memory-replay
 
 ifneq ($(CROSS_CC_HAS_ARMV8_3),)
 pauth-3: CFLAGS += $(CROSS_CC_HAS_ARMV8_3)
+# This test explicitly checks the output of the pauth operation so we
+# must force the use of the QARMA5 algorithm for it.
+run-pauth-3: QEMU_BASE_MACHINE=-M virt -cpu max,pauth-qarma5=on -display none
 else
 pauth-3:
 	$(call skip-test, "BUILD of $@", "missing compiler support")
-- 
2.34.1

From: Pierrick Bouvier <pierrick.bouvier@linaro.org>

Pointer authentication on aarch64 is pretty expensive (up to 50% of
execution time) when running a virtual machine with tcg and -cpu max
(which enables pauth=on).

The advice is always: use pauth-impdef=on.
Our documentation even mentions it "by default" in
docs/system/introduction.rst.

Thus, we change the default to use impdef by default. This does not
affect kvm or hvf acceleration, since pauth algorithm used is the one
from host cpu.

This change is retro compatible, in terms of cli, with previous
versions, as the semantic of using -cpu max,pauth-impdef=on, and -cpu
max,pauth-qarma3=on is preserved.
The new option introduced in previous patch and matching old default is
-cpu max,pauth-qarma5=on.
It is retro compatible with migration as well, by defining a backcompat
property, that will use qarma5 by default for virt machine <= 9.2.
Tested by saving and restoring a vm from qemu 9.2.0 into qemu-master
(10.0) for cpus neoverse-n2 and max.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241219183211.3493974-3-pierrick.bouvier@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/system/arm/cpu-features.rst |  2 +-
 docs/system/introduction.rst     |  2 +-
 target/arm/cpu.h                 |  3 +++
 hw/core/machine.c                |  4 +++-
 target/arm/cpu.c                 |  2 ++
 target/arm/cpu64.c               | 22 ++++++++++++++++------
 6 files changed, 26 insertions(+), 9 deletions(-)