Series comparison

-[PULL 0/3] tcg patch queue
+[PULL v2 00/21] tcg patch queue
-The following changes since commit e18e5501d8ac692d32657a3e1ef545b14e72b730:
+Version 2: Drop signed 32-bit guest patches while CI failure examined.
-  Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20200210' into staging (2020-02-10 18:09:14 +0000)
 The following changes since commit 3d1fbc59665ff8a5d74b0fd30583044fe99e1117:
   Merge remote-tracking branch 'remotes/nvme/tags/nvme-next-pull-request' into staging (2022-03-04 15:31:23 +0000)
 are available in the Git repository at:
-  https://github.com/rth7680/qemu.git tags/pull-tcg-20200212
+  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20220304
-for you to fetch changes up to 2445971604c1cfd3ec484457159f4ac300fb04d2:
+for you to fetch changes up to cf320769476c3e2820be2a6280bfa1e15baf396f:
-  tcg: Add tcg_gen_gvec_5_ptr (2020-02-12 14:58:36 -0800)
+  tcg/i386: Implement bitsel for avx512 (2022-03-04 08:50:41 -1000)
 ----------------------------------------------------------------
-Fix breakpoint invalidation.
+Reorder do_constant_folding_cond test to satisfy valgrind.
-Add support for tcg helpers with 7 arguments.
+Fix value of MAX_OPC_PARAM_IARGS.
-Add support for gvec helpers with 5 arguments.
+Add opcodes for vector nand, nor, eqv.
 Support vector nand, nor, eqv on PPC and S390X hosts.
 Support AVX512VL, AVX512BW, AVX512DQ, and AVX512VBMI2.
 ----------------------------------------------------------------
-Max Filippov (1):
+Alex Bennée (1):
-      exec: flush CPU TB cache in breakpoint_invalidate
+      tcg/optimize: only read val after const check
-Richard Henderson (1):
+Richard Henderson (19):
-      tcg: Add tcg_gen_gvec_5_ptr
+      tcg: Add opcodes for vector nand, nor, eqv
       tcg/ppc: Implement vector NAND, NOR, EQV
       tcg/s390x: Implement vector NAND, NOR, EQV
       tcg/i386: Detect AVX512
       tcg/i386: Add tcg_out_evex_opc
       tcg/i386: Use tcg_can_emit_vec_op in expand_vec_cmp_noinv
       tcg/i386: Implement avx512 variable shifts
       tcg/i386: Implement avx512 scalar shift
       tcg/i386: Implement avx512 immediate sari shift
       tcg/i386: Implement avx512 immediate rotate
       tcg/i386: Implement avx512 variable rotate
       tcg/i386: Support avx512vbmi2 vector shift-double instructions
       tcg/i386: Expand vector word rotate as avx512vbmi2 shift-double
       tcg/i386: Remove rotls_vec from tcg_target_op_def
       tcg/i386: Expand scalar rotate with avx512 insns
       tcg/i386: Implement avx512 min/max/abs
       tcg/i386: Implement avx512 multiply
       tcg/i386: Implement more logical operations for avx512
       tcg/i386: Implement bitsel for avx512
-Taylor Simpson (1):
+Ziqiao Kong (1):
-      tcg: Add support for a helper with 7 arguments
+      tcg: Set MAX_OPC_PARAM_IARGS to 7
- include/exec/helper-gen.h   | 13 +++++++++++++
+ include/qemu/cpuid.h          |  20 ++-
- include/exec/helper-head.h  |  2 ++
+ include/tcg/tcg-opc.h         |   3 +
- include/exec/helper-proto.h |  6 ++++++
+ include/tcg/tcg.h             |   5 +-
- include/exec/helper-tcg.h   |  7 +++++++
+ tcg/aarch64/tcg-target.h      |   3 +
- include/tcg/tcg-op-gvec.h   |  7 +++++++
+ tcg/arm/tcg-target.h          |   3 +
- exec.c                      | 15 +++++++--------
+ tcg/i386/tcg-target-con-set.h |   1 +
- tcg/tcg-op-gvec.c           | 32 ++++++++++++++++++++++++++++++++
+ tcg/i386/tcg-target.h         |  17 +-
-files changed, 74 insertions(+), 8 deletions(-)
+ tcg/i386/tcg-target.opc.h     |   3 +
  tcg/ppc/tcg-target.h          |   3 +
  tcg/s390x/tcg-target.h        |   3 +
  tcg/optimize.c                |  20 +--
  tcg/tcg-op-vec.c              |  27 ++-
  tcg/tcg.c                     |   6 +
  tcg/i386/tcg-target.c.inc     | 387 +++++++++++++++++++++++++++++++++++-------
  tcg/ppc/tcg-target.c.inc      |  15 ++
  tcg/s390x/tcg-target.c.inc    |  17 ++
  tcg/tci/tcg-target.c.inc      |   2 +-
 files changed, 441 insertions(+), 94 deletions(-)

-[PULL 1/3] exec: flush CPU TB cache in breakpoint_invalidate
+Deleted patch
-From: Max Filippov <jcmvbkbc@gmail.com>
-When a breakpoint is inserted at location for which there's currently no
-virtual to physical translation no action is taken on CPU TB cache. If a
-TB for that virtual address already exists but is not visible ATM the
-breakpoint won't be hit next time an instruction at that address will be
-executed.
-Flush entire CPU TB cache in breakpoint_invalidate to force
-re-translation of all TBs for the breakpoint address.
-This change fixes the following scenario:
-- linux user application is running
-- a breakpoint is inserted from QEMU gdbstub for a user address that is
-  not currently present in the target CPU TLB
-- an instruction at that address is executed, but the external debugger
-  doesn't get control.
-Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
-Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
-Message-Id: <20191127220602.10827-2-jcmvbkbc@gmail.com>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
----
- exec.c | 15 +++++++--------
-file changed, 7 insertions(+), 8 deletions(-)
-diff --git a/exec.c b/exec.c
-index XXXXXXX..XXXXXXX 100644
---- a/exec.c
-+++ b/exec.c
-@@ -XXX,XX +XXX,XX @@ void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
- static void breakpoint_invalidate(CPUState *cpu, target_ulong pc)
- {
--    MemTxAttrs attrs;
--    hwaddr phys = cpu_get_phys_page_attrs_debug(cpu, pc, &attrs);
--    int asidx = cpu_asidx_from_attrs(cpu, attrs);
--    if (phys != -1) {
--        /* Locks grabbed by tb_invalidate_phys_addr */
--        tb_invalidate_phys_addr(cpu->cpu_ases[asidx].as,
--                                phys | (pc & ~TARGET_PAGE_MASK), attrs);
--    }
-+    /*
-+     * There may not be a virtual to physical translation for the pc
-+     * right now, but there may exist cached TB for this pc.
-+     * Flush the whole TB cache to force re-translation of such TBs.
-+     * This is heavyweight, but we're debugging anyway.
-+     */
-+    tb_flush(cpu);
- }
- #endif
---
-.20.1

-[PULL 2/3] tcg: Add support for a helper with 7 arguments
+Deleted patch
-From: Taylor Simpson <tsimpson@quicinc.com>
-Currently, helpers can only take up to 6 arguments.  This patch adds the
-capability for up to 7 arguments.  I have tested it with the Hexagon port
-that I am preparing for submission.
-Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
-Message-Id: <1580942510-2820-1-git-send-email-tsimpson@quicinc.com>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
----
- include/exec/helper-gen.h   | 13 +++++++++++++
- include/exec/helper-head.h  |  2 ++
- include/exec/helper-proto.h |  6 ++++++
- include/exec/helper-tcg.h   |  7 +++++++
-files changed, 28 insertions(+)
-diff --git a/include/exec/helper-gen.h b/include/exec/helper-gen.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/exec/helper-gen.h
-+++ b/include/exec/helper-gen.h
-@@ -XXX,XX +XXX,XX @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret)          \
-   tcg_gen_callN(HELPER(name), dh_retvar(ret), 6, args);                 \
- }
-+#define DEF_HELPER_FLAGS_7(name, flags, ret, t1, t2, t3, t4, t5, t6, t7)\
-+static inline void glue(gen_helper_, name)(dh_retvar_decl(ret)          \
-+    dh_arg_decl(t1, 1),  dh_arg_decl(t2, 2), dh_arg_decl(t3, 3),        \
-+    dh_arg_decl(t4, 4), dh_arg_decl(t5, 5), dh_arg_decl(t6, 6),         \
-+    dh_arg_decl(t7, 7))                                                 \
-+{                                                                       \
-+  TCGTemp *args[7] = { dh_arg(t1, 1), dh_arg(t2, 2), dh_arg(t3, 3),     \
-+                     dh_arg(t4, 4), dh_arg(t5, 5), dh_arg(t6, 6),       \
-+                     dh_arg(t7, 7) };                                   \
-+  tcg_gen_callN(HELPER(name), dh_retvar(ret), 7, args);                 \
-+}
-+
- #include "helper.h"
- #include "trace/generated-helpers.h"
- #include "trace/generated-helpers-wrappers.h"
-@@ -XXX,XX +XXX,XX @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret)          \
- #undef DEF_HELPER_FLAGS_4
- #undef DEF_HELPER_FLAGS_5
- #undef DEF_HELPER_FLAGS_6
-+#undef DEF_HELPER_FLAGS_7
- #undef GEN_HELPER
- #endif /* HELPER_GEN_H */
-diff --git a/include/exec/helper-head.h b/include/exec/helper-head.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/exec/helper-head.h
-+++ b/include/exec/helper-head.h
-@@ -XXX,XX +XXX,XX @@
-     DEF_HELPER_FLAGS_5(name, 0, ret, t1, t2, t3, t4, t5)
- #define DEF_HELPER_6(name, ret, t1, t2, t3, t4, t5, t6) \
-     DEF_HELPER_FLAGS_6(name, 0, ret, t1, t2, t3, t4, t5, t6)
-+#define DEF_HELPER_7(name, ret, t1, t2, t3, t4, t5, t6, t7) \
-+    DEF_HELPER_FLAGS_7(name, 0, ret, t1, t2, t3, t4, t5, t6, t7)
- /* MAX_OPC_PARAM_IARGS must be set to n if last entry is DEF_HELPER_FLAGS_n. */
-diff --git a/include/exec/helper-proto.h b/include/exec/helper-proto.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/exec/helper-proto.h
-+++ b/include/exec/helper-proto.h
-@@ -XXX,XX +XXX,XX @@ dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
- dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
-                             dh_ctype(t4), dh_ctype(t5), dh_ctype(t6));
-+#define DEF_HELPER_FLAGS_7(name, flags, ret, t1, t2, t3, t4, t5, t6, t7) \
-+dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
-+                            dh_ctype(t4), dh_ctype(t5), dh_ctype(t6), \
-+                            dh_ctype(t7));
-+
- #include "helper.h"
- #include "trace/generated-helpers.h"
- #include "tcg-runtime.h"
-@@ -XXX,XX +XXX,XX @@ dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
- #undef DEF_HELPER_FLAGS_4
- #undef DEF_HELPER_FLAGS_5
- #undef DEF_HELPER_FLAGS_6
-+#undef DEF_HELPER_FLAGS_7
- #endif /* HELPER_PROTO_H */
-diff --git a/include/exec/helper-tcg.h b/include/exec/helper-tcg.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/exec/helper-tcg.h
-+++ b/include/exec/helper-tcg.h
-@@ -XXX,XX +XXX,XX @@
-     | dh_sizemask(t2, 2) | dh_sizemask(t3, 3) | dh_sizemask(t4, 4) \
-     | dh_sizemask(t5, 5) | dh_sizemask(t6, 6) },
-+#define DEF_HELPER_FLAGS_7(NAME, FLAGS, ret, t1, t2, t3, t4, t5, t6, t7) \
-+  { .func = HELPER(NAME), .name = str(NAME), .flags = FLAGS, \
-+    .sizemask = dh_sizemask(ret, 0) | dh_sizemask(t1, 1) \
-+    | dh_sizemask(t2, 2) | dh_sizemask(t3, 3) | dh_sizemask(t4, 4) \
-+    | dh_sizemask(t5, 5) | dh_sizemask(t6, 6) | dh_sizemask(t7, 7) },
-+
- #include "helper.h"
- #include "trace/generated-helpers.h"
- #include "tcg-runtime.h"
-@@ -XXX,XX +XXX,XX @@
- #undef DEF_HELPER_FLAGS_4
- #undef DEF_HELPER_FLAGS_5
- #undef DEF_HELPER_FLAGS_6
-+#undef DEF_HELPER_FLAGS_7
- #endif /* HELPER_TCG_H */
---
-.20.1

-[PULL 3/3] tcg: Add tcg_gen_gvec_5_ptr
+Deleted patch
-Extend the vector generator infrastructure to handle
-vector arguments.
-Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
-Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
----
- include/tcg/tcg-op-gvec.h |  7 +++++++
- tcg/tcg-op-gvec.c         | 32 ++++++++++++++++++++++++++++++++
-files changed, 39 insertions(+)
-diff --git a/include/tcg/tcg-op-gvec.h b/include/tcg/tcg-op-gvec.h
-index XXXXXXX..XXXXXXX 100644
---- a/include/tcg/tcg-op-gvec.h
-+++ b/include/tcg/tcg-op-gvec.h
-@@ -XXX,XX +XXX,XX @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs,
-                         uint32_t maxsz, int32_t data,
-                         gen_helper_gvec_4_ptr *fn);
-+typedef void gen_helper_gvec_5_ptr(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr,
-+                                   TCGv_ptr, TCGv_ptr, TCGv_i32);
-+void tcg_gen_gvec_5_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs,
-+                        uint32_t cofs, uint32_t eofs, TCGv_ptr ptr,
-+                        uint32_t oprsz, uint32_t maxsz, int32_t data,
-+                        gen_helper_gvec_5_ptr *fn);
-+
- /* Expand a gvec operation.  Either inline or out-of-line depending on
-    the actual vector size and the operations supported by the host.  */
- typedef struct {
-diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
-index XXXXXXX..XXXXXXX 100644
---- a/tcg/tcg-op-gvec.c
-+++ b/tcg/tcg-op-gvec.c
-@@ -XXX,XX +XXX,XX @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs,
-     tcg_temp_free_i32(desc);
- }
-+/* Generate a call to a gvec-style helper with five vector operands
-+   and an extra pointer operand.  */
-+void tcg_gen_gvec_5_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs,
-+                        uint32_t cofs, uint32_t eofs, TCGv_ptr ptr,
-+                        uint32_t oprsz, uint32_t maxsz, int32_t data,
-+                        gen_helper_gvec_5_ptr *fn)
-+{
-+    TCGv_ptr a0, a1, a2, a3, a4;
-+    TCGv_i32 desc = tcg_const_i32(simd_desc(oprsz, maxsz, data));
-+
-+    a0 = tcg_temp_new_ptr();
-+    a1 = tcg_temp_new_ptr();
-+    a2 = tcg_temp_new_ptr();
-+    a3 = tcg_temp_new_ptr();
-+    a4 = tcg_temp_new_ptr();
-+
-+    tcg_gen_addi_ptr(a0, cpu_env, dofs);
-+    tcg_gen_addi_ptr(a1, cpu_env, aofs);
-+    tcg_gen_addi_ptr(a2, cpu_env, bofs);
-+    tcg_gen_addi_ptr(a3, cpu_env, cofs);
-+    tcg_gen_addi_ptr(a4, cpu_env, eofs);
-+
-+    fn(a0, a1, a2, a3, a4, ptr, desc);
-+
-+    tcg_temp_free_ptr(a0);
-+    tcg_temp_free_ptr(a1);
-+    tcg_temp_free_ptr(a2);
-+    tcg_temp_free_ptr(a3);
-+    tcg_temp_free_ptr(a4);
-+    tcg_temp_free_i32(desc);
-+}
-+
- /* Return true if we want to implement something of OPRSZ bytes
-    in units of LNSZ.  This limits the expansion of inline code.  */
- static inline bool check_size_impl(uint32_t oprsz, uint32_t lnsz)
---
-.20.1

The following changes since commit e18e5501d8ac692d32657a3e1ef545b14e72b730:

Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20200210' into staging (2020-02-10 18:09:14 +0000)

are available in the Git repository at:

https://github.com/rth7680/qemu.git tags/pull-tcg-20200212

for you to fetch changes up to 2445971604c1cfd3ec484457159f4ac300fb04d2:

tcg: Add tcg_gen_gvec_5_ptr (2020-02-12 14:58:36 -0800)

----------------------------------------------------------------
Fix breakpoint invalidation.
Add support for tcg helpers with 7 arguments.
Add support for gvec helpers with 5 arguments.

----------------------------------------------------------------
Max Filippov (1):
      exec: flush CPU TB cache in breakpoint_invalidate

Richard Henderson (1):
      tcg: Add tcg_gen_gvec_5_ptr

Taylor Simpson (1):
      tcg: Add support for a helper with 7 arguments

From: Max Filippov <jcmvbkbc@gmail.com>

When a breakpoint is inserted at location for which there's currently no
virtual to physical translation no action is taken on CPU TB cache. If a
TB for that virtual address already exists but is not visible ATM the
breakpoint won't be hit next time an instruction at that address will be
executed.

Flush entire CPU TB cache in breakpoint_invalidate to force
re-translation of all TBs for the breakpoint address.

This change fixes the following scenario:
- linux user application is running
- a breakpoint is inserted from QEMU gdbstub for a user address that is
  not currently present in the target CPU TLB
- an instruction at that address is executed, but the external debugger
  doesn't get control.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Message-Id: <20191127220602.10827-2-jcmvbkbc@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 exec.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/exec.c b/exec.c
index XXXXXXX..XXXXXXX 100644
--- a/exec.c
+++ b/exec.c
@@ -XXX,XX +XXX,XX @@ void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
 
 static void breakpoint_invalidate(CPUState *cpu, target_ulong pc)
 {
-    MemTxAttrs attrs;
-    hwaddr phys = cpu_get_phys_page_attrs_debug(cpu, pc, &attrs);
-    int asidx = cpu_asidx_from_attrs(cpu, attrs);
-    if (phys != -1) {
-        /* Locks grabbed by tb_invalidate_phys_addr */
-        tb_invalidate_phys_addr(cpu->cpu_ases[asidx].as,
-                                phys | (pc & ~TARGET_PAGE_MASK), attrs);
-    }
+    /*
+     * There may not be a virtual to physical translation for the pc
+     * right now, but there may exist cached TB for this pc.
+     * Flush the whole TB cache to force re-translation of such TBs.
+     * This is heavyweight, but we're debugging anyway.
+     */
+    tb_flush(cpu);
 }
 #endif
 
-- 
2.20.1

From: Taylor Simpson <tsimpson@quicinc.com>

Currently, helpers can only take up to 6 arguments.  This patch adds the
capability for up to 7 arguments.  I have tested it with the Hexagon port
that I am preparing for submission.

Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <1580942510-2820-1-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/helper-gen.h   | 13 +++++++++++++
 include/exec/helper-head.h  |  2 ++
 include/exec/helper-proto.h |  6 ++++++
 include/exec/helper-tcg.h   |  7 +++++++
 4 files changed, 28 insertions(+)

diff --git a/include/exec/helper-gen.h b/include/exec/helper-gen.h
index XXXXXXX..XXXXXXX 100644
--- a/include/exec/helper-gen.h
+++ b/include/exec/helper-gen.h
@@ -XXX,XX +XXX,XX @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret)          \
   tcg_gen_callN(HELPER(name), dh_retvar(ret), 6, args);                 \
 }
 
+#define DEF_HELPER_FLAGS_7(name, flags, ret, t1, t2, t3, t4, t5, t6, t7)\
+static inline void glue(gen_helper_, name)(dh_retvar_decl(ret)          \
+    dh_arg_decl(t1, 1),  dh_arg_decl(t2, 2), dh_arg_decl(t3, 3),        \
+    dh_arg_decl(t4, 4), dh_arg_decl(t5, 5), dh_arg_decl(t6, 6),         \
+    dh_arg_decl(t7, 7))                                                 \
+{                                                                       \
+  TCGTemp *args[7] = { dh_arg(t1, 1), dh_arg(t2, 2), dh_arg(t3, 3),     \
+                     dh_arg(t4, 4), dh_arg(t5, 5), dh_arg(t6, 6),       \
+                     dh_arg(t7, 7) };                                   \
+  tcg_gen_callN(HELPER(name), dh_retvar(ret), 7, args);                 \
+}
+
 #include "helper.h"
 #include "trace/generated-helpers.h"
 #include "trace/generated-helpers-wrappers.h"
@@ -XXX,XX +XXX,XX @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret)          \
 #undef DEF_HELPER_FLAGS_4
 #undef DEF_HELPER_FLAGS_5
 #undef DEF_HELPER_FLAGS_6
+#undef DEF_HELPER_FLAGS_7
 #undef GEN_HELPER
 
 #endif /* HELPER_GEN_H */
diff --git a/include/exec/helper-head.h b/include/exec/helper-head.h
index XXXXXXX..XXXXXXX 100644
--- a/include/exec/helper-head.h
+++ b/include/exec/helper-head.h
@@ -XXX,XX +XXX,XX @@
     DEF_HELPER_FLAGS_5(name, 0, ret, t1, t2, t3, t4, t5)
 #define DEF_HELPER_6(name, ret, t1, t2, t3, t4, t5, t6) \
     DEF_HELPER_FLAGS_6(name, 0, ret, t1, t2, t3, t4, t5, t6)
+#define DEF_HELPER_7(name, ret, t1, t2, t3, t4, t5, t6, t7) \
+    DEF_HELPER_FLAGS_7(name, 0, ret, t1, t2, t3, t4, t5, t6, t7)
 
 /* MAX_OPC_PARAM_IARGS must be set to n if last entry is DEF_HELPER_FLAGS_n. */
 
diff --git a/include/exec/helper-proto.h b/include/exec/helper-proto.h
index XXXXXXX..XXXXXXX 100644
--- a/include/exec/helper-proto.h
+++ b/include/exec/helper-proto.h
@@ -XXX,XX +XXX,XX @@ dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
 dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
                             dh_ctype(t4), dh_ctype(t5), dh_ctype(t6));
 
+#define DEF_HELPER_FLAGS_7(name, flags, ret, t1, t2, t3, t4, t5, t6, t7) \
+dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
+                            dh_ctype(t4), dh_ctype(t5), dh_ctype(t6), \
+                            dh_ctype(t7));
+
 #include "helper.h"
 #include "trace/generated-helpers.h"
 #include "tcg-runtime.h"
@@ -XXX,XX +XXX,XX @@ dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
 #undef DEF_HELPER_FLAGS_4
 #undef DEF_HELPER_FLAGS_5
 #undef DEF_HELPER_FLAGS_6
+#undef DEF_HELPER_FLAGS_7
 
 #endif /* HELPER_PROTO_H */
diff --git a/include/exec/helper-tcg.h b/include/exec/helper-tcg.h
index XXXXXXX..XXXXXXX 100644
--- a/include/exec/helper-tcg.h
+++ b/include/exec/helper-tcg.h
@@ -XXX,XX +XXX,XX @@
     | dh_sizemask(t2, 2) | dh_sizemask(t3, 3) | dh_sizemask(t4, 4) \
     | dh_sizemask(t5, 5) | dh_sizemask(t6, 6) },
 
+#define DEF_HELPER_FLAGS_7(NAME, FLAGS, ret, t1, t2, t3, t4, t5, t6, t7) \
+  { .func = HELPER(NAME), .name = str(NAME), .flags = FLAGS, \
+    .sizemask = dh_sizemask(ret, 0) | dh_sizemask(t1, 1) \
+    | dh_sizemask(t2, 2) | dh_sizemask(t3, 3) | dh_sizemask(t4, 4) \
+    | dh_sizemask(t5, 5) | dh_sizemask(t6, 6) | dh_sizemask(t7, 7) },
+
 #include "helper.h"
 #include "trace/generated-helpers.h"
 #include "tcg-runtime.h"
@@ -XXX,XX +XXX,XX @@
 #undef DEF_HELPER_FLAGS_4
 #undef DEF_HELPER_FLAGS_5
 #undef DEF_HELPER_FLAGS_6
+#undef DEF_HELPER_FLAGS_7
 
 #endif /* HELPER_TCG_H */
-- 
2.20.1

Extend the vector generator infrastructure to handle
5 vector arguments.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/tcg/tcg-op-gvec.h |  7 +++++++
 tcg/tcg-op-gvec.c         | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/include/tcg/tcg-op-gvec.h b/include/tcg/tcg-op-gvec.h
index XXXXXXX..XXXXXXX 100644
--- a/include/tcg/tcg-op-gvec.h
+++ b/include/tcg/tcg-op-gvec.h
@@ -XXX,XX +XXX,XX @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs,
                         uint32_t maxsz, int32_t data,
                         gen_helper_gvec_4_ptr *fn);
 
+typedef void gen_helper_gvec_5_ptr(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr,
+                                   TCGv_ptr, TCGv_ptr, TCGv_i32);
+void tcg_gen_gvec_5_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs,
+                        uint32_t cofs, uint32_t eofs, TCGv_ptr ptr,
+                        uint32_t oprsz, uint32_t maxsz, int32_t data,
+                        gen_helper_gvec_5_ptr *fn);
+
 /* Expand a gvec operation.  Either inline or out-of-line depending on
    the actual vector size and the operations supported by the host.  */
 typedef struct {
diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index XXXXXXX..XXXXXXX 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -XXX,XX +XXX,XX @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs,
     tcg_temp_free_i32(desc);
 }
 
+/* Generate a call to a gvec-style helper with five vector operands
+   and an extra pointer operand.  */
+void tcg_gen_gvec_5_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs,
+                        uint32_t cofs, uint32_t eofs, TCGv_ptr ptr,
+                        uint32_t oprsz, uint32_t maxsz, int32_t data,
+                        gen_helper_gvec_5_ptr *fn)
+{
+    TCGv_ptr a0, a1, a2, a3, a4;
+    TCGv_i32 desc = tcg_const_i32(simd_desc(oprsz, maxsz, data));
+
+    a0 = tcg_temp_new_ptr();
+    a1 = tcg_temp_new_ptr();
+    a2 = tcg_temp_new_ptr();
+    a3 = tcg_temp_new_ptr();
+    a4 = tcg_temp_new_ptr();
+
+    tcg_gen_addi_ptr(a0, cpu_env, dofs);
+    tcg_gen_addi_ptr(a1, cpu_env, aofs);
+    tcg_gen_addi_ptr(a2, cpu_env, bofs);
+    tcg_gen_addi_ptr(a3, cpu_env, cofs);
+    tcg_gen_addi_ptr(a4, cpu_env, eofs);
+
+    fn(a0, a1, a2, a3, a4, ptr, desc);
+
+    tcg_temp_free_ptr(a0);
+    tcg_temp_free_ptr(a1);
+    tcg_temp_free_ptr(a2);
+    tcg_temp_free_ptr(a3);
+    tcg_temp_free_ptr(a4);
+    tcg_temp_free_i32(desc);
+}
+
 /* Return true if we want to implement something of OPRSZ bytes
    in units of LNSZ.  This limits the expansion of inline code.  */
 static inline bool check_size_impl(uint32_t oprsz, uint32_t lnsz)
-- 
2.20.1

Version 2: Drop signed 32-bit guest patches while CI failure examined.

The following changes since commit 3d1fbc59665ff8a5d74b0fd30583044fe99e1117:

Merge remote-tracking branch 'remotes/nvme/tags/nvme-next-pull-request' into staging (2022-03-04 15:31:23 +0000)

are available in the Git repository at:

https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20220304

for you to fetch changes up to cf320769476c3e2820be2a6280bfa1e15baf396f:

tcg/i386: Implement bitsel for avx512 (2022-03-04 08:50:41 -1000)

----------------------------------------------------------------
Reorder do_constant_folding_cond test to satisfy valgrind.
Fix value of MAX_OPC_PARAM_IARGS.
Add opcodes for vector nand, nor, eqv.
Support vector nand, nor, eqv on PPC and S390X hosts.
Support AVX512VL, AVX512BW, AVX512DQ, and AVX512VBMI2.

----------------------------------------------------------------
Alex Bennée (1):
      tcg/optimize: only read val after const check

Richard Henderson (19):
      tcg: Add opcodes for vector nand, nor, eqv
      tcg/ppc: Implement vector NAND, NOR, EQV
      tcg/s390x: Implement vector NAND, NOR, EQV
      tcg/i386: Detect AVX512
      tcg/i386: Add tcg_out_evex_opc
      tcg/i386: Use tcg_can_emit_vec_op in expand_vec_cmp_noinv
      tcg/i386: Implement avx512 variable shifts
      tcg/i386: Implement avx512 scalar shift
      tcg/i386: Implement avx512 immediate sari shift
      tcg/i386: Implement avx512 immediate rotate
      tcg/i386: Implement avx512 variable rotate
      tcg/i386: Support avx512vbmi2 vector shift-double instructions
      tcg/i386: Expand vector word rotate as avx512vbmi2 shift-double
      tcg/i386: Remove rotls_vec from tcg_target_op_def
      tcg/i386: Expand scalar rotate with avx512 insns
      tcg/i386: Implement avx512 min/max/abs
      tcg/i386: Implement avx512 multiply
      tcg/i386: Implement more logical operations for avx512
      tcg/i386: Implement bitsel for avx512

Ziqiao Kong (1):
      tcg: Set MAX_OPC_PARAM_IARGS to 7