From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385443602741.9160355318493; Wed, 30 Aug 2023 01:50:43 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt8-0008Tn-9Z; Wed, 30 Aug 2023 04:49:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGt5-0008Ql-NZ for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:19 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt0-0007SZ-LW for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:19 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxbeuDAu9kXwgdAA--.57106S3; Wed, 30 Aug 2023 16:49:07 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S3; Wed, 30 Aug 2023 16:49:07 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 01/48] target/loongarch: Add LASX data support Date: Wed, 30 Aug 2023 16:48:15 +0800 Message-Id: <20230830084902.2113960-2-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S3 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385444148100001 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/cpu.h | 24 ++++++++++++---------- target/loongarch/internals.h | 22 -------------------- target/loongarch/vec.h | 33 ++++++++++++++++++++++++++++++ linux-user/loongarch64/signal.c | 1 + target/loongarch/cpu.c | 1 + target/loongarch/gdbstub.c | 1 + target/loongarch/lsx_helper.c | 1 + target/loongarch/machine.c | 36 ++++++++++++++++++++++++++++++++- 8 files changed, 85 insertions(+), 34 deletions(-) create mode 100644 target/loongarch/vec.h diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h index 4d7201995a..347ad1c8a9 100644 --- a/target/loongarch/cpu.h +++ b/target/loongarch/cpu.h @@ -251,18 +251,20 @@ FIELD(TLB_MISC, ASID, 1, 10) FIELD(TLB_MISC, VPPN, 13, 35) FIELD(TLB_MISC, PS, 48, 6) =20 -#define LSX_LEN (128) +#define LSX_LEN (128) +#define LASX_LEN (256) + typedef union VReg { - int8_t B[LSX_LEN / 8]; - int16_t H[LSX_LEN / 16]; - int32_t W[LSX_LEN / 32]; - int64_t D[LSX_LEN / 64]; - uint8_t UB[LSX_LEN / 8]; - uint16_t UH[LSX_LEN / 16]; - uint32_t UW[LSX_LEN / 32]; - uint64_t UD[LSX_LEN / 64]; - Int128 Q[LSX_LEN / 128]; -}VReg; + int8_t B[LASX_LEN / 8]; + int16_t H[LASX_LEN / 16]; + int32_t W[LASX_LEN / 32]; + int64_t D[LASX_LEN / 64]; + uint8_t UB[LASX_LEN / 8]; + uint16_t UH[LASX_LEN / 16]; + uint32_t UW[LASX_LEN / 32]; + uint64_t UD[LASX_LEN / 64]; + Int128 Q[LASX_LEN / 128]; +} VReg; =20 typedef union fpr_t fpr_t; union fpr_t { diff --git a/target/loongarch/internals.h b/target/loongarch/internals.h index 7b0f29c942..c492863cc5 100644 --- a/target/loongarch/internals.h +++ b/target/loongarch/internals.h @@ -21,28 +21,6 @@ /* Global bit for huge page */ #define LOONGARCH_HGLOBAL_SHIFT 12 =20 -#if HOST_BIG_ENDIAN -#define B(x) B[15 - (x)] -#define H(x) H[7 - (x)] -#define W(x) W[3 - (x)] -#define D(x) D[1 - (x)] -#define UB(x) UB[15 - (x)] -#define UH(x) UH[7 - (x)] -#define UW(x) UW[3 - (x)] -#define UD(x) UD[1 -(x)] -#define Q(x) Q[x] -#else -#define B(x) B[x] -#define H(x) H[x] -#define W(x) W[x] -#define D(x) D[x] -#define UB(x) UB[x] -#define UH(x) UH[x] -#define UW(x) UW[x] -#define UD(x) UD[x] -#define Q(x) Q[x] -#endif - void loongarch_translate_init(void); =20 void loongarch_cpu_dump_state(CPUState *cpu, FILE *f, int flags); diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h new file mode 100644 index 0000000000..2f23cae7d7 --- /dev/null +++ b/target/loongarch/vec.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * QEMU LoongArch vector utilitites + * + * Copyright (c) 2023 Loongson Technology Corporation Limited + */ + +#ifndef LOONGARCH_VEC_H +#define LOONGARCH_VEC_H + +#if HOST_BIG_ENDIAN +#define B(x) B[(x) ^ 15] +#define H(x) H[(x) ^ 7] +#define W(x) W[(x) ^ 3] +#define D(x) D[(x) ^ 1] +#define UB(x) UB[(x) ^ 15] +#define UH(x) UH[(x) ^ 7] +#define UW(x) UW[(x) ^ 3] +#define UD(x) UD[(x) ^ 1] +#define Q(x) Q[x] +#else +#define B(x) B[x] +#define H(x) H[x] +#define W(x) W[x] +#define D(x) D[x] +#define UB(x) UB[x] +#define UH(x) UH[x] +#define UW(x) UW[x] +#define UD(x) UD[x] +#define Q(x) Q[x] +#endif /* HOST_BIG_ENDIAN */ + +#endif /* LOONGARCH_VEC_H */ diff --git a/linux-user/loongarch64/signal.c b/linux-user/loongarch64/signa= l.c index bb8efb1172..39572c1190 100644 --- a/linux-user/loongarch64/signal.c +++ b/linux-user/loongarch64/signal.c @@ -12,6 +12,7 @@ #include "linux-user/trace.h" =20 #include "target/loongarch/internals.h" +#include "target/loongarch/vec.h" =20 /* FP context was used */ #define SC_USED_FP (1 << 0) diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c index 27fc6e1f33..923e4b30cf 100644 --- a/target/loongarch/cpu.c +++ b/target/loongarch/cpu.c @@ -18,6 +18,7 @@ #include "cpu-csr.h" #include "sysemu/reset.h" #include "tcg/tcg.h" +#include "vec.h" =20 const char * const regnames[32] =3D { "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", diff --git a/target/loongarch/gdbstub.c b/target/loongarch/gdbstub.c index b09804b62f..5fc2f19e96 100644 --- a/target/loongarch/gdbstub.c +++ b/target/loongarch/gdbstub.c @@ -11,6 +11,7 @@ #include "internals.h" #include "exec/gdbstub.h" #include "gdbstub/helpers.h" +#include "vec.h" =20 uint64_t read_fcc(CPULoongArchState *env) { diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index 9571f0aef0..b231a2798b 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -12,6 +12,7 @@ #include "fpu/softfloat.h" #include "internals.h" #include "tcg/tcg.h" +#include "vec.h" =20 #define DO_ADD(a, b) (a + b) #define DO_SUB(a, b) (a - b) diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c index d8ac99c9a4..1c4e01d076 100644 --- a/target/loongarch/machine.c +++ b/target/loongarch/machine.c @@ -8,7 +8,7 @@ #include "qemu/osdep.h" #include "cpu.h" #include "migration/cpu.h" -#include "internals.h" +#include "vec.h" =20 static const VMStateDescription vmstate_fpu_reg =3D { .name =3D "fpu_reg", @@ -76,6 +76,39 @@ static const VMStateDescription vmstate_lsx =3D { }, }; =20 +static const VMStateDescription vmstate_lasxh_reg =3D { + .name =3D "lasxh_reg", + .version_id =3D 1, + .minimum_version_id =3D 1, + .fields =3D (VMStateField[]) { + VMSTATE_UINT64(UD(2), VReg), + VMSTATE_UINT64(UD(3), VReg), + VMSTATE_END_OF_LIST() + } +}; + +#define VMSTATE_LASXH_REGS(_field, _state, _start) \ + VMSTATE_STRUCT_SUB_ARRAY(_field, _state, _start, 32, 0, \ + vmstate_lasxh_reg, fpr_t) + +static bool lasx_needed(void *opaque) +{ + LoongArchCPU *cpu =3D opaque; + + return FIELD_EX64(cpu->env.cpucfg[2], CPUCFG2, LASX); +} + +static const VMStateDescription vmstate_lasx =3D { + .name =3D "cpu/lasx", + .version_id =3D 1, + .minimum_version_id =3D 1, + .needed =3D lasx_needed, + .fields =3D (VMStateField[]) { + VMSTATE_LASXH_REGS(env.fpr, LoongArchCPU, 0), + VMSTATE_END_OF_LIST() + }, +}; + /* TLB state */ const VMStateDescription vmstate_tlb =3D { .name =3D "cpu/tlb", @@ -163,6 +196,7 @@ const VMStateDescription vmstate_loongarch_cpu =3D { .subsections =3D (const VMStateDescription*[]) { &vmstate_fpu, &vmstate_lsx, + &vmstate_lasx, NULL } }; --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385512708509.0497557745681; Wed, 30 Aug 2023 01:51:52 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtA-00005W-23; Wed, 30 Aug 2023 04:49:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGt7-0008S4-0l for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:21 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt2-0007Se-O6 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:20 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dx_7uEAu9kYAgdAA--.262S3; Wed, 30 Aug 2023 16:49:08 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S4; Wed, 30 Aug 2023 16:49:07 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 02/48] target/loongarch: meson.build support build LASX Date: Wed, 30 Aug 2023 16:48:16 +0800 Message-Id: <20230830084902.2113960-3-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S4 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385514469100003 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/translate.c | 1 + target/loongarch/insn_trans/trans_lasx.c.inc | 6 ++++++ 2 files changed, 7 insertions(+) create mode 100644 target/loongarch/insn_trans/trans_lasx.c.inc diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c index fd393ed76d..1f91afee81 100644 --- a/target/loongarch/translate.c +++ b/target/loongarch/translate.c @@ -262,6 +262,7 @@ static uint64_t make_address_pc(DisasContext *ctx, uint= 64_t addr) #include "insn_trans/trans_branch.c.inc" #include "insn_trans/trans_privileged.c.inc" #include "insn_trans/trans_lsx.c.inc" +#include "insn_trans/trans_lasx.c.inc" =20 static void loongarch_tr_translate_insn(DisasContextBase *dcbase, CPUState= *cs) { diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc new file mode 100644 index 0000000000..56a9839255 --- /dev/null +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -0,0 +1,6 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * LASX translate functions + * Copyright (c) 2023 Loongson Technology Corporation Limited + */ + --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385463605188.02380257915445; Wed, 30 Aug 2023 01:51:03 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt7-0008TK-PO; Wed, 30 Aug 2023 04:49:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGt5-0008Py-3O for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:19 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt0-0007Sg-ST for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:18 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxNuiEAu9kYggdAA--.5920S3; Wed, 30 Aug 2023 16:49:08 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S5; Wed, 30 Aug 2023 16:49:08 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 03/48] target/loongarch: Add CHECK_ASXE maccro for check LASX enable Date: Wed, 30 Aug 2023 16:48:17 +0800 Message-Id: <20230830084902.2113960-4-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S5 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385464276100007 Content-Type: text/plain; charset="utf-8" Reviewed-by: Richard Henderson Signed-off-by: Song Gao --- target/loongarch/cpu.h | 2 ++ target/loongarch/cpu.c | 2 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 10 ++++++++++ 3 files changed, 14 insertions(+) diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h index 347ad1c8a9..f125a8e49b 100644 --- a/target/loongarch/cpu.h +++ b/target/loongarch/cpu.h @@ -462,6 +462,7 @@ static inline void set_pc(CPULoongArchState *env, uint6= 4_t value) #define HW_FLAGS_CRMD_PG R_CSR_CRMD_PG_MASK /* 0x10 */ #define HW_FLAGS_EUEN_FPE 0x04 #define HW_FLAGS_EUEN_SXE 0x08 +#define HW_FLAGS_EUEN_ASXE 0x10 #define HW_FLAGS_VA32 0x20 =20 static inline void cpu_get_tb_cpu_state(CPULoongArchState *env, vaddr *pc, @@ -472,6 +473,7 @@ static inline void cpu_get_tb_cpu_state(CPULoongArchSta= te *env, vaddr *pc, *flags =3D env->CSR_CRMD & (R_CSR_CRMD_PLV_MASK | R_CSR_CRMD_PG_MASK); *flags |=3D FIELD_EX64(env->CSR_EUEN, CSR_EUEN, FPE) * HW_FLAGS_EUEN_F= PE; *flags |=3D FIELD_EX64(env->CSR_EUEN, CSR_EUEN, SXE) * HW_FLAGS_EUEN_S= XE; + *flags |=3D FIELD_EX64(env->CSR_EUEN, CSR_EUEN, ASXE) * HW_FLAGS_EUEN_= ASXE; *flags |=3D is_va32(env) * HW_FLAGS_VA32; } =20 diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c index 923e4b30cf..4deae22104 100644 --- a/target/loongarch/cpu.c +++ b/target/loongarch/cpu.c @@ -54,6 +54,7 @@ static const char * const excp_names[] =3D { [EXCCODE_DBP] =3D "Debug breakpoint", [EXCCODE_BCE] =3D "Bound Check Exception", [EXCCODE_SXD] =3D "128 bit vector instructions Disable exception", + [EXCCODE_ASXD] =3D "256 bit vector instructions Disable exception", }; =20 const char *loongarch_exception_name(int32_t exception) @@ -189,6 +190,7 @@ static void loongarch_cpu_do_interrupt(CPUState *cs) case EXCCODE_FPD: case EXCCODE_FPE: case EXCCODE_SXD: + case EXCCODE_ASXD: env->CSR_BADV =3D env->pc; QEMU_FALLTHROUGH; case EXCCODE_BCE: diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 56a9839255..75a77f5dce 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -4,3 +4,13 @@ * Copyright (c) 2023 Loongson Technology Corporation Limited */ =20 +#ifndef CONFIG_USER_ONLY +#define CHECK_ASXE do { \ + if ((ctx->base.tb->flags & HW_FLAGS_EUEN_ASXE) =3D=3D 0) { \ + generate_exception(ctx, EXCCODE_ASXD); \ + return true; \ + } \ +} while (0) +#else +#define CHECK_ASXE +#endif --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385439753770.241877805245; Wed, 30 Aug 2023 01:50:39 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt7-0008RX-6M; Wed, 30 Aug 2023 04:49:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGt4-0008Pm-Uo for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:18 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt0-0007Sl-En for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:18 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxEvCFAu9kZQgdAA--.58538S3; Wed, 30 Aug 2023 16:49:09 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S6; Wed, 30 Aug 2023 16:49:08 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 04/48] target/loongarch: Add avail_LASX to check LASX instructions Date: Wed, 30 Aug 2023 16:48:18 +0800 Message-Id: <20230830084902.2113960-5-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S6 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385440137100001 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/translate.h | 1 + 1 file changed, 1 insertion(+) diff --git a/target/loongarch/translate.h b/target/loongarch/translate.h index 89b49a859e..195f53573a 100644 --- a/target/loongarch/translate.h +++ b/target/loongarch/translate.h @@ -23,6 +23,7 @@ #define avail_LSPW(C) (FIELD_EX32((C)->cpucfg2, CPUCFG2, LSPW)) #define avail_LAM(C) (FIELD_EX32((C)->cpucfg2, CPUCFG2, LAM)) #define avail_LSX(C) (FIELD_EX32((C)->cpucfg2, CPUCFG2, LSX)) +#define avail_LASX(C) (FIELD_EX32((C)->cpucfg2, CPUCFG2, LASX)) #define avail_IOCSR(C) (FIELD_EX32((C)->cpucfg1, CPUCFG1, IOCSR)) =20 /* --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385387305893.2775255324339; Wed, 30 Aug 2023 01:49:47 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtB-0000D6-QW; Wed, 30 Aug 2023 04:49:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGt9-00008A-OF for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:23 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt2-0007TE-Jb for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:23 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxV_GFAu9kZwgdAA--.59814S3; Wed, 30 Aug 2023 16:49:09 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S7; Wed, 30 Aug 2023 16:49:09 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 05/48] target/loongarch: Implement xvadd/xvsub Date: Wed, 30 Aug 2023 16:48:19 +0800 Message-Id: <20230830084902.2113960-6-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S7 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385388811100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVADD.{B/H/W/D/Q}; - XVSUB.{B/H/W/D/Q}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 17 + target/loongarch/insns.decode | 14 + target/loongarch/disas.c | 23 + target/loongarch/translate.c | 4 + target/loongarch/insn_trans/trans_lasx.c.inc | 56 +- target/loongarch/insn_trans/trans_lsx.c.inc | 513 +++++++++---------- 6 files changed, 355 insertions(+), 272 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 2f23cae7d7..512f2fd83f 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -8,6 +8,23 @@ #ifndef LOONGARCH_VEC_H #define LOONGARCH_VEC_H =20 +#ifndef CONFIG_USER_ONLY + #define CHECK_VEC do { \ + if ((ctx->vl =3D=3D LSX_LEN) && \ + (ctx->base.tb->flags & HW_FLAGS_EUEN_SXE) =3D=3D 0) { \ + generate_exception(ctx, EXCCODE_SXD); \ + return true; \ + } \ + if ((ctx->vl =3D=3D LASX_LEN) && \ + (ctx->base.tb->flags & HW_FLAGS_EUEN_ASXE) =3D=3D 0) { \ + generate_exception(ctx, EXCCODE_ASXD); \ + return true; \ + } \ + } while (0) +#else + #define CHECK_VEC +#endif /*!CONFIG_USER_ONLY */ + #if HOST_BIG_ENDIAN #define B(x) B[(x) ^ 15] #define H(x) H[(x) ^ 7] diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index c9c3bc2c73..bcc18fb6c5 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1296,3 +1296,17 @@ vstelm_d 0011 00010001 0 . ........ ..... ..= ... @vr_i8i1 vstelm_w 0011 00010010 .. ........ ..... ..... @vr_i8i2 vstelm_h 0011 0001010 ... ........ ..... ..... @vr_i8i3 vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 + +# +# LoongArch LASX instructions +# +xvadd_b 0111 01000000 10100 ..... ..... ..... @vvv +xvadd_h 0111 01000000 10101 ..... ..... ..... @vvv +xvadd_w 0111 01000000 10110 ..... ..... ..... @vvv +xvadd_d 0111 01000000 10111 ..... ..... ..... @vvv +xvadd_q 0111 01010010 11010 ..... ..... ..... @vvv +xvsub_b 0111 01000000 11000 ..... ..... ..... @vvv +xvsub_h 0111 01000000 11001 ..... ..... ..... @vvv +xvsub_w 0111 01000000 11010 ..... ..... ..... @vvv +xvsub_d 0111 01000000 11011 ..... ..... ..... @vvv +xvsub_q 0111 01010010 11011 ..... ..... ..... @vvv diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 5c402d944d..d8b62ba532 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1695,3 +1695,26 @@ INSN_LSX(vstelm_d, vr_ii) INSN_LSX(vstelm_w, vr_ii) INSN_LSX(vstelm_h, vr_ii) INSN_LSX(vstelm_b, vr_ii) + +#define INSN_LASX(insn, type) \ +static bool trans_##insn(DisasContext *ctx, arg_##type * a) \ +{ \ + output_##type ## _x(ctx, a, #insn); \ + return true; \ +} + +static void output_vvv_x(DisasContext *ctx, arg_vvv * a, const char *mnemo= nic) +{ + output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk); +} + +INSN_LASX(xvadd_b, vvv) +INSN_LASX(xvadd_h, vvv) +INSN_LASX(xvadd_w, vvv) +INSN_LASX(xvadd_d, vvv) +INSN_LASX(xvadd_q, vvv) +INSN_LASX(xvsub_b, vvv) +INSN_LASX(xvsub_h, vvv) +INSN_LASX(xvsub_w, vvv) +INSN_LASX(xvsub_d, vvv) +INSN_LASX(xvsub_q, vvv) diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c index 1f91afee81..36039dfeef 100644 --- a/target/loongarch/translate.c +++ b/target/loongarch/translate.c @@ -18,6 +18,7 @@ #include "fpu/softfloat.h" #include "translate.h" #include "internals.h" +#include "vec.h" =20 /* Global register indices */ TCGv cpu_gpr[32], cpu_pc; @@ -122,6 +123,9 @@ static void loongarch_tr_init_disas_context(DisasContex= tBase *dcbase, if (FIELD_EX64(env->cpucfg[2], CPUCFG2, LSX)) { ctx->vl =3D LSX_LEN; } + if (FIELD_EX64(env->cpucfg[2], CPUCFG2, LASX)) { + ctx->vl =3D LASX_LEN; + } =20 ctx->la64 =3D is_la64(env); ctx->va32 =3D (ctx->base.tb->flags & HW_FLAGS_VA32) !=3D 0; diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 75a77f5dce..218b8dc648 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -4,13 +4,49 @@ * Copyright (c) 2023 Loongson Technology Corporation Limited */ =20 -#ifndef CONFIG_USER_ONLY -#define CHECK_ASXE do { \ - if ((ctx->base.tb->flags & HW_FLAGS_EUEN_ASXE) =3D=3D 0) { \ - generate_exception(ctx, EXCCODE_ASXD); \ - return true; \ - } \ -} while (0) -#else -#define CHECK_ASXE -#endif +TRANS(xvadd_b, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_add) +TRANS(xvadd_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_add) +TRANS(xvadd_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_add) +TRANS(xvadd_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_add) + +#define XVADDSUB_Q(NAME) \ +static bool trans_xv## NAME ##_q(DisasContext *ctx, arg_vvv * a) \ +{ \ + TCGv_i64 rh, rl, ah, al, bh, bl; \ + int i; \ + \ + if (!avail_LASX(ctx)) { \ + return false; \ + } \ + \ + CHECK_VEC; \ + \ + rh =3D tcg_temp_new_i64(); \ + rl =3D tcg_temp_new_i64(); \ + ah =3D tcg_temp_new_i64(); \ + al =3D tcg_temp_new_i64(); \ + bh =3D tcg_temp_new_i64(); \ + bl =3D tcg_temp_new_i64(); \ + \ + for (i =3D 0; i < 2; i++) { \ + get_vreg64(ah, a->vj, 1 + i * 2); \ + get_vreg64(al, a->vj, 0 + i * 2); \ + get_vreg64(bh, a->vk, 1 + i * 2); \ + get_vreg64(bl, a->vk, 0 + i * 2); \ + \ + tcg_gen_## NAME ##2_i64(rl, rh, al, ah, bl, bh); \ + \ + set_vreg64(rh, a->vd, 1 + i * 2); \ + set_vreg64(rl, a->vd, 0 + i * 2); \ + } \ + \ + return true; \ +} + +XVADDSUB_Q(add) +XVADDSUB_Q(sub) + +TRANS(xvsub_b, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_sub) +TRANS(xvsub_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_sub) +TRANS(xvsub_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_sub) +TRANS(xvsub_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_sub) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index 5fbf2718f7..0e12213e8b 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -4,17 +4,6 @@ * Copyright (c) 2022-2023 Loongson Technology Corporation Limited */ =20 -#ifndef CONFIG_USER_ONLY -#define CHECK_SXE do { \ - if ((ctx->base.tb->flags & HW_FLAGS_EUEN_SXE) =3D=3D 0) { \ - generate_exception(ctx, EXCCODE_SXD); \ - return true; \ - } \ -} while (0) -#else -#define CHECK_SXE -#endif - static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32)) @@ -24,7 +13,7 @@ static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, TCGv_i32 vk =3D tcg_constant_i32(a->vk); TCGv_i32 va =3D tcg_constant_i32(a->va); =20 - CHECK_SXE; + CHECK_VEC; func(cpu_env, vd, vj, vk, va); return true; } @@ -36,7 +25,7 @@ static bool gen_vvv(DisasContext *ctx, arg_vvv *a, TCGv_i32 vj =3D tcg_constant_i32(a->vj); TCGv_i32 vk =3D tcg_constant_i32(a->vk); =20 - CHECK_SXE; + CHECK_VEC; =20 func(cpu_env, vd, vj, vk); return true; @@ -48,7 +37,7 @@ static bool gen_vv(DisasContext *ctx, arg_vv *a, TCGv_i32 vd =3D tcg_constant_i32(a->vd); TCGv_i32 vj =3D tcg_constant_i32(a->vj); =20 - CHECK_SXE; + CHECK_VEC; func(cpu_env, vd, vj); return true; } @@ -60,7 +49,7 @@ static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, TCGv_i32 vj =3D tcg_constant_i32(a->vj); TCGv_i32 imm =3D tcg_constant_i32(a->imm); =20 - CHECK_SXE; + CHECK_VEC; func(cpu_env, vd, vj, imm); return true; } @@ -71,24 +60,24 @@ static bool gen_cv(DisasContext *ctx, arg_cv *a, TCGv_i32 vj =3D tcg_constant_i32(a->vj); TCGv_i32 cd =3D tcg_constant_i32(a->cd); =20 - CHECK_SXE; + CHECK_VEC; func(cpu_env, cd, vj); return true; } =20 -static bool gvec_vvv(DisasContext *ctx, arg_vvv *a, MemOp mop, +static bool gvec_vvv(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, MemOp = mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t)) { uint32_t vd_ofs, vj_ofs, vk_ofs; =20 - CHECK_SXE; + CHECK_VEC; =20 vd_ofs =3D vec_full_offset(a->vd); vj_ofs =3D vec_full_offset(a->vj); vk_ofs =3D vec_full_offset(a->vk); =20 - func(mop, vd_ofs, vj_ofs, vk_ofs, 16, ctx->vl/8); + func(mop, vd_ofs, vj_ofs, vk_ofs, oprsz, ctx->vl / 8); return true; } =20 @@ -98,7 +87,7 @@ static bool gvec_vv(DisasContext *ctx, arg_vv *a, MemOp m= op, { uint32_t vd_ofs, vj_ofs; =20 - CHECK_SXE; + CHECK_VEC; =20 vd_ofs =3D vec_full_offset(a->vd); vj_ofs =3D vec_full_offset(a->vj); @@ -113,7 +102,7 @@ static bool gvec_vv_i(DisasContext *ctx, arg_vv_i *a, M= emOp mop, { uint32_t vd_ofs, vj_ofs; =20 - CHECK_SXE; + CHECK_VEC; =20 vd_ofs =3D vec_full_offset(a->vd); vj_ofs =3D vec_full_offset(a->vj); @@ -126,7 +115,7 @@ static bool gvec_subi(DisasContext *ctx, arg_vv_i *a, M= emOp mop) { uint32_t vd_ofs, vj_ofs; =20 - CHECK_SXE; + CHECK_VEC; =20 vd_ofs =3D vec_full_offset(a->vd); vj_ofs =3D vec_full_offset(a->vj); @@ -135,10 +124,10 @@ static bool gvec_subi(DisasContext *ctx, arg_vv_i *a,= MemOp mop) return true; } =20 -TRANS(vadd_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_add) -TRANS(vadd_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_add) -TRANS(vadd_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_add) -TRANS(vadd_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_add) +TRANS(vadd_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_add) +TRANS(vadd_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_add) +TRANS(vadd_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_add) +TRANS(vadd_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_add) =20 #define VADDSUB_Q(NAME) \ static bool trans_v## NAME ##_q(DisasContext *ctx, arg_vvv *a) \ @@ -149,7 +138,7 @@ static bool trans_v## NAME ##_q(DisasContext *ctx, arg_= vvv *a) \ return false; \ } \ \ - CHECK_SXE; \ + CHECK_VEC; \ \ rh =3D tcg_temp_new_i64(); \ rl =3D tcg_temp_new_i64(); \ @@ -174,10 +163,10 @@ static bool trans_v## NAME ##_q(DisasContext *ctx, ar= g_vvv *a) \ VADDSUB_Q(add) VADDSUB_Q(sub) =20 -TRANS(vsub_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_sub) -TRANS(vsub_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_sub) -TRANS(vsub_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_sub) -TRANS(vsub_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_sub) +TRANS(vsub_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_sub) +TRANS(vsub_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_sub) +TRANS(vsub_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_sub) +TRANS(vsub_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_sub) =20 TRANS(vaddi_bu, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_addi) TRANS(vaddi_hu, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_addi) @@ -193,22 +182,22 @@ TRANS(vneg_h, LSX, gvec_vv, MO_16, tcg_gen_gvec_neg) TRANS(vneg_w, LSX, gvec_vv, MO_32, tcg_gen_gvec_neg) TRANS(vneg_d, LSX, gvec_vv, MO_64, tcg_gen_gvec_neg) =20 -TRANS(vsadd_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_ssadd) -TRANS(vsadd_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_ssadd) -TRANS(vsadd_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_ssadd) -TRANS(vsadd_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_ssadd) -TRANS(vsadd_bu, LSX, gvec_vvv, MO_8, tcg_gen_gvec_usadd) -TRANS(vsadd_hu, LSX, gvec_vvv, MO_16, tcg_gen_gvec_usadd) -TRANS(vsadd_wu, LSX, gvec_vvv, MO_32, tcg_gen_gvec_usadd) -TRANS(vsadd_du, LSX, gvec_vvv, MO_64, tcg_gen_gvec_usadd) -TRANS(vssub_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_sssub) -TRANS(vssub_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_sssub) -TRANS(vssub_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_sssub) -TRANS(vssub_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_sssub) -TRANS(vssub_bu, LSX, gvec_vvv, MO_8, tcg_gen_gvec_ussub) -TRANS(vssub_hu, LSX, gvec_vvv, MO_16, tcg_gen_gvec_ussub) -TRANS(vssub_wu, LSX, gvec_vvv, MO_32, tcg_gen_gvec_ussub) -TRANS(vssub_du, LSX, gvec_vvv, MO_64, tcg_gen_gvec_ussub) +TRANS(vsadd_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_ssadd) +TRANS(vsadd_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_ssadd) +TRANS(vsadd_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_ssadd) +TRANS(vsadd_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_ssadd) +TRANS(vsadd_bu, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_usadd) +TRANS(vsadd_hu, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_usadd) +TRANS(vsadd_wu, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_usadd) +TRANS(vsadd_du, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_usadd) +TRANS(vssub_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_sssub) +TRANS(vssub_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_sssub) +TRANS(vssub_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_sssub) +TRANS(vssub_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_sssub) +TRANS(vssub_bu, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_ussub) +TRANS(vssub_hu, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_ussub) +TRANS(vssub_wu, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_ussub) +TRANS(vssub_du, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_ussub) =20 TRANS(vhaddw_h_b, LSX, gen_vvv, gen_helper_vhaddw_h_b) TRANS(vhaddw_w_h, LSX, gen_vvv, gen_helper_vhaddw_w_h) @@ -305,10 +294,10 @@ static void do_vaddwev_s(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vaddwev_h_b, LSX, gvec_vvv, MO_8, do_vaddwev_s) -TRANS(vaddwev_w_h, LSX, gvec_vvv, MO_16, do_vaddwev_s) -TRANS(vaddwev_d_w, LSX, gvec_vvv, MO_32, do_vaddwev_s) -TRANS(vaddwev_q_d, LSX, gvec_vvv, MO_64, do_vaddwev_s) +TRANS(vaddwev_h_b, LSX, gvec_vvv, 16, MO_8, do_vaddwev_s) +TRANS(vaddwev_w_h, LSX, gvec_vvv, 16, MO_16, do_vaddwev_s) +TRANS(vaddwev_d_w, LSX, gvec_vvv, 16, MO_32, do_vaddwev_s) +TRANS(vaddwev_q_d, LSX, gvec_vvv, 16, MO_64, do_vaddwev_s) =20 static void gen_vaddwod_w_h(TCGv_i32 t, TCGv_i32 a, TCGv_i32 b) { @@ -384,10 +373,10 @@ static void do_vaddwod_s(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vaddwod_h_b, LSX, gvec_vvv, MO_8, do_vaddwod_s) -TRANS(vaddwod_w_h, LSX, gvec_vvv, MO_16, do_vaddwod_s) -TRANS(vaddwod_d_w, LSX, gvec_vvv, MO_32, do_vaddwod_s) -TRANS(vaddwod_q_d, LSX, gvec_vvv, MO_64, do_vaddwod_s) +TRANS(vaddwod_h_b, LSX, gvec_vvv, 16, MO_8, do_vaddwod_s) +TRANS(vaddwod_w_h, LSX, gvec_vvv, 16, MO_16, do_vaddwod_s) +TRANS(vaddwod_d_w, LSX, gvec_vvv, 16, MO_32, do_vaddwod_s) +TRANS(vaddwod_q_d, LSX, gvec_vvv, 16, MO_64, do_vaddwod_s) =20 static void gen_vsubwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -467,10 +456,10 @@ static void do_vsubwev_s(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vsubwev_h_b, LSX, gvec_vvv, MO_8, do_vsubwev_s) -TRANS(vsubwev_w_h, LSX, gvec_vvv, MO_16, do_vsubwev_s) -TRANS(vsubwev_d_w, LSX, gvec_vvv, MO_32, do_vsubwev_s) -TRANS(vsubwev_q_d, LSX, gvec_vvv, MO_64, do_vsubwev_s) +TRANS(vsubwev_h_b, LSX, gvec_vvv, 16, MO_8, do_vsubwev_s) +TRANS(vsubwev_w_h, LSX, gvec_vvv, 16, MO_16, do_vsubwev_s) +TRANS(vsubwev_d_w, LSX, gvec_vvv, 16, MO_32, do_vsubwev_s) +TRANS(vsubwev_q_d, LSX, gvec_vvv, 16, MO_64, do_vsubwev_s) =20 static void gen_vsubwod_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -546,10 +535,10 @@ static void do_vsubwod_s(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vsubwod_h_b, LSX, gvec_vvv, MO_8, do_vsubwod_s) -TRANS(vsubwod_w_h, LSX, gvec_vvv, MO_16, do_vsubwod_s) -TRANS(vsubwod_d_w, LSX, gvec_vvv, MO_32, do_vsubwod_s) -TRANS(vsubwod_q_d, LSX, gvec_vvv, MO_64, do_vsubwod_s) +TRANS(vsubwod_h_b, LSX, gvec_vvv, 16, MO_8, do_vsubwod_s) +TRANS(vsubwod_w_h, LSX, gvec_vvv, 16, MO_16, do_vsubwod_s) +TRANS(vsubwod_d_w, LSX, gvec_vvv, 16, MO_32, do_vsubwod_s) +TRANS(vsubwod_q_d, LSX, gvec_vvv, 16, MO_64, do_vsubwod_s) =20 static void gen_vaddwev_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -621,10 +610,10 @@ static void do_vaddwev_u(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vaddwev_h_bu, LSX, gvec_vvv, MO_8, do_vaddwev_u) -TRANS(vaddwev_w_hu, LSX, gvec_vvv, MO_16, do_vaddwev_u) -TRANS(vaddwev_d_wu, LSX, gvec_vvv, MO_32, do_vaddwev_u) -TRANS(vaddwev_q_du, LSX, gvec_vvv, MO_64, do_vaddwev_u) +TRANS(vaddwev_h_bu, LSX, gvec_vvv, 16, MO_8, do_vaddwev_u) +TRANS(vaddwev_w_hu, LSX, gvec_vvv, 16, MO_16, do_vaddwev_u) +TRANS(vaddwev_d_wu, LSX, gvec_vvv, 16, MO_32, do_vaddwev_u) +TRANS(vaddwev_q_du, LSX, gvec_vvv, 16, MO_64, do_vaddwev_u) =20 static void gen_vaddwod_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -700,10 +689,10 @@ static void do_vaddwod_u(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vaddwod_h_bu, LSX, gvec_vvv, MO_8, do_vaddwod_u) -TRANS(vaddwod_w_hu, LSX, gvec_vvv, MO_16, do_vaddwod_u) -TRANS(vaddwod_d_wu, LSX, gvec_vvv, MO_32, do_vaddwod_u) -TRANS(vaddwod_q_du, LSX, gvec_vvv, MO_64, do_vaddwod_u) +TRANS(vaddwod_h_bu, LSX, gvec_vvv, 16, MO_8, do_vaddwod_u) +TRANS(vaddwod_w_hu, LSX, gvec_vvv, 16, MO_16, do_vaddwod_u) +TRANS(vaddwod_d_wu, LSX, gvec_vvv, 16, MO_32, do_vaddwod_u) +TRANS(vaddwod_q_du, LSX, gvec_vvv, 16, MO_64, do_vaddwod_u) =20 static void gen_vsubwev_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -775,10 +764,10 @@ static void do_vsubwev_u(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vsubwev_h_bu, LSX, gvec_vvv, MO_8, do_vsubwev_u) -TRANS(vsubwev_w_hu, LSX, gvec_vvv, MO_16, do_vsubwev_u) -TRANS(vsubwev_d_wu, LSX, gvec_vvv, MO_32, do_vsubwev_u) -TRANS(vsubwev_q_du, LSX, gvec_vvv, MO_64, do_vsubwev_u) +TRANS(vsubwev_h_bu, LSX, gvec_vvv, 16, MO_8, do_vsubwev_u) +TRANS(vsubwev_w_hu, LSX, gvec_vvv, 16, MO_16, do_vsubwev_u) +TRANS(vsubwev_d_wu, LSX, gvec_vvv, 16, MO_32, do_vsubwev_u) +TRANS(vsubwev_q_du, LSX, gvec_vvv, 16, MO_64, do_vsubwev_u) =20 static void gen_vsubwod_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -854,10 +843,10 @@ static void do_vsubwod_u(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vsubwod_h_bu, LSX, gvec_vvv, MO_8, do_vsubwod_u) -TRANS(vsubwod_w_hu, LSX, gvec_vvv, MO_16, do_vsubwod_u) -TRANS(vsubwod_d_wu, LSX, gvec_vvv, MO_32, do_vsubwod_u) -TRANS(vsubwod_q_du, LSX, gvec_vvv, MO_64, do_vsubwod_u) +TRANS(vsubwod_h_bu, LSX, gvec_vvv, 16, MO_8, do_vsubwod_u) +TRANS(vsubwod_w_hu, LSX, gvec_vvv, 16, MO_16, do_vsubwod_u) +TRANS(vsubwod_d_wu, LSX, gvec_vvv, 16, MO_32, do_vsubwod_u) +TRANS(vsubwod_q_du, LSX, gvec_vvv, 16, MO_64, do_vsubwod_u) =20 static void gen_vaddwev_u_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_ve= c b) { @@ -937,10 +926,10 @@ static void do_vaddwev_u_s(unsigned vece, uint32_t vd= _ofs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vaddwev_h_bu_b, LSX, gvec_vvv, MO_8, do_vaddwev_u_s) -TRANS(vaddwev_w_hu_h, LSX, gvec_vvv, MO_16, do_vaddwev_u_s) -TRANS(vaddwev_d_wu_w, LSX, gvec_vvv, MO_32, do_vaddwev_u_s) -TRANS(vaddwev_q_du_d, LSX, gvec_vvv, MO_64, do_vaddwev_u_s) +TRANS(vaddwev_h_bu_b, LSX, gvec_vvv, 16, MO_8, do_vaddwev_u_s) +TRANS(vaddwev_w_hu_h, LSX, gvec_vvv, 16, MO_16, do_vaddwev_u_s) +TRANS(vaddwev_d_wu_w, LSX, gvec_vvv, 16, MO_32, do_vaddwev_u_s) +TRANS(vaddwev_q_du_d, LSX, gvec_vvv, 16, MO_64, do_vaddwev_u_s) =20 static void gen_vaddwod_u_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_ve= c b) { @@ -1017,10 +1006,10 @@ static void do_vaddwod_u_s(unsigned vece, uint32_t = vd_ofs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vaddwod_h_bu_b, LSX, gvec_vvv, MO_8, do_vaddwod_u_s) -TRANS(vaddwod_w_hu_h, LSX, gvec_vvv, MO_16, do_vaddwod_u_s) -TRANS(vaddwod_d_wu_w, LSX, gvec_vvv, MO_32, do_vaddwod_u_s) -TRANS(vaddwod_q_du_d, LSX, gvec_vvv, MO_64, do_vaddwod_u_s) +TRANS(vaddwod_h_bu_b, LSX, gvec_vvv, 16, MO_8, do_vaddwod_u_s) +TRANS(vaddwod_w_hu_h, LSX, gvec_vvv, 16, MO_16, do_vaddwod_u_s) +TRANS(vaddwod_d_wu_w, LSX, gvec_vvv, 16, MO_32, do_vaddwod_u_s) +TRANS(vaddwod_q_du_d, LSX, gvec_vvv, 16, MO_64, do_vaddwod_u_s) =20 static void do_vavg(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b, void (*gen_shr_vec)(unsigned, TCGv_vec, @@ -1129,14 +1118,14 @@ static void do_vavg_u(unsigned vece, uint32_t vd_of= s, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vavg_b, LSX, gvec_vvv, MO_8, do_vavg_s) -TRANS(vavg_h, LSX, gvec_vvv, MO_16, do_vavg_s) -TRANS(vavg_w, LSX, gvec_vvv, MO_32, do_vavg_s) -TRANS(vavg_d, LSX, gvec_vvv, MO_64, do_vavg_s) -TRANS(vavg_bu, LSX, gvec_vvv, MO_8, do_vavg_u) -TRANS(vavg_hu, LSX, gvec_vvv, MO_16, do_vavg_u) -TRANS(vavg_wu, LSX, gvec_vvv, MO_32, do_vavg_u) -TRANS(vavg_du, LSX, gvec_vvv, MO_64, do_vavg_u) +TRANS(vavg_b, LSX, gvec_vvv, 16, MO_8, do_vavg_s) +TRANS(vavg_h, LSX, gvec_vvv, 16, MO_16, do_vavg_s) +TRANS(vavg_w, LSX, gvec_vvv, 16, MO_32, do_vavg_s) +TRANS(vavg_d, LSX, gvec_vvv, 16, MO_64, do_vavg_s) +TRANS(vavg_bu, LSX, gvec_vvv, 16, MO_8, do_vavg_u) +TRANS(vavg_hu, LSX, gvec_vvv, 16, MO_16, do_vavg_u) +TRANS(vavg_wu, LSX, gvec_vvv, 16, MO_32, do_vavg_u) +TRANS(vavg_du, LSX, gvec_vvv, 16, MO_64, do_vavg_u) =20 static void do_vavgr_s(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, uint32_t vk_ofs, uint32_t oprsz, uint32_t maxsz) @@ -1210,14 +1199,14 @@ static void do_vavgr_u(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vavgr_b, LSX, gvec_vvv, MO_8, do_vavgr_s) -TRANS(vavgr_h, LSX, gvec_vvv, MO_16, do_vavgr_s) -TRANS(vavgr_w, LSX, gvec_vvv, MO_32, do_vavgr_s) -TRANS(vavgr_d, LSX, gvec_vvv, MO_64, do_vavgr_s) -TRANS(vavgr_bu, LSX, gvec_vvv, MO_8, do_vavgr_u) -TRANS(vavgr_hu, LSX, gvec_vvv, MO_16, do_vavgr_u) -TRANS(vavgr_wu, LSX, gvec_vvv, MO_32, do_vavgr_u) -TRANS(vavgr_du, LSX, gvec_vvv, MO_64, do_vavgr_u) +TRANS(vavgr_b, LSX, gvec_vvv, 16, MO_8, do_vavgr_s) +TRANS(vavgr_h, LSX, gvec_vvv, 16, MO_16, do_vavgr_s) +TRANS(vavgr_w, LSX, gvec_vvv, 16, MO_32, do_vavgr_s) +TRANS(vavgr_d, LSX, gvec_vvv, 16, MO_64, do_vavgr_s) +TRANS(vavgr_bu, LSX, gvec_vvv, 16, MO_8, do_vavgr_u) +TRANS(vavgr_hu, LSX, gvec_vvv, 16, MO_16, do_vavgr_u) +TRANS(vavgr_wu, LSX, gvec_vvv, 16, MO_32, do_vavgr_u) +TRANS(vavgr_du, LSX, gvec_vvv, 16, MO_64, do_vavgr_u) =20 static void gen_vabsd_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) { @@ -1305,14 +1294,14 @@ static void do_vabsd_u(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vabsd_b, LSX, gvec_vvv, MO_8, do_vabsd_s) -TRANS(vabsd_h, LSX, gvec_vvv, MO_16, do_vabsd_s) -TRANS(vabsd_w, LSX, gvec_vvv, MO_32, do_vabsd_s) -TRANS(vabsd_d, LSX, gvec_vvv, MO_64, do_vabsd_s) -TRANS(vabsd_bu, LSX, gvec_vvv, MO_8, do_vabsd_u) -TRANS(vabsd_hu, LSX, gvec_vvv, MO_16, do_vabsd_u) -TRANS(vabsd_wu, LSX, gvec_vvv, MO_32, do_vabsd_u) -TRANS(vabsd_du, LSX, gvec_vvv, MO_64, do_vabsd_u) +TRANS(vabsd_b, LSX, gvec_vvv, 16, MO_8, do_vabsd_s) +TRANS(vabsd_h, LSX, gvec_vvv, 16, MO_16, do_vabsd_s) +TRANS(vabsd_w, LSX, gvec_vvv, 16, MO_32, do_vabsd_s) +TRANS(vabsd_d, LSX, gvec_vvv, 16, MO_64, do_vabsd_s) +TRANS(vabsd_bu, LSX, gvec_vvv, 16, MO_8, do_vabsd_u) +TRANS(vabsd_hu, LSX, gvec_vvv, 16, MO_16, do_vabsd_u) +TRANS(vabsd_wu, LSX, gvec_vvv, 16, MO_32, do_vabsd_u) +TRANS(vabsd_du, LSX, gvec_vvv, 16, MO_64, do_vabsd_u) =20 static void gen_vadda(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) { @@ -1362,28 +1351,28 @@ static void do_vadda(unsigned vece, uint32_t vd_ofs= , uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vadda_b, LSX, gvec_vvv, MO_8, do_vadda) -TRANS(vadda_h, LSX, gvec_vvv, MO_16, do_vadda) -TRANS(vadda_w, LSX, gvec_vvv, MO_32, do_vadda) -TRANS(vadda_d, LSX, gvec_vvv, MO_64, do_vadda) - -TRANS(vmax_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_smax) -TRANS(vmax_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_smax) -TRANS(vmax_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_smax) -TRANS(vmax_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_smax) -TRANS(vmax_bu, LSX, gvec_vvv, MO_8, tcg_gen_gvec_umax) -TRANS(vmax_hu, LSX, gvec_vvv, MO_16, tcg_gen_gvec_umax) -TRANS(vmax_wu, LSX, gvec_vvv, MO_32, tcg_gen_gvec_umax) -TRANS(vmax_du, LSX, gvec_vvv, MO_64, tcg_gen_gvec_umax) - -TRANS(vmin_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_smin) -TRANS(vmin_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_smin) -TRANS(vmin_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_smin) -TRANS(vmin_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_smin) -TRANS(vmin_bu, LSX, gvec_vvv, MO_8, tcg_gen_gvec_umin) -TRANS(vmin_hu, LSX, gvec_vvv, MO_16, tcg_gen_gvec_umin) -TRANS(vmin_wu, LSX, gvec_vvv, MO_32, tcg_gen_gvec_umin) -TRANS(vmin_du, LSX, gvec_vvv, MO_64, tcg_gen_gvec_umin) +TRANS(vadda_b, LSX, gvec_vvv, 16, MO_8, do_vadda) +TRANS(vadda_h, LSX, gvec_vvv, 16, MO_16, do_vadda) +TRANS(vadda_w, LSX, gvec_vvv, 16, MO_32, do_vadda) +TRANS(vadda_d, LSX, gvec_vvv, 16, MO_64, do_vadda) + +TRANS(vmax_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_smax) +TRANS(vmax_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_smax) +TRANS(vmax_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_smax) +TRANS(vmax_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_smax) +TRANS(vmax_bu, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_umax) +TRANS(vmax_hu, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_umax) +TRANS(vmax_wu, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_umax) +TRANS(vmax_du, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_umax) + +TRANS(vmin_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_smin) +TRANS(vmin_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_smin) +TRANS(vmin_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_smin) +TRANS(vmin_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_smin) +TRANS(vmin_bu, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_umin) +TRANS(vmin_hu, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_umin) +TRANS(vmin_wu, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_umin) +TRANS(vmin_du, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_umin) =20 static void gen_vmini_s(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t imm) { @@ -1567,10 +1556,10 @@ TRANS(vmaxi_hu, LSX, gvec_vv_i, MO_16, do_vmaxi_u) TRANS(vmaxi_wu, LSX, gvec_vv_i, MO_32, do_vmaxi_u) TRANS(vmaxi_du, LSX, gvec_vv_i, MO_64, do_vmaxi_u) =20 -TRANS(vmul_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_mul) -TRANS(vmul_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_mul) -TRANS(vmul_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_mul) -TRANS(vmul_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_mul) +TRANS(vmul_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_mul) +TRANS(vmul_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_mul) +TRANS(vmul_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_mul) +TRANS(vmul_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_mul) =20 static void gen_vmuh_w(TCGv_i32 t, TCGv_i32 a, TCGv_i32 b) { @@ -1611,10 +1600,10 @@ static void do_vmuh_s(unsigned vece, uint32_t vd_of= s, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmuh_b, LSX, gvec_vvv, MO_8, do_vmuh_s) -TRANS(vmuh_h, LSX, gvec_vvv, MO_16, do_vmuh_s) -TRANS(vmuh_w, LSX, gvec_vvv, MO_32, do_vmuh_s) -TRANS(vmuh_d, LSX, gvec_vvv, MO_64, do_vmuh_s) +TRANS(vmuh_b, LSX, gvec_vvv, 16, MO_8, do_vmuh_s) +TRANS(vmuh_h, LSX, gvec_vvv, 16, MO_16, do_vmuh_s) +TRANS(vmuh_w, LSX, gvec_vvv, 16, MO_32, do_vmuh_s) +TRANS(vmuh_d, LSX, gvec_vvv, 16, MO_64, do_vmuh_s) =20 static void gen_vmuh_wu(TCGv_i32 t, TCGv_i32 a, TCGv_i32 b) { @@ -1655,10 +1644,10 @@ static void do_vmuh_u(unsigned vece, uint32_t vd_of= s, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmuh_bu, LSX, gvec_vvv, MO_8, do_vmuh_u) -TRANS(vmuh_hu, LSX, gvec_vvv, MO_16, do_vmuh_u) -TRANS(vmuh_wu, LSX, gvec_vvv, MO_32, do_vmuh_u) -TRANS(vmuh_du, LSX, gvec_vvv, MO_64, do_vmuh_u) +TRANS(vmuh_bu, LSX, gvec_vvv, 16, MO_8, do_vmuh_u) +TRANS(vmuh_hu, LSX, gvec_vvv, 16, MO_16, do_vmuh_u) +TRANS(vmuh_wu, LSX, gvec_vvv, 16, MO_32, do_vmuh_u) +TRANS(vmuh_du, LSX, gvec_vvv, 16, MO_64, do_vmuh_u) =20 static void gen_vmulwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -1728,9 +1717,9 @@ static void do_vmulwev_s(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmulwev_h_b, LSX, gvec_vvv, MO_8, do_vmulwev_s) -TRANS(vmulwev_w_h, LSX, gvec_vvv, MO_16, do_vmulwev_s) -TRANS(vmulwev_d_w, LSX, gvec_vvv, MO_32, do_vmulwev_s) +TRANS(vmulwev_h_b, LSX, gvec_vvv, 16, MO_8, do_vmulwev_s) +TRANS(vmulwev_w_h, LSX, gvec_vvv, 16, MO_16, do_vmulwev_s) +TRANS(vmulwev_d_w, LSX, gvec_vvv, 16, MO_32, do_vmulwev_s) =20 static void tcg_gen_mulus2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 arg1, TCGv_i64 arg2) @@ -1836,9 +1825,9 @@ static void do_vmulwod_s(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmulwod_h_b, LSX, gvec_vvv, MO_8, do_vmulwod_s) -TRANS(vmulwod_w_h, LSX, gvec_vvv, MO_16, do_vmulwod_s) -TRANS(vmulwod_d_w, LSX, gvec_vvv, MO_32, do_vmulwod_s) +TRANS(vmulwod_h_b, LSX, gvec_vvv, 16, MO_8, do_vmulwod_s) +TRANS(vmulwod_w_h, LSX, gvec_vvv, 16, MO_16, do_vmulwod_s) +TRANS(vmulwod_d_w, LSX, gvec_vvv, 16, MO_32, do_vmulwod_s) =20 static void gen_vmulwev_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -1906,9 +1895,9 @@ static void do_vmulwev_u(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmulwev_h_bu, LSX, gvec_vvv, MO_8, do_vmulwev_u) -TRANS(vmulwev_w_hu, LSX, gvec_vvv, MO_16, do_vmulwev_u) -TRANS(vmulwev_d_wu, LSX, gvec_vvv, MO_32, do_vmulwev_u) +TRANS(vmulwev_h_bu, LSX, gvec_vvv, 16, MO_8, do_vmulwev_u) +TRANS(vmulwev_w_hu, LSX, gvec_vvv, 16, MO_16, do_vmulwev_u) +TRANS(vmulwev_d_wu, LSX, gvec_vvv, 16, MO_32, do_vmulwev_u) =20 static void gen_vmulwod_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -1976,9 +1965,9 @@ static void do_vmulwod_u(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmulwod_h_bu, LSX, gvec_vvv, MO_8, do_vmulwod_u) -TRANS(vmulwod_w_hu, LSX, gvec_vvv, MO_16, do_vmulwod_u) -TRANS(vmulwod_d_wu, LSX, gvec_vvv, MO_32, do_vmulwod_u) +TRANS(vmulwod_h_bu, LSX, gvec_vvv, 16, MO_8, do_vmulwod_u) +TRANS(vmulwod_w_hu, LSX, gvec_vvv, 16, MO_16, do_vmulwod_u) +TRANS(vmulwod_d_wu, LSX, gvec_vvv, 16, MO_32, do_vmulwod_u) =20 static void gen_vmulwev_u_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_ve= c b) { @@ -2048,9 +2037,9 @@ static void do_vmulwev_u_s(unsigned vece, uint32_t vd= _ofs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmulwev_h_bu_b, LSX, gvec_vvv, MO_8, do_vmulwev_u_s) -TRANS(vmulwev_w_hu_h, LSX, gvec_vvv, MO_16, do_vmulwev_u_s) -TRANS(vmulwev_d_wu_w, LSX, gvec_vvv, MO_32, do_vmulwev_u_s) +TRANS(vmulwev_h_bu_b, LSX, gvec_vvv, 16, MO_8, do_vmulwev_u_s) +TRANS(vmulwev_w_hu_h, LSX, gvec_vvv, 16, MO_16, do_vmulwev_u_s) +TRANS(vmulwev_d_wu_w, LSX, gvec_vvv, 16, MO_32, do_vmulwev_u_s) =20 static void gen_vmulwod_u_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_ve= c b) { @@ -2117,9 +2106,9 @@ static void do_vmulwod_u_s(unsigned vece, uint32_t vd= _ofs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmulwod_h_bu_b, LSX, gvec_vvv, MO_8, do_vmulwod_u_s) -TRANS(vmulwod_w_hu_h, LSX, gvec_vvv, MO_16, do_vmulwod_u_s) -TRANS(vmulwod_d_wu_w, LSX, gvec_vvv, MO_32, do_vmulwod_u_s) +TRANS(vmulwod_h_bu_b, LSX, gvec_vvv, 16, MO_8, do_vmulwod_u_s) +TRANS(vmulwod_w_hu_h, LSX, gvec_vvv, 16, MO_16, do_vmulwod_u_s) +TRANS(vmulwod_d_wu_w, LSX, gvec_vvv, 16, MO_32, do_vmulwod_u_s) =20 static void gen_vmadd(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) { @@ -2190,10 +2179,10 @@ static void do_vmadd(unsigned vece, uint32_t vd_ofs= , uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmadd_b, LSX, gvec_vvv, MO_8, do_vmadd) -TRANS(vmadd_h, LSX, gvec_vvv, MO_16, do_vmadd) -TRANS(vmadd_w, LSX, gvec_vvv, MO_32, do_vmadd) -TRANS(vmadd_d, LSX, gvec_vvv, MO_64, do_vmadd) +TRANS(vmadd_b, LSX, gvec_vvv, 16, MO_8, do_vmadd) +TRANS(vmadd_h, LSX, gvec_vvv, 16, MO_16, do_vmadd) +TRANS(vmadd_w, LSX, gvec_vvv, 16, MO_32, do_vmadd) +TRANS(vmadd_d, LSX, gvec_vvv, 16, MO_64, do_vmadd) =20 static void gen_vmsub(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) { @@ -2264,10 +2253,10 @@ static void do_vmsub(unsigned vece, uint32_t vd_ofs= , uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmsub_b, LSX, gvec_vvv, MO_8, do_vmsub) -TRANS(vmsub_h, LSX, gvec_vvv, MO_16, do_vmsub) -TRANS(vmsub_w, LSX, gvec_vvv, MO_32, do_vmsub) -TRANS(vmsub_d, LSX, gvec_vvv, MO_64, do_vmsub) +TRANS(vmsub_b, LSX, gvec_vvv, 16, MO_8, do_vmsub) +TRANS(vmsub_h, LSX, gvec_vvv, 16, MO_16, do_vmsub) +TRANS(vmsub_w, LSX, gvec_vvv, 16, MO_32, do_vmsub) +TRANS(vmsub_d, LSX, gvec_vvv, 16, MO_64, do_vmsub) =20 static void gen_vmaddwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec= b) { @@ -2339,9 +2328,9 @@ static void do_vmaddwev_s(unsigned vece, uint32_t vd_= ofs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmaddwev_h_b, LSX, gvec_vvv, MO_8, do_vmaddwev_s) -TRANS(vmaddwev_w_h, LSX, gvec_vvv, MO_16, do_vmaddwev_s) -TRANS(vmaddwev_d_w, LSX, gvec_vvv, MO_32, do_vmaddwev_s) +TRANS(vmaddwev_h_b, LSX, gvec_vvv, 16, MO_8, do_vmaddwev_s) +TRANS(vmaddwev_w_h, LSX, gvec_vvv, 16, MO_16, do_vmaddwev_s) +TRANS(vmaddwev_d_w, LSX, gvec_vvv, 16, MO_32, do_vmaddwev_s) =20 #define VMADD_Q(NAME, FN, idx1, idx2) \ static bool trans_## NAME (DisasContext *ctx, arg_vvv *a) \ @@ -2447,9 +2436,9 @@ static void do_vmaddwod_s(unsigned vece, uint32_t vd_= ofs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmaddwod_h_b, LSX, gvec_vvv, MO_8, do_vmaddwod_s) -TRANS(vmaddwod_w_h, LSX, gvec_vvv, MO_16, do_vmaddwod_s) -TRANS(vmaddwod_d_w, LSX, gvec_vvv, MO_32, do_vmaddwod_s) +TRANS(vmaddwod_h_b, LSX, gvec_vvv, 16, MO_8, do_vmaddwod_s) +TRANS(vmaddwod_w_h, LSX, gvec_vvv, 16, MO_16, do_vmaddwod_s) +TRANS(vmaddwod_d_w, LSX, gvec_vvv, 16, MO_32, do_vmaddwod_s) =20 static void gen_vmaddwev_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec= b) { @@ -2517,9 +2506,9 @@ static void do_vmaddwev_u(unsigned vece, uint32_t vd_= ofs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmaddwev_h_bu, LSX, gvec_vvv, MO_8, do_vmaddwev_u) -TRANS(vmaddwev_w_hu, LSX, gvec_vvv, MO_16, do_vmaddwev_u) -TRANS(vmaddwev_d_wu, LSX, gvec_vvv, MO_32, do_vmaddwev_u) +TRANS(vmaddwev_h_bu, LSX, gvec_vvv, 16, MO_8, do_vmaddwev_u) +TRANS(vmaddwev_w_hu, LSX, gvec_vvv, 16, MO_16, do_vmaddwev_u) +TRANS(vmaddwev_d_wu, LSX, gvec_vvv, 16, MO_32, do_vmaddwev_u) =20 static void gen_vmaddwod_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec= b) { @@ -2588,9 +2577,9 @@ static void do_vmaddwod_u(unsigned vece, uint32_t vd_= ofs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmaddwod_h_bu, LSX, gvec_vvv, MO_8, do_vmaddwod_u) -TRANS(vmaddwod_w_hu, LSX, gvec_vvv, MO_16, do_vmaddwod_u) -TRANS(vmaddwod_d_wu, LSX, gvec_vvv, MO_32, do_vmaddwod_u) +TRANS(vmaddwod_h_bu, LSX, gvec_vvv, 16, MO_8, do_vmaddwod_u) +TRANS(vmaddwod_w_hu, LSX, gvec_vvv, 16, MO_16, do_vmaddwod_u) +TRANS(vmaddwod_d_wu, LSX, gvec_vvv, 16, MO_32, do_vmaddwod_u) =20 static void gen_vmaddwev_u_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_v= ec b) { @@ -2661,9 +2650,9 @@ static void do_vmaddwev_u_s(unsigned vece, uint32_t v= d_ofs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmaddwev_h_bu_b, LSX, gvec_vvv, MO_8, do_vmaddwev_u_s) -TRANS(vmaddwev_w_hu_h, LSX, gvec_vvv, MO_16, do_vmaddwev_u_s) -TRANS(vmaddwev_d_wu_w, LSX, gvec_vvv, MO_32, do_vmaddwev_u_s) +TRANS(vmaddwev_h_bu_b, LSX, gvec_vvv, 16, MO_8, do_vmaddwev_u_s) +TRANS(vmaddwev_w_hu_h, LSX, gvec_vvv, 16, MO_16, do_vmaddwev_u_s) +TRANS(vmaddwev_d_wu_w, LSX, gvec_vvv, 16, MO_32, do_vmaddwev_u_s) =20 static void gen_vmaddwod_u_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_v= ec b) { @@ -2733,9 +2722,9 @@ static void do_vmaddwod_u_s(unsigned vece, uint32_t v= d_ofs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vmaddwod_h_bu_b, LSX, gvec_vvv, MO_8, do_vmaddwod_u_s) -TRANS(vmaddwod_w_hu_h, LSX, gvec_vvv, MO_16, do_vmaddwod_u_s) -TRANS(vmaddwod_d_wu_w, LSX, gvec_vvv, MO_32, do_vmaddwod_u_s) +TRANS(vmaddwod_h_bu_b, LSX, gvec_vvv, 16, MO_8, do_vmaddwod_u_s) +TRANS(vmaddwod_w_hu_h, LSX, gvec_vvv, 16, MO_16, do_vmaddwod_u_s) +TRANS(vmaddwod_d_wu_w, LSX, gvec_vvv, 16, MO_32, do_vmaddwod_u_s) =20 TRANS(vdiv_b, LSX, gen_vvv, gen_helper_vdiv_b) TRANS(vdiv_h, LSX, gen_vvv, gen_helper_vdiv_h) @@ -2912,10 +2901,10 @@ static void do_vsigncov(unsigned vece, uint32_t vd_= ofs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vsigncov_b, LSX, gvec_vvv, MO_8, do_vsigncov) -TRANS(vsigncov_h, LSX, gvec_vvv, MO_16, do_vsigncov) -TRANS(vsigncov_w, LSX, gvec_vvv, MO_32, do_vsigncov) -TRANS(vsigncov_d, LSX, gvec_vvv, MO_64, do_vsigncov) +TRANS(vsigncov_b, LSX, gvec_vvv, 16, MO_8, do_vsigncov) +TRANS(vsigncov_h, LSX, gvec_vvv, 16, MO_16, do_vsigncov) +TRANS(vsigncov_w, LSX, gvec_vvv, 16, MO_32, do_vsigncov) +TRANS(vsigncov_d, LSX, gvec_vvv, 16, MO_64, do_vsigncov) =20 TRANS(vmskltz_b, LSX, gen_vv, gen_helper_vmskltz_b) TRANS(vmskltz_h, LSX, gen_vv, gen_helper_vmskltz_h) @@ -3049,7 +3038,7 @@ static bool trans_vldi(DisasContext *ctx, arg_vldi *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 sel =3D (a->imm >> 12) & 0x1; =20 @@ -3066,10 +3055,10 @@ static bool trans_vldi(DisasContext *ctx, arg_vldi = *a) return true; } =20 -TRANS(vand_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_and) -TRANS(vor_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_or) -TRANS(vxor_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_xor) -TRANS(vnor_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_nor) +TRANS(vand_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_and) +TRANS(vor_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_or) +TRANS(vxor_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_xor) +TRANS(vnor_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_nor) =20 static bool trans_vandn_v(DisasContext *ctx, arg_vvv *a) { @@ -3079,7 +3068,7 @@ static bool trans_vandn_v(DisasContext *ctx, arg_vvv = *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 vd_ofs =3D vec_full_offset(a->vd); vj_ofs =3D vec_full_offset(a->vj); @@ -3088,7 +3077,7 @@ static bool trans_vandn_v(DisasContext *ctx, arg_vvv = *a) tcg_gen_gvec_andc(MO_64, vd_ofs, vk_ofs, vj_ofs, 16, ctx->vl/8); return true; } -TRANS(vorn_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_orc) +TRANS(vorn_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_orc) TRANS(vandi_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_andi) TRANS(vori_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_ori) TRANS(vxori_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_xori) @@ -3126,37 +3115,37 @@ static void do_vnori_b(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, =20 TRANS(vnori_b, LSX, gvec_vv_i, MO_8, do_vnori_b) =20 -TRANS(vsll_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_shlv) -TRANS(vsll_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_shlv) -TRANS(vsll_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_shlv) -TRANS(vsll_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_shlv) +TRANS(vsll_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_shlv) +TRANS(vsll_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_shlv) +TRANS(vsll_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_shlv) +TRANS(vsll_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_shlv) TRANS(vslli_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_shli) TRANS(vslli_h, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_shli) TRANS(vslli_w, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_shli) TRANS(vslli_d, LSX, gvec_vv_i, MO_64, tcg_gen_gvec_shli) =20 -TRANS(vsrl_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_shrv) -TRANS(vsrl_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_shrv) -TRANS(vsrl_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_shrv) -TRANS(vsrl_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_shrv) +TRANS(vsrl_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_shrv) +TRANS(vsrl_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_shrv) +TRANS(vsrl_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_shrv) +TRANS(vsrl_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_shrv) TRANS(vsrli_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_shri) TRANS(vsrli_h, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_shri) TRANS(vsrli_w, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_shri) TRANS(vsrli_d, LSX, gvec_vv_i, MO_64, tcg_gen_gvec_shri) =20 -TRANS(vsra_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_sarv) -TRANS(vsra_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_sarv) -TRANS(vsra_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_sarv) -TRANS(vsra_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_sarv) +TRANS(vsra_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_sarv) +TRANS(vsra_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_sarv) +TRANS(vsra_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_sarv) +TRANS(vsra_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_sarv) TRANS(vsrai_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_sari) TRANS(vsrai_h, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_sari) TRANS(vsrai_w, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_sari) TRANS(vsrai_d, LSX, gvec_vv_i, MO_64, tcg_gen_gvec_sari) =20 -TRANS(vrotr_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_rotrv) -TRANS(vrotr_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_rotrv) -TRANS(vrotr_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_rotrv) -TRANS(vrotr_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_rotrv) +TRANS(vrotr_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_rotrv) +TRANS(vrotr_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_rotrv) +TRANS(vrotr_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_rotrv) +TRANS(vrotr_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_rotrv) TRANS(vrotri_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_rotri) TRANS(vrotri_h, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_rotri) TRANS(vrotri_w, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_rotri) @@ -3361,10 +3350,10 @@ static void do_vbitclr(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vbitclr_b, LSX, gvec_vvv, MO_8, do_vbitclr) -TRANS(vbitclr_h, LSX, gvec_vvv, MO_16, do_vbitclr) -TRANS(vbitclr_w, LSX, gvec_vvv, MO_32, do_vbitclr) -TRANS(vbitclr_d, LSX, gvec_vvv, MO_64, do_vbitclr) +TRANS(vbitclr_b, LSX, gvec_vvv, 16, MO_8, do_vbitclr) +TRANS(vbitclr_h, LSX, gvec_vvv, 16, MO_16, do_vbitclr) +TRANS(vbitclr_w, LSX, gvec_vvv, 16, MO_32, do_vbitclr) +TRANS(vbitclr_d, LSX, gvec_vvv, 16, MO_64, do_vbitclr) =20 static void do_vbiti(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t imm, void (*func)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec)) @@ -3472,10 +3461,10 @@ static void do_vbitset(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vbitset_b, LSX, gvec_vvv, MO_8, do_vbitset) -TRANS(vbitset_h, LSX, gvec_vvv, MO_16, do_vbitset) -TRANS(vbitset_w, LSX, gvec_vvv, MO_32, do_vbitset) -TRANS(vbitset_d, LSX, gvec_vvv, MO_64, do_vbitset) +TRANS(vbitset_b, LSX, gvec_vvv, 16, MO_8, do_vbitset) +TRANS(vbitset_h, LSX, gvec_vvv, 16, MO_16, do_vbitset) +TRANS(vbitset_w, LSX, gvec_vvv, 16, MO_32, do_vbitset) +TRANS(vbitset_d, LSX, gvec_vvv, 16, MO_64, do_vbitset) =20 static void do_vbitseti(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, int64_t imm, uint32_t oprsz, uint32_t maxsz) @@ -3554,10 +3543,10 @@ static void do_vbitrev(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); } =20 -TRANS(vbitrev_b, LSX, gvec_vvv, MO_8, do_vbitrev) -TRANS(vbitrev_h, LSX, gvec_vvv, MO_16, do_vbitrev) -TRANS(vbitrev_w, LSX, gvec_vvv, MO_32, do_vbitrev) -TRANS(vbitrev_d, LSX, gvec_vvv, MO_64, do_vbitrev) +TRANS(vbitrev_b, LSX, gvec_vvv, 16, MO_8, do_vbitrev) +TRANS(vbitrev_h, LSX, gvec_vvv, 16, MO_16, do_vbitrev) +TRANS(vbitrev_w, LSX, gvec_vvv, 16, MO_32, do_vbitrev) +TRANS(vbitrev_d, LSX, gvec_vvv, 16, MO_64, do_vbitrev) =20 static void do_vbitrevi(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, int64_t imm, uint32_t oprsz, uint32_t maxsz) @@ -3706,7 +3695,7 @@ static bool do_cmp(DisasContext *ctx, arg_vvv *a, Mem= Op mop, TCGCond cond) { uint32_t vd_ofs, vj_ofs, vk_ofs; =20 - CHECK_SXE; + CHECK_VEC; =20 vd_ofs =3D vec_full_offset(a->vd); vj_ofs =3D vec_full_offset(a->vj); @@ -3752,7 +3741,7 @@ static bool do_## NAME ##_s(DisasContext *ctx, arg_vv= _i *a, MemOp mop) \ { \ uint32_t vd_ofs, vj_ofs; \ \ - CHECK_SXE; \ + CHECK_VEC; \ \ static const TCGOpcode vecop_list[] =3D { \ INDEX_op_cmp_vec, 0 \ @@ -3801,7 +3790,7 @@ static bool do_## NAME ##_u(DisasContext *ctx, arg_vv= _i *a, MemOp mop) \ { \ uint32_t vd_ofs, vj_ofs; \ \ - CHECK_SXE; \ + CHECK_VEC; \ \ static const TCGOpcode vecop_list[] =3D { \ INDEX_op_cmp_vec, 0 \ @@ -3899,7 +3888,7 @@ static bool trans_vfcmp_cond_s(DisasContext *ctx, arg= _vvv_fcond *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 fn =3D (a->fcond & 1 ? gen_helper_vfcmp_s_s : gen_helper_vfcmp_c_s); flags =3D get_fcmp_flags(a->fcond >> 1); @@ -3920,7 +3909,7 @@ static bool trans_vfcmp_cond_d(DisasContext *ctx, arg= _vvv_fcond *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 fn =3D (a->fcond & 1 ? gen_helper_vfcmp_s_d : gen_helper_vfcmp_c_d); flags =3D get_fcmp_flags(a->fcond >> 1); @@ -3935,7 +3924,7 @@ static bool trans_vbitsel_v(DisasContext *ctx, arg_vv= vv *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 tcg_gen_gvec_bitsel(MO_64, vec_full_offset(a->vd), vec_full_offset(a->= va), vec_full_offset(a->vk), vec_full_offset(a->vj), @@ -3961,7 +3950,7 @@ static bool trans_vbitseli_b(DisasContext *ctx, arg_v= v_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 tcg_gen_gvec_2i(vec_full_offset(a->vd), vec_full_offset(a->vj), 16, ctx->vl/8, a->imm, &op); @@ -3984,7 +3973,7 @@ static bool trans_## NAME (DisasContext *ctx, arg_cv = *a) \ return false; = \ } = \ = \ - CHECK_SXE; = \ + CHECK_VEC; = \ tcg_gen_or_i64(t1, al, ah); = \ tcg_gen_setcondi_i64(COND, t1, t1, 0); = \ tcg_gen_st8_tl(t1, cpu_env, offsetof(CPULoongArchState, cf[a->cd & 0x7= ])); \ @@ -4012,7 +4001,7 @@ static bool trans_vinsgr2vr_b(DisasContext *ctx, arg_= vr_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_st8_i64(src, cpu_env, offsetof(CPULoongArchState, fpr[a->vd].vreg.B(a->imm))= ); return true; @@ -4026,7 +4015,7 @@ static bool trans_vinsgr2vr_h(DisasContext *ctx, arg_= vr_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_st16_i64(src, cpu_env, offsetof(CPULoongArchState, fpr[a->vd].vreg.H(a->imm))= ); return true; @@ -4040,7 +4029,7 @@ static bool trans_vinsgr2vr_w(DisasContext *ctx, arg_= vr_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_st32_i64(src, cpu_env, offsetof(CPULoongArchState, fpr[a->vd].vreg.W(a->imm)= )); return true; @@ -4054,7 +4043,7 @@ static bool trans_vinsgr2vr_d(DisasContext *ctx, arg_= vr_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_st_i64(src, cpu_env, offsetof(CPULoongArchState, fpr[a->vd].vreg.D(a->imm))); return true; @@ -4068,7 +4057,7 @@ static bool trans_vpickve2gr_b(DisasContext *ctx, arg= _rv_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_ld8s_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.B(a->imm)= )); return true; @@ -4082,7 +4071,7 @@ static bool trans_vpickve2gr_h(DisasContext *ctx, arg= _rv_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_ld16s_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.H(a->imm= ))); return true; @@ -4096,7 +4085,7 @@ static bool trans_vpickve2gr_w(DisasContext *ctx, arg= _rv_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_ld32s_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.W(a->imm= ))); return true; @@ -4110,7 +4099,7 @@ static bool trans_vpickve2gr_d(DisasContext *ctx, arg= _rv_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_ld_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.D(a->imm))); return true; @@ -4124,7 +4113,7 @@ static bool trans_vpickve2gr_bu(DisasContext *ctx, ar= g_rv_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_ld8u_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.B(a->imm)= )); return true; @@ -4138,7 +4127,7 @@ static bool trans_vpickve2gr_hu(DisasContext *ctx, ar= g_rv_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_ld16u_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.H(a->imm= ))); return true; @@ -4152,7 +4141,7 @@ static bool trans_vpickve2gr_wu(DisasContext *ctx, ar= g_rv_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_ld32u_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.W(a->imm= ))); return true; @@ -4166,7 +4155,7 @@ static bool trans_vpickve2gr_du(DisasContext *ctx, ar= g_rv_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_ld_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.D(a->imm))); return true; @@ -4180,7 +4169,7 @@ static bool gvec_dup(DisasContext *ctx, arg_vr *a, Me= mOp mop) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 tcg_gen_gvec_dup_i64(mop, vec_full_offset(a->vd), 16, ctx->vl/8, src); @@ -4198,7 +4187,7 @@ static bool trans_vreplvei_b(DisasContext *ctx, arg_v= v_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_gvec_dup_mem(MO_8,vec_full_offset(a->vd), offsetof(CPULoongArchState, fpr[a->vj].vreg.B((a->imm))), @@ -4212,7 +4201,7 @@ static bool trans_vreplvei_h(DisasContext *ctx, arg_v= v_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_gvec_dup_mem(MO_16, vec_full_offset(a->vd), offsetof(CPULoongArchState, fpr[a->vj].vreg.H((a->imm))), @@ -4225,7 +4214,7 @@ static bool trans_vreplvei_w(DisasContext *ctx, arg_v= v_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_gvec_dup_mem(MO_32, vec_full_offset(a->vd), offsetof(CPULoongArchState, fpr[a->vj].vreg.W((a->imm))), @@ -4238,7 +4227,7 @@ static bool trans_vreplvei_d(DisasContext *ctx, arg_v= v_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; tcg_gen_gvec_dup_mem(MO_64, vec_full_offset(a->vd), offsetof(CPULoongArchState, fpr[a->vj].vreg.D((a->imm))), @@ -4257,7 +4246,7 @@ static bool gen_vreplve(DisasContext *ctx, arg_vvr *a= , int vece, int bit, return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 tcg_gen_andi_i64(t0, gpr_src(ctx, a->rk, EXT_NONE), (LSX_LEN/bit) -1); tcg_gen_shli_i64(t0, t0, vece); @@ -4287,7 +4276,7 @@ static bool trans_vbsll_v(DisasContext *ctx, arg_vv_i= *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 desthigh =3D tcg_temp_new_i64(); destlow =3D tcg_temp_new_i64(); @@ -4321,7 +4310,7 @@ static bool trans_vbsrl_v(DisasContext *ctx, arg_vv_i= *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 desthigh =3D tcg_temp_new_i64(); destlow =3D tcg_temp_new_i64(); @@ -4399,7 +4388,7 @@ static bool trans_vld(DisasContext *ctx, arg_vr_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 addr =3D gpr_src(ctx, a->rj, EXT_NONE); val =3D tcg_temp_new_i128(); @@ -4426,7 +4415,7 @@ static bool trans_vst(DisasContext *ctx, arg_vr_i *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 addr =3D gpr_src(ctx, a->rj, EXT_NONE); val =3D tcg_temp_new_i128(); @@ -4453,7 +4442,7 @@ static bool trans_vldx(DisasContext *ctx, arg_vrr *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 src1 =3D gpr_src(ctx, a->rj, EXT_NONE); src2 =3D gpr_src(ctx, a->rk, EXT_NONE); @@ -4480,7 +4469,7 @@ static bool trans_vstx(DisasContext *ctx, arg_vrr *a) return false; } =20 - CHECK_SXE; + CHECK_VEC; =20 src1 =3D gpr_src(ctx, a->rj, EXT_NONE); src2 =3D gpr_src(ctx, a->rk, EXT_NONE); @@ -4507,7 +4496,7 @@ static bool trans_## NAME (DisasContext *ctx, arg_vr_= i *a) \ return false; \ } \ \ - CHECK_SXE; \ + CHECK_VEC; \ \ addr =3D gpr_src(ctx, a->rj, EXT_NONE); = \ val =3D tcg_temp_new_i64(); = \ @@ -4535,7 +4524,7 @@ static bool trans_## NAME (DisasContext *ctx, arg_vr_= ii *a) \ return false; = \ } = \ = \ - CHECK_SXE; = \ + CHECK_VEC; = \ = \ addr =3D gpr_src(ctx, a->rj, EXT_NONE); = \ val =3D tcg_temp_new_i64(); = \ --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385389623992.0531157351776; Wed, 30 Aug 2023 01:49:49 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt7-0008Ra-9F; Wed, 30 Aug 2023 04:49:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGt5-0008Q8-5x for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:19 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt1-0007Sz-5e for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:18 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxfeuGAu9kaQgdAA--.56986S3; Wed, 30 Aug 2023 16:49:10 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S8; Wed, 30 Aug 2023 16:49:09 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 06/48] target/loongarch: Implement xvreplgr2vr Date: Wed, 30 Aug 2023 16:48:20 +0800 Message-Id: <20230830084902.2113960-7-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S8 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385390716100011 Content-Type: text/plain; charset="utf-8" This patch includes: - XVREPLGR2VR.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 5 +++++ target/loongarch/disas.c | 10 ++++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 5 +++++ target/loongarch/insn_trans/trans_lsx.c.inc | 12 ++++++------ 4 files changed, 26 insertions(+), 6 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index bcc18fb6c5..04bd238995 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1310,3 +1310,8 @@ xvsub_h 0111 01000000 11001 ..... ..... ....= . @vvv xvsub_w 0111 01000000 11010 ..... ..... ..... @vvv xvsub_d 0111 01000000 11011 ..... ..... ..... @vvv xvsub_q 0111 01010010 11011 ..... ..... ..... @vvv + +xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr +xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr +xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr +xvreplgr2vr_d 0111 01101001 11110 00011 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index d8b62ba532..c47f455ed0 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1708,6 +1708,11 @@ static void output_vvv_x(DisasContext *ctx, arg_vvv = * a, const char *mnemonic) output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk); } =20 +static void output_vr_x(DisasContext *ctx, arg_vr *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, r%d", a->vd, a->rj); +} + INSN_LASX(xvadd_b, vvv) INSN_LASX(xvadd_h, vvv) INSN_LASX(xvadd_w, vvv) @@ -1718,3 +1723,8 @@ INSN_LASX(xvsub_h, vvv) INSN_LASX(xvsub_w, vvv) INSN_LASX(xvsub_d, vvv) INSN_LASX(xvsub_q, vvv) + +INSN_LASX(xvreplgr2vr_b, vr) +INSN_LASX(xvreplgr2vr_h, vr) +INSN_LASX(xvreplgr2vr_w, vr) +INSN_LASX(xvreplgr2vr_d, vr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 218b8dc648..66b5abc790 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -50,3 +50,8 @@ TRANS(xvsub_b, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_sub) TRANS(xvsub_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_sub) TRANS(xvsub_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_sub) TRANS(xvsub_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_sub) + +TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) +TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) +TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) +TRANS(xvreplgr2vr_d, LASX, gvec_dup, 32, MO_64) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index 0e12213e8b..c0e7a9a372 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -4161,7 +4161,7 @@ static bool trans_vpickve2gr_du(DisasContext *ctx, ar= g_rv_i *a) return true; } =20 -static bool gvec_dup(DisasContext *ctx, arg_vr *a, MemOp mop) +static bool gvec_dup(DisasContext *ctx, arg_vr *a, uint32_t oprsz, MemOp m= op) { TCGv src =3D gpr_src(ctx, a->rj, EXT_NONE); =20 @@ -4172,14 +4172,14 @@ static bool gvec_dup(DisasContext *ctx, arg_vr *a, = MemOp mop) CHECK_VEC; =20 tcg_gen_gvec_dup_i64(mop, vec_full_offset(a->vd), - 16, ctx->vl/8, src); + oprsz, ctx->vl / 8, src); return true; } =20 -TRANS(vreplgr2vr_b, LSX, gvec_dup, MO_8) -TRANS(vreplgr2vr_h, LSX, gvec_dup, MO_16) -TRANS(vreplgr2vr_w, LSX, gvec_dup, MO_32) -TRANS(vreplgr2vr_d, LSX, gvec_dup, MO_64) +TRANS(vreplgr2vr_b, LSX, gvec_dup, 16, MO_8) +TRANS(vreplgr2vr_h, LSX, gvec_dup, 16, MO_16) +TRANS(vreplgr2vr_w, LSX, gvec_dup, 16, MO_32) +TRANS(vreplgr2vr_d, LSX, gvec_dup, 16, MO_64) =20 static bool trans_vreplvei_b(DisasContext *ctx, arg_vv_i *a) { --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385542608976.1090694866534; Wed, 30 Aug 2023 01:52:22 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtB-0000D0-C0; Wed, 30 Aug 2023 04:49:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGt8-0008Tq-EZ for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:22 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt3-0007TC-C2 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:22 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxfOqGAu9kbAgdAA--.32236S3; Wed, 30 Aug 2023 16:49:10 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S9; Wed, 30 Aug 2023 16:49:10 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 07/48] target/loongarch: Implement xvaddi/xvsubi Date: Wed, 30 Aug 2023 16:48:21 +0800 Message-Id: <20230830084902.2113960-8-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S9 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385543544100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVADDI.{B/H/W/D}U; - XVSUBI.{B/H/W/D}U. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 9 ++ target/loongarch/disas.c | 14 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 9 ++ target/loongarch/insn_trans/trans_lsx.c.inc | 136 +++++++++---------- 4 files changed, 100 insertions(+), 68 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 04bd238995..c48dca70b8 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1311,6 +1311,15 @@ xvsub_w 0111 01000000 11010 ..... ..... ...= .. @vvv xvsub_d 0111 01000000 11011 ..... ..... ..... @vvv xvsub_q 0111 01010010 11011 ..... ..... ..... @vvv =20 +xvaddi_bu 0111 01101000 10100 ..... ..... ..... @vv_ui5 +xvaddi_hu 0111 01101000 10101 ..... ..... ..... @vv_ui5 +xvaddi_wu 0111 01101000 10110 ..... ..... ..... @vv_ui5 +xvaddi_du 0111 01101000 10111 ..... ..... ..... @vv_ui5 +xvsubi_bu 0111 01101000 11000 ..... ..... ..... @vv_ui5 +xvsubi_hu 0111 01101000 11001 ..... ..... ..... @vv_ui5 +xvsubi_wu 0111 01101000 11010 ..... ..... ..... @vv_ui5 +xvsubi_du 0111 01101000 11011 ..... ..... ..... @vv_ui5 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index c47f455ed0..f59e3cebf0 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1708,6 +1708,11 @@ static void output_vvv_x(DisasContext *ctx, arg_vvv = * a, const char *mnemonic) output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk); } =20 +static void output_vv_i_x(DisasContext *ctx, arg_vv_i *a, const char *mnem= onic) +{ + output(ctx, mnemonic, "x%d, x%d, 0x%x", a->vd, a->vj, a->imm); +} + static void output_vr_x(DisasContext *ctx, arg_vr *a, const char *mnemonic) { output(ctx, mnemonic, "x%d, r%d", a->vd, a->rj); @@ -1724,6 +1729,15 @@ INSN_LASX(xvsub_w, vvv) INSN_LASX(xvsub_d, vvv) INSN_LASX(xvsub_q, vvv) =20 +INSN_LASX(xvaddi_bu, vv_i) +INSN_LASX(xvaddi_hu, vv_i) +INSN_LASX(xvaddi_wu, vv_i) +INSN_LASX(xvaddi_du, vv_i) +INSN_LASX(xvsubi_bu, vv_i) +INSN_LASX(xvsubi_hu, vv_i) +INSN_LASX(xvsubi_wu, vv_i) +INSN_LASX(xvsubi_du, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 66b5abc790..0e8a711fde 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -51,6 +51,15 @@ TRANS(xvsub_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_s= ub) TRANS(xvsub_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_sub) TRANS(xvsub_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_sub) =20 +TRANS(xvaddi_bu, LASX, gvec_vv_i, 32, MO_8, tcg_gen_gvec_addi) +TRANS(xvaddi_hu, LASX, gvec_vv_i, 32, MO_16, tcg_gen_gvec_addi) +TRANS(xvaddi_wu, LASX, gvec_vv_i, 32, MO_32, tcg_gen_gvec_addi) +TRANS(xvaddi_du, LASX, gvec_vv_i, 32, MO_64, tcg_gen_gvec_addi) +TRANS(xvsubi_bu, LASX, gvec_subi, 32, MO_8) +TRANS(xvsubi_hu, LASX, gvec_subi, 32, MO_16) +TRANS(xvsubi_wu, LASX, gvec_subi, 32, MO_32) +TRANS(xvsubi_du, LASX, gvec_subi, 32, MO_64) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index c0e7a9a372..00f134a0b1 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -96,7 +96,7 @@ static bool gvec_vv(DisasContext *ctx, arg_vv *a, MemOp m= op, return true; } =20 -static bool gvec_vv_i(DisasContext *ctx, arg_vv_i *a, MemOp mop, +static bool gvec_vv_i(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz, MemO= p mop, void (*func)(unsigned, uint32_t, uint32_t, int64_t, uint32_t, uint32_t)) { @@ -107,11 +107,11 @@ static bool gvec_vv_i(DisasContext *ctx, arg_vv_i *a,= MemOp mop, vd_ofs =3D vec_full_offset(a->vd); vj_ofs =3D vec_full_offset(a->vj); =20 - func(mop, vd_ofs, vj_ofs, a->imm , 16, ctx->vl/8); + func(mop, vd_ofs, vj_ofs, a->imm, oprsz, ctx->vl / 8); return true; } =20 -static bool gvec_subi(DisasContext *ctx, arg_vv_i *a, MemOp mop) +static bool gvec_subi(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz, MemO= p mop) { uint32_t vd_ofs, vj_ofs; =20 @@ -120,7 +120,7 @@ static bool gvec_subi(DisasContext *ctx, arg_vv_i *a, M= emOp mop) vd_ofs =3D vec_full_offset(a->vd); vj_ofs =3D vec_full_offset(a->vj); =20 - tcg_gen_gvec_addi(mop, vd_ofs, vj_ofs, -a->imm, 16, ctx->vl/8); + tcg_gen_gvec_addi(mop, vd_ofs, vj_ofs, -a->imm, oprsz, ctx->vl / 8); return true; } =20 @@ -168,14 +168,14 @@ TRANS(vsub_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_= sub) TRANS(vsub_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_sub) TRANS(vsub_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_sub) =20 -TRANS(vaddi_bu, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_addi) -TRANS(vaddi_hu, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_addi) -TRANS(vaddi_wu, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_addi) -TRANS(vaddi_du, LSX, gvec_vv_i, MO_64, tcg_gen_gvec_addi) -TRANS(vsubi_bu, LSX, gvec_subi, MO_8) -TRANS(vsubi_hu, LSX, gvec_subi, MO_16) -TRANS(vsubi_wu, LSX, gvec_subi, MO_32) -TRANS(vsubi_du, LSX, gvec_subi, MO_64) +TRANS(vaddi_bu, LSX, gvec_vv_i, 16, MO_8, tcg_gen_gvec_addi) +TRANS(vaddi_hu, LSX, gvec_vv_i, 16, MO_16, tcg_gen_gvec_addi) +TRANS(vaddi_wu, LSX, gvec_vv_i, 16, MO_32, tcg_gen_gvec_addi) +TRANS(vaddi_du, LSX, gvec_vv_i, 16, MO_64, tcg_gen_gvec_addi) +TRANS(vsubi_bu, LSX, gvec_subi, 16, MO_8) +TRANS(vsubi_hu, LSX, gvec_subi, 16, MO_16) +TRANS(vsubi_wu, LSX, gvec_subi, 16, MO_32) +TRANS(vsubi_du, LSX, gvec_subi, 16, MO_64) =20 TRANS(vneg_b, LSX, gvec_vv, MO_8, tcg_gen_gvec_neg) TRANS(vneg_h, LSX, gvec_vv, MO_16, tcg_gen_gvec_neg) @@ -1466,14 +1466,14 @@ static void do_vmini_u(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_2i(vd_ofs, vj_ofs, oprsz, maxsz, imm, &op[vece]); } =20 -TRANS(vmini_b, LSX, gvec_vv_i, MO_8, do_vmini_s) -TRANS(vmini_h, LSX, gvec_vv_i, MO_16, do_vmini_s) -TRANS(vmini_w, LSX, gvec_vv_i, MO_32, do_vmini_s) -TRANS(vmini_d, LSX, gvec_vv_i, MO_64, do_vmini_s) -TRANS(vmini_bu, LSX, gvec_vv_i, MO_8, do_vmini_u) -TRANS(vmini_hu, LSX, gvec_vv_i, MO_16, do_vmini_u) -TRANS(vmini_wu, LSX, gvec_vv_i, MO_32, do_vmini_u) -TRANS(vmini_du, LSX, gvec_vv_i, MO_64, do_vmini_u) +TRANS(vmini_b, LSX, gvec_vv_i, 16, MO_8, do_vmini_s) +TRANS(vmini_h, LSX, gvec_vv_i, 16, MO_16, do_vmini_s) +TRANS(vmini_w, LSX, gvec_vv_i, 16, MO_32, do_vmini_s) +TRANS(vmini_d, LSX, gvec_vv_i, 16, MO_64, do_vmini_s) +TRANS(vmini_bu, LSX, gvec_vv_i, 16, MO_8, do_vmini_u) +TRANS(vmini_hu, LSX, gvec_vv_i, 16, MO_16, do_vmini_u) +TRANS(vmini_wu, LSX, gvec_vv_i, 16, MO_32, do_vmini_u) +TRANS(vmini_du, LSX, gvec_vv_i, 16, MO_64, do_vmini_u) =20 static void do_vmaxi_s(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, int64_t imm, uint32_t oprsz, uint32_t maxsz) @@ -1547,14 +1547,14 @@ static void do_vmaxi_u(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_2i(vd_ofs, vj_ofs, oprsz, maxsz, imm, &op[vece]); } =20 -TRANS(vmaxi_b, LSX, gvec_vv_i, MO_8, do_vmaxi_s) -TRANS(vmaxi_h, LSX, gvec_vv_i, MO_16, do_vmaxi_s) -TRANS(vmaxi_w, LSX, gvec_vv_i, MO_32, do_vmaxi_s) -TRANS(vmaxi_d, LSX, gvec_vv_i, MO_64, do_vmaxi_s) -TRANS(vmaxi_bu, LSX, gvec_vv_i, MO_8, do_vmaxi_u) -TRANS(vmaxi_hu, LSX, gvec_vv_i, MO_16, do_vmaxi_u) -TRANS(vmaxi_wu, LSX, gvec_vv_i, MO_32, do_vmaxi_u) -TRANS(vmaxi_du, LSX, gvec_vv_i, MO_64, do_vmaxi_u) +TRANS(vmaxi_b, LSX, gvec_vv_i, 16, MO_8, do_vmaxi_s) +TRANS(vmaxi_h, LSX, gvec_vv_i, 16, MO_16, do_vmaxi_s) +TRANS(vmaxi_w, LSX, gvec_vv_i, 16, MO_32, do_vmaxi_s) +TRANS(vmaxi_d, LSX, gvec_vv_i, 16, MO_64, do_vmaxi_s) +TRANS(vmaxi_bu, LSX, gvec_vv_i, 16, MO_8, do_vmaxi_u) +TRANS(vmaxi_hu, LSX, gvec_vv_i, 16, MO_16, do_vmaxi_u) +TRANS(vmaxi_wu, LSX, gvec_vv_i, 16, MO_32, do_vmaxi_u) +TRANS(vmaxi_du, LSX, gvec_vv_i, 16, MO_64, do_vmaxi_u) =20 TRANS(vmul_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_mul) TRANS(vmul_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_mul) @@ -2790,10 +2790,10 @@ static void do_vsat_s(unsigned vece, uint32_t vd_of= s, uint32_t vj_ofs, tcg_constant_i64((1ll<< imm) -1), &op[vece]); } =20 -TRANS(vsat_b, LSX, gvec_vv_i, MO_8, do_vsat_s) -TRANS(vsat_h, LSX, gvec_vv_i, MO_16, do_vsat_s) -TRANS(vsat_w, LSX, gvec_vv_i, MO_32, do_vsat_s) -TRANS(vsat_d, LSX, gvec_vv_i, MO_64, do_vsat_s) +TRANS(vsat_b, LSX, gvec_vv_i, 16, MO_8, do_vsat_s) +TRANS(vsat_h, LSX, gvec_vv_i, 16, MO_16, do_vsat_s) +TRANS(vsat_w, LSX, gvec_vv_i, 16, MO_32, do_vsat_s) +TRANS(vsat_d, LSX, gvec_vv_i, 16, MO_64, do_vsat_s) =20 static void gen_vsat_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec max) { @@ -2839,10 +2839,10 @@ static void do_vsat_u(unsigned vece, uint32_t vd_of= s, uint32_t vj_ofs, tcg_constant_i64(max), &op[vece]); } =20 -TRANS(vsat_bu, LSX, gvec_vv_i, MO_8, do_vsat_u) -TRANS(vsat_hu, LSX, gvec_vv_i, MO_16, do_vsat_u) -TRANS(vsat_wu, LSX, gvec_vv_i, MO_32, do_vsat_u) -TRANS(vsat_du, LSX, gvec_vv_i, MO_64, do_vsat_u) +TRANS(vsat_bu, LSX, gvec_vv_i, 16, MO_8, do_vsat_u) +TRANS(vsat_hu, LSX, gvec_vv_i, 16, MO_16, do_vsat_u) +TRANS(vsat_wu, LSX, gvec_vv_i, 16, MO_32, do_vsat_u) +TRANS(vsat_du, LSX, gvec_vv_i, 16, MO_64, do_vsat_u) =20 TRANS(vexth_h_b, LSX, gen_vv, gen_helper_vexth_h_b) TRANS(vexth_w_h, LSX, gen_vv, gen_helper_vexth_w_h) @@ -3078,9 +3078,9 @@ static bool trans_vandn_v(DisasContext *ctx, arg_vvv = *a) return true; } TRANS(vorn_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_orc) -TRANS(vandi_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_andi) -TRANS(vori_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_ori) -TRANS(vxori_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_xori) +TRANS(vandi_b, LSX, gvec_vv_i, 16, MO_8, tcg_gen_gvec_andi) +TRANS(vori_b, LSX, gvec_vv_i, 16, MO_8, tcg_gen_gvec_ori) +TRANS(vxori_b, LSX, gvec_vv_i, 16, MO_8, tcg_gen_gvec_xori) =20 static void gen_vnori(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t imm) { @@ -3113,43 +3113,43 @@ static void do_vnori_b(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, tcg_gen_gvec_2i(vd_ofs, vj_ofs, oprsz, maxsz, imm, &op); } =20 -TRANS(vnori_b, LSX, gvec_vv_i, MO_8, do_vnori_b) +TRANS(vnori_b, LSX, gvec_vv_i, 16, MO_8, do_vnori_b) =20 TRANS(vsll_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_shlv) TRANS(vsll_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_shlv) TRANS(vsll_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_shlv) TRANS(vsll_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_shlv) -TRANS(vslli_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_shli) -TRANS(vslli_h, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_shli) -TRANS(vslli_w, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_shli) -TRANS(vslli_d, LSX, gvec_vv_i, MO_64, tcg_gen_gvec_shli) +TRANS(vslli_b, LSX, gvec_vv_i, 16, MO_8, tcg_gen_gvec_shli) +TRANS(vslli_h, LSX, gvec_vv_i, 16, MO_16, tcg_gen_gvec_shli) +TRANS(vslli_w, LSX, gvec_vv_i, 16, MO_32, tcg_gen_gvec_shli) +TRANS(vslli_d, LSX, gvec_vv_i, 16, MO_64, tcg_gen_gvec_shli) =20 TRANS(vsrl_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_shrv) TRANS(vsrl_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_shrv) TRANS(vsrl_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_shrv) TRANS(vsrl_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_shrv) -TRANS(vsrli_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_shri) -TRANS(vsrli_h, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_shri) -TRANS(vsrli_w, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_shri) -TRANS(vsrli_d, LSX, gvec_vv_i, MO_64, tcg_gen_gvec_shri) +TRANS(vsrli_b, LSX, gvec_vv_i, 16, MO_8, tcg_gen_gvec_shri) +TRANS(vsrli_h, LSX, gvec_vv_i, 16, MO_16, tcg_gen_gvec_shri) +TRANS(vsrli_w, LSX, gvec_vv_i, 16, MO_32, tcg_gen_gvec_shri) +TRANS(vsrli_d, LSX, gvec_vv_i, 16, MO_64, tcg_gen_gvec_shri) =20 TRANS(vsra_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_sarv) TRANS(vsra_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_sarv) TRANS(vsra_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_sarv) TRANS(vsra_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_sarv) -TRANS(vsrai_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_sari) -TRANS(vsrai_h, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_sari) -TRANS(vsrai_w, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_sari) -TRANS(vsrai_d, LSX, gvec_vv_i, MO_64, tcg_gen_gvec_sari) +TRANS(vsrai_b, LSX, gvec_vv_i, 16, MO_8, tcg_gen_gvec_sari) +TRANS(vsrai_h, LSX, gvec_vv_i, 16, MO_16, tcg_gen_gvec_sari) +TRANS(vsrai_w, LSX, gvec_vv_i, 16, MO_32, tcg_gen_gvec_sari) +TRANS(vsrai_d, LSX, gvec_vv_i, 16, MO_64, tcg_gen_gvec_sari) =20 TRANS(vrotr_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_rotrv) TRANS(vrotr_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_rotrv) TRANS(vrotr_w, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_rotrv) TRANS(vrotr_d, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_rotrv) -TRANS(vrotri_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_rotri) -TRANS(vrotri_h, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_rotri) -TRANS(vrotri_w, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_rotri) -TRANS(vrotri_d, LSX, gvec_vv_i, MO_64, tcg_gen_gvec_rotri) +TRANS(vrotri_b, LSX, gvec_vv_i, 16, MO_8, tcg_gen_gvec_rotri) +TRANS(vrotri_h, LSX, gvec_vv_i, 16, MO_16, tcg_gen_gvec_rotri) +TRANS(vrotri_w, LSX, gvec_vv_i, 16, MO_32, tcg_gen_gvec_rotri) +TRANS(vrotri_d, LSX, gvec_vv_i, 16, MO_64, tcg_gen_gvec_rotri) =20 TRANS(vsllwil_h_b, LSX, gen_vv_i, gen_helper_vsllwil_h_b) TRANS(vsllwil_w_h, LSX, gen_vv_i, gen_helper_vsllwil_w_h) @@ -3420,10 +3420,10 @@ static void do_vbitclri(unsigned vece, uint32_t vd_= ofs, uint32_t vj_ofs, tcg_gen_gvec_2i(vd_ofs, vj_ofs, oprsz, maxsz, imm, &op[vece]); } =20 -TRANS(vbitclri_b, LSX, gvec_vv_i, MO_8, do_vbitclri) -TRANS(vbitclri_h, LSX, gvec_vv_i, MO_16, do_vbitclri) -TRANS(vbitclri_w, LSX, gvec_vv_i, MO_32, do_vbitclri) -TRANS(vbitclri_d, LSX, gvec_vv_i, MO_64, do_vbitclri) +TRANS(vbitclri_b, LSX, gvec_vv_i, 16, MO_8, do_vbitclri) +TRANS(vbitclri_h, LSX, gvec_vv_i, 16, MO_16, do_vbitclri) +TRANS(vbitclri_w, LSX, gvec_vv_i, 16, MO_32, do_vbitclri) +TRANS(vbitclri_d, LSX, gvec_vv_i, 16, MO_64, do_vbitclri) =20 static void do_vbitset(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, uint32_t vk_ofs, uint32_t oprsz, uint32_t maxsz) @@ -3502,10 +3502,10 @@ static void do_vbitseti(unsigned vece, uint32_t vd_= ofs, uint32_t vj_ofs, tcg_gen_gvec_2i(vd_ofs, vj_ofs, oprsz, maxsz, imm, &op[vece]); } =20 -TRANS(vbitseti_b, LSX, gvec_vv_i, MO_8, do_vbitseti) -TRANS(vbitseti_h, LSX, gvec_vv_i, MO_16, do_vbitseti) -TRANS(vbitseti_w, LSX, gvec_vv_i, MO_32, do_vbitseti) -TRANS(vbitseti_d, LSX, gvec_vv_i, MO_64, do_vbitseti) +TRANS(vbitseti_b, LSX, gvec_vv_i, 16, MO_8, do_vbitseti) +TRANS(vbitseti_h, LSX, gvec_vv_i, 16, MO_16, do_vbitseti) +TRANS(vbitseti_w, LSX, gvec_vv_i, 16, MO_32, do_vbitseti) +TRANS(vbitseti_d, LSX, gvec_vv_i, 16, MO_64, do_vbitseti) =20 static void do_vbitrev(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, uint32_t vk_ofs, uint32_t oprsz, uint32_t maxsz) @@ -3584,10 +3584,10 @@ static void do_vbitrevi(unsigned vece, uint32_t vd_= ofs, uint32_t vj_ofs, tcg_gen_gvec_2i(vd_ofs, vj_ofs, oprsz, maxsz, imm, &op[vece]); } =20 -TRANS(vbitrevi_b, LSX, gvec_vv_i, MO_8, do_vbitrevi) -TRANS(vbitrevi_h, LSX, gvec_vv_i, MO_16, do_vbitrevi) -TRANS(vbitrevi_w, LSX, gvec_vv_i, MO_32, do_vbitrevi) -TRANS(vbitrevi_d, LSX, gvec_vv_i, MO_64, do_vbitrevi) +TRANS(vbitrevi_b, LSX, gvec_vv_i, 16, MO_8, do_vbitrevi) +TRANS(vbitrevi_h, LSX, gvec_vv_i, 16, MO_16, do_vbitrevi) +TRANS(vbitrevi_w, LSX, gvec_vv_i, 16, MO_32, do_vbitrevi) +TRANS(vbitrevi_d, LSX, gvec_vv_i, 16, MO_64, do_vbitrevi) =20 TRANS(vfrstp_b, LSX, gen_vvv, gen_helper_vfrstp_b) TRANS(vfrstp_h, LSX, gen_vvv, gen_helper_vfrstp_h) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385397320120.02136351316938; Wed, 30 Aug 2023 01:49:57 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt8-0008Tp-CF; Wed, 30 Aug 2023 04:49:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGt5-0008QA-7c for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:19 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt0-0007TO-Li for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:18 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx7+uHAu9kbggdAA--.58100S3; Wed, 30 Aug 2023 16:49:11 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S10; Wed, 30 Aug 2023 16:49:10 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 08/48] target/loongarch: Implement xvneg Date: Wed, 30 Aug 2023 16:48:22 +0800 Message-Id: <20230830084902.2113960-9-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S10 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385397988100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVNEG.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 5 +++++ target/loongarch/disas.c | 10 ++++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 5 +++++ target/loongarch/insn_trans/trans_lsx.c.inc | 12 ++++++------ 4 files changed, 26 insertions(+), 6 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index c48dca70b8..759172628f 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1320,6 +1320,11 @@ xvsubi_hu 0111 01101000 11001 ..... ..... ...= .. @vv_ui5 xvsubi_wu 0111 01101000 11010 ..... ..... ..... @vv_ui5 xvsubi_du 0111 01101000 11011 ..... ..... ..... @vv_ui5 =20 +xvneg_b 0111 01101001 11000 01100 ..... ..... @vv +xvneg_h 0111 01101001 11000 01101 ..... ..... @vv +xvneg_w 0111 01101001 11000 01110 ..... ..... @vv +xvneg_d 0111 01101001 11000 01111 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index f59e3cebf0..4e26d49acc 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1713,6 +1713,11 @@ static void output_vv_i_x(DisasContext *ctx, arg_vv_= i *a, const char *mnemonic) output(ctx, mnemonic, "x%d, x%d, 0x%x", a->vd, a->vj, a->imm); } =20 +static void output_vv_x(DisasContext *ctx, arg_vv *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, x%d", a->vd, a->vj); +} + static void output_vr_x(DisasContext *ctx, arg_vr *a, const char *mnemonic) { output(ctx, mnemonic, "x%d, r%d", a->vd, a->rj); @@ -1738,6 +1743,11 @@ INSN_LASX(xvsubi_hu, vv_i) INSN_LASX(xvsubi_wu, vv_i) INSN_LASX(xvsubi_du, vv_i) =20 +INSN_LASX(xvneg_b, vv) +INSN_LASX(xvneg_h, vv) +INSN_LASX(xvneg_w, vv) +INSN_LASX(xvneg_d, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 0e8a711fde..29eefe6934 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -60,6 +60,11 @@ TRANS(xvsubi_hu, LASX, gvec_subi, 32, MO_16) TRANS(xvsubi_wu, LASX, gvec_subi, 32, MO_32) TRANS(xvsubi_du, LASX, gvec_subi, 32, MO_64) =20 +TRANS(xvneg_b, LASX, gvec_vv, 32, MO_8, tcg_gen_gvec_neg) +TRANS(xvneg_h, LASX, gvec_vv, 32, MO_16, tcg_gen_gvec_neg) +TRANS(xvneg_w, LASX, gvec_vv, 32, MO_32, tcg_gen_gvec_neg) +TRANS(xvneg_d, LASX, gvec_vv, 32, MO_64, tcg_gen_gvec_neg) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index 00f134a0b1..86a0d4d6b9 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -81,7 +81,7 @@ static bool gvec_vvv(DisasContext *ctx, arg_vvv *a, uint3= 2_t oprsz, MemOp mop, return true; } =20 -static bool gvec_vv(DisasContext *ctx, arg_vv *a, MemOp mop, +static bool gvec_vv(DisasContext *ctx, arg_vv *a, uint32_t oprsz, MemOp mo= p, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t)) { @@ -92,7 +92,7 @@ static bool gvec_vv(DisasContext *ctx, arg_vv *a, MemOp m= op, vd_ofs =3D vec_full_offset(a->vd); vj_ofs =3D vec_full_offset(a->vj); =20 - func(mop, vd_ofs, vj_ofs, 16, ctx->vl/8); + func(mop, vd_ofs, vj_ofs, oprsz, ctx->vl / 8); return true; } =20 @@ -177,10 +177,10 @@ TRANS(vsubi_hu, LSX, gvec_subi, 16, MO_16) TRANS(vsubi_wu, LSX, gvec_subi, 16, MO_32) TRANS(vsubi_du, LSX, gvec_subi, 16, MO_64) =20 -TRANS(vneg_b, LSX, gvec_vv, MO_8, tcg_gen_gvec_neg) -TRANS(vneg_h, LSX, gvec_vv, MO_16, tcg_gen_gvec_neg) -TRANS(vneg_w, LSX, gvec_vv, MO_32, tcg_gen_gvec_neg) -TRANS(vneg_d, LSX, gvec_vv, MO_64, tcg_gen_gvec_neg) +TRANS(vneg_b, LSX, gvec_vv, 16, MO_8, tcg_gen_gvec_neg) +TRANS(vneg_h, LSX, gvec_vv, 16, MO_16, tcg_gen_gvec_neg) +TRANS(vneg_w, LSX, gvec_vv, 16, MO_32, tcg_gen_gvec_neg) +TRANS(vneg_d, LSX, gvec_vv, 16, MO_64, tcg_gen_gvec_neg) =20 TRANS(vsadd_b, LSX, gvec_vvv, 16, MO_8, tcg_gen_gvec_ssadd) TRANS(vsadd_h, LSX, gvec_vvv, 16, MO_16, tcg_gen_gvec_ssadd) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385641785528.405396122365; Wed, 30 Aug 2023 01:54:01 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtA-0000Bo-QP; Wed, 30 Aug 2023 04:49:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGt6-0008Ru-TN for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:20 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGt1-0007TU-Ek for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:20 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dxg_CIAu9kdAgdAA--.59638S3; Wed, 30 Aug 2023 16:49:12 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S11; Wed, 30 Aug 2023 16:49:11 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 09/48] target/loongarch: Implement xvsadd/xvssub Date: Wed, 30 Aug 2023 16:48:23 +0800 Message-Id: <20230830084902.2113960-10-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S11 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385643880100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSADD.{B/H/W/D}[U]; - XVSSUB.{B/H/W/D}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 18 ++++++++++++++++++ target/loongarch/disas.c | 17 +++++++++++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 17 +++++++++++++++++ 3 files changed, 52 insertions(+) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 759172628f..32f857ff7c 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1325,6 +1325,24 @@ xvneg_h 0111 01101001 11000 01101 ..... ...= .. @vv xvneg_w 0111 01101001 11000 01110 ..... ..... @vv xvneg_d 0111 01101001 11000 01111 ..... ..... @vv =20 +xvsadd_b 0111 01000100 01100 ..... ..... ..... @vvv +xvsadd_h 0111 01000100 01101 ..... ..... ..... @vvv +xvsadd_w 0111 01000100 01110 ..... ..... ..... @vvv +xvsadd_d 0111 01000100 01111 ..... ..... ..... @vvv +xvsadd_bu 0111 01000100 10100 ..... ..... ..... @vvv +xvsadd_hu 0111 01000100 10101 ..... ..... ..... @vvv +xvsadd_wu 0111 01000100 10110 ..... ..... ..... @vvv +xvsadd_du 0111 01000100 10111 ..... ..... ..... @vvv + +xvssub_b 0111 01000100 10000 ..... ..... ..... @vvv +xvssub_h 0111 01000100 10001 ..... ..... ..... @vvv +xvssub_w 0111 01000100 10010 ..... ..... ..... @vvv +xvssub_d 0111 01000100 10011 ..... ..... ..... @vvv +xvssub_bu 0111 01000100 11000 ..... ..... ..... @vvv +xvssub_hu 0111 01000100 11001 ..... ..... ..... @vvv +xvssub_wu 0111 01000100 11010 ..... ..... ..... @vvv +xvssub_du 0111 01000100 11011 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 4e26d49acc..0fd88a56c1 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1748,6 +1748,23 @@ INSN_LASX(xvneg_h, vv) INSN_LASX(xvneg_w, vv) INSN_LASX(xvneg_d, vv) =20 +INSN_LASX(xvsadd_b, vvv) +INSN_LASX(xvsadd_h, vvv) +INSN_LASX(xvsadd_w, vvv) +INSN_LASX(xvsadd_d, vvv) +INSN_LASX(xvsadd_bu, vvv) +INSN_LASX(xvsadd_hu, vvv) +INSN_LASX(xvsadd_wu, vvv) +INSN_LASX(xvsadd_du, vvv) +INSN_LASX(xvssub_b, vvv) +INSN_LASX(xvssub_h, vvv) +INSN_LASX(xvssub_w, vvv) +INSN_LASX(xvssub_d, vvv) +INSN_LASX(xvssub_bu, vvv) +INSN_LASX(xvssub_hu, vvv) +INSN_LASX(xvssub_wu, vvv) +INSN_LASX(xvssub_du, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 29eefe6934..c818a09312 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -65,6 +65,23 @@ TRANS(xvneg_h, LASX, gvec_vv, 32, MO_16, tcg_gen_gvec_ne= g) TRANS(xvneg_w, LASX, gvec_vv, 32, MO_32, tcg_gen_gvec_neg) TRANS(xvneg_d, LASX, gvec_vv, 32, MO_64, tcg_gen_gvec_neg) =20 +TRANS(xvsadd_b, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_ssadd) +TRANS(xvsadd_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_ssadd) +TRANS(xvsadd_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_ssadd) +TRANS(xvsadd_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_ssadd) +TRANS(xvsadd_bu, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_usadd) +TRANS(xvsadd_hu, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_usadd) +TRANS(xvsadd_wu, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_usadd) +TRANS(xvsadd_du, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_usadd) +TRANS(xvssub_b, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_sssub) +TRANS(xvssub_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_sssub) +TRANS(xvssub_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_sssub) +TRANS(xvssub_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_sssub) +TRANS(xvssub_bu, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_ussub) +TRANS(xvssub_hu, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_ussub) +TRANS(xvssub_wu, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_ussub) +TRANS(xvssub_du, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_ussub) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385659479131.51046511944207; Wed, 30 Aug 2023 01:54:19 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGub-0004NJ-ST; Wed, 30 Aug 2023 04:50:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGuS-000322-3e for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:44 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuI-0007u6-Oq for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:43 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx77uJAu9kfAgdAA--.264S3; Wed, 30 Aug 2023 16:49:13 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S12; Wed, 30 Aug 2023 16:49:11 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 10/48] target/loongarch: rename lsx_helper.c to vec_helper.c Date: Wed, 30 Aug 2023 16:48:24 +0800 Message-Id: <20230830084902.2113960-11-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S12 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385661197100007 Content-Type: text/plain; charset="utf-8" Use gen_helper_gvec_* series function. and rename lsx_helper.c to vec_helper.c. Signed-off-by: Song Gao --- target/loongarch/helper.h | 642 ++++---- .../loongarch/{lsx_helper.c =3D> vec_helper.c} | 1297 ++++++++--------- target/loongarch/insn_trans/trans_lsx.c.inc | 731 +++++----- target/loongarch/meson.build | 2 +- 4 files changed, 1329 insertions(+), 1343 deletions(-) rename target/loongarch/{lsx_helper.c =3D> vec_helper.c} (71%) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index ffb1e0b0bf..1abd9e1410 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -133,22 +133,22 @@ DEF_HELPER_1(idle, void, env) #endif =20 /* LoongArch LSX */ -DEF_HELPER_4(vhaddw_h_b, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_w_h, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_d_w, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_q_d, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_hu_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_wu_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_du_wu, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_qu_du, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_h_b, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_w_h, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_d_w, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_q_d, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_hu_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_wu_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_du_wu, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_qu_du, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vhaddw_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vaddwev_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vaddwev_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -305,22 +305,22 @@ DEF_HELPER_FLAGS_4(vmaddwod_h_bu_b, TCG_CALL_NO_RWG, = void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vmaddwod_w_hu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, = i32) DEF_HELPER_FLAGS_4(vmaddwod_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, = i32) =20 -DEF_HELPER_4(vdiv_b, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_h, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_w, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_d, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_wu, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_du, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_b, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_h, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_w, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_d, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_wu, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_du, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vdiv_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vsat_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vsat_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) @@ -331,161 +331,161 @@ DEF_HELPER_FLAGS_4(vsat_hu, TCG_CALL_NO_RWG, void, = ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vsat_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vsat_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 -DEF_HELPER_3(vexth_h_b, void, env, i32, i32) -DEF_HELPER_3(vexth_w_h, void, env, i32, i32) -DEF_HELPER_3(vexth_d_w, void, env, i32, i32) -DEF_HELPER_3(vexth_q_d, void, env, i32, i32) -DEF_HELPER_3(vexth_hu_bu, void, env, i32, i32) -DEF_HELPER_3(vexth_wu_hu, void, env, i32, i32) -DEF_HELPER_3(vexth_du_wu, void, env, i32, i32) -DEF_HELPER_3(vexth_qu_du, void, env, i32, i32) +DEF_HELPER_FLAGS_3(vexth_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vsigncov_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 -DEF_HELPER_3(vmskltz_b, void, env, i32, i32) -DEF_HELPER_3(vmskltz_h, void, env, i32, i32) -DEF_HELPER_3(vmskltz_w, void, env, i32, i32) -DEF_HELPER_3(vmskltz_d, void, env, i32, i32) -DEF_HELPER_3(vmskgez_b, void, env, i32, i32) -DEF_HELPER_3(vmsknz_b, void, env, i32,i32) +DEF_HELPER_FLAGS_3(vmskltz_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmskltz_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmskltz_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmskltz_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmskgez_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmsknz_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vnori_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 -DEF_HELPER_4(vsllwil_h_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsllwil_w_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsllwil_d_w, void, env, i32, i32, i32) -DEF_HELPER_3(vextl_q_d, void, env, i32, i32) -DEF_HELPER_4(vsllwil_hu_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vsllwil_wu_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vsllwil_du_wu, void, env, i32, i32, i32) -DEF_HELPER_3(vextl_qu_du, void, env, i32, i32) - -DEF_HELPER_4(vsrlr_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlr_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlr_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlr_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlri_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlri_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlri_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlri_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vsrar_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsrar_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrar_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrar_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrari_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsrari_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrari_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrari_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vsrln_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrln_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrln_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsran_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsran_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsran_w_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vsrlni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vsrani_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrani_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrani_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrani_d_q, void, env, i32, i32, i32) - -DEF_HELPER_4(vsrlrn_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrn_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrn_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarn_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarn_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarn_w_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vsrlrni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarni_d_q, void, env, i32, i32, i32) - -DEF_HELPER_4(vssrln_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_wu_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vssrlni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_du_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_du_q, void, env, i32, i32, i32) - -DEF_HELPER_4(vssrlrn_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_wu_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vssrlrni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_du_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_du_q, void, env, i32, i32, i32) - -DEF_HELPER_3(vclo_b, void, env, i32, i32) -DEF_HELPER_3(vclo_h, void, env, i32, i32) -DEF_HELPER_3(vclo_w, void, env, i32, i32) -DEF_HELPER_3(vclo_d, void, env, i32, i32) -DEF_HELPER_3(vclz_b, void, env, i32, i32) -DEF_HELPER_3(vclz_h, void, env, i32, i32) -DEF_HELPER_3(vclz_w, void, env, i32, i32) -DEF_HELPER_3(vclz_d, void, env, i32, i32) - -DEF_HELPER_3(vpcnt_b, void, env, i32, i32) -DEF_HELPER_3(vpcnt_h, void, env, i32, i32) -DEF_HELPER_3(vpcnt_w, void, env, i32, i32) -DEF_HELPER_3(vpcnt_d, void, env, i32, i32) +DEF_HELPER_FLAGS_4(vsllwil_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsllwil_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsllwil_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_3(vextl_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsllwil_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vsllwil_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vsllwil_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_3(vextl_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(vsrlr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlr_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlr_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlri_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(vsrar_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrar_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrar_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrari_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrari_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrari_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrari_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(vsrln_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrln_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrln_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsran_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsran_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsran_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(vsrlni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrani_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrani_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrani_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrani_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(vsrlrn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlrn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlrn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrarn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrarn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrarn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(vsrlrni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlrni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlrni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlrni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrarni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrarni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrarni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrarni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(vssrln_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(vssrlni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_du_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_du_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(vssrlrn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(vssrlrni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrlrni_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrlrni_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrlrni_du_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrarni_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrarni_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrarni_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrarni_du_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) + +DEF_HELPER_FLAGS_3(vclo_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclo_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclo_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclo_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclz_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclz_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclz_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclz_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(vpcnt_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vpcnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vpcnt_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vpcnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vbitclr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vbitclr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -514,107 +514,107 @@ DEF_HELPER_FLAGS_4(vbitrevi_h, TCG_CALL_NO_RWG, voi= d, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vbitrevi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vbitrevi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 -DEF_HELPER_4(vfrstp_b, void, env, i32, i32, i32) -DEF_HELPER_4(vfrstp_h, void, env, i32, i32, i32) -DEF_HELPER_4(vfrstpi_b, void, env, i32, i32, i32) -DEF_HELPER_4(vfrstpi_h, void, env, i32, i32, i32) - -DEF_HELPER_4(vfadd_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfadd_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfsub_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfsub_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfmul_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmul_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfdiv_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfdiv_d, void, env, i32, i32, i32) - -DEF_HELPER_5(vfmadd_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfmadd_d, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfmsub_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfmsub_d, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfnmadd_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfnmadd_d, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfnmsub_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfnmsub_d, void, env, i32, i32, i32, i32) - -DEF_HELPER_4(vfmax_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmax_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfmin_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmin_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vfmaxa_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmaxa_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfmina_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmina_d, void, env, i32, i32, i32) - -DEF_HELPER_3(vflogb_s, void, env, i32, i32) -DEF_HELPER_3(vflogb_d, void, env, i32, i32) - -DEF_HELPER_3(vfclass_s, void, env, i32, i32) -DEF_HELPER_3(vfclass_d, void, env, i32, i32) - -DEF_HELPER_3(vfsqrt_s, void, env, i32, i32) -DEF_HELPER_3(vfsqrt_d, void, env, i32, i32) -DEF_HELPER_3(vfrecip_s, void, env, i32, i32) -DEF_HELPER_3(vfrecip_d, void, env, i32, i32) -DEF_HELPER_3(vfrsqrt_s, void, env, i32, i32) -DEF_HELPER_3(vfrsqrt_d, void, env, i32, i32) - -DEF_HELPER_3(vfcvtl_s_h, void, env, i32, i32) -DEF_HELPER_3(vfcvth_s_h, void, env, i32, i32) -DEF_HELPER_3(vfcvtl_d_s, void, env, i32, i32) -DEF_HELPER_3(vfcvth_d_s, void, env, i32, i32) -DEF_HELPER_4(vfcvt_h_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfcvt_s_d, void, env, i32, i32, i32) - -DEF_HELPER_3(vfrintrne_s, void, env, i32, i32) -DEF_HELPER_3(vfrintrne_d, void, env, i32, i32) -DEF_HELPER_3(vfrintrz_s, void, env, i32, i32) -DEF_HELPER_3(vfrintrz_d, void, env, i32, i32) -DEF_HELPER_3(vfrintrp_s, void, env, i32, i32) -DEF_HELPER_3(vfrintrp_d, void, env, i32, i32) -DEF_HELPER_3(vfrintrm_s, void, env, i32, i32) -DEF_HELPER_3(vfrintrm_d, void, env, i32, i32) -DEF_HELPER_3(vfrint_s, void, env, i32, i32) -DEF_HELPER_3(vfrint_d, void, env, i32, i32) - -DEF_HELPER_3(vftintrne_w_s, void, env, i32, i32) -DEF_HELPER_3(vftintrne_l_d, void, env, i32, i32) -DEF_HELPER_3(vftintrz_w_s, void, env, i32, i32) -DEF_HELPER_3(vftintrz_l_d, void, env, i32, i32) -DEF_HELPER_3(vftintrp_w_s, void, env, i32, i32) -DEF_HELPER_3(vftintrp_l_d, void, env, i32, i32) -DEF_HELPER_3(vftintrm_w_s, void, env, i32, i32) -DEF_HELPER_3(vftintrm_l_d, void, env, i32, i32) -DEF_HELPER_3(vftint_w_s, void, env, i32, i32) -DEF_HELPER_3(vftint_l_d, void, env, i32, i32) -DEF_HELPER_3(vftintrz_wu_s, void, env, i32, i32) -DEF_HELPER_3(vftintrz_lu_d, void, env, i32, i32) -DEF_HELPER_3(vftint_wu_s, void, env, i32, i32) -DEF_HELPER_3(vftint_lu_d, void, env, i32, i32) -DEF_HELPER_4(vftintrne_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vftintrz_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vftintrp_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vftintrm_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vftint_w_d, void, env, i32, i32, i32) -DEF_HELPER_3(vftintrnel_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrneh_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrzl_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrzh_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrpl_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrph_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrml_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrmh_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintl_l_s, void, env, i32, i32) -DEF_HELPER_3(vftinth_l_s, void, env, i32, i32) - -DEF_HELPER_3(vffint_s_w, void, env, i32, i32) -DEF_HELPER_3(vffint_d_l, void, env, i32, i32) -DEF_HELPER_3(vffint_s_wu, void, env, i32, i32) -DEF_HELPER_3(vffint_d_lu, void, env, i32, i32) -DEF_HELPER_3(vffintl_d_w, void, env, i32, i32) -DEF_HELPER_3(vffinth_d_w, void, env, i32, i32) -DEF_HELPER_4(vffint_s_l, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vfrstp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vfrstp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vfrstpi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vfrstpi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_5(vfadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfdiv_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfdiv_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_6(vfmadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, en= v, i32) +DEF_HELPER_FLAGS_6(vfmadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, en= v, i32) +DEF_HELPER_FLAGS_6(vfmsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, en= v, i32) +DEF_HELPER_FLAGS_6(vfmsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, en= v, i32) +DEF_HELPER_FLAGS_6(vfnmadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, e= nv, i32) +DEF_HELPER_FLAGS_6(vfnmadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, e= nv, i32) +DEF_HELPER_FLAGS_6(vfnmsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, e= nv, i32) +DEF_HELPER_FLAGS_6(vfnmsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, e= nv, i32) + +DEF_HELPER_FLAGS_5(vfmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmax_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmin_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmin_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_5(vfmaxa_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_5(vfmaxa_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_5(vfmina_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_5(vfmina_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i3= 2) + +DEF_HELPER_FLAGS_4(vflogb_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vflogb_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vfclass_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfclass_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vfsqrt_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfsqrt_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrecip_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrecip_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrsqrt_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrsqrt_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vfcvtl_s_h, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfcvth_s_h, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfcvtl_d_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfcvth_d_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfcvt_h_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i= 32) +DEF_HELPER_FLAGS_5(vfcvt_s_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i= 32) + +DEF_HELPER_FLAGS_4(vfrintrne_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrne_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrz_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrz_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrp_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrp_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrm_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrm_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrint_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrint_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vftintrne_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrne_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrz_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrz_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrp_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrp_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrm_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrm_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftint_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftint_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrz_wu_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrz_lu_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftint_wu_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftint_lu_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vftintrne_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, en= v, i32) +DEF_HELPER_FLAGS_5(vftintrz_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env= , i32) +DEF_HELPER_FLAGS_5(vftintrp_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env= , i32) +DEF_HELPER_FLAGS_5(vftintrm_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env= , i32) +DEF_HELPER_FLAGS_5(vftint_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, = i32) +DEF_HELPER_FLAGS_4(vftintrnel_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i= 32) +DEF_HELPER_FLAGS_4(vftintrneh_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i= 32) +DEF_HELPER_FLAGS_4(vftintrzl_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrzh_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrpl_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrph_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrml_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrmh_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintl_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftinth_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vffint_s_w, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffint_d_l, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffint_s_wu, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffint_d_lu, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffintl_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffinth_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vffint_s_l, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, = i32) =20 DEF_HELPER_FLAGS_4(vseqi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vseqi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) @@ -655,45 +655,45 @@ DEF_HELPER_3(vsetallnez_h, void, env, i32, i32) DEF_HELPER_3(vsetallnez_w, void, env, i32, i32) DEF_HELPER_3(vsetallnez_d, void, env, i32, i32) =20 -DEF_HELPER_4(vpackev_b, void, env, i32, i32, i32) -DEF_HELPER_4(vpackev_h, void, env, i32, i32, i32) -DEF_HELPER_4(vpackev_w, void, env, i32, i32, i32) -DEF_HELPER_4(vpackev_d, void, env, i32, i32, i32) -DEF_HELPER_4(vpackod_b, void, env, i32, i32, i32) -DEF_HELPER_4(vpackod_h, void, env, i32, i32, i32) -DEF_HELPER_4(vpackod_w, void, env, i32, i32, i32) -DEF_HELPER_4(vpackod_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vpickev_b, void, env, i32, i32, i32) -DEF_HELPER_4(vpickev_h, void, env, i32, i32, i32) -DEF_HELPER_4(vpickev_w, void, env, i32, i32, i32) -DEF_HELPER_4(vpickev_d, void, env, i32, i32, i32) -DEF_HELPER_4(vpickod_b, void, env, i32, i32, i32) -DEF_HELPER_4(vpickod_h, void, env, i32, i32, i32) -DEF_HELPER_4(vpickod_w, void, env, i32, i32, i32) -DEF_HELPER_4(vpickod_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vilvl_b, void, env, i32, i32, i32) -DEF_HELPER_4(vilvl_h, void, env, i32, i32, i32) -DEF_HELPER_4(vilvl_w, void, env, i32, i32, i32) -DEF_HELPER_4(vilvl_d, void, env, i32, i32, i32) -DEF_HELPER_4(vilvh_b, void, env, i32, i32, i32) -DEF_HELPER_4(vilvh_h, void, env, i32, i32, i32) -DEF_HELPER_4(vilvh_w, void, env, i32, i32, i32) -DEF_HELPER_4(vilvh_d, void, env, i32, i32, i32) - -DEF_HELPER_5(vshuf_b, void, env, i32, i32, i32, i32) -DEF_HELPER_4(vshuf_h, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf_w, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf_d, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf4i_b, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf4i_h, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf4i_w, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf4i_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vpermi_w, void, env, i32, i32, i32) - -DEF_HELPER_4(vextrins_b, void, env, i32, i32, i32) -DEF_HELPER_4(vextrins_h, void, env, i32, i32, i32) -DEF_HELPER_4(vextrins_w, void, env, i32, i32, i32) -DEF_HELPER_4(vextrins_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vpackev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackev_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackev_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackod_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackod_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackod_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackod_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(vpickev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickev_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickev_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickod_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickod_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickod_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickod_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(vilvl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvl_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvh_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(vshuf_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vshuf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vshuf_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vshuf_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vshuf4i_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vshuf4i_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vshuf4i_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vshuf4i_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(vpermi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(vextrins_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vextrins_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vextrins_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vextrins_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/vec_helper.c similarity index 71% rename from target/loongarch/lsx_helper.c rename to target/loongarch/vec_helper.c index b231a2798b..d01903018a 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/vec_helper.c @@ -18,13 +18,12 @@ #define DO_SUB(a, b) (a - b) =20 #define DO_ODD_EVEN(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ typedef __typeof(Vd->E1(0)) TD; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ @@ -36,12 +35,11 @@ DO_ODD_EVEN(vhaddw_h_b, 16, H, B, DO_ADD) DO_ODD_EVEN(vhaddw_w_h, 32, W, H, DO_ADD) DO_ODD_EVEN(vhaddw_d_w, 64, D, W, DO_ADD) =20 -void HELPER(vhaddw_q_d)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vhaddw_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 Vd->Q(0) =3D int128_add(int128_makes64(Vj->D(1)), int128_makes64(Vk->D= (0))); } @@ -50,12 +48,11 @@ DO_ODD_EVEN(vhsubw_h_b, 16, H, B, DO_SUB) DO_ODD_EVEN(vhsubw_w_h, 32, W, H, DO_SUB) DO_ODD_EVEN(vhsubw_d_w, 64, D, W, DO_SUB) =20 -void HELPER(vhsubw_q_d)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vhsubw_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 Vd->Q(0) =3D int128_sub(int128_makes64(Vj->D(1)), int128_makes64(Vk->D= (0))); } @@ -64,12 +61,11 @@ DO_ODD_EVEN(vhaddw_hu_bu, 16, UH, UB, DO_ADD) DO_ODD_EVEN(vhaddw_wu_hu, 32, UW, UH, DO_ADD) DO_ODD_EVEN(vhaddw_du_wu, 64, UD, UW, DO_ADD) =20 -void HELPER(vhaddw_qu_du)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vhaddw_qu_du)(void *vd, void *vj, void *vk, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 Vd->Q(0) =3D int128_add(int128_make64((uint64_t)Vj->D(1)), int128_make64((uint64_t)Vk->D(0))); @@ -79,12 +75,11 @@ DO_ODD_EVEN(vhsubw_hu_bu, 16, UH, UB, DO_SUB) DO_ODD_EVEN(vhsubw_wu_hu, 32, UW, UH, DO_SUB) DO_ODD_EVEN(vhsubw_du_wu, 64, UD, UW, DO_SUB) =20 -void HELPER(vhsubw_qu_du)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vhsubw_qu_du)(void *vd, void *vj, void *vk, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 Vd->Q(0) =3D int128_sub(int128_make64((uint64_t)Vj->D(1)), int128_make64((uint64_t)Vk->D(0))); @@ -539,7 +534,7 @@ VMADDWEV_U_S(vmaddwev_w_hu_h, 32, W, UW, H, UH, DO_MUL) VMADDWEV_U_S(vmaddwev_d_wu_w, 64, D, UD, W, UW, DO_MUL) =20 #define VMADDWOD_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ VReg *Vd =3D (VReg *)vd; \ @@ -549,8 +544,8 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_= t v) \ typedef __typeof(Vd->EU1(0)) TU1; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->ES1(i) +=3D DO_OP((TU1)Vj->EU2(2 * i + 1), \ - (TS1)Vk->ES2(2 * i + 1)); \ + Vd->ES1(i) +=3D DO_OP((TU1)Vj->EU2(2 * i + 1), \ + (TS1)Vk->ES2(2 * i + 1)); \ } \ } =20 @@ -565,17 +560,17 @@ VMADDWOD_U_S(vmaddwod_d_wu_w, 64, D, UD, W, UW, DO_MU= L) #define DO_REM(N, M) (unlikely(M =3D=3D 0) ? 0 :\ unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? 0 : N % = M) =20 -#define VDIV(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)); \ - } \ +#define VDIV(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)); \ + } \ } =20 VDIV(vdiv_b, 8, B, DO_DIV) @@ -632,30 +627,30 @@ VSAT_U(vsat_hu, 16, UH) VSAT_U(vsat_wu, 32, UW) VSAT_U(vsat_du, 64, UD) =20 -#define VEXTH(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D Vj->E2(i + LSX_LEN/BIT); \ - } \ +#define VEXTH(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E1(i) =3D Vj->E2(i + LSX_LEN/BIT); \ + } \ } =20 -void HELPER(vexth_q_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vexth_q_d)(void *vd, void *vj, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 Vd->Q(0) =3D int128_makes64(Vj->D(1)); } =20 -void HELPER(vexth_qu_du)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vexth_qu_du)(void *vd, void *vj, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 Vd->Q(0) =3D int128_make64((uint64_t)Vj->D(1)); } @@ -684,11 +679,11 @@ static uint64_t do_vmskltz_b(int64_t val) return c >> 56; } =20 -void HELPER(vmskltz_b)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskltz_b)(void *vd, void *vj, uint32_t desc) { uint16_t temp =3D 0; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp =3D do_vmskltz_b(Vj->D(0)); temp |=3D (do_vmskltz_b(Vj->D(1)) << 8); @@ -705,11 +700,11 @@ static uint64_t do_vmskltz_h(int64_t val) return c >> 60; } =20 -void HELPER(vmskltz_h)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskltz_h)(void *vd, void *vj, uint32_t desc) { uint16_t temp =3D 0; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp =3D do_vmskltz_h(Vj->D(0)); temp |=3D (do_vmskltz_h(Vj->D(1)) << 4); @@ -725,11 +720,11 @@ static uint64_t do_vmskltz_w(int64_t val) return c >> 62; } =20 -void HELPER(vmskltz_w)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskltz_w)(void *vd, void *vj, uint32_t desc) { uint16_t temp =3D 0; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp =3D do_vmskltz_w(Vj->D(0)); temp |=3D (do_vmskltz_w(Vj->D(1)) << 2); @@ -741,11 +736,11 @@ static uint64_t do_vmskltz_d(int64_t val) { return (uint64_t)val >> 63; } -void HELPER(vmskltz_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskltz_d)(void *vd, void *vj, uint32_t desc) { uint16_t temp =3D 0; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp =3D do_vmskltz_d(Vj->D(0)); temp |=3D (do_vmskltz_d(Vj->D(1)) << 1); @@ -753,11 +748,11 @@ void HELPER(vmskltz_d)(CPULoongArchState *env, uint32= _t vd, uint32_t vj) Vd->D(1) =3D 0; } =20 -void HELPER(vmskgez_b)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskgez_b)(void *vd, void *vj, uint32_t desc) { uint16_t temp =3D 0; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp =3D do_vmskltz_b(Vj->D(0)); temp |=3D (do_vmskltz_b(Vj->D(1)) << 8); @@ -775,11 +770,11 @@ static uint64_t do_vmskez_b(uint64_t a) return c >> 56; } =20 -void HELPER(vmsknz_b)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmsknz_b)(void vd, void vj, uint32_t desc) { uint16_t temp =3D 0; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp =3D do_vmskez_b(Vj->D(0)); temp |=3D (do_vmskez_b(Vj->D(1)) << 8); @@ -798,36 +793,35 @@ void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm= , uint32_t v) } } =20 -#define VSLLWIL(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - typedef __typeof(temp.E1(0)) TD; \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E1(i) =3D (TD)Vj->E2(i) << (imm % BIT); \ - } \ - *Vd =3D temp; \ +#define VSLLWIL(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + typedef __typeof(temp.E1(0)) TD; \ + \ + temp.D(0) =3D 0; \ + temp.D(1) =3D 0; \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E1(i) =3D (TD)Vj->E2(i) << (imm % BIT); \ + } \ + *Vd =3D temp; \ } =20 -void HELPER(vextl_q_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vextl_q_d)(void *vd, void *vj, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 Vd->Q(0) =3D int128_makes64(Vj->D(0)); } =20 -void HELPER(vextl_qu_du)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vextl_qu_du)(void *vd, void *vj, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 Vd->Q(0) =3D int128_make64(Vj->D(0)); } @@ -855,13 +849,12 @@ do_vsrlr(W, uint32_t) do_vsrlr(D, uint64_t) =20 #define VSRLR(NAME, BIT, T, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ Vd->E(i) =3D do_vsrlr_ ## E(Vj->E(i), ((T)Vk->E(i))%BIT); \ @@ -873,17 +866,16 @@ VSRLR(vsrlr_h, 16, uint16_t, H) VSRLR(vsrlr_w, 32, uint32_t, W) VSRLR(vsrlr_d, 64, uint64_t, D) =20 -#define VSRLRI(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D do_vsrlr_ ## E(Vj->E(i), imm); \ - } \ +#define VSRLRI(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) =3D do_vsrlr_ ## E(Vj->E(i), imm); \ + } \ } =20 VSRLRI(vsrlri_b, 8, B) @@ -907,13 +899,12 @@ do_vsrar(W, int32_t) do_vsrar(D, int64_t) =20 #define VSRAR(NAME, BIT, T, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ Vd->E(i) =3D do_vsrar_ ## E(Vj->E(i), ((T)Vk->E(i))%BIT); \ @@ -925,17 +916,16 @@ VSRAR(vsrar_h, 16, uint16_t, H) VSRAR(vsrar_w, 32, uint32_t, W) VSRAR(vsrar_d, 64, uint64_t, D) =20 -#define VSRARI(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D do_vsrar_ ## E(Vj->E(i), imm); \ - } \ +#define VSRARI(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) =3D do_vsrar_ ## E(Vj->E(i), imm); \ + } \ } =20 VSRARI(vsrari_b, 8, B) @@ -946,13 +936,12 @@ VSRARI(vsrari_d, 64, D) #define R_SHIFT(a, b) (a >> b) =20 #define VSRLN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *v, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) =3D R_SHIFT((T)Vj->E2(i),((T)Vk->E2(i)) % BIT); \ @@ -964,50 +953,47 @@ VSRLN(vsrln_b_h, 16, uint16_t, B, H) VSRLN(vsrln_h_w, 32, uint32_t, H, W) VSRLN(vsrln_w_d, 64, uint64_t, W, D) =20 -#define VSRAN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D R_SHIFT(Vj->E2(i), ((T)Vk->E2(i)) % BIT); \ - } \ - Vd->D(1) =3D 0; \ +#define VSRAN(NAME, BIT, T, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E1(i) =3D R_SHIFT(Vj->E2(i), ((T)Vk->E2(i)) % BIT); \ + } \ + Vd->D(1) =3D 0; \ } =20 VSRAN(vsran_b_h, 16, uint16_t, B, H) VSRAN(vsran_h_w, 32, uint32_t, H, W) VSRAN(vsran_w_d, 64, uint64_t, W, D) =20 -#define VSRLNI(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D R_SHIFT((T)Vj->E2(i), imm); \ - temp.E1(i + max) =3D R_SHIFT((T)Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ -} - -void HELPER(vsrlni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +#define VSRLNI(NAME, BIT, T, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, max; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + temp.D(0) =3D 0; \ + temp.D(1) =3D 0; \ + max =3D LSX_LEN/BIT; \ + for (i =3D 0; i < max; i++) { \ + temp.E1(i) =3D R_SHIFT((T)Vj->E2(i), imm); \ + temp.E1(i + max) =3D R_SHIFT((T)Vd->E2(i), imm); \ + } \ + *Vd =3D temp; \ +} + +void HELPER(vsrlni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp.D(0) =3D 0; temp.D(1) =3D 0; @@ -1020,31 +1006,29 @@ VSRLNI(vsrlni_b_h, 16, uint16_t, B, H) VSRLNI(vsrlni_h_w, 32, uint32_t, H, W) VSRLNI(vsrlni_w_d, 64, uint64_t, W, D) =20 -#define VSRANI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D R_SHIFT(Vj->E2(i), imm); \ - temp.E1(i + max) =3D R_SHIFT(Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ -} - -void HELPER(vsrani_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +#define VSRANI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, max; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + temp.D(0) =3D 0; \ + temp.D(1) =3D 0; \ + max =3D LSX_LEN/BIT; \ + for (i =3D 0; i < max; i++) { \ + temp.E1(i) =3D R_SHIFT(Vj->E2(i), imm); \ + temp.E1(i + max) =3D R_SHIFT(Vd->E2(i), imm); \ + } \ + *Vd =3D temp; \ +} + +void HELPER(vsrani_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp.D(0) =3D 0; temp.D(1) =3D 0; @@ -1058,13 +1042,12 @@ VSRANI(vsrani_h_w, 32, H, W) VSRANI(vsrani_w_d, 64, W, D) =20 #define VSRLRN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) =3D do_vsrlr_ ## E2(Vj->E2(i), ((T)Vk->E2(i))%BIT); \ @@ -1077,13 +1060,12 @@ VSRLRN(vsrlrn_h_w, 32, uint32_t, H, W) VSRLRN(vsrlrn_w_d, 64, uint64_t, W, D) =20 #define VSRARN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) =3D do_vsrar_ ## E2(Vj->E2(i), ((T)Vk->E2(i))%BIT); \ @@ -1095,31 +1077,29 @@ VSRARN(vsrarn_b_h, 16, uint8_t, B, H) VSRARN(vsrarn_h_w, 32, uint16_t, H, W) VSRARN(vsrarn_w_d, 64, uint32_t, W, D) =20 -#define VSRLRNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D do_vsrlr_ ## E2(Vj->E2(i), imm); \ - temp.E1(i + max) =3D do_vsrlr_ ## E2(Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ -} - -void HELPER(vsrlrni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +#define VSRLRNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, max; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + temp.D(0) =3D 0; \ + temp.D(1) =3D 0; \ + max =3D LSX_LEN/BIT; \ + for (i =3D 0; i < max; i++) { \ + temp.E1(i) =3D do_vsrlr_ ## E2(Vj->E2(i), imm); \ + temp.E1(i + max) =3D do_vsrlr_ ## E2(Vd->E2(i), imm); \ + } \ + *Vd =3D temp; \ +} + +void HELPER(vsrlrni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; Int128 r1, r2; =20 if (imm =3D=3D 0) { @@ -1139,31 +1119,29 @@ VSRLRNI(vsrlrni_b_h, 16, B, H) VSRLRNI(vsrlrni_h_w, 32, H, W) VSRLRNI(vsrlrni_w_d, 64, W, D) =20 -#define VSRARNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D do_vsrar_ ## E2(Vj->E2(i), imm); \ - temp.E1(i + max) =3D do_vsrar_ ## E2(Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ -} - -void HELPER(vsrarni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +#define VSRARNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, max; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + temp.D(0) =3D 0; \ + temp.D(1) =3D 0; \ + max =3D LSX_LEN/BIT; \ + for (i =3D 0; i < max; i++) { \ + temp.E1(i) =3D do_vsrar_ ## E2(Vj->E2(i), imm); \ + temp.E1(i + max) =3D do_vsrar_ ## E2(Vd->E2(i), imm); \ + } \ + *Vd =3D temp; \ +} + +void HELPER(vsrarni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; Int128 r1, r2; =20 if (imm =3D=3D 0) { @@ -1206,13 +1184,12 @@ SSRLNS(H, uint32_t, int32_t, uint16_t) SSRLNS(W, uint64_t, int64_t, uint32_t) =20 #define VSSRLN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrlns_ ## E1(Vj->E2(i), (T)Vk->E2(i)% BIT, BIT/2= -1); \ @@ -1249,13 +1226,12 @@ SSRANS(H, int32_t, int16_t) SSRANS(W, int64_t, int32_t) =20 #define VSSRAN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrans_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2 = -1); \ @@ -1290,13 +1266,12 @@ SSRLNU(H, uint32_t, uint16_t, int32_t) SSRLNU(W, uint64_t, uint32_t, int64_t) =20 #define VSSRLNU(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrlnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2)= ; \ @@ -1334,13 +1309,12 @@ SSRANU(H, uint32_t, uint16_t, int32_t) SSRANU(W, uint64_t, uint32_t, int64_t) =20 #define VSSRANU(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssranu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2)= ; \ @@ -1353,13 +1327,12 @@ VSSRANU(vssran_hu_w, 32, uint32_t, H, W) VSSRANU(vssran_wu_d, 64, uint64_t, W, D) =20 #define VSSRLNI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrlns_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ @@ -1368,12 +1341,11 @@ void HELPER(NAME)(CPULoongArchState *env, = \ *Vd =3D temp; = \ } =20 -void HELPER(vssrlni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrlni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 if (imm =3D=3D 0) { shft_res1 =3D Vj->Q(0); @@ -1402,13 +1374,12 @@ VSSRLNI(vssrlni_h_w, 32, H, W) VSSRLNI(vssrlni_w_d, 64, W, D) =20 #define VSSRANI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrans_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ @@ -1417,12 +1388,11 @@ void HELPER(NAME)(CPULoongArchState *env, = \ *Vd =3D temp; = \ } =20 -void HELPER(vssrani_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrani_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask, min; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd;=20 + VReg *Vj =3D (VReg *)vj;=20 =20 if (imm =3D=3D 0) { shft_res1 =3D Vj->Q(0); @@ -1456,13 +1426,12 @@ VSSRANI(vssrani_h_w, 32, H, W) VSSRANI(vssrani_w_d, 64, W, D) =20 #define VSSRLNUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrlnu_ ## E1(Vj->E2(i), imm, BIT/2); = \ @@ -1471,12 +1440,11 @@ void HELPER(NAME)(CPULoongArchState *env, = \ *Vd =3D temp; = \ } =20 -void HELPER(vssrlni_du_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrlni_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 if (imm =3D=3D 0) { shft_res1 =3D Vj->Q(0); @@ -1505,13 +1473,12 @@ VSSRLNUI(vssrlni_hu_w, 32, H, W) VSSRLNUI(vssrlni_wu_d, 64, W, D) =20 #define VSSRANUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssranu_ ## E1(Vj->E2(i), imm, BIT/2); = \ @@ -1520,12 +1487,11 @@ void HELPER(NAME)(CPULoongArchState *env, = \ *Vd =3D temp; = \ } =20 -void HELPER(vssrani_du_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrani_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 if (imm =3D=3D 0) { shft_res1 =3D Vj->Q(0); @@ -1582,13 +1548,12 @@ SSRLRNS(H, W, uint32_t, int32_t, uint16_t) SSRLRNS(W, D, uint64_t, int64_t, uint32_t) =20 #define VSSRLRN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrlrns_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= -1); \ @@ -1622,13 +1587,12 @@ SSRARNS(H, W, int32_t, int16_t) SSRARNS(W, D, int64_t, int32_t) =20 #define VSSRARN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrarns_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= -1); \ @@ -1661,13 +1625,12 @@ SSRLRNU(H, W, uint32_t, uint16_t, int32_t) SSRLRNU(W, D, uint64_t, uint32_t, int64_t) =20 #define VSSRLRNU(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrlrnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= ); \ @@ -1703,13 +1666,12 @@ SSRARNU(H, W, uint32_t, uint16_t, int32_t) SSRARNU(W, D, uint64_t, uint32_t, int64_t) =20 #define VSSRARNU(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrarnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= ); \ @@ -1722,13 +1684,12 @@ VSSRARNU(vssrarn_hu_w, 32, uint32_t, H, W) VSSRARNU(vssrarn_wu_d, 64, uint64_t, W, D) =20 #define VSSRLRNI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrlrns_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ @@ -1738,12 +1699,11 @@ void HELPER(NAME)(CPULoongArchState *env, = \ } =20 #define VSSRLRNI_Q(NAME, sh) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ Int128 shft_res1, shft_res2, mask, r1, r2; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ if (imm =3D=3D 0) { = \ shft_res1 =3D Vj->Q(0); = \ @@ -1777,13 +1737,12 @@ VSSRLRNI(vssrlrni_w_d, 64, W, D) VSSRLRNI_Q(vssrlrni_d_q, 63) =20 #define VSSRARNI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrarns_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ @@ -1792,12 +1751,11 @@ void HELPER(NAME)(CPULoongArchState *env, *Vd =3D temp; = \ } =20 -void HELPER(vssrarni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrarni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask1, mask2, r1, r2; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 if (imm =3D=3D 0) { shft_res1 =3D Vj->Q(0); @@ -1835,13 +1793,12 @@ VSSRARNI(vssrarni_h_w, 32, H, W) VSSRARNI(vssrarni_w_d, 64, W, D) =20 #define VSSRLRNUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrlrnu_ ## E1(Vj->E2(i), imm, BIT/2); = \ @@ -1856,13 +1813,12 @@ VSSRLRNUI(vssrlrni_wu_d, 64, W, D) VSSRLRNI_Q(vssrlrni_du_q, 64) =20 #define VSSRARNUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrarnu_ ## E1(Vj->E2(i), imm, BIT/2); = \ @@ -1871,12 +1827,11 @@ void HELPER(NAME)(CPULoongArchState *env, = \ *Vd =3D temp; = \ } =20 -void HELPER(vssrarni_du_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrarni_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask1, mask2, r1, r2; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 if (imm =3D=3D 0) { shft_res1 =3D Vj->Q(0); @@ -1920,17 +1875,17 @@ VSSRARNUI(vssrarni_bu_h, 16, B, H) VSSRARNUI(vssrarni_hu_w, 32, H, W) VSSRARNUI(vssrarni_wu_d, 64, W, D) =20 -#define DO_2OP(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) \ - { \ - Vd->E(i) =3D DO_OP(Vj->E(i)); \ - } \ +#define DO_2OP(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) \ + { \ + Vd->E(i) =3D DO_OP(Vj->E(i)); \ + } \ } =20 #define DO_CLO_B(N) (clz32(~N & 0xff) - 24) @@ -1951,17 +1906,17 @@ DO_2OP(vclz_h, 16, UH, DO_CLZ_H) DO_2OP(vclz_w, 32, UW, DO_CLZ_W) DO_2OP(vclz_d, 64, UD, DO_CLZ_D) =20 -#define VPCNT(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) \ - { \ - Vd->E(i) =3D FN(Vj->E(i)); \ - } \ +#define VPCNT(NAME, BIT, E, FN) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) \ + { \ + Vd->E(i) =3D FN(Vj->E(i)); \ + } \ } =20 VPCNT(vpcnt_b, 8, UB, ctpop8) @@ -2024,42 +1979,40 @@ DO_BITI(vbitrevi_h, 16, UH, DO_BITREV) DO_BITI(vbitrevi_w, 32, UW, DO_BITREV) DO_BITI(vbitrevi_d, 64, UD, DO_BITREV) =20 -#define VFRSTP(NAME, BIT, MASK, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i, m; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - if (Vj->E(i) < 0) { \ - break; \ - } \ - } \ - m =3D Vk->E(0) & MASK; \ - Vd->E(m) =3D i; \ +#define VFRSTP(NAME, BIT, MASK, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, m; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + if (Vj->E(i) < 0) { \ + break; \ + } \ + } \ + m =3D Vk->E(0) & MASK; \ + Vd->E(m) =3D i; \ } =20 VFRSTP(vfrstp_b, 8, 0xf, B) VFRSTP(vfrstp_h, 16, 0x7, H) =20 -#define VFRSTPI(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, m; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - if (Vj->E(i) < 0) { \ - break; \ - } \ - } \ - m =3D imm % (LSX_LEN/BIT); \ - Vd->E(m) =3D i; \ +#define VFRSTPI(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, m; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + if (Vj->E(i) < 0) { \ + break; \ + } \ + } \ + m =3D imm % (LSX_LEN/BIT); \ + Vd->E(m) =3D i; \ } =20 VFRSTPI(vfrstpi_b, 8, B) @@ -2097,13 +2050,13 @@ static inline void vec_clear_cause(CPULoongArchStat= e *env) } =20 #define DO_3OP_F(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, \ + CPULoongArchState *env, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ vec_clear_cause(env); \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ @@ -2130,14 +2083,14 @@ DO_3OP_F(vfmina_s, 32, UW, float32_minnummag) DO_3OP_F(vfmina_d, 64, UD, float64_minnummag) =20 #define DO_4OP_F(NAME, BIT, E, FN, flags) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk, uint32_t va) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, void *va, = \ + CPULoongArchState *env, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ - VReg *Va =3D &(env->fpr[va].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + VReg *Va =3D (VReg *)va; = \ = \ vec_clear_cause(env); = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ @@ -2157,17 +2110,17 @@ DO_4OP_F(vfnmsub_s, 32, UW, float32_muladd, DO_4OP_F(vfnmsub_d, 64, UD, float64_muladd, float_muladd_negate_c | float_muladd_negate_result) =20 -#define DO_2OP_F(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - vec_clear_cause(env); \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D FN(env, Vj->E(i)); \ - } \ +#define DO_2OP_F(NAME, BIT, E, FN) = \ +void HELPER(NAME)(void *vd, void *vj, CPULoongArchState *env, uint32_t des= c) \ +{ = \ + int i; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + = \ + vec_clear_cause(env); = \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ + Vd->E(i) =3D FN(env, Vj->E(i)); = \ + } = \ } =20 #define FLOGB(BIT, T) \ @@ -2188,16 +2141,16 @@ static T do_flogb_## BIT(CPULoongArchState *env, T = fj) \ FLOGB(32, uint32_t) FLOGB(64, uint64_t) =20 -#define FCLASS(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D FN(env, Vj->E(i)); \ - } \ +#define FCLASS(NAME, BIT, E, FN) = \ +void HELPER(NAME)(void *vd, void *vj, CPULoongArchState *env, uint32_t des= c) \ +{ = \ + int i; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + = \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ + Vd->E(i) =3D FN(env, Vj->E(i)); = \ + } = \ } =20 FCLASS(vfclass_s, 32, UW, helper_fclass_s) @@ -2267,12 +2220,13 @@ static uint32_t float64_cvt_float32(uint64_t d, flo= at_status *status) return float64_to_float32(d, status); } =20 -void HELPER(vfcvtl_s_h)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfcvtl_s_h)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < LSX_LEN/32; i++) { @@ -2282,12 +2236,13 @@ void HELPER(vfcvtl_s_h)(CPULoongArchState *env, uin= t32_t vd, uint32_t vj) *Vd =3D temp; } =20 -void HELPER(vfcvtl_d_s)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfcvtl_d_s)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < LSX_LEN/64; i++) { @@ -2297,12 +2252,13 @@ void HELPER(vfcvtl_d_s)(CPULoongArchState *env, uin= t32_t vd, uint32_t vj) *Vd =3D temp; } =20 -void HELPER(vfcvth_s_h)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfcvth_s_h)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < LSX_LEN/32; i++) { @@ -2312,12 +2268,13 @@ void HELPER(vfcvth_s_h)(CPULoongArchState *env, uin= t32_t vd, uint32_t vj) *Vd =3D temp; } =20 -void HELPER(vfcvth_d_s)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfcvth_d_s)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < LSX_LEN/64; i++) { @@ -2327,14 +2284,14 @@ void HELPER(vfcvth_d_s)(CPULoongArchState *env, uin= t32_t vd, uint32_t vj) *Vd =3D temp; } =20 -void HELPER(vfcvt_h_s)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vfcvt_h_s)(void *vd, void *vj, void *vk, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 vec_clear_cause(env); for(i =3D 0; i < LSX_LEN/32; i++) { @@ -2345,14 +2302,14 @@ void HELPER(vfcvt_h_s)(CPULoongArchState *env, *Vd =3D temp; } =20 -void HELPER(vfcvt_s_d)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vfcvt_s_d)(void *vd, void *vj, void *vk, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 vec_clear_cause(env); for(i =3D 0; i < LSX_LEN/64; i++) { @@ -2363,24 +2320,26 @@ void HELPER(vfcvt_s_d)(CPULoongArchState *env, *Vd =3D temp; } =20 -void HELPER(vfrint_s)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfrint_s)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < 4; i++) { Vd->W(i) =3D float32_round_to_int(Vj->UW(i), &env->fp_status); vec_update_fcsr0(env, GETPC()); } -} =20 -void HELPER(vfrint_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) + +void HELPER(vfrint_d)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < 2; i++) { @@ -2389,21 +2348,21 @@ void HELPER(vfrint_d)(CPULoongArchState *env, uint3= 2_t vd, uint32_t vj) } } =20 -#define FCVT_2OP(NAME, BIT, E, MODE) = \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) = \ -{ = \ - int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - = \ - vec_clear_cause(env); = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - FloatRoundMode old_mode =3D get_float_rounding_mode(&env->fp_statu= s); \ - set_float_rounding_mode(MODE, &env->fp_status); = \ - Vd->E(i) =3D float## BIT ## _round_to_int(Vj->E(i), &env->fp_statu= s); \ - set_float_rounding_mode(old_mode, &env->fp_status); = \ - vec_update_fcsr0(env, GETPC()); = \ - } = \ +#define FCVT_2OP(NAME, BIT, E, MODE) = \ +void HELPER(NAME)(void *vd, void *vj, CPULoongArchState *env, uint32_t des= c) \ +{ = \ + int i; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + = \ + vec_clear_cause(env); = \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ + FloatRoundMode old_mode =3D get_float_rounding_mode(&env->fp_statu= s); \ + set_float_rounding_mode(MODE, &env->fp_status); = \ + Vd->E(i) =3D float## BIT ## _round_to_int(Vj->E(i), &env->fp_statu= s); \ + set_float_rounding_mode(old_mode, &env->fp_status); = \ + vec_update_fcsr0(env, GETPC()); = \ + } = \ } =20 FCVT_2OP(vfrintrne_s, 32, UW, float_round_nearest_even) @@ -2482,22 +2441,22 @@ FTINT(rp_w_d, float64, int32, uint64_t, uint32_t, f= loat_round_up) FTINT(rz_w_d, float64, int32, uint64_t, uint32_t, float_round_to_zero) FTINT(rne_w_d, float64, int32, uint64_t, uint32_t, float_round_nearest_eve= n) =20 -#define FTINT_W_D(NAME, FN) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - vec_clear_cause(env); \ - for (i =3D 0; i < 2; i++) { \ - temp.W(i + 2) =3D FN(env, Vj->UD(i)); \ - temp.W(i) =3D FN(env, Vk->UD(i)); \ - } \ - *Vd =3D temp; \ +#define FTINT_W_D(NAME, FN) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, \ + CPULoongArchState *env,uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + vec_clear_cause(env); \ + for (i =3D 0; i < 2; i++) { \ + temp.W(i + 2) =3D FN(env, Vj->UD(i)); \ + temp.W(i) =3D FN(env, Vk->UD(i)); \ + } \ + *Vd =3D temp; \ } =20 FTINT_W_D(vftint_w_d, do_float64_to_int32) @@ -2515,19 +2474,19 @@ FTINT(rph_l_s, float32, int64, uint32_t, uint64_t, = float_round_up) FTINT(rzh_l_s, float32, int64, uint32_t, uint64_t, float_round_to_zero) FTINT(rneh_l_s, float32, int64, uint32_t, uint64_t, float_round_nearest_ev= en) =20 -#define FTINTL_L_S(NAME, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - vec_clear_cause(env); \ - for (i =3D 0; i < 2; i++) { \ - temp.D(i) =3D FN(env, Vj->UW(i)); \ - } \ - *Vd =3D temp; \ +#define FTINTL_L_S(NAME, FN) = \ +void HELPER(NAME)(void *vd, void *vj, CPULoongArchState *env, uint32_t des= c) \ +{ = \ + int i; = \ + VReg temp; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + = \ + vec_clear_cause(env); = \ + for (i =3D 0; i < 2; i++) { = \ + temp.D(i) =3D FN(env, Vj->UW(i)); = \ + } = \ + *Vd =3D temp; = \ } =20 FTINTL_L_S(vftintl_l_s, do_float32_to_int64) @@ -2536,19 +2495,19 @@ FTINTL_L_S(vftintrpl_l_s, do_ftintrpl_l_s) FTINTL_L_S(vftintrzl_l_s, do_ftintrzl_l_s) FTINTL_L_S(vftintrnel_l_s, do_ftintrnel_l_s) =20 -#define FTINTH_L_S(NAME, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - vec_clear_cause(env); \ - for (i =3D 0; i < 2; i++) { \ - temp.D(i) =3D FN(env, Vj->UW(i + 2)); \ - } \ - *Vd =3D temp; \ +#define FTINTH_L_S(NAME, FN) = \ +void HELPER(NAME)(void *vd, void *vj, CPULoongArchState *env, uint32_t des= c) \ +{ = \ + int i; = \ + VReg temp; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + = \ + vec_clear_cause(env); = \ + for (i =3D 0; i < 2; i++) { = \ + temp.D(i) =3D FN(env, Vj->UW(i + 2)); = \ + } = \ + *Vd =3D temp; = \ } =20 FTINTH_L_S(vftinth_l_s, do_float32_to_int64) @@ -2577,12 +2536,13 @@ DO_2OP_F(vffint_d_l, 64, D, do_ffint_d_l) DO_2OP_F(vffint_s_wu, 32, UW, do_ffint_s_wu) DO_2OP_F(vffint_d_lu, 64, UD, do_ffint_d_lu) =20 -void HELPER(vffintl_d_w)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vffintl_d_w)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < 2; i++) { @@ -2592,12 +2552,13 @@ void HELPER(vffintl_d_w)(CPULoongArchState *env, ui= nt32_t vd, uint32_t vj) *Vd =3D temp; } =20 -void HELPER(vffinth_d_w)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vffinth_d_w)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < 2; i++) { @@ -2607,14 +2568,14 @@ void HELPER(vffinth_d_w)(CPULoongArchState *env, ui= nt32_t vd, uint32_t vj) *Vd =3D temp; } =20 -void HELPER(vffint_s_l)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vffint_s_l)(void *vd, void *vj, void *vk, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 vec_clear_cause(env); for (i =3D 0; i < 2; i++) { @@ -2768,21 +2729,20 @@ SETALLNEZ(vsetallnez_h, MO_16) SETALLNEZ(vsetallnez_w, MO_32) SETALLNEZ(vsetallnez_d, MO_64) =20 -#define VPACKEV(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) =3D Vj->E(2 * i); \ - temp.E(2 *i) =3D Vk->E(2 * i); \ - } \ - *Vd =3D temp; \ +#define VPACKEV(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(2 * i + 1) =3D Vj->E(2 * i); \ + temp.E(2 *i) =3D Vk->E(2 * i); \ + } \ + *Vd =3D temp; \ } =20 VPACKEV(vpackev_b, 16, B) @@ -2790,21 +2750,20 @@ VPACKEV(vpackev_h, 32, H) VPACKEV(vpackev_w, 64, W) VPACKEV(vpackev_d, 128, D) =20 -#define VPACKOD(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) =3D Vj->E(2 * i + 1); \ - temp.E(2 * i) =3D Vk->E(2 * i + 1); \ - } \ - *Vd =3D temp; \ +#define VPACKOD(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(2 * i + 1) =3D Vj->E(2 * i + 1); \ + temp.E(2 * i) =3D Vk->E(2 * i + 1); \ + } \ + *Vd =3D temp; \ } =20 VPACKOD(vpackod_b, 16, B) @@ -2812,21 +2771,20 @@ VPACKOD(vpackod_h, 32, H) VPACKOD(vpackod_w, 64, W) VPACKOD(vpackod_d, 128, D) =20 -#define VPICKEV(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i + LSX_LEN/BIT) =3D Vj->E(2 * i); \ - temp.E(i) =3D Vk->E(2 * i); \ - } \ - *Vd =3D temp; \ +#define VPICKEV(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(i + LSX_LEN/BIT) =3D Vj->E(2 * i); \ + temp.E(i) =3D Vk->E(2 * i); \ + } \ + *Vd =3D temp; \ } =20 VPICKEV(vpickev_b, 16, B) @@ -2834,21 +2792,20 @@ VPICKEV(vpickev_h, 32, H) VPICKEV(vpickev_w, 64, W) VPICKEV(vpickev_d, 128, D) =20 -#define VPICKOD(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i + LSX_LEN/BIT) =3D Vj->E(2 * i + 1); \ - temp.E(i) =3D Vk->E(2 * i + 1); \ - } \ - *Vd =3D temp; \ +#define VPICKOD(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(i + LSX_LEN/BIT) =3D Vj->E(2 * i + 1); \ + temp.E(i) =3D Vk->E(2 * i + 1); \ + } \ + *Vd =3D temp; \ } =20 VPICKOD(vpickod_b, 16, B) @@ -2856,21 +2813,20 @@ VPICKOD(vpickod_h, 32, H) VPICKOD(vpickod_w, 64, W) VPICKOD(vpickod_d, 128, D) =20 -#define VILVL(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) =3D Vj->E(i); \ - temp.E(2 * i) =3D Vk->E(i); \ - } \ - *Vd =3D temp; \ +#define VILVL(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(2 * i + 1) =3D Vj->E(i); \ + temp.E(2 * i) =3D Vk->E(i); \ + } \ + *Vd =3D temp; \ } =20 VILVL(vilvl_b, 16, B) @@ -2878,21 +2834,20 @@ VILVL(vilvl_h, 32, H) VILVL(vilvl_w, 64, W) VILVL(vilvl_d, 128, D) =20 -#define VILVH(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) =3D Vj->E(i + LSX_LEN/BIT); \ - temp.E(2 * i) =3D Vk->E(i + LSX_LEN/BIT); \ - } \ - *Vd =3D temp; \ +#define VILVH(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(2 * i + 1) =3D Vj->E(i + LSX_LEN/BIT); \ + temp.E(2 * i) =3D Vk->E(i + LSX_LEN/BIT); \ + } \ + *Vd =3D temp; \ } =20 VILVH(vilvh_b, 16, B) @@ -2900,15 +2855,14 @@ VILVH(vilvh_h, 32, H) VILVH(vilvh_w, 64, W) VILVH(vilvh_d, 128, D) =20 -void HELPER(vshuf_b)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk, uint32_t va) +void HELPER(vshuf_b)(void *vd, void *vj, void *vk, void *va, uint32_t desc) { int i, m; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); - VReg *Va =3D &(env->fpr[va].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; + VReg *Va =3D (VReg *)va; =20 m =3D LSX_LEN/8; for (i =3D 0; i < m ; i++) { @@ -2918,53 +2872,50 @@ void HELPER(vshuf_b)(CPULoongArchState *env, *Vd =3D temp; } =20 -#define VSHUF(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i, m; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - m =3D LSX_LEN/BIT; \ - for (i =3D 0; i < m; i++) { \ - uint64_t k =3D ((uint8_t) Vd->E(i)) % (2 * m); \ - temp.E(i) =3D k < m ? Vk->E(k) : Vj->E(k - m); \ - } \ - *Vd =3D temp; \ +#define VSHUF(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, m; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + m =3D LSX_LEN/BIT; \ + for (i =3D 0; i < m; i++) { \ + uint64_t k =3D ((uint8_t) Vd->E(i)) % (2 * m); \ + temp.E(i) =3D k < m ? Vk->E(k) : Vj->E(k - m); \ + } \ + *Vd =3D temp; \ } =20 VSHUF(vshuf_h, 16, H) VSHUF(vshuf_w, 32, W) VSHUF(vshuf_d, 64, D) =20 -#define VSHUF4I(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i) =3D Vj->E(((i) & 0xfc) + (((imm) >> \ - (2 * ((i) & 0x03))) & 0x03)); \ - } \ - *Vd =3D temp; \ +#define VSHUF4I(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(i) =3D Vj->E(((i) & 0xfc) + (((imm) >> \ + (2 * ((i) & 0x03))) & 0x03)); \ + } \ + *Vd =3D temp; \ } =20 VSHUF4I(vshuf4i_b, 8, B) VSHUF4I(vshuf4i_h, 16, H) VSHUF4I(vshuf4i_w, 32, W) =20 -void HELPER(vshuf4i_d)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vshuf4i_d)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 VReg temp; temp.D(0) =3D (imm & 2 ? Vj : Vd)->D(imm & 1); @@ -2972,12 +2923,11 @@ void HELPER(vshuf4i_d)(CPULoongArchState *env, *Vd =3D temp; } =20 -void HELPER(vpermi_w)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vpermi_w)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp.W(0) =3D Vj->W(imm & 0x3); temp.W(1) =3D Vj->W((imm >> 2) & 0x3); @@ -2986,17 +2936,16 @@ void HELPER(vpermi_w)(CPULoongArchState *env, *Vd =3D temp; } =20 -#define VEXTRINS(NAME, BIT, E, MASK) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int ins, extr; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - ins =3D (imm >> 4) & MASK; \ - extr =3D imm & MASK; \ - Vd->E(ins) =3D Vj->E(extr); \ +#define VEXTRINS(NAME, BIT, E, MASK) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int ins, extr; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + ins =3D (imm >> 4) & MASK; \ + extr =3D imm & MASK; \ + Vd->E(ins) =3D Vj->E(extr); \ } =20 VEXTRINS(vextrins_b, 8, B, 0xf) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index 86a0d4d6b9..5653a556bf 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -4,53 +4,90 @@ * Copyright (c) 2022-2023 Loongson Technology Corporation Limited */ =20 -static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, - TCGv_i32, TCGv_i32)) +static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, int oprsz, + gen_helper_gvec_4 *fn) { - TCGv_i32 vd =3D tcg_constant_i32(a->vd); - TCGv_i32 vj =3D tcg_constant_i32(a->vj); - TCGv_i32 vk =3D tcg_constant_i32(a->vk); - TCGv_i32 va =3D tcg_constant_i32(a->va); - CHECK_VEC; - func(cpu_env, vd, vj, vk, va); + + tcg_gen_gvec_4_ool(vec_full_offset(a->vd), + vec_full_offset(a->vj), + vec_full_offset(a->vk), + vec_full_offset(a->va), + oprsz, ctx->vl / 8, oprsz, fn); return true; } =20 -static bool gen_vvv(DisasContext *ctx, arg_vvv *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) +static bool gen_vvv(DisasContext *ctx, arg_vvv *a, int oprsz, + gen_helper_gvec_3 * fn) { - TCGv_i32 vd =3D tcg_constant_i32(a->vd); - TCGv_i32 vj =3D tcg_constant_i32(a->vj); - TCGv_i32 vk =3D tcg_constant_i32(a->vk); + CHECK_VEC; + + tcg_gen_gvec_3_ool(vec_full_offset(a->vd), + vec_full_offset(a->vj), + vec_full_offset(a->vk), + oprsz, ctx->vl / 8, oprsz, fn); + return true; +} =20 +static bool gen_vv(DisasContext *ctx, arg_vv *a, int oprsz, + gen_helper_gvec_2 * fn) +{ CHECK_VEC; =20 - func(cpu_env, vd, vj, vk); + tcg_gen_gvec_2_ool(vec_full_offset(a->vd), + vec_full_offset(a->vj), + oprsz, ctx->vl / 8, oprsz, fn); return true; } =20 -static bool gen_vv(DisasContext *ctx, arg_vv *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32)) +static bool gen_vvvv_f(DisasContext *ctx, arg_vvvv *a, int oprsz, + gen_helper_gvec_4_ptr *fn) { - TCGv_i32 vd =3D tcg_constant_i32(a->vd); - TCGv_i32 vj =3D tcg_constant_i32(a->vj); + CHECK_VEC; + + tcg_gen_gvec_4_ptr(vec_full_offset(a->vd), + vec_full_offset(a->vj), + vec_full_offset(a->vk), + vec_full_offset(a->va), + cpu_env, + oprsz, ctx->vl / 8, oprsz, fn); + return true; +} =20 +static bool gen_vvv_f(DisasContext *ctx, arg_vvv *a, int oprsz, + gen_helper_gvec_3_ptr * fn) +{ CHECK_VEC; - func(cpu_env, vd, vj); + + tcg_gen_gvec_3_ptr(vec_full_offset(a->vd), + vec_full_offset(a->vj), + vec_full_offset(a->vk), + cpu_env, + oprsz, ctx->vl / 8, oprsz, fn); return true; } =20 -static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) +static bool gen_vv_f(DisasContext *ctx, arg_vv *a, int oprsz, + gen_helper_gvec_2_ptr * fn) { - TCGv_i32 vd =3D tcg_constant_i32(a->vd); - TCGv_i32 vj =3D tcg_constant_i32(a->vj); - TCGv_i32 imm =3D tcg_constant_i32(a->imm); + CHECK_VEC; =20 + tcg_gen_gvec_2_ptr(vec_full_offset(a->vd), + vec_full_offset(a->vj), + cpu_env, + oprsz, ctx->vl / 8, oprsz, fn); + return true; +} + +static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, int oprsz, + gen_helper_gvec_2i * fn) +{ CHECK_VEC; - func(cpu_env, vd, vj, imm); + + tcg_gen_gvec_2i_ool(vec_full_offset(a->vd), + vec_full_offset(a->vj), + tcg_constant_i64(a->imm), + oprsz, ctx->vl / 8, oprsz, fn); return true; } =20 @@ -199,22 +236,22 @@ TRANS(vssub_hu, LSX, gvec_vvv, 16, MO_16, tcg_gen_gve= c_ussub) TRANS(vssub_wu, LSX, gvec_vvv, 16, MO_32, tcg_gen_gvec_ussub) TRANS(vssub_du, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_ussub) =20 -TRANS(vhaddw_h_b, LSX, gen_vvv, gen_helper_vhaddw_h_b) -TRANS(vhaddw_w_h, LSX, gen_vvv, gen_helper_vhaddw_w_h) -TRANS(vhaddw_d_w, LSX, gen_vvv, gen_helper_vhaddw_d_w) -TRANS(vhaddw_q_d, LSX, gen_vvv, gen_helper_vhaddw_q_d) -TRANS(vhaddw_hu_bu, LSX, gen_vvv, gen_helper_vhaddw_hu_bu) -TRANS(vhaddw_wu_hu, LSX, gen_vvv, gen_helper_vhaddw_wu_hu) -TRANS(vhaddw_du_wu, LSX, gen_vvv, gen_helper_vhaddw_du_wu) -TRANS(vhaddw_qu_du, LSX, gen_vvv, gen_helper_vhaddw_qu_du) -TRANS(vhsubw_h_b, LSX, gen_vvv, gen_helper_vhsubw_h_b) -TRANS(vhsubw_w_h, LSX, gen_vvv, gen_helper_vhsubw_w_h) -TRANS(vhsubw_d_w, LSX, gen_vvv, gen_helper_vhsubw_d_w) -TRANS(vhsubw_q_d, LSX, gen_vvv, gen_helper_vhsubw_q_d) -TRANS(vhsubw_hu_bu, LSX, gen_vvv, gen_helper_vhsubw_hu_bu) -TRANS(vhsubw_wu_hu, LSX, gen_vvv, gen_helper_vhsubw_wu_hu) -TRANS(vhsubw_du_wu, LSX, gen_vvv, gen_helper_vhsubw_du_wu) -TRANS(vhsubw_qu_du, LSX, gen_vvv, gen_helper_vhsubw_qu_du) +TRANS(vhaddw_h_b, LSX, gen_vvv, 16, gen_helper_vhaddw_h_b) +TRANS(vhaddw_w_h, LSX, gen_vvv, 16, gen_helper_vhaddw_w_h) +TRANS(vhaddw_d_w, LSX, gen_vvv, 16, gen_helper_vhaddw_d_w) +TRANS(vhaddw_q_d, LSX, gen_vvv, 16, gen_helper_vhaddw_q_d) +TRANS(vhaddw_hu_bu, LSX, gen_vvv, 16, gen_helper_vhaddw_hu_bu) +TRANS(vhaddw_wu_hu, LSX, gen_vvv, 16, gen_helper_vhaddw_wu_hu) +TRANS(vhaddw_du_wu, LSX, gen_vvv, 16, gen_helper_vhaddw_du_wu) +TRANS(vhaddw_qu_du, LSX, gen_vvv, 16, gen_helper_vhaddw_qu_du) +TRANS(vhsubw_h_b, LSX, gen_vvv, 16, gen_helper_vhsubw_h_b) +TRANS(vhsubw_w_h, LSX, gen_vvv, 16, gen_helper_vhsubw_w_h) +TRANS(vhsubw_d_w, LSX, gen_vvv, 16, gen_helper_vhsubw_d_w) +TRANS(vhsubw_q_d, LSX, gen_vvv, 16, gen_helper_vhsubw_q_d) +TRANS(vhsubw_hu_bu, LSX, gen_vvv, 16, gen_helper_vhsubw_hu_bu) +TRANS(vhsubw_wu_hu, LSX, gen_vvv, 16, gen_helper_vhsubw_wu_hu) +TRANS(vhsubw_du_wu, LSX, gen_vvv, 16, gen_helper_vhsubw_du_wu) +TRANS(vhsubw_qu_du, LSX, gen_vvv, 16, gen_helper_vhsubw_qu_du) =20 static void gen_vaddwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -2726,22 +2763,22 @@ TRANS(vmaddwod_h_bu_b, LSX, gvec_vvv, 16, MO_8, do_= vmaddwod_u_s) TRANS(vmaddwod_w_hu_h, LSX, gvec_vvv, 16, MO_16, do_vmaddwod_u_s) TRANS(vmaddwod_d_wu_w, LSX, gvec_vvv, 16, MO_32, do_vmaddwod_u_s) =20 -TRANS(vdiv_b, LSX, gen_vvv, gen_helper_vdiv_b) -TRANS(vdiv_h, LSX, gen_vvv, gen_helper_vdiv_h) -TRANS(vdiv_w, LSX, gen_vvv, gen_helper_vdiv_w) -TRANS(vdiv_d, LSX, gen_vvv, gen_helper_vdiv_d) -TRANS(vdiv_bu, LSX, gen_vvv, gen_helper_vdiv_bu) -TRANS(vdiv_hu, LSX, gen_vvv, gen_helper_vdiv_hu) -TRANS(vdiv_wu, LSX, gen_vvv, gen_helper_vdiv_wu) -TRANS(vdiv_du, LSX, gen_vvv, gen_helper_vdiv_du) -TRANS(vmod_b, LSX, gen_vvv, gen_helper_vmod_b) -TRANS(vmod_h, LSX, gen_vvv, gen_helper_vmod_h) -TRANS(vmod_w, LSX, gen_vvv, gen_helper_vmod_w) -TRANS(vmod_d, LSX, gen_vvv, gen_helper_vmod_d) -TRANS(vmod_bu, LSX, gen_vvv, gen_helper_vmod_bu) -TRANS(vmod_hu, LSX, gen_vvv, gen_helper_vmod_hu) -TRANS(vmod_wu, LSX, gen_vvv, gen_helper_vmod_wu) -TRANS(vmod_du, LSX, gen_vvv, gen_helper_vmod_du) +TRANS(vdiv_b, LSX, gen_vvv, 16, gen_helper_vdiv_b) +TRANS(vdiv_h, LSX, gen_vvv, 16, gen_helper_vdiv_h) +TRANS(vdiv_w, LSX, gen_vvv, 16, gen_helper_vdiv_w) +TRANS(vdiv_d, LSX, gen_vvv, 16, gen_helper_vdiv_d) +TRANS(vdiv_bu, LSX, gen_vvv, 16, gen_helper_vdiv_bu) +TRANS(vdiv_hu, LSX, gen_vvv, 16, gen_helper_vdiv_hu) +TRANS(vdiv_wu, LSX, gen_vvv, 16, gen_helper_vdiv_wu) +TRANS(vdiv_du, LSX, gen_vvv, 16, gen_helper_vdiv_du) +TRANS(vmod_b, LSX, gen_vvv, 16, gen_helper_vmod_b) +TRANS(vmod_h, LSX, gen_vvv, 16, gen_helper_vmod_h) +TRANS(vmod_w, LSX, gen_vvv, 16, gen_helper_vmod_w) +TRANS(vmod_d, LSX, gen_vvv, 16, gen_helper_vmod_d) +TRANS(vmod_bu, LSX, gen_vvv, 16, gen_helper_vmod_bu) +TRANS(vmod_hu, LSX, gen_vvv, 16, gen_helper_vmod_hu) +TRANS(vmod_wu, LSX, gen_vvv, 16, gen_helper_vmod_wu) +TRANS(vmod_du, LSX, gen_vvv, 16, gen_helper_vmod_du) =20 static void gen_vsat_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec max) { @@ -2844,14 +2881,14 @@ TRANS(vsat_hu, LSX, gvec_vv_i, 16, MO_16, do_vsat_u) TRANS(vsat_wu, LSX, gvec_vv_i, 16, MO_32, do_vsat_u) TRANS(vsat_du, LSX, gvec_vv_i, 16, MO_64, do_vsat_u) =20 -TRANS(vexth_h_b, LSX, gen_vv, gen_helper_vexth_h_b) -TRANS(vexth_w_h, LSX, gen_vv, gen_helper_vexth_w_h) -TRANS(vexth_d_w, LSX, gen_vv, gen_helper_vexth_d_w) -TRANS(vexth_q_d, LSX, gen_vv, gen_helper_vexth_q_d) -TRANS(vexth_hu_bu, LSX, gen_vv, gen_helper_vexth_hu_bu) -TRANS(vexth_wu_hu, LSX, gen_vv, gen_helper_vexth_wu_hu) -TRANS(vexth_du_wu, LSX, gen_vv, gen_helper_vexth_du_wu) -TRANS(vexth_qu_du, LSX, gen_vv, gen_helper_vexth_qu_du) +TRANS(vexth_h_b, LSX, gen_vv, 16, gen_helper_vexth_h_b) +TRANS(vexth_w_h, LSX, gen_vv, 16, gen_helper_vexth_w_h) +TRANS(vexth_d_w, LSX, gen_vv, 16, gen_helper_vexth_d_w) +TRANS(vexth_q_d, LSX, gen_vv, 16, gen_helper_vexth_q_d) +TRANS(vexth_hu_bu, LSX, gen_vv, 16, gen_helper_vexth_hu_bu) +TRANS(vexth_wu_hu, LSX, gen_vv, 16, gen_helper_vexth_wu_hu) +TRANS(vexth_du_wu, LSX, gen_vv, 16, gen_helper_vexth_du_wu) +TRANS(vexth_qu_du, LSX, gen_vv, 16, gen_helper_vexth_qu_du) =20 static void gen_vsigncov(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) { @@ -2906,12 +2943,12 @@ TRANS(vsigncov_h, LSX, gvec_vvv, 16, MO_16, do_vsig= ncov) TRANS(vsigncov_w, LSX, gvec_vvv, 16, MO_32, do_vsigncov) TRANS(vsigncov_d, LSX, gvec_vvv, 16, MO_64, do_vsigncov) =20 -TRANS(vmskltz_b, LSX, gen_vv, gen_helper_vmskltz_b) -TRANS(vmskltz_h, LSX, gen_vv, gen_helper_vmskltz_h) -TRANS(vmskltz_w, LSX, gen_vv, gen_helper_vmskltz_w) -TRANS(vmskltz_d, LSX, gen_vv, gen_helper_vmskltz_d) -TRANS(vmskgez_b, LSX, gen_vv, gen_helper_vmskgez_b) -TRANS(vmsknz_b, LSX, gen_vv, gen_helper_vmsknz_b) +TRANS(vmskltz_b, LSX, gen_vv, 16, gen_helper_vmskltz_b) +TRANS(vmskltz_h, LSX, gen_vv, 16, gen_helper_vmskltz_h) +TRANS(vmskltz_w, LSX, gen_vv, 16, gen_helper_vmskltz_w) +TRANS(vmskltz_d, LSX, gen_vv, 16, gen_helper_vmskltz_d) +TRANS(vmskgez_b, LSX, gen_vv, 16, gen_helper_vmskgez_b) +TRANS(vmsknz_b, LSX, gen_vv, 16, gen_helper_vmsknz_b) =20 #define EXPAND_BYTE(bit) ((uint64_t)(bit ? 0xff : 0)) =20 @@ -3151,138 +3188,138 @@ TRANS(vrotri_h, LSX, gvec_vv_i, 16, MO_16, tcg_ge= n_gvec_rotri) TRANS(vrotri_w, LSX, gvec_vv_i, 16, MO_32, tcg_gen_gvec_rotri) TRANS(vrotri_d, LSX, gvec_vv_i, 16, MO_64, tcg_gen_gvec_rotri) =20 -TRANS(vsllwil_h_b, LSX, gen_vv_i, gen_helper_vsllwil_h_b) -TRANS(vsllwil_w_h, LSX, gen_vv_i, gen_helper_vsllwil_w_h) -TRANS(vsllwil_d_w, LSX, gen_vv_i, gen_helper_vsllwil_d_w) -TRANS(vextl_q_d, LSX, gen_vv, gen_helper_vextl_q_d) -TRANS(vsllwil_hu_bu, LSX, gen_vv_i, gen_helper_vsllwil_hu_bu) -TRANS(vsllwil_wu_hu, LSX, gen_vv_i, gen_helper_vsllwil_wu_hu) -TRANS(vsllwil_du_wu, LSX, gen_vv_i, gen_helper_vsllwil_du_wu) -TRANS(vextl_qu_du, LSX, gen_vv, gen_helper_vextl_qu_du) - -TRANS(vsrlr_b, LSX, gen_vvv, gen_helper_vsrlr_b) -TRANS(vsrlr_h, LSX, gen_vvv, gen_helper_vsrlr_h) -TRANS(vsrlr_w, LSX, gen_vvv, gen_helper_vsrlr_w) -TRANS(vsrlr_d, LSX, gen_vvv, gen_helper_vsrlr_d) -TRANS(vsrlri_b, LSX, gen_vv_i, gen_helper_vsrlri_b) -TRANS(vsrlri_h, LSX, gen_vv_i, gen_helper_vsrlri_h) -TRANS(vsrlri_w, LSX, gen_vv_i, gen_helper_vsrlri_w) -TRANS(vsrlri_d, LSX, gen_vv_i, gen_helper_vsrlri_d) - -TRANS(vsrar_b, LSX, gen_vvv, gen_helper_vsrar_b) -TRANS(vsrar_h, LSX, gen_vvv, gen_helper_vsrar_h) -TRANS(vsrar_w, LSX, gen_vvv, gen_helper_vsrar_w) -TRANS(vsrar_d, LSX, gen_vvv, gen_helper_vsrar_d) -TRANS(vsrari_b, LSX, gen_vv_i, gen_helper_vsrari_b) -TRANS(vsrari_h, LSX, gen_vv_i, gen_helper_vsrari_h) -TRANS(vsrari_w, LSX, gen_vv_i, gen_helper_vsrari_w) -TRANS(vsrari_d, LSX, gen_vv_i, gen_helper_vsrari_d) - -TRANS(vsrln_b_h, LSX, gen_vvv, gen_helper_vsrln_b_h) -TRANS(vsrln_h_w, LSX, gen_vvv, gen_helper_vsrln_h_w) -TRANS(vsrln_w_d, LSX, gen_vvv, gen_helper_vsrln_w_d) -TRANS(vsran_b_h, LSX, gen_vvv, gen_helper_vsran_b_h) -TRANS(vsran_h_w, LSX, gen_vvv, gen_helper_vsran_h_w) -TRANS(vsran_w_d, LSX, gen_vvv, gen_helper_vsran_w_d) - -TRANS(vsrlni_b_h, LSX, gen_vv_i, gen_helper_vsrlni_b_h) -TRANS(vsrlni_h_w, LSX, gen_vv_i, gen_helper_vsrlni_h_w) -TRANS(vsrlni_w_d, LSX, gen_vv_i, gen_helper_vsrlni_w_d) -TRANS(vsrlni_d_q, LSX, gen_vv_i, gen_helper_vsrlni_d_q) -TRANS(vsrani_b_h, LSX, gen_vv_i, gen_helper_vsrani_b_h) -TRANS(vsrani_h_w, LSX, gen_vv_i, gen_helper_vsrani_h_w) -TRANS(vsrani_w_d, LSX, gen_vv_i, gen_helper_vsrani_w_d) -TRANS(vsrani_d_q, LSX, gen_vv_i, gen_helper_vsrani_d_q) - -TRANS(vsrlrn_b_h, LSX, gen_vvv, gen_helper_vsrlrn_b_h) -TRANS(vsrlrn_h_w, LSX, gen_vvv, gen_helper_vsrlrn_h_w) -TRANS(vsrlrn_w_d, LSX, gen_vvv, gen_helper_vsrlrn_w_d) -TRANS(vsrarn_b_h, LSX, gen_vvv, gen_helper_vsrarn_b_h) -TRANS(vsrarn_h_w, LSX, gen_vvv, gen_helper_vsrarn_h_w) -TRANS(vsrarn_w_d, LSX, gen_vvv, gen_helper_vsrarn_w_d) - -TRANS(vsrlrni_b_h, LSX, gen_vv_i, gen_helper_vsrlrni_b_h) -TRANS(vsrlrni_h_w, LSX, gen_vv_i, gen_helper_vsrlrni_h_w) -TRANS(vsrlrni_w_d, LSX, gen_vv_i, gen_helper_vsrlrni_w_d) -TRANS(vsrlrni_d_q, LSX, gen_vv_i, gen_helper_vsrlrni_d_q) -TRANS(vsrarni_b_h, LSX, gen_vv_i, gen_helper_vsrarni_b_h) -TRANS(vsrarni_h_w, LSX, gen_vv_i, gen_helper_vsrarni_h_w) -TRANS(vsrarni_w_d, LSX, gen_vv_i, gen_helper_vsrarni_w_d) -TRANS(vsrarni_d_q, LSX, gen_vv_i, gen_helper_vsrarni_d_q) - -TRANS(vssrln_b_h, LSX, gen_vvv, gen_helper_vssrln_b_h) -TRANS(vssrln_h_w, LSX, gen_vvv, gen_helper_vssrln_h_w) -TRANS(vssrln_w_d, LSX, gen_vvv, gen_helper_vssrln_w_d) -TRANS(vssran_b_h, LSX, gen_vvv, gen_helper_vssran_b_h) -TRANS(vssran_h_w, LSX, gen_vvv, gen_helper_vssran_h_w) -TRANS(vssran_w_d, LSX, gen_vvv, gen_helper_vssran_w_d) -TRANS(vssrln_bu_h, LSX, gen_vvv, gen_helper_vssrln_bu_h) -TRANS(vssrln_hu_w, LSX, gen_vvv, gen_helper_vssrln_hu_w) -TRANS(vssrln_wu_d, LSX, gen_vvv, gen_helper_vssrln_wu_d) -TRANS(vssran_bu_h, LSX, gen_vvv, gen_helper_vssran_bu_h) -TRANS(vssran_hu_w, LSX, gen_vvv, gen_helper_vssran_hu_w) -TRANS(vssran_wu_d, LSX, gen_vvv, gen_helper_vssran_wu_d) - -TRANS(vssrlni_b_h, LSX, gen_vv_i, gen_helper_vssrlni_b_h) -TRANS(vssrlni_h_w, LSX, gen_vv_i, gen_helper_vssrlni_h_w) -TRANS(vssrlni_w_d, LSX, gen_vv_i, gen_helper_vssrlni_w_d) -TRANS(vssrlni_d_q, LSX, gen_vv_i, gen_helper_vssrlni_d_q) -TRANS(vssrani_b_h, LSX, gen_vv_i, gen_helper_vssrani_b_h) -TRANS(vssrani_h_w, LSX, gen_vv_i, gen_helper_vssrani_h_w) -TRANS(vssrani_w_d, LSX, gen_vv_i, gen_helper_vssrani_w_d) -TRANS(vssrani_d_q, LSX, gen_vv_i, gen_helper_vssrani_d_q) -TRANS(vssrlni_bu_h, LSX, gen_vv_i, gen_helper_vssrlni_bu_h) -TRANS(vssrlni_hu_w, LSX, gen_vv_i, gen_helper_vssrlni_hu_w) -TRANS(vssrlni_wu_d, LSX, gen_vv_i, gen_helper_vssrlni_wu_d) -TRANS(vssrlni_du_q, LSX, gen_vv_i, gen_helper_vssrlni_du_q) -TRANS(vssrani_bu_h, LSX, gen_vv_i, gen_helper_vssrani_bu_h) -TRANS(vssrani_hu_w, LSX, gen_vv_i, gen_helper_vssrani_hu_w) -TRANS(vssrani_wu_d, LSX, gen_vv_i, gen_helper_vssrani_wu_d) -TRANS(vssrani_du_q, LSX, gen_vv_i, gen_helper_vssrani_du_q) - -TRANS(vssrlrn_b_h, LSX, gen_vvv, gen_helper_vssrlrn_b_h) -TRANS(vssrlrn_h_w, LSX, gen_vvv, gen_helper_vssrlrn_h_w) -TRANS(vssrlrn_w_d, LSX, gen_vvv, gen_helper_vssrlrn_w_d) -TRANS(vssrarn_b_h, LSX, gen_vvv, gen_helper_vssrarn_b_h) -TRANS(vssrarn_h_w, LSX, gen_vvv, gen_helper_vssrarn_h_w) -TRANS(vssrarn_w_d, LSX, gen_vvv, gen_helper_vssrarn_w_d) -TRANS(vssrlrn_bu_h, LSX, gen_vvv, gen_helper_vssrlrn_bu_h) -TRANS(vssrlrn_hu_w, LSX, gen_vvv, gen_helper_vssrlrn_hu_w) -TRANS(vssrlrn_wu_d, LSX, gen_vvv, gen_helper_vssrlrn_wu_d) -TRANS(vssrarn_bu_h, LSX, gen_vvv, gen_helper_vssrarn_bu_h) -TRANS(vssrarn_hu_w, LSX, gen_vvv, gen_helper_vssrarn_hu_w) -TRANS(vssrarn_wu_d, LSX, gen_vvv, gen_helper_vssrarn_wu_d) - -TRANS(vssrlrni_b_h, LSX, gen_vv_i, gen_helper_vssrlrni_b_h) -TRANS(vssrlrni_h_w, LSX, gen_vv_i, gen_helper_vssrlrni_h_w) -TRANS(vssrlrni_w_d, LSX, gen_vv_i, gen_helper_vssrlrni_w_d) -TRANS(vssrlrni_d_q, LSX, gen_vv_i, gen_helper_vssrlrni_d_q) -TRANS(vssrarni_b_h, LSX, gen_vv_i, gen_helper_vssrarni_b_h) -TRANS(vssrarni_h_w, LSX, gen_vv_i, gen_helper_vssrarni_h_w) -TRANS(vssrarni_w_d, LSX, gen_vv_i, gen_helper_vssrarni_w_d) -TRANS(vssrarni_d_q, LSX, gen_vv_i, gen_helper_vssrarni_d_q) -TRANS(vssrlrni_bu_h, LSX, gen_vv_i, gen_helper_vssrlrni_bu_h) -TRANS(vssrlrni_hu_w, LSX, gen_vv_i, gen_helper_vssrlrni_hu_w) -TRANS(vssrlrni_wu_d, LSX, gen_vv_i, gen_helper_vssrlrni_wu_d) -TRANS(vssrlrni_du_q, LSX, gen_vv_i, gen_helper_vssrlrni_du_q) -TRANS(vssrarni_bu_h, LSX, gen_vv_i, gen_helper_vssrarni_bu_h) -TRANS(vssrarni_hu_w, LSX, gen_vv_i, gen_helper_vssrarni_hu_w) -TRANS(vssrarni_wu_d, LSX, gen_vv_i, gen_helper_vssrarni_wu_d) -TRANS(vssrarni_du_q, LSX, gen_vv_i, gen_helper_vssrarni_du_q) - -TRANS(vclo_b, LSX, gen_vv, gen_helper_vclo_b) -TRANS(vclo_h, LSX, gen_vv, gen_helper_vclo_h) -TRANS(vclo_w, LSX, gen_vv, gen_helper_vclo_w) -TRANS(vclo_d, LSX, gen_vv, gen_helper_vclo_d) -TRANS(vclz_b, LSX, gen_vv, gen_helper_vclz_b) -TRANS(vclz_h, LSX, gen_vv, gen_helper_vclz_h) -TRANS(vclz_w, LSX, gen_vv, gen_helper_vclz_w) -TRANS(vclz_d, LSX, gen_vv, gen_helper_vclz_d) - -TRANS(vpcnt_b, LSX, gen_vv, gen_helper_vpcnt_b) -TRANS(vpcnt_h, LSX, gen_vv, gen_helper_vpcnt_h) -TRANS(vpcnt_w, LSX, gen_vv, gen_helper_vpcnt_w) -TRANS(vpcnt_d, LSX, gen_vv, gen_helper_vpcnt_d) +TRANS(vsllwil_h_b, LSX, gen_vv_i, 16, gen_helper_vsllwil_h_b) +TRANS(vsllwil_w_h, LSX, gen_vv_i, 16, gen_helper_vsllwil_w_h) +TRANS(vsllwil_d_w, LSX, gen_vv_i, 16, gen_helper_vsllwil_d_w) +TRANS(vextl_q_d, LSX, gen_vv, 16, gen_helper_vextl_q_d) +TRANS(vsllwil_hu_bu, LSX, gen_vv_i, 16, gen_helper_vsllwil_hu_bu) +TRANS(vsllwil_wu_hu, LSX, gen_vv_i, 16, gen_helper_vsllwil_wu_hu) +TRANS(vsllwil_du_wu, LSX, gen_vv_i, 16, gen_helper_vsllwil_du_wu) +TRANS(vextl_qu_du, LSX, gen_vv, 16, gen_helper_vextl_qu_du) + +TRANS(vsrlr_b, LSX, gen_vvv, 16, gen_helper_vsrlr_b) +TRANS(vsrlr_h, LSX, gen_vvv, 16, gen_helper_vsrlr_h) +TRANS(vsrlr_w, LSX, gen_vvv, 16, gen_helper_vsrlr_w) +TRANS(vsrlr_d, LSX, gen_vvv, 16, gen_helper_vsrlr_d) +TRANS(vsrlri_b, LSX, gen_vv_i, 16, gen_helper_vsrlri_b) +TRANS(vsrlri_h, LSX, gen_vv_i, 16, gen_helper_vsrlri_h) +TRANS(vsrlri_w, LSX, gen_vv_i, 16, gen_helper_vsrlri_w) +TRANS(vsrlri_d, LSX, gen_vv_i, 16, gen_helper_vsrlri_d) + +TRANS(vsrar_b, LSX, gen_vvv, 16, gen_helper_vsrar_b) +TRANS(vsrar_h, LSX, gen_vvv, 16, gen_helper_vsrar_h) +TRANS(vsrar_w, LSX, gen_vvv, 16, gen_helper_vsrar_w) +TRANS(vsrar_d, LSX, gen_vvv, 16, gen_helper_vsrar_d) +TRANS(vsrari_b, LSX, gen_vv_i, 16, gen_helper_vsrari_b) +TRANS(vsrari_h, LSX, gen_vv_i, 16, gen_helper_vsrari_h) +TRANS(vsrari_w, LSX, gen_vv_i, 16, gen_helper_vsrari_w) +TRANS(vsrari_d, LSX, gen_vv_i, 16, gen_helper_vsrari_d) + +TRANS(vsrln_b_h, LSX, gen_vvv, 16, gen_helper_vsrln_b_h) +TRANS(vsrln_h_w, LSX, gen_vvv, 16, gen_helper_vsrln_h_w) +TRANS(vsrln_w_d, LSX, gen_vvv, 16, gen_helper_vsrln_w_d) +TRANS(vsran_b_h, LSX, gen_vvv, 16, gen_helper_vsran_b_h) +TRANS(vsran_h_w, LSX, gen_vvv, 16, gen_helper_vsran_h_w) +TRANS(vsran_w_d, LSX, gen_vvv, 16, gen_helper_vsran_w_d) + +TRANS(vsrlni_b_h, LSX, gen_vv_i, 16, gen_helper_vsrlni_b_h) +TRANS(vsrlni_h_w, LSX, gen_vv_i, 16, gen_helper_vsrlni_h_w) +TRANS(vsrlni_w_d, LSX, gen_vv_i, 16, gen_helper_vsrlni_w_d) +TRANS(vsrlni_d_q, LSX, gen_vv_i, 16, gen_helper_vsrlni_d_q) +TRANS(vsrani_b_h, LSX, gen_vv_i, 16, gen_helper_vsrani_b_h) +TRANS(vsrani_h_w, LSX, gen_vv_i, 16, gen_helper_vsrani_h_w) +TRANS(vsrani_w_d, LSX, gen_vv_i, 16, gen_helper_vsrani_w_d) +TRANS(vsrani_d_q, LSX, gen_vv_i, 16, gen_helper_vsrani_d_q) + +TRANS(vsrlrn_b_h, LSX, gen_vvv, 16, gen_helper_vsrlrn_b_h) +TRANS(vsrlrn_h_w, LSX, gen_vvv, 16, gen_helper_vsrlrn_h_w) +TRANS(vsrlrn_w_d, LSX, gen_vvv, 16, gen_helper_vsrlrn_w_d) +TRANS(vsrarn_b_h, LSX, gen_vvv, 16, gen_helper_vsrarn_b_h) +TRANS(vsrarn_h_w, LSX, gen_vvv, 16, gen_helper_vsrarn_h_w) +TRANS(vsrarn_w_d, LSX, gen_vvv, 16, gen_helper_vsrarn_w_d) + +TRANS(vsrlrni_b_h, LSX, gen_vv_i, 16, gen_helper_vsrlrni_b_h) +TRANS(vsrlrni_h_w, LSX, gen_vv_i, 16, gen_helper_vsrlrni_h_w) +TRANS(vsrlrni_w_d, LSX, gen_vv_i, 16, gen_helper_vsrlrni_w_d) +TRANS(vsrlrni_d_q, LSX, gen_vv_i, 16, gen_helper_vsrlrni_d_q) +TRANS(vsrarni_b_h, LSX, gen_vv_i, 16, gen_helper_vsrarni_b_h) +TRANS(vsrarni_h_w, LSX, gen_vv_i, 16, gen_helper_vsrarni_h_w) +TRANS(vsrarni_w_d, LSX, gen_vv_i, 16, gen_helper_vsrarni_w_d) +TRANS(vsrarni_d_q, LSX, gen_vv_i, 16, gen_helper_vsrarni_d_q) + +TRANS(vssrln_b_h, LSX, gen_vvv, 16, gen_helper_vssrln_b_h) +TRANS(vssrln_h_w, LSX, gen_vvv, 16, gen_helper_vssrln_h_w) +TRANS(vssrln_w_d, LSX, gen_vvv, 16, gen_helper_vssrln_w_d) +TRANS(vssran_b_h, LSX, gen_vvv, 16, gen_helper_vssran_b_h) +TRANS(vssran_h_w, LSX, gen_vvv, 16, gen_helper_vssran_h_w) +TRANS(vssran_w_d, LSX, gen_vvv, 16, gen_helper_vssran_w_d) +TRANS(vssrln_bu_h, LSX, gen_vvv, 16, gen_helper_vssrln_bu_h) +TRANS(vssrln_hu_w, LSX, gen_vvv, 16, gen_helper_vssrln_hu_w) +TRANS(vssrln_wu_d, LSX, gen_vvv, 16, gen_helper_vssrln_wu_d) +TRANS(vssran_bu_h, LSX, gen_vvv, 16, gen_helper_vssran_bu_h) +TRANS(vssran_hu_w, LSX, gen_vvv, 16, gen_helper_vssran_hu_w) +TRANS(vssran_wu_d, LSX, gen_vvv, 16, gen_helper_vssran_wu_d) + +TRANS(vssrlni_b_h, LSX, gen_vv_i, 16, gen_helper_vssrlni_b_h) +TRANS(vssrlni_h_w, LSX, gen_vv_i, 16, gen_helper_vssrlni_h_w) +TRANS(vssrlni_w_d, LSX, gen_vv_i, 16, gen_helper_vssrlni_w_d) +TRANS(vssrlni_d_q, LSX, gen_vv_i, 16, gen_helper_vssrlni_d_q) +TRANS(vssrani_b_h, LSX, gen_vv_i, 16, gen_helper_vssrani_b_h) +TRANS(vssrani_h_w, LSX, gen_vv_i, 16, gen_helper_vssrani_h_w) +TRANS(vssrani_w_d, LSX, gen_vv_i, 16, gen_helper_vssrani_w_d) +TRANS(vssrani_d_q, LSX, gen_vv_i, 16, gen_helper_vssrani_d_q) +TRANS(vssrlni_bu_h, LSX, gen_vv_i, 16, gen_helper_vssrlni_bu_h) +TRANS(vssrlni_hu_w, LSX, gen_vv_i, 16, gen_helper_vssrlni_hu_w) +TRANS(vssrlni_wu_d, LSX, gen_vv_i, 16, gen_helper_vssrlni_wu_d) +TRANS(vssrlni_du_q, LSX, gen_vv_i, 16, gen_helper_vssrlni_du_q) +TRANS(vssrani_bu_h, LSX, gen_vv_i, 16, gen_helper_vssrani_bu_h) +TRANS(vssrani_hu_w, LSX, gen_vv_i, 16, gen_helper_vssrani_hu_w) +TRANS(vssrani_wu_d, LSX, gen_vv_i, 16, gen_helper_vssrani_wu_d) +TRANS(vssrani_du_q, LSX, gen_vv_i, 16, gen_helper_vssrani_du_q) + +TRANS(vssrlrn_b_h, LSX, gen_vvv, 16, gen_helper_vssrlrn_b_h) +TRANS(vssrlrn_h_w, LSX, gen_vvv, 16, gen_helper_vssrlrn_h_w) +TRANS(vssrlrn_w_d, LSX, gen_vvv, 16, gen_helper_vssrlrn_w_d) +TRANS(vssrarn_b_h, LSX, gen_vvv, 16, gen_helper_vssrarn_b_h) +TRANS(vssrarn_h_w, LSX, gen_vvv, 16, gen_helper_vssrarn_h_w) +TRANS(vssrarn_w_d, LSX, gen_vvv, 16, gen_helper_vssrarn_w_d) +TRANS(vssrlrn_bu_h, LSX, gen_vvv, 16, gen_helper_vssrlrn_bu_h) +TRANS(vssrlrn_hu_w, LSX, gen_vvv, 16, gen_helper_vssrlrn_hu_w) +TRANS(vssrlrn_wu_d, LSX, gen_vvv, 16, gen_helper_vssrlrn_wu_d) +TRANS(vssrarn_bu_h, LSX, gen_vvv, 16, gen_helper_vssrarn_bu_h) +TRANS(vssrarn_hu_w, LSX, gen_vvv, 16, gen_helper_vssrarn_hu_w) +TRANS(vssrarn_wu_d, LSX, gen_vvv, 16, gen_helper_vssrarn_wu_d) + +TRANS(vssrlrni_b_h, LSX, gen_vv_i, 16, gen_helper_vssrlrni_b_h) +TRANS(vssrlrni_h_w, LSX, gen_vv_i, 16, gen_helper_vssrlrni_h_w) +TRANS(vssrlrni_w_d, LSX, gen_vv_i, 16, gen_helper_vssrlrni_w_d) +TRANS(vssrlrni_d_q, LSX, gen_vv_i, 16, gen_helper_vssrlrni_d_q) +TRANS(vssrarni_b_h, LSX, gen_vv_i, 16, gen_helper_vssrarni_b_h) +TRANS(vssrarni_h_w, LSX, gen_vv_i, 16, gen_helper_vssrarni_h_w) +TRANS(vssrarni_w_d, LSX, gen_vv_i, 16, gen_helper_vssrarni_w_d) +TRANS(vssrarni_d_q, LSX, gen_vv_i, 16, gen_helper_vssrarni_d_q) +TRANS(vssrlrni_bu_h, LSX, gen_vv_i, 16, gen_helper_vssrlrni_bu_h) +TRANS(vssrlrni_hu_w, LSX, gen_vv_i, 16, gen_helper_vssrlrni_hu_w) +TRANS(vssrlrni_wu_d, LSX, gen_vv_i, 16, gen_helper_vssrlrni_wu_d) +TRANS(vssrlrni_du_q, LSX, gen_vv_i, 16, gen_helper_vssrlrni_du_q) +TRANS(vssrarni_bu_h, LSX, gen_vv_i, 16, gen_helper_vssrarni_bu_h) +TRANS(vssrarni_hu_w, LSX, gen_vv_i, 16, gen_helper_vssrarni_hu_w) +TRANS(vssrarni_wu_d, LSX, gen_vv_i, 16, gen_helper_vssrarni_wu_d) +TRANS(vssrarni_du_q, LSX, gen_vv_i, 16, gen_helper_vssrarni_du_q) + +TRANS(vclo_b, LSX, gen_vv, 16, gen_helper_vclo_b) +TRANS(vclo_h, LSX, gen_vv, 16, gen_helper_vclo_h) +TRANS(vclo_w, LSX, gen_vv, 16, gen_helper_vclo_w) +TRANS(vclo_d, LSX, gen_vv, 16, gen_helper_vclo_d) +TRANS(vclz_b, LSX, gen_vv, 16, gen_helper_vclz_b) +TRANS(vclz_h, LSX, gen_vv, 16, gen_helper_vclz_h) +TRANS(vclz_w, LSX, gen_vv, 16, gen_helper_vclz_w) +TRANS(vclz_d, LSX, gen_vv, 16, gen_helper_vclz_d) + +TRANS(vpcnt_b, LSX, gen_vv, 16, gen_helper_vpcnt_b) +TRANS(vpcnt_h, LSX, gen_vv, 16, gen_helper_vpcnt_h) +TRANS(vpcnt_w, LSX, gen_vv, 16, gen_helper_vpcnt_w) +TRANS(vpcnt_d, LSX, gen_vv, 16, gen_helper_vpcnt_d) =20 static void do_vbit(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b, void (*func)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec)) @@ -3589,107 +3626,107 @@ TRANS(vbitrevi_h, LSX, gvec_vv_i, 16, MO_16, do_v= bitrevi) TRANS(vbitrevi_w, LSX, gvec_vv_i, 16, MO_32, do_vbitrevi) TRANS(vbitrevi_d, LSX, gvec_vv_i, 16, MO_64, do_vbitrevi) =20 -TRANS(vfrstp_b, LSX, gen_vvv, gen_helper_vfrstp_b) -TRANS(vfrstp_h, LSX, gen_vvv, gen_helper_vfrstp_h) -TRANS(vfrstpi_b, LSX, gen_vv_i, gen_helper_vfrstpi_b) -TRANS(vfrstpi_h, LSX, gen_vv_i, gen_helper_vfrstpi_h) - -TRANS(vfadd_s, LSX, gen_vvv, gen_helper_vfadd_s) -TRANS(vfadd_d, LSX, gen_vvv, gen_helper_vfadd_d) -TRANS(vfsub_s, LSX, gen_vvv, gen_helper_vfsub_s) -TRANS(vfsub_d, LSX, gen_vvv, gen_helper_vfsub_d) -TRANS(vfmul_s, LSX, gen_vvv, gen_helper_vfmul_s) -TRANS(vfmul_d, LSX, gen_vvv, gen_helper_vfmul_d) -TRANS(vfdiv_s, LSX, gen_vvv, gen_helper_vfdiv_s) -TRANS(vfdiv_d, LSX, gen_vvv, gen_helper_vfdiv_d) - -TRANS(vfmadd_s, LSX, gen_vvvv, gen_helper_vfmadd_s) -TRANS(vfmadd_d, LSX, gen_vvvv, gen_helper_vfmadd_d) -TRANS(vfmsub_s, LSX, gen_vvvv, gen_helper_vfmsub_s) -TRANS(vfmsub_d, LSX, gen_vvvv, gen_helper_vfmsub_d) -TRANS(vfnmadd_s, LSX, gen_vvvv, gen_helper_vfnmadd_s) -TRANS(vfnmadd_d, LSX, gen_vvvv, gen_helper_vfnmadd_d) -TRANS(vfnmsub_s, LSX, gen_vvvv, gen_helper_vfnmsub_s) -TRANS(vfnmsub_d, LSX, gen_vvvv, gen_helper_vfnmsub_d) - -TRANS(vfmax_s, LSX, gen_vvv, gen_helper_vfmax_s) -TRANS(vfmax_d, LSX, gen_vvv, gen_helper_vfmax_d) -TRANS(vfmin_s, LSX, gen_vvv, gen_helper_vfmin_s) -TRANS(vfmin_d, LSX, gen_vvv, gen_helper_vfmin_d) - -TRANS(vfmaxa_s, LSX, gen_vvv, gen_helper_vfmaxa_s) -TRANS(vfmaxa_d, LSX, gen_vvv, gen_helper_vfmaxa_d) -TRANS(vfmina_s, LSX, gen_vvv, gen_helper_vfmina_s) -TRANS(vfmina_d, LSX, gen_vvv, gen_helper_vfmina_d) - -TRANS(vflogb_s, LSX, gen_vv, gen_helper_vflogb_s) -TRANS(vflogb_d, LSX, gen_vv, gen_helper_vflogb_d) - -TRANS(vfclass_s, LSX, gen_vv, gen_helper_vfclass_s) -TRANS(vfclass_d, LSX, gen_vv, gen_helper_vfclass_d) - -TRANS(vfsqrt_s, LSX, gen_vv, gen_helper_vfsqrt_s) -TRANS(vfsqrt_d, LSX, gen_vv, gen_helper_vfsqrt_d) -TRANS(vfrecip_s, LSX, gen_vv, gen_helper_vfrecip_s) -TRANS(vfrecip_d, LSX, gen_vv, gen_helper_vfrecip_d) -TRANS(vfrsqrt_s, LSX, gen_vv, gen_helper_vfrsqrt_s) -TRANS(vfrsqrt_d, LSX, gen_vv, gen_helper_vfrsqrt_d) - -TRANS(vfcvtl_s_h, LSX, gen_vv, gen_helper_vfcvtl_s_h) -TRANS(vfcvth_s_h, LSX, gen_vv, gen_helper_vfcvth_s_h) -TRANS(vfcvtl_d_s, LSX, gen_vv, gen_helper_vfcvtl_d_s) -TRANS(vfcvth_d_s, LSX, gen_vv, gen_helper_vfcvth_d_s) -TRANS(vfcvt_h_s, LSX, gen_vvv, gen_helper_vfcvt_h_s) -TRANS(vfcvt_s_d, LSX, gen_vvv, gen_helper_vfcvt_s_d) - -TRANS(vfrintrne_s, LSX, gen_vv, gen_helper_vfrintrne_s) -TRANS(vfrintrne_d, LSX, gen_vv, gen_helper_vfrintrne_d) -TRANS(vfrintrz_s, LSX, gen_vv, gen_helper_vfrintrz_s) -TRANS(vfrintrz_d, LSX, gen_vv, gen_helper_vfrintrz_d) -TRANS(vfrintrp_s, LSX, gen_vv, gen_helper_vfrintrp_s) -TRANS(vfrintrp_d, LSX, gen_vv, gen_helper_vfrintrp_d) -TRANS(vfrintrm_s, LSX, gen_vv, gen_helper_vfrintrm_s) -TRANS(vfrintrm_d, LSX, gen_vv, gen_helper_vfrintrm_d) -TRANS(vfrint_s, LSX, gen_vv, gen_helper_vfrint_s) -TRANS(vfrint_d, LSX, gen_vv, gen_helper_vfrint_d) - -TRANS(vftintrne_w_s, LSX, gen_vv, gen_helper_vftintrne_w_s) -TRANS(vftintrne_l_d, LSX, gen_vv, gen_helper_vftintrne_l_d) -TRANS(vftintrz_w_s, LSX, gen_vv, gen_helper_vftintrz_w_s) -TRANS(vftintrz_l_d, LSX, gen_vv, gen_helper_vftintrz_l_d) -TRANS(vftintrp_w_s, LSX, gen_vv, gen_helper_vftintrp_w_s) -TRANS(vftintrp_l_d, LSX, gen_vv, gen_helper_vftintrp_l_d) -TRANS(vftintrm_w_s, LSX, gen_vv, gen_helper_vftintrm_w_s) -TRANS(vftintrm_l_d, LSX, gen_vv, gen_helper_vftintrm_l_d) -TRANS(vftint_w_s, LSX, gen_vv, gen_helper_vftint_w_s) -TRANS(vftint_l_d, LSX, gen_vv, gen_helper_vftint_l_d) -TRANS(vftintrz_wu_s, LSX, gen_vv, gen_helper_vftintrz_wu_s) -TRANS(vftintrz_lu_d, LSX, gen_vv, gen_helper_vftintrz_lu_d) -TRANS(vftint_wu_s, LSX, gen_vv, gen_helper_vftint_wu_s) -TRANS(vftint_lu_d, LSX, gen_vv, gen_helper_vftint_lu_d) -TRANS(vftintrne_w_d, LSX, gen_vvv, gen_helper_vftintrne_w_d) -TRANS(vftintrz_w_d, LSX, gen_vvv, gen_helper_vftintrz_w_d) -TRANS(vftintrp_w_d, LSX, gen_vvv, gen_helper_vftintrp_w_d) -TRANS(vftintrm_w_d, LSX, gen_vvv, gen_helper_vftintrm_w_d) -TRANS(vftint_w_d, LSX, gen_vvv, gen_helper_vftint_w_d) -TRANS(vftintrnel_l_s, LSX, gen_vv, gen_helper_vftintrnel_l_s) -TRANS(vftintrneh_l_s, LSX, gen_vv, gen_helper_vftintrneh_l_s) -TRANS(vftintrzl_l_s, LSX, gen_vv, gen_helper_vftintrzl_l_s) -TRANS(vftintrzh_l_s, LSX, gen_vv, gen_helper_vftintrzh_l_s) -TRANS(vftintrpl_l_s, LSX, gen_vv, gen_helper_vftintrpl_l_s) -TRANS(vftintrph_l_s, LSX, gen_vv, gen_helper_vftintrph_l_s) -TRANS(vftintrml_l_s, LSX, gen_vv, gen_helper_vftintrml_l_s) -TRANS(vftintrmh_l_s, LSX, gen_vv, gen_helper_vftintrmh_l_s) -TRANS(vftintl_l_s, LSX, gen_vv, gen_helper_vftintl_l_s) -TRANS(vftinth_l_s, LSX, gen_vv, gen_helper_vftinth_l_s) - -TRANS(vffint_s_w, LSX, gen_vv, gen_helper_vffint_s_w) -TRANS(vffint_d_l, LSX, gen_vv, gen_helper_vffint_d_l) -TRANS(vffint_s_wu, LSX, gen_vv, gen_helper_vffint_s_wu) -TRANS(vffint_d_lu, LSX, gen_vv, gen_helper_vffint_d_lu) -TRANS(vffintl_d_w, LSX, gen_vv, gen_helper_vffintl_d_w) -TRANS(vffinth_d_w, LSX, gen_vv, gen_helper_vffinth_d_w) -TRANS(vffint_s_l, LSX, gen_vvv, gen_helper_vffint_s_l) +TRANS(vfrstp_b, LSX, gen_vvv, 16, gen_helper_vfrstp_b) +TRANS(vfrstp_h, LSX, gen_vvv, 16, gen_helper_vfrstp_h) +TRANS(vfrstpi_b, LSX, gen_vv_i, 16, gen_helper_vfrstpi_b) +TRANS(vfrstpi_h, LSX, gen_vv_i, 16, gen_helper_vfrstpi_h) + +TRANS(vfadd_s, LSX, gen_vvv_f, 16, gen_helper_vfadd_s) +TRANS(vfadd_d, LSX, gen_vvv_f, 16, gen_helper_vfadd_d) +TRANS(vfsub_s, LSX, gen_vvv_f, 16, gen_helper_vfsub_s) +TRANS(vfsub_d, LSX, gen_vvv_f, 16, gen_helper_vfsub_d) +TRANS(vfmul_s, LSX, gen_vvv_f, 16, gen_helper_vfmul_s) +TRANS(vfmul_d, LSX, gen_vvv_f, 16, gen_helper_vfmul_d) +TRANS(vfdiv_s, LSX, gen_vvv_f, 16, gen_helper_vfdiv_s) +TRANS(vfdiv_d, LSX, gen_vvv_f, 16, gen_helper_vfdiv_d) + +TRANS(vfmadd_s, LSX, gen_vvvv_f, 16, gen_helper_vfmadd_s) +TRANS(vfmadd_d, LSX, gen_vvvv_f, 16, gen_helper_vfmadd_d) +TRANS(vfmsub_s, LSX, gen_vvvv_f, 16, gen_helper_vfmsub_s) +TRANS(vfmsub_d, LSX, gen_vvvv_f, 16, gen_helper_vfmsub_d) +TRANS(vfnmadd_s, LSX, gen_vvvv_f, 16, gen_helper_vfnmadd_s) +TRANS(vfnmadd_d, LSX, gen_vvvv_f, 16, gen_helper_vfnmadd_d) +TRANS(vfnmsub_s, LSX, gen_vvvv_f, 16, gen_helper_vfnmsub_s) +TRANS(vfnmsub_d, LSX, gen_vvvv_f, 16, gen_helper_vfnmsub_d) + +TRANS(vfmax_s, LSX, gen_vvv_f, 16, gen_helper_vfmax_s) +TRANS(vfmax_d, LSX, gen_vvv_f, 16, gen_helper_vfmax_d) +TRANS(vfmin_s, LSX, gen_vvv_f, 16, gen_helper_vfmin_s) +TRANS(vfmin_d, LSX, gen_vvv_f, 16, gen_helper_vfmin_d) + +TRANS(vfmaxa_s, LSX, gen_vvv_f, 16, gen_helper_vfmaxa_s) +TRANS(vfmaxa_d, LSX, gen_vvv_f, 16, gen_helper_vfmaxa_d) +TRANS(vfmina_s, LSX, gen_vvv_f, 16, gen_helper_vfmina_s) +TRANS(vfmina_d, LSX, gen_vvv_f, 16, gen_helper_vfmina_d) + +TRANS(vflogb_s, LSX, gen_vv_f, 16, gen_helper_vflogb_s) +TRANS(vflogb_d, LSX, gen_vv_f, 16, gen_helper_vflogb_d) + +TRANS(vfclass_s, LSX, gen_vv_f, 16, gen_helper_vfclass_s) +TRANS(vfclass_d, LSX, gen_vv_f, 16, gen_helper_vfclass_d) + +TRANS(vfsqrt_s, LSX, gen_vv_f, 16, gen_helper_vfsqrt_s) +TRANS(vfsqrt_d, LSX, gen_vv_f, 16, gen_helper_vfsqrt_d) +TRANS(vfrecip_s, LSX, gen_vv_f, 16, gen_helper_vfrecip_s) +TRANS(vfrecip_d, LSX, gen_vv_f, 16, gen_helper_vfrecip_d) +TRANS(vfrsqrt_s, LSX, gen_vv_f, 16, gen_helper_vfrsqrt_s) +TRANS(vfrsqrt_d, LSX, gen_vv_f, 16, gen_helper_vfrsqrt_d) + +TRANS(vfcvtl_s_h, LSX, gen_vv_f, 16, gen_helper_vfcvtl_s_h) +TRANS(vfcvth_s_h, LSX, gen_vv_f, 16, gen_helper_vfcvth_s_h) +TRANS(vfcvtl_d_s, LSX, gen_vv_f, 16, gen_helper_vfcvtl_d_s) +TRANS(vfcvth_d_s, LSX, gen_vv_f, 16, gen_helper_vfcvth_d_s) +TRANS(vfcvt_h_s, LSX, gen_vvv_f, 16, gen_helper_vfcvt_h_s) +TRANS(vfcvt_s_d, LSX, gen_vvv_f, 16, gen_helper_vfcvt_s_d) + +TRANS(vfrintrne_s, LSX, gen_vv_f, 16, gen_helper_vfrintrne_s) +TRANS(vfrintrne_d, LSX, gen_vv_f, 16, gen_helper_vfrintrne_d) +TRANS(vfrintrz_s, LSX, gen_vv_f, 16, gen_helper_vfrintrz_s) +TRANS(vfrintrz_d, LSX, gen_vv_f, 16, gen_helper_vfrintrz_d) +TRANS(vfrintrp_s, LSX, gen_vv_f, 16, gen_helper_vfrintrp_s) +TRANS(vfrintrp_d, LSX, gen_vv_f, 16, gen_helper_vfrintrp_d) +TRANS(vfrintrm_s, LSX, gen_vv_f, 16, gen_helper_vfrintrm_s) +TRANS(vfrintrm_d, LSX, gen_vv_f, 16, gen_helper_vfrintrm_d) +TRANS(vfrint_s, LSX, gen_vv_f, 16, gen_helper_vfrint_s) +TRANS(vfrint_d, LSX, gen_vv_f, 16, gen_helper_vfrint_d) + +TRANS(vftintrne_w_s, LSX, gen_vv_f, 16, gen_helper_vftintrne_w_s) +TRANS(vftintrne_l_d, LSX, gen_vv_f, 16, gen_helper_vftintrne_l_d) +TRANS(vftintrz_w_s, LSX, gen_vv_f, 16, gen_helper_vftintrz_w_s) +TRANS(vftintrz_l_d, LSX, gen_vv_f, 16, gen_helper_vftintrz_l_d) +TRANS(vftintrp_w_s, LSX, gen_vv_f, 16, gen_helper_vftintrp_w_s) +TRANS(vftintrp_l_d, LSX, gen_vv_f, 16, gen_helper_vftintrp_l_d) +TRANS(vftintrm_w_s, LSX, gen_vv_f, 16, gen_helper_vftintrm_w_s) +TRANS(vftintrm_l_d, LSX, gen_vv_f, 16, gen_helper_vftintrm_l_d) +TRANS(vftint_w_s, LSX, gen_vv_f, 16, gen_helper_vftint_w_s) +TRANS(vftint_l_d, LSX, gen_vv_f, 16, gen_helper_vftint_l_d) +TRANS(vftintrz_wu_s, LSX, gen_vv_f, 16, gen_helper_vftintrz_wu_s) +TRANS(vftintrz_lu_d, LSX, gen_vv_f, 16, gen_helper_vftintrz_lu_d) +TRANS(vftint_wu_s, LSX, gen_vv_f, 16, gen_helper_vftint_wu_s) +TRANS(vftint_lu_d, LSX, gen_vv_f, 16, gen_helper_vftint_lu_d) +TRANS(vftintrne_w_d, LSX, gen_vvv_f, 16, gen_helper_vftintrne_w_d) +TRANS(vftintrz_w_d, LSX, gen_vvv_f, 16, gen_helper_vftintrz_w_d) +TRANS(vftintrp_w_d, LSX, gen_vvv_f, 16, gen_helper_vftintrp_w_d) +TRANS(vftintrm_w_d, LSX, gen_vvv_f, 16, gen_helper_vftintrm_w_d) +TRANS(vftint_w_d, LSX, gen_vvv_f, 16, gen_helper_vftint_w_d) +TRANS(vftintrnel_l_s, LSX, gen_vv_f, 16, gen_helper_vftintrnel_l_s) +TRANS(vftintrneh_l_s, LSX, gen_vv_f, 16, gen_helper_vftintrneh_l_s) +TRANS(vftintrzl_l_s, LSX, gen_vv_f, 16, gen_helper_vftintrzl_l_s) +TRANS(vftintrzh_l_s, LSX, gen_vv_f, 16, gen_helper_vftintrzh_l_s) +TRANS(vftintrpl_l_s, LSX, gen_vv_f, 16, gen_helper_vftintrpl_l_s) +TRANS(vftintrph_l_s, LSX, gen_vv_f, 16, gen_helper_vftintrph_l_s) +TRANS(vftintrml_l_s, LSX, gen_vv_f, 16, gen_helper_vftintrml_l_s) +TRANS(vftintrmh_l_s, LSX, gen_vv_f, 16, gen_helper_vftintrmh_l_s) +TRANS(vftintl_l_s, LSX, gen_vv_f, 16, gen_helper_vftintl_l_s) +TRANS(vftinth_l_s, LSX, gen_vv_f, 16, gen_helper_vftinth_l_s) + +TRANS(vffint_s_w, LSX, gen_vv_f, 16, gen_helper_vffint_s_w) +TRANS(vffint_d_l, LSX, gen_vv_f, 16, gen_helper_vffint_d_l) +TRANS(vffint_s_wu, LSX, gen_vv_f, 16, gen_helper_vffint_s_wu) +TRANS(vffint_d_lu, LSX, gen_vv_f, 16, gen_helper_vffint_d_lu) +TRANS(vffintl_d_w, LSX, gen_vv_f, 16, gen_helper_vffintl_d_w) +TRANS(vffinth_d_w, LSX, gen_vv_f, 16, gen_helper_vffinth_d_w) +TRANS(vffint_s_l, LSX, gen_vvv_f, 16, gen_helper_vffint_s_l) =20 static bool do_cmp(DisasContext *ctx, arg_vvv *a, MemOp mop, TCGCond cond) { @@ -4335,48 +4372,48 @@ static bool trans_vbsrl_v(DisasContext *ctx, arg_vv= _i *a) return true; } =20 -TRANS(vpackev_b, LSX, gen_vvv, gen_helper_vpackev_b) -TRANS(vpackev_h, LSX, gen_vvv, gen_helper_vpackev_h) -TRANS(vpackev_w, LSX, gen_vvv, gen_helper_vpackev_w) -TRANS(vpackev_d, LSX, gen_vvv, gen_helper_vpackev_d) -TRANS(vpackod_b, LSX, gen_vvv, gen_helper_vpackod_b) -TRANS(vpackod_h, LSX, gen_vvv, gen_helper_vpackod_h) -TRANS(vpackod_w, LSX, gen_vvv, gen_helper_vpackod_w) -TRANS(vpackod_d, LSX, gen_vvv, gen_helper_vpackod_d) - -TRANS(vpickev_b, LSX, gen_vvv, gen_helper_vpickev_b) -TRANS(vpickev_h, LSX, gen_vvv, gen_helper_vpickev_h) -TRANS(vpickev_w, LSX, gen_vvv, gen_helper_vpickev_w) -TRANS(vpickev_d, LSX, gen_vvv, gen_helper_vpickev_d) -TRANS(vpickod_b, LSX, gen_vvv, gen_helper_vpickod_b) -TRANS(vpickod_h, LSX, gen_vvv, gen_helper_vpickod_h) -TRANS(vpickod_w, LSX, gen_vvv, gen_helper_vpickod_w) -TRANS(vpickod_d, LSX, gen_vvv, gen_helper_vpickod_d) - -TRANS(vilvl_b, LSX, gen_vvv, gen_helper_vilvl_b) -TRANS(vilvl_h, LSX, gen_vvv, gen_helper_vilvl_h) -TRANS(vilvl_w, LSX, gen_vvv, gen_helper_vilvl_w) -TRANS(vilvl_d, LSX, gen_vvv, gen_helper_vilvl_d) -TRANS(vilvh_b, LSX, gen_vvv, gen_helper_vilvh_b) -TRANS(vilvh_h, LSX, gen_vvv, gen_helper_vilvh_h) -TRANS(vilvh_w, LSX, gen_vvv, gen_helper_vilvh_w) -TRANS(vilvh_d, LSX, gen_vvv, gen_helper_vilvh_d) - -TRANS(vshuf_b, LSX, gen_vvvv, gen_helper_vshuf_b) -TRANS(vshuf_h, LSX, gen_vvv, gen_helper_vshuf_h) -TRANS(vshuf_w, LSX, gen_vvv, gen_helper_vshuf_w) -TRANS(vshuf_d, LSX, gen_vvv, gen_helper_vshuf_d) -TRANS(vshuf4i_b, LSX, gen_vv_i, gen_helper_vshuf4i_b) -TRANS(vshuf4i_h, LSX, gen_vv_i, gen_helper_vshuf4i_h) -TRANS(vshuf4i_w, LSX, gen_vv_i, gen_helper_vshuf4i_w) -TRANS(vshuf4i_d, LSX, gen_vv_i, gen_helper_vshuf4i_d) - -TRANS(vpermi_w, LSX, gen_vv_i, gen_helper_vpermi_w) - -TRANS(vextrins_b, LSX, gen_vv_i, gen_helper_vextrins_b) -TRANS(vextrins_h, LSX, gen_vv_i, gen_helper_vextrins_h) -TRANS(vextrins_w, LSX, gen_vv_i, gen_helper_vextrins_w) -TRANS(vextrins_d, LSX, gen_vv_i, gen_helper_vextrins_d) +TRANS(vpackev_b, LSX, gen_vvv, 16, gen_helper_vpackev_b) +TRANS(vpackev_h, LSX, gen_vvv, 16, gen_helper_vpackev_h) +TRANS(vpackev_w, LSX, gen_vvv, 16, gen_helper_vpackev_w) +TRANS(vpackev_d, LSX, gen_vvv, 16, gen_helper_vpackev_d) +TRANS(vpackod_b, LSX, gen_vvv, 16, gen_helper_vpackod_b) +TRANS(vpackod_h, LSX, gen_vvv, 16, gen_helper_vpackod_h) +TRANS(vpackod_w, LSX, gen_vvv, 16, gen_helper_vpackod_w) +TRANS(vpackod_d, LSX, gen_vvv, 16, gen_helper_vpackod_d) + +TRANS(vpickev_b, LSX, gen_vvv, 16, gen_helper_vpickev_b) +TRANS(vpickev_h, LSX, gen_vvv, 16, gen_helper_vpickev_h) +TRANS(vpickev_w, LSX, gen_vvv, 16, gen_helper_vpickev_w) +TRANS(vpickev_d, LSX, gen_vvv, 16, gen_helper_vpickev_d) +TRANS(vpickod_b, LSX, gen_vvv, 16, gen_helper_vpickod_b) +TRANS(vpickod_h, LSX, gen_vvv, 16, gen_helper_vpickod_h) +TRANS(vpickod_w, LSX, gen_vvv, 16, gen_helper_vpickod_w) +TRANS(vpickod_d, LSX, gen_vvv, 16, gen_helper_vpickod_d) + +TRANS(vilvl_b, LSX, gen_vvv, 16, gen_helper_vilvl_b) +TRANS(vilvl_h, LSX, gen_vvv, 16, gen_helper_vilvl_h) +TRANS(vilvl_w, LSX, gen_vvv, 16, gen_helper_vilvl_w) +TRANS(vilvl_d, LSX, gen_vvv, 16, gen_helper_vilvl_d) +TRANS(vilvh_b, LSX, gen_vvv, 16, gen_helper_vilvh_b) +TRANS(vilvh_h, LSX, gen_vvv, 16, gen_helper_vilvh_h) +TRANS(vilvh_w, LSX, gen_vvv, 16, gen_helper_vilvh_w) +TRANS(vilvh_d, LSX, gen_vvv, 16, gen_helper_vilvh_d) + +TRANS(vshuf_b, LSX, gen_vvvv, 16, gen_helper_vshuf_b) +TRANS(vshuf_h, LSX, gen_vvv, 16, gen_helper_vshuf_h) +TRANS(vshuf_w, LSX, gen_vvv, 16, gen_helper_vshuf_w) +TRANS(vshuf_d, LSX, gen_vvv, 16, gen_helper_vshuf_d) +TRANS(vshuf4i_b, LSX, gen_vv_i, 16, gen_helper_vshuf4i_b) +TRANS(vshuf4i_h, LSX, gen_vv_i, 16, gen_helper_vshuf4i_h) +TRANS(vshuf4i_w, LSX, gen_vv_i, 16, gen_helper_vshuf4i_w) +TRANS(vshuf4i_d, LSX, gen_vv_i, 16, gen_helper_vshuf4i_d) + +TRANS(vpermi_w, LSX, gen_vv_i, 16, gen_helper_vpermi_w) + +TRANS(vextrins_b, LSX, gen_vv_i, 16, gen_helper_vextrins_b) +TRANS(vextrins_h, LSX, gen_vv_i, 16, gen_helper_vextrins_h) +TRANS(vextrins_w, LSX, gen_vv_i, 16, gen_helper_vextrins_w) +TRANS(vextrins_d, LSX, gen_vv_i, 16, gen_helper_vextrins_d) =20 static bool trans_vld(DisasContext *ctx, arg_vr_i *a) { diff --git a/target/loongarch/meson.build b/target/loongarch/meson.build index b7a27df5a9..7fbf045a5d 100644 --- a/target/loongarch/meson.build +++ b/target/loongarch/meson.build @@ -11,7 +11,7 @@ loongarch_tcg_ss.add(files( 'op_helper.c', 'translate.c', 'gdbstub.c', - 'lsx_helper.c', + 'vec_helper.c', )) loongarch_tcg_ss.add(zlib) =20 --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385484989332.97941265949623; Wed, 30 Aug 2023 01:51:24 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtK-0000aQ-Rm; Wed, 30 Aug 2023 04:49:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtI-0000PY-6e for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:32 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtC-0007Te-Ow for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:31 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxyeqJAu9keggdAA--.49853S3; Wed, 30 Aug 2023 16:49:13 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S13; Wed, 30 Aug 2023 16:49:12 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 11/48] target/loongarch: Implement xvhaddw/xvhsubw Date: Wed, 30 Aug 2023 16:48:25 +0800 Message-Id: <20230830084902.2113960-12-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S13 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385486348100007 Content-Type: text/plain; charset="utf-8" This patch includes: - XVHADDW.{H.B/W.H/D.W/Q.D/HU.BU/WU.HU/DU.WU/QU.DU}; - XVHSUBW.{H.B/W.H/D.W/Q.D/HU.BU/WU.HU/DU.WU/QU.DU}. Signed-off-by: Song Gao --- target/loongarch/vec.h | 3 ++ target/loongarch/insns.decode | 18 ++++++++++ target/loongarch/disas.c | 17 +++++++++ target/loongarch/vec_helper.c | 36 ++++++++++++++------ target/loongarch/insn_trans/trans_lasx.c.inc | 17 +++++++++ 5 files changed, 81 insertions(+), 10 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 512f2fd83f..5332dff83c 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -47,4 +47,7 @@ #define Q(x) Q[x] #endif /* HOST_BIG_ENDIAN */ =20 +#define DO_ADD(a, b) (a + b) +#define DO_SUB(a, b) (a - b) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 32f857ff7c..ba0b36f4a7 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1343,6 +1343,24 @@ xvssub_hu 0111 01000100 11001 ..... ..... ...= .. @vvv xvssub_wu 0111 01000100 11010 ..... ..... ..... @vvv xvssub_du 0111 01000100 11011 ..... ..... ..... @vvv =20 +xvhaddw_h_b 0111 01000101 01000 ..... ..... ..... @vvv +xvhaddw_w_h 0111 01000101 01001 ..... ..... ..... @vvv +xvhaddw_d_w 0111 01000101 01010 ..... ..... ..... @vvv +xvhaddw_q_d 0111 01000101 01011 ..... ..... ..... @vvv +xvhaddw_hu_bu 0111 01000101 10000 ..... ..... ..... @vvv +xvhaddw_wu_hu 0111 01000101 10001 ..... ..... ..... @vvv +xvhaddw_du_wu 0111 01000101 10010 ..... ..... ..... @vvv +xvhaddw_qu_du 0111 01000101 10011 ..... ..... ..... @vvv + +xvhsubw_h_b 0111 01000101 01100 ..... ..... ..... @vvv +xvhsubw_w_h 0111 01000101 01101 ..... ..... ..... @vvv +xvhsubw_d_w 0111 01000101 01110 ..... ..... ..... @vvv +xvhsubw_q_d 0111 01000101 01111 ..... ..... ..... @vvv +xvhsubw_hu_bu 0111 01000101 10100 ..... ..... ..... @vvv +xvhsubw_wu_hu 0111 01000101 10101 ..... ..... ..... @vvv +xvhsubw_du_wu 0111 01000101 10110 ..... ..... ..... @vvv +xvhsubw_qu_du 0111 01000101 10111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 0fd88a56c1..e188220519 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1765,6 +1765,23 @@ INSN_LASX(xvssub_hu, vvv) INSN_LASX(xvssub_wu, vvv) INSN_LASX(xvssub_du, vvv) =20 +INSN_LASX(xvhaddw_h_b, vvv) +INSN_LASX(xvhaddw_w_h, vvv) +INSN_LASX(xvhaddw_d_w, vvv) +INSN_LASX(xvhaddw_q_d, vvv) +INSN_LASX(xvhaddw_hu_bu, vvv) +INSN_LASX(xvhaddw_wu_hu, vvv) +INSN_LASX(xvhaddw_du_wu, vvv) +INSN_LASX(xvhaddw_qu_du, vvv) +INSN_LASX(xvhsubw_h_b, vvv) +INSN_LASX(xvhsubw_w_h, vvv) +INSN_LASX(xvhsubw_d_w, vvv) +INSN_LASX(xvhsubw_q_d, vvv) +INSN_LASX(xvhsubw_hu_bu, vvv) +INSN_LASX(xvhsubw_wu_hu, vvv) +INSN_LASX(xvhsubw_du_wu, vvv) +INSN_LASX(xvhsubw_qu_du, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index d01903018a..b6c0b3fda8 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -14,9 +14,6 @@ #include "tcg/tcg.h" #include "vec.h" =20 -#define DO_ADD(a, b) (a + b) -#define DO_SUB(a, b) (a - b) - #define DO_ODD_EVEN(NAME, BIT, E1, E2, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ @@ -25,8 +22,9 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t = desc) \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ typedef __typeof(Vd->E1(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E1(i) =3D DO_OP((TD)Vj->E2(2 * i + 1), (TD)Vk->E2(2 * i)); \ } \ } @@ -37,11 +35,16 @@ DO_ODD_EVEN(vhaddw_d_w, 64, D, W, DO_ADD) =20 void HELPER(vhaddw_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_makes64(Vj->D(1)), int128_makes64(Vk->D= (0))); + for (i =3D 0; i < oprsz / 16 ; i++) { + Vd->Q(i) =3D int128_add(int128_makes64(Vj->D(2 * i + 1)), + int128_makes64(Vk->D(2 * i))); + } } =20 DO_ODD_EVEN(vhsubw_h_b, 16, H, B, DO_SUB) @@ -50,11 +53,16 @@ DO_ODD_EVEN(vhsubw_d_w, 64, D, W, DO_SUB) =20 void HELPER(vhsubw_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_sub(int128_makes64(Vj->D(1)), int128_makes64(Vk->D= (0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_sub(int128_makes64(Vj->D(2 * i + 1)), + int128_makes64(Vk->D(2 * i))); + } } =20 DO_ODD_EVEN(vhaddw_hu_bu, 16, UH, UB, DO_ADD) @@ -63,12 +71,16 @@ DO_ODD_EVEN(vhaddw_du_wu, 64, UD, UW, DO_ADD) =20 void HELPER(vhaddw_qu_du)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_make64((uint64_t)Vj->D(1)), - int128_make64((uint64_t)Vk->D(0))); + for (i =3D 0; i < oprsz / 16; i ++) { + Vd->Q(i) =3D int128_add(int128_make64(Vj->UD(2 * i + 1)), + int128_make64(Vk->UD(2 * i))); + } } =20 DO_ODD_EVEN(vhsubw_hu_bu, 16, UH, UB, DO_SUB) @@ -77,12 +89,16 @@ DO_ODD_EVEN(vhsubw_du_wu, 64, UD, UW, DO_SUB) =20 void HELPER(vhsubw_qu_du)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_sub(int128_make64((uint64_t)Vj->D(1)), - int128_make64((uint64_t)Vk->D(0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_sub(int128_make64(Vj->UD(2 * i + 1)), + int128_make64(Vk->UD(2 * i))); + } } =20 #define DO_EVEN(NAME, BIT, E1, E2, DO_OP) \ diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index c818a09312..90c9ccce4f 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -82,6 +82,23 @@ TRANS(xvssub_hu, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec= _ussub) TRANS(xvssub_wu, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_ussub) TRANS(xvssub_du, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_ussub) =20 +TRANS(xvhaddw_h_b, LASX, gen_vvv, 32, gen_helper_vhaddw_h_b) +TRANS(xvhaddw_w_h, LASX, gen_vvv, 32, gen_helper_vhaddw_w_h) +TRANS(xvhaddw_d_w, LASX, gen_vvv, 32, gen_helper_vhaddw_d_w) +TRANS(xvhaddw_q_d, LASX, gen_vvv, 32, gen_helper_vhaddw_q_d) +TRANS(xvhaddw_hu_bu, LASX, gen_vvv, 32, gen_helper_vhaddw_hu_bu) +TRANS(xvhaddw_wu_hu, LASX, gen_vvv, 32, gen_helper_vhaddw_wu_hu) +TRANS(xvhaddw_du_wu, LASX, gen_vvv, 32, gen_helper_vhaddw_du_wu) +TRANS(xvhaddw_qu_du, LASX, gen_vvv, 32, gen_helper_vhaddw_qu_du) +TRANS(xvhsubw_h_b, LASX, gen_vvv, 32, gen_helper_vhsubw_h_b) +TRANS(xvhsubw_w_h, LASX, gen_vvv, 32, gen_helper_vhsubw_w_h) +TRANS(xvhsubw_d_w, LASX, gen_vvv, 32, gen_helper_vhsubw_d_w) +TRANS(xvhsubw_q_d, LASX, gen_vvv, 32, gen_helper_vhsubw_q_d) +TRANS(xvhsubw_hu_bu, LASX, gen_vvv, 32, gen_helper_vhsubw_hu_bu) +TRANS(xvhsubw_wu_hu, LASX, gen_vvv, 32, gen_helper_vhsubw_wu_hu) +TRANS(xvhsubw_du_wu, LASX, gen_vvv, 32, gen_helper_vhsubw_du_wu) +TRANS(xvhsubw_qu_du, LASX, gen_vvv, 32, gen_helper_vhsubw_qu_du) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385484230825.40697723475; Wed, 30 Aug 2023 01:51:24 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuZ-00046r-V6; Wed, 30 Aug 2023 04:50:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGuP-0002h0-P4 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:41 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuJ-0007u7-U0 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:41 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxNuiKAu9kfwgdAA--.5922S3; Wed, 30 Aug 2023 16:49:14 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S14; Wed, 30 Aug 2023 16:49:13 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 12/48] target/loongarch: Implement xvaddw/xvsubw Date: Wed, 30 Aug 2023 16:48:26 +0800 Message-Id: <20230830084902.2113960-13-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S14 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385485197100002 Content-Type: text/plain; charset="utf-8" This patch includes: - XVADDW{EV/OD}.{H.B/W.H/D.W/Q.D}[U]; - XVSUBW{EV/OD}.{H.B/W.H/D.W/Q.D}[U]; - XVADDW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}. Signed-off-by: Song Gao --- target/loongarch/insns.decode | 45 +++++++ target/loongarch/disas.c | 43 +++++++ target/loongarch/vec_helper.c | 121 +++++++++++++------ target/loongarch/insn_trans/trans_lasx.c.inc | 45 +++++++ 4 files changed, 220 insertions(+), 34 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index ba0b36f4a7..e1d8b30179 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1361,6 +1361,51 @@ xvhsubw_wu_hu 0111 01000101 10101 ..... ..... ...= .. @vvv xvhsubw_du_wu 0111 01000101 10110 ..... ..... ..... @vvv xvhsubw_qu_du 0111 01000101 10111 ..... ..... ..... @vvv =20 +xvaddwev_h_b 0111 01000001 11100 ..... ..... ..... @vvv +xvaddwev_w_h 0111 01000001 11101 ..... ..... ..... @vvv +xvaddwev_d_w 0111 01000001 11110 ..... ..... ..... @vvv +xvaddwev_q_d 0111 01000001 11111 ..... ..... ..... @vvv +xvaddwod_h_b 0111 01000010 00100 ..... ..... ..... @vvv +xvaddwod_w_h 0111 01000010 00101 ..... ..... ..... @vvv +xvaddwod_d_w 0111 01000010 00110 ..... ..... ..... @vvv +xvaddwod_q_d 0111 01000010 00111 ..... ..... ..... @vvv + +xvsubwev_h_b 0111 01000010 00000 ..... ..... ..... @vvv +xvsubwev_w_h 0111 01000010 00001 ..... ..... ..... @vvv +xvsubwev_d_w 0111 01000010 00010 ..... ..... ..... @vvv +xvsubwev_q_d 0111 01000010 00011 ..... ..... ..... @vvv +xvsubwod_h_b 0111 01000010 01000 ..... ..... ..... @vvv +xvsubwod_w_h 0111 01000010 01001 ..... ..... ..... @vvv +xvsubwod_d_w 0111 01000010 01010 ..... ..... ..... @vvv +xvsubwod_q_d 0111 01000010 01011 ..... ..... ..... @vvv + +xvaddwev_h_bu 0111 01000010 11100 ..... ..... ..... @vvv +xvaddwev_w_hu 0111 01000010 11101 ..... ..... ..... @vvv +xvaddwev_d_wu 0111 01000010 11110 ..... ..... ..... @vvv +xvaddwev_q_du 0111 01000010 11111 ..... ..... ..... @vvv +xvaddwod_h_bu 0111 01000011 00100 ..... ..... ..... @vvv +xvaddwod_w_hu 0111 01000011 00101 ..... ..... ..... @vvv +xvaddwod_d_wu 0111 01000011 00110 ..... ..... ..... @vvv +xvaddwod_q_du 0111 01000011 00111 ..... ..... ..... @vvv + +xvsubwev_h_bu 0111 01000011 00000 ..... ..... ..... @vvv +xvsubwev_w_hu 0111 01000011 00001 ..... ..... ..... @vvv +xvsubwev_d_wu 0111 01000011 00010 ..... ..... ..... @vvv +xvsubwev_q_du 0111 01000011 00011 ..... ..... ..... @vvv +xvsubwod_h_bu 0111 01000011 01000 ..... ..... ..... @vvv +xvsubwod_w_hu 0111 01000011 01001 ..... ..... ..... @vvv +xvsubwod_d_wu 0111 01000011 01010 ..... ..... ..... @vvv +xvsubwod_q_du 0111 01000011 01011 ..... ..... ..... @vvv + +xvaddwev_h_bu_b 0111 01000011 11100 ..... ..... ..... @vvv +xvaddwev_w_hu_h 0111 01000011 11101 ..... ..... ..... @vvv +xvaddwev_d_wu_w 0111 01000011 11110 ..... ..... ..... @vvv +xvaddwev_q_du_d 0111 01000011 11111 ..... ..... ..... @vvv +xvaddwod_h_bu_b 0111 01000100 00000 ..... ..... ..... @vvv +xvaddwod_w_hu_h 0111 01000100 00001 ..... ..... ..... @vvv +xvaddwod_d_wu_w 0111 01000100 00010 ..... ..... ..... @vvv +xvaddwod_q_du_d 0111 01000100 00011 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index e188220519..6972e33833 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1782,6 +1782,49 @@ INSN_LASX(xvhsubw_wu_hu, vvv) INSN_LASX(xvhsubw_du_wu, vvv) INSN_LASX(xvhsubw_qu_du, vvv) =20 +INSN_LASX(xvaddwev_h_b, vvv) +INSN_LASX(xvaddwev_w_h, vvv) +INSN_LASX(xvaddwev_d_w, vvv) +INSN_LASX(xvaddwev_q_d, vvv) +INSN_LASX(xvaddwod_h_b, vvv) +INSN_LASX(xvaddwod_w_h, vvv) +INSN_LASX(xvaddwod_d_w, vvv) +INSN_LASX(xvaddwod_q_d, vvv) +INSN_LASX(xvsubwev_h_b, vvv) +INSN_LASX(xvsubwev_w_h, vvv) +INSN_LASX(xvsubwev_d_w, vvv) +INSN_LASX(xvsubwev_q_d, vvv) +INSN_LASX(xvsubwod_h_b, vvv) +INSN_LASX(xvsubwod_w_h, vvv) +INSN_LASX(xvsubwod_d_w, vvv) +INSN_LASX(xvsubwod_q_d, vvv) + +INSN_LASX(xvaddwev_h_bu, vvv) +INSN_LASX(xvaddwev_w_hu, vvv) +INSN_LASX(xvaddwev_d_wu, vvv) +INSN_LASX(xvaddwev_q_du, vvv) +INSN_LASX(xvaddwod_h_bu, vvv) +INSN_LASX(xvaddwod_w_hu, vvv) +INSN_LASX(xvaddwod_d_wu, vvv) +INSN_LASX(xvaddwod_q_du, vvv) +INSN_LASX(xvsubwev_h_bu, vvv) +INSN_LASX(xvsubwev_w_hu, vvv) +INSN_LASX(xvsubwev_d_wu, vvv) +INSN_LASX(xvsubwev_q_du, vvv) +INSN_LASX(xvsubwod_h_bu, vvv) +INSN_LASX(xvsubwod_w_hu, vvv) +INSN_LASX(xvsubwod_d_wu, vvv) +INSN_LASX(xvsubwod_q_du, vvv) + +INSN_LASX(xvaddwev_h_bu_b, vvv) +INSN_LASX(xvaddwev_w_hu_h, vvv) +INSN_LASX(xvaddwev_d_wu_w, vvv) +INSN_LASX(xvaddwev_q_du_d, vvv) +INSN_LASX(xvaddwod_h_bu_b, vvv) +INSN_LASX(xvaddwod_w_hu_h, vvv) +INSN_LASX(xvaddwod_d_wu_w, vvv) +INSN_LASX(xvaddwod_q_du_d, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index b6c0b3fda8..fffc67ce93 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -13,6 +13,7 @@ #include "internals.h" #include "tcg/tcg.h" #include "vec.h" +#include "tcg/tcg-gvec-desc.h" =20 #define DO_ODD_EVEN(NAME, BIT, E1, E2, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ @@ -102,133 +103,173 @@ void HELPER(vhsubw_qu_du)(void *vd, void *vj, void = *vk, uint32_t desc) } =20 #define DO_EVEN(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ typedef __typeof(Vd->E1(0)) TD; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E1(i) =3D DO_OP((TD)Vj->E2(2 * i) ,(TD)Vk->E2(2 * i)); \ } \ } =20 #define DO_ODD(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ VReg *Vd =3D (VReg *)vd; = \ VReg *Vj =3D (VReg *)vj; = \ VReg *Vk =3D (VReg *)vk; = \ typedef __typeof(Vd->E1(0)) TD; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { = \ Vd->E1(i) =3D DO_OP((TD)Vj->E2(2 * i + 1), (TD)Vk->E2(2 * i + 1));= \ } \ } =20 -void HELPER(vaddwev_q_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vaddwev_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_makes64(Vj->D(0)), int128_makes64(Vk->D= (0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_add(int128_makes64(Vj->D(2 * i)), + int128_makes64(Vk->D(2 * i))); + } } =20 DO_EVEN(vaddwev_h_b, 16, H, B, DO_ADD) DO_EVEN(vaddwev_w_h, 32, W, H, DO_ADD) DO_EVEN(vaddwev_d_w, 64, D, W, DO_ADD) =20 -void HELPER(vaddwod_q_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vaddwod_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_makes64(Vj->D(1)), int128_makes64(Vk->D= (1))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_add(int128_makes64(Vj->D(2 * i +1)), + int128_makes64(Vk->D(2 * i +1))); + } } =20 DO_ODD(vaddwod_h_b, 16, H, B, DO_ADD) DO_ODD(vaddwod_w_h, 32, W, H, DO_ADD) DO_ODD(vaddwod_d_w, 64, D, W, DO_ADD) =20 -void HELPER(vsubwev_q_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vsubwev_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_sub(int128_makes64(Vj->D(0)), int128_makes64(Vk->D= (0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_sub(int128_makes64(Vj->D(2 * i)), + int128_makes64(Vk->D(2 * i))); + } } =20 DO_EVEN(vsubwev_h_b, 16, H, B, DO_SUB) DO_EVEN(vsubwev_w_h, 32, W, H, DO_SUB) DO_EVEN(vsubwev_d_w, 64, D, W, DO_SUB) =20 -void HELPER(vsubwod_q_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vsubwod_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_sub(int128_makes64(Vj->D(1)), int128_makes64(Vk->D= (1))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_sub(int128_makes64(Vj->D(2 * i + 1)), + int128_makes64(Vk->D(2 * i + 1))); + } } =20 DO_ODD(vsubwod_h_b, 16, H, B, DO_SUB) DO_ODD(vsubwod_w_h, 32, W, H, DO_SUB) DO_ODD(vsubwod_d_w, 64, D, W, DO_SUB) =20 -void HELPER(vaddwev_q_du)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vaddwev_q_du)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_make64((uint64_t)Vj->D(0)), - int128_make64((uint64_t)Vk->D(0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_add(int128_make64(Vj->UD(2 * i)), + int128_make64(Vk->UD(2 * i))); + } } =20 DO_EVEN(vaddwev_h_bu, 16, UH, UB, DO_ADD) DO_EVEN(vaddwev_w_hu, 32, UW, UH, DO_ADD) DO_EVEN(vaddwev_d_wu, 64, UD, UW, DO_ADD) =20 -void HELPER(vaddwod_q_du)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vaddwod_q_du)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_make64((uint64_t)Vj->D(1)), - int128_make64((uint64_t)Vk->D(1))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_add(int128_make64(Vj->UD(2 * i + 1)), + int128_make64(Vk->UD(2 * i + 1))); + } } =20 DO_ODD(vaddwod_h_bu, 16, UH, UB, DO_ADD) DO_ODD(vaddwod_w_hu, 32, UW, UH, DO_ADD) DO_ODD(vaddwod_d_wu, 64, UD, UW, DO_ADD) =20 -void HELPER(vsubwev_q_du)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vsubwev_q_du)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_sub(int128_make64((uint64_t)Vj->D(0)), - int128_make64((uint64_t)Vk->D(0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_sub(int128_make64(Vj->UD(2 * i)), + int128_make64(Vk->UD(2 * i))); + } } =20 DO_EVEN(vsubwev_h_bu, 16, UH, UB, DO_SUB) DO_EVEN(vsubwev_w_hu, 32, UW, UH, DO_SUB) DO_EVEN(vsubwev_d_wu, 64, UD, UW, DO_SUB) =20 -void HELPER(vsubwod_q_du)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vsubwod_q_du)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_sub(int128_make64((uint64_t)Vj->D(1)), - int128_make64((uint64_t)Vk->D(1))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_sub(int128_make64(Vj->UD(2 * i + 1)), + int128_make64(Vk->UD(2 * i + 1))); + } } =20 DO_ODD(vsubwod_h_bu, 16, UH, UB, DO_SUB) @@ -236,7 +277,7 @@ DO_ODD(vsubwod_w_hu, 32, UW, UH, DO_SUB) DO_ODD(vsubwod_d_wu, 64, UD, UW, DO_SUB) =20 #define DO_EVEN_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ VReg *Vd =3D (VReg *)vd; \ @@ -244,13 +285,15 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint3= 2_t v) \ VReg *Vk =3D (VReg *)vk; \ typedef __typeof(Vd->ES1(0)) TDS; \ typedef __typeof(Vd->EU1(0)) TDU; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->ES1(i) =3D DO_OP((TDU)Vj->EU2(2 * i) ,(TDS)Vk->ES2(2 * i)); \ } \ } =20 #define DO_ODD_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ VReg *Vd =3D (VReg *)vd; = \ @@ -258,33 +301,43 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint3= 2_t v) \ VReg *Vk =3D (VReg *)vk; = \ typedef __typeof(Vd->ES1(0)) TDS; = \ typedef __typeof(Vd->EU1(0)) TDU; = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { = \ Vd->ES1(i) =3D DO_OP((TDU)Vj->EU2(2 * i + 1), (TDS)Vk->ES2(2 * i += 1)); \ } = \ } =20 -void HELPER(vaddwev_q_du_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vaddwev_q_du_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_make64((uint64_t)Vj->D(0)), - int128_makes64(Vk->D(0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_add(int128_make64(Vj->UD(2 * i)), + int128_makes64(Vk->D(2 * i))); + } } =20 DO_EVEN_U_S(vaddwev_h_bu_b, 16, H, UH, B, UB, DO_ADD) DO_EVEN_U_S(vaddwev_w_hu_h, 32, W, UW, H, UH, DO_ADD) DO_EVEN_U_S(vaddwev_d_wu_w, 64, D, UD, W, UW, DO_ADD) =20 -void HELPER(vaddwod_q_du_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vaddwod_q_du_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_make64((uint64_t)Vj->D(1)), - int128_makes64(Vk->D(1))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_add(int128_make64(Vj->UD(2 * i + 1)), + int128_makes64(Vk->D(2 * i + 1))); + } } =20 DO_ODD_U_S(vaddwod_h_bu_b, 16, H, UH, B, UB, DO_ADD) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 90c9ccce4f..922222bd78 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -99,6 +99,51 @@ TRANS(xvhsubw_wu_hu, LASX, gen_vvv, 32, gen_helper_vhsub= w_wu_hu) TRANS(xvhsubw_du_wu, LASX, gen_vvv, 32, gen_helper_vhsubw_du_wu) TRANS(xvhsubw_qu_du, LASX, gen_vvv, 32, gen_helper_vhsubw_qu_du) =20 +TRANS(xvaddwev_h_b, LASX, gvec_vvv, 32, MO_8, do_vaddwev_s) +TRANS(xvaddwev_w_h, LASX, gvec_vvv, 32, MO_16, do_vaddwev_s) +TRANS(xvaddwev_d_w, LASX, gvec_vvv, 32, MO_32, do_vaddwev_s) +TRANS(xvaddwev_q_d, LASX, gvec_vvv, 32, MO_64, do_vaddwev_s) +TRANS(xvaddwod_h_b, LASX, gvec_vvv, 32, MO_8, do_vaddwod_s) +TRANS(xvaddwod_w_h, LASX, gvec_vvv, 32, MO_16, do_vaddwod_s) +TRANS(xvaddwod_d_w, LASX, gvec_vvv, 32, MO_32, do_vaddwod_s) +TRANS(xvaddwod_q_d, LASX, gvec_vvv, 32, MO_64, do_vaddwod_s) + +TRANS(xvsubwev_h_b, LASX, gvec_vvv, 32, MO_8, do_vsubwev_s) +TRANS(xvsubwev_w_h, LASX, gvec_vvv, 32, MO_16, do_vsubwev_s) +TRANS(xvsubwev_d_w, LASX, gvec_vvv, 32, MO_32, do_vsubwev_s) +TRANS(xvsubwev_q_d, LASX, gvec_vvv, 32, MO_64, do_vsubwev_s) +TRANS(xvsubwod_h_b, LASX, gvec_vvv, 32, MO_8, do_vsubwod_s) +TRANS(xvsubwod_w_h, LASX, gvec_vvv, 32, MO_16, do_vsubwod_s) +TRANS(xvsubwod_d_w, LASX, gvec_vvv, 32, MO_32, do_vsubwod_s) +TRANS(xvsubwod_q_d, LASX, gvec_vvv, 32, MO_64, do_vsubwod_s) + +TRANS(xvaddwev_h_bu, LASX, gvec_vvv, 32, MO_8, do_vaddwev_u) +TRANS(xvaddwev_w_hu, LASX, gvec_vvv, 32, MO_16, do_vaddwev_u) +TRANS(xvaddwev_d_wu, LASX, gvec_vvv, 32, MO_32, do_vaddwev_u) +TRANS(xvaddwev_q_du, LASX, gvec_vvv, 32, MO_64, do_vaddwev_u) +TRANS(xvaddwod_h_bu, LASX, gvec_vvv, 32, MO_8, do_vaddwod_u) +TRANS(xvaddwod_w_hu, LASX, gvec_vvv, 32, MO_16, do_vaddwod_u) +TRANS(xvaddwod_d_wu, LASX, gvec_vvv, 32, MO_32, do_vaddwod_u) +TRANS(xvaddwod_q_du, LASX, gvec_vvv, 32, MO_64, do_vaddwod_u) + +TRANS(xvsubwev_h_bu, LASX, gvec_vvv, 32, MO_8, do_vsubwev_u) +TRANS(xvsubwev_w_hu, LASX, gvec_vvv, 32, MO_16, do_vsubwev_u) +TRANS(xvsubwev_d_wu, LASX, gvec_vvv, 32, MO_32, do_vsubwev_u) +TRANS(xvsubwev_q_du, LASX, gvec_vvv, 32, MO_64, do_vsubwev_u) +TRANS(xvsubwod_h_bu, LASX, gvec_vvv, 32, MO_8, do_vsubwod_u) +TRANS(xvsubwod_w_hu, LASX, gvec_vvv, 32, MO_16, do_vsubwod_u) +TRANS(xvsubwod_d_wu, LASX, gvec_vvv, 32, MO_32, do_vsubwod_u) +TRANS(xvsubwod_q_du, LASX, gvec_vvv, 32, MO_64, do_vsubwod_u) + +TRANS(xvaddwev_h_bu_b, LASX, gvec_vvv, 32, MO_8, do_vaddwev_u_s) +TRANS(xvaddwev_w_hu_h, LASX, gvec_vvv, 32, MO_16, do_vaddwev_u_s) +TRANS(xvaddwev_d_wu_w, LASX, gvec_vvv, 32, MO_32, do_vaddwev_u_s) +TRANS(xvaddwev_q_du_d, LASX, gvec_vvv, 32, MO_64, do_vaddwev_u_s) +TRANS(xvaddwod_h_bu_b, LASX, gvec_vvv, 32, MO_8, do_vaddwod_u_s) +TRANS(xvaddwod_w_hu_h, LASX, gvec_vvv, 32, MO_16, do_vaddwod_u_s) +TRANS(xvaddwod_d_wu_w, LASX, gvec_vvv, 32, MO_32, do_vaddwod_u_s) +TRANS(xvaddwod_q_du_d, LASX, gvec_vvv, 32, MO_64, do_vaddwod_u_s) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385583244812.0306019750943; Wed, 30 Aug 2023 01:53:03 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtH-0000PR-UG; Wed, 30 Aug 2023 04:49:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtG-0000NT-0S for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:30 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtD-0007Ti-5m for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:29 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Ax1fCKAu9kgQgdAA--.59268S3; Wed, 30 Aug 2023 16:49:14 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S15; Wed, 30 Aug 2023 16:49:14 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 13/48] target/loongarch: Implement xavg/xvagr Date: Wed, 30 Aug 2023 16:48:27 +0800 Message-Id: <20230830084902.2113960-14-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S15 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385584780100002 Content-Type: text/plain; charset="utf-8" This patch includes: - XVAVG.{B/H/W/D/}[U]; - XVAVGR.{B/H/W/D}[U]. Signed-off-by: Song Gao --- target/loongarch/vec.h | 3 +++ target/loongarch/insns.decode | 17 +++++++++++++ target/loongarch/disas.c | 17 +++++++++++++ target/loongarch/vec_helper.c | 25 ++++++++++---------- target/loongarch/insn_trans/trans_lasx.c.inc | 17 +++++++++++++ 5 files changed, 66 insertions(+), 13 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 5332dff83c..6ac6b22f20 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -50,4 +50,7 @@ #define DO_ADD(a, b) (a + b) #define DO_SUB(a, b) (a - b) =20 +#define DO_VAVG(a, b) ((a >> 1) + (b >> 1) + (a & b & 1)) +#define DO_VAVGR(a, b) ((a >> 1) + (b >> 1) + ((a | b) & 1)) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index e1d8b30179..a2cb39750d 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1406,6 +1406,23 @@ xvaddwod_w_hu_h 0111 01000100 00001 ..... ..... ...= .. @vvv xvaddwod_d_wu_w 0111 01000100 00010 ..... ..... ..... @vvv xvaddwod_q_du_d 0111 01000100 00011 ..... ..... ..... @vvv =20 +xvavg_b 0111 01000110 01000 ..... ..... ..... @vvv +xvavg_h 0111 01000110 01001 ..... ..... ..... @vvv +xvavg_w 0111 01000110 01010 ..... ..... ..... @vvv +xvavg_d 0111 01000110 01011 ..... ..... ..... @vvv +xvavg_bu 0111 01000110 01100 ..... ..... ..... @vvv +xvavg_hu 0111 01000110 01101 ..... ..... ..... @vvv +xvavg_wu 0111 01000110 01110 ..... ..... ..... @vvv +xvavg_du 0111 01000110 01111 ..... ..... ..... @vvv +xvavgr_b 0111 01000110 10000 ..... ..... ..... @vvv +xvavgr_h 0111 01000110 10001 ..... ..... ..... @vvv +xvavgr_w 0111 01000110 10010 ..... ..... ..... @vvv +xvavgr_d 0111 01000110 10011 ..... ..... ..... @vvv +xvavgr_bu 0111 01000110 10100 ..... ..... ..... @vvv +xvavgr_hu 0111 01000110 10101 ..... ..... ..... @vvv +xvavgr_wu 0111 01000110 10110 ..... ..... ..... @vvv +xvavgr_du 0111 01000110 10111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 6972e33833..8296aafa98 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1825,6 +1825,23 @@ INSN_LASX(xvaddwod_w_hu_h, vvv) INSN_LASX(xvaddwod_d_wu_w, vvv) INSN_LASX(xvaddwod_q_du_d, vvv) =20 +INSN_LASX(xvavg_b, vvv) +INSN_LASX(xvavg_h, vvv) +INSN_LASX(xvavg_w, vvv) +INSN_LASX(xvavg_d, vvv) +INSN_LASX(xvavg_bu, vvv) +INSN_LASX(xvavg_hu, vvv) +INSN_LASX(xvavg_wu, vvv) +INSN_LASX(xvavg_du, vvv) +INSN_LASX(xvavgr_b, vvv) +INSN_LASX(xvavgr_h, vvv) +INSN_LASX(xvavgr_w, vvv) +INSN_LASX(xvavgr_d, vvv) +INSN_LASX(xvavgr_bu, vvv) +INSN_LASX(xvavgr_hu, vvv) +INSN_LASX(xvavgr_wu, vvv) +INSN_LASX(xvavgr_du, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index fffc67ce93..a5d425e965 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -344,19 +344,18 @@ DO_ODD_U_S(vaddwod_h_bu_b, 16, H, UH, B, UB, DO_ADD) DO_ODD_U_S(vaddwod_w_hu_h, 32, W, UW, H, UH, DO_ADD) DO_ODD_U_S(vaddwod_d_wu_w, 64, D, UD, W, UW, DO_ADD) =20 -#define DO_VAVG(a, b) ((a >> 1) + (b >> 1) + (a & b & 1)) -#define DO_VAVGR(a, b) ((a >> 1) + (b >> 1) + ((a | b) & 1)) - -#define DO_3OP(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)); \ - } \ +#define DO_3OP(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)); \ + } \ } =20 DO_3OP(vavg_b, 8, B, DO_VAVG) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 922222bd78..bcd4b03afc 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -144,6 +144,23 @@ TRANS(xvaddwod_w_hu_h, LASX, gvec_vvv, 32, MO_16, do_v= addwod_u_s) TRANS(xvaddwod_d_wu_w, LASX, gvec_vvv, 32, MO_32, do_vaddwod_u_s) TRANS(xvaddwod_q_du_d, LASX, gvec_vvv, 32, MO_64, do_vaddwod_u_s) =20 +TRANS(xvavg_b, LASX, gvec_vvv, 32, MO_8, do_vavg_s) +TRANS(xvavg_h, LASX, gvec_vvv, 32, MO_16, do_vavg_s) +TRANS(xvavg_w, LASX, gvec_vvv, 32, MO_32, do_vavg_s) +TRANS(xvavg_d, LASX, gvec_vvv, 32, MO_64, do_vavg_s) +TRANS(xvavg_bu, LASX, gvec_vvv, 32, MO_8, do_vavg_u) +TRANS(xvavg_hu, LASX, gvec_vvv, 32, MO_16, do_vavg_u) +TRANS(xvavg_wu, LASX, gvec_vvv, 32, MO_32, do_vavg_u) +TRANS(xvavg_du, LASX, gvec_vvv, 32, MO_64, do_vavg_u) +TRANS(xvavgr_b, LASX, gvec_vvv, 32, MO_8, do_vavgr_s) +TRANS(xvavgr_h, LASX, gvec_vvv, 32, MO_16, do_vavgr_s) +TRANS(xvavgr_w, LASX, gvec_vvv, 32, MO_32, do_vavgr_s) +TRANS(xvavgr_d, LASX, gvec_vvv, 32, MO_64, do_vavgr_s) +TRANS(xvavgr_bu, LASX, gvec_vvv, 32, MO_8, do_vavgr_u) +TRANS(xvavgr_hu, LASX, gvec_vvv, 32, MO_16, do_vavgr_u) +TRANS(xvavgr_wu, LASX, gvec_vvv, 32, MO_32, do_vavgr_u) +TRANS(xvavgr_du, LASX, gvec_vvv, 32, MO_64, do_vavgr_u) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385661538142.93526241296001; Wed, 30 Aug 2023 01:54:21 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuW-0003hD-P3; Wed, 30 Aug 2023 04:50:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGuM-0002Np-N4 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:40 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuJ-0007u5-Eo for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:38 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cxh+iLAu9khAgdAA--.23731S3; Wed, 30 Aug 2023 16:49:15 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S16; Wed, 30 Aug 2023 16:49:14 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 14/48] target/loongarch: Implement xvabsd Date: Wed, 30 Aug 2023 16:48:28 +0800 Message-Id: <20230830084902.2113960-15-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S16 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385662024100011 Content-Type: text/plain; charset="utf-8" This patch includes: - XVABSD.{B/H/W/D}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 2 ++ target/loongarch/insns.decode | 9 +++++++++ target/loongarch/disas.c | 9 +++++++++ target/loongarch/vec_helper.c | 2 -- target/loongarch/insn_trans/trans_lasx.c.inc | 9 +++++++++ 5 files changed, 29 insertions(+), 2 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 6ac6b22f20..6767073635 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -53,4 +53,6 @@ #define DO_VAVG(a, b) ((a >> 1) + (b >> 1) + (a & b & 1)) #define DO_VAVGR(a, b) ((a >> 1) + (b >> 1) + ((a | b) & 1)) =20 +#define DO_VABSD(a, b) ((a > b) ? (a - b) : (b - a)) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index a2cb39750d..c086ee9b22 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1423,6 +1423,15 @@ xvavgr_hu 0111 01000110 10101 ..... ..... ...= .. @vvv xvavgr_wu 0111 01000110 10110 ..... ..... ..... @vvv xvavgr_du 0111 01000110 10111 ..... ..... ..... @vvv =20 +xvabsd_b 0111 01000110 00000 ..... ..... ..... @vvv +xvabsd_h 0111 01000110 00001 ..... ..... ..... @vvv +xvabsd_w 0111 01000110 00010 ..... ..... ..... @vvv +xvabsd_d 0111 01000110 00011 ..... ..... ..... @vvv +xvabsd_bu 0111 01000110 00100 ..... ..... ..... @vvv +xvabsd_hu 0111 01000110 00101 ..... ..... ..... @vvv +xvabsd_wu 0111 01000110 00110 ..... ..... ..... @vvv +xvabsd_du 0111 01000110 00111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 8296aafa98..d0b1de39b8 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1842,6 +1842,15 @@ INSN_LASX(xvavgr_hu, vvv) INSN_LASX(xvavgr_wu, vvv) INSN_LASX(xvavgr_du, vvv) =20 +INSN_LASX(xvabsd_b, vvv) +INSN_LASX(xvabsd_h, vvv) +INSN_LASX(xvabsd_w, vvv) +INSN_LASX(xvabsd_d, vvv) +INSN_LASX(xvabsd_bu, vvv) +INSN_LASX(xvabsd_hu, vvv) +INSN_LASX(xvabsd_wu, vvv) +INSN_LASX(xvabsd_du, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index a5d425e965..939ea11f19 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -375,8 +375,6 @@ DO_3OP(vavgr_hu, 16, UH, DO_VAVGR) DO_3OP(vavgr_wu, 32, UW, DO_VAVGR) DO_3OP(vavgr_du, 64, UD, DO_VAVGR) =20 -#define DO_VABSD(a, b) ((a > b) ? (a -b) : (b-a)) - DO_3OP(vabsd_b, 8, B, DO_VABSD) DO_3OP(vabsd_h, 16, H, DO_VABSD) DO_3OP(vabsd_w, 32, W, DO_VABSD) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index bcd4b03afc..2be165a839 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -161,6 +161,15 @@ TRANS(xvavgr_hu, LASX, gvec_vvv, 32, MO_16, do_vavgr_u) TRANS(xvavgr_wu, LASX, gvec_vvv, 32, MO_32, do_vavgr_u) TRANS(xvavgr_du, LASX, gvec_vvv, 32, MO_64, do_vavgr_u) =20 +TRANS(xvabsd_b, LASX, gvec_vvv, 32, MO_8, do_vabsd_s) +TRANS(xvabsd_h, LASX, gvec_vvv, 32, MO_16, do_vabsd_s) +TRANS(xvabsd_w, LASX, gvec_vvv, 32, MO_32, do_vabsd_s) +TRANS(xvabsd_d, LASX, gvec_vvv, 32, MO_64, do_vabsd_s) +TRANS(xvabsd_bu, LASX, gvec_vvv, 32, MO_8, do_vabsd_u) +TRANS(xvabsd_hu, LASX, gvec_vvv, 32, MO_16, do_vabsd_u) +TRANS(xvabsd_wu, LASX, gvec_vvv, 32, MO_32, do_vabsd_u) +TRANS(xvabsd_du, LASX, gvec_vvv, 32, MO_64, do_vabsd_u) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385559447830.5094440135298; Wed, 30 Aug 2023 01:52:39 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtH-0000PU-Vc; Wed, 30 Aug 2023 04:49:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtG-0000Oi-Ku for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:30 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtD-0007Tq-VY for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:30 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx77uLAu9khggdAA--.266S3; Wed, 30 Aug 2023 16:49:15 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S17; Wed, 30 Aug 2023 16:49:15 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 15/48] target/loongarch: Implement xvadda Date: Wed, 30 Aug 2023 16:48:29 +0800 Message-Id: <20230830084902.2113960-16-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S17 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385560667100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVADDA.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 2 ++ target/loongarch/insns.decode | 5 ++++ target/loongarch/disas.c | 5 ++++ target/loongarch/vec_helper.c | 24 ++++++++++---------- target/loongarch/insn_trans/trans_lasx.c.inc | 5 ++++ 5 files changed, 29 insertions(+), 12 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 6767073635..7ccc89c10f 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -55,4 +55,6 @@ =20 #define DO_VABSD(a, b) ((a > b) ? (a - b) : (b - a)) =20 +#define DO_VABS(a) ((a < 0) ? (-a) : (a)) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index c086ee9b22..f3722e3aa7 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1432,6 +1432,11 @@ xvabsd_hu 0111 01000110 00101 ..... ..... ...= .. @vvv xvabsd_wu 0111 01000110 00110 ..... ..... ..... @vvv xvabsd_du 0111 01000110 00111 ..... ..... ..... @vvv =20 +xvadda_b 0111 01000101 11000 ..... ..... ..... @vvv +xvadda_h 0111 01000101 11001 ..... ..... ..... @vvv +xvadda_w 0111 01000101 11010 ..... ..... ..... @vvv +xvadda_d 0111 01000101 11011 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index d0b1de39b8..b48822e431 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1851,6 +1851,11 @@ INSN_LASX(xvabsd_hu, vvv) INSN_LASX(xvabsd_wu, vvv) INSN_LASX(xvabsd_du, vvv) =20 +INSN_LASX(xvadda_b, vvv) +INSN_LASX(xvadda_h, vvv) +INSN_LASX(xvadda_w, vvv) +INSN_LASX(xvadda_d, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 939ea11f19..819fa5e033 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -384,18 +384,18 @@ DO_3OP(vabsd_hu, 16, UH, DO_VABSD) DO_3OP(vabsd_wu, 32, UW, DO_VABSD) DO_3OP(vabsd_du, 64, UD, DO_VABSD) =20 -#define DO_VABS(a) ((a < 0) ? (-a) : (a)) - -#define DO_VADDA(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i)) + DO_OP(Vk->E(i)); \ - } \ +#define DO_VADDA(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i)) + DO_OP(Vk->E(i)); \ + } \ } =20 DO_VADDA(vadda_b, 8, B, DO_VABS) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 2be165a839..a3f2740f74 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -170,6 +170,11 @@ TRANS(xvabsd_hu, LASX, gvec_vvv, 32, MO_16, do_vabsd_u) TRANS(xvabsd_wu, LASX, gvec_vvv, 32, MO_32, do_vabsd_u) TRANS(xvabsd_du, LASX, gvec_vvv, 32, MO_64, do_vabsd_u) =20 +TRANS(xvadda_b, LASX, gvec_vvv, 32, MO_8, do_vadda) +TRANS(xvadda_h, LASX, gvec_vvv, 32, MO_16, do_vadda) +TRANS(xvadda_w, LASX, gvec_vvv, 32, MO_32, do_vadda) +TRANS(xvadda_d, LASX, gvec_vvv, 32, MO_64, do_vadda) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385450668344.40581010022777; Wed, 30 Aug 2023 01:50:50 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtK-0000ZK-LH; Wed, 30 Aug 2023 04:49:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtI-0000QF-SG for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:32 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtF-0007UN-VC for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:32 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxPOuMAu9khwgdAA--.53992S3; Wed, 30 Aug 2023 16:49:16 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S18; Wed, 30 Aug 2023 16:49:15 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 16/48] target/loongarch: Implement xvmax/xvmin Date: Wed, 30 Aug 2023 16:48:30 +0800 Message-Id: <20230830084902.2113960-17-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S18 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385452838100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVMAX[I].{B/H/W/D}[U]; - XVMIN[I].{B/H/W/D}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 3 ++ target/loongarch/insns.decode | 36 ++++++++++++++++++++ target/loongarch/disas.c | 34 ++++++++++++++++++ target/loongarch/vec_helper.c | 26 +++++++------- target/loongarch/insn_trans/trans_lasx.c.inc | 36 ++++++++++++++++++++ 5 files changed, 121 insertions(+), 14 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 7ccc89c10f..cd6f6a72fd 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -57,4 +57,7 @@ =20 #define DO_VABS(a) ((a < 0) ? (-a) : (a)) =20 +#define DO_MIN(a, b) (a < b ? a : b) +#define DO_MAX(a, b) (a > b ? a : b) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index f3722e3aa7..99aefcb651 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1437,6 +1437,42 @@ xvadda_h 0111 01000101 11001 ..... ..... ...= .. @vvv xvadda_w 0111 01000101 11010 ..... ..... ..... @vvv xvadda_d 0111 01000101 11011 ..... ..... ..... @vvv =20 +xvmax_b 0111 01000111 00000 ..... ..... ..... @vvv +xvmax_h 0111 01000111 00001 ..... ..... ..... @vvv +xvmax_w 0111 01000111 00010 ..... ..... ..... @vvv +xvmax_d 0111 01000111 00011 ..... ..... ..... @vvv +xvmax_bu 0111 01000111 01000 ..... ..... ..... @vvv +xvmax_hu 0111 01000111 01001 ..... ..... ..... @vvv +xvmax_wu 0111 01000111 01010 ..... ..... ..... @vvv +xvmax_du 0111 01000111 01011 ..... ..... ..... @vvv + +xvmaxi_b 0111 01101001 00000 ..... ..... ..... @vv_i5 +xvmaxi_h 0111 01101001 00001 ..... ..... ..... @vv_i5 +xvmaxi_w 0111 01101001 00010 ..... ..... ..... @vv_i5 +xvmaxi_d 0111 01101001 00011 ..... ..... ..... @vv_i5 +xvmaxi_bu 0111 01101001 01000 ..... ..... ..... @vv_ui5 +xvmaxi_hu 0111 01101001 01001 ..... ..... ..... @vv_ui5 +xvmaxi_wu 0111 01101001 01010 ..... ..... ..... @vv_ui5 +xvmaxi_du 0111 01101001 01011 ..... ..... ..... @vv_ui5 + +xvmin_b 0111 01000111 00100 ..... ..... ..... @vvv +xvmin_h 0111 01000111 00101 ..... ..... ..... @vvv +xvmin_w 0111 01000111 00110 ..... ..... ..... @vvv +xvmin_d 0111 01000111 00111 ..... ..... ..... @vvv +xvmin_bu 0111 01000111 01100 ..... ..... ..... @vvv +xvmin_hu 0111 01000111 01101 ..... ..... ..... @vvv +xvmin_wu 0111 01000111 01110 ..... ..... ..... @vvv +xvmin_du 0111 01000111 01111 ..... ..... ..... @vvv + +xvmini_b 0111 01101001 00100 ..... ..... ..... @vv_i5 +xvmini_h 0111 01101001 00101 ..... ..... ..... @vv_i5 +xvmini_w 0111 01101001 00110 ..... ..... ..... @vv_i5 +xvmini_d 0111 01101001 00111 ..... ..... ..... @vv_i5 +xvmini_bu 0111 01101001 01100 ..... ..... ..... @vv_ui5 +xvmini_hu 0111 01101001 01101 ..... ..... ..... @vv_ui5 +xvmini_wu 0111 01101001 01110 ..... ..... ..... @vv_ui5 +xvmini_du 0111 01101001 01111 ..... ..... ..... @vv_ui5 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index b48822e431..63c1dc757f 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1856,6 +1856,40 @@ INSN_LASX(xvadda_h, vvv) INSN_LASX(xvadda_w, vvv) INSN_LASX(xvadda_d, vvv) =20 +INSN_LASX(xvmax_b, vvv) +INSN_LASX(xvmax_h, vvv) +INSN_LASX(xvmax_w, vvv) +INSN_LASX(xvmax_d, vvv) +INSN_LASX(xvmin_b, vvv) +INSN_LASX(xvmin_h, vvv) +INSN_LASX(xvmin_w, vvv) +INSN_LASX(xvmin_d, vvv) +INSN_LASX(xvmax_bu, vvv) +INSN_LASX(xvmax_hu, vvv) +INSN_LASX(xvmax_wu, vvv) +INSN_LASX(xvmax_du, vvv) +INSN_LASX(xvmin_bu, vvv) +INSN_LASX(xvmin_hu, vvv) +INSN_LASX(xvmin_wu, vvv) +INSN_LASX(xvmin_du, vvv) + +INSN_LASX(xvmaxi_b, vv_i) +INSN_LASX(xvmaxi_h, vv_i) +INSN_LASX(xvmaxi_w, vv_i) +INSN_LASX(xvmaxi_d, vv_i) +INSN_LASX(xvmini_b, vv_i) +INSN_LASX(xvmini_h, vv_i) +INSN_LASX(xvmini_w, vv_i) +INSN_LASX(xvmini_d, vv_i) +INSN_LASX(xvmaxi_bu, vv_i) +INSN_LASX(xvmaxi_hu, vv_i) +INSN_LASX(xvmaxi_wu, vv_i) +INSN_LASX(xvmaxi_du, vv_i) +INSN_LASX(xvmini_bu, vv_i) +INSN_LASX(xvmini_hu, vv_i) +INSN_LASX(xvmini_wu, vv_i) +INSN_LASX(xvmini_du, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 819fa5e033..0c641d80c7 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -403,20 +403,18 @@ DO_VADDA(vadda_h, 16, H, DO_VABS) DO_VADDA(vadda_w, 32, W, DO_VABS) DO_VADDA(vadda_d, 64, D, DO_VABS) =20 -#define DO_MIN(a, b) (a < b ? a : b) -#define DO_MAX(a, b) (a > b ? a : b) - -#define VMINMAXI(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - typedef __typeof(Vd->E(0)) TD; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i), (TD)imm); \ - } \ +#define VMINMAXI(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + typedef __typeof(Vd->E(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), (TD)imm); \ + } \ } =20 VMINMAXI(vmini_b, 8, B, DO_MIN) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index a3f2740f74..ba31da6578 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -175,6 +175,42 @@ TRANS(xvadda_h, LASX, gvec_vvv, 32, MO_16, do_vadda) TRANS(xvadda_w, LASX, gvec_vvv, 32, MO_32, do_vadda) TRANS(xvadda_d, LASX, gvec_vvv, 32, MO_64, do_vadda) =20 +TRANS(xvmax_b, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_smax) +TRANS(xvmax_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_smax) +TRANS(xvmax_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_smax) +TRANS(xvmax_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_smax) +TRANS(xvmax_bu, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_umax) +TRANS(xvmax_hu, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_umax) +TRANS(xvmax_wu, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_umax) +TRANS(xvmax_du, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_umax) + +TRANS(xvmin_b, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_smin) +TRANS(xvmin_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_smin) +TRANS(xvmin_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_smin) +TRANS(xvmin_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_smin) +TRANS(xvmin_bu, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_umin) +TRANS(xvmin_hu, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_umin) +TRANS(xvmin_wu, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_umin) +TRANS(xvmin_du, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_umin) + +TRANS(xvmini_b, LASX, gvec_vv_i, 32, MO_8, do_vmini_s) +TRANS(xvmini_h, LASX, gvec_vv_i, 32, MO_16, do_vmini_s) +TRANS(xvmini_w, LASX, gvec_vv_i, 32, MO_32, do_vmini_s) +TRANS(xvmini_d, LASX, gvec_vv_i, 32, MO_64, do_vmini_s) +TRANS(xvmini_bu, LASX, gvec_vv_i, 32, MO_8, do_vmini_u) +TRANS(xvmini_hu, LASX, gvec_vv_i, 32, MO_16, do_vmini_u) +TRANS(xvmini_wu, LASX, gvec_vv_i, 32, MO_32, do_vmini_u) +TRANS(xvmini_du, LASX, gvec_vv_i, 32, MO_64, do_vmini_u) + +TRANS(xvmaxi_b, LASX, gvec_vv_i, 32, MO_8, do_vmaxi_s) +TRANS(xvmaxi_h, LASX, gvec_vv_i, 32, MO_16, do_vmaxi_s) +TRANS(xvmaxi_w, LASX, gvec_vv_i, 32, MO_32, do_vmaxi_s) +TRANS(xvmaxi_d, LASX, gvec_vv_i, 32, MO_64, do_vmaxi_s) +TRANS(xvmaxi_bu, LASX, gvec_vv_i, 32, MO_8, do_vmaxi_u) +TRANS(xvmaxi_hu, LASX, gvec_vv_i, 32, MO_16, do_vmaxi_u) +TRANS(xvmaxi_wu, LASX, gvec_vv_i, 32, MO_32, do_vmaxi_u) +TRANS(xvmaxi_du, LASX, gvec_vv_i, 32, MO_64, do_vmaxi_u) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385544179237.1403747248038; Wed, 30 Aug 2023 01:52:24 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuS-00034X-Dd; Wed, 30 Aug 2023 04:50:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGuO-0002Ws-QW for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:40 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuI-0007u4-Ns for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:40 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxuOiNAu9kiggdAA--.23629S3; Wed, 30 Aug 2023 16:49:17 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S19; Wed, 30 Aug 2023 16:49:16 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 17/48] target/loongarch: Implement xvmul/xvmuh/xvmulw{ev/od} Date: Wed, 30 Aug 2023 16:48:31 +0800 Message-Id: <20230830084902.2113960-18-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S19 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385545468100005 Content-Type: text/plain; charset="utf-8" This patch includes: - XVMUL.{B/H/W/D}; - XVMUH.{B/H/W/D}[U]; - XVMULW{EV/OD}.{H.B/W.H/D.W/Q.D}[U]; - XVMULW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 2 + target/loongarch/insns.decode | 38 +++++++++++++ target/loongarch/disas.c | 38 +++++++++++++ target/loongarch/vec_helper.c | 57 ++++++++++--------- target/loongarch/insn_trans/trans_lasx.c.inc | 42 ++++++++++++++ target/loongarch/insn_trans/trans_lsx.c.inc | 60 ++++++++++---------- 6 files changed, 180 insertions(+), 57 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index cd6f6a72fd..6fc84c8c5a 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -60,4 +60,6 @@ #define DO_MIN(a, b) (a < b ? a : b) #define DO_MAX(a, b) (a > b ? a : b) =20 +#define DO_MUL(a, b) (a * b) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 99aefcb651..0f9ebe641f 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1473,6 +1473,44 @@ xvmini_hu 0111 01101001 01101 ..... ..... ...= .. @vv_ui5 xvmini_wu 0111 01101001 01110 ..... ..... ..... @vv_ui5 xvmini_du 0111 01101001 01111 ..... ..... ..... @vv_ui5 =20 +xvmul_b 0111 01001000 01000 ..... ..... ..... @vvv +xvmul_h 0111 01001000 01001 ..... ..... ..... @vvv +xvmul_w 0111 01001000 01010 ..... ..... ..... @vvv +xvmul_d 0111 01001000 01011 ..... ..... ..... @vvv +xvmuh_b 0111 01001000 01100 ..... ..... ..... @vvv +xvmuh_h 0111 01001000 01101 ..... ..... ..... @vvv +xvmuh_w 0111 01001000 01110 ..... ..... ..... @vvv +xvmuh_d 0111 01001000 01111 ..... ..... ..... @vvv +xvmuh_bu 0111 01001000 10000 ..... ..... ..... @vvv +xvmuh_hu 0111 01001000 10001 ..... ..... ..... @vvv +xvmuh_wu 0111 01001000 10010 ..... ..... ..... @vvv +xvmuh_du 0111 01001000 10011 ..... ..... ..... @vvv + +xvmulwev_h_b 0111 01001001 00000 ..... ..... ..... @vvv +xvmulwev_w_h 0111 01001001 00001 ..... ..... ..... @vvv +xvmulwev_d_w 0111 01001001 00010 ..... ..... ..... @vvv +xvmulwev_q_d 0111 01001001 00011 ..... ..... ..... @vvv +xvmulwod_h_b 0111 01001001 00100 ..... ..... ..... @vvv +xvmulwod_w_h 0111 01001001 00101 ..... ..... ..... @vvv +xvmulwod_d_w 0111 01001001 00110 ..... ..... ..... @vvv +xvmulwod_q_d 0111 01001001 00111 ..... ..... ..... @vvv +xvmulwev_h_bu 0111 01001001 10000 ..... ..... ..... @vvv +xvmulwev_w_hu 0111 01001001 10001 ..... ..... ..... @vvv +xvmulwev_d_wu 0111 01001001 10010 ..... ..... ..... @vvv +xvmulwev_q_du 0111 01001001 10011 ..... ..... ..... @vvv +xvmulwod_h_bu 0111 01001001 10100 ..... ..... ..... @vvv +xvmulwod_w_hu 0111 01001001 10101 ..... ..... ..... @vvv +xvmulwod_d_wu 0111 01001001 10110 ..... ..... ..... @vvv +xvmulwod_q_du 0111 01001001 10111 ..... ..... ..... @vvv +xvmulwev_h_bu_b 0111 01001010 00000 ..... ..... ..... @vvv +xvmulwev_w_hu_h 0111 01001010 00001 ..... ..... ..... @vvv +xvmulwev_d_wu_w 0111 01001010 00010 ..... ..... ..... @vvv +xvmulwev_q_du_d 0111 01001010 00011 ..... ..... ..... @vvv +xvmulwod_h_bu_b 0111 01001010 00100 ..... ..... ..... @vvv +xvmulwod_w_hu_h 0111 01001010 00101 ..... ..... ..... @vvv +xvmulwod_d_wu_w 0111 01001010 00110 ..... ..... ..... @vvv +xvmulwod_q_du_d 0111 01001010 00111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 63c1dc757f..e5f9a6bcdf 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1890,6 +1890,44 @@ INSN_LASX(xvmini_hu, vv_i) INSN_LASX(xvmini_wu, vv_i) INSN_LASX(xvmini_du, vv_i) =20 +INSN_LASX(xvmul_b, vvv) +INSN_LASX(xvmul_h, vvv) +INSN_LASX(xvmul_w, vvv) +INSN_LASX(xvmul_d, vvv) +INSN_LASX(xvmuh_b, vvv) +INSN_LASX(xvmuh_h, vvv) +INSN_LASX(xvmuh_w, vvv) +INSN_LASX(xvmuh_d, vvv) +INSN_LASX(xvmuh_bu, vvv) +INSN_LASX(xvmuh_hu, vvv) +INSN_LASX(xvmuh_wu, vvv) +INSN_LASX(xvmuh_du, vvv) + +INSN_LASX(xvmulwev_h_b, vvv) +INSN_LASX(xvmulwev_w_h, vvv) +INSN_LASX(xvmulwev_d_w, vvv) +INSN_LASX(xvmulwev_q_d, vvv) +INSN_LASX(xvmulwod_h_b, vvv) +INSN_LASX(xvmulwod_w_h, vvv) +INSN_LASX(xvmulwod_d_w, vvv) +INSN_LASX(xvmulwod_q_d, vvv) +INSN_LASX(xvmulwev_h_bu, vvv) +INSN_LASX(xvmulwev_w_hu, vvv) +INSN_LASX(xvmulwev_d_wu, vvv) +INSN_LASX(xvmulwev_q_du, vvv) +INSN_LASX(xvmulwod_h_bu, vvv) +INSN_LASX(xvmulwod_w_hu, vvv) +INSN_LASX(xvmulwod_d_wu, vvv) +INSN_LASX(xvmulwod_q_du, vvv) +INSN_LASX(xvmulwev_h_bu_b, vvv) +INSN_LASX(xvmulwev_w_hu_h, vvv) +INSN_LASX(xvmulwev_d_wu_w, vvv) +INSN_LASX(xvmulwev_q_du_d, vvv) +INSN_LASX(xvmulwod_h_bu_b, vvv) +INSN_LASX(xvmulwod_w_hu_h, vvv) +INSN_LASX(xvmulwod_d_wu_w, vvv) +INSN_LASX(xvmulwod_q_du_d, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 0c641d80c7..f641950cbe 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -434,58 +434,59 @@ VMINMAXI(vmaxi_hu, 16, UH, DO_MAX) VMINMAXI(vmaxi_wu, 32, UW, DO_MAX) VMINMAXI(vmaxi_du, 64, UD, DO_MAX) =20 -#define DO_VMUH(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - typedef __typeof(Vd->E1(0)) T; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E2(i) =3D ((T)Vj->E2(i)) * ((T)Vk->E2(i)) >> BIT; \ - } \ +#define DO_VMUH(NAME, BIT, E1, E2, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + typedef __typeof(Vd->E1(0)) T; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E2(i) =3D ((T)Vj->E2(i)) * ((T)Vk->E2(i)) >> BIT; \ + } \ } =20 -void HELPER(vmuh_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vmuh_d)(void *vd, void *vj, void *vk, uint32_t desc) { - uint64_t l, h1, h2; + int i; + uint64_t l, h; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - muls64(&l, &h1, Vj->D(0), Vk->D(0)); - muls64(&l, &h2, Vj->D(1), Vk->D(1)); - - Vd->D(0) =3D h1; - Vd->D(1) =3D h2; + for (i =3D 0; i < oprsz / 8; i++) { + muls64(&l, &h, Vj->D(i), Vk->D(i)); + Vd->D(i) =3D h; + } } =20 DO_VMUH(vmuh_b, 8, H, B, DO_MUH) DO_VMUH(vmuh_h, 16, W, H, DO_MUH) DO_VMUH(vmuh_w, 32, D, W, DO_MUH) =20 -void HELPER(vmuh_du)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vmuh_du)(void *vd, void *vj, void *vk, uint32_t desc) { - uint64_t l, h1, h2; + int i; + uint64_t l, h; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - mulu64(&l, &h1, Vj->D(0), Vk->D(0)); - mulu64(&l, &h2, Vj->D(1), Vk->D(1)); - - Vd->D(0) =3D h1; - Vd->D(1) =3D h2; + for (i =3D 0; i < oprsz / 8; i++) { + mulu64(&l, &h, Vj->D(i), Vk->D(i)); + Vd->D(i) =3D h; + } } =20 DO_VMUH(vmuh_bu, 8, UH, UB, DO_MUH) DO_VMUH(vmuh_hu, 16, UW, UH, DO_MUH) DO_VMUH(vmuh_wu, 32, UD, UW, DO_MUH) =20 -#define DO_MUL(a, b) (a * b) - DO_EVEN(vmulwev_h_b, 16, H, B, DO_MUL) DO_EVEN(vmulwev_w_h, 32, W, H, DO_MUL) DO_EVEN(vmulwev_d_w, 64, D, W, DO_MUL) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index ba31da6578..ca9361782e 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -211,6 +211,48 @@ TRANS(xvmaxi_hu, LASX, gvec_vv_i, 32, MO_16, do_vmaxi_= u) TRANS(xvmaxi_wu, LASX, gvec_vv_i, 32, MO_32, do_vmaxi_u) TRANS(xvmaxi_du, LASX, gvec_vv_i, 32, MO_64, do_vmaxi_u) =20 +TRANS(xvmul_b, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_mul) +TRANS(xvmul_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_mul) +TRANS(xvmul_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_mul) +TRANS(xvmul_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_mul) +TRANS(xvmuh_b, LASX, gvec_vvv, 32, MO_8, do_vmuh_s) +TRANS(xvmuh_h, LASX, gvec_vvv, 32, MO_16, do_vmuh_s) +TRANS(xvmuh_w, LASX, gvec_vvv, 32, MO_32, do_vmuh_s) +TRANS(xvmuh_d, LASX, gvec_vvv, 32, MO_64, do_vmuh_s) +TRANS(xvmuh_bu, LASX, gvec_vvv, 32, MO_8, do_vmuh_u) +TRANS(xvmuh_hu, LASX, gvec_vvv, 32, MO_16, do_vmuh_u) +TRANS(xvmuh_wu, LASX, gvec_vvv, 32, MO_32, do_vmuh_u) +TRANS(xvmuh_du, LASX, gvec_vvv, 32, MO_64, do_vmuh_u) + +TRANS(xvmulwev_h_b, LASX, gvec_vvv, 32, MO_8, do_vmulwev_s) +TRANS(xvmulwev_w_h, LASX, gvec_vvv, 32, MO_16, do_vmulwev_s) +TRANS(xvmulwev_d_w, LASX, gvec_vvv, 32, MO_32, do_vmulwev_s) + +TRANS(xvmulwev_q_d, LASX, gen_vmul_q, 32, 0, 0, tcg_gen_muls2_i64) +TRANS(xvmulwod_q_d, LASX, gen_vmul_q, 32, 1, 1, tcg_gen_muls2_i64) +TRANS(xvmulwev_q_du, LASX, gen_vmul_q, 32, 0, 0, tcg_gen_mulu2_i64) +TRANS(xvmulwod_q_du, LASX, gen_vmul_q, 32, 1, 1, tcg_gen_mulu2_i64) +TRANS(xvmulwev_q_du_d, LASX, gen_vmul_q, 32, 0, 0, tcg_gen_mulus2_i64) +TRANS(xvmulwod_q_du_d, LASX, gen_vmul_q, 32, 1, 1, tcg_gen_mulus2_i64) + +TRANS(xvmulwod_h_b, LASX, gvec_vvv, 32, MO_8, do_vmulwod_s) +TRANS(xvmulwod_w_h, LASX, gvec_vvv, 32, MO_16, do_vmulwod_s) +TRANS(xvmulwod_d_w, LASX, gvec_vvv, 32, MO_32, do_vmulwod_s) + +TRANS(xvmulwev_h_bu, LASX, gvec_vvv, 32, MO_8, do_vmulwev_u) +TRANS(xvmulwev_w_hu, LASX, gvec_vvv, 32, MO_16, do_vmulwev_u) +TRANS(xvmulwev_d_wu, LASX, gvec_vvv, 32, MO_32, do_vmulwev_u) +TRANS(xvmulwod_h_bu, LASX, gvec_vvv, 32, MO_8, do_vmulwod_u) +TRANS(xvmulwod_w_hu, LASX, gvec_vvv, 32, MO_16, do_vmulwod_u) +TRANS(xvmulwod_d_wu, LASX, gvec_vvv, 32, MO_32, do_vmulwod_u) + +TRANS(xvmulwev_h_bu_b, LASX, gvec_vvv, 32, MO_8, do_vmulwev_u_s) +TRANS(xvmulwev_w_hu_h, LASX, gvec_vvv, 32, MO_16, do_vmulwev_u_s) +TRANS(xvmulwev_d_wu_w, LASX, gvec_vvv, 32, MO_32, do_vmulwev_u_s) +TRANS(xvmulwod_h_bu_b, LASX, gvec_vvv, 32, MO_8, do_vmulwod_u_s) +TRANS(xvmulwod_w_hu_h, LASX, gvec_vvv, 32, MO_16, do_vmulwod_u_s) +TRANS(xvmulwod_d_wu_w, LASX, gvec_vvv, 32, MO_32, do_vmulwod_u_s) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index 5653a556bf..d25f89a6a4 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -1764,37 +1764,39 @@ static void tcg_gen_mulus2_i64(TCGv_i64 rl, TCGv_i6= 4 rh, tcg_gen_mulsu2_i64(rl, rh, arg2, arg1); } =20 -#define VMUL_Q(NAME, FN, idx1, idx2) \ -static bool trans_## NAME (DisasContext *ctx, arg_vvv *a) \ -{ \ - TCGv_i64 rh, rl, arg1, arg2; \ - \ - if (!avail_LSX(ctx)) { \ - return false; \ - } \ - \ - rh =3D tcg_temp_new_i64(); \ - rl =3D tcg_temp_new_i64(); \ - arg1 =3D tcg_temp_new_i64(); \ - arg2 =3D tcg_temp_new_i64(); \ - \ - get_vreg64(arg1, a->vj, idx1); \ - get_vreg64(arg2, a->vk, idx2); \ - \ - tcg_gen_## FN ##_i64(rl, rh, arg1, arg2); \ - \ - set_vreg64(rh, a->vd, 1); \ - set_vreg64(rl, a->vd, 0); \ - \ - return true; \ +static bool gen_vmul_q(DisasContext *ctx, + arg_vvv *a, int oprsz, int idx1, int idx2, + void (*func)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64= )) +{ + TCGv_i64 rh, rl, arg1, arg2; + int i; + + CHECK_VEC; + + rh =3D tcg_temp_new_i64(); + rl =3D tcg_temp_new_i64(); + arg1 =3D tcg_temp_new_i64(); + arg2 =3D tcg_temp_new_i64(); + + for (i =3D 0; i < oprsz / 16; i++) { + get_vreg64(arg1, a->vj, 2 * i + idx1); + get_vreg64(arg2, a->vk, 2 * i + idx2); + + func(rl, rh, arg1, arg2); + + set_vreg64(rh, a->vd, 2 * i + 1); + set_vreg64(rl, a->vd, 2 * i); + } + + return true; } =20 -VMUL_Q(vmulwev_q_d, muls2, 0, 0) -VMUL_Q(vmulwod_q_d, muls2, 1, 1) -VMUL_Q(vmulwev_q_du, mulu2, 0, 0) -VMUL_Q(vmulwod_q_du, mulu2, 1, 1) -VMUL_Q(vmulwev_q_du_d, mulus2, 0, 0) -VMUL_Q(vmulwod_q_du_d, mulus2, 1, 1) +TRANS(vmulwev_q_d, LSX, gen_vmul_q, 16, 0, 0, tcg_gen_muls2_i64) +TRANS(vmulwod_q_d, LSX, gen_vmul_q, 16, 1, 1, tcg_gen_muls2_i64) +TRANS(vmulwev_q_du, LSX, gen_vmul_q, 16, 0, 0, tcg_gen_mulu2_i64) +TRANS(vmulwod_q_du, LSX, gen_vmul_q, 16, 1, 1, tcg_gen_mulu2_i64) +TRANS(vmulwev_q_du_d, LSX, gen_vmul_q, 16, 0, 0, tcg_gen_mulus2_i64) +TRANS(vmulwod_q_du_d, LSX, gen_vmul_q, 16, 1, 1, tcg_gen_mulus2_i64) =20 static void gen_vmulwod_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385485591431.28986361402553; Wed, 30 Aug 2023 01:51:25 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtM-0000lj-ED; Wed, 30 Aug 2023 04:49:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtK-0000ZF-I3 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:34 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtH-0007Uv-0t for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:34 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxlPCOAu9kjQgdAA--.59292S3; Wed, 30 Aug 2023 16:49:18 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S20; Wed, 30 Aug 2023 16:49:17 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 18/48] target/loongarch: Implement xvmadd/xvmsub/xvmaddw{ev/od} Date: Wed, 30 Aug 2023 16:48:32 +0800 Message-Id: <20230830084902.2113960-19-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S20 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385487190100013 Content-Type: text/plain; charset="utf-8" This patch includes: - XVMADD.{B/H/W/D}; - XVMSUB.{B/H/W/D}; - XVMADDW{EV/OD}.{H.B/W.H/D.W/Q.D}[U]; - XVMADDW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 3 + target/loongarch/insns.decode | 34 ++++++ target/loongarch/disas.c | 34 ++++++ target/loongarch/vec_helper.c | 113 ++++++++++--------- target/loongarch/insn_trans/trans_lasx.c.inc | 38 +++++++ target/loongarch/insn_trans/trans_lsx.c.inc | 72 ++++++------ 6 files changed, 203 insertions(+), 91 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 6fc84c8c5a..06c8d7e314 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -62,4 +62,7 @@ =20 #define DO_MUL(a, b) (a * b) =20 +#define DO_MADD(a, b, c) (a + b * c) +#define DO_MSUB(a, b, c) (a - b * c) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 0f9ebe641f..d6fb51ae64 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1511,6 +1511,40 @@ xvmulwod_w_hu_h 0111 01001010 00101 ..... ..... ...= .. @vvv xvmulwod_d_wu_w 0111 01001010 00110 ..... ..... ..... @vvv xvmulwod_q_du_d 0111 01001010 00111 ..... ..... ..... @vvv =20 +xvmadd_b 0111 01001010 10000 ..... ..... ..... @vvv +xvmadd_h 0111 01001010 10001 ..... ..... ..... @vvv +xvmadd_w 0111 01001010 10010 ..... ..... ..... @vvv +xvmadd_d 0111 01001010 10011 ..... ..... ..... @vvv +xvmsub_b 0111 01001010 10100 ..... ..... ..... @vvv +xvmsub_h 0111 01001010 10101 ..... ..... ..... @vvv +xvmsub_w 0111 01001010 10110 ..... ..... ..... @vvv +xvmsub_d 0111 01001010 10111 ..... ..... ..... @vvv + +xvmaddwev_h_b 0111 01001010 11000 ..... ..... ..... @vvv +xvmaddwev_w_h 0111 01001010 11001 ..... ..... ..... @vvv +xvmaddwev_d_w 0111 01001010 11010 ..... ..... ..... @vvv +xvmaddwev_q_d 0111 01001010 11011 ..... ..... ..... @vvv +xvmaddwod_h_b 0111 01001010 11100 ..... ..... ..... @vvv +xvmaddwod_w_h 0111 01001010 11101 ..... ..... ..... @vvv +xvmaddwod_d_w 0111 01001010 11110 ..... ..... ..... @vvv +xvmaddwod_q_d 0111 01001010 11111 ..... ..... ..... @vvv +xvmaddwev_h_bu 0111 01001011 01000 ..... ..... ..... @vvv +xvmaddwev_w_hu 0111 01001011 01001 ..... ..... ..... @vvv +xvmaddwev_d_wu 0111 01001011 01010 ..... ..... ..... @vvv +xvmaddwev_q_du 0111 01001011 01011 ..... ..... ..... @vvv +xvmaddwod_h_bu 0111 01001011 01100 ..... ..... ..... @vvv +xvmaddwod_w_hu 0111 01001011 01101 ..... ..... ..... @vvv +xvmaddwod_d_wu 0111 01001011 01110 ..... ..... ..... @vvv +xvmaddwod_q_du 0111 01001011 01111 ..... ..... ..... @vvv +xvmaddwev_h_bu_b 0111 01001011 11000 ..... ..... ..... @vvv +xvmaddwev_w_hu_h 0111 01001011 11001 ..... ..... ..... @vvv +xvmaddwev_d_wu_w 0111 01001011 11010 ..... ..... ..... @vvv +xvmaddwev_q_du_d 0111 01001011 11011 ..... ..... ..... @vvv +xvmaddwod_h_bu_b 0111 01001011 11100 ..... ..... ..... @vvv +xvmaddwod_w_hu_h 0111 01001011 11101 ..... ..... ..... @vvv +xvmaddwod_d_wu_w 0111 01001011 11110 ..... ..... ..... @vvv +xvmaddwod_q_du_d 0111 01001011 11111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index e5f9a6bcdf..b115fe8315 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1928,6 +1928,40 @@ INSN_LASX(xvmulwod_w_hu_h, vvv) INSN_LASX(xvmulwod_d_wu_w, vvv) INSN_LASX(xvmulwod_q_du_d, vvv) =20 +INSN_LASX(xvmadd_b, vvv) +INSN_LASX(xvmadd_h, vvv) +INSN_LASX(xvmadd_w, vvv) +INSN_LASX(xvmadd_d, vvv) +INSN_LASX(xvmsub_b, vvv) +INSN_LASX(xvmsub_h, vvv) +INSN_LASX(xvmsub_w, vvv) +INSN_LASX(xvmsub_d, vvv) + +INSN_LASX(xvmaddwev_h_b, vvv) +INSN_LASX(xvmaddwev_w_h, vvv) +INSN_LASX(xvmaddwev_d_w, vvv) +INSN_LASX(xvmaddwev_q_d, vvv) +INSN_LASX(xvmaddwod_h_b, vvv) +INSN_LASX(xvmaddwod_w_h, vvv) +INSN_LASX(xvmaddwod_d_w, vvv) +INSN_LASX(xvmaddwod_q_d, vvv) +INSN_LASX(xvmaddwev_h_bu, vvv) +INSN_LASX(xvmaddwev_w_hu, vvv) +INSN_LASX(xvmaddwev_d_wu, vvv) +INSN_LASX(xvmaddwev_q_du, vvv) +INSN_LASX(xvmaddwod_h_bu, vvv) +INSN_LASX(xvmaddwod_w_hu, vvv) +INSN_LASX(xvmaddwod_d_wu, vvv) +INSN_LASX(xvmaddwod_q_du, vvv) +INSN_LASX(xvmaddwev_h_bu_b, vvv) +INSN_LASX(xvmaddwev_w_hu_h, vvv) +INSN_LASX(xvmaddwev_d_wu_w, vvv) +INSN_LASX(xvmaddwev_q_du_d, vvv) +INSN_LASX(xvmaddwod_h_bu_b, vvv) +INSN_LASX(xvmaddwod_w_hu_h, vvv) +INSN_LASX(xvmaddwod_d_wu_w, vvv) +INSN_LASX(xvmaddwod_q_du_d, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index f641950cbe..5a1bff8b04 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -511,19 +511,18 @@ DO_ODD_U_S(vmulwod_h_bu_b, 16, H, UH, B, UB, DO_MUL) DO_ODD_U_S(vmulwod_w_hu_h, 32, W, UW, H, UH, DO_MUL) DO_ODD_U_S(vmulwod_d_wu_w, 64, D, UD, W, UW, DO_MUL) =20 -#define DO_MADD(a, b, c) (a + b * c) -#define DO_MSUB(a, b, c) (a - b * c) - -#define VMADDSUB(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vd->E(i), Vj->E(i) ,Vk->E(i)); \ - } \ +#define VMADDSUB(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vd->E(i), Vj->E(i) ,Vk->E(i)); \ + } \ } =20 VMADDSUB(vmadd_b, 8, B, DO_MADD) @@ -536,15 +535,16 @@ VMADDSUB(vmsub_w, 32, W, DO_MSUB) VMADDSUB(vmsub_d, 64, D, DO_MSUB) =20 #define VMADDWEV(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ typedef __typeof(Vd->E1(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E1(i) +=3D DO_OP((TD)Vj->E2(2 * i), (TD)Vk->E2(2 * i)); \ } \ } @@ -556,19 +556,20 @@ VMADDWEV(vmaddwev_h_bu, 16, UH, UB, DO_MUL) VMADDWEV(vmaddwev_w_hu, 32, UW, UH, DO_MUL) VMADDWEV(vmaddwev_d_wu, 64, UD, UW, DO_MUL) =20 -#define VMADDWOD(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - typedef __typeof(Vd->E1(0)) TD; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) +=3D DO_OP((TD)Vj->E2(2 * i + 1), \ - (TD)Vk->E2(2 * i + 1)); \ - } \ +#define VMADDWOD(NAME, BIT, E1, E2, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + typedef __typeof(Vd->E1(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E1(i) +=3D DO_OP((TD)Vj->E2(2 * i + 1), \ + (TD)Vk->E2(2 * i + 1)); \ + } \ } =20 VMADDWOD(vmaddwod_h_b, 16, H, B, DO_MUL) @@ -578,40 +579,42 @@ VMADDWOD(vmaddwod_h_bu, 16, UH, UB, DO_MUL) VMADDWOD(vmaddwod_w_hu, 32, UW, UH, DO_MUL) VMADDWOD(vmaddwod_d_wu, 64, UD, UW, DO_MUL) =20 -#define VMADDWEV_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - typedef __typeof(Vd->ES1(0)) TS1; \ - typedef __typeof(Vd->EU1(0)) TU1; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->ES1(i) +=3D DO_OP((TU1)Vj->EU2(2 * i), \ - (TS1)Vk->ES2(2 * i)); \ - } \ +#define VMADDWEV_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + typedef __typeof(Vd->ES1(0)) TS1; \ + typedef __typeof(Vd->EU1(0)) TU1; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->ES1(i) +=3D DO_OP((TU1)Vj->EU2(2 * i), \ + (TS1)Vk->ES2(2 * i)); \ + } \ } =20 VMADDWEV_U_S(vmaddwev_h_bu_b, 16, H, UH, B, UB, DO_MUL) VMADDWEV_U_S(vmaddwev_w_hu_h, 32, W, UW, H, UH, DO_MUL) VMADDWEV_U_S(vmaddwev_d_wu_w, 64, D, UD, W, UW, DO_MUL) =20 -#define VMADDWOD_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ +#define VMADDWOD_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - typedef __typeof(Vd->ES1(0)) TS1; \ - typedef __typeof(Vd->EU1(0)) TU1; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->ES1(i) +=3D DO_OP((TU1)Vj->EU2(2 * i + 1), \ - (TS1)Vk->ES2(2 * i + 1)); \ - } \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + typedef __typeof(Vd->ES1(0)) TS1; \ + typedef __typeof(Vd->EU1(0)) TU1; \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->ES1(i) +=3D DO_OP((TU1)Vj->EU2(2 * i + 1), \ + (TS1)Vk->ES2(2 * i + 1)); \ + } \ } =20 VMADDWOD_U_S(vmaddwod_h_bu_b, 16, H, UH, B, UB, DO_MUL) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index ca9361782e..1073118417 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -253,6 +253,44 @@ TRANS(xvmulwod_h_bu_b, LASX, gvec_vvv, 32, MO_8, do_vm= ulwod_u_s) TRANS(xvmulwod_w_hu_h, LASX, gvec_vvv, 32, MO_16, do_vmulwod_u_s) TRANS(xvmulwod_d_wu_w, LASX, gvec_vvv, 32, MO_32, do_vmulwod_u_s) =20 +TRANS(xvmadd_b, LASX, gvec_vvv, 32, MO_8, do_vmadd) +TRANS(xvmadd_h, LASX, gvec_vvv, 32, MO_16, do_vmadd) +TRANS(xvmadd_w, LASX, gvec_vvv, 32, MO_32, do_vmadd) +TRANS(xvmadd_d, LASX, gvec_vvv, 32, MO_64, do_vmadd) +TRANS(xvmsub_b, LASX, gvec_vvv, 32, MO_8, do_vmsub) +TRANS(xvmsub_h, LASX, gvec_vvv, 32, MO_16, do_vmsub) +TRANS(xvmsub_w, LASX, gvec_vvv, 32, MO_32, do_vmsub) +TRANS(xvmsub_d, LASX, gvec_vvv, 32, MO_64, do_vmsub) + +TRANS(xvmaddwev_h_b, LASX, gvec_vvv, 32, MO_8, do_vmaddwev_s) +TRANS(xvmaddwev_w_h, LASX, gvec_vvv, 32, MO_16, do_vmaddwev_s) +TRANS(xvmaddwev_d_w, LASX, gvec_vvv, 32, MO_32, do_vmaddwev_s) + +TRANS(xvmaddwev_q_d, LASX, gen_vmadd_q, 32, 0, 0, tcg_gen_muls2_i64) +TRANS(xvmaddwod_q_d, LASX, gen_vmadd_q, 32, 1, 1, tcg_gen_muls2_i64) +TRANS(xvmaddwev_q_du, LASX, gen_vmadd_q, 32, 0, 0, tcg_gen_mulu2_i64) +TRANS(xvmaddwod_q_du, LASX, gen_vmadd_q, 32, 1, 1, tcg_gen_mulu2_i64) +TRANS(xvmaddwev_q_du_d, LASX, gen_vmadd_q, 32, 0, 0, tcg_gen_mulus2_i64) +TRANS(xvmaddwod_q_du_d, LASX, gen_vmadd_q, 32, 1, 1, tcg_gen_mulus2_i64) + +TRANS(xvmaddwod_h_b, LASX, gvec_vvv, 32, MO_8, do_vmaddwod_s) +TRANS(xvmaddwod_w_h, LASX, gvec_vvv, 32, MO_16, do_vmaddwod_s) +TRANS(xvmaddwod_d_w, LASX, gvec_vvv, 32, MO_32, do_vmaddwod_s) + +TRANS(xvmaddwev_h_bu, LASX, gvec_vvv, 32, MO_8, do_vmaddwev_u) +TRANS(xvmaddwev_w_hu, LASX, gvec_vvv, 32, MO_16, do_vmaddwev_u) +TRANS(xvmaddwev_d_wu, LASX, gvec_vvv, 32, MO_32, do_vmaddwev_u) +TRANS(xvmaddwod_h_bu, LASX, gvec_vvv, 32, MO_8, do_vmaddwod_u) +TRANS(xvmaddwod_w_hu, LASX, gvec_vvv, 32, MO_16, do_vmaddwod_u) +TRANS(xvmaddwod_d_wu, LASX, gvec_vvv, 32, MO_32, do_vmaddwod_u) + +TRANS(xvmaddwev_h_bu_b, LASX, gvec_vvv, 32, MO_8, do_vmaddwev_u_s) +TRANS(xvmaddwev_w_hu_h, LASX, gvec_vvv, 32, MO_16, do_vmaddwev_u_s) +TRANS(xvmaddwev_d_wu_w, LASX, gvec_vvv, 32, MO_32, do_vmaddwev_u_s) +TRANS(xvmaddwod_h_bu_b, LASX, gvec_vvv, 32, MO_8, do_vmaddwod_u_s) +TRANS(xvmaddwod_w_hu_h, LASX, gvec_vvv, 32, MO_16, do_vmaddwod_u_s) +TRANS(xvmaddwod_d_wu_w, LASX, gvec_vvv, 32, MO_32, do_vmaddwod_u_s) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index d25f89a6a4..7e77686bfc 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -2371,42 +2371,42 @@ TRANS(vmaddwev_h_b, LSX, gvec_vvv, 16, MO_8, do_vma= ddwev_s) TRANS(vmaddwev_w_h, LSX, gvec_vvv, 16, MO_16, do_vmaddwev_s) TRANS(vmaddwev_d_w, LSX, gvec_vvv, 16, MO_32, do_vmaddwev_s) =20 -#define VMADD_Q(NAME, FN, idx1, idx2) \ -static bool trans_## NAME (DisasContext *ctx, arg_vvv *a) \ -{ \ - TCGv_i64 rh, rl, arg1, arg2, th, tl; \ - \ - if (!avail_LSX(ctx)) { \ - return false; \ - } \ - \ - rh =3D tcg_temp_new_i64(); \ - rl =3D tcg_temp_new_i64(); \ - arg1 =3D tcg_temp_new_i64(); \ - arg2 =3D tcg_temp_new_i64(); \ - th =3D tcg_temp_new_i64(); \ - tl =3D tcg_temp_new_i64(); \ - \ - get_vreg64(arg1, a->vj, idx1); \ - get_vreg64(arg2, a->vk, idx2); \ - get_vreg64(rh, a->vd, 1); \ - get_vreg64(rl, a->vd, 0); \ - \ - tcg_gen_## FN ##_i64(tl, th, arg1, arg2); \ - tcg_gen_add2_i64(rl, rh, rl, rh, tl, th); \ - \ - set_vreg64(rh, a->vd, 1); \ - set_vreg64(rl, a->vd, 0); \ - \ - return true; \ -} - -VMADD_Q(vmaddwev_q_d, muls2, 0, 0) -VMADD_Q(vmaddwod_q_d, muls2, 1, 1) -VMADD_Q(vmaddwev_q_du, mulu2, 0, 0) -VMADD_Q(vmaddwod_q_du, mulu2, 1, 1) -VMADD_Q(vmaddwev_q_du_d, mulus2, 0, 0) -VMADD_Q(vmaddwod_q_du_d, mulus2, 1, 1) +static bool gen_vmadd_q(DisasContext * ctx, + arg_vvv *a, int oprsz, int idx1, int idx2, + void (*func)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i6= 4)) +{ + TCGv_i64 rh, rl, arg1, arg2, th, tl; + int i; + + rh =3D tcg_temp_new_i64(); + rl =3D tcg_temp_new_i64(); + arg1 =3D tcg_temp_new_i64(); + arg2 =3D tcg_temp_new_i64(); + th =3D tcg_temp_new_i64(); + tl =3D tcg_temp_new_i64(); + + for (i =3D 0; i < oprsz / 16; i++) { + get_vreg64(arg1, a->vj, 2 * i + idx1); + get_vreg64(arg2, a->vk, 2 * i + idx2); + get_vreg64(rh, a->vd, 2 * i + 1); + get_vreg64(rl, a->vd, 2 * i); + + func(tl, th, arg1, arg2); + tcg_gen_add2_i64(rl, rh, rl, rh, tl, th); + + set_vreg64(rh, a->vd, 2 * i + 1); + set_vreg64(rl, a->vd, 2 * i); + } + + return true; +} + +TRANS(vmaddwev_q_d, LSX, gen_vmadd_q, 16, 0, 0, tcg_gen_muls2_i64) +TRANS(vmaddwod_q_d, LSX, gen_vmadd_q, 16, 1, 1, tcg_gen_muls2_i64) +TRANS(vmaddwev_q_du, LSX, gen_vmadd_q, 16, 0, 0, tcg_gen_mulu2_i64) +TRANS(vmaddwod_q_du, LSX, gen_vmadd_q, 16, 1, 1, tcg_gen_mulu2_i64) +TRANS(vmaddwev_q_du_d, LSX, gen_vmadd_q, 16, 0, 0, tcg_gen_mulus2_i64) +TRANS(vmaddwod_q_du_d, LSX, gen_vmadd_q, 16, 1, 1, tcg_gen_mulus2_i64) =20 static void gen_vmaddwod_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec= b) { --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385382401840.0636562927298; Wed, 30 Aug 2023 01:49:42 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtL-0000fH-GF; Wed, 30 Aug 2023 04:49:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtK-0000Xz-Ad for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:34 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtH-0007Ux-3L for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:34 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxfeuOAu9kkggdAA--.56989S3; Wed, 30 Aug 2023 16:49:18 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S21; Wed, 30 Aug 2023 16:49:18 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 19/48] target/loongarch; Implement xvdiv/xvmod Date: Wed, 30 Aug 2023 16:48:33 +0800 Message-Id: <20230830084902.2113960-20-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S21 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385385486100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVDIV.{B/H/W/D}[U]; - XVMOD.{B/H/W/D}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 7 +++++++ target/loongarch/insns.decode | 17 +++++++++++++++++ target/loongarch/disas.c | 17 +++++++++++++++++ target/loongarch/vec_helper.c | 10 ++-------- target/loongarch/insn_trans/trans_lasx.c.inc | 17 +++++++++++++++++ 5 files changed, 60 insertions(+), 8 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 06c8d7e314..ee50d53f4e 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -65,4 +65,11 @@ #define DO_MADD(a, b, c) (a + b * c) #define DO_MSUB(a, b, c) (a - b * c) =20 +#define DO_DIVU(N, M) (unlikely(M =3D=3D 0) ? 0 : N / M) +#define DO_REMU(N, M) (unlikely(M =3D=3D 0) ? 0 : N % M) +#define DO_DIV(N, M) (unlikely(M =3D=3D 0) ? 0 :\ + unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? N : N / = M) +#define DO_REM(N, M) (unlikely(M =3D=3D 0) ? 0 :\ + unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? 0 : N % = M) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index d6fb51ae64..fa25c876b4 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1545,6 +1545,23 @@ xvmaddwod_w_hu_h 0111 01001011 11101 ..... ..... ...= .. @vvv xvmaddwod_d_wu_w 0111 01001011 11110 ..... ..... ..... @vvv xvmaddwod_q_du_d 0111 01001011 11111 ..... ..... ..... @vvv =20 +xvdiv_b 0111 01001110 00000 ..... ..... ..... @vvv +xvdiv_h 0111 01001110 00001 ..... ..... ..... @vvv +xvdiv_w 0111 01001110 00010 ..... ..... ..... @vvv +xvdiv_d 0111 01001110 00011 ..... ..... ..... @vvv +xvmod_b 0111 01001110 00100 ..... ..... ..... @vvv +xvmod_h 0111 01001110 00101 ..... ..... ..... @vvv +xvmod_w 0111 01001110 00110 ..... ..... ..... @vvv +xvmod_d 0111 01001110 00111 ..... ..... ..... @vvv +xvdiv_bu 0111 01001110 01000 ..... ..... ..... @vvv +xvdiv_hu 0111 01001110 01001 ..... ..... ..... @vvv +xvdiv_wu 0111 01001110 01010 ..... ..... ..... @vvv +xvdiv_du 0111 01001110 01011 ..... ..... ..... @vvv +xvmod_bu 0111 01001110 01100 ..... ..... ..... @vvv +xvmod_hu 0111 01001110 01101 ..... ..... ..... @vvv +xvmod_wu 0111 01001110 01110 ..... ..... ..... @vvv +xvmod_du 0111 01001110 01111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index b115fe8315..72df9f0b08 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1962,6 +1962,23 @@ INSN_LASX(xvmaddwod_w_hu_h, vvv) INSN_LASX(xvmaddwod_d_wu_w, vvv) INSN_LASX(xvmaddwod_q_du_d, vvv) =20 +INSN_LASX(xvdiv_b, vvv) +INSN_LASX(xvdiv_h, vvv) +INSN_LASX(xvdiv_w, vvv) +INSN_LASX(xvdiv_d, vvv) +INSN_LASX(xvdiv_bu, vvv) +INSN_LASX(xvdiv_hu, vvv) +INSN_LASX(xvdiv_wu, vvv) +INSN_LASX(xvdiv_du, vvv) +INSN_LASX(xvmod_b, vvv) +INSN_LASX(xvmod_h, vvv) +INSN_LASX(xvmod_w, vvv) +INSN_LASX(xvmod_d, vvv) +INSN_LASX(xvmod_bu, vvv) +INSN_LASX(xvmod_hu, vvv) +INSN_LASX(xvmod_wu, vvv) +INSN_LASX(xvmod_du, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 5a1bff8b04..d217d76ea7 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -621,13 +621,6 @@ VMADDWOD_U_S(vmaddwod_h_bu_b, 16, H, UH, B, UB, DO_MUL) VMADDWOD_U_S(vmaddwod_w_hu_h, 32, W, UW, H, UH, DO_MUL) VMADDWOD_U_S(vmaddwod_d_wu_w, 64, D, UD, W, UW, DO_MUL) =20 -#define DO_DIVU(N, M) (unlikely(M =3D=3D 0) ? 0 : N / M) -#define DO_REMU(N, M) (unlikely(M =3D=3D 0) ? 0 : N % M) -#define DO_DIV(N, M) (unlikely(M =3D=3D 0) ? 0 :\ - unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? N : N / = M) -#define DO_REM(N, M) (unlikely(M =3D=3D 0) ? 0 :\ - unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? 0 : N % = M) - #define VDIV(NAME, BIT, E, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ @@ -635,8 +628,9 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_= t desc) \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)); \ } \ } diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 1073118417..fff6ddd3e0 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -291,6 +291,23 @@ TRANS(xvmaddwod_h_bu_b, LASX, gvec_vvv, 32, MO_8, do_v= maddwod_u_s) TRANS(xvmaddwod_w_hu_h, LASX, gvec_vvv, 32, MO_16, do_vmaddwod_u_s) TRANS(xvmaddwod_d_wu_w, LASX, gvec_vvv, 32, MO_32, do_vmaddwod_u_s) =20 +TRANS(xvdiv_b, LASX, gen_vvv, 32, gen_helper_vdiv_b) +TRANS(xvdiv_h, LASX, gen_vvv, 32, gen_helper_vdiv_h) +TRANS(xvdiv_w, LASX, gen_vvv, 32, gen_helper_vdiv_w) +TRANS(xvdiv_d, LASX, gen_vvv, 32, gen_helper_vdiv_d) +TRANS(xvdiv_bu, LASX, gen_vvv, 32, gen_helper_vdiv_bu) +TRANS(xvdiv_hu, LASX, gen_vvv, 32, gen_helper_vdiv_hu) +TRANS(xvdiv_wu, LASX, gen_vvv, 32, gen_helper_vdiv_wu) +TRANS(xvdiv_du, LASX, gen_vvv, 32, gen_helper_vdiv_du) +TRANS(xvmod_b, LASX, gen_vvv, 32, gen_helper_vmod_b) +TRANS(xvmod_h, LASX, gen_vvv, 32, gen_helper_vmod_h) +TRANS(xvmod_w, LASX, gen_vvv, 32, gen_helper_vmod_w) +TRANS(xvmod_d, LASX, gen_vvv, 32, gen_helper_vmod_d) +TRANS(xvmod_bu, LASX, gen_vvv, 32, gen_helper_vmod_bu) +TRANS(xvmod_hu, LASX, gen_vvv, 32, gen_helper_vmod_hu) +TRANS(xvmod_wu, LASX, gen_vvv, 32, gen_helper_vmod_wu) +TRANS(xvmod_du, LASX, gen_vvv, 32, gen_helper_vmod_du) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16933855295934.4511100167769655; Wed, 30 Aug 2023 01:52:09 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtN-0000qx-7w; Wed, 30 Aug 2023 04:49:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtM-0000nA-HC for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:36 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtI-0007V3-N3 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:36 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxxPCPAu9klQgdAA--.60061S3; Wed, 30 Aug 2023 16:49:19 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S22; Wed, 30 Aug 2023 16:49:18 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 20/48] target/loongarch: Implement xvsat Date: Wed, 30 Aug 2023 16:48:34 +0800 Message-Id: <20230830084902.2113960-21-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S22 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385531304100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSAT.{B/H/W/D}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 9 ++++ target/loongarch/disas.c | 9 ++++ target/loongarch/vec_helper.c | 48 ++++++++++---------- target/loongarch/insn_trans/trans_lasx.c.inc | 9 ++++ 4 files changed, 52 insertions(+), 23 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index fa25c876b4..e366cf7615 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1562,6 +1562,15 @@ xvmod_hu 0111 01001110 01101 ..... ..... ...= .. @vvv xvmod_wu 0111 01001110 01110 ..... ..... ..... @vvv xvmod_du 0111 01001110 01111 ..... ..... ..... @vvv =20 +xvsat_b 0111 01110010 01000 01 ... ..... ..... @vv_ui3 +xvsat_h 0111 01110010 01000 1 .... ..... ..... @vv_ui4 +xvsat_w 0111 01110010 01001 ..... ..... ..... @vv_ui5 +xvsat_d 0111 01110010 0101 ...... ..... ..... @vv_ui6 +xvsat_bu 0111 01110010 10000 01 ... ..... ..... @vv_ui3 +xvsat_hu 0111 01110010 10000 1 .... ..... ..... @vv_ui4 +xvsat_wu 0111 01110010 10001 ..... ..... ..... @vv_ui5 +xvsat_du 0111 01110010 1001 ...... ..... ..... @vv_ui6 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 72df9f0b08..09e5981fc3 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1979,6 +1979,15 @@ INSN_LASX(xvmod_hu, vvv) INSN_LASX(xvmod_wu, vvv) INSN_LASX(xvmod_du, vvv) =20 +INSN_LASX(xvsat_b, vv_i) +INSN_LASX(xvsat_h, vv_i) +INSN_LASX(xvsat_w, vv_i) +INSN_LASX(xvsat_d, vv_i) +INSN_LASX(xvsat_bu, vv_i) +INSN_LASX(xvsat_hu, vv_i) +INSN_LASX(xvsat_wu, vv_i) +INSN_LASX(xvsat_du, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index d217d76ea7..44daf5ee9a 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -652,18 +652,19 @@ VDIV(vmod_hu, 16, UH, DO_REMU) VDIV(vmod_wu, 32, UW, DO_REMU) VDIV(vmod_du, 64, UD, DO_REMU) =20 -#define VSAT_S(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t max, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - typedef __typeof(Vd->E(0)) TD; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D Vj->E(i) > (TD)max ? (TD)max : \ - Vj->E(i) < (TD)~max ? (TD)~max: Vj->E(i); \ - } \ +#define VSAT_S(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t max, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + typedef __typeof(Vd->E(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D Vj->E(i) > (TD)max ? (TD)max : \ + Vj->E(i) < (TD)~max ? (TD)~max: Vj->E(i); \ + } \ } =20 VSAT_S(vsat_b, 8, B) @@ -671,17 +672,18 @@ VSAT_S(vsat_h, 16, H) VSAT_S(vsat_w, 32, W) VSAT_S(vsat_d, 64, D) =20 -#define VSAT_U(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t max, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - typedef __typeof(Vd->E(0)) TD; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D Vj->E(i) > (TD)max ? (TD)max : Vj->E(i); \ - } \ +#define VSAT_U(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t max, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + typedef __typeof(Vd->E(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D Vj->E(i) > (TD)max ? (TD)max : Vj->E(i); \ + } \ } =20 VSAT_U(vsat_bu, 8, UB) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index fff6ddd3e0..093cf2a1fa 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -308,6 +308,15 @@ TRANS(xvmod_hu, LASX, gen_vvv, 32, gen_helper_vmod_hu) TRANS(xvmod_wu, LASX, gen_vvv, 32, gen_helper_vmod_wu) TRANS(xvmod_du, LASX, gen_vvv, 32, gen_helper_vmod_du) =20 +TRANS(xvsat_b, LASX, gvec_vv_i, 32, MO_8, do_vsat_s) +TRANS(xvsat_h, LASX, gvec_vv_i, 32, MO_16, do_vsat_s) +TRANS(xvsat_w, LASX, gvec_vv_i, 32, MO_32, do_vsat_s) +TRANS(xvsat_d, LASX, gvec_vv_i, 32, MO_64, do_vsat_s) +TRANS(xvsat_bu, LASX, gvec_vv_i, 32, MO_8, do_vsat_u) +TRANS(xvsat_hu, LASX, gvec_vv_i, 32, MO_16, do_vsat_u) +TRANS(xvsat_wu, LASX, gvec_vv_i, 32, MO_32, do_vsat_u) +TRANS(xvsat_du, LASX, gvec_vv_i, 32, MO_64, do_vsat_u) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385618108762.1079227477237; Wed, 30 Aug 2023 01:53:38 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtQ-0000yb-6b; Wed, 30 Aug 2023 04:49:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtO-0000w1-Ja for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:38 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtL-0007Vm-6m for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:38 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dxg_CQAu9kmAgdAA--.59641S3; Wed, 30 Aug 2023 16:49:20 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S23; Wed, 30 Aug 2023 16:49:19 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 21/48] target/loongarch: Implement xvexth Date: Wed, 30 Aug 2023 16:48:35 +0800 Message-Id: <20230830084902.2113960-22-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S23 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385619841100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVEXTH.{H.B/W.H/D.W/Q.D}; - XVEXTH.{HU.BU/WU.HU/DU.WU/QU.DU}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 9 +++++ target/loongarch/disas.c | 9 +++++ target/loongarch/vec_helper.c | 36 +++++++++++++------- target/loongarch/insn_trans/trans_lasx.c.inc | 9 +++++ 4 files changed, 51 insertions(+), 12 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index e366cf7615..7491f295a5 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1571,6 +1571,15 @@ xvsat_hu 0111 01110010 10000 1 .... ..... ..= ... @vv_ui4 xvsat_wu 0111 01110010 10001 ..... ..... ..... @vv_ui5 xvsat_du 0111 01110010 1001 ...... ..... ..... @vv_ui6 =20 +xvexth_h_b 0111 01101001 11101 11000 ..... ..... @vv +xvexth_w_h 0111 01101001 11101 11001 ..... ..... @vv +xvexth_d_w 0111 01101001 11101 11010 ..... ..... @vv +xvexth_q_d 0111 01101001 11101 11011 ..... ..... @vv +xvexth_hu_bu 0111 01101001 11101 11100 ..... ..... @vv +xvexth_wu_hu 0111 01101001 11101 11101 ..... ..... @vv +xvexth_du_wu 0111 01101001 11101 11110 ..... ..... @vv +xvexth_qu_du 0111 01101001 11101 11111 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 09e5981fc3..6ca545956d 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1988,6 +1988,15 @@ INSN_LASX(xvsat_hu, vv_i) INSN_LASX(xvsat_wu, vv_i) INSN_LASX(xvsat_du, vv_i) =20 +INSN_LASX(xvexth_h_b, vv) +INSN_LASX(xvexth_w_h, vv) +INSN_LASX(xvexth_d_w, vv) +INSN_LASX(xvexth_q_d, vv) +INSN_LASX(xvexth_hu_bu, vv) +INSN_LASX(xvexth_wu_hu, vv) +INSN_LASX(xvexth_du_wu, vv) +INSN_LASX(xvexth_qu_du, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 44daf5ee9a..51cc8c4526 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -691,32 +691,44 @@ VSAT_U(vsat_hu, 16, UH) VSAT_U(vsat_wu, 32, UW) VSAT_U(vsat_du, 64, UD) =20 -#define VEXTH(NAME, BIT, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D Vj->E2(i + LSX_LEN/BIT); \ - } \ +#define VEXTH(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + ofs =3D LSX_LEN / BIT; \ + for (i =3D 0; i < oprsz / 16; i++) { \ + for (j =3D 0; j < ofs; j++) { \ + Vd->E1(j + i * ofs) =3D Vj->E2(j + ofs + ofs * 2 * i); \ + } \ + } \ } =20 void HELPER(vexth_q_d)(void *vd, void *vj, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_makes64(Vj->D(1)); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_makes64(Vj->D(2 * i + 1)); + } } =20 void HELPER(vexth_qu_du)(void *vd, void *vj, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_make64((uint64_t)Vj->D(1)); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_make64(Vj->UD(2 * i + 1)); + } } =20 VEXTH(vexth_h_b, 16, H, B) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 093cf2a1fa..3fb86d9a92 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -317,6 +317,15 @@ TRANS(xvsat_hu, LASX, gvec_vv_i, 32, MO_16, do_vsat_u) TRANS(xvsat_wu, LASX, gvec_vv_i, 32, MO_32, do_vsat_u) TRANS(xvsat_du, LASX, gvec_vv_i, 32, MO_64, do_vsat_u) =20 +TRANS(xvexth_h_b, LASX, gen_vv, 32, gen_helper_vexth_h_b) +TRANS(xvexth_w_h, LASX, gen_vv, 32, gen_helper_vexth_w_h) +TRANS(xvexth_d_w, LASX, gen_vv, 32, gen_helper_vexth_d_w) +TRANS(xvexth_q_d, LASX, gen_vv, 32, gen_helper_vexth_q_d) +TRANS(xvexth_hu_bu, LASX, gen_vv, 32, gen_helper_vexth_hu_bu) +TRANS(xvexth_wu_hu, LASX, gen_vv, 32, gen_helper_vexth_wu_hu) +TRANS(xvexth_du_wu, LASX, gen_vv, 32, gen_helper_vexth_du_wu) +TRANS(xvexth_qu_du, LASX, gen_vv, 32, gen_helper_vexth_qu_du) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385388824199.43689040053857; Wed, 30 Aug 2023 01:49:48 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtO-0000x0-TB; Wed, 30 Aug 2023 04:49:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtN-0000uw-SY for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:37 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtK-0007Vo-Nd for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:37 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxpPCRAu9kmQgdAA--.59309S3; Wed, 30 Aug 2023 16:49:21 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S24; Wed, 30 Aug 2023 16:49:20 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 22/48] target/loongarch: Implement vext2xv Date: Wed, 30 Aug 2023 16:48:36 +0800 Message-Id: <20230830084902.2113960-23-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S24 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385389941100005 Content-Type: text/plain; charset="utf-8" This patch includes: - VEXT2XV.{H/W/D}.B, VEXT2XV.{HU/WU/DU}.BU; - VEXT2XV.{W/D}.B, VEXT2XV.{WU/DU}.HU; - VEXT2XV.D.W, VEXT2XV.DU.WU. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 13 +++++++++ target/loongarch/insns.decode | 13 +++++++++ target/loongarch/disas.c | 13 +++++++++ target/loongarch/vec_helper.c | 28 ++++++++++++++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 13 +++++++++ 5 files changed, 80 insertions(+) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 1abd9e1410..e9c5412267 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -340,6 +340,19 @@ DEF_HELPER_FLAGS_3(vexth_wu_hu, TCG_CALL_NO_RWG, void,= ptr, ptr, i32) DEF_HELPER_FLAGS_3(vexth_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(vexth_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(vext2xv_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_w_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_d_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_d_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_wu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_du_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_du_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(vsigncov_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 7491f295a5..db1a6689f0 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1580,6 +1580,19 @@ xvexth_wu_hu 0111 01101001 11101 11101 ..... ...= .. @vv xvexth_du_wu 0111 01101001 11101 11110 ..... ..... @vv xvexth_qu_du 0111 01101001 11101 11111 ..... ..... @vv =20 +vext2xv_h_b 0111 01101001 11110 00100 ..... ..... @vv +vext2xv_w_b 0111 01101001 11110 00101 ..... ..... @vv +vext2xv_d_b 0111 01101001 11110 00110 ..... ..... @vv +vext2xv_w_h 0111 01101001 11110 00111 ..... ..... @vv +vext2xv_d_h 0111 01101001 11110 01000 ..... ..... @vv +vext2xv_d_w 0111 01101001 11110 01001 ..... ..... @vv +vext2xv_hu_bu 0111 01101001 11110 01010 ..... ..... @vv +vext2xv_wu_bu 0111 01101001 11110 01011 ..... ..... @vv +vext2xv_du_bu 0111 01101001 11110 01100 ..... ..... @vv +vext2xv_wu_hu 0111 01101001 11110 01101 ..... ..... @vv +vext2xv_du_hu 0111 01101001 11110 01110 ..... ..... @vv +vext2xv_du_wu 0111 01101001 11110 01111 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 6ca545956d..975ea018da 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1997,6 +1997,19 @@ INSN_LASX(xvexth_wu_hu, vv) INSN_LASX(xvexth_du_wu, vv) INSN_LASX(xvexth_qu_du, vv) =20 +INSN_LASX(vext2xv_h_b, vv) +INSN_LASX(vext2xv_w_b, vv) +INSN_LASX(vext2xv_d_b, vv) +INSN_LASX(vext2xv_w_h, vv) +INSN_LASX(vext2xv_d_h, vv) +INSN_LASX(vext2xv_d_w, vv) +INSN_LASX(vext2xv_hu_bu, vv) +INSN_LASX(vext2xv_wu_bu, vv) +INSN_LASX(vext2xv_du_bu, vv) +INSN_LASX(vext2xv_wu_hu, vv) +INSN_LASX(vext2xv_du_hu, vv) +INSN_LASX(vext2xv_du_wu, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 51cc8c4526..5f78bd076b 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -738,6 +738,34 @@ VEXTH(vexth_hu_bu, 16, UH, UB) VEXTH(vexth_wu_hu, 32, UW, UH) VEXTH(vexth_du_wu, 64, UD, UW) =20 +#define VEXT2XV(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i; \ + VReg temp =3D {}; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + temp.E1(i) =3D Vj->E2(i); \ + } \ + *Vd =3D temp; \ +} + +VEXT2XV(vext2xv_h_b, 16, H, B) +VEXT2XV(vext2xv_w_b, 32, W, B) +VEXT2XV(vext2xv_d_b, 64, D, B) +VEXT2XV(vext2xv_w_h, 32, W, H) +VEXT2XV(vext2xv_d_h, 64, D, H) +VEXT2XV(vext2xv_d_w, 64, D, W) +VEXT2XV(vext2xv_hu_bu, 16, UH, UB) +VEXT2XV(vext2xv_wu_bu, 32, UW, UB) +VEXT2XV(vext2xv_du_bu, 64, UD, UB) +VEXT2XV(vext2xv_wu_hu, 32, UW, UH) +VEXT2XV(vext2xv_du_hu, 64, UD, UH) +VEXT2XV(vext2xv_du_wu, 64, UD, UW) + #define DO_SIGNCOV(a, b) (a =3D=3D 0 ? 0 : a < 0 ? -b : b) =20 DO_3OP(vsigncov_b, 8, B, DO_SIGNCOV) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 3fb86d9a92..1e75815995 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -326,6 +326,19 @@ TRANS(xvexth_wu_hu, LASX, gen_vv, 32, gen_helper_vexth= _wu_hu) TRANS(xvexth_du_wu, LASX, gen_vv, 32, gen_helper_vexth_du_wu) TRANS(xvexth_qu_du, LASX, gen_vv, 32, gen_helper_vexth_qu_du) =20 +TRANS(vext2xv_h_b, LASX, gen_vv, 32, gen_helper_vext2xv_h_b) +TRANS(vext2xv_w_b, LASX, gen_vv, 32, gen_helper_vext2xv_w_b) +TRANS(vext2xv_d_b, LASX, gen_vv, 32, gen_helper_vext2xv_d_b) +TRANS(vext2xv_w_h, LASX, gen_vv, 32, gen_helper_vext2xv_w_h) +TRANS(vext2xv_d_h, LASX, gen_vv, 32, gen_helper_vext2xv_d_h) +TRANS(vext2xv_d_w, LASX, gen_vv, 32, gen_helper_vext2xv_d_w) +TRANS(vext2xv_hu_bu, LASX, gen_vv, 32, gen_helper_vext2xv_hu_bu) +TRANS(vext2xv_wu_bu, LASX, gen_vv, 32, gen_helper_vext2xv_wu_bu) +TRANS(vext2xv_du_bu, LASX, gen_vv, 32, gen_helper_vext2xv_du_bu) +TRANS(vext2xv_wu_hu, LASX, gen_vv, 32, gen_helper_vext2xv_wu_hu) +TRANS(vext2xv_du_hu, LASX, gen_vv, 32, gen_helper_vext2xv_du_hu) +TRANS(vext2xv_du_wu, LASX, gen_vv, 32, gen_helper_vext2xv_du_wu) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385520001260.37123649942; Wed, 30 Aug 2023 01:52:00 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtS-00014Z-Gw; Wed, 30 Aug 2023 04:49:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtP-0000yJ-W6 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:40 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtN-0007WJ-4Q for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:39 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxnuuRAu9knAgdAA--.56909S3; Wed, 30 Aug 2023 16:49:21 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S25; Wed, 30 Aug 2023 16:49:21 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 23/48] target/loongarch: Implement xvsigncov Date: Wed, 30 Aug 2023 16:48:37 +0800 Message-Id: <20230830084902.2113960-24-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S25 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385521397100002 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSIGNCOV.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 2 ++ target/loongarch/insns.decode | 5 +++++ target/loongarch/disas.c | 5 +++++ target/loongarch/vec_helper.c | 2 -- target/loongarch/insn_trans/trans_lasx.c.inc | 5 +++++ 5 files changed, 17 insertions(+), 2 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index ee50d53f4e..681afd842f 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -72,4 +72,6 @@ #define DO_REM(N, M) (unlikely(M =3D=3D 0) ? 0 :\ unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? 0 : N % = M) =20 +#define DO_SIGNCOV(a, b) (a =3D=3D 0 ? 0 : a < 0 ? -b : b) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index db1a6689f0..7bbda1a142 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1593,6 +1593,11 @@ vext2xv_wu_hu 0111 01101001 11110 01101 ..... ...= .. @vv vext2xv_du_hu 0111 01101001 11110 01110 ..... ..... @vv vext2xv_du_wu 0111 01101001 11110 01111 ..... ..... @vv =20 +xvsigncov_b 0111 01010010 11100 ..... ..... ..... @vvv +xvsigncov_h 0111 01010010 11101 ..... ..... ..... @vvv +xvsigncov_w 0111 01010010 11110 ..... ..... ..... @vvv +xvsigncov_d 0111 01010010 11111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 975ea018da..85e0cb7c8d 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2010,6 +2010,11 @@ INSN_LASX(vext2xv_wu_hu, vv) INSN_LASX(vext2xv_du_hu, vv) INSN_LASX(vext2xv_du_wu, vv) =20 +INSN_LASX(xvsigncov_b, vvv) +INSN_LASX(xvsigncov_h, vvv) +INSN_LASX(xvsigncov_w, vvv) +INSN_LASX(xvsigncov_d, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 5f78bd076b..0a322b3287 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -766,8 +766,6 @@ VEXT2XV(vext2xv_wu_hu, 32, UW, UH) VEXT2XV(vext2xv_du_hu, 64, UD, UH) VEXT2XV(vext2xv_du_wu, 64, UD, UW) =20 -#define DO_SIGNCOV(a, b) (a =3D=3D 0 ? 0 : a < 0 ? -b : b) - DO_3OP(vsigncov_b, 8, B, DO_SIGNCOV) DO_3OP(vsigncov_h, 16, H, DO_SIGNCOV) DO_3OP(vsigncov_w, 32, W, DO_SIGNCOV) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 1e75815995..93dff7d20a 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -339,6 +339,11 @@ TRANS(vext2xv_wu_hu, LASX, gen_vv, 32, gen_helper_vext= 2xv_wu_hu) TRANS(vext2xv_du_hu, LASX, gen_vv, 32, gen_helper_vext2xv_du_hu) TRANS(vext2xv_du_wu, LASX, gen_vv, 32, gen_helper_vext2xv_du_wu) =20 +TRANS(xvsigncov_b, LASX, gvec_vvv, 32, MO_8, do_vsigncov) +TRANS(xvsigncov_h, LASX, gvec_vvv, 32, MO_16, do_vsigncov) +TRANS(xvsigncov_w, LASX, gvec_vvv, 32, MO_32, do_vsigncov) +TRANS(xvsigncov_d, LASX, gvec_vvv, 32, MO_64, do_vsigncov) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385754957670.9424779043534; Wed, 30 Aug 2023 01:55:54 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtP-0000xw-Jx; Wed, 30 Aug 2023 04:49:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtN-0000v1-Tq for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:37 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtK-0007Vr-PC for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:37 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxIvCSAu9koAgdAA--.58543S3; Wed, 30 Aug 2023 16:49:22 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S26; Wed, 30 Aug 2023 16:49:21 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 24/48] target/loongarch: Implement xvmskltz/xvmskgez/xvmsknz Date: Wed, 30 Aug 2023 16:48:38 +0800 Message-Id: <20230830084902.2113960-25-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S26 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385756681100007 Content-Type: text/plain; charset="utf-8" This patch includes: - XVMSKLTZ.{B/H/W/D}; - XVMSKGEZ.B; - XVMSKNZ.B. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 7 ++ target/loongarch/disas.c | 7 ++ target/loongarch/vec_helper.c | 80 ++++++++++++++------ target/loongarch/insn_trans/trans_lasx.c.inc | 7 ++ 4 files changed, 76 insertions(+), 25 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 7bbda1a142..6a161d6d20 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1598,6 +1598,13 @@ xvsigncov_h 0111 01010010 11101 ..... ..... ...= .. @vvv xvsigncov_w 0111 01010010 11110 ..... ..... ..... @vvv xvsigncov_d 0111 01010010 11111 ..... ..... ..... @vvv =20 +xvmskltz_b 0111 01101001 11000 10000 ..... ..... @vv +xvmskltz_h 0111 01101001 11000 10001 ..... ..... @vv +xvmskltz_w 0111 01101001 11000 10010 ..... ..... @vv +xvmskltz_d 0111 01101001 11000 10011 ..... ..... @vv +xvmskgez_b 0111 01101001 11000 10100 ..... ..... @vv +xvmsknz_b 0111 01101001 11000 11000 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 85e0cb7c8d..1a11153343 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2010,6 +2010,13 @@ INSN_LASX(vext2xv_wu_hu, vv) INSN_LASX(vext2xv_du_hu, vv) INSN_LASX(vext2xv_du_wu, vv) =20 +INSN_LASX(xvmskltz_b, vv) +INSN_LASX(xvmskltz_h, vv) +INSN_LASX(xvmskltz_w, vv) +INSN_LASX(xvmskltz_d, vv) +INSN_LASX(xvmskgez_b, vv) +INSN_LASX(xvmsknz_b, vv) + INSN_LASX(xvsigncov_b, vvv) INSN_LASX(xvsigncov_h, vvv) INSN_LASX(xvsigncov_w, vvv) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 0a322b3287..47837875a8 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -783,14 +783,19 @@ static uint64_t do_vmskltz_b(int64_t val) =20 void HELPER(vmskltz_b)(void *vd, void *vj, uint32_t desc) { + int i; uint16_t temp =3D 0; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - temp =3D do_vmskltz_b(Vj->D(0)); - temp |=3D (do_vmskltz_b(Vj->D(1)) << 8); - Vd->D(0) =3D temp; - Vd->D(1) =3D 0; + for (i =3D 0; i < oprsz / 16; i++) { + temp =3D 0; + temp =3D do_vmskltz_b(Vj->D(2 * i)); + temp |=3D (do_vmskltz_b(Vj->D(2 * i + 1)) << 8); + Vd->D(2 * i) =3D temp; + Vd->D(2 * i + 1) =3D 0; + } } =20 static uint64_t do_vmskltz_h(int64_t val) @@ -804,14 +809,19 @@ static uint64_t do_vmskltz_h(int64_t val) =20 void HELPER(vmskltz_h)(void *vd, void *vj, uint32_t desc) { + int i; uint16_t temp =3D 0; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - temp =3D do_vmskltz_h(Vj->D(0)); - temp |=3D (do_vmskltz_h(Vj->D(1)) << 4); - Vd->D(0) =3D temp; - Vd->D(1) =3D 0; + for (i =3D 0; i < oprsz / 16; i++) { + temp =3D 0; + temp =3D do_vmskltz_h(Vj->D(2 * i)); + temp |=3D (do_vmskltz_h(Vj->D(2 * i + 1)) << 4); + Vd->D(2 * i) =3D temp; + Vd->D(2 * i + 1) =3D 0; + } } =20 static uint64_t do_vmskltz_w(int64_t val) @@ -824,14 +834,19 @@ static uint64_t do_vmskltz_w(int64_t val) =20 void HELPER(vmskltz_w)(void *vd, void *vj, uint32_t desc) { + int i; uint16_t temp =3D 0; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - temp =3D do_vmskltz_w(Vj->D(0)); - temp |=3D (do_vmskltz_w(Vj->D(1)) << 2); - Vd->D(0) =3D temp; - Vd->D(1) =3D 0; + for (i =3D 0; i < oprsz / 16; i++) { + temp =3D 0; + temp =3D do_vmskltz_w(Vj->D(2 * i)); + temp |=3D (do_vmskltz_w(Vj->D(2 * i + 1)) << 2); + Vd->D(2 * i) =3D temp; + Vd->D(2 * i + 1) =3D 0; + } } =20 static uint64_t do_vmskltz_d(int64_t val) @@ -840,26 +855,36 @@ static uint64_t do_vmskltz_d(int64_t val) } void HELPER(vmskltz_d)(void *vd, void *vj, uint32_t desc) { + int i; uint16_t temp =3D 0; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - temp =3D do_vmskltz_d(Vj->D(0)); - temp |=3D (do_vmskltz_d(Vj->D(1)) << 1); - Vd->D(0) =3D temp; - Vd->D(1) =3D 0; + for (i =3D 0; i < oprsz / 16; i++) { + temp =3D 0; + temp =3D do_vmskltz_d(Vj->D(2 * i)); + temp |=3D (do_vmskltz_d(Vj->D(2 * i + 1)) << 1); + Vd->D(2 * i) =3D temp; + Vd->D(2 * i + 1) =3D 0; + } } =20 void HELPER(vmskgez_b)(void *vd, void *vj, uint32_t desc) { + int i; uint16_t temp =3D 0; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - temp =3D do_vmskltz_b(Vj->D(0)); - temp |=3D (do_vmskltz_b(Vj->D(1)) << 8); - Vd->D(0) =3D (uint16_t)(~temp); - Vd->D(1) =3D 0; + for (i =3D 0; i < oprsz / 16; i++) { + temp =3D 0; + temp =3D do_vmskltz_b(Vj->D(2 * i)); + temp |=3D (do_vmskltz_b(Vj->D(2 * i + 1)) << 8); + Vd->D(2 * i) =3D (uint16_t)(~temp); + Vd->D(2 * i + 1) =3D 0; + } } =20 static uint64_t do_vmskez_b(uint64_t a) @@ -872,16 +897,21 @@ static uint64_t do_vmskez_b(uint64_t a) return c >> 56; } =20 -void HELPER(vmsknz_b)(void vd, void vj, uint32_t desc) +void HELPER(vmsknz_b)(void *vd, void *vj, uint32_t desc) { + int i; uint16_t temp =3D 0; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - temp =3D do_vmskez_b(Vj->D(0)); - temp |=3D (do_vmskez_b(Vj->D(1)) << 8); - Vd->D(0) =3D (uint16_t)(~temp); - Vd->D(1) =3D 0; + for (i =3D 0; i < oprsz / 16; i++) { + temp =3D 0; + temp =3D do_vmskez_b(Vj->D(2 * i)); + temp |=3D (do_vmskez_b(Vj->D(2 * i + 1)) << 8); + Vd->D(2 * i) =3D (uint16_t)(~temp); + Vd->D(2 * i + 1) =3D 0; + } } =20 void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm, uint32_t v) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 93dff7d20a..92fae91900 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -344,6 +344,13 @@ TRANS(xvsigncov_h, LASX, gvec_vvv, 32, MO_16, do_vsign= cov) TRANS(xvsigncov_w, LASX, gvec_vvv, 32, MO_32, do_vsigncov) TRANS(xvsigncov_d, LASX, gvec_vvv, 32, MO_64, do_vsigncov) =20 +TRANS(xvmskltz_b, LASX, gen_vv, 32, gen_helper_vmskltz_b) +TRANS(xvmskltz_h, LASX, gen_vv, 32, gen_helper_vmskltz_h) +TRANS(xvmskltz_w, LASX, gen_vv, 32, gen_helper_vmskltz_w) +TRANS(xvmskltz_d, LASX, gen_vv, 32, gen_helper_vmskltz_d) +TRANS(xvmskgez_b, LASX, gen_vv, 32, gen_helper_vmskgez_b) +TRANS(xvmsknz_b, LASX, gen_vv, 32, gen_helper_vmsknz_b) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385443349407.5876644720564; Wed, 30 Aug 2023 01:50:43 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtR-00012j-QY; Wed, 30 Aug 2023 04:49:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtO-0000wJ-NI for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:38 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtL-0007Vv-3y for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:38 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxJuiSAu9koggdAA--.258S3; Wed, 30 Aug 2023 16:49:22 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S27; Wed, 30 Aug 2023 16:49:22 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 25/48] target/loognarch: Implement xvldi Date: Wed, 30 Aug 2023 16:48:39 +0800 Message-Id: <20230830084902.2113960-26-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S27 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385444150100002 Content-Type: text/plain; charset="utf-8" This patch includes: - XVLDI. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 2 ++ target/loongarch/disas.c | 7 +++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 2 ++ target/loongarch/insn_trans/trans_lsx.c.inc | 6 ++++-- 4 files changed, 15 insertions(+), 2 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 6a161d6d20..edaa756395 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1605,6 +1605,8 @@ xvmskltz_d 0111 01101001 11000 10011 ..... ....= . @vv xvmskgez_b 0111 01101001 11000 10100 ..... ..... @vv xvmsknz_b 0111 01101001 11000 11000 ..... ..... @vv =20 +xvldi 0111 01111110 00 ............. ..... @v_i13 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 1a11153343..8fa2edf007 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1703,6 +1703,11 @@ static bool trans_##insn(DisasContext *ctx, arg_##ty= pe * a) \ return true; \ } =20 +static void output_v_i_x(DisasContext *ctx, arg_v_i *a, const char *mnemon= ic) +{ + output(ctx, mnemonic, "x%d, 0x%x", a->vd, a->imm); +} + static void output_vvv_x(DisasContext *ctx, arg_vvv * a, const char *mnemo= nic) { output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk); @@ -2022,6 +2027,8 @@ INSN_LASX(xvsigncov_h, vvv) INSN_LASX(xvsigncov_w, vvv) INSN_LASX(xvsigncov_d, vvv) =20 +INSN_LASX(xvldi, v_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 92fae91900..f0e71f5f98 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -351,6 +351,8 @@ TRANS(xvmskltz_d, LASX, gen_vv, 32, gen_helper_vmskltz_= d) TRANS(xvmskgez_b, LASX, gen_vv, 32, gen_helper_vmskgez_b) TRANS(xvmsknz_b, LASX, gen_vv, 32, gen_helper_vmsknz_b) =20 +TRANS(xvldi, LASX, do_vldi, 32) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index 7e77686bfc..f76da508c3 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -3068,7 +3068,7 @@ static uint64_t vldi_get_value(DisasContext *ctx, uin= t32_t imm) return data; } =20 -static bool trans_vldi(DisasContext *ctx, arg_vldi *a) +static bool do_vldi(DisasContext *ctx, arg_vldi *a, uint32_t oprsz) { int sel, vece; uint64_t value; @@ -3089,11 +3089,13 @@ static bool trans_vldi(DisasContext *ctx, arg_vldi = *a) vece =3D (a->imm >> 10) & 0x3; } =20 - tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->vd), 16, ctx->vl/8, + tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->vd), oprsz, ctx->vl / 8, tcg_constant_i64(value)); return true; } =20 +TRANS(vldi, LSX, do_vldi, 16) + TRANS(vand_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_and) TRANS(vor_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_or) TRANS(vxor_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_xor) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385485677157.79780299248364; Wed, 30 Aug 2023 01:51:25 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtR-00011C-Ai; Wed, 30 Aug 2023 04:49:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtQ-0000yL-0O for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:40 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtM-0007W8-4n for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:39 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxXOqTAu9kpQgdAA--.32133S3; Wed, 30 Aug 2023 16:49:23 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S28; Wed, 30 Aug 2023 16:49:22 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 26/48] target/loongarch: Implement LASX logic instructions Date: Wed, 30 Aug 2023 16:48:40 +0800 Message-Id: <20230830084902.2113960-27-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S28 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385487182100012 Content-Type: text/plain; charset="utf-8" This patch includes: - XV{AND/OR/XOR/NOR/ANDN/ORN}.V; - XV{AND/OR/XOR/NOR}I.B. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 12 ++++++++++++ target/loongarch/disas.c | 12 ++++++++++++ target/loongarch/vec_helper.c | 4 ++-- target/loongarch/insn_trans/trans_lasx.c.inc | 11 +++++++++++ target/loongarch/insn_trans/trans_lsx.c.inc | 5 +++-- 5 files changed, 40 insertions(+), 4 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index edaa756395..fb28666577 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1607,6 +1607,18 @@ xvmsknz_b 0111 01101001 11000 11000 ..... ...= .. @vv =20 xvldi 0111 01111110 00 ............. ..... @v_i13 =20 +xvand_v 0111 01010010 01100 ..... ..... ..... @vvv +xvor_v 0111 01010010 01101 ..... ..... ..... @vvv +xvxor_v 0111 01010010 01110 ..... ..... ..... @vvv +xvnor_v 0111 01010010 01111 ..... ..... ..... @vvv +xvandn_v 0111 01010010 10000 ..... ..... ..... @vvv +xvorn_v 0111 01010010 10001 ..... ..... ..... @vvv + +xvandi_b 0111 01111101 00 ........ ..... ..... @vv_ui8 +xvori_b 0111 01111101 01 ........ ..... ..... @vv_ui8 +xvxori_b 0111 01111101 10 ........ ..... ..... @vv_ui8 +xvnori_b 0111 01111101 11 ........ ..... ..... @vv_ui8 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 8fa2edf007..59fa249bae 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2029,6 +2029,18 @@ INSN_LASX(xvsigncov_d, vvv) =20 INSN_LASX(xvldi, v_i) =20 +INSN_LASX(xvand_v, vvv) +INSN_LASX(xvor_v, vvv) +INSN_LASX(xvxor_v, vvv) +INSN_LASX(xvnor_v, vvv) +INSN_LASX(xvandn_v, vvv) +INSN_LASX(xvorn_v, vvv) + +INSN_LASX(xvandi_b, vv_i) +INSN_LASX(xvori_b, vv_i) +INSN_LASX(xvxori_b, vv_i) +INSN_LASX(xvnori_b, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 47837875a8..e33969339f 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -914,13 +914,13 @@ void HELPER(vmsknz_b)(void *vd, void *vj, uint32_t de= sc) } } =20 -void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm, uint32_t v) +void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm, uint32_t desc) { int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; =20 - for (i =3D 0; i < LSX_LEN/8; i++) { + for (i =3D 0; i < simd_oprsz(desc); i++) { Vd->B(i) =3D ~(Vj->B(i) | (uint8_t)imm); } } diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index f0e71f5f98..9a3a504538 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -353,6 +353,17 @@ TRANS(xvmsknz_b, LASX, gen_vv, 32, gen_helper_vmsknz_b) =20 TRANS(xvldi, LASX, do_vldi, 32) =20 +TRANS(xvand_v, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_and) +TRANS(xvor_v, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_or) +TRANS(xvxor_v, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_xor) +TRANS(xvnor_v, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_nor) +TRANS(xvandn_v, LASX, do_vandn_v, 32) +TRANS(xvorn_v, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_orc) +TRANS(xvandi_b, LASX, gvec_vv_i, 32, MO_8, tcg_gen_gvec_andi) +TRANS(xvori_b, LASX, gvec_vv_i, 32, MO_8, tcg_gen_gvec_ori) +TRANS(xvxori_b, LASX, gvec_vv_i, 32, MO_8, tcg_gen_gvec_xori) +TRANS(xvnori_b, LASX, gvec_vv_i, 32, MO_8, do_vnori_b) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index f76da508c3..64de014a58 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -3101,7 +3101,7 @@ TRANS(vor_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_o= r) TRANS(vxor_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_xor) TRANS(vnor_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_nor) =20 -static bool trans_vandn_v(DisasContext *ctx, arg_vvv *a) +static bool do_vandn_v(DisasContext *ctx, arg_vvv *a, uint32_t oprsz) { uint32_t vd_ofs, vj_ofs, vk_ofs; =20 @@ -3115,9 +3115,10 @@ static bool trans_vandn_v(DisasContext *ctx, arg_vvv= *a) vj_ofs =3D vec_full_offset(a->vj); vk_ofs =3D vec_full_offset(a->vk); =20 - tcg_gen_gvec_andc(MO_64, vd_ofs, vk_ofs, vj_ofs, 16, ctx->vl/8); + tcg_gen_gvec_andc(MO_64, vd_ofs, vk_ofs, vj_ofs, oprsz, ctx->vl / 8); return true; } +TRANS(vandn_v, LSX, do_vandn_v, 16) TRANS(vorn_v, LSX, gvec_vvv, 16, MO_64, tcg_gen_gvec_orc) TRANS(vandi_b, LSX, gvec_vv_i, 16, MO_8, tcg_gen_gvec_andi) TRANS(vori_b, LSX, gvec_vv_i, 16, MO_8, tcg_gen_gvec_ori) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385646909104.41280781377952; Wed, 30 Aug 2023 01:54:06 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtU-00015f-4G; Wed, 30 Aug 2023 04:49:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtQ-0000zl-PD for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:40 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtN-0007WQ-PN for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:40 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxtPCUAu9kpwgdAA--.60096S3; Wed, 30 Aug 2023 16:49:24 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S29; Wed, 30 Aug 2023 16:49:23 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 27/48] target/loongarch: Implement xvsll xvsrl xvsra xvrotr Date: Wed, 30 Aug 2023 16:48:41 +0800 Message-Id: <20230830084902.2113960-28-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S29 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385647913100002 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSLL[I].{B/H/W/D}; - XVSRL[I].{B/H/W/D}; - XVSRA[I].{B/H/W/D}; - XVROTR[I].{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 33 ++++++++++++++++++ target/loongarch/disas.c | 36 ++++++++++++++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 36 ++++++++++++++++++++ 3 files changed, 105 insertions(+) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index fb28666577..fb7bd9fb34 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1619,6 +1619,39 @@ xvori_b 0111 01111101 01 ........ ..... ...= .. @vv_ui8 xvxori_b 0111 01111101 10 ........ ..... ..... @vv_ui8 xvnori_b 0111 01111101 11 ........ ..... ..... @vv_ui8 =20 +xvsll_b 0111 01001110 10000 ..... ..... ..... @vvv +xvsll_h 0111 01001110 10001 ..... ..... ..... @vvv +xvsll_w 0111 01001110 10010 ..... ..... ..... @vvv +xvsll_d 0111 01001110 10011 ..... ..... ..... @vvv +xvslli_b 0111 01110010 11000 01 ... ..... ..... @vv_ui3 +xvslli_h 0111 01110010 11000 1 .... ..... ..... @vv_ui4 +xvslli_w 0111 01110010 11001 ..... ..... ..... @vv_ui5 +xvslli_d 0111 01110010 1101 ...... ..... ..... @vv_ui6 +xvsrl_b 0111 01001110 10100 ..... ..... ..... @vvv +xvsrl_h 0111 01001110 10101 ..... ..... ..... @vvv +xvsrl_w 0111 01001110 10110 ..... ..... ..... @vvv +xvsrl_d 0111 01001110 10111 ..... ..... ..... @vvv +xvsrli_b 0111 01110011 00000 01 ... ..... ..... @vv_ui3 +xvsrli_h 0111 01110011 00000 1 .... ..... ..... @vv_ui4 +xvsrli_w 0111 01110011 00001 ..... ..... ..... @vv_ui5 +xvsrli_d 0111 01110011 0001 ...... ..... ..... @vv_ui6 +xvsra_b 0111 01001110 11000 ..... ..... ..... @vvv +xvsra_h 0111 01001110 11001 ..... ..... ..... @vvv +xvsra_w 0111 01001110 11010 ..... ..... ..... @vvv +xvsra_d 0111 01001110 11011 ..... ..... ..... @vvv +xvsrai_b 0111 01110011 01000 01 ... ..... ..... @vv_ui3 +xvsrai_h 0111 01110011 01000 1 .... ..... ..... @vv_ui4 +xvsrai_w 0111 01110011 01001 ..... ..... ..... @vv_ui5 +xvsrai_d 0111 01110011 0101 ...... ..... ..... @vv_ui6 +xvrotr_b 0111 01001110 11100 ..... ..... ..... @vvv +xvrotr_h 0111 01001110 11101 ..... ..... ..... @vvv +xvrotr_w 0111 01001110 11110 ..... ..... ..... @vvv +xvrotr_d 0111 01001110 11111 ..... ..... ..... @vvv +xvrotri_b 0111 01101010 00000 01 ... ..... ..... @vv_ui3 +xvrotri_h 0111 01101010 00000 1 .... ..... ..... @vv_ui4 +xvrotri_w 0111 01101010 00001 ..... ..... ..... @vv_ui5 +xvrotri_d 0111 01101010 0001 ...... ..... ..... @vv_ui6 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 59fa249bae..e081a11aba 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2041,6 +2041,42 @@ INSN_LASX(xvori_b, vv_i) INSN_LASX(xvxori_b, vv_i) INSN_LASX(xvnori_b, vv_i) =20 +INSN_LASX(xvsll_b, vvv) +INSN_LASX(xvsll_h, vvv) +INSN_LASX(xvsll_w, vvv) +INSN_LASX(xvsll_d, vvv) +INSN_LASX(xvslli_b, vv_i) +INSN_LASX(xvslli_h, vv_i) +INSN_LASX(xvslli_w, vv_i) +INSN_LASX(xvslli_d, vv_i) + +INSN_LASX(xvsrl_b, vvv) +INSN_LASX(xvsrl_h, vvv) +INSN_LASX(xvsrl_w, vvv) +INSN_LASX(xvsrl_d, vvv) +INSN_LASX(xvsrli_b, vv_i) +INSN_LASX(xvsrli_h, vv_i) +INSN_LASX(xvsrli_w, vv_i) +INSN_LASX(xvsrli_d, vv_i) + +INSN_LASX(xvsra_b, vvv) +INSN_LASX(xvsra_h, vvv) +INSN_LASX(xvsra_w, vvv) +INSN_LASX(xvsra_d, vvv) +INSN_LASX(xvsrai_b, vv_i) +INSN_LASX(xvsrai_h, vv_i) +INSN_LASX(xvsrai_w, vv_i) +INSN_LASX(xvsrai_d, vv_i) + +INSN_LASX(xvrotr_b, vvv) +INSN_LASX(xvrotr_h, vvv) +INSN_LASX(xvrotr_w, vvv) +INSN_LASX(xvrotr_d, vvv) +INSN_LASX(xvrotri_b, vv_i) +INSN_LASX(xvrotri_h, vv_i) +INSN_LASX(xvrotri_w, vv_i) +INSN_LASX(xvrotri_d, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 9a3a504538..d13dfacebf 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -364,6 +364,42 @@ TRANS(xvori_b, LASX, gvec_vv_i, 32, MO_8, tcg_gen_gvec= _ori) TRANS(xvxori_b, LASX, gvec_vv_i, 32, MO_8, tcg_gen_gvec_xori) TRANS(xvnori_b, LASX, gvec_vv_i, 32, MO_8, do_vnori_b) =20 +TRANS(xvsll_b, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_shlv) +TRANS(xvsll_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_shlv) +TRANS(xvsll_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_shlv) +TRANS(xvsll_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_shlv) +TRANS(xvslli_b, LASX, gvec_vv_i, 32, MO_8, tcg_gen_gvec_shli) +TRANS(xvslli_h, LASX, gvec_vv_i, 32, MO_16, tcg_gen_gvec_shli) +TRANS(xvslli_w, LASX, gvec_vv_i, 32, MO_32, tcg_gen_gvec_shli) +TRANS(xvslli_d, LASX, gvec_vv_i, 32, MO_64, tcg_gen_gvec_shli) + +TRANS(xvsrl_b, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_shrv) +TRANS(xvsrl_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_shrv) +TRANS(xvsrl_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_shrv) +TRANS(xvsrl_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_shrv) +TRANS(xvsrli_b, LASX, gvec_vv_i, 32, MO_8, tcg_gen_gvec_shri) +TRANS(xvsrli_h, LASX, gvec_vv_i, 32, MO_16, tcg_gen_gvec_shri) +TRANS(xvsrli_w, LASX, gvec_vv_i, 32, MO_32, tcg_gen_gvec_shri) +TRANS(xvsrli_d, LASX, gvec_vv_i, 32, MO_64, tcg_gen_gvec_shri) + +TRANS(xvsra_b, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_sarv) +TRANS(xvsra_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_sarv) +TRANS(xvsra_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_sarv) +TRANS(xvsra_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_sarv) +TRANS(xvsrai_b, LASX, gvec_vv_i, 32, MO_8, tcg_gen_gvec_sari) +TRANS(xvsrai_h, LASX, gvec_vv_i, 32, MO_16, tcg_gen_gvec_sari) +TRANS(xvsrai_w, LASX, gvec_vv_i, 32, MO_32, tcg_gen_gvec_sari) +TRANS(xvsrai_d, LASX, gvec_vv_i, 32, MO_64, tcg_gen_gvec_sari) + +TRANS(xvrotr_b, LASX, gvec_vvv, 32, MO_8, tcg_gen_gvec_rotrv) +TRANS(xvrotr_h, LASX, gvec_vvv, 32, MO_16, tcg_gen_gvec_rotrv) +TRANS(xvrotr_w, LASX, gvec_vvv, 32, MO_32, tcg_gen_gvec_rotrv) +TRANS(xvrotr_d, LASX, gvec_vvv, 32, MO_64, tcg_gen_gvec_rotrv) +TRANS(xvrotri_b, LASX, gvec_vv_i, 32, MO_8, tcg_gen_gvec_rotri) +TRANS(xvrotri_h, LASX, gvec_vv_i, 32, MO_16, tcg_gen_gvec_rotri) +TRANS(xvrotri_w, LASX, gvec_vv_i, 32, MO_32, tcg_gen_gvec_rotri) +TRANS(xvrotri_d, LASX, gvec_vv_i, 32, MO_64, tcg_gen_gvec_rotri) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385610146236.24056930851464; Wed, 30 Aug 2023 01:53:30 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGvL-00023t-VO; Wed, 30 Aug 2023 04:51:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGvJ-0001nI-V5 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:51:37 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGvH-00087B-4J for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:51:37 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxlPCWAu9kqQgdAA--.59294S3; Wed, 30 Aug 2023 16:49:26 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S30; Wed, 30 Aug 2023 16:49:23 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 28/48] target/loongarch: Implement xvsllwil xvextl Date: Wed, 30 Aug 2023 16:48:42 +0800 Message-Id: <20230830084902.2113960-29-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S30 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385611757100007 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSLLWIL.{H.B/W.H/D.W}; - XVSLLWIL.{HU.BU/WU.HU/DU.WU}; - XVEXTL.Q.D, VEXTL.QU.DU. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 9 ++++ target/loongarch/disas.c | 9 ++++ target/loongarch/vec_helper.c | 44 ++++++++++++-------- target/loongarch/insn_trans/trans_lasx.c.inc | 9 ++++ 4 files changed, 54 insertions(+), 17 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index fb7bd9fb34..8a7933eccc 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1652,6 +1652,15 @@ xvrotri_h 0111 01101010 00000 1 .... ..... ..= ... @vv_ui4 xvrotri_w 0111 01101010 00001 ..... ..... ..... @vv_ui5 xvrotri_d 0111 01101010 0001 ...... ..... ..... @vv_ui6 =20 +xvsllwil_h_b 0111 01110000 10000 01 ... ..... ..... @vv_ui3 +xvsllwil_w_h 0111 01110000 10000 1 .... ..... ..... @vv_ui4 +xvsllwil_d_w 0111 01110000 10001 ..... ..... ..... @vv_ui5 +xvextl_q_d 0111 01110000 10010 00000 ..... ..... @vv +xvsllwil_hu_bu 0111 01110000 11000 01 ... ..... ..... @vv_ui3 +xvsllwil_wu_hu 0111 01110000 11000 1 .... ..... ..... @vv_ui4 +xvsllwil_du_wu 0111 01110000 11001 ..... ..... ..... @vv_ui5 +xvextl_qu_du 0111 01110000 11010 00000 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index e081a11aba..93c205fa32 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2077,6 +2077,15 @@ INSN_LASX(xvrotri_h, vv_i) INSN_LASX(xvrotri_w, vv_i) INSN_LASX(xvrotri_d, vv_i) =20 +INSN_LASX(xvsllwil_h_b, vv_i) +INSN_LASX(xvsllwil_w_h, vv_i) +INSN_LASX(xvsllwil_d_w, vv_i) +INSN_LASX(xvextl_q_d, vv) +INSN_LASX(xvsllwil_hu_bu, vv_i) +INSN_LASX(xvsllwil_wu_hu, vv_i) +INSN_LASX(xvsllwil_du_wu, vv_i) +INSN_LASX(xvextl_qu_du, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index e33969339f..7fe9f9f34e 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -925,37 +925,47 @@ void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm= , uint32_t desc) } } =20 -#define VSLLWIL(NAME, BIT, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - typedef __typeof(temp.E1(0)) TD; \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E1(i) =3D (TD)Vj->E2(i) << (imm % BIT); \ - } \ - *Vd =3D temp; \ +#define VSLLWIL(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + typedef __typeof(temp.E1(0)) TD; = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * i) =3D (TD)Vj->E2(j + ofs * 2 * i) << (imm %= BIT); \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 void HELPER(vextl_q_d)(void *vd, void *vj, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_makes64(Vj->D(0)); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_makes64(Vj->D(2 * i)); + } } =20 void HELPER(vextl_qu_du)(void *vd, void *vj, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_make64(Vj->D(0)); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_make64(Vj->UD(2 * i)); + } } =20 VSLLWIL(vsllwil_h_b, 16, H, B) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index d13dfacebf..eef6f28338 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -400,6 +400,15 @@ TRANS(xvrotri_h, LASX, gvec_vv_i, 32, MO_16, tcg_gen_g= vec_rotri) TRANS(xvrotri_w, LASX, gvec_vv_i, 32, MO_32, tcg_gen_gvec_rotri) TRANS(xvrotri_d, LASX, gvec_vv_i, 32, MO_64, tcg_gen_gvec_rotri) =20 +TRANS(xvsllwil_h_b, LASX, gen_vv_i, 32, gen_helper_vsllwil_h_b) +TRANS(xvsllwil_w_h, LASX, gen_vv_i, 32, gen_helper_vsllwil_w_h) +TRANS(xvsllwil_d_w, LASX, gen_vv_i, 32, gen_helper_vsllwil_d_w) +TRANS(xvextl_q_d, LASX, gen_vv, 32, gen_helper_vextl_q_d) +TRANS(xvsllwil_hu_bu, LASX, gen_vv_i, 32, gen_helper_vsllwil_hu_bu) +TRANS(xvsllwil_wu_hu, LASX, gen_vv_i, 32, gen_helper_vsllwil_wu_hu) +TRANS(xvsllwil_du_wu, LASX, gen_vv_i, 32, gen_helper_vsllwil_du_wu) +TRANS(xvextl_qu_du, LASX, gen_vv, 32, gen_helper_vextl_qu_du) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385657474559.4333514717339; Wed, 30 Aug 2023 01:54:17 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGvA-0000HC-Kn; Wed, 30 Aug 2023 04:51:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGv8-0008Qs-QK for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:51:26 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGv5-000878-Rm for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:51:26 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxEvCXAu9kqwgdAA--.58543S3; Wed, 30 Aug 2023 16:49:27 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S31; Wed, 30 Aug 2023 16:49:26 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 29/48] target/loongarch: Implement xvsrlr xvsrar Date: Wed, 30 Aug 2023 16:48:43 +0800 Message-Id: <20230830084902.2113960-30-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S31 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385657946100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSRLR[I].{B/H/W/D}; - XVSRAR[I].{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 17 +++++++++++++++++ target/loongarch/disas.c | 18 ++++++++++++++++++ target/loongarch/vec_helper.c | 12 ++++++++---- target/loongarch/insn_trans/trans_lasx.c.inc | 18 ++++++++++++++++++ 4 files changed, 61 insertions(+), 4 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 8a7933eccc..ca0951e1cc 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1661,6 +1661,23 @@ xvsllwil_wu_hu 0111 01110000 11000 1 .... ..... ..= ... @vv_ui4 xvsllwil_du_wu 0111 01110000 11001 ..... ..... ..... @vv_ui5 xvextl_qu_du 0111 01110000 11010 00000 ..... ..... @vv =20 +xvsrlr_b 0111 01001111 00000 ..... ..... ..... @vvv +xvsrlr_h 0111 01001111 00001 ..... ..... ..... @vvv +xvsrlr_w 0111 01001111 00010 ..... ..... ..... @vvv +xvsrlr_d 0111 01001111 00011 ..... ..... ..... @vvv +xvsrlri_b 0111 01101010 01000 01 ... ..... ..... @vv_ui3 +xvsrlri_h 0111 01101010 01000 1 .... ..... ..... @vv_ui4 +xvsrlri_w 0111 01101010 01001 ..... ..... ..... @vv_ui5 +xvsrlri_d 0111 01101010 0101 ...... ..... ..... @vv_ui6 +xvsrar_b 0111 01001111 00100 ..... ..... ..... @vvv +xvsrar_h 0111 01001111 00101 ..... ..... ..... @vvv +xvsrar_w 0111 01001111 00110 ..... ..... ..... @vvv +xvsrar_d 0111 01001111 00111 ..... ..... ..... @vvv +xvsrari_b 0111 01101010 10000 01 ... ..... ..... @vv_ui3 +xvsrari_h 0111 01101010 10000 1 .... ..... ..... @vv_ui4 +xvsrari_w 0111 01101010 10001 ..... ..... ..... @vv_ui5 +xvsrari_d 0111 01101010 1001 ...... ..... ..... @vv_ui6 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 93c205fa32..9109203a05 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2086,6 +2086,24 @@ INSN_LASX(xvsllwil_wu_hu, vv_i) INSN_LASX(xvsllwil_du_wu, vv_i) INSN_LASX(xvextl_qu_du, vv) =20 +INSN_LASX(xvsrlr_b, vvv) +INSN_LASX(xvsrlr_h, vvv) +INSN_LASX(xvsrlr_w, vvv) +INSN_LASX(xvsrlr_d, vvv) +INSN_LASX(xvsrlri_b, vv_i) +INSN_LASX(xvsrlri_h, vv_i) +INSN_LASX(xvsrlri_w, vv_i) +INSN_LASX(xvsrlri_d, vv_i) + +INSN_LASX(xvsrar_b, vvv) +INSN_LASX(xvsrar_h, vvv) +INSN_LASX(xvsrar_w, vvv) +INSN_LASX(xvsrar_d, vvv) +INSN_LASX(xvsrari_b, vv_i) +INSN_LASX(xvsrari_h, vv_i) +INSN_LASX(xvsrari_w, vv_i) +INSN_LASX(xvsrari_d, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 7fe9f9f34e..12a2b2a9e6 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -997,8 +997,9 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_= t desc) \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D do_vsrlr_ ## E(Vj->E(i), ((T)Vk->E(i))%BIT); \ } \ } @@ -1014,8 +1015,9 @@ void HELPER(NAME)(void *vd, void *vj, uint64_t imm, u= int32_t desc) \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D do_vsrlr_ ## E(Vj->E(i), imm); \ } \ } @@ -1047,8 +1049,9 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint3= 2_t desc) \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D do_vsrar_ ## E(Vj->E(i), ((T)Vk->E(i))%BIT); \ } \ } @@ -1064,8 +1067,9 @@ void HELPER(NAME)(void *vd, void *vj, uint64_t imm, u= int32_t desc) \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D do_vsrar_ ## E(Vj->E(i), imm); \ } \ } diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index eef6f28338..4a92df2cd9 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -409,6 +409,24 @@ TRANS(xvsllwil_wu_hu, LASX, gen_vv_i, 32, gen_helper_v= sllwil_wu_hu) TRANS(xvsllwil_du_wu, LASX, gen_vv_i, 32, gen_helper_vsllwil_du_wu) TRANS(xvextl_qu_du, LASX, gen_vv, 32, gen_helper_vextl_qu_du) =20 +TRANS(xvsrlr_b, LASX, gen_vvv, 32, gen_helper_vsrlr_b) +TRANS(xvsrlr_h, LASX, gen_vvv, 32, gen_helper_vsrlr_h) +TRANS(xvsrlr_w, LASX, gen_vvv, 32, gen_helper_vsrlr_w) +TRANS(xvsrlr_d, LASX, gen_vvv, 32, gen_helper_vsrlr_d) +TRANS(xvsrlri_b, LASX, gen_vv_i, 32, gen_helper_vsrlri_b) +TRANS(xvsrlri_h, LASX, gen_vv_i, 32, gen_helper_vsrlri_h) +TRANS(xvsrlri_w, LASX, gen_vv_i, 32, gen_helper_vsrlri_w) +TRANS(xvsrlri_d, LASX, gen_vv_i, 32, gen_helper_vsrlri_d) + +TRANS(xvsrar_b, LASX, gen_vvv, 32, gen_helper_vsrar_b) +TRANS(xvsrar_h, LASX, gen_vvv, 32, gen_helper_vsrar_h) +TRANS(xvsrar_w, LASX, gen_vvv, 32, gen_helper_vsrar_w) +TRANS(xvsrar_d, LASX, gen_vvv, 32, gen_helper_vsrar_d) +TRANS(xvsrari_b, LASX, gen_vv_i, 32, gen_helper_vsrari_b) +TRANS(xvsrari_h, LASX, gen_vv_i, 32, gen_helper_vsrari_h) +TRANS(xvsrari_w, LASX, gen_vv_i, 32, gen_helper_vsrari_w) +TRANS(xvsrari_d, LASX, gen_vv_i, 32, gen_helper_vsrari_d) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385603338349.8050545234803; Wed, 30 Aug 2023 01:53:23 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtV-00019P-Qb; Wed, 30 Aug 2023 04:49:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtU-00015p-7e for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:44 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtR-0007Wq-2g for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:43 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxtPCXAu9krAgdAA--.60098S3; Wed, 30 Aug 2023 16:49:27 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S32; Wed, 30 Aug 2023 16:49:27 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 30/48] target/loongarch: Implement xvsrln xvsran Date: Wed, 30 Aug 2023 16:48:44 +0800 Message-Id: <20230830084902.2113960-31-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S32 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385604849100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSRLN.{B.H/H.W/W.D}; - XVSRAN.{B.H/H.W/W.D}; - XVSRLNI.{B.H/H.W/W.D/D.Q}; - XVSRANI.{B.H/H.W/W.D/D.Q}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 2 + target/loongarch/insns.decode | 16 ++ target/loongarch/disas.c | 16 ++ target/loongarch/vec_helper.c | 168 ++++++++++--------- target/loongarch/insn_trans/trans_lasx.c.inc | 16 ++ 5 files changed, 141 insertions(+), 77 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 681afd842f..67d829f9da 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -74,4 +74,6 @@ =20 #define DO_SIGNCOV(a, b) (a =3D=3D 0 ? 0 : a < 0 ? -b : b) =20 +#define R_SHIFT(a, b) (a >> b) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index ca0951e1cc..204dcfa075 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1678,6 +1678,22 @@ xvsrari_h 0111 01101010 10000 1 .... ..... ..= ... @vv_ui4 xvsrari_w 0111 01101010 10001 ..... ..... ..... @vv_ui5 xvsrari_d 0111 01101010 1001 ...... ..... ..... @vv_ui6 =20 +xvsrln_b_h 0111 01001111 01001 ..... ..... ..... @vvv +xvsrln_h_w 0111 01001111 01010 ..... ..... ..... @vvv +xvsrln_w_d 0111 01001111 01011 ..... ..... ..... @vvv +xvsran_b_h 0111 01001111 01101 ..... ..... ..... @vvv +xvsran_h_w 0111 01001111 01110 ..... ..... ..... @vvv +xvsran_w_d 0111 01001111 01111 ..... ..... ..... @vvv + +xvsrlni_b_h 0111 01110100 00000 1 .... ..... ..... @vv_ui4 +xvsrlni_h_w 0111 01110100 00001 ..... ..... ..... @vv_ui5 +xvsrlni_w_d 0111 01110100 0001 ...... ..... ..... @vv_ui6 +xvsrlni_d_q 0111 01110100 001 ....... ..... ..... @vv_ui7 +xvsrani_b_h 0111 01110101 10000 1 .... ..... ..... @vv_ui4 +xvsrani_h_w 0111 01110101 10001 ..... ..... ..... @vv_ui5 +xvsrani_w_d 0111 01110101 1001 ...... ..... ..... @vv_ui6 +xvsrani_d_q 0111 01110101 101 ....... ..... ..... @vv_ui7 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 9109203a05..14b526abd6 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2104,6 +2104,22 @@ INSN_LASX(xvsrari_h, vv_i) INSN_LASX(xvsrari_w, vv_i) INSN_LASX(xvsrari_d, vv_i) =20 +INSN_LASX(xvsrln_b_h, vvv) +INSN_LASX(xvsrln_h_w, vvv) +INSN_LASX(xvsrln_w_d, vvv) +INSN_LASX(xvsran_b_h, vvv) +INSN_LASX(xvsran_h_w, vvv) +INSN_LASX(xvsran_w_d, vvv) + +INSN_LASX(xvsrlni_b_h, vv_i) +INSN_LASX(xvsrlni_h_w, vv_i) +INSN_LASX(xvsrlni_w_d, vv_i) +INSN_LASX(xvsrlni_d_q, vv_i) +INSN_LASX(xvsrani_b_h, vv_i) +INSN_LASX(xvsrani_h_w, vv_i) +INSN_LASX(xvsrani_w_d, vv_i) +INSN_LASX(xvsrani_d_q, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 12a2b2a9e6..bcfa7b9530 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -1079,107 +1079,121 @@ VSRARI(vsrari_h, 16, H) VSRARI(vsrari_w, 32, W) VSRARI(vsrari_d, 64, D) =20 -#define R_SHIFT(a, b) (a >> b) - -#define VSRLN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *v, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D R_SHIFT((T)Vj->E2(i),((T)Vk->E2(i)) % BIT); \ - } \ - Vd->D(1) =3D 0; \ +#define VSRLN(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D R_SHIFT(Vj->E2(j + ofs * i), = \ + Vk->E2(j + ofs * i) % BIT); \ + } \ + Vd->D(2 * i + 1) =3D 0; = \ + } \ } =20 -VSRLN(vsrln_b_h, 16, uint16_t, B, H) -VSRLN(vsrln_h_w, 32, uint32_t, H, W) -VSRLN(vsrln_w_d, 64, uint64_t, W, D) +VSRLN(vsrln_b_h, 16, B, UH) +VSRLN(vsrln_h_w, 32, H, UW) +VSRLN(vsrln_w_d, 64, W, UD) =20 -#define VSRAN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D R_SHIFT(Vj->E2(i), ((T)Vk->E2(i)) % BIT); \ - } \ - Vd->D(1) =3D 0; \ +#define VSRAN(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D R_SHIFT(Vj->E2(j + ofs * i), = \ + Vk->E3(j + ofs * i) % BIT); \ + } \ + Vd->D(2 * i + 1) =3D 0; = \ + } \ } =20 -VSRAN(vsran_b_h, 16, uint16_t, B, H) -VSRAN(vsran_h_w, 32, uint32_t, H, W) -VSRAN(vsran_w_d, 64, uint64_t, W, D) +VSRAN(vsran_b_h, 16, B, H, UH) +VSRAN(vsran_h_w, 32, H, W, UW) +VSRAN(vsran_w_d, 64, W, D, UD) =20 -#define VSRLNI(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D R_SHIFT((T)Vj->E2(i), imm); \ - temp.E1(i + max) =3D R_SHIFT((T)Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ +#define VSRLNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D R_SHIFT(Vj->E2(j + ofs * i), imm)= ; \ + temp.E1(j + ofs * (2 * i + 1)) =3D R_SHIFT(Vd->E2(j + ofs * i)= , \ + imm); \ + } \ + } \ + *Vd =3D temp; = \ } =20 void HELPER(vsrlni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - VReg temp; + int i; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; =20 - temp.D(0) =3D 0; - temp.D(1) =3D 0; - temp.D(0) =3D int128_getlo(int128_urshift(Vj->Q(0), imm % 128)); - temp.D(1) =3D int128_getlo(int128_urshift(Vd->Q(0), imm % 128)); + for (i =3D 0; i < 2; i++) { + temp.D(2 * i) =3D int128_getlo(int128_urshift(Vj->Q(i), imm % 128)= ); + temp.D(2 * i +1) =3D int128_getlo(int128_urshift(Vd->Q(i), imm % 1= 28)); + } *Vd =3D temp; } =20 -VSRLNI(vsrlni_b_h, 16, uint16_t, B, H) -VSRLNI(vsrlni_h_w, 32, uint32_t, H, W) -VSRLNI(vsrlni_w_d, 64, uint64_t, W, D) +VSRLNI(vsrlni_b_h, 16, B, UH) +VSRLNI(vsrlni_h_w, 32, H, UW) +VSRLNI(vsrlni_w_d, 64, W, UD) =20 -#define VSRANI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D R_SHIFT(Vj->E2(i), imm); \ - temp.E1(i + max) =3D R_SHIFT(Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ +#define VSRANI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D R_SHIFT(Vj->E2(j + ofs * i), imm)= ; \ + temp.E1(j + ofs * (2 * i + 1)) =3D R_SHIFT(Vd->E2(j + ofs * i)= , \ + imm); \ + } \ + } \ + *Vd =3D temp; = \ } =20 void HELPER(vsrani_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - VReg temp; + int i; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; =20 - temp.D(0) =3D 0; - temp.D(1) =3D 0; - temp.D(0) =3D int128_getlo(int128_rshift(Vj->Q(0), imm % 128)); - temp.D(1) =3D int128_getlo(int128_rshift(Vd->Q(0), imm % 128)); + for (i =3D 0; i < 2; i++) { + temp.D(2 * i) =3D int128_getlo(int128_rshift(Vj->Q(i), imm % 128)); + temp.D(2 * i + 1) =3D int128_getlo(int128_rshift(Vd->Q(i), imm % 1= 28)); + } *Vd =3D temp; } =20 diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 4a92df2cd9..a420e8dfc9 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -427,6 +427,22 @@ TRANS(xvsrari_h, LASX, gen_vv_i, 32, gen_helper_vsrari= _h) TRANS(xvsrari_w, LASX, gen_vv_i, 32, gen_helper_vsrari_w) TRANS(xvsrari_d, LASX, gen_vv_i, 32, gen_helper_vsrari_d) =20 +TRANS(xvsrln_b_h, LASX, gen_vvv, 32, gen_helper_vsrln_b_h) +TRANS(xvsrln_h_w, LASX, gen_vvv, 32, gen_helper_vsrln_h_w) +TRANS(xvsrln_w_d, LASX, gen_vvv, 32, gen_helper_vsrln_w_d) +TRANS(xvsran_b_h, LASX, gen_vvv, 32, gen_helper_vsran_b_h) +TRANS(xvsran_h_w, LASX, gen_vvv, 32, gen_helper_vsran_h_w) +TRANS(xvsran_w_d, LASX, gen_vvv, 32, gen_helper_vsran_w_d) + +TRANS(xvsrlni_b_h, LASX, gen_vv_i, 32, gen_helper_vsrlni_b_h) +TRANS(xvsrlni_h_w, LASX, gen_vv_i, 32, gen_helper_vsrlni_h_w) +TRANS(xvsrlni_w_d, LASX, gen_vv_i, 32, gen_helper_vsrlni_w_d) +TRANS(xvsrlni_d_q, LASX, gen_vv_i, 32, gen_helper_vsrlni_d_q) +TRANS(xvsrani_b_h, LASX, gen_vv_i, 32, gen_helper_vsrani_b_h) +TRANS(xvsrani_h_w, LASX, gen_vv_i, 32, gen_helper_vsrani_h_w) +TRANS(xvsrani_w_d, LASX, gen_vv_i, 32, gen_helper_vsrani_w_d) +TRANS(xvsrani_d_q, LASX, gen_vv_i, 32, gen_helper_vsrani_d_q) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385591017496.1477927516247; Wed, 30 Aug 2023 01:53:11 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtV-00016V-2g; Wed, 30 Aug 2023 04:49:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtT-00015T-RO for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:43 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtQ-0007Wv-Ht for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:43 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxPOuYAu9krwgdAA--.53997S3; Wed, 30 Aug 2023 16:49:28 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S33; Wed, 30 Aug 2023 16:49:27 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 31/48] target/loongarch: Implement xvsrlrn xvsrarn Date: Wed, 30 Aug 2023 16:48:45 +0800 Message-Id: <20230830084902.2113960-32-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S33 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385591596100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSRLRN.{B.H/H.W/W.D}; - XVSRARN.{B.H/H.W/W.D}; - XVSRLRNI.{B.H/H.W/W.D/D.Q}; - XVSRARNI.{B.H/H.W/W.D/D.Q}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 16 ++ target/loongarch/disas.c | 16 ++ target/loongarch/vec_helper.c | 198 +++++++++++-------- target/loongarch/insn_trans/trans_lasx.c.inc | 16 ++ 4 files changed, 161 insertions(+), 85 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 204dcfa075..d7c50b14ca 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1694,6 +1694,22 @@ xvsrani_h_w 0111 01110101 10001 ..... ..... ...= .. @vv_ui5 xvsrani_w_d 0111 01110101 1001 ...... ..... ..... @vv_ui6 xvsrani_d_q 0111 01110101 101 ....... ..... ..... @vv_ui7 =20 +xvsrlrn_b_h 0111 01001111 10001 ..... ..... ..... @vvv +xvsrlrn_h_w 0111 01001111 10010 ..... ..... ..... @vvv +xvsrlrn_w_d 0111 01001111 10011 ..... ..... ..... @vvv +xvsrarn_b_h 0111 01001111 10101 ..... ..... ..... @vvv +xvsrarn_h_w 0111 01001111 10110 ..... ..... ..... @vvv +xvsrarn_w_d 0111 01001111 10111 ..... ..... ..... @vvv + +xvsrlrni_b_h 0111 01110100 01000 1 .... ..... ..... @vv_ui4 +xvsrlrni_h_w 0111 01110100 01001 ..... ..... ..... @vv_ui5 +xvsrlrni_w_d 0111 01110100 0101 ...... ..... ..... @vv_ui6 +xvsrlrni_d_q 0111 01110100 011 ....... ..... ..... @vv_ui7 +xvsrarni_b_h 0111 01110101 11000 1 .... ..... ..... @vv_ui4 +xvsrarni_h_w 0111 01110101 11001 ..... ..... ..... @vv_ui5 +xvsrarni_w_d 0111 01110101 1101 ...... ..... ..... @vv_ui6 +xvsrarni_d_q 0111 01110101 111 ....... ..... ..... @vv_ui7 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 14b526abd6..04b6ea713d 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2120,6 +2120,22 @@ INSN_LASX(xvsrani_h_w, vv_i) INSN_LASX(xvsrani_w_d, vv_i) INSN_LASX(xvsrani_d_q, vv_i) =20 +INSN_LASX(xvsrlrn_b_h, vvv) +INSN_LASX(xvsrlrn_h_w, vvv) +INSN_LASX(xvsrlrn_w_d, vvv) +INSN_LASX(xvsrarn_b_h, vvv) +INSN_LASX(xvsrarn_h_w, vvv) +INSN_LASX(xvsrarn_w_d, vvv) + +INSN_LASX(xvsrlrni_b_h, vv_i) +INSN_LASX(xvsrlrni_h_w, vv_i) +INSN_LASX(xvsrlrni_w_d, vv_i) +INSN_LASX(xvsrlrni_d_q, vv_i) +INSN_LASX(xvsrarni_b_h, vv_i) +INSN_LASX(xvsrarni_h_w, vv_i) +INSN_LASX(xvsrarni_w_d, vv_i) +INSN_LASX(xvsrarni_d_q, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index bcfa7b9530..d4f2091656 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -1201,76 +1201,95 @@ VSRANI(vsrani_b_h, 16, B, H) VSRANI(vsrani_h_w, 32, H, W) VSRANI(vsrani_w_d, 64, W, D) =20 -#define VSRLRN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D do_vsrlr_ ## E2(Vj->E2(i), ((T)Vk->E2(i))%BIT); \ - } \ - Vd->D(1) =3D 0; \ +#define VSRLRN(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_vsrlr_ ##E2(Vj->E2(j + ofs * i)= , \ + Vk->E3(j + ofs * i) % BIT);= \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSRLRN(vsrlrn_b_h, 16, uint16_t, B, H) -VSRLRN(vsrlrn_h_w, 32, uint32_t, H, W) -VSRLRN(vsrlrn_w_d, 64, uint64_t, W, D) +VSRLRN(vsrlrn_b_h, 16, B, H, UH) +VSRLRN(vsrlrn_h_w, 32, H, W, UW) +VSRLRN(vsrlrn_w_d, 64, W, D, UD) =20 -#define VSRARN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D do_vsrar_ ## E2(Vj->E2(i), ((T)Vk->E2(i))%BIT); \ - } \ - Vd->D(1) =3D 0; \ +#define VSRARN(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_vsrar_ ## E2(Vj->E2(j + ofs * i= ), \ + Vk->E3(j + ofs * i) % BIT)= ; \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSRARN(vsrarn_b_h, 16, uint8_t, B, H) -VSRARN(vsrarn_h_w, 32, uint16_t, H, W) -VSRARN(vsrarn_w_d, 64, uint32_t, W, D) - -#define VSRLRNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D do_vsrlr_ ## E2(Vj->E2(i), imm); \ - temp.E1(i + max) =3D do_vsrlr_ ## E2(Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ +VSRARN(vsrarn_b_h, 16, B, H, UH) +VSRARN(vsrarn_h_w, 32, H, W, UW) +VSRARN(vsrarn_w_d, 64, W, D, UD) + +#define VSRLRNI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_vsrlr_ ## E2(Vj->E2(j + ofs * = i), imm); \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_vsrlr_ ## E2(Vd->E2(j + = ofs * i), \ + imm); = \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 void HELPER(vsrlrni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - VReg temp; + int i; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; - Int128 r1, r2; - - if (imm =3D=3D 0) { - temp.D(0) =3D int128_getlo(Vj->Q(0)); - temp.D(1) =3D int128_getlo(Vd->Q(0)); - } else { - r1 =3D int128_and(int128_urshift(Vj->Q(0), (imm -1)), int128_one()= ); - r2 =3D int128_and(int128_urshift(Vd->Q(0), (imm -1)), int128_one()= ); + Int128 r[4]; + int oprsz =3D simd_oprsz(desc); =20 - temp.D(0) =3D int128_getlo(int128_add(int128_urshift(Vj->Q(0), imm)= , r1)); - temp.D(1) =3D int128_getlo(int128_add(int128_urshift(Vd->Q(0), imm)= , r2)); + for (i =3D 0; i < oprsz / 16; i++) { + if (imm =3D=3D 0) { + temp.D(2 * i) =3D int128_getlo(Vj->Q(i)); + temp.D(2 * i + 1) =3D int128_getlo(Vd->Q(i)); + } else { + r[2 * i] =3D int128_and(int128_urshift(Vj->Q(i), (imm - 1)), + int128_one()); + r[2 * i + 1] =3D int128_and(int128_urshift(Vd->Q(i), (imm - 1)= ), + int128_one()); + temp.D(2 * i) =3D int128_getlo(int128_add(int128_urshift(Vj->Q= (i), + imm), r[2 * i])); + temp.D(2 * i + 1) =3D int128_getlo(int128_add(int128_urshift(V= d->Q(i), + imm), r[ 2 * i + 1= ])); + } } *Vd =3D temp; } @@ -1279,40 +1298,49 @@ VSRLRNI(vsrlrni_b_h, 16, B, H) VSRLRNI(vsrlrni_h_w, 32, H, W) VSRLRNI(vsrlrni_w_d, 64, W, D) =20 -#define VSRARNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D do_vsrar_ ## E2(Vj->E2(i), imm); \ - temp.E1(i + max) =3D do_vsrar_ ## E2(Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ +#define VSRARNI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_vsrar_ ## E2(Vj->E2(j + ofs * = i), imm); \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_vsrar_ ## E2(Vd->E2(j + = ofs * i), \ + imm); = \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 void HELPER(vsrarni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - VReg temp; + int i; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; - Int128 r1, r2; - - if (imm =3D=3D 0) { - temp.D(0) =3D int128_getlo(Vj->Q(0)); - temp.D(1) =3D int128_getlo(Vd->Q(0)); - } else { - r1 =3D int128_and(int128_rshift(Vj->Q(0), (imm -1)), int128_one()); - r2 =3D int128_and(int128_rshift(Vd->Q(0), (imm -1)), int128_one()); + Int128 r[4]; + int oprsz =3D simd_oprsz(desc); =20 - temp.D(0) =3D int128_getlo(int128_add(int128_rshift(Vj->Q(0), imm),= r1)); - temp.D(1) =3D int128_getlo(int128_add(int128_rshift(Vd->Q(0), imm),= r2)); + for (i =3D 0; i < oprsz / 16; i++) { + if (imm =3D=3D 0) { + temp.D(2 * i) =3D int128_getlo(Vj->Q(i)); + temp.D(2 * i + 1) =3D int128_getlo(Vd->Q(i)); + } else { + r[2 * i] =3D int128_and(int128_rshift(Vj->Q(i), (imm - 1)), + int128_one()); + r[2 * i + 1] =3D int128_and(int128_rshift(Vd->Q(i), (imm - 1)), + int128_one()); + temp.D(2 * i) =3D int128_getlo(int128_add(int128_rshift(Vj->Q(= i), + imm), r[2 * i])); + temp.D(2 * i + 1) =3D int128_getlo(int128_add(int128_rshift(Vd= ->Q(i), + imm), r[2 * i + 1]= )); + } } *Vd =3D temp; } diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index a420e8dfc9..702a2f770d 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -443,6 +443,22 @@ TRANS(xvsrani_h_w, LASX, gen_vv_i, 32, gen_helper_vsra= ni_h_w) TRANS(xvsrani_w_d, LASX, gen_vv_i, 32, gen_helper_vsrani_w_d) TRANS(xvsrani_d_q, LASX, gen_vv_i, 32, gen_helper_vsrani_d_q) =20 +TRANS(xvsrlrn_b_h, LASX, gen_vvv, 32, gen_helper_vsrlrn_b_h) +TRANS(xvsrlrn_h_w, LASX, gen_vvv, 32, gen_helper_vsrlrn_h_w) +TRANS(xvsrlrn_w_d, LASX, gen_vvv, 32, gen_helper_vsrlrn_w_d) +TRANS(xvsrarn_b_h, LASX, gen_vvv, 32, gen_helper_vsrarn_b_h) +TRANS(xvsrarn_h_w, LASX, gen_vvv, 32, gen_helper_vsrarn_h_w) +TRANS(xvsrarn_w_d, LASX, gen_vvv, 32, gen_helper_vsrarn_w_d) + +TRANS(xvsrlrni_b_h, LASX, gen_vv_i, 32, gen_helper_vsrlrni_b_h) +TRANS(xvsrlrni_h_w, LASX, gen_vv_i, 32, gen_helper_vsrlrni_h_w) +TRANS(xvsrlrni_w_d, LASX, gen_vv_i, 32, gen_helper_vsrlrni_w_d) +TRANS(xvsrlrni_d_q, LASX, gen_vv_i, 32, gen_helper_vsrlrni_d_q) +TRANS(xvsrarni_b_h, LASX, gen_vv_i, 32, gen_helper_vsrarni_b_h) +TRANS(xvsrarni_h_w, LASX, gen_vv_i, 32, gen_helper_vsrarni_h_w) +TRANS(xvsrarni_w_d, LASX, gen_vv_i, 32, gen_helper_vsrarni_w_d) +TRANS(xvsrarni_d_q, LASX, gen_vv_i, 32, gen_helper_vsrarni_d_q) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385628149568.2753826206675; Wed, 30 Aug 2023 01:53:48 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGvE-0000xz-3q; Wed, 30 Aug 2023 04:51:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGvB-0000Md-8w for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:51:29 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGv6-00087C-1I for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:51:29 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dx_+uYAu9ksAgdAA--.58278S3; Wed, 30 Aug 2023 16:49:28 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S34; Wed, 30 Aug 2023 16:49:28 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 32/48] target/loongarch: Implement xvssrln xvssran Date: Wed, 30 Aug 2023 16:48:46 +0800 Message-Id: <20230830084902.2113960-33-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S34 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385629023100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSSRLN.{B.H/H.W/W.D}; - XVSSRAN.{B.H/H.W/W.D}; - XVSSRLN.{BU.H/HU.W/WU.D}; - XVSSRAN.{BU.H/HU.W/WU.D}; - XVSSRLNI.{B.H/H.W/W.D/D.Q}; - XVSSRANI.{B.H/H.W/W.D/D.Q}; - XVSSRLNI.{BU.H/HU.W/WU.D/DU.Q}; - XVSSRANI.{BU.H/HU.W/WU.D/DU.Q}. Signed-off-by: Song Gao --- target/loongarch/insns.decode | 30 ++ target/loongarch/disas.c | 30 ++ target/loongarch/vec_helper.c | 451 ++++++++++--------- target/loongarch/insn_trans/trans_lasx.c.inc | 30 ++ 4 files changed, 337 insertions(+), 204 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index d7c50b14ca..022dd9bfd1 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1710,6 +1710,36 @@ xvsrarni_h_w 0111 01110101 11001 ..... ..... ...= .. @vv_ui5 xvsrarni_w_d 0111 01110101 1101 ...... ..... ..... @vv_ui6 xvsrarni_d_q 0111 01110101 111 ....... ..... ..... @vv_ui7 =20 +xvssrln_b_h 0111 01001111 11001 ..... ..... ..... @vvv +xvssrln_h_w 0111 01001111 11010 ..... ..... ..... @vvv +xvssrln_w_d 0111 01001111 11011 ..... ..... ..... @vvv +xvssran_b_h 0111 01001111 11101 ..... ..... ..... @vvv +xvssran_h_w 0111 01001111 11110 ..... ..... ..... @vvv +xvssran_w_d 0111 01001111 11111 ..... ..... ..... @vvv +xvssrln_bu_h 0111 01010000 01001 ..... ..... ..... @vvv +xvssrln_hu_w 0111 01010000 01010 ..... ..... ..... @vvv +xvssrln_wu_d 0111 01010000 01011 ..... ..... ..... @vvv +xvssran_bu_h 0111 01010000 01101 ..... ..... ..... @vvv +xvssran_hu_w 0111 01010000 01110 ..... ..... ..... @vvv +xvssran_wu_d 0111 01010000 01111 ..... ..... ..... @vvv + +xvssrlni_b_h 0111 01110100 10000 1 .... ..... ..... @vv_ui4 +xvssrlni_h_w 0111 01110100 10001 ..... ..... ..... @vv_ui5 +xvssrlni_w_d 0111 01110100 1001 ...... ..... ..... @vv_ui6 +xvssrlni_d_q 0111 01110100 101 ....... ..... ..... @vv_ui7 +xvssrani_b_h 0111 01110110 00000 1 .... ..... ..... @vv_ui4 +xvssrani_h_w 0111 01110110 00001 ..... ..... ..... @vv_ui5 +xvssrani_w_d 0111 01110110 0001 ...... ..... ..... @vv_ui6 +xvssrani_d_q 0111 01110110 001 ....... ..... ..... @vv_ui7 +xvssrlni_bu_h 0111 01110100 11000 1 .... ..... ..... @vv_ui4 +xvssrlni_hu_w 0111 01110100 11001 ..... ..... ..... @vv_ui5 +xvssrlni_wu_d 0111 01110100 1101 ...... ..... ..... @vv_ui6 +xvssrlni_du_q 0111 01110100 111 ....... ..... ..... @vv_ui7 +xvssrani_bu_h 0111 01110110 01000 1 .... ..... ..... @vv_ui4 +xvssrani_hu_w 0111 01110110 01001 ..... ..... ..... @vv_ui5 +xvssrani_wu_d 0111 01110110 0101 ...... ..... ..... @vv_ui6 +xvssrani_du_q 0111 01110110 011 ....... ..... ..... @vv_ui7 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 04b6ea713d..04e8d42044 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2136,6 +2136,36 @@ INSN_LASX(xvsrarni_h_w, vv_i) INSN_LASX(xvsrarni_w_d, vv_i) INSN_LASX(xvsrarni_d_q, vv_i) =20 +INSN_LASX(xvssrln_b_h, vvv) +INSN_LASX(xvssrln_h_w, vvv) +INSN_LASX(xvssrln_w_d, vvv) +INSN_LASX(xvssran_b_h, vvv) +INSN_LASX(xvssran_h_w, vvv) +INSN_LASX(xvssran_w_d, vvv) +INSN_LASX(xvssrln_bu_h, vvv) +INSN_LASX(xvssrln_hu_w, vvv) +INSN_LASX(xvssrln_wu_d, vvv) +INSN_LASX(xvssran_bu_h, vvv) +INSN_LASX(xvssran_hu_w, vvv) +INSN_LASX(xvssran_wu_d, vvv) + +INSN_LASX(xvssrlni_b_h, vv_i) +INSN_LASX(xvssrlni_h_w, vv_i) +INSN_LASX(xvssrlni_w_d, vv_i) +INSN_LASX(xvssrlni_d_q, vv_i) +INSN_LASX(xvssrani_b_h, vv_i) +INSN_LASX(xvssrani_h_w, vv_i) +INSN_LASX(xvssrani_w_d, vv_i) +INSN_LASX(xvssrani_d_q, vv_i) +INSN_LASX(xvssrlni_bu_h, vv_i) +INSN_LASX(xvssrlni_hu_w, vv_i) +INSN_LASX(xvssrlni_wu_d, vv_i) +INSN_LASX(xvssrlni_du_q, vv_i) +INSN_LASX(xvssrani_bu_h, vv_i) +INSN_LASX(xvssrani_hu_w, vv_i) +INSN_LASX(xvssrani_wu_d, vv_i) +INSN_LASX(xvssrani_du_q, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index d4f2091656..738bb452f6 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -1371,23 +1371,29 @@ SSRLNS(B, uint16_t, int16_t, uint8_t) SSRLNS(H, uint32_t, int32_t, uint16_t) SSRLNS(W, uint64_t, int64_t, uint32_t) =20 -#define VSSRLN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrlns_ ## E1(Vj->E2(i), (T)Vk->E2(i)% BIT, BIT/2= -1); \ - } = \ - Vd->D(1) =3D 0; = \ +#define VSSRLN(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrlns_ ## E1(Vj->E2(j + ofs * = i), \ + Vk->E3(j + ofs * i) % BIT,= \ + BIT / 2 - 1); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRLN(vssrln_b_h, 16, uint16_t, B, H) -VSSRLN(vssrln_h_w, 32, uint32_t, H, W) -VSSRLN(vssrln_w_d, 64, uint64_t, W, D) +VSSRLN(vssrln_b_h, 16, B, H, UH) +VSSRLN(vssrln_h_w, 32, H, W, UW) +VSSRLN(vssrln_w_d, 64, W, D, UD) =20 #define SSRANS(E, T1, T2) \ static T1 do_ssrans_ ## E(T1 e2, int sa, int sh) \ @@ -1399,10 +1405,10 @@ static T1 do_ssrans_ ## E(T1 e2, int sa, int sh) \ shft_res =3D e2 >> sa; \ } \ T2 mask; \ - mask =3D (1ll << sh) -1; \ + mask =3D (1ll << sh) - 1; \ if (shft_res > mask) { \ return mask; \ - } else if (shft_res < -(mask +1)) { \ + } else if (shft_res < -(mask + 1)) { \ return ~mask; \ } else { \ return shft_res; \ @@ -1413,23 +1419,29 @@ SSRANS(B, int16_t, int8_t) SSRANS(H, int32_t, int16_t) SSRANS(W, int64_t, int32_t) =20 -#define VSSRAN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrans_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2 = -1); \ - } = \ - Vd->D(1) =3D 0; = \ +#define VSSRAN(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrans_ ## E1(Vj->E2(j + ofs * = i), \ + Vk->E3(j + ofs * i) % BIT,= \ + BIT / 2 - 1); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRAN(vssran_b_h, 16, uint16_t, B, H) -VSSRAN(vssran_h_w, 32, uint32_t, H, W) -VSSRAN(vssran_w_d, 64, uint64_t, W, D) +VSSRAN(vssran_b_h, 16, B, H, UH) +VSSRAN(vssran_h_w, 32, H, W, UW) +VSSRAN(vssran_w_d, 64, W, D, UD) =20 #define SSRLNU(E, T1, T2, T3) \ static T1 do_ssrlnu_ ## E(T3 e2, int sa, int sh) \ @@ -1441,7 +1453,7 @@ static T1 do_ssrlnu_ ## E(T3 e2, int sa, int sh) \ shft_res =3D (((T1)e2) >> sa); \ } \ T2 mask; \ - mask =3D (1ull << sh) -1; \ + mask =3D (1ull << sh) - 1; \ if (shft_res > mask) { \ return mask; \ } else { \ @@ -1453,23 +1465,29 @@ SSRLNU(B, uint16_t, uint8_t, int16_t) SSRLNU(H, uint32_t, uint16_t, int32_t) SSRLNU(W, uint64_t, uint32_t, int64_t) =20 -#define VSSRLNU(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrlnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2)= ; \ - } \ - Vd->D(1) =3D 0; = \ +#define VSSRLNU(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrlnu_ ## E1(Vj->E2(j + ofs * = i), \ + Vk->E3(j + ofs * i) % BIT,= \ + BIT / 2); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRLNU(vssrln_bu_h, 16, uint16_t, B, H) -VSSRLNU(vssrln_hu_w, 32, uint32_t, H, W) -VSSRLNU(vssrln_wu_d, 64, uint64_t, W, D) +VSSRLNU(vssrln_bu_h, 16, B, H, UH) +VSSRLNU(vssrln_hu_w, 32, H, W, UW) +VSSRLNU(vssrln_wu_d, 64, W, D, UD) =20 #define SSRANU(E, T1, T2, T3) \ static T1 do_ssranu_ ## E(T3 e2, int sa, int sh) \ @@ -1484,7 +1502,7 @@ static T1 do_ssranu_ ## E(T3 e2, int sa, int sh) \ shft_res =3D 0; \ } \ T2 mask; \ - mask =3D (1ull << sh) -1; \ + mask =3D (1ull << sh) - 1; \ if (shft_res > mask) { \ return mask; \ } else { \ @@ -1496,64 +1514,76 @@ SSRANU(B, uint16_t, uint8_t, int16_t) SSRANU(H, uint32_t, uint16_t, int32_t) SSRANU(W, uint64_t, uint32_t, int64_t) =20 -#define VSSRANU(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssranu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2)= ; \ - } \ - Vd->D(1) =3D 0; = \ +#define VSSRANU(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssranu_ ## E1(Vj->E2(j + ofs * = i), \ + Vk->E3(j + ofs * i) % = BIT, \ + BIT / 2); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRANU(vssran_bu_h, 16, uint16_t, B, H) -VSSRANU(vssran_hu_w, 32, uint32_t, H, W) -VSSRANU(vssran_wu_d, 64, uint64_t, W, D) - -#define VSSRLNI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrlns_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrlns_ ## E1(Vd->E2(i), imm, BIT/= 2 -1);\ - } = \ - *Vd =3D temp; = \ +VSSRANU(vssran_bu_h, 16, B, H, UH) +VSSRANU(vssran_hu_w, 32, H, W, UW) +VSSRANU(vssran_wu_d, 64, W, D, UD) + +#define VSSRLNI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrlns_ ## E1(Vj->E2(j + ofs *= i), \ + imm, BIT / 2 - 1); = \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrlns_ ## E1(Vd->E2(j += ofs * i), \ + imm, BIT / 2 - = 1); \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 void HELPER(vssrlni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - Int128 shft_res1, shft_res2, mask; + int i, j; + Int128 shft_res[4], mask; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - if (imm =3D=3D 0) { - shft_res1 =3D Vj->Q(0); - shft_res2 =3D Vd->Q(0); - } else { - shft_res1 =3D int128_urshift(Vj->Q(0), imm); - shft_res2 =3D int128_urshift(Vd->Q(0), imm); - } mask =3D int128_sub(int128_lshift(int128_one(), 63), int128_one()); =20 - if (int128_ult(mask, shft_res1)) { - Vd->D(0) =3D int128_getlo(mask); - }else { - Vd->D(0) =3D int128_getlo(shft_res1); - } - - if (int128_ult(mask, shft_res2)) { - Vd->D(1) =3D int128_getlo(mask); - }else { - Vd->D(1) =3D int128_getlo(shft_res2); + for (i =3D 0; i < oprsz / 16; i++) { + if (imm =3D=3D 0) { + shft_res[2 * i] =3D Vj->Q(i); + shft_res[2 * i + 1] =3D Vd->Q(i); + } else { + shft_res[2 * i] =3D int128_urshift(Vj->Q(i), imm); + shft_res[2 * i + 1] =3D int128_urshift(Vd->Q(i), imm); + } + for (j =3D 2 * i; j <=3D 2 * i + 1; j++) { + if (int128_ult(mask, shft_res[j])) { + Vd->D(j) =3D int128_getlo(mask); + }else { + Vd->D(j) =3D int128_getlo(shft_res[j]); + } + } } } =20 @@ -1561,51 +1591,55 @@ VSSRLNI(vssrlni_b_h, 16, B, H) VSSRLNI(vssrlni_h_w, 32, H, W) VSSRLNI(vssrlni_w_d, 64, W, D) =20 -#define VSSRANI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrans_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrans_ ## E1(Vd->E2(i), imm, BIT/= 2 -1); \ - } = \ - *Vd =3D temp; = \ +#define VSSRANI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrans_ ## E1(Vj->E2(j + ofs *= i), \ + imm, BIT / 2 - 1);= \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrans_ ## E1(Vd->E2(j += ofs * i), \ + imm, BIT / 2= - 1); \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 void HELPER(vssrani_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - Int128 shft_res1, shft_res2, mask, min; - VReg *Vd =3D (VReg *)vd;=20 - VReg *Vj =3D (VReg *)vj;=20 + int i, j; + Int128 shft_res[4], mask, min; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - if (imm =3D=3D 0) { - shft_res1 =3D Vj->Q(0); - shft_res2 =3D Vd->Q(0); - } else { - shft_res1 =3D int128_rshift(Vj->Q(0), imm); - shft_res2 =3D int128_rshift(Vd->Q(0), imm); - } mask =3D int128_sub(int128_lshift(int128_one(), 63), int128_one()); min =3D int128_lshift(int128_one(), 63); =20 - if (int128_gt(shft_res1, mask)) { - Vd->D(0) =3D int128_getlo(mask); - } else if (int128_lt(shft_res1, int128_neg(min))) { - Vd->D(0) =3D int128_getlo(min); - } else { - Vd->D(0) =3D int128_getlo(shft_res1); - } - - if (int128_gt(shft_res2, mask)) { - Vd->D(1) =3D int128_getlo(mask); - } else if (int128_lt(shft_res2, int128_neg(min))) { - Vd->D(1) =3D int128_getlo(min); - } else { - Vd->D(1) =3D int128_getlo(shft_res2); + for (i =3D 0; i < oprsz / 16; i++) { + if (imm =3D=3D 0) { + shft_res[2 * i] =3D Vj->Q(i); + shft_res[2 * i + 1] =3D Vd->Q(i); + } else { + shft_res[2 * i] =3D int128_rshift(Vj->Q(i), imm); + shft_res[2 * i + 1] =3D int128_rshift(Vd->Q(i), imm); + } + for (j =3D 2 * i; j <=3D 2 * i + 1; j++) { + if (int128_gt(shft_res[j], mask)) { + Vd->D(j) =3D int128_getlo(mask); + } else if (int128_lt(shft_res[j], int128_neg(min))) { + Vd->D(j) =3D int128_getlo(min); + } else { + Vd->D(j) =3D int128_getlo(shft_res[j]); + } + } } } =20 @@ -1613,46 +1647,52 @@ VSSRANI(vssrani_b_h, 16, B, H) VSSRANI(vssrani_h_w, 32, H, W) VSSRANI(vssrani_w_d, 64, W, D) =20 -#define VSSRLNUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrlnu_ ## E1(Vj->E2(i), imm, BIT/2); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrlnu_ ## E1(Vd->E2(i), imm, BIT/= 2); \ - } = \ - *Vd =3D temp; = \ +#define VSSRLNUI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrlnu_ ## E1(Vj->E2(j + ofs *= i), \ + imm, BIT / 2); = \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrlnu_ ## E1(Vd->E2(j += ofs * i), \ + imm, BIT / 2= ); \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 void HELPER(vssrlni_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - Int128 shft_res1, shft_res2, mask; + int i, j; + Int128 shft_res[4], mask; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - if (imm =3D=3D 0) { - shft_res1 =3D Vj->Q(0); - shft_res2 =3D Vd->Q(0); - } else { - shft_res1 =3D int128_urshift(Vj->Q(0), imm); - shft_res2 =3D int128_urshift(Vd->Q(0), imm); - } mask =3D int128_sub(int128_lshift(int128_one(), 64), int128_one()); =20 - if (int128_ult(mask, shft_res1)) { - Vd->D(0) =3D int128_getlo(mask); - }else { - Vd->D(0) =3D int128_getlo(shft_res1); - } - - if (int128_ult(mask, shft_res2)) { - Vd->D(1) =3D int128_getlo(mask); - }else { - Vd->D(1) =3D int128_getlo(shft_res2); + for (i =3D 0; i < oprsz / 16; i++) { + if (imm =3D=3D 0) { + shft_res[2 * i] =3D Vj->Q(i); + shft_res[2 * i + 1] =3D Vd->Q(i); + } else { + shft_res[2 * i] =3D int128_urshift(Vj->Q(i), imm); + shft_res[2 * i + 1] =3D int128_urshift(Vd->Q(i), imm); + } + for (j =3D 2 * i; j <=3D 2 * i + 1; j++) { + if (int128_ult(mask, shft_res[j])) { + Vd->D(j) =3D int128_getlo(mask); + }else { + Vd->D(j) =3D int128_getlo(shft_res[j]); + } + } } } =20 @@ -1660,55 +1700,58 @@ VSSRLNUI(vssrlni_bu_h, 16, B, H) VSSRLNUI(vssrlni_hu_w, 32, H, W) VSSRLNUI(vssrlni_wu_d, 64, W, D) =20 -#define VSSRANUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssranu_ ## E1(Vj->E2(i), imm, BIT/2); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssranu_ ## E1(Vd->E2(i), imm, BIT/= 2); \ - } = \ - *Vd =3D temp; = \ +#define VSSRANUI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssranu_ ## E1(Vj->E2(j + ofs *= i), \ + imm, BIT / 2); = \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssranu_ ## E1(Vd->E2(j += ofs * i), \ + imm, BIT / 2= ); \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 void HELPER(vssrani_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - Int128 shft_res1, shft_res2, mask; + int i, j; + Int128 shft_res[4], mask; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; - - if (imm =3D=3D 0) { - shft_res1 =3D Vj->Q(0); - shft_res2 =3D Vd->Q(0); - } else { - shft_res1 =3D int128_rshift(Vj->Q(0), imm); - shft_res2 =3D int128_rshift(Vd->Q(0), imm); - } - - if (int128_lt(Vj->Q(0), int128_zero())) { - shft_res1 =3D int128_zero(); - } - - if (int128_lt(Vd->Q(0), int128_zero())) { - shft_res2 =3D int128_zero(); - } + int oprsz =3D simd_oprsz(desc); =20 mask =3D int128_sub(int128_lshift(int128_one(), 64), int128_one()); =20 - if (int128_ult(mask, shft_res1)) { - Vd->D(0) =3D int128_getlo(mask); - }else { - Vd->D(0) =3D int128_getlo(shft_res1); - } - - if (int128_ult(mask, shft_res2)) { - Vd->D(1) =3D int128_getlo(mask); - }else { - Vd->D(1) =3D int128_getlo(shft_res2); + for (i =3D 0; i < oprsz / 16; i++) { + if (imm =3D=3D 0) { + shft_res[2 * i] =3D Vj->Q(i); + shft_res[2 * i + 1] =3D Vd->Q(i); + } else { + shft_res[2 * i] =3D int128_rshift(Vj->Q(i), imm); + shft_res[2 * i + 1] =3D int128_rshift(Vd->Q(i), imm); + } + if (int128_lt(Vj->Q(i), int128_zero())) { + shft_res[2 * i] =3D int128_zero(); + } + if (int128_lt(Vd->Q(i), int128_zero())) { + shft_res[2 * i + 1] =3D int128_zero(); + } + for (j =3D 2 * i; j <=3D 2 * i + 1; j++) { + if (int128_ult(mask, shft_res[j])) { + Vd->D(j) =3D int128_getlo(mask); + }else { + Vd->D(j) =3D int128_getlo(shft_res[j]); + } + } } } =20 diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 702a2f770d..9c218abb6f 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -459,6 +459,36 @@ TRANS(xvsrarni_h_w, LASX, gen_vv_i, 32, gen_helper_vsr= arni_h_w) TRANS(xvsrarni_w_d, LASX, gen_vv_i, 32, gen_helper_vsrarni_w_d) TRANS(xvsrarni_d_q, LASX, gen_vv_i, 32, gen_helper_vsrarni_d_q) =20 +TRANS(xvssrln_b_h, LASX, gen_vvv, 32, gen_helper_vssrln_b_h) +TRANS(xvssrln_h_w, LASX, gen_vvv, 32, gen_helper_vssrln_h_w) +TRANS(xvssrln_w_d, LASX, gen_vvv, 32, gen_helper_vssrln_w_d) +TRANS(xvssran_b_h, LASX, gen_vvv, 32, gen_helper_vssran_b_h) +TRANS(xvssran_h_w, LASX, gen_vvv, 32, gen_helper_vssran_h_w) +TRANS(xvssran_w_d, LASX, gen_vvv, 32, gen_helper_vssran_w_d) +TRANS(xvssrln_bu_h, LASX, gen_vvv, 32, gen_helper_vssrln_bu_h) +TRANS(xvssrln_hu_w, LASX, gen_vvv, 32, gen_helper_vssrln_hu_w) +TRANS(xvssrln_wu_d, LASX, gen_vvv, 32, gen_helper_vssrln_wu_d) +TRANS(xvssran_bu_h, LASX, gen_vvv, 32, gen_helper_vssran_bu_h) +TRANS(xvssran_hu_w, LASX, gen_vvv, 32, gen_helper_vssran_hu_w) +TRANS(xvssran_wu_d, LASX, gen_vvv, 32, gen_helper_vssran_wu_d) + +TRANS(xvssrlni_b_h, LASX, gen_vv_i, 32, gen_helper_vssrlni_b_h) +TRANS(xvssrlni_h_w, LASX, gen_vv_i, 32, gen_helper_vssrlni_h_w) +TRANS(xvssrlni_w_d, LASX, gen_vv_i, 32, gen_helper_vssrlni_w_d) +TRANS(xvssrlni_d_q, LASX, gen_vv_i, 32, gen_helper_vssrlni_d_q) +TRANS(xvssrani_b_h, LASX, gen_vv_i, 32, gen_helper_vssrani_b_h) +TRANS(xvssrani_h_w, LASX, gen_vv_i, 32, gen_helper_vssrani_h_w) +TRANS(xvssrani_w_d, LASX, gen_vv_i, 32, gen_helper_vssrani_w_d) +TRANS(xvssrani_d_q, LASX, gen_vv_i, 32, gen_helper_vssrani_d_q) +TRANS(xvssrlni_bu_h, LASX, gen_vv_i, 32, gen_helper_vssrlni_bu_h) +TRANS(xvssrlni_hu_w, LASX, gen_vv_i, 32, gen_helper_vssrlni_hu_w) +TRANS(xvssrlni_wu_d, LASX, gen_vv_i, 32, gen_helper_vssrlni_wu_d) +TRANS(xvssrlni_du_q, LASX, gen_vv_i, 32, gen_helper_vssrlni_du_q) +TRANS(xvssrani_bu_h, LASX, gen_vv_i, 32, gen_helper_vssrani_bu_h) +TRANS(xvssrani_hu_w, LASX, gen_vv_i, 32, gen_helper_vssrani_hu_w) +TRANS(xvssrani_wu_d, LASX, gen_vv_i, 32, gen_helper_vssrani_wu_d) +TRANS(xvssrani_du_q, LASX, gen_vv_i, 32, gen_helper_vssrani_du_q) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385413453843.3986969645102; Wed, 30 Aug 2023 01:50:13 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtZ-0001Px-Np; Wed, 30 Aug 2023 04:49:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtX-0001GW-Mf for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:47 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtT-0007XZ-Cz for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:47 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxV_GaAu9ktAgdAA--.59819S3; Wed, 30 Aug 2023 16:49:30 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S35; Wed, 30 Aug 2023 16:49:28 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 33/48] target/loongarch: Implement xvssrlrn xvssrarn Date: Wed, 30 Aug 2023 16:48:47 +0800 Message-Id: <20230830084902.2113960-34-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S35 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385414972100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSSRLRN.{B.H/H.W/W.D}; - XVSSRARN.{B.H/H.W/W.D}; - XVSSRLRN.{BU.H/HU.W/WU.D}; - XVSSRARN.{BU.H/HU.W/WU.D}; - XVSSRLRNI.{B.H/H.W/W.D/D.Q}; - XVSSRARNI.{B.H/H.W/W.D/D.Q}; - XVSSRLRNI.{BU.H/HU.W/WU.D/DU.Q}; - XVSSRARNI.{BU.H/HU.W/WU.D/DU.Q}. Signed-off-by: Song Gao --- target/loongarch/insns.decode | 30 ++ target/loongarch/disas.c | 30 ++ target/loongarch/vec_helper.c | 467 ++++++++++--------- target/loongarch/insn_trans/trans_lasx.c.inc | 30 ++ 4 files changed, 348 insertions(+), 209 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 022dd9bfd1..dc74bae7a5 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1740,6 +1740,36 @@ xvssrani_hu_w 0111 01110110 01001 ..... ..... ...= .. @vv_ui5 xvssrani_wu_d 0111 01110110 0101 ...... ..... ..... @vv_ui6 xvssrani_du_q 0111 01110110 011 ....... ..... ..... @vv_ui7 =20 +xvssrlrn_b_h 0111 01010000 00001 ..... ..... ..... @vvv +xvssrlrn_h_w 0111 01010000 00010 ..... ..... ..... @vvv +xvssrlrn_w_d 0111 01010000 00011 ..... ..... ..... @vvv +xvssrarn_b_h 0111 01010000 00101 ..... ..... ..... @vvv +xvssrarn_h_w 0111 01010000 00110 ..... ..... ..... @vvv +xvssrarn_w_d 0111 01010000 00111 ..... ..... ..... @vvv +xvssrlrn_bu_h 0111 01010000 10001 ..... ..... ..... @vvv +xvssrlrn_hu_w 0111 01010000 10010 ..... ..... ..... @vvv +xvssrlrn_wu_d 0111 01010000 10011 ..... ..... ..... @vvv +xvssrarn_bu_h 0111 01010000 10101 ..... ..... ..... @vvv +xvssrarn_hu_w 0111 01010000 10110 ..... ..... ..... @vvv +xvssrarn_wu_d 0111 01010000 10111 ..... ..... ..... @vvv + +xvssrlrni_b_h 0111 01110101 00000 1 .... ..... ..... @vv_ui4 +xvssrlrni_h_w 0111 01110101 00001 ..... ..... ..... @vv_ui5 +xvssrlrni_w_d 0111 01110101 0001 ...... ..... ..... @vv_ui6 +xvssrlrni_d_q 0111 01110101 001 ....... ..... ..... @vv_ui7 +xvssrarni_b_h 0111 01110110 10000 1 .... ..... ..... @vv_ui4 +xvssrarni_h_w 0111 01110110 10001 ..... ..... ..... @vv_ui5 +xvssrarni_w_d 0111 01110110 1001 ...... ..... ..... @vv_ui6 +xvssrarni_d_q 0111 01110110 101 ....... ..... ..... @vv_ui7 +xvssrlrni_bu_h 0111 01110101 01000 1 .... ..... ..... @vv_ui4 +xvssrlrni_hu_w 0111 01110101 01001 ..... ..... ..... @vv_ui5 +xvssrlrni_wu_d 0111 01110101 0101 ...... ..... ..... @vv_ui6 +xvssrlrni_du_q 0111 01110101 011 ....... ..... ..... @vv_ui7 +xvssrarni_bu_h 0111 01110110 11000 1 .... ..... ..... @vv_ui4 +xvssrarni_hu_w 0111 01110110 11001 ..... ..... ..... @vv_ui5 +xvssrarni_wu_d 0111 01110110 1101 ...... ..... ..... @vv_ui6 +xvssrarni_du_q 0111 01110110 111 ....... ..... ..... @vv_ui7 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 04e8d42044..f043a2f9b6 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2166,6 +2166,36 @@ INSN_LASX(xvssrani_hu_w, vv_i) INSN_LASX(xvssrani_wu_d, vv_i) INSN_LASX(xvssrani_du_q, vv_i) =20 +INSN_LASX(xvssrlrn_b_h, vvv) +INSN_LASX(xvssrlrn_h_w, vvv) +INSN_LASX(xvssrlrn_w_d, vvv) +INSN_LASX(xvssrarn_b_h, vvv) +INSN_LASX(xvssrarn_h_w, vvv) +INSN_LASX(xvssrarn_w_d, vvv) +INSN_LASX(xvssrlrn_bu_h, vvv) +INSN_LASX(xvssrlrn_hu_w, vvv) +INSN_LASX(xvssrlrn_wu_d, vvv) +INSN_LASX(xvssrarn_bu_h, vvv) +INSN_LASX(xvssrarn_hu_w, vvv) +INSN_LASX(xvssrarn_wu_d, vvv) + +INSN_LASX(xvssrlrni_b_h, vv_i) +INSN_LASX(xvssrlrni_h_w, vv_i) +INSN_LASX(xvssrlrni_w_d, vv_i) +INSN_LASX(xvssrlrni_d_q, vv_i) +INSN_LASX(xvssrlrni_bu_h, vv_i) +INSN_LASX(xvssrlrni_hu_w, vv_i) +INSN_LASX(xvssrlrni_wu_d, vv_i) +INSN_LASX(xvssrlrni_du_q, vv_i) +INSN_LASX(xvssrarni_b_h, vv_i) +INSN_LASX(xvssrarni_h_w, vv_i) +INSN_LASX(xvssrarni_w_d, vv_i) +INSN_LASX(xvssrarni_d_q, vv_i) +INSN_LASX(xvssrarni_bu_h, vv_i) +INSN_LASX(xvssrarni_hu_w, vv_i) +INSN_LASX(xvssrarni_wu_d, vv_i) +INSN_LASX(xvssrarni_du_q, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 738bb452f6..852c65716e 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -1766,7 +1766,7 @@ static T1 do_ssrlrns_ ## E1(T2 e2, int sa, int sh) \ \ shft_res =3D do_vsrlr_ ## E2(e2, sa); \ T1 mask; \ - mask =3D (1ull << sh) -1; \ + mask =3D (1ull << sh) - 1; \ if (shft_res > mask) { \ return mask; \ } else { \ @@ -1778,23 +1778,29 @@ SSRLRNS(B, H, uint16_t, int16_t, uint8_t) SSRLRNS(H, W, uint32_t, int32_t, uint16_t) SSRLRNS(W, D, uint64_t, int64_t, uint32_t) =20 -#define VSSRLRN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrlrns_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= -1); \ - } = \ - Vd->D(1) =3D 0; = \ +#define VSSRLRN(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrlrns_ ## E1(Vj->E2(j + ofs *= i), \ + Vk->E3(j + ofs * i) % = BIT, \ + BIT / 2 - 1); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRLRN(vssrlrn_b_h, 16, uint16_t, B, H) -VSSRLRN(vssrlrn_h_w, 32, uint32_t, H, W) -VSSRLRN(vssrlrn_w_d, 64, uint64_t, W, D) +VSSRLRN(vssrlrn_b_h, 16, B, H, UH) +VSSRLRN(vssrlrn_h_w, 32, H, W, UW) +VSSRLRN(vssrlrn_w_d, 64, W, D, UD) =20 #define SSRARNS(E1, E2, T1, T2) \ static T1 do_ssrarns_ ## E1(T1 e2, int sa, int sh) \ @@ -1803,7 +1809,7 @@ static T1 do_ssrarns_ ## E1(T1 e2, int sa, int sh) \ \ shft_res =3D do_vsrar_ ## E2(e2, sa); \ T2 mask; \ - mask =3D (1ll << sh) -1; \ + mask =3D (1ll << sh) - 1; \ if (shft_res > mask) { \ return mask; \ } else if (shft_res < -(mask +1)) { \ @@ -1817,23 +1823,29 @@ SSRARNS(B, H, int16_t, int8_t) SSRARNS(H, W, int32_t, int16_t) SSRARNS(W, D, int64_t, int32_t) =20 -#define VSSRARN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrarns_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= -1); \ - } = \ - Vd->D(1) =3D 0; = \ +#define VSSRARN(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrarns_ ## E1(Vj->E2(j + ofs *= i), \ + Vk->E3(j + ofs * i) % = BIT, \ + BIT/ 2 - 1); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRARN(vssrarn_b_h, 16, uint16_t, B, H) -VSSRARN(vssrarn_h_w, 32, uint32_t, H, W) -VSSRARN(vssrarn_w_d, 64, uint64_t, W, D) +VSSRARN(vssrarn_b_h, 16, B, H, UH) +VSSRARN(vssrarn_h_w, 32, H, W, UW) +VSSRARN(vssrarn_w_d, 64, W, D, UD) =20 #define SSRLRNU(E1, E2, T1, T2, T3) \ static T1 do_ssrlrnu_ ## E1(T3 e2, int sa, int sh) \ @@ -1843,7 +1855,7 @@ static T1 do_ssrlrnu_ ## E1(T3 e2, int sa, int sh) \ shft_res =3D do_vsrlr_ ## E2(e2, sa); \ \ T2 mask; \ - mask =3D (1ull << sh) -1; \ + mask =3D (1ull << sh) - 1; \ if (shft_res > mask) { \ return mask; \ } else { \ @@ -1855,23 +1867,29 @@ SSRLRNU(B, H, uint16_t, uint8_t, int16_t) SSRLRNU(H, W, uint32_t, uint16_t, int32_t) SSRLRNU(W, D, uint64_t, uint32_t, int64_t) =20 -#define VSSRLRNU(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrlrnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= ); \ - } = \ - Vd->D(1) =3D 0; = \ +#define VSSRLRNU(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrlrnu_ ## E1(Vj->E2(j + ofs *= i), \ + Vk->E3(j + ofs * i) % = BIT, \ + BIT / 2); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRLRNU(vssrlrn_bu_h, 16, uint16_t, B, H) -VSSRLRNU(vssrlrn_hu_w, 32, uint32_t, H, W) -VSSRLRNU(vssrlrn_wu_d, 64, uint64_t, W, D) +VSSRLRNU(vssrlrn_bu_h, 16, B, H, UH) +VSSRLRNU(vssrlrn_hu_w, 32, H, W, UW) +VSSRLRNU(vssrlrn_wu_d, 64, W, D, UD) =20 #define SSRARNU(E1, E2, T1, T2, T3) \ static T1 do_ssrarnu_ ## E1(T3 e2, int sa, int sh) \ @@ -1884,7 +1902,7 @@ static T1 do_ssrarnu_ ## E1(T3 e2, int sa, int sh) \ shft_res =3D do_vsrar_ ## E2(e2, sa); \ } \ T2 mask; \ - mask =3D (1ull << sh) -1; \ + mask =3D (1ull << sh) - 1; \ if (shft_res > mask) { \ return mask; \ } else { \ @@ -1896,70 +1914,84 @@ SSRARNU(B, H, uint16_t, uint8_t, int16_t) SSRARNU(H, W, uint32_t, uint16_t, int32_t) SSRARNU(W, D, uint64_t, uint32_t, int64_t) =20 -#define VSSRARNU(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrarnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= ); \ - } = \ - Vd->D(1) =3D 0; = \ -} - -VSSRARNU(vssrarn_bu_h, 16, uint16_t, B, H) -VSSRARNU(vssrarn_hu_w, 32, uint32_t, H, W) -VSSRARNU(vssrarn_wu_d, 64, uint64_t, W, D) - -#define VSSRLRNI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrlrns_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrlrns_ ## E1(Vd->E2(i), imm, BIT= /2 -1);\ - } = \ - *Vd =3D temp; = \ +#define VSSRARNU(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrarnu_ ## E1(Vj->E2(j + ofs *= i), \ + Vk->E3(j + ofs * i) % BIT,= \ + BIT / 2); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -#define VSSRLRNI_Q(NAME, sh) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - Int128 shft_res1, shft_res2, mask, r1, r2; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - if (imm =3D=3D 0) { = \ - shft_res1 =3D Vj->Q(0); = \ - shft_res2 =3D Vd->Q(0); = \ - } else { = \ - r1 =3D int128_and(int128_urshift(Vj->Q(0), (imm -1)), int128_one()= ); \ - r2 =3D int128_and(int128_urshift(Vd->Q(0), (imm -1)), int128_one()= ); \ - = \ - shft_res1 =3D (int128_add(int128_urshift(Vj->Q(0), imm), r1)); = \ - shft_res2 =3D (int128_add(int128_urshift(Vd->Q(0), imm), r2)); = \ - } = \ - = \ - mask =3D int128_sub(int128_lshift(int128_one(), sh), int128_one()); = \ - = \ - if (int128_ult(mask, shft_res1)) { = \ - Vd->D(0) =3D int128_getlo(mask); = \ - }else { = \ - Vd->D(0) =3D int128_getlo(shft_res1); = \ - } = \ - = \ - if (int128_ult(mask, shft_res2)) { = \ - Vd->D(1) =3D int128_getlo(mask); = \ - }else { = \ - Vd->D(1) =3D int128_getlo(shft_res2); = \ - } = \ +VSSRARNU(vssrarn_bu_h, 16, B, H, UH) +VSSRARNU(vssrarn_hu_w, 32, H, W, UW) +VSSRARNU(vssrarn_wu_d, 64, W, D, UD) + +#define VSSRLRNI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrlrns_ ## E1(Vj->E2(j + ofs = * i), \ + imm, BIT / 2 - 1)= ; \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrlrns_ ## E1(Vd->E2(j = + ofs * i), \ + imm, BIT / = 2 - 1); \ + } = \ + } = \ + *Vd =3D temp; = \ +} + +#define VSSRLRNI_Q(NAME, sh) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j; = \ + Int128 shft_res[4], mask, r[4]; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + mask =3D int128_sub(int128_lshift(int128_one(), sh), int128_one()); = \ + = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + if (imm =3D=3D 0) { = \ + shft_res[2 * i] =3D Vj->Q(i); = \ + shft_res[2 * i + 1] =3D Vd->Q(i); = \ + } else { = \ + r[2 * i] =3D int128_and(int128_urshift(Vj->Q(i), (imm - 1)), = \ + int128_one()); = \ + r[2 * i + 1] =3D int128_and(int128_urshift(Vd->Q(i), (imm - 1)= ), \ + int128_one()); = \ + shft_res[2 * i] =3D int128_add(int128_urshift(Vj->Q(i), imm), = \ + r[2 * i]); = \ + shft_res[2 * i + 1] =3D int128_add(int128_urshift(Vd->Q(i), im= m), \ + r[2 * i + 1]); = \ + } = \ + for (j =3D 2 * i; j <=3D 2 * i + 1; j++) { = \ + if (int128_ult(mask, shft_res[j])) { = \ + Vd->D(j) =3D int128_getlo(mask); = \ + }else { = \ + Vd->D(j) =3D int128_getlo(shft_res[j]); = \ + } = \ + } = \ + } = \ } =20 VSSRLRNI(vssrlrni_b_h, 16, B, H) @@ -1967,55 +1999,61 @@ VSSRLRNI(vssrlrni_h_w, 32, H, W) VSSRLRNI(vssrlrni_w_d, 64, W, D) VSSRLRNI_Q(vssrlrni_d_q, 63) =20 -#define VSSRARNI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrarns_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrarns_ ## E1(Vd->E2(i), imm, BIT= /2 -1); \ - } = \ - *Vd =3D temp; = \ +#define VSSRARNI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrarns_ ## E1(Vj->E2(j + ofs = * i), \ + imm, BIT / 2 - 1)= ; \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrarns_ ## E1(Vd->E2(j = + ofs * i), \ + imm, BIT / = 2 - 1); \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 void HELPER(vssrarni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - Int128 shft_res1, shft_res2, mask1, mask2, r1, r2; + int i, j; + Int128 shft_res[4], mask1, mask2, r[4]; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; - - if (imm =3D=3D 0) { - shft_res1 =3D Vj->Q(0); - shft_res2 =3D Vd->Q(0); - } else { - r1 =3D int128_and(int128_rshift(Vj->Q(0), (imm -1)), int128_one()); - r2 =3D int128_and(int128_rshift(Vd->Q(0), (imm -1)), int128_one()); - - shft_res1 =3D int128_add(int128_rshift(Vj->Q(0), imm), r1); - shft_res2 =3D int128_add(int128_rshift(Vd->Q(0), imm), r2); - } + int oprsz =3D simd_oprsz(desc); =20 mask1 =3D int128_sub(int128_lshift(int128_one(), 63), int128_one()); mask2 =3D int128_lshift(int128_one(), 63); =20 - if (int128_gt(shft_res1, mask1)) { - Vd->D(0) =3D int128_getlo(mask1); - } else if (int128_lt(shft_res1, int128_neg(mask2))) { - Vd->D(0) =3D int128_getlo(mask2); - } else { - Vd->D(0) =3D int128_getlo(shft_res1); - } - - if (int128_gt(shft_res2, mask1)) { - Vd->D(1) =3D int128_getlo(mask1); - } else if (int128_lt(shft_res2, int128_neg(mask2))) { - Vd->D(1) =3D int128_getlo(mask2); - } else { - Vd->D(1) =3D int128_getlo(shft_res2); + for (i =3D 0; i < oprsz / 16; i++) { + if (imm =3D=3D 0) { + shft_res[2 * i] =3D Vj->Q(i); + shft_res[2 * i + 1] =3D Vd->Q(i); + } else { + r[2 * i] =3D int128_and(int128_rshift(Vj->Q(i), (imm - 1)), + int128_one()); + r[2 * i + 1] =3D int128_and(int128_rshift(Vd->Q(i), (imm - 1)), + int128_one()); + shft_res[2 * i] =3D int128_add(int128_rshift(Vj->Q(i), imm), + r[2 * i]); + shft_res[2 * i + 1] =3D int128_add(int128_rshift(Vd->Q(i), imm= ), + r[2 * i + 1]); + } + for (j =3D 2 * i; j <=3D 2 * i + 1; j++) { + if (int128_gt(shft_res[j], mask1)) { + Vd->D(j) =3D int128_getlo(mask1); + } else if (int128_lt(shft_res[j], int128_neg(mask2))) { + Vd->D(j) =3D int128_getlo(mask2); + } else { + Vd->D(j) =3D int128_getlo(shft_res[j]); + } + } } } =20 @@ -2023,19 +2061,25 @@ VSSRARNI(vssrarni_b_h, 16, B, H) VSSRARNI(vssrarni_h_w, 32, H, W) VSSRARNI(vssrarni_w_d, 64, W, D) =20 -#define VSSRLRNUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrlrnu_ ## E1(Vj->E2(i), imm, BIT/2); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrlrnu_ ## E1(Vd->E2(i), imm, BIT= /2); \ - } = \ - *Vd =3D temp; = \ +#define VSSRLRNUI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrlrnu_ ## E1(Vj->E2(j + ofs = * i), \ + imm, BIT / 2); = \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrlrnu_ ## E1(Vd->E2(j = + ofs * i), \ + imm, BIT / = 2); \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 VSSRLRNUI(vssrlrni_bu_h, 16, B, H) @@ -2043,62 +2087,67 @@ VSSRLRNUI(vssrlrni_hu_w, 32, H, W) VSSRLRNUI(vssrlrni_wu_d, 64, W, D) VSSRLRNI_Q(vssrlrni_du_q, 64) =20 -#define VSSRARNUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrarnu_ ## E1(Vj->E2(i), imm, BIT/2); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrarnu_ ## E1(Vd->E2(i), imm, BIT= /2); \ - } = \ - *Vd =3D temp; = \ +#define VSSRARNUI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrarnu_ ## E1(Vj->E2(j + ofs = * i), \ + imm, BIT / 2); = \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrarnu_ ## E1(Vd->E2(j = + ofs * i), \ + imm, BIT / = 2); \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 void HELPER(vssrarni_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - Int128 shft_res1, shft_res2, mask1, mask2, r1, r2; + int i, j; + Int128 shft_res[4], mask1, mask2, r[4]; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; - - if (imm =3D=3D 0) { - shft_res1 =3D Vj->Q(0); - shft_res2 =3D Vd->Q(0); - } else { - r1 =3D int128_and(int128_rshift(Vj->Q(0), (imm -1)), int128_one()); - r2 =3D int128_and(int128_rshift(Vd->Q(0), (imm -1)), int128_one()); - - shft_res1 =3D int128_add(int128_rshift(Vj->Q(0), imm), r1); - shft_res2 =3D int128_add(int128_rshift(Vd->Q(0), imm), r2); - } - - if (int128_lt(Vj->Q(0), int128_zero())) { - shft_res1 =3D int128_zero(); - } - if (int128_lt(Vd->Q(0), int128_zero())) { - shft_res2 =3D int128_zero(); - } + int oprsz =3D simd_oprsz(desc); =20 mask1 =3D int128_sub(int128_lshift(int128_one(), 64), int128_one()); mask2 =3D int128_lshift(int128_one(), 64); =20 - if (int128_gt(shft_res1, mask1)) { - Vd->D(0) =3D int128_getlo(mask1); - } else if (int128_lt(shft_res1, int128_neg(mask2))) { - Vd->D(0) =3D int128_getlo(mask2); - } else { - Vd->D(0) =3D int128_getlo(shft_res1); - } - - if (int128_gt(shft_res2, mask1)) { - Vd->D(1) =3D int128_getlo(mask1); - } else if (int128_lt(shft_res2, int128_neg(mask2))) { - Vd->D(1) =3D int128_getlo(mask2); - } else { - Vd->D(1) =3D int128_getlo(shft_res2); + for (i =3D 0; i < oprsz / 16; i++) { + if (imm =3D=3D 0) { + shft_res[2 * i] =3D Vj->Q(i); + shft_res[2 * i + 1] =3D Vd->Q(i); + } else { + r[2 * i] =3D int128_and(int128_rshift(Vj->Q(i), (imm - 1)), + int128_one()); + r[2 * i + 1] =3D int128_and(int128_rshift(Vd->Q(i), (imm - 1)), + int128_one()); + shft_res[2 * i] =3D int128_add(int128_rshift(Vj->Q(i), imm), + r[2 * i]); + shft_res[2 * i + 1] =3D int128_add(int128_rshift(Vd->Q(i), imm= ), + r[2 * i + 1]); + } + if (int128_lt(Vj->Q(i), int128_zero())) { + shft_res[2 * i] =3D int128_zero(); + } + if (int128_lt(Vd->Q(i), int128_zero())) { + shft_res[2 * i + 1] =3D int128_zero(); + } + for (j =3D 2 * i; j <=3D 2 * i + 1; j++) { + if (int128_gt(shft_res[j], mask1)) { + Vd->D(j) =3D int128_getlo(mask1); + } else if (int128_lt(shft_res[j], int128_neg(mask2))) { + Vd->D(j) =3D int128_getlo(mask2); + } else { + Vd->D(j) =3D int128_getlo(shft_res[j]); + } + } } } =20 diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 9c218abb6f..dc658fc2cb 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -489,6 +489,36 @@ TRANS(xvssrani_hu_w, LASX, gen_vv_i, 32, gen_helper_vs= srani_hu_w) TRANS(xvssrani_wu_d, LASX, gen_vv_i, 32, gen_helper_vssrani_wu_d) TRANS(xvssrani_du_q, LASX, gen_vv_i, 32, gen_helper_vssrani_du_q) =20 +TRANS(xvssrlrn_b_h, LASX, gen_vvv, 32, gen_helper_vssrlrn_b_h) +TRANS(xvssrlrn_h_w, LASX, gen_vvv, 32, gen_helper_vssrlrn_h_w) +TRANS(xvssrlrn_w_d, LASX, gen_vvv, 32, gen_helper_vssrlrn_w_d) +TRANS(xvssrarn_b_h, LASX, gen_vvv, 32, gen_helper_vssrarn_b_h) +TRANS(xvssrarn_h_w, LASX, gen_vvv, 32, gen_helper_vssrarn_h_w) +TRANS(xvssrarn_w_d, LASX, gen_vvv, 32, gen_helper_vssrarn_w_d) +TRANS(xvssrlrn_bu_h, LASX, gen_vvv, 32, gen_helper_vssrlrn_bu_h) +TRANS(xvssrlrn_hu_w, LASX, gen_vvv, 32, gen_helper_vssrlrn_hu_w) +TRANS(xvssrlrn_wu_d, LASX, gen_vvv, 32, gen_helper_vssrlrn_wu_d) +TRANS(xvssrarn_bu_h, LASX, gen_vvv, 32, gen_helper_vssrarn_bu_h) +TRANS(xvssrarn_hu_w, LASX, gen_vvv, 32, gen_helper_vssrarn_hu_w) +TRANS(xvssrarn_wu_d, LASX, gen_vvv, 32, gen_helper_vssrarn_wu_d) + +TRANS(xvssrlrni_b_h, LASX, gen_vv_i, 32, gen_helper_vssrlrni_b_h) +TRANS(xvssrlrni_h_w, LASX, gen_vv_i, 32, gen_helper_vssrlrni_h_w) +TRANS(xvssrlrni_w_d, LASX, gen_vv_i, 32, gen_helper_vssrlrni_w_d) +TRANS(xvssrlrni_d_q, LASX, gen_vv_i, 32, gen_helper_vssrlrni_d_q) +TRANS(xvssrarni_b_h, LASX, gen_vv_i, 32, gen_helper_vssrarni_b_h) +TRANS(xvssrarni_h_w, LASX, gen_vv_i, 32, gen_helper_vssrarni_h_w) +TRANS(xvssrarni_w_d, LASX, gen_vv_i, 32, gen_helper_vssrarni_w_d) +TRANS(xvssrarni_d_q, LASX, gen_vv_i, 32, gen_helper_vssrarni_d_q) +TRANS(xvssrlrni_bu_h, LASX, gen_vv_i, 32, gen_helper_vssrlrni_bu_h) +TRANS(xvssrlrni_hu_w, LASX, gen_vv_i, 32, gen_helper_vssrlrni_hu_w) +TRANS(xvssrlrni_wu_d, LASX, gen_vv_i, 32, gen_helper_vssrlrni_wu_d) +TRANS(xvssrlrni_du_q, LASX, gen_vv_i, 32, gen_helper_vssrlrni_du_q) +TRANS(xvssrarni_bu_h, LASX, gen_vv_i, 32, gen_helper_vssrarni_bu_h) +TRANS(xvssrarni_hu_w, LASX, gen_vv_i, 32, gen_helper_vssrarni_hu_w) +TRANS(xvssrarni_wu_d, LASX, gen_vv_i, 32, gen_helper_vssrarni_wu_d) +TRANS(xvssrarni_du_q, LASX, gen_vv_i, 32, gen_helper_vssrarni_du_q) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385461355344.54292189867965; Wed, 30 Aug 2023 01:51:01 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtf-0001aq-1a; Wed, 30 Aug 2023 04:49:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtc-0001Wj-Cp for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:52 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtV-0007YN-TI for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:52 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx77uaAu9ktwgdAA--.269S3; Wed, 30 Aug 2023 16:49:30 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S36; Wed, 30 Aug 2023 16:49:30 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 34/48] target/loongarch: Implement xvclo xvclz Date: Wed, 30 Aug 2023 16:48:48 +0800 Message-Id: <20230830084902.2113960-35-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S36 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385462286100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVCLO.{B/H/W/D}; - XVCLZ.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 9 +++++++++ target/loongarch/insns.decode | 9 +++++++++ target/loongarch/disas.c | 9 +++++++++ target/loongarch/vec_helper.c | 13 ++----------- target/loongarch/insn_trans/trans_lasx.c.inc | 9 +++++++++ 5 files changed, 38 insertions(+), 11 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 67d829f9da..4497cd4a6d 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -76,4 +76,13 @@ =20 #define R_SHIFT(a, b) (a >> b) =20 +#define DO_CLO_B(N) (clz32(~N & 0xff) - 24) +#define DO_CLO_H(N) (clz32(~N & 0xffff) - 16) +#define DO_CLO_W(N) (clz32(~N)) +#define DO_CLO_D(N) (clz64(~N)) +#define DO_CLZ_B(N) (clz32(N) - 24) +#define DO_CLZ_H(N) (clz32(N) - 16) +#define DO_CLZ_W(N) (clz32(N)) +#define DO_CLZ_D(N) (clz64(N)) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index dc74bae7a5..3175532045 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1770,6 +1770,15 @@ xvssrarni_hu_w 0111 01110110 11001 ..... ..... ...= .. @vv_ui5 xvssrarni_wu_d 0111 01110110 1101 ...... ..... ..... @vv_ui6 xvssrarni_du_q 0111 01110110 111 ....... ..... ..... @vv_ui7 =20 +xvclo_b 0111 01101001 11000 00000 ..... ..... @vv +xvclo_h 0111 01101001 11000 00001 ..... ..... @vv +xvclo_w 0111 01101001 11000 00010 ..... ..... @vv +xvclo_d 0111 01101001 11000 00011 ..... ..... @vv +xvclz_b 0111 01101001 11000 00100 ..... ..... @vv +xvclz_h 0111 01101001 11000 00101 ..... ..... @vv +xvclz_w 0111 01101001 11000 00110 ..... ..... @vv +xvclz_d 0111 01101001 11000 00111 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index f043a2f9b6..0fc58735b9 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2196,6 +2196,15 @@ INSN_LASX(xvssrarni_hu_w, vv_i) INSN_LASX(xvssrarni_wu_d, vv_i) INSN_LASX(xvssrarni_du_q, vv_i) =20 +INSN_LASX(xvclo_b, vv) +INSN_LASX(xvclo_h, vv) +INSN_LASX(xvclo_w, vv) +INSN_LASX(xvclo_d, vv) +INSN_LASX(xvclz_b, vv) +INSN_LASX(xvclz_h, vv) +INSN_LASX(xvclz_w, vv) +INSN_LASX(xvclz_d, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 852c65716e..789f6b303e 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2161,22 +2161,13 @@ void HELPER(NAME)(void *vd, void *vj, uint32_t desc= ) \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) \ - { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D DO_OP(Vj->E(i)); \ } \ } =20 -#define DO_CLO_B(N) (clz32(~N & 0xff) - 24) -#define DO_CLO_H(N) (clz32(~N & 0xffff) - 16) -#define DO_CLO_W(N) (clz32(~N)) -#define DO_CLO_D(N) (clz64(~N)) -#define DO_CLZ_B(N) (clz32(N) - 24) -#define DO_CLZ_H(N) (clz32(N) - 16) -#define DO_CLZ_W(N) (clz32(N)) -#define DO_CLZ_D(N) (clz64(N)) - DO_2OP(vclo_b, 8, UB, DO_CLO_B) DO_2OP(vclo_h, 16, UH, DO_CLO_H) DO_2OP(vclo_w, 32, UW, DO_CLO_W) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index dc658fc2cb..4227fbe629 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -519,6 +519,15 @@ TRANS(xvssrarni_hu_w, LASX, gen_vv_i, 32, gen_helper_v= ssrarni_hu_w) TRANS(xvssrarni_wu_d, LASX, gen_vv_i, 32, gen_helper_vssrarni_wu_d) TRANS(xvssrarni_du_q, LASX, gen_vv_i, 32, gen_helper_vssrarni_du_q) =20 +TRANS(xvclo_b, LASX, gen_vv, 32, gen_helper_vclo_b) +TRANS(xvclo_h, LASX, gen_vv, 32, gen_helper_vclo_h) +TRANS(xvclo_w, LASX, gen_vv, 32, gen_helper_vclo_w) +TRANS(xvclo_d, LASX, gen_vv, 32, gen_helper_vclo_d) +TRANS(xvclz_b, LASX, gen_vv, 32, gen_helper_vclz_b) +TRANS(xvclz_h, LASX, gen_vv, 32, gen_helper_vclz_h) +TRANS(xvclz_w, LASX, gen_vv, 32, gen_helper_vclz_w) +TRANS(xvclz_d, LASX, gen_vv, 32, gen_helper_vclz_d) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385610278650.9431205044928; Wed, 30 Aug 2023 01:53:30 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGta-0001Sm-O2; Wed, 30 Aug 2023 04:49:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtZ-0001PO-18 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:49 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtU-0007Xf-6n for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:48 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxtPCaAu9kuAgdAA--.60100S3; Wed, 30 Aug 2023 16:49:30 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S37; Wed, 30 Aug 2023 16:49:30 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 35/48] target/loongarch: Implement xvpcnt Date: Wed, 30 Aug 2023 16:48:49 +0800 Message-Id: <20230830084902.2113960-36-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S37 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385610867100003 Content-Type: text/plain; charset="utf-8" This patch includes: - VPCNT.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 5 +++++ target/loongarch/disas.c | 5 +++++ target/loongarch/vec_helper.c | 4 ++-- target/loongarch/insn_trans/trans_lasx.c.inc | 5 +++++ 4 files changed, 17 insertions(+), 2 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 3175532045..d683c6a6ab 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1779,6 +1779,11 @@ xvclz_h 0111 01101001 11000 00101 ..... ...= .. @vv xvclz_w 0111 01101001 11000 00110 ..... ..... @vv xvclz_d 0111 01101001 11000 00111 ..... ..... @vv =20 +xvpcnt_b 0111 01101001 11000 01000 ..... ..... @vv +xvpcnt_h 0111 01101001 11000 01001 ..... ..... @vv +xvpcnt_w 0111 01101001 11000 01010 ..... ..... @vv +xvpcnt_d 0111 01101001 11000 01011 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 0fc58735b9..9e31f9bbbc 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2205,6 +2205,11 @@ INSN_LASX(xvclz_h, vv) INSN_LASX(xvclz_w, vv) INSN_LASX(xvclz_d, vv) =20 +INSN_LASX(xvpcnt_b, vv) +INSN_LASX(xvpcnt_h, vv) +INSN_LASX(xvpcnt_w, vv) +INSN_LASX(xvpcnt_d, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 789f6b303e..9c2b52fd7d 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2183,9 +2183,9 @@ void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) \ - { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D FN(Vj->E(i)); \ } \ } diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 4227fbe629..2a24de178d 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -528,6 +528,11 @@ TRANS(xvclz_h, LASX, gen_vv, 32, gen_helper_vclz_h) TRANS(xvclz_w, LASX, gen_vv, 32, gen_helper_vclz_w) TRANS(xvclz_d, LASX, gen_vv, 32, gen_helper_vclz_d) =20 +TRANS(xvpcnt_b, LASX, gen_vv, 32, gen_helper_vpcnt_b) +TRANS(xvpcnt_h, LASX, gen_vv, 32, gen_helper_vpcnt_h) +TRANS(xvpcnt_w, LASX, gen_vv, 32, gen_helper_vpcnt_w) +TRANS(xvpcnt_d, LASX, gen_vv, 32, gen_helper_vpcnt_d) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385496251783.8349841088356; Wed, 30 Aug 2023 01:51:36 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtY-0001Kp-EM; Wed, 30 Aug 2023 04:49:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtX-0001Cf-44 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:47 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtU-0007Xt-13 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:46 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cxc_CbAu9kuggdAA--.59450S3; Wed, 30 Aug 2023 16:49:31 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S38; Wed, 30 Aug 2023 16:49:30 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 36/48] target/loongarch: Implement xvbitclr xvbitset xvbitrev Date: Wed, 30 Aug 2023 16:48:50 +0800 Message-Id: <20230830084902.2113960-37-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S38 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385497252100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVBITCLR[I].{B/H/W/D}; - XVBITSET[I].{B/H/W/D}; - XVBITREV[I].{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 4 ++ target/loongarch/insns.decode | 27 +++++++++++ target/loongarch/disas.c | 25 ++++++++++ target/loongarch/vec_helper.c | 48 ++++++++++---------- target/loongarch/insn_trans/trans_lasx.c.inc | 27 +++++++++++ 5 files changed, 106 insertions(+), 25 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 4497cd4a6d..aae70f9de9 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -85,4 +85,8 @@ #define DO_CLZ_W(N) (clz32(N)) #define DO_CLZ_D(N) (clz64(N)) =20 +#define DO_BITCLR(a, bit) (a & ~(1ull << bit)) +#define DO_BITSET(a, bit) (a | 1ull << bit) +#define DO_BITREV(a, bit) (a ^ (1ull << bit)) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index d683c6a6ab..cb6db8002a 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1784,6 +1784,33 @@ xvpcnt_h 0111 01101001 11000 01001 ..... ...= .. @vv xvpcnt_w 0111 01101001 11000 01010 ..... ..... @vv xvpcnt_d 0111 01101001 11000 01011 ..... ..... @vv =20 +xvbitclr_b 0111 01010000 11000 ..... ..... ..... @vvv +xvbitclr_h 0111 01010000 11001 ..... ..... ..... @vvv +xvbitclr_w 0111 01010000 11010 ..... ..... ..... @vvv +xvbitclr_d 0111 01010000 11011 ..... ..... ..... @vvv +xvbitclri_b 0111 01110001 00000 01 ... ..... ..... @vv_ui3 +xvbitclri_h 0111 01110001 00000 1 .... ..... ..... @vv_ui4 +xvbitclri_w 0111 01110001 00001 ..... ..... ..... @vv_ui5 +xvbitclri_d 0111 01110001 0001 ...... ..... ..... @vv_ui6 + +xvbitset_b 0111 01010000 11100 ..... ..... ..... @vvv +xvbitset_h 0111 01010000 11101 ..... ..... ..... @vvv +xvbitset_w 0111 01010000 11110 ..... ..... ..... @vvv +xvbitset_d 0111 01010000 11111 ..... ..... ..... @vvv +xvbitseti_b 0111 01110001 01000 01 ... ..... ..... @vv_ui3 +xvbitseti_h 0111 01110001 01000 1 .... ..... ..... @vv_ui4 +xvbitseti_w 0111 01110001 01001 ..... ..... ..... @vv_ui5 +xvbitseti_d 0111 01110001 0101 ...... ..... ..... @vv_ui6 + +xvbitrev_b 0111 01010001 00000 ..... ..... ..... @vvv +xvbitrev_h 0111 01010001 00001 ..... ..... ..... @vvv +xvbitrev_w 0111 01010001 00010 ..... ..... ..... @vvv +xvbitrev_d 0111 01010001 00011 ..... ..... ..... @vvv +xvbitrevi_b 0111 01110001 10000 01 ... ..... ..... @vv_ui3 +xvbitrevi_h 0111 01110001 10000 1 .... ..... ..... @vv_ui4 +xvbitrevi_w 0111 01110001 10001 ..... ..... ..... @vv_ui5 +xvbitrevi_d 0111 01110001 1001 ...... ..... ..... @vv_ui6 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 9e31f9bbbc..dad9243fd7 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2210,6 +2210,31 @@ INSN_LASX(xvpcnt_h, vv) INSN_LASX(xvpcnt_w, vv) INSN_LASX(xvpcnt_d, vv) =20 +INSN_LASX(xvbitclr_b, vvv) +INSN_LASX(xvbitclr_h, vvv) +INSN_LASX(xvbitclr_w, vvv) +INSN_LASX(xvbitclr_d, vvv) +INSN_LASX(xvbitclri_b, vv_i) +INSN_LASX(xvbitclri_h, vv_i) +INSN_LASX(xvbitclri_w, vv_i) +INSN_LASX(xvbitclri_d, vv_i) +INSN_LASX(xvbitset_b, vvv) +INSN_LASX(xvbitset_h, vvv) +INSN_LASX(xvbitset_w, vvv) +INSN_LASX(xvbitset_d, vvv) +INSN_LASX(xvbitseti_b, vv_i) +INSN_LASX(xvbitseti_h, vv_i) +INSN_LASX(xvbitseti_w, vv_i) +INSN_LASX(xvbitseti_d, vv_i) +INSN_LASX(xvbitrev_b, vvv) +INSN_LASX(xvbitrev_h, vvv) +INSN_LASX(xvbitrev_w, vvv) +INSN_LASX(xvbitrev_d, vvv) +INSN_LASX(xvbitrevi_b, vv_i) +INSN_LASX(xvbitrevi_h, vv_i) +INSN_LASX(xvbitrevi_w, vv_i) +INSN_LASX(xvbitrevi_d, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 9c2b52fd7d..03b42dc887 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2195,21 +2195,18 @@ VPCNT(vpcnt_h, 16, UH, ctpop16) VPCNT(vpcnt_w, 32, UW, ctpop32) VPCNT(vpcnt_d, 64, UD, ctpop64) =20 -#define DO_BITCLR(a, bit) (a & ~(1ull << bit)) -#define DO_BITSET(a, bit) (a | 1ull << bit) -#define DO_BITREV(a, bit) (a ^ (1ull << bit)) - -#define DO_BIT(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)%BIT); \ - } \ +#define DO_BIT(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i) % BIT); \ + } \ } =20 DO_BIT(vbitclr_b, 8, UB, DO_BITCLR) @@ -2225,16 +2222,17 @@ DO_BIT(vbitrev_h, 16, UH, DO_BITREV) DO_BIT(vbitrev_w, 32, UW, DO_BITREV) DO_BIT(vbitrev_d, 64, UD, DO_BITREV) =20 -#define DO_BITI(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i), imm); \ - } \ +#define DO_BITI(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), imm); \ + } \ } =20 DO_BITI(vbitclri_b, 8, UB, DO_BITCLR) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 2a24de178d..92c6506e04 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -533,6 +533,33 @@ TRANS(xvpcnt_h, LASX, gen_vv, 32, gen_helper_vpcnt_h) TRANS(xvpcnt_w, LASX, gen_vv, 32, gen_helper_vpcnt_w) TRANS(xvpcnt_d, LASX, gen_vv, 32, gen_helper_vpcnt_d) =20 +TRANS(xvbitclr_b, LASX, gvec_vvv, 32, MO_8, do_vbitclr) +TRANS(xvbitclr_h, LASX, gvec_vvv, 32, MO_16, do_vbitclr) +TRANS(xvbitclr_w, LASX, gvec_vvv, 32, MO_32, do_vbitclr) +TRANS(xvbitclr_d, LASX, gvec_vvv, 32, MO_64, do_vbitclr) +TRANS(xvbitclri_b, LASX, gvec_vv_i, 32, MO_8, do_vbitclri) +TRANS(xvbitclri_h, LASX, gvec_vv_i, 32, MO_16, do_vbitclri) +TRANS(xvbitclri_w, LASX, gvec_vv_i, 32, MO_32, do_vbitclri) +TRANS(xvbitclri_d, LASX, gvec_vv_i, 32, MO_64, do_vbitclri) + +TRANS(xvbitset_b, LASX, gvec_vvv, 32, MO_8, do_vbitset) +TRANS(xvbitset_h, LASX, gvec_vvv, 32, MO_16, do_vbitset) +TRANS(xvbitset_w, LASX, gvec_vvv, 32, MO_32, do_vbitset) +TRANS(xvbitset_d, LASX, gvec_vvv, 32, MO_64, do_vbitset) +TRANS(xvbitseti_b, LASX, gvec_vv_i, 32, MO_8, do_vbitseti) +TRANS(xvbitseti_h, LASX, gvec_vv_i, 32, MO_16, do_vbitseti) +TRANS(xvbitseti_w, LASX, gvec_vv_i, 32, MO_32, do_vbitseti) +TRANS(xvbitseti_d, LASX, gvec_vv_i, 32, MO_64, do_vbitseti) + +TRANS(xvbitrev_b, LASX, gvec_vvv, 32, MO_8, do_vbitrev) +TRANS(xvbitrev_h, LASX, gvec_vvv, 32, MO_16, do_vbitrev) +TRANS(xvbitrev_w, LASX, gvec_vvv, 32, MO_32, do_vbitrev) +TRANS(xvbitrev_d, LASX, gvec_vvv, 32, MO_64, do_vbitrev) +TRANS(xvbitrevi_b, LASX, gvec_vv_i, 32, MO_8, do_vbitrevi) +TRANS(xvbitrevi_h, LASX, gvec_vv_i, 32, MO_16, do_vbitrevi) +TRANS(xvbitrevi_w, LASX, gvec_vv_i, 32, MO_32, do_vbitrevi) +TRANS(xvbitrevi_d, LASX, gvec_vv_i, 32, MO_64, do_vbitrevi) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385696116848.6845360284331; Wed, 30 Aug 2023 01:54:56 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuI-0002EE-Ip; Wed, 30 Aug 2023 04:50:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtb-0001U6-1W for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:51 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtU-0007Y4-NY for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:50 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cxh+ibAu9kvAgdAA--.23732S3; Wed, 30 Aug 2023 16:49:31 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S39; Wed, 30 Aug 2023 16:49:31 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 37/48] target/loongarch: Implement xvfrstp Date: Wed, 30 Aug 2023 16:48:51 +0800 Message-Id: <20230830084902.2113960-38-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S39 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385698120100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVFRSTP[I].{B/H}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 5 ++ target/loongarch/disas.c | 5 ++ target/loongarch/vec_helper.c | 48 ++++++++++++-------- target/loongarch/insn_trans/trans_lasx.c.inc | 5 ++ 4 files changed, 43 insertions(+), 20 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index cb6db8002a..6035fe139c 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1811,6 +1811,11 @@ xvbitrevi_h 0111 01110001 10000 1 .... ..... ..= ... @vv_ui4 xvbitrevi_w 0111 01110001 10001 ..... ..... ..... @vv_ui5 xvbitrevi_d 0111 01110001 1001 ...... ..... ..... @vv_ui6 =20 +xvfrstp_b 0111 01010010 10110 ..... ..... ..... @vvv +xvfrstp_h 0111 01010010 10111 ..... ..... ..... @vvv +xvfrstpi_b 0111 01101001 10100 ..... ..... ..... @vv_ui5 +xvfrstpi_h 0111 01101001 10101 ..... ..... ..... @vv_ui5 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index dad9243fd7..27d6252686 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2235,6 +2235,11 @@ INSN_LASX(xvbitrevi_h, vv_i) INSN_LASX(xvbitrevi_w, vv_i) INSN_LASX(xvbitrevi_d, vv_i) =20 +INSN_LASX(xvfrstp_b, vvv) +INSN_LASX(xvfrstp_h, vvv) +INSN_LASX(xvfrstpi_b, vv_i) +INSN_LASX(xvfrstpi_h, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 03b42dc887..5c53cc8962 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2251,37 +2251,45 @@ DO_BITI(vbitrevi_d, 64, UD, DO_BITREV) #define VFRSTP(NAME, BIT, MASK, E) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ - int i, m; \ + int i, j, m, ofs; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - if (Vj->E(i) < 0) { \ - break; \ + ofs =3D LSX_LEN / BIT; \ + for (i =3D 0; i < oprsz / 16; i++) { \ + m =3D Vk->E(i * ofs) & MASK; \ + for (j =3D 0; j < ofs; j++) { \ + if (Vj->E(j + ofs * i) < 0) { \ + break; \ + } \ } \ + Vd->E(m + i * ofs) =3D j; \ } \ - m =3D Vk->E(0) & MASK; \ - Vd->E(m) =3D i; \ } =20 VFRSTP(vfrstp_b, 8, 0xf, B) VFRSTP(vfrstp_h, 16, 0x7, H) =20 -#define VFRSTPI(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void vj, uint64_t imm, uint32_t desc) \ -{ \ - int i, m; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - if (Vj->E(i) < 0) { \ - break; \ - } \ - } \ - m =3D imm % (LSX_LEN/BIT); \ - Vd->E(m) =3D i; \ +#define VFRSTPI(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, j, m, ofs; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + ofs =3D LSX_LEN / BIT; \ + m =3D imm % ofs; \ + for (i =3D 0; i < oprsz / 16; i++) { \ + for (j =3D 0; j < ofs; j++) { \ + if (Vj->E(j + ofs * i) < 0) { \ + break; \ + } \ + } \ + Vd->E(m + i * ofs) =3D j; \ + } \ } =20 VFRSTPI(vfrstpi_b, 8, B) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 92c6506e04..8a7d1b41e1 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -560,6 +560,11 @@ TRANS(xvbitrevi_h, LASX, gvec_vv_i, 32, MO_16, do_vbit= revi) TRANS(xvbitrevi_w, LASX, gvec_vv_i, 32, MO_32, do_vbitrevi) TRANS(xvbitrevi_d, LASX, gvec_vv_i, 32, MO_64, do_vbitrevi) =20 +TRANS(xvfrstp_b, LASX, gen_vvv, 32, gen_helper_vfrstp_b) +TRANS(xvfrstp_h, LASX, gen_vvv, 32, gen_helper_vfrstp_h) +TRANS(xvfrstpi_b, LASX, gen_vv_i, 32, gen_helper_vfrstpi_b) +TRANS(xvfrstpi_h, LASX, gen_vv_i, 32, gen_helper_vfrstpi_h) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 169338556151682.80384036716771; Wed, 30 Aug 2023 01:52:41 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuC-000237-RX; Wed, 30 Aug 2023 04:50:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtc-0001X2-UH for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:53 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtX-0007Yh-0m for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:52 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxTeudAu9kvwgdAA--.53934S3; Wed, 30 Aug 2023 16:49:33 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S40; Wed, 30 Aug 2023 16:49:31 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 38/48] target/loongarch: Implement LASX fpu arith instructions Date: Wed, 30 Aug 2023 16:48:52 +0800 Message-Id: <20230830084902.2113960-39-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S40 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385563559100007 Content-Type: text/plain; charset="utf-8" This patch includes: - XVF{ADD/SUB/MUL/DIV}.{S/D}; - XVF{MADD/MSUB/NMADD/NMSUB}.{S/D}; - XVF{MAX/MIN}.{S/D}; - XVF{MAXA/MINA}.{S/D}; - XVFLOGB.{S/D}; - XVFCLASS.{S/D}; - XVF{SQRT/RECIP/RSQRT}.{S/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 41 ++++++++++ target/loongarch/disas.c | 46 +++++++++++ target/loongarch/vec_helper.c | 82 +++++++++++--------- target/loongarch/insn_trans/trans_lasx.c.inc | 41 ++++++++++ 4 files changed, 172 insertions(+), 38 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 6035fe139c..4224b0a4b1 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1816,6 +1816,47 @@ xvfrstp_h 0111 01010010 10111 ..... ..... ...= .. @vvv xvfrstpi_b 0111 01101001 10100 ..... ..... ..... @vv_ui5 xvfrstpi_h 0111 01101001 10101 ..... ..... ..... @vv_ui5 =20 +xvfadd_s 0111 01010011 00001 ..... ..... ..... @vvv +xvfadd_d 0111 01010011 00010 ..... ..... ..... @vvv +xvfsub_s 0111 01010011 00101 ..... ..... ..... @vvv +xvfsub_d 0111 01010011 00110 ..... ..... ..... @vvv +xvfmul_s 0111 01010011 10001 ..... ..... ..... @vvv +xvfmul_d 0111 01010011 10010 ..... ..... ..... @vvv +xvfdiv_s 0111 01010011 10101 ..... ..... ..... @vvv +xvfdiv_d 0111 01010011 10110 ..... ..... ..... @vvv + +xvfmadd_s 0000 10100001 ..... ..... ..... ..... @vvvv +xvfmadd_d 0000 10100010 ..... ..... ..... ..... @vvvv +xvfmsub_s 0000 10100101 ..... ..... ..... ..... @vvvv +xvfmsub_d 0000 10100110 ..... ..... ..... ..... @vvvv +xvfnmadd_s 0000 10101001 ..... ..... ..... ..... @vvvv +xvfnmadd_d 0000 10101010 ..... ..... ..... ..... @vvvv +xvfnmsub_s 0000 10101101 ..... ..... ..... ..... @vvvv +xvfnmsub_d 0000 10101110 ..... ..... ..... ..... @vvvv + +xvfmax_s 0111 01010011 11001 ..... ..... ..... @vvv +xvfmax_d 0111 01010011 11010 ..... ..... ..... @vvv +xvfmin_s 0111 01010011 11101 ..... ..... ..... @vvv +xvfmin_d 0111 01010011 11110 ..... ..... ..... @vvv + +xvfmaxa_s 0111 01010100 00001 ..... ..... ..... @vvv +xvfmaxa_d 0111 01010100 00010 ..... ..... ..... @vvv +xvfmina_s 0111 01010100 00101 ..... ..... ..... @vvv +xvfmina_d 0111 01010100 00110 ..... ..... ..... @vvv + +xvflogb_s 0111 01101001 11001 10001 ..... ..... @vv +xvflogb_d 0111 01101001 11001 10010 ..... ..... @vv + +xvfclass_s 0111 01101001 11001 10101 ..... ..... @vv +xvfclass_d 0111 01101001 11001 10110 ..... ..... @vv + +xvfsqrt_s 0111 01101001 11001 11001 ..... ..... @vv +xvfsqrt_d 0111 01101001 11001 11010 ..... ..... @vv +xvfrecip_s 0111 01101001 11001 11101 ..... ..... @vv +xvfrecip_d 0111 01101001 11001 11110 ..... ..... @vv +xvfrsqrt_s 0111 01101001 11010 00001 ..... ..... @vv +xvfrsqrt_d 0111 01101001 11010 00010 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 27d6252686..4af74f1ae9 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1708,6 +1708,11 @@ static void output_v_i_x(DisasContext *ctx, arg_v_i = *a, const char *mnemonic) output(ctx, mnemonic, "x%d, 0x%x", a->vd, a->imm); } =20 +static void output_vvvv_x(DisasContext *ctx, arg_vvvv *a, const char *mnem= onic) +{ + output(ctx, mnemonic, "x%d, x%d, x%d, x%d", a->vd, a->vj, a->vk, a->va= ); +} + static void output_vvv_x(DisasContext *ctx, arg_vvv * a, const char *mnemo= nic) { output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk); @@ -2240,6 +2245,47 @@ INSN_LASX(xvfrstp_h, vvv) INSN_LASX(xvfrstpi_b, vv_i) INSN_LASX(xvfrstpi_h, vv_i) =20 +INSN_LASX(xvfadd_s, vvv) +INSN_LASX(xvfadd_d, vvv) +INSN_LASX(xvfsub_s, vvv) +INSN_LASX(xvfsub_d, vvv) +INSN_LASX(xvfmul_s, vvv) +INSN_LASX(xvfmul_d, vvv) +INSN_LASX(xvfdiv_s, vvv) +INSN_LASX(xvfdiv_d, vvv) + +INSN_LASX(xvfmadd_s, vvvv) +INSN_LASX(xvfmadd_d, vvvv) +INSN_LASX(xvfmsub_s, vvvv) +INSN_LASX(xvfmsub_d, vvvv) +INSN_LASX(xvfnmadd_s, vvvv) +INSN_LASX(xvfnmadd_d, vvvv) +INSN_LASX(xvfnmsub_s, vvvv) +INSN_LASX(xvfnmsub_d, vvvv) + +INSN_LASX(xvfmax_s, vvv) +INSN_LASX(xvfmax_d, vvv) +INSN_LASX(xvfmin_s, vvv) +INSN_LASX(xvfmin_d, vvv) + +INSN_LASX(xvfmaxa_s, vvv) +INSN_LASX(xvfmaxa_d, vvv) +INSN_LASX(xvfmina_s, vvv) +INSN_LASX(xvfmina_d, vvv) + +INSN_LASX(xvflogb_s, vv) +INSN_LASX(xvflogb_d, vv) + +INSN_LASX(xvfclass_s, vv) +INSN_LASX(xvfclass_d, vv) + +INSN_LASX(xvfsqrt_s, vv) +INSN_LASX(xvfsqrt_d, vv) +INSN_LASX(xvfrecip_s, vv) +INSN_LASX(xvfrecip_d, vv) +INSN_LASX(xvfrsqrt_s, vv) +INSN_LASX(xvfrsqrt_d, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 5c53cc8962..684b023ee5 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2334,9 +2334,10 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, = \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ vec_clear_cause(env); \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D FN(Vj->E(i), Vk->E(i), &env->fp_status); \ vec_update_fcsr0(env, GETPC()); \ } \ @@ -2368,9 +2369,10 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, void= *va, \ VReg *Vj =3D (VReg *)vj; = \ VReg *Vk =3D (VReg *)vk; = \ VReg *Va =3D (VReg *)va; = \ + int oprsz =3D simd_oprsz(desc); = \ = \ vec_clear_cause(env); = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { = \ Vd->E(i) =3D FN(Vj->E(i), Vk->E(i), Va->E(i), flags, &env->fp_stat= us); \ vec_update_fcsr0(env, GETPC()); = \ } = \ @@ -2387,47 +2389,51 @@ DO_4OP_F(vfnmsub_s, 32, UW, float32_muladd, DO_4OP_F(vfnmsub_d, 64, UD, float64_muladd, float_muladd_negate_c | float_muladd_negate_result) =20 -#define DO_2OP_F(NAME, BIT, E, FN) = \ -void HELPER(NAME)(void *vd, void *vj, CPULoongArchState *env, uint32_t des= c) \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - vec_clear_cause(env); = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E(i) =3D FN(env, Vj->E(i)); = \ - } = \ -} - -#define FLOGB(BIT, T) \ -static T do_flogb_## BIT(CPULoongArchState *env, T fj) \ -{ \ - T fp, fd; \ - float_status *status =3D &env->fp_status; \ - FloatRoundMode old_mode =3D get_float_rounding_mode(status); \ - \ - set_float_rounding_mode(float_round_down, status); \ - fp =3D float ## BIT ##_log2(fj, status); \ - fd =3D float ## BIT ##_round_to_int(fp, status); \ - set_float_rounding_mode(old_mode, status); \ - vec_update_fcsr0_mask(env, GETPC(), float_flag_inexact); \ - return fd; \ +#define DO_2OP_F(NAME, BIT, E, FN) \ +void HELPER(NAME)(void *vd, void * vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + vec_clear_cause(env); \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D FN(env, Vj->E(i)); \ + } \ +} + +#define FLOGB(BIT, T) \ +static T do_flogb_## BIT(CPULoongArchState *env, T fj) \ +{ \ + T fp, fd; \ + float_status *status =3D &env->fp_status; \ + FloatRoundMode old_mode =3D get_float_rounding_mode(status); \ + \ + set_float_rounding_mode(float_round_down, status); \ + fp =3D float ## BIT ##_log2(fj, status); \ + fd =3D float ## BIT ##_round_to_int(fp, status); \ + set_float_rounding_mode(old_mode, status); \ + vec_update_fcsr0_mask(env, GETPC(), float_flag_inexact); \ + return fd; \ } =20 FLOGB(32, uint32_t) FLOGB(64, uint64_t) =20 -#define FCLASS(NAME, BIT, E, FN) = \ -void HELPER(NAME)(void *vd, void *vj, CPULoongArchState *env, uint32_t des= c) \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E(i) =3D FN(env, Vj->E(i)); = \ - } = \ +#define FCLASS(NAME, BIT, E, FN) \ +void HELPER(NAME)(void *vd, void* vj, \ + CPULoongArchState *env,uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D FN(env, Vj->E(i)); \ + } \ } =20 FCLASS(vfclass_s, 32, UW, helper_fclass_s) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 8a7d1b41e1..b1b1fb939b 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -565,6 +565,47 @@ TRANS(xvfrstp_h, LASX, gen_vvv, 32, gen_helper_vfrstp_= h) TRANS(xvfrstpi_b, LASX, gen_vv_i, 32, gen_helper_vfrstpi_b) TRANS(xvfrstpi_h, LASX, gen_vv_i, 32, gen_helper_vfrstpi_h) =20 +TRANS(xvfadd_s, LASX, gen_vvv_f, 32, gen_helper_vfadd_s) +TRANS(xvfadd_d, LASX, gen_vvv_f, 32, gen_helper_vfadd_d) +TRANS(xvfsub_s, LASX, gen_vvv_f, 32, gen_helper_vfsub_s) +TRANS(xvfsub_d, LASX, gen_vvv_f, 32, gen_helper_vfsub_d) +TRANS(xvfmul_s, LASX, gen_vvv_f, 32, gen_helper_vfmul_s) +TRANS(xvfmul_d, LASX, gen_vvv_f, 32, gen_helper_vfmul_d) +TRANS(xvfdiv_s, LASX, gen_vvv_f, 32, gen_helper_vfdiv_s) +TRANS(xvfdiv_d, LASX, gen_vvv_f, 32, gen_helper_vfdiv_d) + +TRANS(xvfmadd_s, LASX, gen_vvvv_f, 32, gen_helper_vfmadd_s) +TRANS(xvfmadd_d, LASX, gen_vvvv_f, 32, gen_helper_vfmadd_d) +TRANS(xvfmsub_s, LASX, gen_vvvv_f, 32, gen_helper_vfmsub_s) +TRANS(xvfmsub_d, LASX, gen_vvvv_f, 32, gen_helper_vfmsub_d) +TRANS(xvfnmadd_s, LASX, gen_vvvv_f, 32, gen_helper_vfnmadd_s) +TRANS(xvfnmadd_d, LASX, gen_vvvv_f, 32, gen_helper_vfnmadd_d) +TRANS(xvfnmsub_s, LASX, gen_vvvv_f, 32, gen_helper_vfnmsub_s) +TRANS(xvfnmsub_d, LASX, gen_vvvv_f, 32, gen_helper_vfnmsub_d) + +TRANS(xvfmax_s, LASX, gen_vvv_f, 32, gen_helper_vfmax_s) +TRANS(xvfmax_d, LASX, gen_vvv_f, 32, gen_helper_vfmax_d) +TRANS(xvfmin_s, LASX, gen_vvv_f, 32, gen_helper_vfmin_s) +TRANS(xvfmin_d, LASX, gen_vvv_f, 32, gen_helper_vfmin_d) + +TRANS(xvfmaxa_s, LASX, gen_vvv_f, 32, gen_helper_vfmaxa_s) +TRANS(xvfmaxa_d, LASX, gen_vvv_f, 32, gen_helper_vfmaxa_d) +TRANS(xvfmina_s, LASX, gen_vvv_f, 32, gen_helper_vfmina_s) +TRANS(xvfmina_d, LASX, gen_vvv_f, 32, gen_helper_vfmina_d) + +TRANS(xvflogb_s, LASX, gen_vv_f, 32, gen_helper_vflogb_s) +TRANS(xvflogb_d, LASX, gen_vv_f, 32, gen_helper_vflogb_d) + +TRANS(xvfclass_s, LASX, gen_vv_f, 32, gen_helper_vfclass_s) +TRANS(xvfclass_d, LASX, gen_vv_f, 32, gen_helper_vfclass_d) + +TRANS(xvfsqrt_s, LASX, gen_vv_f, 32, gen_helper_vfsqrt_s) +TRANS(xvfsqrt_d, LASX, gen_vv_f, 32, gen_helper_vfsqrt_d) +TRANS(xvfrecip_s, LASX, gen_vv_f, 32, gen_helper_vfrecip_s) +TRANS(xvfrecip_d, LASX, gen_vv_f, 32, gen_helper_vfrecip_d) +TRANS(xvfrsqrt_s, LASX, gen_vv_f, 32, gen_helper_vfrsqrt_s) +TRANS(xvfrsqrt_d, LASX, gen_vv_f, 32, gen_helper_vfrsqrt_d) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16933856279861015.4518889373884; Wed, 30 Aug 2023 01:53:47 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuR-0002s5-4E; Wed, 30 Aug 2023 04:50:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGte-0001a4-3t for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:54 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtX-0007Yl-7g for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:53 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cxh+ieAu9kwQgdAA--.23734S3; Wed, 30 Aug 2023 16:49:34 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S41; Wed, 30 Aug 2023 16:49:33 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 39/48] target/loongarch: Implement LASX fpu fcvt instructions Date: Wed, 30 Aug 2023 16:48:53 +0800 Message-Id: <20230830084902.2113960-40-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S41 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385630120100007 Content-Type: text/plain; charset="utf-8" This patch includes: - XVFCVT{L/H}.{S.H/D.S}; - XVFCVT.{H.S/S.D}; - XVFRINT[{RNE/RZ/RP/RM}].{S/D}; - XVFTINT[{RNE/RZ/RP/RM}].{W.S/L.D}; - XVFTINT[RZ].{WU.S/LU.D}; - XVFTINT[{RNE/RZ/RP/RM}].W.D; - XVFTINT[{RNE/RZ/RP/RM}]{L/H}.L.S; - XVFFINT.{S.W/D.L}[U]; - X[CVFFINT.S.L, VFFINT{L/H}.D.W. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 58 ++++ target/loongarch/disas.c | 56 ++++ target/loongarch/vec_helper.c | 263 ++++++++++++------- target/loongarch/insn_trans/trans_lasx.c.inc | 56 ++++ 4 files changed, 335 insertions(+), 98 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 4224b0a4b1..ed4f82e7fe 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1857,6 +1857,64 @@ xvfrecip_d 0111 01101001 11001 11110 ..... ...= .. @vv xvfrsqrt_s 0111 01101001 11010 00001 ..... ..... @vv xvfrsqrt_d 0111 01101001 11010 00010 ..... ..... @vv =20 +xvfcvtl_s_h 0111 01101001 11011 11010 ..... ..... @vv +xvfcvth_s_h 0111 01101001 11011 11011 ..... ..... @vv +xvfcvtl_d_s 0111 01101001 11011 11100 ..... ..... @vv +xvfcvth_d_s 0111 01101001 11011 11101 ..... ..... @vv +xvfcvt_h_s 0111 01010100 01100 ..... ..... ..... @vvv +xvfcvt_s_d 0111 01010100 01101 ..... ..... ..... @vvv + +xvfrintrne_s 0111 01101001 11010 11101 ..... ..... @vv +xvfrintrne_d 0111 01101001 11010 11110 ..... ..... @vv +xvfrintrz_s 0111 01101001 11010 11001 ..... ..... @vv +xvfrintrz_d 0111 01101001 11010 11010 ..... ..... @vv +xvfrintrp_s 0111 01101001 11010 10101 ..... ..... @vv +xvfrintrp_d 0111 01101001 11010 10110 ..... ..... @vv +xvfrintrm_s 0111 01101001 11010 10001 ..... ..... @vv +xvfrintrm_d 0111 01101001 11010 10010 ..... ..... @vv +xvfrint_s 0111 01101001 11010 01101 ..... ..... @vv +xvfrint_d 0111 01101001 11010 01110 ..... ..... @vv + +xvftintrne_w_s 0111 01101001 11100 10100 ..... ..... @vv +xvftintrne_l_d 0111 01101001 11100 10101 ..... ..... @vv +xvftintrz_w_s 0111 01101001 11100 10010 ..... ..... @vv +xvftintrz_l_d 0111 01101001 11100 10011 ..... ..... @vv +xvftintrp_w_s 0111 01101001 11100 10000 ..... ..... @vv +xvftintrp_l_d 0111 01101001 11100 10001 ..... ..... @vv +xvftintrm_w_s 0111 01101001 11100 01110 ..... ..... @vv +xvftintrm_l_d 0111 01101001 11100 01111 ..... ..... @vv +xvftint_w_s 0111 01101001 11100 01100 ..... ..... @vv +xvftint_l_d 0111 01101001 11100 01101 ..... ..... @vv +xvftintrz_wu_s 0111 01101001 11100 11100 ..... ..... @vv +xvftintrz_lu_d 0111 01101001 11100 11101 ..... ..... @vv +xvftint_wu_s 0111 01101001 11100 10110 ..... ..... @vv +xvftint_lu_d 0111 01101001 11100 10111 ..... ..... @vv + +xvftintrne_w_d 0111 01010100 10111 ..... ..... ..... @vvv +xvftintrz_w_d 0111 01010100 10110 ..... ..... ..... @vvv +xvftintrp_w_d 0111 01010100 10101 ..... ..... ..... @vvv +xvftintrm_w_d 0111 01010100 10100 ..... ..... ..... @vvv +xvftint_w_d 0111 01010100 10011 ..... ..... ..... @vvv + +xvftintrnel_l_s 0111 01101001 11101 01000 ..... ..... @vv +xvftintrneh_l_s 0111 01101001 11101 01001 ..... ..... @vv +xvftintrzl_l_s 0111 01101001 11101 00110 ..... ..... @vv +xvftintrzh_l_s 0111 01101001 11101 00111 ..... ..... @vv +xvftintrpl_l_s 0111 01101001 11101 00100 ..... ..... @vv +xvftintrph_l_s 0111 01101001 11101 00101 ..... ..... @vv +xvftintrml_l_s 0111 01101001 11101 00010 ..... ..... @vv +xvftintrmh_l_s 0111 01101001 11101 00011 ..... ..... @vv +xvftintl_l_s 0111 01101001 11101 00000 ..... ..... @vv +xvftinth_l_s 0111 01101001 11101 00001 ..... ..... @vv + +xvffint_s_w 0111 01101001 11100 00000 ..... ..... @vv +xvffint_d_l 0111 01101001 11100 00010 ..... ..... @vv +xvffint_s_wu 0111 01101001 11100 00001 ..... ..... @vv +xvffint_d_lu 0111 01101001 11100 00011 ..... ..... @vv +xvffintl_d_w 0111 01101001 11100 00100 ..... ..... @vv +xvffinth_d_w 0111 01101001 11100 00101 ..... ..... @vv +xvffint_s_l 0111 01010100 10000 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 4af74f1ae9..3fd3dc3591 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2286,6 +2286,62 @@ INSN_LASX(xvfrecip_d, vv) INSN_LASX(xvfrsqrt_s, vv) INSN_LASX(xvfrsqrt_d, vv) =20 +INSN_LASX(xvfcvtl_s_h, vv) +INSN_LASX(xvfcvth_s_h, vv) +INSN_LASX(xvfcvtl_d_s, vv) +INSN_LASX(xvfcvth_d_s, vv) +INSN_LASX(xvfcvt_h_s, vvv) +INSN_LASX(xvfcvt_s_d, vvv) + +INSN_LASX(xvfrint_s, vv) +INSN_LASX(xvfrint_d, vv) +INSN_LASX(xvfrintrm_s, vv) +INSN_LASX(xvfrintrm_d, vv) +INSN_LASX(xvfrintrp_s, vv) +INSN_LASX(xvfrintrp_d, vv) +INSN_LASX(xvfrintrz_s, vv) +INSN_LASX(xvfrintrz_d, vv) +INSN_LASX(xvfrintrne_s, vv) +INSN_LASX(xvfrintrne_d, vv) + +INSN_LASX(xvftint_w_s, vv) +INSN_LASX(xvftint_l_d, vv) +INSN_LASX(xvftintrm_w_s, vv) +INSN_LASX(xvftintrm_l_d, vv) +INSN_LASX(xvftintrp_w_s, vv) +INSN_LASX(xvftintrp_l_d, vv) +INSN_LASX(xvftintrz_w_s, vv) +INSN_LASX(xvftintrz_l_d, vv) +INSN_LASX(xvftintrne_w_s, vv) +INSN_LASX(xvftintrne_l_d, vv) +INSN_LASX(xvftint_wu_s, vv) +INSN_LASX(xvftint_lu_d, vv) +INSN_LASX(xvftintrz_wu_s, vv) +INSN_LASX(xvftintrz_lu_d, vv) +INSN_LASX(xvftint_w_d, vvv) +INSN_LASX(xvftintrm_w_d, vvv) +INSN_LASX(xvftintrp_w_d, vvv) +INSN_LASX(xvftintrz_w_d, vvv) +INSN_LASX(xvftintrne_w_d, vvv) +INSN_LASX(xvftintl_l_s, vv) +INSN_LASX(xvftinth_l_s, vv) +INSN_LASX(xvftintrml_l_s, vv) +INSN_LASX(xvftintrmh_l_s, vv) +INSN_LASX(xvftintrpl_l_s, vv) +INSN_LASX(xvftintrph_l_s, vv) +INSN_LASX(xvftintrzl_l_s, vv) +INSN_LASX(xvftintrzh_l_s, vv) +INSN_LASX(xvftintrnel_l_s, vv) +INSN_LASX(xvftintrneh_l_s, vv) + +INSN_LASX(xvffint_s_w, vv) +INSN_LASX(xvffint_s_wu, vv) +INSN_LASX(xvffint_d_l, vv) +INSN_LASX(xvffint_d_lu, vv) +INSN_LASX(xvffintl_d_w, vv) +INSN_LASX(xvffinth_d_w, vv) +INSN_LASX(xvffint_s_l, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 684b023ee5..3e2757d57b 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2506,14 +2506,19 @@ static uint32_t float64_cvt_float32(uint64_t d, flo= at_status *status) void HELPER(vfcvtl_s_h)(void *vd, void *vj, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 32; vec_clear_cause(env); - for (i =3D 0; i < LSX_LEN/32; i++) { - temp.UW(i) =3D float16_cvt_float32(Vj->UH(i), &env->fp_status); + for (i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.UW(j + ofs * i) =3Dfloat16_cvt_float32(Vj->UH(j + ofs * 2= * i), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2522,14 +2527,19 @@ void HELPER(vfcvtl_s_h)(void *vd, void *vj, void HELPER(vfcvtl_d_s)(void *vd, void *vj, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 64; vec_clear_cause(env); - for (i =3D 0; i < LSX_LEN/64; i++) { - temp.UD(i) =3D float32_cvt_float64(Vj->UW(i), &env->fp_status); + for (i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.UD(j + ofs * i) =3D float32_cvt_float64(Vj->UW(j + ofs * = 2 * i), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2538,14 +2548,19 @@ void HELPER(vfcvtl_d_s)(void *vd, void *vj, void HELPER(vfcvth_s_h)(void *vd, void *vj, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 32; vec_clear_cause(env); - for (i =3D 0; i < LSX_LEN/32; i++) { - temp.UW(i) =3D float16_cvt_float32(Vj->UH(i + 4), &env->fp_status); + for (i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.UW(j + ofs * i) =3D float16_cvt_float32(Vj->UH(j + ofs * = (2 * i + 1)), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2554,14 +2569,19 @@ void HELPER(vfcvth_s_h)(void *vd, void *vj, void HELPER(vfcvth_d_s)(void *vd, void *vj, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 64; vec_clear_cause(env); - for (i =3D 0; i < LSX_LEN/64; i++) { - temp.UD(i) =3D float32_cvt_float64(Vj->UW(i + 2), &env->fp_status); + for (i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.UD(j + ofs * i) =3D float32_cvt_float64(Vj->UW(j + ofs * = (2 * i + 1)), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2570,16 +2590,22 @@ void HELPER(vfcvth_d_s)(void *vd, void *vj, void HELPER(vfcvt_h_s)(void *vd, void *vj, void *vk, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 32; vec_clear_cause(env); - for(i =3D 0; i < LSX_LEN/32; i++) { - temp.UH(i + 4) =3D float32_cvt_float16(Vj->UW(i), &env->fp_status); - temp.UH(i) =3D float32_cvt_float16(Vk->UW(i), &env->fp_status); + for(i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.UH(j + ofs * (2 * i + 1)) =3D float32_cvt_float16(Vj->UW(= j + ofs * i), + &env->fp_= status); + temp.UH(j + ofs * 2 * i) =3D float32_cvt_float16(Vk->UW(j + of= s * i), + &env->fp_status= ); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2588,16 +2614,22 @@ void HELPER(vfcvt_h_s)(void *vd, void *vj, void *vk, void HELPER(vfcvt_s_d)(void *vd, void *vj, void *vk, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 64; vec_clear_cause(env); - for(i =3D 0; i < LSX_LEN/64; i++) { - temp.UW(i + 2) =3D float64_cvt_float32(Vj->UD(i), &env->fp_status); - temp.UW(i) =3D float64_cvt_float32(Vk->UD(i), &env->fp_status); + for(i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.UW(j + ofs * (2 * i + 1)) =3D float64_cvt_float32(Vj->UD(= j + ofs * i), + &env->fp_= status); + temp.UW(j + ofs * 2 * i) =3D float64_cvt_float32(Vk->UD(j + of= s * i), + &env->fp_status= ); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2609,12 +2641,14 @@ void HELPER(vfrint_s)(void *vd, void *vj, int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 vec_clear_cause(env); - for (i =3D 0; i < 4; i++) { + for (i =3D 0; i < oprsz / 4; i++) { Vd->W(i) =3D float32_round_to_int(Vj->UW(i), &env->fp_status); vec_update_fcsr0(env, GETPC()); } +} =20 =20 void HELPER(vfrint_d)(void *vd, void *vj, @@ -2623,29 +2657,32 @@ void HELPER(vfrint_d)(void *vd, void *vj, int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 vec_clear_cause(env); - for (i =3D 0; i < 2; i++) { + for (i =3D 0; i < oprsz / 8; i++) { Vd->D(i) =3D float64_round_to_int(Vj->UD(i), &env->fp_status); vec_update_fcsr0(env, GETPC()); } } =20 -#define FCVT_2OP(NAME, BIT, E, MODE) = \ -void HELPER(NAME)(void *vd, void *vj, CPULoongArchState *env, uint32_t des= c) \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - vec_clear_cause(env); = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - FloatRoundMode old_mode =3D get_float_rounding_mode(&env->fp_statu= s); \ - set_float_rounding_mode(MODE, &env->fp_status); = \ - Vd->E(i) =3D float## BIT ## _round_to_int(Vj->E(i), &env->fp_statu= s); \ - set_float_rounding_mode(old_mode, &env->fp_status); = \ - vec_update_fcsr0(env, GETPC()); = \ - } = \ +#define FCVT_2OP(NAME, BIT, E, MODE) = \ +void HELPER(NAME)(void *vd, void *vj, = \ + CPULoongArchState *env, uint32_t desc) = \ +{ = \ + int i; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + vec_clear_cause(env); = \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { = \ + FloatRoundMode old_mode =3D get_float_rounding_mode(&env->fp_statu= s); \ + set_float_rounding_mode(MODE, &env->fp_status); = \ + Vd->E(i) =3D float## BIT ## _round_to_int(Vj->E(i), &env->fp_statu= s); \ + set_float_rounding_mode(old_mode, &env->fp_status); = \ + vec_update_fcsr0(env, GETPC()); = \ + } = \ } =20 FCVT_2OP(vfrintrne_s, 32, UW, float_round_nearest_even) @@ -2724,22 +2761,26 @@ FTINT(rp_w_d, float64, int32, uint64_t, uint32_t, f= loat_round_up) FTINT(rz_w_d, float64, int32, uint64_t, uint32_t, float_round_to_zero) FTINT(rne_w_d, float64, int32, uint64_t, uint32_t, float_round_nearest_eve= n) =20 -#define FTINT_W_D(NAME, FN) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, \ - CPULoongArchState *env,uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - vec_clear_cause(env); \ - for (i =3D 0; i < 2; i++) { \ - temp.W(i + 2) =3D FN(env, Vj->UD(i)); \ - temp.W(i) =3D FN(env, Vk->UD(i)); \ - } \ - *Vd =3D temp; \ +#define FTINT_W_D(NAME, FN) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / 64; = \ + vec_clear_cause(env); \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.W(j + ofs * (2 * i + 1)) =3D FN(env, Vj->UD(j + ofs * i))= ; \ + temp.W(j + ofs * 2 * i) =3D FN(env, Vk->UD(j + ofs * i)); = \ + } \ + } \ + *Vd =3D temp; = \ } =20 FTINT_W_D(vftint_w_d, do_float64_to_int32) @@ -2757,19 +2798,24 @@ FTINT(rph_l_s, float32, int64, uint32_t, uint64_t, = float_round_up) FTINT(rzh_l_s, float32, int64, uint32_t, uint64_t, float_round_to_zero) FTINT(rneh_l_s, float32, int64, uint32_t, uint64_t, float_round_nearest_ev= en) =20 -#define FTINTL_L_S(NAME, FN) = \ -void HELPER(NAME)(void *vd, void *vj, CPULoongArchState *env, uint32_t des= c) \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - vec_clear_cause(env); = \ - for (i =3D 0; i < 2; i++) { = \ - temp.D(i) =3D FN(env, Vj->UW(i)); = \ - } = \ - *Vd =3D temp; = \ +#define FTINTL_L_S(NAME, FN) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + ofs =3D LSX_LEN / 64; \ + vec_clear_cause(env); \ + for (i =3D 0; i < oprsz / 16; i++) { \ + for (j =3D 0; j < ofs; j++) { \ + temp.D(j + ofs * i) =3D FN(env, Vj->UW(j + ofs * 2 * i)); \ + } \ + } \ + *Vd =3D temp; \ } =20 FTINTL_L_S(vftintl_l_s, do_float32_to_int64) @@ -2778,19 +2824,24 @@ FTINTL_L_S(vftintrpl_l_s, do_ftintrpl_l_s) FTINTL_L_S(vftintrzl_l_s, do_ftintrzl_l_s) FTINTL_L_S(vftintrnel_l_s, do_ftintrnel_l_s) =20 -#define FTINTH_L_S(NAME, FN) = \ -void HELPER(NAME)(void *vd, void *vj, CPULoongArchState *env, uint32_t des= c) \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - vec_clear_cause(env); = \ - for (i =3D 0; i < 2; i++) { = \ - temp.D(i) =3D FN(env, Vj->UW(i + 2)); = \ - } = \ - *Vd =3D temp; = \ +#define FTINTH_L_S(NAME, FN) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / 64; = \ + vec_clear_cause(env); \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.D(j + ofs * i) =3D FN(env, Vj->UW(j + ofs * (2 * i + 1)))= ; \ + } \ + } \ + *Vd =3D temp; = \ } =20 FTINTH_L_S(vftinth_l_s, do_float32_to_int64) @@ -2822,14 +2873,19 @@ DO_2OP_F(vffint_d_lu, 64, UD, do_ffint_d_lu) void HELPER(vffintl_d_w)(void *vd, void *vj, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc);=20 =20 + ofs =3D LSX_LEN / 64; vec_clear_cause(env); - for (i =3D 0; i < 2; i++) { - temp.D(i) =3D int32_to_float64(Vj->W(i), &env->fp_status); + for (i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.D(j + ofs * i) =3D int32_to_float64(Vj->W(j + ofs * 2 * i= ), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2838,14 +2894,19 @@ void HELPER(vffintl_d_w)(void *vd, void *vj, void HELPER(vffinth_d_w)(void *vd, void *vj, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 64; vec_clear_cause(env); - for (i =3D 0; i < 2; i++) { - temp.D(i) =3D int32_to_float64(Vj->W(i + 2), &env->fp_status); + for (i =3D 0; i < oprsz /16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.D(j + ofs * i) =3D int32_to_float64(Vj->W(j + ofs * (2 * = i + 1)), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2854,16 +2915,22 @@ void HELPER(vffinth_d_w)(void *vd, void *vj, void HELPER(vffint_s_l)(void *vd, void *vj, void *vk, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 64; vec_clear_cause(env); - for (i =3D 0; i < 2; i++) { - temp.W(i + 2) =3D int64_to_float32(Vj->D(i), &env->fp_status); - temp.W(i) =3D int64_to_float32(Vk->D(i), &env->fp_status); + for (i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.W(j + ofs * (2 * i + 1)) =3D int64_to_float32(Vj->D(j + o= fs * i), + &env->fp_stat= us); + temp.W(j + ofs * 2 * i) =3D int64_to_float32(Vk->D(j + ofs * i= ), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index b1b1fb939b..760160184c 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -606,6 +606,62 @@ TRANS(xvfrecip_d, LASX, gen_vv_f, 32, gen_helper_vfrec= ip_d) TRANS(xvfrsqrt_s, LASX, gen_vv_f, 32, gen_helper_vfrsqrt_s) TRANS(xvfrsqrt_d, LASX, gen_vv_f, 32, gen_helper_vfrsqrt_d) =20 +TRANS(xvfcvtl_s_h, LASX, gen_vv_f, 32, gen_helper_vfcvtl_s_h) +TRANS(xvfcvth_s_h, LASX, gen_vv_f, 32, gen_helper_vfcvth_s_h) +TRANS(xvfcvtl_d_s, LASX, gen_vv_f, 32, gen_helper_vfcvtl_d_s) +TRANS(xvfcvth_d_s, LASX, gen_vv_f, 32, gen_helper_vfcvth_d_s) +TRANS(xvfcvt_h_s, LASX, gen_vvv_f, 32, gen_helper_vfcvt_h_s) +TRANS(xvfcvt_s_d, LASX, gen_vvv_f, 32, gen_helper_vfcvt_s_d) + +TRANS(xvfrintrne_s, LASX, gen_vv_f, 32, gen_helper_vfrintrne_s) +TRANS(xvfrintrne_d, LASX, gen_vv_f, 32, gen_helper_vfrintrne_d) +TRANS(xvfrintrz_s, LASX, gen_vv_f, 32, gen_helper_vfrintrz_s) +TRANS(xvfrintrz_d, LASX, gen_vv_f, 32, gen_helper_vfrintrz_d) +TRANS(xvfrintrp_s, LASX, gen_vv_f, 32, gen_helper_vfrintrp_s) +TRANS(xvfrintrp_d, LASX, gen_vv_f, 32, gen_helper_vfrintrp_d) +TRANS(xvfrintrm_s, LASX, gen_vv_f, 32, gen_helper_vfrintrm_s) +TRANS(xvfrintrm_d, LASX, gen_vv_f, 32, gen_helper_vfrintrm_d) +TRANS(xvfrint_s, LASX, gen_vv_f, 32, gen_helper_vfrint_s) +TRANS(xvfrint_d, LASX, gen_vv_f, 32, gen_helper_vfrint_d) + +TRANS(xvftintrne_w_s, LASX, gen_vv_f, 32, gen_helper_vftintrne_w_s) +TRANS(xvftintrne_l_d, LASX, gen_vv_f, 32, gen_helper_vftintrne_l_d) +TRANS(xvftintrz_w_s, LASX, gen_vv_f, 32, gen_helper_vftintrz_w_s) +TRANS(xvftintrz_l_d, LASX, gen_vv_f, 32, gen_helper_vftintrz_l_d) +TRANS(xvftintrp_w_s, LASX, gen_vv_f, 32, gen_helper_vftintrp_w_s) +TRANS(xvftintrp_l_d, LASX, gen_vv_f, 32, gen_helper_vftintrp_l_d) +TRANS(xvftintrm_w_s, LASX, gen_vv_f, 32, gen_helper_vftintrm_w_s) +TRANS(xvftintrm_l_d, LASX, gen_vv_f, 32, gen_helper_vftintrm_l_d) +TRANS(xvftint_w_s, LASX, gen_vv_f, 32, gen_helper_vftint_w_s) +TRANS(xvftint_l_d, LASX, gen_vv_f, 32, gen_helper_vftint_l_d) +TRANS(xvftintrz_wu_s, LASX, gen_vv_f, 32, gen_helper_vftintrz_wu_s) +TRANS(xvftintrz_lu_d, LASX, gen_vv_f, 32, gen_helper_vftintrz_lu_d) +TRANS(xvftint_wu_s, LASX, gen_vv_f, 32, gen_helper_vftint_wu_s) +TRANS(xvftint_lu_d, LASX, gen_vv_f, 32, gen_helper_vftint_lu_d) +TRANS(xvftintrne_w_d, LASX, gen_vvv_f, 32, gen_helper_vftintrne_w_d) +TRANS(xvftintrz_w_d, LASX, gen_vvv_f, 32, gen_helper_vftintrz_w_d) +TRANS(xvftintrp_w_d, LASX, gen_vvv_f, 32, gen_helper_vftintrp_w_d) +TRANS(xvftintrm_w_d, LASX, gen_vvv_f, 32, gen_helper_vftintrm_w_d) +TRANS(xvftint_w_d, LASX, gen_vvv_f, 32, gen_helper_vftint_w_d) +TRANS(xvftintrnel_l_s, LASX, gen_vv_f, 32, gen_helper_vftintrnel_l_s) +TRANS(xvftintrneh_l_s, LASX, gen_vv_f, 32, gen_helper_vftintrneh_l_s) +TRANS(xvftintrzl_l_s, LASX, gen_vv_f, 32, gen_helper_vftintrzl_l_s) +TRANS(xvftintrzh_l_s, LASX, gen_vv_f, 32, gen_helper_vftintrzh_l_s) +TRANS(xvftintrpl_l_s, LASX, gen_vv_f, 32, gen_helper_vftintrpl_l_s) +TRANS(xvftintrph_l_s, LASX, gen_vv_f, 32, gen_helper_vftintrph_l_s) +TRANS(xvftintrml_l_s, LASX, gen_vv_f, 32, gen_helper_vftintrml_l_s) +TRANS(xvftintrmh_l_s, LASX, gen_vv_f, 32, gen_helper_vftintrmh_l_s) +TRANS(xvftintl_l_s, LASX, gen_vv_f, 32, gen_helper_vftintl_l_s) +TRANS(xvftinth_l_s, LASX, gen_vv_f, 32, gen_helper_vftinth_l_s) + +TRANS(xvffint_s_w, LASX, gen_vv_f, 32, gen_helper_vffint_s_w) +TRANS(xvffint_d_l, LASX, gen_vv_f, 32, gen_helper_vffint_d_l) +TRANS(xvffint_s_wu, LASX, gen_vv_f, 32, gen_helper_vffint_s_wu) +TRANS(xvffint_d_lu, LASX, gen_vv_f, 32, gen_helper_vffint_d_lu) +TRANS(xvffintl_d_w, LASX, gen_vv_f, 32, gen_helper_vffintl_d_w) +TRANS(xvffinth_d_w, LASX, gen_vv_f, 32, gen_helper_vffinth_d_w) +TRANS(xvffint_s_l, LASX, gen_vvv_f, 32, gen_helper_vffint_s_l) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 169338551754677.30954399292943; Wed, 30 Aug 2023 01:51:57 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuN-0002I1-O1; Wed, 30 Aug 2023 04:50:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtf-0001cV-Cx for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:55 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtY-0007Z1-Bs for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:55 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxuOifAu9kxQgdAA--.23634S3; Wed, 30 Aug 2023 16:49:35 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S42; Wed, 30 Aug 2023 16:49:34 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 40/48] target/loongarch: Implement xvseq xvsle xvslt Date: Wed, 30 Aug 2023 16:48:54 +0800 Message-Id: <20230830084902.2113960-41-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S42 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385519172100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSEQ[I].{B/H/W/D}; - XVSLE[I].{B/H/W/D}[U]; - XVSLT[I].{B/H/W/D/}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 4 + target/loongarch/insns.decode | 43 +++ target/loongarch/disas.c | 43 +++ target/loongarch/vec_helper.c | 27 +- target/loongarch/insn_trans/trans_lasx.c.inc | 43 +++ target/loongarch/insn_trans/trans_lsx.c.inc | 263 ++++++++++--------- 6 files changed, 278 insertions(+), 145 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index aae70f9de9..bc74effb7c 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -89,4 +89,8 @@ #define DO_BITSET(a, bit) (a | 1ull << bit) #define DO_BITREV(a, bit) (a ^ (1ull << bit)) =20 +#define VSEQ(a, b) (a =3D=3D b ? -1 : 0) +#define VSLE(a, b) (a <=3D b ? -1 : 0) +#define VSLT(a, b) (a < b ? -1 : 0) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index ed4f82e7fe..82c26a318b 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1915,6 +1915,49 @@ xvffintl_d_w 0111 01101001 11100 00100 ..... ...= .. @vv xvffinth_d_w 0111 01101001 11100 00101 ..... ..... @vv xvffint_s_l 0111 01010100 10000 ..... ..... ..... @vvv =20 +xvseq_b 0111 01000000 00000 ..... ..... ..... @vvv +xvseq_h 0111 01000000 00001 ..... ..... ..... @vvv +xvseq_w 0111 01000000 00010 ..... ..... ..... @vvv +xvseq_d 0111 01000000 00011 ..... ..... ..... @vvv +xvseqi_b 0111 01101000 00000 ..... ..... ..... @vv_i5 +xvseqi_h 0111 01101000 00001 ..... ..... ..... @vv_i5 +xvseqi_w 0111 01101000 00010 ..... ..... ..... @vv_i5 +xvseqi_d 0111 01101000 00011 ..... ..... ..... @vv_i5 + +xvsle_b 0111 01000000 00100 ..... ..... ..... @vvv +xvsle_h 0111 01000000 00101 ..... ..... ..... @vvv +xvsle_w 0111 01000000 00110 ..... ..... ..... @vvv +xvsle_d 0111 01000000 00111 ..... ..... ..... @vvv +xvslei_b 0111 01101000 00100 ..... ..... ..... @vv_i5 +xvslei_h 0111 01101000 00101 ..... ..... ..... @vv_i5 +xvslei_w 0111 01101000 00110 ..... ..... ..... @vv_i5 +xvslei_d 0111 01101000 00111 ..... ..... ..... @vv_i5 +xvsle_bu 0111 01000000 01000 ..... ..... ..... @vvv +xvsle_hu 0111 01000000 01001 ..... ..... ..... @vvv +xvsle_wu 0111 01000000 01010 ..... ..... ..... @vvv +xvsle_du 0111 01000000 01011 ..... ..... ..... @vvv +xvslei_bu 0111 01101000 01000 ..... ..... ..... @vv_ui5 +xvslei_hu 0111 01101000 01001 ..... ..... ..... @vv_ui5 +xvslei_wu 0111 01101000 01010 ..... ..... ..... @vv_ui5 +xvslei_du 0111 01101000 01011 ..... ..... ..... @vv_ui5 + +xvslt_b 0111 01000000 01100 ..... ..... ..... @vvv +xvslt_h 0111 01000000 01101 ..... ..... ..... @vvv +xvslt_w 0111 01000000 01110 ..... ..... ..... @vvv +xvslt_d 0111 01000000 01111 ..... ..... ..... @vvv +xvslti_b 0111 01101000 01100 ..... ..... ..... @vv_i5 +xvslti_h 0111 01101000 01101 ..... ..... ..... @vv_i5 +xvslti_w 0111 01101000 01110 ..... ..... ..... @vv_i5 +xvslti_d 0111 01101000 01111 ..... ..... ..... @vv_i5 +xvslt_bu 0111 01000000 10000 ..... ..... ..... @vvv +xvslt_hu 0111 01000000 10001 ..... ..... ..... @vvv +xvslt_wu 0111 01000000 10010 ..... ..... ..... @vvv +xvslt_du 0111 01000000 10011 ..... ..... ..... @vvv +xvslti_bu 0111 01101000 10000 ..... ..... ..... @vv_ui5 +xvslti_hu 0111 01101000 10001 ..... ..... ..... @vv_ui5 +xvslti_wu 0111 01101000 10010 ..... ..... ..... @vv_ui5 +xvslti_du 0111 01101000 10011 ..... ..... ..... @vv_ui5 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 3fd3dc3591..295ba74f2b 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2342,6 +2342,49 @@ INSN_LASX(xvffintl_d_w, vv) INSN_LASX(xvffinth_d_w, vv) INSN_LASX(xvffint_s_l, vvv) =20 +INSN_LASX(xvseq_b, vvv) +INSN_LASX(xvseq_h, vvv) +INSN_LASX(xvseq_w, vvv) +INSN_LASX(xvseq_d, vvv) +INSN_LASX(xvseqi_b, vv_i) +INSN_LASX(xvseqi_h, vv_i) +INSN_LASX(xvseqi_w, vv_i) +INSN_LASX(xvseqi_d, vv_i) + +INSN_LASX(xvsle_b, vvv) +INSN_LASX(xvsle_h, vvv) +INSN_LASX(xvsle_w, vvv) +INSN_LASX(xvsle_d, vvv) +INSN_LASX(xvslei_b, vv_i) +INSN_LASX(xvslei_h, vv_i) +INSN_LASX(xvslei_w, vv_i) +INSN_LASX(xvslei_d, vv_i) +INSN_LASX(xvsle_bu, vvv) +INSN_LASX(xvsle_hu, vvv) +INSN_LASX(xvsle_wu, vvv) +INSN_LASX(xvsle_du, vvv) +INSN_LASX(xvslei_bu, vv_i) +INSN_LASX(xvslei_hu, vv_i) +INSN_LASX(xvslei_wu, vv_i) +INSN_LASX(xvslei_du, vv_i) + +INSN_LASX(xvslt_b, vvv) +INSN_LASX(xvslt_h, vvv) +INSN_LASX(xvslt_w, vvv) +INSN_LASX(xvslt_d, vvv) +INSN_LASX(xvslti_b, vv_i) +INSN_LASX(xvslti_h, vv_i) +INSN_LASX(xvslti_w, vv_i) +INSN_LASX(xvslti_d, vv_i) +INSN_LASX(xvslt_bu, vvv) +INSN_LASX(xvslt_hu, vvv) +INSN_LASX(xvslt_wu, vvv) +INSN_LASX(xvslt_du, vvv) +INSN_LASX(xvslti_bu, vv_i) +INSN_LASX(xvslti_hu, vv_i) +INSN_LASX(xvslti_wu, vv_i) +INSN_LASX(xvslti_du, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 3e2757d57b..19958c054c 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2936,21 +2936,18 @@ void HELPER(vffint_s_l)(void *vd, void *vj, void *v= k, *Vd =3D temp; } =20 -#define VSEQ(a, b) (a =3D=3D b ? -1 : 0) -#define VSLE(a, b) (a <=3D b ? -1 : 0) -#define VSLT(a, b) (a < b ? -1 : 0) - -#define VCMPI(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - typedef __typeof(Vd->E(0)) TD; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i), (TD)imm); \ - } \ +#define VCMPI(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + typedef __typeof(Vd->E(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), (TD)imm); \ + } \ } =20 VCMPI(vseqi_b, 8, B, VSEQ) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 760160184c..c1cd02d6a1 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -662,6 +662,49 @@ TRANS(xvffintl_d_w, LASX, gen_vv_f, 32, gen_helper_vff= intl_d_w) TRANS(xvffinth_d_w, LASX, gen_vv_f, 32, gen_helper_vffinth_d_w) TRANS(xvffint_s_l, LASX, gen_vvv_f, 32, gen_helper_vffint_s_l) =20 +TRANS(xvseq_b, LASX, do_cmp, 32, MO_8, TCG_COND_EQ) +TRANS(xvseq_h, LASX, do_cmp, 32, MO_16, TCG_COND_EQ) +TRANS(xvseq_w, LASX, do_cmp, 32, MO_32, TCG_COND_EQ) +TRANS(xvseq_d, LASX, do_cmp, 32, MO_64, TCG_COND_EQ) +TRANS(xvseqi_b, LASX, do_vseqi_s, 32, MO_8) +TRANS(xvseqi_h, LASX, do_vseqi_s, 32, MO_16) +TRANS(xvseqi_w, LASX, do_vseqi_s, 32, MO_32) +TRANS(xvseqi_d, LASX, do_vseqi_s, 32, MO_64) + +TRANS(xvsle_b, LASX, do_cmp, 32, MO_8, TCG_COND_LE) +TRANS(xvsle_h, LASX, do_cmp, 32, MO_16, TCG_COND_LE) +TRANS(xvsle_w, LASX, do_cmp, 32, MO_32, TCG_COND_LE) +TRANS(xvsle_d, LASX, do_cmp, 32, MO_64, TCG_COND_LE) +TRANS(xvslei_b, LASX, do_vslei_s, 32, MO_8) +TRANS(xvslei_h, LASX, do_vslei_s, 32, MO_16) +TRANS(xvslei_w, LASX, do_vslei_s, 32, MO_32) +TRANS(xvslei_d, LASX, do_vslei_s, 32, MO_64) +TRANS(xvsle_bu, LASX, do_cmp, 32, MO_8, TCG_COND_LEU) +TRANS(xvsle_hu, LASX, do_cmp, 32, MO_16, TCG_COND_LEU) +TRANS(xvsle_wu, LASX, do_cmp, 32, MO_32, TCG_COND_LEU) +TRANS(xvsle_du, LASX, do_cmp, 32, MO_64, TCG_COND_LEU) +TRANS(xvslei_bu, LASX, do_vslei_u, 32, MO_8) +TRANS(xvslei_hu, LASX, do_vslei_u, 32, MO_16) +TRANS(xvslei_wu, LASX, do_vslei_u, 32, MO_32) +TRANS(xvslei_du, LASX, do_vslei_u, 32, MO_64) + +TRANS(xvslt_b, LASX, do_cmp, 32, MO_8, TCG_COND_LT) +TRANS(xvslt_h, LASX, do_cmp, 32, MO_16, TCG_COND_LT) +TRANS(xvslt_w, LASX, do_cmp, 32, MO_32, TCG_COND_LT) +TRANS(xvslt_d, LASX, do_cmp, 32, MO_64, TCG_COND_LT) +TRANS(xvslti_b, LASX, do_vslti_s, 32, MO_8) +TRANS(xvslti_h, LASX, do_vslti_s, 32, MO_16) +TRANS(xvslti_w, LASX, do_vslti_s, 32, MO_32) +TRANS(xvslti_d, LASX, do_vslti_s, 32, MO_64) +TRANS(xvslt_bu, LASX, do_cmp, 32, MO_8, TCG_COND_LTU) +TRANS(xvslt_hu, LASX, do_cmp, 32, MO_16, TCG_COND_LTU) +TRANS(xvslt_wu, LASX, do_cmp, 32, MO_32, TCG_COND_LTU) +TRANS(xvslt_du, LASX, do_cmp, 32, MO_64, TCG_COND_LTU) +TRANS(xvslti_bu, LASX, do_vslti_u, 32, MO_8) +TRANS(xvslti_hu, LASX, do_vslti_u, 32, MO_16) +TRANS(xvslti_wu, LASX, do_vslti_u, 32, MO_32) +TRANS(xvslti_du, LASX, do_vslti_u, 32, MO_64) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index 64de014a58..f757db7a76 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -3733,7 +3733,8 @@ TRANS(vffintl_d_w, LSX, gen_vv_f, 16, gen_helper_vffi= ntl_d_w) TRANS(vffinth_d_w, LSX, gen_vv_f, 16, gen_helper_vffinth_d_w) TRANS(vffint_s_l, LSX, gen_vvv_f, 16, gen_helper_vffint_s_l) =20 -static bool do_cmp(DisasContext *ctx, arg_vvv *a, MemOp mop, TCGCond cond) +static bool do_cmp(DisasContext *ctx, arg_vvv *a, + uint32_t oprsz, MemOp mop, TCGCond cond) { uint32_t vd_ofs, vj_ofs, vk_ofs; =20 @@ -3743,7 +3744,7 @@ static bool do_cmp(DisasContext *ctx, arg_vvv *a, Mem= Op mop, TCGCond cond) vj_ofs =3D vec_full_offset(a->vj); vk_ofs =3D vec_full_offset(a->vk); =20 - tcg_gen_gvec_cmp(cond, mop, vd_ofs, vj_ofs, vk_ofs, 16, ctx->vl/8); + tcg_gen_gvec_cmp(cond, mop, vd_ofs, vj_ofs, vk_ofs, oprsz, ctx->vl / 8= ); return true; } =20 @@ -3778,145 +3779,147 @@ static void gen_vslti_u_vec(unsigned vece, TCGv_v= ec t, TCGv_vec a, int64_t imm) do_cmpi_vec(TCG_COND_LTU, vece, t, a, imm); } =20 -#define DO_CMPI_S(NAME) \ -static bool do_## NAME ##_s(DisasContext *ctx, arg_vv_i *a, MemOp mop) \ -{ \ - uint32_t vd_ofs, vj_ofs; \ - \ - CHECK_VEC; \ - \ - static const TCGOpcode vecop_list[] =3D { \ - INDEX_op_cmp_vec, 0 \ - }; \ - static const GVecGen2i op[4] =3D { \ - { \ - .fniv =3D gen_## NAME ##_s_vec, \ - .fnoi =3D gen_helper_## NAME ##_b, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_8 \ - }, \ - { \ - .fniv =3D gen_## NAME ##_s_vec, \ - .fnoi =3D gen_helper_## NAME ##_h, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_16 \ - }, \ - { \ - .fniv =3D gen_## NAME ##_s_vec, \ - .fnoi =3D gen_helper_## NAME ##_w, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_32 \ - }, \ - { \ - .fniv =3D gen_## NAME ##_s_vec, \ - .fnoi =3D gen_helper_## NAME ##_d, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_64 \ - } \ - }; \ - \ - vd_ofs =3D vec_full_offset(a->vd); \ - vj_ofs =3D vec_full_offset(a->vj); \ - \ - tcg_gen_gvec_2i(vd_ofs, vj_ofs, 16, ctx->vl/8, a->imm, &op[mop]); \ - \ - return true; \ +#define DO_CMPI_S(NAME) = \ +static bool do_## NAME ##_s(DisasContext *ctx, = \ + arg_vv_i *a, uint32_t oprsz, MemOp mop) = \ +{ = \ + uint32_t vd_ofs, vj_ofs; = \ + = \ + CHECK_VEC; = \ + = \ + static const TCGOpcode vecop_list[] =3D { = \ + INDEX_op_cmp_vec, 0 = \ + }; = \ + static const GVecGen2i op[4] =3D { = \ + { = \ + .fniv =3D gen_## NAME ##_s_vec, = \ + .fnoi =3D gen_helper_## NAME ##_b, = \ + .opt_opc =3D vecop_list, = \ + .vece =3D MO_8 = \ + }, = \ + { = \ + .fniv =3D gen_## NAME ##_s_vec, = \ + .fnoi =3D gen_helper_## NAME ##_h, = \ + .opt_opc =3D vecop_list, = \ + .vece =3D MO_16 = \ + }, = \ + { = \ + .fniv =3D gen_## NAME ##_s_vec, = \ + .fnoi =3D gen_helper_## NAME ##_w, = \ + .opt_opc =3D vecop_list, = \ + .vece =3D MO_32 = \ + }, = \ + { = \ + .fniv =3D gen_## NAME ##_s_vec, = \ + .fnoi =3D gen_helper_## NAME ##_d, = \ + .opt_opc =3D vecop_list, = \ + .vece =3D MO_64 = \ + } = \ + }; = \ + = \ + vd_ofs =3D vec_full_offset(a->vd); = \ + vj_ofs =3D vec_full_offset(a->vj); = \ + = \ + tcg_gen_gvec_2i(vd_ofs, vj_ofs, oprsz, ctx->vl / 8, a->imm, &op[mop]);= \ + = \ + return true; = \ } =20 DO_CMPI_S(vseqi) DO_CMPI_S(vslei) DO_CMPI_S(vslti) =20 -#define DO_CMPI_U(NAME) \ -static bool do_## NAME ##_u(DisasContext *ctx, arg_vv_i *a, MemOp mop) \ -{ \ - uint32_t vd_ofs, vj_ofs; \ - \ - CHECK_VEC; \ - \ - static const TCGOpcode vecop_list[] =3D { \ - INDEX_op_cmp_vec, 0 \ - }; \ - static const GVecGen2i op[4] =3D { \ - { \ - .fniv =3D gen_## NAME ##_u_vec, \ - .fnoi =3D gen_helper_## NAME ##_bu, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_8 \ - }, \ - { \ - .fniv =3D gen_## NAME ##_u_vec, \ - .fnoi =3D gen_helper_## NAME ##_hu, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_16 \ - }, \ - { \ - .fniv =3D gen_## NAME ##_u_vec, \ - .fnoi =3D gen_helper_## NAME ##_wu, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_32 \ - }, \ - { \ - .fniv =3D gen_## NAME ##_u_vec, \ - .fnoi =3D gen_helper_## NAME ##_du, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_64 \ - } \ - }; \ - \ - vd_ofs =3D vec_full_offset(a->vd); \ - vj_ofs =3D vec_full_offset(a->vj); \ - \ - tcg_gen_gvec_2i(vd_ofs, vj_ofs, 16, ctx->vl/8, a->imm, &op[mop]); \ - \ - return true; \ +#define DO_CMPI_U(NAME) = \ +static bool do_## NAME ##_u(DisasContext *ctx, = \ + arg_vv_i *a, uint32_t oprsz, MemOp mop) = \ +{ = \ + uint32_t vd_ofs, vj_ofs; = \ + = \ + CHECK_VEC; = \ + = \ + static const TCGOpcode vecop_list[] =3D { = \ + INDEX_op_cmp_vec, 0 = \ + }; = \ + static const GVecGen2i op[4] =3D { = \ + { = \ + .fniv =3D gen_## NAME ##_u_vec, = \ + .fnoi =3D gen_helper_## NAME ##_bu, = \ + .opt_opc =3D vecop_list, = \ + .vece =3D MO_8 = \ + }, = \ + { = \ + .fniv =3D gen_## NAME ##_u_vec, = \ + .fnoi =3D gen_helper_## NAME ##_hu, = \ + .opt_opc =3D vecop_list, = \ + .vece =3D MO_16 = \ + }, = \ + { = \ + .fniv =3D gen_## NAME ##_u_vec, = \ + .fnoi =3D gen_helper_## NAME ##_wu, = \ + .opt_opc =3D vecop_list, = \ + .vece =3D MO_32 = \ + }, = \ + { = \ + .fniv =3D gen_## NAME ##_u_vec, = \ + .fnoi =3D gen_helper_## NAME ##_du, = \ + .opt_opc =3D vecop_list, = \ + .vece =3D MO_64 = \ + } = \ + }; = \ + = \ + vd_ofs =3D vec_full_offset(a->vd); = \ + vj_ofs =3D vec_full_offset(a->vj); = \ + = \ + tcg_gen_gvec_2i(vd_ofs, vj_ofs, oprsz, ctx->vl / 8, a->imm, &op[mop]);= \ + = \ + return true; = \ } =20 DO_CMPI_U(vslei) DO_CMPI_U(vslti) =20 -TRANS(vseq_b, LSX, do_cmp, MO_8, TCG_COND_EQ) -TRANS(vseq_h, LSX, do_cmp, MO_16, TCG_COND_EQ) -TRANS(vseq_w, LSX, do_cmp, MO_32, TCG_COND_EQ) -TRANS(vseq_d, LSX, do_cmp, MO_64, TCG_COND_EQ) -TRANS(vseqi_b, LSX, do_vseqi_s, MO_8) -TRANS(vseqi_h, LSX, do_vseqi_s, MO_16) -TRANS(vseqi_w, LSX, do_vseqi_s, MO_32) -TRANS(vseqi_d, LSX, do_vseqi_s, MO_64) - -TRANS(vsle_b, LSX, do_cmp, MO_8, TCG_COND_LE) -TRANS(vsle_h, LSX, do_cmp, MO_16, TCG_COND_LE) -TRANS(vsle_w, LSX, do_cmp, MO_32, TCG_COND_LE) -TRANS(vsle_d, LSX, do_cmp, MO_64, TCG_COND_LE) -TRANS(vslei_b, LSX, do_vslei_s, MO_8) -TRANS(vslei_h, LSX, do_vslei_s, MO_16) -TRANS(vslei_w, LSX, do_vslei_s, MO_32) -TRANS(vslei_d, LSX, do_vslei_s, MO_64) -TRANS(vsle_bu, LSX, do_cmp, MO_8, TCG_COND_LEU) -TRANS(vsle_hu, LSX, do_cmp, MO_16, TCG_COND_LEU) -TRANS(vsle_wu, LSX, do_cmp, MO_32, TCG_COND_LEU) -TRANS(vsle_du, LSX, do_cmp, MO_64, TCG_COND_LEU) -TRANS(vslei_bu, LSX, do_vslei_u, MO_8) -TRANS(vslei_hu, LSX, do_vslei_u, MO_16) -TRANS(vslei_wu, LSX, do_vslei_u, MO_32) -TRANS(vslei_du, LSX, do_vslei_u, MO_64) - -TRANS(vslt_b, LSX, do_cmp, MO_8, TCG_COND_LT) -TRANS(vslt_h, LSX, do_cmp, MO_16, TCG_COND_LT) -TRANS(vslt_w, LSX, do_cmp, MO_32, TCG_COND_LT) -TRANS(vslt_d, LSX, do_cmp, MO_64, TCG_COND_LT) -TRANS(vslti_b, LSX, do_vslti_s, MO_8) -TRANS(vslti_h, LSX, do_vslti_s, MO_16) -TRANS(vslti_w, LSX, do_vslti_s, MO_32) -TRANS(vslti_d, LSX, do_vslti_s, MO_64) -TRANS(vslt_bu, LSX, do_cmp, MO_8, TCG_COND_LTU) -TRANS(vslt_hu, LSX, do_cmp, MO_16, TCG_COND_LTU) -TRANS(vslt_wu, LSX, do_cmp, MO_32, TCG_COND_LTU) -TRANS(vslt_du, LSX, do_cmp, MO_64, TCG_COND_LTU) -TRANS(vslti_bu, LSX, do_vslti_u, MO_8) -TRANS(vslti_hu, LSX, do_vslti_u, MO_16) -TRANS(vslti_wu, LSX, do_vslti_u, MO_32) -TRANS(vslti_du, LSX, do_vslti_u, MO_64) +TRANS(vseq_b, LSX, do_cmp, 16, MO_8, TCG_COND_EQ) +TRANS(vseq_h, LSX, do_cmp, 16, MO_16, TCG_COND_EQ) +TRANS(vseq_w, LSX, do_cmp, 16, MO_32, TCG_COND_EQ) +TRANS(vseq_d, LSX, do_cmp, 16, MO_64, TCG_COND_EQ) +TRANS(vseqi_b, LSX, do_vseqi_s, 16, MO_8) +TRANS(vseqi_h, LSX, do_vseqi_s, 16, MO_16) +TRANS(vseqi_w, LSX, do_vseqi_s, 16, MO_32) +TRANS(vseqi_d, LSX, do_vseqi_s, 16, MO_64) + +TRANS(vsle_b, LSX, do_cmp, 16, MO_8, TCG_COND_LE) +TRANS(vsle_h, LSX, do_cmp, 16, MO_16, TCG_COND_LE) +TRANS(vsle_w, LSX, do_cmp, 16, MO_32, TCG_COND_LE) +TRANS(vsle_d, LSX, do_cmp, 16, MO_64, TCG_COND_LE) +TRANS(vslei_b, LSX, do_vslei_s, 16, MO_8) +TRANS(vslei_h, LSX, do_vslei_s, 16, MO_16) +TRANS(vslei_w, LSX, do_vslei_s, 16, MO_32) +TRANS(vslei_d, LSX, do_vslei_s, 16, MO_64) +TRANS(vsle_bu, LSX, do_cmp, 16, MO_8, TCG_COND_LEU) +TRANS(vsle_hu, LSX, do_cmp, 16, MO_16, TCG_COND_LEU) +TRANS(vsle_wu, LSX, do_cmp, 16, MO_32, TCG_COND_LEU) +TRANS(vsle_du, LSX, do_cmp, 16, MO_64, TCG_COND_LEU) +TRANS(vslei_bu, LSX, do_vslei_u, 16, MO_8) +TRANS(vslei_hu, LSX, do_vslei_u, 16, MO_16) +TRANS(vslei_wu, LSX, do_vslei_u, 16, MO_32) +TRANS(vslei_du, LSX, do_vslei_u, 16, MO_64) + +TRANS(vslt_b, LSX, do_cmp, 16, MO_8, TCG_COND_LT) +TRANS(vslt_h, LSX, do_cmp, 16, MO_16, TCG_COND_LT) +TRANS(vslt_w, LSX, do_cmp, 16, MO_32, TCG_COND_LT) +TRANS(vslt_d, LSX, do_cmp, 16, MO_64, TCG_COND_LT) +TRANS(vslti_b, LSX, do_vslti_s, 16, MO_8) +TRANS(vslti_h, LSX, do_vslti_s, 16, MO_16) +TRANS(vslti_w, LSX, do_vslti_s, 16, MO_32) +TRANS(vslti_d, LSX, do_vslti_s, 16, MO_64) +TRANS(vslt_bu, LSX, do_cmp, 16, MO_8, TCG_COND_LTU) +TRANS(vslt_hu, LSX, do_cmp, 16, MO_16, TCG_COND_LTU) +TRANS(vslt_wu, LSX, do_cmp, 16, MO_32, TCG_COND_LTU) +TRANS(vslt_du, LSX, do_cmp, 16, MO_64, TCG_COND_LTU) +TRANS(vslti_bu, LSX, do_vslti_u, 16, MO_8) +TRANS(vslti_hu, LSX, do_vslti_u, 16, MO_16) +TRANS(vslti_wu, LSX, do_vslti_u, 16, MO_32) +TRANS(vslti_du, LSX, do_vslti_u, 16, MO_64) =20 static bool trans_vfcmp_cond_s(DisasContext *ctx, arg_vvv_fcond *a) { --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16933855783361013.7332485314525; Wed, 30 Aug 2023 01:52:58 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGvC-0000Vl-6q; Wed, 30 Aug 2023 04:51:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGvA-0000Ct-6b for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:51:28 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGv5-000877-SE for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:51:27 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxDOufAu9kxwgdAA--.53990S3; Wed, 30 Aug 2023 16:49:35 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S43; Wed, 30 Aug 2023 16:49:35 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 41/48] target/loongarch: Implement xvfcmp Date: Wed, 30 Aug 2023 16:48:55 +0800 Message-Id: <20230830084902.2113960-42-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S43 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385580765100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVFCMP.cond.{S/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 8 +- target/loongarch/insns.decode | 3 + target/loongarch/disas.c | 94 ++++++++++++++++++++ target/loongarch/vec_helper.c | 4 +- target/loongarch/insn_trans/trans_lasx.c.inc | 3 + target/loongarch/insn_trans/trans_lsx.c.inc | 17 ++-- 6 files changed, 117 insertions(+), 12 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index e9c5412267..b54ce68077 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -652,10 +652,10 @@ DEF_HELPER_FLAGS_4(vslti_hu, TCG_CALL_NO_RWG, void, p= tr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vslti_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vslti_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 -DEF_HELPER_5(vfcmp_c_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfcmp_s_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfcmp_c_d, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfcmp_s_d, void, env, i32, i32, i32, i32) +DEF_HELPER_6(vfcmp_c_s, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vfcmp_s_s, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vfcmp_c_d, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vfcmp_s_d, void, env, i32, i32, i32, i32, i32) =20 DEF_HELPER_FLAGS_4(vbitseli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 82c26a318b..0d46bd5e5e 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1958,6 +1958,9 @@ xvslti_hu 0111 01101000 10001 ..... ..... ....= . @vv_ui5 xvslti_wu 0111 01101000 10010 ..... ..... ..... @vv_ui5 xvslti_du 0111 01101000 10011 ..... ..... ..... @vv_ui5 =20 +xvfcmp_cond_s 0000 11001001 ..... ..... ..... ..... @vvv_fcond +xvfcmp_cond_d 0000 11001010 ..... ..... ..... ..... @vvv_fcond + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 295ba74f2b..607774375c 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2385,6 +2385,100 @@ INSN_LASX(xvslti_hu, vv_i) INSN_LASX(xvslti_wu, vv_i) INSN_LASX(xvslti_du, vv_i) =20 +#define output_xvfcmp(C, PREFIX, SUFFIX) = \ +{ = \ + (C)->info->fprintf_func((C)->info->stream, "%08x %s%s\tx%d, x%d, x%d"= , \ + (C)->insn, PREFIX, SUFFIX, a->vd, = \ + a->vj, a->vk); = \ +} + +static bool output_xxx_fcond(DisasContext *ctx, arg_vvv_fcond * a, + const char *suffix) +{ + bool ret =3D true; + switch (a->fcond) { + case 0x0: + output_xvfcmp(ctx, "xvfcmp_caf_", suffix); + break; + case 0x1: + output_xvfcmp(ctx, "xvfcmp_saf_", suffix); + break; + case 0x2: + output_xvfcmp(ctx, "xvfcmp_clt_", suffix); + break; + case 0x3: + output_xvfcmp(ctx, "xvfcmp_slt_", suffix); + break; + case 0x4: + output_xvfcmp(ctx, "xvfcmp_ceq_", suffix); + break; + case 0x5: + output_xvfcmp(ctx, "xvfcmp_seq_", suffix); + break; + case 0x6: + output_xvfcmp(ctx, "xvfcmp_cle_", suffix); + break; + case 0x7: + output_xvfcmp(ctx, "xvfcmp_sle_", suffix); + break; + case 0x8: + output_xvfcmp(ctx, "xvfcmp_cun_", suffix); + break; + case 0x9: + output_xvfcmp(ctx, "xvfcmp_sun_", suffix); + break; + case 0xA: + output_xvfcmp(ctx, "xvfcmp_cult_", suffix); + break; + case 0xB: + output_xvfcmp(ctx, "xvfcmp_sult_", suffix); + break; + case 0xC: + output_xvfcmp(ctx, "xvfcmp_cueq_", suffix); + break; + case 0xD: + output_xvfcmp(ctx, "xvfcmp_sueq_", suffix); + break; + case 0xE: + output_xvfcmp(ctx, "xvfcmp_cule_", suffix); + break; + case 0xF: + output_xvfcmp(ctx, "xvfcmp_sule_", suffix); + break; + case 0x10: + output_xvfcmp(ctx, "xvfcmp_cne_", suffix); + break; + case 0x11: + output_xvfcmp(ctx, "xvfcmp_sne_", suffix); + break; + case 0x14: + output_xvfcmp(ctx, "xvfcmp_cor_", suffix); + break; + case 0x15: + output_xvfcmp(ctx, "xvfcmp_sor_", suffix); + break; + case 0x18: + output_xvfcmp(ctx, "xvfcmp_cune_", suffix); + break; + case 0x19: + output_xvfcmp(ctx, "xvfcmp_sune_", suffix); + break; + default: + ret =3D false; + } + return ret; +} + +#define LASX_FCMP_INSN(suffix) \ +static bool trans_xvfcmp_cond_##suffix(DisasContext *ctx, \ + arg_vvv_fcond * a) \ +{ \ + return output_xxx_fcond(ctx, a, #suffix); \ +} + +LASX_FCMP_INSN(s) +LASX_FCMP_INSN(d) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 19958c054c..4970a4b39a 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -3001,7 +3001,7 @@ static uint64_t vfcmp_common(CPULoongArchState *env, } =20 #define VFCMP(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t oprsz, \ uint32_t vd, uint32_t vj, uint32_t vk, uint32_t flags) \ { \ int i; \ @@ -3011,7 +3011,7 @@ void HELPER(NAME)(CPULoongArchState *env, = \ VReg *Vk =3D &(env->fpr[vk].vreg); = \ \ vec_clear_cause(env); \ - for (i =3D 0; i < LSX_LEN/BIT ; i++) { = \ + for (i =3D 0; i < oprsz / (BIT / 8) ; i++) { = \ FloatRelation cmp; \ cmp =3D FN(Vj->E(i), Vk->E(i), &env->fp_status); = \ t.E(i) =3D vfcmp_common(env, cmp, flags); = \ diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index c1cd02d6a1..6efb9733a3 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -705,6 +705,9 @@ TRANS(xvslti_hu, LASX, do_vslti_u, 32, MO_16) TRANS(xvslti_wu, LASX, do_vslti_u, 32, MO_32) TRANS(xvslti_du, LASX, do_vslti_u, 32, MO_64) =20 +TRANS(xvfcmp_cond_s, LASX, do_vfcmp_cond_s, 32) +TRANS(xvfcmp_cond_d, LASX, do_vfcmp_cond_d, 32) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index f757db7a76..a5d6cc834d 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -3921,13 +3921,14 @@ TRANS(vslti_hu, LSX, do_vslti_u, 16, MO_16) TRANS(vslti_wu, LSX, do_vslti_u, 16, MO_32) TRANS(vslti_du, LSX, do_vslti_u, 16, MO_64) =20 -static bool trans_vfcmp_cond_s(DisasContext *ctx, arg_vvv_fcond *a) +static bool do_vfcmp_cond_s(DisasContext *ctx, arg_vvv_fcond *a, uint32_t = sz) { uint32_t flags; - void (*fn)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32); + void (*fn)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32); TCGv_i32 vd =3D tcg_constant_i32(a->vd); TCGv_i32 vj =3D tcg_constant_i32(a->vj); TCGv_i32 vk =3D tcg_constant_i32(a->vk); + TCGv_i32 oprsz =3D tcg_constant_i32(sz); =20 if (!avail_LSX(ctx)) { return false; @@ -3937,18 +3938,19 @@ static bool trans_vfcmp_cond_s(DisasContext *ctx, a= rg_vvv_fcond *a) =20 fn =3D (a->fcond & 1 ? gen_helper_vfcmp_s_s : gen_helper_vfcmp_c_s); flags =3D get_fcmp_flags(a->fcond >> 1); - fn(cpu_env, vd, vj, vk, tcg_constant_i32(flags)); + fn(cpu_env, oprsz, vd, vj, vk, tcg_constant_i32(flags)); =20 return true; } =20 -static bool trans_vfcmp_cond_d(DisasContext *ctx, arg_vvv_fcond *a) +static bool do_vfcmp_cond_d(DisasContext *ctx, arg_vvv_fcond *a, uint32_t = sz) { uint32_t flags; - void (*fn)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32); + void (*fn)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32); TCGv_i32 vd =3D tcg_constant_i32(a->vd); TCGv_i32 vj =3D tcg_constant_i32(a->vj); TCGv_i32 vk =3D tcg_constant_i32(a->vk); + TCGv_i32 oprsz =3D tcg_constant_i32(sz); =20 if (!avail_LSX(ctx)) { return false; @@ -3958,11 +3960,14 @@ static bool trans_vfcmp_cond_d(DisasContext *ctx, a= rg_vvv_fcond *a) =20 fn =3D (a->fcond & 1 ? gen_helper_vfcmp_s_d : gen_helper_vfcmp_c_d); flags =3D get_fcmp_flags(a->fcond >> 1); - fn(cpu_env, vd, vj, vk, tcg_constant_i32(flags)); + fn(cpu_env, oprsz, vd, vj, vk, tcg_constant_i32(flags)); =20 return true; } =20 +TRANS(vfcmp_cond_s, LSX, do_vfcmp_cond_s, 16) +TRANS(vfcmp_cond_d, LSX, do_vfcmp_cond_d, 16) + static bool trans_vbitsel_v(DisasContext *ctx, arg_vvvv *a) { if (!avail_LSX(ctx)) { --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385754772276.0808719310156; Wed, 30 Aug 2023 01:55:54 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuQ-0002j2-4w; Wed, 30 Aug 2023 04:50:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtf-0001cU-Cu for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:55 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtZ-0007ZL-6l for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:55 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxHOugAu9kyQgdAA--.54598S3; Wed, 30 Aug 2023 16:49:36 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S44; Wed, 30 Aug 2023 16:49:35 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 42/48] target/loongarch: Implement xvbitsel xvset Date: Wed, 30 Aug 2023 16:48:56 +0800 Message-Id: <20230830084902.2113960-43-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S44 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385755525100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVBITSEL.V; - XVBITSELI.B; - XVSET{EQZ/NEZ}.V; - XVSETANYEQZ.{B/H/W/D}; - XVSETALLNEZ.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 16 +++---- target/loongarch/insns.decode | 15 +++++++ target/loongarch/disas.c | 19 ++++++++ target/loongarch/vec_helper.c | 40 ++++++++++------- target/loongarch/insn_trans/trans_lasx.c.inc | 46 ++++++++++++++++++++ target/loongarch/insn_trans/trans_lsx.c.inc | 44 +++++++++---------- 6 files changed, 134 insertions(+), 46 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index b54ce68077..85233586e3 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -659,14 +659,14 @@ DEF_HELPER_6(vfcmp_s_d, void, env, i32, i32, i32, i32= , i32) =20 DEF_HELPER_FLAGS_4(vbitseli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 -DEF_HELPER_3(vsetanyeqz_b, void, env, i32, i32) -DEF_HELPER_3(vsetanyeqz_h, void, env, i32, i32) -DEF_HELPER_3(vsetanyeqz_w, void, env, i32, i32) -DEF_HELPER_3(vsetanyeqz_d, void, env, i32, i32) -DEF_HELPER_3(vsetallnez_b, void, env, i32, i32) -DEF_HELPER_3(vsetallnez_h, void, env, i32, i32) -DEF_HELPER_3(vsetallnez_w, void, env, i32, i32) -DEF_HELPER_3(vsetallnez_d, void, env, i32, i32) +DEF_HELPER_4(vsetanyeqz_b, void, env, i32, i32, i32) +DEF_HELPER_4(vsetanyeqz_h, void, env, i32, i32, i32) +DEF_HELPER_4(vsetanyeqz_w, void, env, i32, i32, i32) +DEF_HELPER_4(vsetanyeqz_d, void, env, i32, i32, i32) +DEF_HELPER_4(vsetallnez_b, void, env, i32, i32, i32) +DEF_HELPER_4(vsetallnez_h, void, env, i32, i32, i32) +DEF_HELPER_4(vsetallnez_w, void, env, i32, i32, i32) +DEF_HELPER_4(vsetallnez_d, void, env, i32, i32, i32) =20 DEF_HELPER_FLAGS_4(vpackev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vpackev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 0d46bd5e5e..ad6751fdfb 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1961,6 +1961,21 @@ xvslti_du 0111 01101000 10011 ..... ..... ...= .. @vv_ui5 xvfcmp_cond_s 0000 11001001 ..... ..... ..... ..... @vvv_fcond xvfcmp_cond_d 0000 11001010 ..... ..... ..... ..... @vvv_fcond =20 +xvbitsel_v 0000 11010010 ..... ..... ..... ..... @vvvv + +xvbitseli_b 0111 01111100 01 ........ ..... ..... @vv_ui8 + +xvseteqz_v 0111 01101001 11001 00110 ..... 00 ... @cv +xvsetnez_v 0111 01101001 11001 00111 ..... 00 ... @cv +xvsetanyeqz_b 0111 01101001 11001 01000 ..... 00 ... @cv +xvsetanyeqz_h 0111 01101001 11001 01001 ..... 00 ... @cv +xvsetanyeqz_w 0111 01101001 11001 01010 ..... 00 ... @cv +xvsetanyeqz_d 0111 01101001 11001 01011 ..... 00 ... @cv +xvsetallnez_b 0111 01101001 11001 01100 ..... 00 ... @cv +xvsetallnez_h 0111 01101001 11001 01101 ..... 00 ... @cv +xvsetallnez_w 0111 01101001 11001 01110 ..... 00 ... @cv +xvsetallnez_d 0111 01101001 11001 01111 ..... 00 ... @cv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 607774375c..3a06b5cb80 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1703,6 +1703,11 @@ static bool trans_##insn(DisasContext *ctx, arg_##ty= pe * a) \ return true; \ } =20 +static void output_cv_x(DisasContext *ctx, arg_cv *a, const char *mnemonic) +{ + output(ctx, mnemonic, "fcc%d, x%d", a->cd, a->vj); +} + static void output_v_i_x(DisasContext *ctx, arg_v_i *a, const char *mnemon= ic) { output(ctx, mnemonic, "x%d, 0x%x", a->vd, a->imm); @@ -2479,6 +2484,20 @@ static bool trans_xvfcmp_cond_##suffix(DisasContext = *ctx, \ LASX_FCMP_INSN(s) LASX_FCMP_INSN(d) =20 +INSN_LASX(xvbitsel_v, vvvv) +INSN_LASX(xvbitseli_b, vv_i) + +INSN_LASX(xvseteqz_v, cv) +INSN_LASX(xvsetnez_v, cv) +INSN_LASX(xvsetanyeqz_b, cv) +INSN_LASX(xvsetanyeqz_h, cv) +INSN_LASX(xvsetanyeqz_w, cv) +INSN_LASX(xvsetanyeqz_d, cv) +INSN_LASX(xvsetallnez_b, cv) +INSN_LASX(xvsetallnez_h, cv) +INSN_LASX(xvsetallnez_w, cv) +INSN_LASX(xvsetallnez_d, cv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 4970a4b39a..1a13342c86 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -3025,13 +3025,13 @@ VFCMP(vfcmp_s_s, 32, UW, float32_compare) VFCMP(vfcmp_c_d, 64, UD, float64_compare_quiet) VFCMP(vfcmp_s_d, 64, UD, float64_compare) =20 -void HELPER(vbitseli_b)(void *vd, void *vj, uint64_t imm, uint32_t v) +void HELPER(vbitseli_b)(void *vd, void *vj, uint64_t imm, uint32_t desc) { int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; =20 - for (i =3D 0; i < 16; i++) { + for (i =3D 0; i < simd_oprsz(desc); i++) { Vd->B(i) =3D (~Vd->B(i) & Vj->B(i)) | (Vd->B(i) & imm); } } @@ -3039,7 +3039,7 @@ void HELPER(vbitseli_b)(void *vd, void *vj, uint64_t= imm, uint32_t v) /* Copy from target/arm/tcg/sve_helper.c */ static inline bool do_match2(uint64_t n, uint64_t m0, uint64_t m1, int esz) { - uint64_t bits =3D 8 << esz; + int bits =3D 8 << esz; uint64_t ones =3D dup_const(esz, 1); uint64_t signs =3D ones << (bits - 1); uint64_t cmp0, cmp1; @@ -3052,24 +3052,34 @@ static inline bool do_match2(uint64_t n, uint64_t m= 0, uint64_t m1, int esz) return (cmp0 | cmp1) & signs; } =20 -#define SETANYEQZ(NAME, MO) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t cd, uint32_t vj) \ -{ \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - env->cf[cd & 0x7] =3D do_match2(0, Vj->D(0), Vj->D(1), MO); \ +#define SETANYEQZ(NAME, MO) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t oprsz, uint32_t cd, uint32_t vj) \ +{ \ + VReg *Vj =3D &(env->fpr[vj].vreg); \ + \ + env->cf[cd & 0x7] =3D do_match2(0, Vj->D(0), Vj->D(1), MO); \ + if (oprsz =3D=3D 32) { \ + env->cf[cd & 0x7] =3D env->cf[cd & 0x7] || \ + do_match2(0, Vj->D(2), Vj->D(3), MO); \ + } \ } SETANYEQZ(vsetanyeqz_b, MO_8) SETANYEQZ(vsetanyeqz_h, MO_16) SETANYEQZ(vsetanyeqz_w, MO_32) SETANYEQZ(vsetanyeqz_d, MO_64) =20 -#define SETALLNEZ(NAME, MO) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t cd, uint32_t vj) \ -{ \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - env->cf[cd & 0x7]=3D !do_match2(0, Vj->D(0), Vj->D(1), MO); \ +#define SETALLNEZ(NAME, MO) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t oprsz, uint32_t cd, uint32_t vj) \ +{ \ + VReg *Vj =3D &(env->fpr[vj].vreg); \ + \ + env->cf[cd & 0x7]=3D !do_match2(0, Vj->D(0), Vj->D(1), MO); \ + if (oprsz =3D=3D 32) { \ + env->cf[cd & 0x7] =3D env->cf[cd & 0x7] && \ + !do_match2(0, Vj->D(2), Vj->D(3), MO); \ + } \ } SETALLNEZ(vsetallnez_b, MO_8) SETALLNEZ(vsetallnez_h, MO_16) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 6efb9733a3..190fe3eecb 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -708,6 +708,52 @@ TRANS(xvslti_du, LASX, do_vslti_u, 32, MO_64) TRANS(xvfcmp_cond_s, LASX, do_vfcmp_cond_s, 32) TRANS(xvfcmp_cond_d, LASX, do_vfcmp_cond_d, 32) =20 +TRANS(xvbitsel_v, LASX, do_vbitsel_v, 32) +TRANS(xvbitseli_b, LASX, do_vbitseli_b, 32) + +#define XVSET(NAME, COND) = \ +static bool trans_## NAME(DisasContext *ctx, arg_cv * a) = \ +{ = \ + TCGv_i64 t1, t2, d[4]; = \ + = \ + d[0] =3D tcg_temp_new_i64(); = \ + d[1] =3D tcg_temp_new_i64(); = \ + d[2] =3D tcg_temp_new_i64(); = \ + d[3] =3D tcg_temp_new_i64(); = \ + t1 =3D tcg_temp_new_i64(); = \ + t2 =3D tcg_temp_new_i64(); = \ + = \ + get_vreg64(d[0], a->vj, 0); = \ + get_vreg64(d[1], a->vj, 1); = \ + get_vreg64(d[2], a->vj, 2); = \ + get_vreg64(d[3], a->vj, 3); = \ + = \ + if (!avail_LASX(ctx)) { = \ + return false; = \ + } = \ + = \ + CHECK_VEC; = \ + tcg_gen_or_i64(t1, d[0], d[1]); = \ + tcg_gen_or_i64(t2, d[2], d[3]); = \ + tcg_gen_or_i64(t1, t2, t1); = \ + tcg_gen_setcondi_i64(COND, t1, t1, 0); = \ + tcg_gen_st8_tl(t1, cpu_env, offsetof(CPULoongArchState, cf[a->cd & 0x7= ])); \ + = \ + return true; = \ +} + +XVSET(xvseteqz_v, TCG_COND_EQ) +XVSET(xvsetnez_v, TCG_COND_NE) + +TRANS(xvsetanyeqz_b, LASX, gen_cv, 32, gen_helper_vsetanyeqz_b) +TRANS(xvsetanyeqz_h, LASX, gen_cv, 32, gen_helper_vsetanyeqz_h) +TRANS(xvsetanyeqz_w, LASX, gen_cv, 32, gen_helper_vsetanyeqz_w) +TRANS(xvsetanyeqz_d, LASX, gen_cv, 32, gen_helper_vsetanyeqz_d) +TRANS(xvsetallnez_b, LASX, gen_cv, 32, gen_helper_vsetallnez_b) +TRANS(xvsetallnez_h, LASX, gen_cv, 32, gen_helper_vsetallnez_h) +TRANS(xvsetallnez_w, LASX, gen_cv, 32, gen_helper_vsetallnez_w) +TRANS(xvsetallnez_d, LASX, gen_cv, 32, gen_helper_vsetallnez_d) + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index a5d6cc834d..2928e878cf 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -91,14 +91,16 @@ static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, in= t oprsz, return true; } =20 -static bool gen_cv(DisasContext *ctx, arg_cv *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32)) +static bool gen_cv(DisasContext *ctx, arg_cv *a, uint32_t sz, + void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) { TCGv_i32 vj =3D tcg_constant_i32(a->vj); TCGv_i32 cd =3D tcg_constant_i32(a->cd); + TCGv_i32 oprsz =3D tcg_constant_i32(sz); =20 CHECK_VEC; - func(cpu_env, cd, vj); + + func(cpu_env, oprsz, cd, vj); return true; } =20 @@ -3968,26 +3970,24 @@ static bool do_vfcmp_cond_d(DisasContext *ctx, arg_= vvv_fcond *a, uint32_t sz) TRANS(vfcmp_cond_s, LSX, do_vfcmp_cond_s, 16) TRANS(vfcmp_cond_d, LSX, do_vfcmp_cond_d, 16) =20 -static bool trans_vbitsel_v(DisasContext *ctx, arg_vvvv *a) +static bool do_vbitsel_v(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz) { - if (!avail_LSX(ctx)) { - return false; - } - CHECK_VEC; =20 tcg_gen_gvec_bitsel(MO_64, vec_full_offset(a->vd), vec_full_offset(a->= va), vec_full_offset(a->vk), vec_full_offset(a->vj), - 16, ctx->vl/8); + oprsz, ctx->vl / 8); return true; } =20 +TRANS(vbitsel_v, LASX, do_vbitsel_v, 16) + static void gen_vbitseli(unsigned vece, TCGv_vec a, TCGv_vec b, int64_t im= m) { tcg_gen_bitsel_vec(vece, a, a, tcg_constant_vec_matching(a, vece, imm)= , b); } =20 -static bool trans_vbitseli_b(DisasContext *ctx, arg_vv_i *a) +static bool do_vbitseli_b(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz) { static const GVecGen2i op =3D { .fniv =3D gen_vbitseli, @@ -3996,17 +3996,15 @@ static bool trans_vbitseli_b(DisasContext *ctx, arg= _vv_i *a) .load_dest =3D true }; =20 - if (!avail_LSX(ctx)) { - return false; - } - CHECK_VEC; =20 tcg_gen_gvec_2i(vec_full_offset(a->vd), vec_full_offset(a->vj), - 16, ctx->vl/8, a->imm, &op); + oprsz, ctx->vl / 8, a->imm, &op); return true; } =20 +TRANS(vbitseli_b, LASX, do_vbitseli_b, 16) + #define VSET(NAME, COND) = \ static bool trans_## NAME (DisasContext *ctx, arg_cv *a) = \ { = \ @@ -4034,14 +4032,14 @@ static bool trans_## NAME (DisasContext *ctx, arg_c= v *a) \ VSET(vseteqz_v, TCG_COND_EQ) VSET(vsetnez_v, TCG_COND_NE) =20 -TRANS(vsetanyeqz_b, LSX, gen_cv, gen_helper_vsetanyeqz_b) -TRANS(vsetanyeqz_h, LSX, gen_cv, gen_helper_vsetanyeqz_h) -TRANS(vsetanyeqz_w, LSX, gen_cv, gen_helper_vsetanyeqz_w) -TRANS(vsetanyeqz_d, LSX, gen_cv, gen_helper_vsetanyeqz_d) -TRANS(vsetallnez_b, LSX, gen_cv, gen_helper_vsetallnez_b) -TRANS(vsetallnez_h, LSX, gen_cv, gen_helper_vsetallnez_h) -TRANS(vsetallnez_w, LSX, gen_cv, gen_helper_vsetallnez_w) -TRANS(vsetallnez_d, LSX, gen_cv, gen_helper_vsetallnez_d) +TRANS(vsetanyeqz_b, LSX, gen_cv, 16, gen_helper_vsetanyeqz_b) +TRANS(vsetanyeqz_h, LSX, gen_cv, 16, gen_helper_vsetanyeqz_h) +TRANS(vsetanyeqz_w, LSX, gen_cv, 16, gen_helper_vsetanyeqz_w) +TRANS(vsetanyeqz_d, LSX, gen_cv, 16, gen_helper_vsetanyeqz_d) +TRANS(vsetallnez_b, LSX, gen_cv, 16, gen_helper_vsetallnez_b) +TRANS(vsetallnez_h, LSX, gen_cv, 16, gen_helper_vsetallnez_h) +TRANS(vsetallnez_w, LSX, gen_cv, 16, gen_helper_vsetallnez_w) +TRANS(vsetallnez_d, LSX, gen_cv, 16, gen_helper_vsetallnez_d) =20 static bool trans_vinsgr2vr_b(DisasContext *ctx, arg_vr_i *a) { --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385645715476.3188670291619; Wed, 30 Aug 2023 01:54:05 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuR-0002uf-Fc; Wed, 30 Aug 2023 04:50:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtg-0001ci-26 for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:56 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtZ-0007ZQ-Fh for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:55 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxNuihAu9kywgdAA--.5927S3; Wed, 30 Aug 2023 16:49:37 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S45; Wed, 30 Aug 2023 16:49:36 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 43/48] target/loongarch: Implement xvinsgr2vr xvpickve2gr Date: Wed, 30 Aug 2023 16:48:57 +0800 Message-Id: <20230830084902.2113960-44-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S45 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385647868100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVINSGR2VR.{W/D}; - XVPICKVE2GR.{W/D}[U]. Signed-off-by: Song Gao --- target/loongarch/insns.decode | 7 +++ target/loongarch/disas.c | 18 ++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 48 ++++++++++++++++++++ 3 files changed, 73 insertions(+) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index ad6751fdfb..bb3bb447ae 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1976,6 +1976,13 @@ xvsetallnez_h 0111 01101001 11001 01101 ..... 00 = ... @cv xvsetallnez_w 0111 01101001 11001 01110 ..... 00 ... @cv xvsetallnez_d 0111 01101001 11001 01111 ..... 00 ... @cv =20 +xvinsgr2vr_w 0111 01101110 10111 10 ... ..... ..... @vr_ui3 +xvinsgr2vr_d 0111 01101110 10111 110 .. ..... ..... @vr_ui2 +xvpickve2gr_w 0111 01101110 11111 10 ... ..... ..... @rv_ui3 +xvpickve2gr_d 0111 01101110 11111 110 .. ..... ..... @rv_ui2 +xvpickve2gr_wu 0111 01101111 00111 10 ... ..... ..... @rv_ui3 +xvpickve2gr_du 0111 01101111 00111 110 .. ..... ..... @rv_ui2 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 3a06b5cb80..0995d9b794 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1738,6 +1738,17 @@ static void output_vr_x(DisasContext *ctx, arg_vr *a= , const char *mnemonic) output(ctx, mnemonic, "x%d, r%d", a->vd, a->rj); } =20 +static void output_vr_i_x(DisasContext *ctx, arg_vr_i *a, const char *mnem= onic) +{ + output(ctx, mnemonic, "x%d, r%d, 0x%x", a->vd, a->rj, a->imm); +} + +static void output_rv_i_x(DisasContext *ctx, arg_rv_i *a, const char *mnem= onic) +{ + output(ctx, mnemonic, "r%d, x%d, 0x%x", a->rd, a->vj, a->imm); +} + + INSN_LASX(xvadd_b, vvv) INSN_LASX(xvadd_h, vvv) INSN_LASX(xvadd_w, vvv) @@ -2498,6 +2509,13 @@ INSN_LASX(xvsetallnez_h, cv) INSN_LASX(xvsetallnez_w, cv) INSN_LASX(xvsetallnez_d, cv) =20 +INSN_LASX(xvinsgr2vr_w, vr_i) +INSN_LASX(xvinsgr2vr_d, vr_i) +INSN_LASX(xvpickve2gr_w, rv_i) +INSN_LASX(xvpickve2gr_d, rv_i) +INSN_LASX(xvpickve2gr_wu, rv_i) +INSN_LASX(xvpickve2gr_du, rv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 190fe3eecb..541e2b1728 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -754,6 +754,54 @@ TRANS(xvsetallnez_h, LASX, gen_cv, 32, gen_helper_vset= allnez_h) TRANS(xvsetallnez_w, LASX, gen_cv, 32, gen_helper_vsetallnez_w) TRANS(xvsetallnez_d, LASX, gen_cv, 32, gen_helper_vsetallnez_d) =20 +static bool trans_xvinsgr2vr_w(DisasContext *ctx, arg_vr_i *a) +{ + if (!avail_LASX(ctx)) { + return false; + } + return trans_vinsgr2vr_w(ctx, a); +} + +static bool trans_xvinsgr2vr_d(DisasContext *ctx, arg_vr_i *a) +{ + if (!avail_LASX(ctx)) { + return false; + } + return trans_vinsgr2vr_d(ctx, a); +} + +static bool trans_xvpickve2gr_w(DisasContext *ctx, arg_rv_i *a) +{ + if (!avail_LASX(ctx)) { + return false; + } + return trans_vpickve2gr_w(ctx, a); +} + +static bool trans_xvpickve2gr_d(DisasContext *ctx, arg_rv_i *a) +{ + if (!avail_LASX(ctx)) { + return false; + } + return trans_vpickve2gr_d(ctx, a); +} + +static bool trans_xvpickve2gr_wu(DisasContext *ctx, arg_rv_i *a) +{ + if (!avail_LASX(ctx)) { + return false; + } + return trans_vpickve2gr_wu(ctx, a); +} + +static bool trans_xvpickve2gr_du(DisasContext *ctx, arg_rv_i *a) +{ + if (!avail_LASX(ctx)) { + return false; + } + return trans_vpickve2gr_du(ctx, a); +} + TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385550807807.3177968129904; Wed, 30 Aug 2023 01:52:30 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuT-0003If-L8; Wed, 30 Aug 2023 04:50:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtg-0001gg-NI for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:00 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGta-0007Zh-BO for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:56 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxXeuhAu9kzQgdAA--.54105S3; Wed, 30 Aug 2023 16:49:37 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S46; Wed, 30 Aug 2023 16:49:37 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 44/48] target/loongarch: Implement xvreplve xvinsve0 xvpickve xvb{sll/srl}v Date: Wed, 30 Aug 2023 16:48:58 +0800 Message-Id: <20230830084902.2113960-45-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S46 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385552881100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVREPLVE.{B/H/W/D}; - XVREPL128VEI.{B/H/W/D}; - XVREPLVE0.{B/H/W/D/Q}; - XVINSVE0.{W/D}; - XVPICKVE.{W/D}; - XVBSLL.V, XVBSRL.V. Signed-off-by: Song Gao --- target/loongarch/helper.h | 5 + target/loongarch/insns.decode | 25 ++++ target/loongarch/disas.c | 28 +++++ target/loongarch/vec_helper.c | 28 +++++ target/loongarch/insn_trans/trans_lasx.c.inc | 118 +++++++++++++++++++ target/loongarch/insn_trans/trans_lsx.c.inc | 111 +++++++++-------- 6 files changed, 269 insertions(+), 46 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 85233586e3..fb489dda2d 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -668,6 +668,11 @@ DEF_HELPER_4(vsetallnez_h, void, env, i32, i32, i32) DEF_HELPER_4(vsetallnez_w, void, env, i32, i32, i32) DEF_HELPER_4(vsetallnez_d, void, env, i32, i32, i32) =20 +DEF_HELPER_FLAGS_4(xvinsve0_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvinsve0_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvpickve_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvpickve_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + DEF_HELPER_FLAGS_4(vpackev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vpackev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vpackev_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index bb3bb447ae..74383ba3bc 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1987,3 +1987,28 @@ xvreplgr2vr_b 0111 01101001 11110 00000 ..... ...= .. @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr xvreplgr2vr_d 0111 01101001 11110 00011 ..... ..... @vr + +xvreplve_b 0111 01010010 00100 ..... ..... ..... @vvr +xvreplve_h 0111 01010010 00101 ..... ..... ..... @vvr +xvreplve_w 0111 01010010 00110 ..... ..... ..... @vvr +xvreplve_d 0111 01010010 00111 ..... ..... ..... @vvr + +xvrepl128vei_b 0111 01101111 01111 0 .... ..... ..... @vv_ui4 +xvrepl128vei_h 0111 01101111 01111 10 ... ..... ..... @vv_ui3 +xvrepl128vei_w 0111 01101111 01111 110 .. ..... ..... @vv_ui2 +xvrepl128vei_d 0111 01101111 01111 1110 . ..... ..... @vv_ui1 + +xvreplve0_b 0111 01110000 01110 00000 ..... ..... @vv +xvreplve0_h 0111 01110000 01111 00000 ..... ..... @vv +xvreplve0_w 0111 01110000 01111 10000 ..... ..... @vv +xvreplve0_d 0111 01110000 01111 11000 ..... ..... @vv +xvreplve0_q 0111 01110000 01111 11100 ..... ..... @vv + +xvinsve0_w 0111 01101111 11111 10 ... ..... ..... @vv_ui3 +xvinsve0_d 0111 01101111 11111 110 .. ..... ..... @vv_ui2 + +xvpickve_w 0111 01110000 00111 10 ... ..... ..... @vv_ui3 +xvpickve_d 0111 01110000 00111 110 .. ..... ..... @vv_ui2 + +xvbsll_v 0111 01101000 11100 ..... ..... ..... @vv_ui5 +xvbsrl_v 0111 01101000 11101 ..... ..... ..... @vv_ui5 diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 0995d9b794..ac7dd3021d 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1748,6 +1748,10 @@ static void output_rv_i_x(DisasContext *ctx, arg_rv_= i *a, const char *mnemonic) output(ctx, mnemonic, "r%d, x%d, 0x%x", a->rd, a->vj, a->imm); } =20 +static void output_vvr_x(DisasContext *ctx, arg_vvr *a, const char *mnemon= ic) +{ + output(ctx, mnemonic, "x%d, x%d, r%d", a->vd, a->vj, a->rk); +} =20 INSN_LASX(xvadd_b, vvv) INSN_LASX(xvadd_h, vvv) @@ -2520,3 +2524,27 @@ INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) INSN_LASX(xvreplgr2vr_d, vr) + +INSN_LASX(xvreplve_b, vvr) +INSN_LASX(xvreplve_h, vvr) +INSN_LASX(xvreplve_w, vvr) +INSN_LASX(xvreplve_d, vvr) +INSN_LASX(xvrepl128vei_b, vv_i) +INSN_LASX(xvrepl128vei_h, vv_i) +INSN_LASX(xvrepl128vei_w, vv_i) +INSN_LASX(xvrepl128vei_d, vv_i) + +INSN_LASX(xvreplve0_b, vv) +INSN_LASX(xvreplve0_h, vv) +INSN_LASX(xvreplve0_w, vv) +INSN_LASX(xvreplve0_d, vv) +INSN_LASX(xvreplve0_q, vv) + +INSN_LASX(xvinsve0_w, vv_i) +INSN_LASX(xvinsve0_d, vv_i) + +INSN_LASX(xvpickve_w, vv_i) +INSN_LASX(xvpickve_d, vv_i) + +INSN_LASX(xvbsll_v, vv_i) +INSN_LASX(xvbsrl_v, vv_i) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 1a13342c86..8da95f20a9 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -3086,6 +3086,34 @@ SETALLNEZ(vsetallnez_h, MO_16) SETALLNEZ(vsetallnez_w, MO_32) SETALLNEZ(vsetallnez_d, MO_64) =20 +#define XVINSVE0(NAME, E, MASK) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + Vd->E(imm & MASK) =3D Vj->E(0); \ +} + +XVINSVE0(xvinsve0_w, W, 0x7) +XVINSVE0(xvinsve0_d, D, 0x3) + +#define XVPICKVE(NAME, E, BIT, MASK) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + Vd->E(0) =3D Vj->E(imm & MASK); \ + for (i =3D 1; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D 0; \ + } \ +} + +XVPICKVE(xvpickve_w, W, 32, 0x7) +XVPICKVE(xvpickve_d, D, 64, 0x3) + #define VPACKEV(NAME, BIT, E) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 541e2b1728..5fed2d2b91 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -806,3 +806,121 @@ TRANS(xvreplgr2vr_b, LASX, gvec_dup, 32, MO_8) TRANS(xvreplgr2vr_h, LASX, gvec_dup, 32, MO_16) TRANS(xvreplgr2vr_w, LASX, gvec_dup, 32, MO_32) TRANS(xvreplgr2vr_d, LASX, gvec_dup, 32, MO_64) + +TRANS(xvreplve_b, LASX, gen_vreplve, 32, MO_8, 8, tcg_gen_ld8u_i64) +TRANS(xvreplve_h, LASX, gen_vreplve, 32, MO_16, 16, tcg_gen_ld16u_i64) +TRANS(xvreplve_w, LASX, gen_vreplve, 32, MO_32, 32, tcg_gen_ld32u_i64) +TRANS(xvreplve_d, LASX, gen_vreplve, 32, MO_64, 64, tcg_gen_ld_i64) + +static bool trans_xvrepl128vei_b(DisasContext *ctx, arg_vv_i * a) +{ + if (!avail_LASX(ctx)) { + return false; + } + + CHECK_VEC; + + tcg_gen_gvec_dup_mem(MO_8, + offsetof(CPULoongArchState, fpr[a->vd].vreg.B(0)), + offsetof(CPULoongArchState, + fpr[a->vj].vreg.B((a->imm))), + 16, 16); + tcg_gen_gvec_dup_mem(MO_8, + offsetof(CPULoongArchState, fpr[a->vd].vreg.B(16)= ), + offsetof(CPULoongArchState, + fpr[a->vj].vreg.B((a->imm + 16))), + 16, 16); + return true; +} + +static bool trans_xvrepl128vei_h(DisasContext *ctx, arg_vv_i *a) +{ + if (!avail_LASX(ctx)) { + return false; + } + + CHECK_VEC; + + tcg_gen_gvec_dup_mem(MO_16, + offsetof(CPULoongArchState, fpr[a->vd].vreg.H(0)), + offsetof(CPULoongArchState, + fpr[a->vj].vreg.H((a->imm))), + 16, 16); + tcg_gen_gvec_dup_mem(MO_16, + offsetof(CPULoongArchState, fpr[a->vd].vreg.H(8)), + offsetof(CPULoongArchState, + fpr[a->vj].vreg.H((a->imm + 8))), + 16, 16); + return true; +} + +static bool trans_xvrepl128vei_w(DisasContext *ctx, arg_vv_i *a) +{ + if (!avail_LASX(ctx)) { + return false; + } + + CHECK_VEC; + + tcg_gen_gvec_dup_mem(MO_32, + offsetof(CPULoongArchState, fpr[a->vd].vreg.W(0)), + offsetof(CPULoongArchState, + fpr[a->vj].vreg.W((a->imm))), + 16, 16); + tcg_gen_gvec_dup_mem(MO_32, + offsetof(CPULoongArchState, fpr[a->vd].vreg.W(4)), + offsetof(CPULoongArchState, + fpr[a->vj].vreg.W((a->imm + 4))), + 16, 16); + return true; +} + +static bool trans_xvrepl128vei_d(DisasContext *ctx, arg_vv_i *a) +{ + if (!avail_LASX(ctx)) { + return false; + } + + CHECK_VEC; + + tcg_gen_gvec_dup_mem(MO_64, + offsetof(CPULoongArchState, fpr[a->vd].vreg.D(0)), + offsetof(CPULoongArchState, + fpr[a->vj].vreg.D((a->imm))), + 16, 16); + tcg_gen_gvec_dup_mem(MO_64, + offsetof(CPULoongArchState, fpr[a->vd].vreg.D(2)), + offsetof(CPULoongArchState, + fpr[a->vj].vreg.D((a->imm + 2))), + 16, 16); + return true; +} + +#define XVREPLVE0(NAME, MOP) = \ +static bool trans_## NAME(DisasContext *ctx, arg_vv * a) = \ +{ = \ + if (!avail_LASX(ctx)) { = \ + return false; = \ + } = \ + = \ + CHECK_VEC; = \ + = \ + tcg_gen_gvec_dup_mem(MOP, vec_full_offset(a->vd), vec_full_offset(a->v= j), \ + 32, 32); = \ + return true; = \ +} + +XVREPLVE0(xvreplve0_b, MO_8) +XVREPLVE0(xvreplve0_h, MO_16) +XVREPLVE0(xvreplve0_w, MO_32) +XVREPLVE0(xvreplve0_d, MO_64) +XVREPLVE0(xvreplve0_q, MO_128) + +TRANS(xvinsve0_w, LASX, gen_vv_i, 32, gen_helper_xvinsve0_w) +TRANS(xvinsve0_d, LASX, gen_vv_i, 32, gen_helper_xvinsve0_d) + +TRANS(xvpickve_w, LASX, gen_vv_i, 32, gen_helper_xvpickve_w) +TRANS(xvpickve_d, LASX, gen_vv_i, 32, gen_helper_xvpickve_d) + +TRANS(xvbsll_v, LASX, do_vbsll_v, 32) +TRANS(xvbsrl_v, LASX, do_vbsrl_v, 32) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index 2928e878cf..4abb03485a 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -4283,7 +4283,8 @@ static bool trans_vreplvei_d(DisasContext *ctx, arg_v= v_i *a) return true; } =20 -static bool gen_vreplve(DisasContext *ctx, arg_vvr *a, int vece, int bit, +static bool gen_vreplve(DisasContext *ctx, arg_vvr *a, + uint32_t oprsz, int vece, int bit, void (*func)(TCGv_i64, TCGv_ptr, tcg_target_long)) { TCGv_i64 t0 =3D tcg_temp_new_i64(); @@ -4296,62 +4297,73 @@ static bool gen_vreplve(DisasContext *ctx, arg_vvr = *a, int vece, int bit, =20 CHECK_VEC; =20 - tcg_gen_andi_i64(t0, gpr_src(ctx, a->rk, EXT_NONE), (LSX_LEN/bit) -1); + tcg_gen_andi_i64(t0, gpr_src(ctx, a->rk, EXT_NONE), (LSX_LEN / bit) - = 1); tcg_gen_shli_i64(t0, t0, vece); if (HOST_BIG_ENDIAN) { - tcg_gen_xori_i64(t0, t0, vece << ((LSX_LEN/bit) -1)); + tcg_gen_xori_i64(t0, t0, vece << ((LSX_LEN / bit) - 1)); } =20 tcg_gen_trunc_i64_ptr(t1, t0); tcg_gen_add_ptr(t1, t1, cpu_env); func(t2, t1, vec_full_offset(a->vj)); - tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->vd), 16, ctx->vl/8, t2); + tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->vd), 16, 16, t2); + if (oprsz =3D=3D 32) { + func(t2, t1, offsetof(CPULoongArchState, fpr[a->vj].vreg.Q(1))); + tcg_gen_gvec_dup_i64(vece, + offsetof(CPULoongArchState, fpr[a->vd].vreg.Q= (1)), + 16, 16, t2); + } =20 return true; } =20 -TRANS(vreplve_b, LSX, gen_vreplve, MO_8, 8, tcg_gen_ld8u_i64) -TRANS(vreplve_h, LSX, gen_vreplve, MO_16, 16, tcg_gen_ld16u_i64) -TRANS(vreplve_w, LSX, gen_vreplve, MO_32, 32, tcg_gen_ld32u_i64) -TRANS(vreplve_d, LSX, gen_vreplve, MO_64, 64, tcg_gen_ld_i64) +TRANS(vreplve_b, LSX, gen_vreplve, 16, MO_8, 8, tcg_gen_ld8u_i64) +TRANS(vreplve_h, LSX, gen_vreplve, 16, MO_16, 16, tcg_gen_ld16u_i64) +TRANS(vreplve_w, LSX, gen_vreplve, 16, MO_32, 32, tcg_gen_ld32u_i64) +TRANS(vreplve_d, LSX, gen_vreplve, 16, MO_64, 64, tcg_gen_ld_i64) =20 -static bool trans_vbsll_v(DisasContext *ctx, arg_vv_i *a) +static bool do_vbsll_v(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz) { + int i, max; int ofs; - TCGv_i64 desthigh, destlow, high, low; + TCGv_i64 desthigh[2], destlow[2], high[2], low[2]; =20 if (!avail_LSX(ctx)) { return false; } =20 CHECK_VEC; + max =3D (oprsz =3D=3D 16) ? 1 : 2; + + for (i =3D 0; i < max; i++) { + desthigh[i] =3D tcg_temp_new_i64(); + destlow[i] =3D tcg_temp_new_i64(); + high[i] =3D tcg_temp_new_i64(); + low[i] =3D tcg_temp_new_i64(); + + get_vreg64(low[i], a->vj, 2 * i); + + ofs =3D ((a->imm) & 0xf) * 8; + if (ofs < 64) { + get_vreg64(high[i], a->vj, 2 * i + 1); + tcg_gen_extract2_i64(desthigh[i], low[i], high[i], 64 - ofs); + tcg_gen_shli_i64(destlow[i], low[i], ofs); + } else { + tcg_gen_shli_i64(desthigh[i], low[i], ofs - 64); + destlow[i] =3D tcg_constant_i64(0); + } =20 - desthigh =3D tcg_temp_new_i64(); - destlow =3D tcg_temp_new_i64(); - high =3D tcg_temp_new_i64(); - low =3D tcg_temp_new_i64(); - - get_vreg64(low, a->vj, 0); - - ofs =3D ((a->imm) & 0xf) * 8; - if (ofs < 64) { - get_vreg64(high, a->vj, 1); - tcg_gen_extract2_i64(desthigh, low, high, 64 - ofs); - tcg_gen_shli_i64(destlow, low, ofs); - } else { - tcg_gen_shli_i64(desthigh, low, ofs - 64); - destlow =3D tcg_constant_i64(0); + set_vreg64(desthigh[i], a->vd, 2 * i + 1); + set_vreg64(destlow[i], a->vd, 2 * i); } =20 - set_vreg64(desthigh, a->vd, 1); - set_vreg64(destlow, a->vd, 0); - return true; } =20 -static bool trans_vbsrl_v(DisasContext *ctx, arg_vv_i *a) +static bool do_vbsrl_v(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz) { - TCGv_i64 desthigh, destlow, high, low; + int i, max; + TCGv_i64 desthigh[2], destlow[2], high[2], low[2]; int ofs; =20 if (!avail_LSX(ctx)) { @@ -4360,29 +4372,36 @@ static bool trans_vbsrl_v(DisasContext *ctx, arg_vv= _i *a) =20 CHECK_VEC; =20 - desthigh =3D tcg_temp_new_i64(); - destlow =3D tcg_temp_new_i64(); - high =3D tcg_temp_new_i64(); - low =3D tcg_temp_new_i64(); + max =3D (oprsz =3D=3D 16) ? 1 : 2; =20 - get_vreg64(high, a->vj, 1); + for (i =3D 0; i < max; i++) { + desthigh[i] =3D tcg_temp_new_i64(); + destlow[i] =3D tcg_temp_new_i64(); + high[i] =3D tcg_temp_new_i64(); + low[i] =3D tcg_temp_new_i64(); =20 - ofs =3D ((a->imm) & 0xf) * 8; - if (ofs < 64) { - get_vreg64(low, a->vj, 0); - tcg_gen_extract2_i64(destlow, low, high, ofs); - tcg_gen_shri_i64(desthigh, high, ofs); - } else { - tcg_gen_shri_i64(destlow, high, ofs - 64); - desthigh =3D tcg_constant_i64(0); - } + get_vreg64(high[i], a->vj, 2 * i + 1); + + ofs =3D ((a->imm) & 0xf) * 8; + if (ofs < 64) { + get_vreg64(low[i], a->vj, 2 * i); + tcg_gen_extract2_i64(destlow[i], low[i], high[i], ofs); + tcg_gen_shri_i64(desthigh[i], high[i], ofs); + } else { + tcg_gen_shri_i64(destlow[i], high[i], ofs - 64); + desthigh[i] =3D tcg_constant_i64(0); + } =20 - set_vreg64(desthigh, a->vd, 1); - set_vreg64(destlow, a->vd, 0); + set_vreg64(desthigh[i], a->vd, 2 * i + 1); + set_vreg64(destlow[i], a->vd, 2 * i); + } =20 return true; } =20 +TRANS(vbsll_v, LSX, do_vbsll_v, 16) +TRANS(vbsrl_v, LSX, do_vbsrl_v, 16) + TRANS(vpackev_b, LSX, gen_vvv, 16, gen_helper_vpackev_b) TRANS(vpackev_h, LSX, gen_vvv, 16, gen_helper_vpackev_h) TRANS(vpackev_w, LSX, gen_vvv, 16, gen_helper_vpackev_w) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 169338566044669.75615547531004; Wed, 30 Aug 2023 01:54:20 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuU-0003Tc-Sx; Wed, 30 Aug 2023 04:50:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGth-0001jR-Ta for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:04 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtb-0007a3-4X for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:57 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dx_+uiAu9k1AgdAA--.58281S3; Wed, 30 Aug 2023 16:49:38 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S47; Wed, 30 Aug 2023 16:49:37 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 45/48] target/loongarch: Implement xvpack xvpick xvilv{l/h} Date: Wed, 30 Aug 2023 16:48:59 +0800 Message-Id: <20230830084902.2113960-46-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S47 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385662010100010 Content-Type: text/plain; charset="utf-8" This patch includes: - XVPACK{EV/OD}.{B/H/W/D}; - XVPICK{EV/OD}.{B/H/W/D}; - XVILV{L/H}.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 27 ++++ target/loongarch/disas.c | 27 ++++ target/loongarch/vec_helper.c | 138 +++++++++++-------- target/loongarch/insn_trans/trans_lasx.c.inc | 27 ++++ 4 files changed, 159 insertions(+), 60 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 74383ba3bc..a325b861c1 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -2012,3 +2012,30 @@ xvpickve_d 0111 01110000 00111 110 .. ..... ..= ... @vv_ui2 =20 xvbsll_v 0111 01101000 11100 ..... ..... ..... @vv_ui5 xvbsrl_v 0111 01101000 11101 ..... ..... ..... @vv_ui5 + +xvpackev_b 0111 01010001 01100 ..... ..... ..... @vvv +xvpackev_h 0111 01010001 01101 ..... ..... ..... @vvv +xvpackev_w 0111 01010001 01110 ..... ..... ..... @vvv +xvpackev_d 0111 01010001 01111 ..... ..... ..... @vvv +xvpackod_b 0111 01010001 10000 ..... ..... ..... @vvv +xvpackod_h 0111 01010001 10001 ..... ..... ..... @vvv +xvpackod_w 0111 01010001 10010 ..... ..... ..... @vvv +xvpackod_d 0111 01010001 10011 ..... ..... ..... @vvv + +xvpickev_b 0111 01010001 11100 ..... ..... ..... @vvv +xvpickev_h 0111 01010001 11101 ..... ..... ..... @vvv +xvpickev_w 0111 01010001 11110 ..... ..... ..... @vvv +xvpickev_d 0111 01010001 11111 ..... ..... ..... @vvv +xvpickod_b 0111 01010010 00000 ..... ..... ..... @vvv +xvpickod_h 0111 01010010 00001 ..... ..... ..... @vvv +xvpickod_w 0111 01010010 00010 ..... ..... ..... @vvv +xvpickod_d 0111 01010010 00011 ..... ..... ..... @vvv + +xvilvl_b 0111 01010001 10100 ..... ..... ..... @vvv +xvilvl_h 0111 01010001 10101 ..... ..... ..... @vvv +xvilvl_w 0111 01010001 10110 ..... ..... ..... @vvv +xvilvl_d 0111 01010001 10111 ..... ..... ..... @vvv +xvilvh_b 0111 01010001 11000 ..... ..... ..... @vvv +xvilvh_h 0111 01010001 11001 ..... ..... ..... @vvv +xvilvh_w 0111 01010001 11010 ..... ..... ..... @vvv +xvilvh_d 0111 01010001 11011 ..... ..... ..... @vvv diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index ac7dd3021d..9b6a07bbb0 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2548,3 +2548,30 @@ INSN_LASX(xvpickve_d, vv_i) =20 INSN_LASX(xvbsll_v, vv_i) INSN_LASX(xvbsrl_v, vv_i) + +INSN_LASX(xvpackev_b, vvv) +INSN_LASX(xvpackev_h, vvv) +INSN_LASX(xvpackev_w, vvv) +INSN_LASX(xvpackev_d, vvv) +INSN_LASX(xvpackod_b, vvv) +INSN_LASX(xvpackod_h, vvv) +INSN_LASX(xvpackod_w, vvv) +INSN_LASX(xvpackod_d, vvv) + +INSN_LASX(xvpickev_b, vvv) +INSN_LASX(xvpickev_h, vvv) +INSN_LASX(xvpickev_w, vvv) +INSN_LASX(xvpickev_d, vvv) +INSN_LASX(xvpickod_b, vvv) +INSN_LASX(xvpickod_h, vvv) +INSN_LASX(xvpickod_w, vvv) +INSN_LASX(xvpickod_d, vvv) + +INSN_LASX(xvilvl_b, vvv) +INSN_LASX(xvilvl_h, vvv) +INSN_LASX(xvilvl_w, vvv) +INSN_LASX(xvilvl_d, vvv) +INSN_LASX(xvilvh_b, vvv) +INSN_LASX(xvilvh_h, vvv) +INSN_LASX(xvilvh_w, vvv) +INSN_LASX(xvilvh_d, vvv) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 8da95f20a9..34be19891a 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -3118,12 +3118,13 @@ XVPICKVE(xvpickve_d, D, 64, 0x3) void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg temp; \ + VReg temp =3D {}; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ temp.E(2 * i + 1) =3D Vj->E(2 * i); \ temp.E(2 *i) =3D Vk->E(2 * i); \ } \ @@ -3139,12 +3140,13 @@ VPACKEV(vpackev_d, 128, D) void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg temp; \ + VReg temp =3D {}; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ temp.E(2 * i + 1) =3D Vj->E(2 * i + 1); \ temp.E(2 * i) =3D Vk->E(2 * i + 1); \ } \ @@ -3156,20 +3158,24 @@ VPACKOD(vpackod_h, 32, H) VPACKOD(vpackod_w, 64, W) VPACKOD(vpackod_d, 128, D) =20 -#define VPICKEV(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i + LSX_LEN/BIT) =3D Vj->E(2 * i); \ - temp.E(i) =3D Vk->E(2 * i); \ - } \ - *Vd =3D temp; \ +#define VPICKEV(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + ofs =3D LSX_LEN / BIT; \ + for (i =3D 0; i < oprsz / 16; i++) { \ + for (j =3D 0; j < ofs; j++) { \ + temp.E(j + ofs * (2 * i + 1)) =3D Vj->E(2 * (j + ofs * i)); \ + temp.E(j + ofs * 2 * i) =3D Vk->E(2 * (j + ofs * i)); \ + } \ + } \ + *Vd =3D temp; \ } =20 VPICKEV(vpickev_b, 16, B) @@ -3177,20 +3183,24 @@ VPICKEV(vpickev_h, 32, H) VPICKEV(vpickev_w, 64, W) VPICKEV(vpickev_d, 128, D) =20 -#define VPICKOD(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i + LSX_LEN/BIT) =3D Vj->E(2 * i + 1); \ - temp.E(i) =3D Vk->E(2 * i + 1); \ - } \ - *Vd =3D temp; \ +#define VPICKOD(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E(j + ofs * (2 * i + 1)) =3D Vj->E(2 * (j + ofs * i) + 1)= ; \ + temp.E(j + ofs * 2 * i) =3D Vk->E(2 * (j + ofs * i) + 1); = \ + } \ + } \ + *Vd =3D temp; = \ } =20 VPICKOD(vpickod_b, 16, B) @@ -3198,20 +3208,24 @@ VPICKOD(vpickod_h, 32, H) VPICKOD(vpickod_w, 64, W) VPICKOD(vpickod_d, 128, D) =20 -#define VILVL(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) =3D Vj->E(i); \ - temp.E(2 * i) =3D Vk->E(i); \ - } \ - *Vd =3D temp; \ +#define VILVL(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + ofs =3D LSX_LEN / BIT; \ + for (i =3D 0; i < oprsz / 16; i++) { \ + for (j =3D 0; j < ofs; j++) { \ + temp.E(2 * (j + ofs * i) + 1) =3D Vj->E(j + ofs * 2 * i); \ + temp.E(2 * (j + ofs * i)) =3D Vk->E(j + ofs * 2 * i); \ + } \ + } \ + *Vd =3D temp; \ } =20 VILVL(vilvl_b, 16, B) @@ -3219,20 +3233,24 @@ VILVL(vilvl_h, 32, H) VILVL(vilvl_w, 64, W) VILVL(vilvl_d, 128, D) =20 -#define VILVH(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) =3D Vj->E(i + LSX_LEN/BIT); \ - temp.E(2 * i) =3D Vk->E(i + LSX_LEN/BIT); \ - } \ - *Vd =3D temp; \ +#define VILVH(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E(2 * (j + ofs * i) + 1) =3D Vj->E(j + ofs * (2 * i + 1))= ; \ + temp.E(2 * (j + ofs * i)) =3D Vk->E(j + ofs * (2 * i + 1)); = \ + } \ + } \ + *Vd =3D temp; = \ } =20 VILVH(vilvh_b, 16, B) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index 5fed2d2b91..aa374f3a00 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -924,3 +924,30 @@ TRANS(xvpickve_d, LASX, gen_vv_i, 32, gen_helper_xvpic= kve_d) =20 TRANS(xvbsll_v, LASX, do_vbsll_v, 32) TRANS(xvbsrl_v, LASX, do_vbsrl_v, 32) + +TRANS(xvpackev_b, LASX, gen_vvv, 32, gen_helper_vpackev_b) +TRANS(xvpackev_h, LASX, gen_vvv, 32, gen_helper_vpackev_h) +TRANS(xvpackev_w, LASX, gen_vvv, 32, gen_helper_vpackev_w) +TRANS(xvpackev_d, LASX, gen_vvv, 32, gen_helper_vpackev_d) +TRANS(xvpackod_b, LASX, gen_vvv, 32, gen_helper_vpackod_b) +TRANS(xvpackod_h, LASX, gen_vvv, 32, gen_helper_vpackod_h) +TRANS(xvpackod_w, LASX, gen_vvv, 32, gen_helper_vpackod_w) +TRANS(xvpackod_d, LASX, gen_vvv, 32, gen_helper_vpackod_d) + +TRANS(xvpickev_b, LASX, gen_vvv, 32, gen_helper_vpickev_b) +TRANS(xvpickev_h, LASX, gen_vvv, 32, gen_helper_vpickev_h) +TRANS(xvpickev_w, LASX, gen_vvv, 32, gen_helper_vpickev_w) +TRANS(xvpickev_d, LASX, gen_vvv, 32, gen_helper_vpickev_d) +TRANS(xvpickod_b, LASX, gen_vvv, 32, gen_helper_vpickod_b) +TRANS(xvpickod_h, LASX, gen_vvv, 32, gen_helper_vpickod_h) +TRANS(xvpickod_w, LASX, gen_vvv, 32, gen_helper_vpickod_w) +TRANS(xvpickod_d, LASX, gen_vvv, 32, gen_helper_vpickod_d) + +TRANS(xvilvl_b, LASX, gen_vvv, 32, gen_helper_vilvl_b) +TRANS(xvilvl_h, LASX, gen_vvv, 32, gen_helper_vilvl_h) +TRANS(xvilvl_w, LASX, gen_vvv, 32, gen_helper_vilvl_w) +TRANS(xvilvl_d, LASX, gen_vvv, 32, gen_helper_vilvl_d) +TRANS(xvilvh_b, LASX, gen_vvv, 32, gen_helper_vilvh_b) +TRANS(xvilvh_h, LASX, gen_vvv, 32, gen_helper_vilvh_h) +TRANS(xvilvh_w, LASX, gen_vvv, 32, gen_helper_vilvh_w) +TRANS(xvilvh_d, LASX, gen_vvv, 32, gen_helper_vilvh_d) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1693385482273696.8402543000336; Wed, 30 Aug 2023 01:51:22 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuV-0003WQ-9m; Wed, 30 Aug 2023 04:50:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGtj-0001jX-6b for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:04 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtd-0007aM-AE for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:58 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxfOqiAu9k1ggdAA--.32239S3; Wed, 30 Aug 2023 16:49:38 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S48; Wed, 30 Aug 2023 16:49:38 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 46/48] target/loongarch: Implement xvshuf xvperm{i} xvshuf4i xvextrins Date: Wed, 30 Aug 2023 16:49:00 +0800 Message-Id: <20230830084902.2113960-47-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S48 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385483138100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSHUF.{B/H/W/D}; - XVPERM.W; - XVSHUF4i.{B/H/W/D}; - XVPERMI.{W/D/Q}; - XVEXTRINS.{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/helper.h | 3 + target/loongarch/vec.h | 2 + target/loongarch/insns.decode | 21 ++++ target/loongarch/disas.c | 21 ++++ target/loongarch/vec_helper.c | 112 +++++++++++++++---- target/loongarch/insn_trans/trans_lasx.c.inc | 21 ++++ 6 files changed, 161 insertions(+), 19 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index fb489dda2d..b3b64a0215 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -709,7 +709,10 @@ DEF_HELPER_FLAGS_4(vshuf4i_h, TCG_CALL_NO_RWG, void, p= tr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vshuf4i_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vshuf4i_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 +DEF_HELPER_FLAGS_4(vperm_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vpermi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vpermi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vpermi_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 DEF_HELPER_FLAGS_4(vextrins_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vextrins_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index bc74effb7c..61e5b69c1e 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -93,4 +93,6 @@ #define VSLE(a, b) (a <=3D b ? -1 : 0) #define VSLT(a, b) (a < b ? -1 : 0) =20 +#define SHF_POS(i, imm) (((i) & 0xfc) + (((imm) >> (2 * ((i) & 0x03))) & 0= x03)) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index a325b861c1..64b67ee9ac 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -2039,3 +2039,24 @@ xvilvh_b 0111 01010001 11000 ..... ..... ...= .. @vvv xvilvh_h 0111 01010001 11001 ..... ..... ..... @vvv xvilvh_w 0111 01010001 11010 ..... ..... ..... @vvv xvilvh_d 0111 01010001 11011 ..... ..... ..... @vvv + +xvshuf_b 0000 11010110 ..... ..... ..... ..... @vvvv +xvshuf_h 0111 01010111 10101 ..... ..... ..... @vvv +xvshuf_w 0111 01010111 10110 ..... ..... ..... @vvv +xvshuf_d 0111 01010111 10111 ..... ..... ..... @vvv + +xvperm_w 0111 01010111 11010 ..... ..... ..... @vvv + +xvshuf4i_b 0111 01111001 00 ........ ..... ..... @vv_ui8 +xvshuf4i_h 0111 01111001 01 ........ ..... ..... @vv_ui8 +xvshuf4i_w 0111 01111001 10 ........ ..... ..... @vv_ui8 +xvshuf4i_d 0111 01111001 11 ........ ..... ..... @vv_ui8 + +xvpermi_w 0111 01111110 01 ........ ..... ..... @vv_ui8 +xvpermi_d 0111 01111110 10 ........ ..... ..... @vv_ui8 +xvpermi_q 0111 01111110 11 ........ ..... ..... @vv_ui8 + +xvextrins_d 0111 01111000 00 ........ ..... ..... @vv_ui8 +xvextrins_w 0111 01111000 01 ........ ..... ..... @vv_ui8 +xvextrins_h 0111 01111000 10 ........ ..... ..... @vv_ui8 +xvextrins_b 0111 01111000 11 ........ ..... ..... @vv_ui8 diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 9b6a07bbb0..a518c59772 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2575,3 +2575,24 @@ INSN_LASX(xvilvh_b, vvv) INSN_LASX(xvilvh_h, vvv) INSN_LASX(xvilvh_w, vvv) INSN_LASX(xvilvh_d, vvv) + +INSN_LASX(xvshuf_b, vvvv) +INSN_LASX(xvshuf_h, vvv) +INSN_LASX(xvshuf_w, vvv) +INSN_LASX(xvshuf_d, vvv) + +INSN_LASX(xvperm_w, vvv) + +INSN_LASX(xvshuf4i_b, vv_i) +INSN_LASX(xvshuf4i_h, vv_i) +INSN_LASX(xvshuf4i_w, vv_i) +INSN_LASX(xvshuf4i_d, vv_i) + +INSN_LASX(xvpermi_w, vv_i) +INSN_LASX(xvpermi_d, vv_i) +INSN_LASX(xvpermi_q, vv_i) + +INSN_LASX(xvextrins_d, vv_i) +INSN_LASX(xvextrins_w, vv_i) +INSN_LASX(xvextrins_h, vv_i) +INSN_LASX(xvextrins_b, vv_i) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 34be19891a..97058ac2b3 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -3261,17 +3261,24 @@ VILVH(vilvh_d, 128, D) void HELPER(vshuf_b)(void *vd, void *vj, void *vk, void *va, uint32_t desc) { int i, m; - VReg temp; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; VReg *Va =3D (VReg *)va; + int oprsz =3D simd_oprsz(desc); =20 - m =3D LSX_LEN/8; - for (i =3D 0; i < m ; i++) { + m =3D LSX_LEN / 8; + for (i =3D 0; i < m; i++) { uint64_t k =3D (uint8_t)Va->B(i) % (2 * m); temp.B(i) =3D k < m ? Vk->B(k) : Vj->B(k - m); } + if (oprsz =3D=3D 32) { + for(i =3D m; i < 2 * m; i++) { + uint64_t j =3D (uint8_t)Va->B(i) % (2 * m); + temp.B(i) =3D j < m ? Vk->B(j + m) : Vj->B(j); + } + } *Vd =3D temp; } =20 @@ -3279,16 +3286,23 @@ void HELPER(vshuf_b)(void *vd, void *vj, void *vk, = void *va, uint32_t desc) void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i, m; \ - VReg temp; \ + VReg temp =3D {}; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ - m =3D LSX_LEN/BIT; \ + m =3D LSX_LEN / BIT; \ for (i =3D 0; i < m; i++) { \ - uint64_t k =3D ((uint8_t) Vd->E(i)) % (2 * m); \ + uint64_t k =3D (uint8_t)Vd->E(i) % (2 * m); \ temp.E(i) =3D k < m ? Vk->E(k) : Vj->E(k - m); \ } \ + if (oprsz =3D=3D 32) { \ + for (i =3D m; i < 2 * m; i++) { \ + uint64_t j =3D (uint8_t)Vd->E(i) % (2 * m); \ + temp.E(i) =3D j < m ? Vk->E(j + m): Vj->E(j); \ + } \ + } \ *Vd =3D temp; \ } =20 @@ -3299,14 +3313,20 @@ VSHUF(vshuf_d, 64, D) #define VSHUF4I(NAME, BIT, E) \ void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ - int i; \ - VReg temp; \ + int i, max; \ + VReg temp =3D {}; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i) =3D Vj->E(((i) & 0xfc) + (((imm) >> \ - (2 * ((i) & 0x03))) & 0x03)); \ + max =3D LSX_LEN / BIT; \ + for (i =3D 0; i < max; i++) { \ + temp.E(i) =3D Vj->E(SHF_POS(i, imm)); \ + } \ + if (oprsz =3D=3D 32) { \ + for (i =3D max; i < 2 * max; i++) { \ + temp.E(i) =3D Vj->E(SHF_POS(i - max, imm) + max); \ + } \ } \ *Vd =3D temp; \ } @@ -3317,38 +3337,92 @@ VSHUF4I(vshuf4i_w, 32, W) =20 void HELPER(vshuf4i_d)(void *vd, void *vj, uint64_t imm, uint32_t desc) { + int i; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - VReg temp; - temp.D(0) =3D (imm & 2 ? Vj : Vd)->D(imm & 1); - temp.D(1) =3D (imm & 8 ? Vj : Vd)->D((imm >> 2) & 1); + for (i =3D 0; i < oprsz / 16; i++) { + temp.D(2 * i) =3D (imm & 2 ? Vj : Vd)->D((imm & 1) + 2 * i); + temp.D(2 * i + 1) =3D (imm & 8 ? Vj : Vd)->D(((imm >> 2) & 1) + 2 = * i); + } + *Vd =3D temp; +} + +void HELPER(vperm_w)(void *vd, void *vj, void *vk, uint32_t desc) +{ + int i, m; + VReg temp =3D {}; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; + + m =3D LASX_LEN / 32; + for (i =3D 0; i < m ; i++) { + uint64_t k =3D (uint8_t)Vk->W(i) % 8; + temp.W(i) =3D Vj->W(k); + } *Vd =3D temp; } =20 void HELPER(vpermi_w)(void *vd, void *vj, uint64_t imm, uint32_t desc) +{ + int i; + VReg temp =3D {}; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); + + for (i =3D 0; i < oprsz / 16; i++) { + temp.W(4 * i) =3D Vj->W((imm & 0x3) + 4 * i); + temp.W(4 * i + 1) =3D Vj->W(((imm >> 2) & 0x3) + 4 * i); + temp.W(4 * i + 2) =3D Vd->W(((imm >> 4) & 0x3) + 4 * i); + temp.W(4 * i + 3) =3D Vd->W(((imm >> 6) & 0x3) + 4 * i); + } + *Vd =3D temp; +} + +void HELPER(vpermi_d)(void *vd, void *vj, uint64_t imm, uint32_t desc) +{ + VReg temp =3D {}; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + + temp.D(0) =3D Vj->D(imm & 0x3); + temp.D(1) =3D Vj->D((imm >> 2) & 0x3); + temp.D(2) =3D Vj->D((imm >> 4) & 0x3); + temp.D(3) =3D Vj->D((imm >> 6) & 0x3); + *Vd =3D temp; +} + +void HELPER(vpermi_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; =20 - temp.W(0) =3D Vj->W(imm & 0x3); - temp.W(1) =3D Vj->W((imm >> 2) & 0x3); - temp.W(2) =3D Vd->W((imm >> 4) & 0x3); - temp.W(3) =3D Vd->W((imm >> 6) & 0x3); + temp.Q(0) =3D (imm & 0x3) > 1 ? Vd->Q((imm & 0x3) - 2) : Vj->Q(imm & 0= x3); + temp.Q(1) =3D ((imm >> 4) & 0x3) > 1 ? Vd->Q(((imm >> 4) & 0x3) - 2) : + Vj->Q((imm >> 4) & 0x3); *Vd =3D temp; } =20 #define VEXTRINS(NAME, BIT, E, MASK) \ void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ - int ins, extr; \ + int ins, extr, max; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ + max =3D LSX_LEN / BIT; \ ins =3D (imm >> 4) & MASK; \ extr =3D imm & MASK; \ Vd->E(ins) =3D Vj->E(extr); \ + if (oprsz =3D=3D 32) { \ + Vd->E(ins + max) =3D Vj->E(extr + max); \ + } \ } =20 VEXTRINS(vextrins_b, 8, B, 0xf) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index aa374f3a00..ebbbc5a6bb 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -951,3 +951,24 @@ TRANS(xvilvh_b, LASX, gen_vvv, 32, gen_helper_vilvh_b) TRANS(xvilvh_h, LASX, gen_vvv, 32, gen_helper_vilvh_h) TRANS(xvilvh_w, LASX, gen_vvv, 32, gen_helper_vilvh_w) TRANS(xvilvh_d, LASX, gen_vvv, 32, gen_helper_vilvh_d) + +TRANS(xvshuf_b, LASX, gen_vvvv, 32, gen_helper_vshuf_b) +TRANS(xvshuf_h, LASX, gen_vvv, 32, gen_helper_vshuf_h) +TRANS(xvshuf_w, LASX, gen_vvv, 32, gen_helper_vshuf_w) +TRANS(xvshuf_d, LASX, gen_vvv, 32, gen_helper_vshuf_d) + +TRANS(xvperm_w, LASX, gen_vvv, 32, gen_helper_vperm_w) + +TRANS(xvshuf4i_b, LASX, gen_vv_i, 32, gen_helper_vshuf4i_b) +TRANS(xvshuf4i_h, LASX, gen_vv_i, 32, gen_helper_vshuf4i_h) +TRANS(xvshuf4i_w, LASX, gen_vv_i, 32, gen_helper_vshuf4i_w) +TRANS(xvshuf4i_d, LASX, gen_vv_i, 32, gen_helper_vshuf4i_d) + +TRANS(xvpermi_w, LASX, gen_vv_i, 32, gen_helper_vpermi_w) +TRANS(xvpermi_d, LASX, gen_vv_i, 32, gen_helper_vpermi_d) +TRANS(xvpermi_q, LASX, gen_vv_i, 32, gen_helper_vpermi_q) + +TRANS(xvextrins_b, LASX, gen_vv_i, 32, gen_helper_vextrins_b) +TRANS(xvextrins_h, LASX, gen_vv_i, 32, gen_helper_vextrins_h) +TRANS(xvextrins_w, LASX, gen_vv_i, 32, gen_helper_vextrins_w) +TRANS(xvextrins_d, LASX, gen_vv_i, 32, gen_helper_vextrins_d) --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 169338553918133.01566678712618; Wed, 30 Aug 2023 01:52:19 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuT-00036V-Ej; Wed, 30 Aug 2023 04:50:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGti-0001jT-4G for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:04 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtc-0007bI-7h for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:57 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Bx5fCjAu9k2AgdAA--.59592S3; Wed, 30 Aug 2023 16:49:39 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S49; Wed, 30 Aug 2023 16:49:38 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 47/48] target/loongarch: Implement xvld xvst Date: Wed, 30 Aug 2023 16:49:01 +0800 Message-Id: <20230830084902.2113960-48-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S49 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385539488100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVLD[X], XVST[X]; - XVLDREPL.{B/H/W/D}; - XVSTELM.{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/insns.decode | 18 +++++ target/loongarch/disas.c | 24 ++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 80 ++++++++++++++++++++ target/loongarch/insn_trans/trans_lsx.c.inc | 54 ++++++------- 4 files changed, 149 insertions(+), 27 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 64b67ee9ac..64b308f9fb 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -550,6 +550,10 @@ dbcl 0000 00000010 10101 ............... = @i15 @vr_i8i2 .... ........ imm2:2 ........ rj:5 vd:5 &vr_ii imm=3D%i8s2 @vr_i8i3 .... ....... imm2:3 ........ rj:5 vd:5 &vr_ii imm=3D%i8s1 @vr_i8i4 .... ...... imm2:4 imm:s8 rj:5 vd:5 &vr_ii +@vr_i8i2x .... ........ imm2:2 ........ rj:5 vd:5 &vr_ii imm=3D%i8s3 +@vr_i8i3x .... ....... imm2:3 ........ rj:5 vd:5 &vr_ii imm=3D%i8s2 +@vr_i8i4x .... ...... imm2:4 ........ rj:5 vd:5 &vr_ii imm=3D%i8s1 +@vr_i8i5x .... ..... imm2:5 imm:s8 rj:5 vd:5 &vr_ii @vrr .... ........ ..... rk:5 rj:5 vd:5 &vrr @v_i13 .... ........ .. imm:13 vd:5 &v_i =20 @@ -2060,3 +2064,17 @@ xvextrins_d 0111 01111000 00 ........ ..... ...= .. @vv_ui8 xvextrins_w 0111 01111000 01 ........ ..... ..... @vv_ui8 xvextrins_h 0111 01111000 10 ........ ..... ..... @vv_ui8 xvextrins_b 0111 01111000 11 ........ ..... ..... @vv_ui8 + +xvld 0010 110010 ............ ..... ..... @vr_i12 +xvst 0010 110011 ............ ..... ..... @vr_i12 +xvldx 0011 10000100 10000 ..... ..... ..... @vrr +xvstx 0011 10000100 11000 ..... ..... ..... @vrr + +xvldrepl_d 0011 00100001 0 ......... ..... ..... @vr_i9 +xvldrepl_w 0011 00100010 .......... ..... ..... @vr_i10 +xvldrepl_h 0011 0010010 ........... ..... ..... @vr_i11 +xvldrepl_b 0011 001010 ............ ..... ..... @vr_i12 +xvstelm_d 0011 00110001 .. ........ ..... ..... @vr_i8i2x +xvstelm_w 0011 0011001 ... ........ ..... ..... @vr_i8i3x +xvstelm_h 0011 001101 .... ........ ..... ..... @vr_i8i4x +xvstelm_b 0011 00111 ..... ........ ..... ..... @vr_i8i5x diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index a518c59772..e5fb362d7f 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1753,6 +1753,16 @@ static void output_vvr_x(DisasContext *ctx, arg_vvr = *a, const char *mnemonic) output(ctx, mnemonic, "x%d, x%d, r%d", a->vd, a->vj, a->rk); } =20 +static void output_vrr_x(DisasContext *ctx, arg_vrr *a, const char *mnemon= ic) +{ + output(ctx, mnemonic, "x%d, r%d, r%d", a->vd, a->rj, a->rk); +} + +static void output_vr_ii_x(DisasContext *ctx, arg_vr_ii *a, const char *mn= emonic) +{ + output(ctx, mnemonic, "x%d, r%d, 0x%x, 0x%x", a->vd, a->rj, a->imm, a-= >imm2); +} + INSN_LASX(xvadd_b, vvv) INSN_LASX(xvadd_h, vvv) INSN_LASX(xvadd_w, vvv) @@ -2596,3 +2606,17 @@ INSN_LASX(xvextrins_d, vv_i) INSN_LASX(xvextrins_w, vv_i) INSN_LASX(xvextrins_h, vv_i) INSN_LASX(xvextrins_b, vv_i) + +INSN_LASX(xvld, vr_i) +INSN_LASX(xvst, vr_i) +INSN_LASX(xvldx, vrr) +INSN_LASX(xvstx, vrr) + +INSN_LASX(xvldrepl_d, vr_i) +INSN_LASX(xvldrepl_w, vr_i) +INSN_LASX(xvldrepl_h, vr_i) +INSN_LASX(xvldrepl_b, vr_i) +INSN_LASX(xvstelm_d, vr_ii) +INSN_LASX(xvstelm_w, vr_ii) +INSN_LASX(xvstelm_h, vr_ii) +INSN_LASX(xvstelm_b, vr_ii) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarc= h/insn_trans/trans_lasx.c.inc index ebbbc5a6bb..b44e9e6d77 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -972,3 +972,83 @@ TRANS(xvextrins_b, LASX, gen_vv_i, 32, gen_helper_vext= rins_b) TRANS(xvextrins_h, LASX, gen_vv_i, 32, gen_helper_vextrins_h) TRANS(xvextrins_w, LASX, gen_vv_i, 32, gen_helper_vextrins_w) TRANS(xvextrins_d, LASX, gen_vv_i, 32, gen_helper_vextrins_d) + +static bool gen_lasx_memory(DisasContext *ctx, arg_vr_i *a, + void (*func)(DisasContext *, int, TCGv)) +{ + TCGv addr =3D gpr_src(ctx, a->rj, EXT_NONE); + TCGv temp =3D NULL; + + CHECK_VEC; + + if (a->imm) { + temp =3D tcg_temp_new(); + tcg_gen_addi_tl(temp, addr, a->imm); + addr =3D temp; + } + + func(ctx, a->vd, addr); + return true; +} + +static void gen_xvld(DisasContext *ctx, int vreg, TCGv addr) +{ + int i; + TCGv temp =3D tcg_temp_new(); + TCGv dest =3D tcg_temp_new(); + + tcg_gen_qemu_ld_i64(dest, addr, ctx->mem_idx, MO_TEUQ); + set_vreg64(dest, vreg, 0); + + for (i =3D 1; i < 4; i++) { + tcg_gen_addi_tl(temp, addr, 8 * i); + tcg_gen_qemu_ld_i64(dest, temp, ctx->mem_idx, MO_TEUQ); + set_vreg64(dest, vreg, i); + } +} + +static void gen_xvst(DisasContext * ctx, int vreg, TCGv addr) +{ + int i; + TCGv temp =3D tcg_temp_new(); + TCGv dest =3D tcg_temp_new(); + + get_vreg64(dest, vreg, 0); + tcg_gen_qemu_st_i64(dest, addr, ctx->mem_idx, MO_TEUQ); + + for (i =3D 1; i < 4; i++) { + tcg_gen_addi_tl(temp, addr, 8 * i); + get_vreg64(dest, vreg, i); + tcg_gen_qemu_st_i64(dest, temp, ctx->mem_idx, MO_TEUQ); + } +} + +TRANS(xvld, LASX, gen_lasx_memory, gen_xvld) +TRANS(xvst, LASX, gen_lasx_memory, gen_xvst) + +static bool gen_lasx_memoryx(DisasContext *ctx, arg_vrr *a, + void (*func)(DisasContext*, int, TCGv)) +{ + TCGv src1 =3D gpr_src(ctx, a->rj, EXT_NONE); + TCGv src2 =3D gpr_src(ctx, a->rk, EXT_NONE); + TCGv addr =3D tcg_temp_new(); + + CHECK_VEC; + + tcg_gen_add_tl(addr, src1, src2); + func(ctx, a->vd, addr); + + return true; +} + +TRANS(xvldx, LASX, gen_lasx_memoryx, gen_xvld) +TRANS(xvstx, LASX, gen_lasx_memoryx, gen_xvst) + +TRANS(xvldrepl_b, LASX, do_vldrepl, 32, MO_8) +TRANS(xvldrepl_h, LASX, do_vldrepl, 32, MO_16) +TRANS(xvldrepl_w, LASX, do_vldrepl, 32, MO_32) +TRANS(xvldrepl_d, LASX, do_vldrepl, 32, MO_64) +VSTELM(xvstelm_b, MO_8, B) +VSTELM(xvstelm_h, MO_16, H) +VSTELM(xvstelm_w, MO_32, W) +VSTELM(xvstelm_d, MO_64, D) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index 4abb03485a..86f333981c 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -4553,33 +4553,33 @@ static bool trans_vstx(DisasContext *ctx, arg_vrr *= a) return true; } =20 -#define VLDREPL(NAME, MO) \ -static bool trans_## NAME (DisasContext *ctx, arg_vr_i *a) \ -{ \ - TCGv addr; \ - TCGv_i64 val; \ - \ - if (!avail_LSX(ctx)) { \ - return false; \ - } \ - \ - CHECK_VEC; \ - \ - addr =3D gpr_src(ctx, a->rj, EXT_NONE); = \ - val =3D tcg_temp_new_i64(); = \ - \ - addr =3D make_address_i(ctx, addr, a->imm); = \ - \ - tcg_gen_qemu_ld_i64(val, addr, ctx->mem_idx, MO); \ - tcg_gen_gvec_dup_i64(MO, vec_full_offset(a->vd), 16, ctx->vl/8, val); \ - \ - return true; \ -} - -VLDREPL(vldrepl_b, MO_8) -VLDREPL(vldrepl_h, MO_16) -VLDREPL(vldrepl_w, MO_32) -VLDREPL(vldrepl_d, MO_64) +static bool do_vldrepl(DisasContext *ctx, arg_vr_i * a, + uint32_t oprsz, MemOp mop) +{ + TCGv addr, temp; + TCGv_i64 val; + + CHECK_VEC; + + addr =3D gpr_src(ctx, a->rj, EXT_NONE); + val =3D tcg_temp_new_i64(); + + if (a->imm) { + temp =3D tcg_temp_new(); + tcg_gen_addi_tl(temp, addr, a->imm); + addr =3D temp; + } + + tcg_gen_qemu_ld_i64(val, addr, ctx->mem_idx, mop); + tcg_gen_gvec_dup_i64(mop, vec_full_offset(a->vd), oprsz, ctx->vl / 8, = val); + + return true; +} + +TRANS(vldrepl_b, LSX, do_vldrepl, 16, MO_8) +TRANS(vldrepl_h, LSX, do_vldrepl, 16, MO_16) +TRANS(vldrepl_w, LSX, do_vldrepl, 16, MO_32) +TRANS(vldrepl_d, LSX, do_vldrepl, 16, MO_64) =20 #define VSTELM(NAME, MO, E) = \ static bool trans_## NAME (DisasContext *ctx, arg_vr_ii *a) = \ --=20 2.39.1 From nobody Tue May 14 05:57:28 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTP id 1693385807190640.1476189548771; Wed, 30 Aug 2023 01:56:47 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGuQ-0002qc-VI; Wed, 30 Aug 2023 04:50:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qbGth-0001jS-Ti for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:50:04 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qbGtb-0007aO-QZ for qemu-devel@nongnu.org; Wed, 30 Aug 2023 04:49:56 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxZ+ijAu9k2wgdAA--.23329S3; Wed, 30 Aug 2023 16:49:39 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxF81+Au9kHhxnAA--.49766S50; Wed, 30 Aug 2023 16:49:39 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v4 48/48] target/loongarch: CPUCFG support LASX Date: Wed, 30 Aug 2023 16:49:02 +0800 Message-Id: <20230830084902.2113960-49-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230830084902.2113960-1-gaosong@loongson.cn> References: <20230830084902.2113960-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxF81+Au9kHhxnAA--.49766S50 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1693385808950100001 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/cpu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c index 4deae22104..e03f71222a 100644 --- a/target/loongarch/cpu.c +++ b/target/loongarch/cpu.c @@ -392,6 +392,7 @@ static void loongarch_la464_initfn(Object *obj) data =3D FIELD_DP32(data, CPUCFG2, FP_DP, 1); data =3D FIELD_DP32(data, CPUCFG2, FP_VER, 1); data =3D FIELD_DP32(data, CPUCFG2, LSX, 1), + data =3D FIELD_DP32(data, CPUCFG2, LASX, 1), data =3D FIELD_DP32(data, CPUCFG2, LLFTP, 1); data =3D FIELD_DP32(data, CPUCFG2, LLFTP_VER, 1); data =3D FIELD_DP32(data, CPUCFG2, LSPW, 1); --=20 2.39.1