From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658987598674.4966244782183; Wed, 13 Sep 2023 19:36:27 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4R-00037w-7r; Wed, 13 Sep 2023 22:27:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4O-00036n-4j for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:04 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4J-0004ps-N6 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:03 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxIvBsbwJlTvgmAA--.9199S3; Thu, 14 Sep 2023 10:26:52 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S3; Thu, 14 Sep 2023 10:26:50 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 01/57] target/loongarch: Renamed lsx*.c to vec* .c Date: Thu, 14 Sep 2023 10:25:49 +0800 Message-Id: <20230914022645.1151356-2-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S3 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658988721100003 Content-Type: text/plain; charset="utf-8" Renamed lsx_helper.c to vec_helper.c and trans_lsx.c.inc to trans_vec.c.inc So LASX can used them. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/translate.c | 2 +- target/loongarch/{lsx_helper.c =3D> vec_helper.c} | 2 +- .../loongarch/insn_trans/{trans_lsx.c.inc =3D> trans_vec.c.inc} | 2 +- target/loongarch/meson.build | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) rename target/loongarch/{lsx_helper.c =3D> vec_helper.c} (99%) rename target/loongarch/insn_trans/{trans_lsx.c.inc =3D> trans_vec.c.inc} = (99%) diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c index fd393ed76d..288727181b 100644 --- a/target/loongarch/translate.c +++ b/target/loongarch/translate.c @@ -261,7 +261,7 @@ static uint64_t make_address_pc(DisasContext *ctx, uint= 64_t addr) #include "insn_trans/trans_fmemory.c.inc" #include "insn_trans/trans_branch.c.inc" #include "insn_trans/trans_privileged.c.inc" -#include "insn_trans/trans_lsx.c.inc" +#include "insn_trans/trans_vec.c.inc" =20 static void loongarch_tr_translate_insn(DisasContextBase *dcbase, CPUState= *cs) { diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/vec_helper.c similarity index 99% rename from target/loongarch/lsx_helper.c rename to target/loongarch/vec_helper.c index 9571f0aef0..73f0974744 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/vec_helper.c @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0-or-later */ /* - * QEMU LoongArch LSX helper functions. + * QEMU LoongArch vector helper functions. * * Copyright (c) 2022-2023 Loongson Technology Corporation Limited */ diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc similarity index 99% rename from target/loongarch/insn_trans/trans_lsx.c.inc rename to target/loongarch/insn_trans/trans_vec.c.inc index 5fbf2718f7..aed5bac5bc 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0-or-later */ /* - * LSX translate functions + * LoongArch vector translate functions * Copyright (c) 2022-2023 Loongson Technology Corporation Limited */ =20 diff --git a/target/loongarch/meson.build b/target/loongarch/meson.build index b7a27df5a9..7fbf045a5d 100644 --- a/target/loongarch/meson.build +++ b/target/loongarch/meson.build @@ -11,7 +11,7 @@ loongarch_tcg_ss.add(files( 'op_helper.c', 'translate.c', 'gdbstub.c', - 'lsx_helper.c', + 'vec_helper.c', )) loongarch_tcg_ss.add(zlib) =20 --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658568009518.0743547370463; Wed, 13 Sep 2023 19:29:28 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4R-000384-DO; Wed, 13 Sep 2023 22:27:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4M-00036V-V3 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:03 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4J-0004pt-MG for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:02 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxHOtsbwJlUfgmAA--.5142S3; Thu, 14 Sep 2023 10:26:52 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S4; Thu, 14 Sep 2023 10:26:52 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 02/57] target/loongarch: Implement gvec_*_vl functions Date: Thu, 14 Sep 2023 10:25:50 +0800 Message-Id: <20230914022645.1151356-3-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S4 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658569222100001 Content-Type: text/plain; charset="utf-8" Create gvec_*_vl functions in order to hide oprsz. This is used by gvec_v* functions for oprsz 16, and will be used by gvec_x* functions for oprsz 32. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insn_trans/trans_vec.c.inc | 68 +++++++++++++-------- 1 file changed, 44 insertions(+), 24 deletions(-) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index aed5bac5bc..aeeb2df41c 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -76,34 +76,58 @@ static bool gen_cv(DisasContext *ctx, arg_cv *a, return true; } =20 +static bool gvec_vvv_vl(DisasContext *ctx, arg_vvv *a, + uint32_t oprsz, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t, uint32_t)) +{ + uint32_t vd_ofs =3D vec_full_offset(a->vd); + uint32_t vj_ofs =3D vec_full_offset(a->vj); + uint32_t vk_ofs =3D vec_full_offset(a->vk); + + func(mop, vd_ofs, vj_ofs, vk_ofs, oprsz, ctx->vl / 8); + return true; +} + static bool gvec_vvv(DisasContext *ctx, arg_vvv *a, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t)) { - uint32_t vd_ofs, vj_ofs, vk_ofs; - CHECK_SXE; + return gvec_vvv_vl(ctx, a, 16, mop, func); +} =20 - vd_ofs =3D vec_full_offset(a->vd); - vj_ofs =3D vec_full_offset(a->vj); - vk_ofs =3D vec_full_offset(a->vk); =20 - func(mop, vd_ofs, vj_ofs, vk_ofs, 16, ctx->vl/8); +static bool gvec_vv_vl(DisasContext *ctx, arg_vv *a, + uint32_t oprsz, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t)) +{ + uint32_t vd_ofs =3D vec_full_offset(a->vd); + uint32_t vj_ofs =3D vec_full_offset(a->vj); + + func(mop, vd_ofs, vj_ofs, oprsz, ctx->vl / 8); return true; } =20 + static bool gvec_vv(DisasContext *ctx, arg_vv *a, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t)) { - uint32_t vd_ofs, vj_ofs; - CHECK_SXE; + return gvec_vv_vl(ctx, a, 16, mop, func); +} =20 - vd_ofs =3D vec_full_offset(a->vd); - vj_ofs =3D vec_full_offset(a->vj); +static bool gvec_vv_i_vl(DisasContext *ctx, arg_vv_i *a, + uint32_t oprsz, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + int64_t, uint32_t, uint32_t)) +{ + uint32_t vd_ofs =3D vec_full_offset(a->vd); + uint32_t vj_ofs =3D vec_full_offset(a->vj); =20 - func(mop, vd_ofs, vj_ofs, 16, ctx->vl/8); + func(mop, vd_ofs, vj_ofs, a->imm, oprsz, ctx->vl / 8); return true; } =20 @@ -111,28 +135,24 @@ static bool gvec_vv_i(DisasContext *ctx, arg_vv_i *a,= MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, int64_t, uint32_t, uint32_t)) { - uint32_t vd_ofs, vj_ofs; - CHECK_SXE; + return gvec_vv_i_vl(ctx, a, 16, mop, func); +} =20 - vd_ofs =3D vec_full_offset(a->vd); - vj_ofs =3D vec_full_offset(a->vj); +static bool gvec_subi_vl(DisasContext *ctx, arg_vv_i *a, + uint32_t oprsz, MemOp mop) +{ + uint32_t vd_ofs =3D vec_full_offset(a->vd); + uint32_t vj_ofs =3D vec_full_offset(a->vj); =20 - func(mop, vd_ofs, vj_ofs, a->imm , 16, ctx->vl/8); + tcg_gen_gvec_addi(mop, vd_ofs, vj_ofs, -a->imm, oprsz, ctx->vl / 8); return true; } =20 static bool gvec_subi(DisasContext *ctx, arg_vv_i *a, MemOp mop) { - uint32_t vd_ofs, vj_ofs; - CHECK_SXE; - - vd_ofs =3D vec_full_offset(a->vd); - vj_ofs =3D vec_full_offset(a->vj); - - tcg_gen_gvec_addi(mop, vd_ofs, vj_ofs, -a->imm, 16, ctx->vl/8); - return true; + return gvec_subi_vl(ctx, a, 16, mop); } =20 TRANS(vadd_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_add) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658680153208.94298618767516; Wed, 13 Sep 2023 19:31:20 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4X-00038W-IG; Wed, 13 Sep 2023 22:27:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4P-00037T-JR for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:05 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4M-0004pv-Un for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:05 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxXOptbwJlUvgmAA--.48370S3; Thu, 14 Sep 2023 10:26:53 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S5; Thu, 14 Sep 2023 10:26:52 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 03/57] target/loongarch: Use gen_helper_gvec_4_ptr for 4OP + env vector instructions Date: Thu, 14 Sep 2023 10:25:51 +0800 Message-Id: <20230914022645.1151356-4-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S5 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658680493100003 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 16 +++++----- target/loongarch/vec_helper.c | 12 +++---- target/loongarch/insn_trans/trans_vec.c.inc | 35 ++++++++++++++++----- 3 files changed, 41 insertions(+), 22 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index ffb1e0b0bf..ead16567c2 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -528,14 +528,14 @@ DEF_HELPER_4(vfmul_d, void, env, i32, i32, i32) DEF_HELPER_4(vfdiv_s, void, env, i32, i32, i32) DEF_HELPER_4(vfdiv_d, void, env, i32, i32, i32) =20 -DEF_HELPER_5(vfmadd_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfmadd_d, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfmsub_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfmsub_d, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfnmadd_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfnmadd_d, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfnmsub_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfnmsub_d, void, env, i32, i32, i32, i32) +DEF_HELPER_FLAGS_6(vfmadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, en= v, i32) +DEF_HELPER_FLAGS_6(vfmadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, en= v, i32) +DEF_HELPER_FLAGS_6(vfmsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, en= v, i32) +DEF_HELPER_FLAGS_6(vfmsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, en= v, i32) +DEF_HELPER_FLAGS_6(vfnmadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, e= nv, i32) +DEF_HELPER_FLAGS_6(vfnmadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, e= nv, i32) +DEF_HELPER_FLAGS_6(vfnmsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, e= nv, i32) +DEF_HELPER_FLAGS_6(vfnmsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, e= nv, i32) =20 DEF_HELPER_4(vfmax_s, void, env, i32, i32, i32) DEF_HELPER_4(vfmax_d, void, env, i32, i32, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 73f0974744..3a7a620227 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2129,14 +2129,14 @@ DO_3OP_F(vfmina_s, 32, UW, float32_minnummag) DO_3OP_F(vfmina_d, 64, UD, float64_minnummag) =20 #define DO_4OP_F(NAME, BIT, E, FN, flags) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk, uint32_t va) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, void *va, = \ + CPULoongArchState *env, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ - VReg *Va =3D &(env->fpr[va].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + VReg *Va =3D (VReg *)va; = \ = \ vec_clear_cause(env); = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index aeeb2df41c..6a518523f0 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -15,6 +15,25 @@ #define CHECK_SXE #endif =20 +static bool gen_vvvv_ptr_vl(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz, + gen_helper_gvec_4_ptr *fn) +{ + tcg_gen_gvec_4_ptr(vec_full_offset(a->vd), + vec_full_offset(a->vj), + vec_full_offset(a->vk), + vec_full_offset(a->va), + cpu_env, + oprsz, ctx->vl / 8, 0, fn); + return true; +} + +static bool gen_vvvv_ptr(DisasContext *ctx, arg_vvvv *a, + gen_helper_gvec_4_ptr *fn) +{ + CHECK_SXE; + return gen_vvvv_ptr_vl(ctx, a, 16, fn); +} + static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32)) @@ -3634,14 +3653,14 @@ TRANS(vfmul_d, LSX, gen_vvv, gen_helper_vfmul_d) TRANS(vfdiv_s, LSX, gen_vvv, gen_helper_vfdiv_s) TRANS(vfdiv_d, LSX, gen_vvv, gen_helper_vfdiv_d) =20 -TRANS(vfmadd_s, LSX, gen_vvvv, gen_helper_vfmadd_s) -TRANS(vfmadd_d, LSX, gen_vvvv, gen_helper_vfmadd_d) -TRANS(vfmsub_s, LSX, gen_vvvv, gen_helper_vfmsub_s) -TRANS(vfmsub_d, LSX, gen_vvvv, gen_helper_vfmsub_d) -TRANS(vfnmadd_s, LSX, gen_vvvv, gen_helper_vfnmadd_s) -TRANS(vfnmadd_d, LSX, gen_vvvv, gen_helper_vfnmadd_d) -TRANS(vfnmsub_s, LSX, gen_vvvv, gen_helper_vfnmsub_s) -TRANS(vfnmsub_d, LSX, gen_vvvv, gen_helper_vfnmsub_d) +TRANS(vfmadd_s, LSX, gen_vvvv_ptr, gen_helper_vfmadd_s) +TRANS(vfmadd_d, LSX, gen_vvvv_ptr, gen_helper_vfmadd_d) +TRANS(vfmsub_s, LSX, gen_vvvv_ptr, gen_helper_vfmsub_s) +TRANS(vfmsub_d, LSX, gen_vvvv_ptr, gen_helper_vfmsub_d) +TRANS(vfnmadd_s, LSX, gen_vvvv_ptr, gen_helper_vfnmadd_s) +TRANS(vfnmadd_d, LSX, gen_vvvv_ptr, gen_helper_vfnmadd_d) +TRANS(vfnmsub_s, LSX, gen_vvvv_ptr, gen_helper_vfnmsub_s) +TRANS(vfnmsub_d, LSX, gen_vvvv_ptr, gen_helper_vfnmsub_d) =20 TRANS(vfmax_s, LSX, gen_vvv, gen_helper_vfmax_s) TRANS(vfmax_d, LSX, gen_vvv, gen_helper_vfmax_d) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658467624989.7382536978176; Wed, 13 Sep 2023 19:27:47 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4Y-000393-Pw; Wed, 13 Sep 2023 22:27:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4O-00036m-3x for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:04 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4J-0004px-MC for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:03 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Ax1fBtbwJlVfgmAA--.9913S3; Thu, 14 Sep 2023 10:26:53 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S6; Thu, 14 Sep 2023 10:26:53 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 04/57] target/loongarch: Use gen_helper_gvec_4 for 4OP vector instructions Date: Thu, 14 Sep 2023 10:25:52 +0800 Message-Id: <20230914022645.1151356-5-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S6 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658469296100001 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 2 +- target/loongarch/vec_helper.c | 11 +++++------ target/loongarch/insn_trans/trans_vec.c.inc | 22 ++++++++++++--------- 3 files changed, 19 insertions(+), 16 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index ead16567c2..727ccfb32c 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -682,7 +682,7 @@ DEF_HELPER_4(vilvh_h, void, env, i32, i32, i32) DEF_HELPER_4(vilvh_w, void, env, i32, i32, i32) DEF_HELPER_4(vilvh_d, void, env, i32, i32, i32) =20 -DEF_HELPER_5(vshuf_b, void, env, i32, i32, i32, i32) +DEF_HELPER_FLAGS_5(vshuf_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_4(vshuf_h, void, env, i32, i32, i32) DEF_HELPER_4(vshuf_w, void, env, i32, i32, i32) DEF_HELPER_4(vshuf_d, void, env, i32, i32, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 3a7a620227..7078c4c845 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2899,15 +2899,14 @@ VILVH(vilvh_h, 32, H) VILVH(vilvh_w, 64, W) VILVH(vilvh_d, 128, D) =20 -void HELPER(vshuf_b)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk, uint32_t va) +void HELPER(vshuf_b)(void *vd, void *vj, void *vk, void *va, uint32_t desc) { int i, m; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); - VReg *Va =3D &(env->fpr[va].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; + VReg *Va =3D (VReg *)va; =20 m =3D LSX_LEN/8; for (i =3D 0; i < m ; i++) { diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 6a518523f0..71b1b5ef58 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -34,18 +34,22 @@ static bool gen_vvvv_ptr(DisasContext *ctx, arg_vvvv *a, return gen_vvvv_ptr_vl(ctx, a, 16, fn); } =20 -static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, - TCGv_i32, TCGv_i32)) +static bool gen_vvvv_vl(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz, + gen_helper_gvec_4 *fn) { - TCGv_i32 vd =3D tcg_constant_i32(a->vd); - TCGv_i32 vj =3D tcg_constant_i32(a->vj); - TCGv_i32 vk =3D tcg_constant_i32(a->vk); - TCGv_i32 va =3D tcg_constant_i32(a->va); + tcg_gen_gvec_4_ool(vec_full_offset(a->vd), + vec_full_offset(a->vj), + vec_full_offset(a->vk), + vec_full_offset(a->va), + oprsz, ctx->vl / 8, 0, fn); + return true; +} =20 +static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, + gen_helper_gvec_4 *fn) +{ CHECK_SXE; - func(cpu_env, vd, vj, vk, va); - return true; + return gen_vvvv_vl(ctx, a, 16, fn); } =20 static bool gen_vvv(DisasContext *ctx, arg_vvv *a, --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16946586662071004.8887445652969; Wed, 13 Sep 2023 19:31:06 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4T-000387-CT; Wed, 13 Sep 2023 22:27:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4O-00037J-Od for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:04 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4J-0004q7-LF for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:04 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Bxd+hubwJlV_gmAA--.39337S3; Thu, 14 Sep 2023 10:26:54 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S7; Thu, 14 Sep 2023 10:26:53 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 05/57] target/loongarch: Use gen_helper_gvec_3_ptr for 3OP + env vector instructions Date: Thu, 14 Sep 2023 10:25:53 +0800 Message-Id: <20230914022645.1151356-6-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S7 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658668050100002 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 48 +++++++-------- target/loongarch/vec_helper.c | 50 ++++++++-------- target/loongarch/insn_trans/trans_vec.c.inc | 66 +++++++++++++-------- 3 files changed, 91 insertions(+), 73 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 727ccfb32c..bcf82597aa 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -519,14 +519,14 @@ DEF_HELPER_4(vfrstp_h, void, env, i32, i32, i32) DEF_HELPER_4(vfrstpi_b, void, env, i32, i32, i32) DEF_HELPER_4(vfrstpi_h, void, env, i32, i32, i32) =20 -DEF_HELPER_4(vfadd_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfadd_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfsub_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfsub_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfmul_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmul_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfdiv_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfdiv_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(vfadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfdiv_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfdiv_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) =20 DEF_HELPER_FLAGS_6(vfmadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, en= v, i32) DEF_HELPER_FLAGS_6(vfmadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, en= v, i32) @@ -537,15 +537,15 @@ DEF_HELPER_FLAGS_6(vfnmadd_d, TCG_CALL_NO_RWG, void, = ptr, ptr, ptr, ptr, env, i3 DEF_HELPER_FLAGS_6(vfnmsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, e= nv, i32) DEF_HELPER_FLAGS_6(vfnmsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, e= nv, i32) =20 -DEF_HELPER_4(vfmax_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmax_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfmin_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmin_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(vfmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmax_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmin_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmin_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) =20 -DEF_HELPER_4(vfmaxa_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmaxa_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfmina_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmina_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(vfmaxa_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_5(vfmaxa_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_5(vfmina_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_5(vfmina_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i3= 2) =20 DEF_HELPER_3(vflogb_s, void, env, i32, i32) DEF_HELPER_3(vflogb_d, void, env, i32, i32) @@ -564,8 +564,8 @@ DEF_HELPER_3(vfcvtl_s_h, void, env, i32, i32) DEF_HELPER_3(vfcvth_s_h, void, env, i32, i32) DEF_HELPER_3(vfcvtl_d_s, void, env, i32, i32) DEF_HELPER_3(vfcvth_d_s, void, env, i32, i32) -DEF_HELPER_4(vfcvt_h_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfcvt_s_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(vfcvt_h_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i= 32) +DEF_HELPER_FLAGS_5(vfcvt_s_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i= 32) =20 DEF_HELPER_3(vfrintrne_s, void, env, i32, i32) DEF_HELPER_3(vfrintrne_d, void, env, i32, i32) @@ -592,11 +592,11 @@ DEF_HELPER_3(vftintrz_wu_s, void, env, i32, i32) DEF_HELPER_3(vftintrz_lu_d, void, env, i32, i32) DEF_HELPER_3(vftint_wu_s, void, env, i32, i32) DEF_HELPER_3(vftint_lu_d, void, env, i32, i32) -DEF_HELPER_4(vftintrne_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vftintrz_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vftintrp_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vftintrm_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vftint_w_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(vftintrne_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, en= v, i32) +DEF_HELPER_FLAGS_5(vftintrz_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env= , i32) +DEF_HELPER_FLAGS_5(vftintrp_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env= , i32) +DEF_HELPER_FLAGS_5(vftintrm_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env= , i32) +DEF_HELPER_FLAGS_5(vftint_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, = i32) DEF_HELPER_3(vftintrnel_l_s, void, env, i32, i32) DEF_HELPER_3(vftintrneh_l_s, void, env, i32, i32) DEF_HELPER_3(vftintrzl_l_s, void, env, i32, i32) @@ -614,7 +614,7 @@ DEF_HELPER_3(vffint_s_wu, void, env, i32, i32) DEF_HELPER_3(vffint_d_lu, void, env, i32, i32) DEF_HELPER_3(vffintl_d_w, void, env, i32, i32) DEF_HELPER_3(vffinth_d_w, void, env, i32, i32) -DEF_HELPER_4(vffint_s_l, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(vffint_s_l, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, = i32) =20 DEF_HELPER_FLAGS_4(vseqi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vseqi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 7078c4c845..eab94a8b76 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2096,13 +2096,13 @@ static inline void vec_clear_cause(CPULoongArchStat= e *env) } =20 #define DO_3OP_F(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, \ + CPULoongArchState *env, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ vec_clear_cause(env); \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ @@ -2326,14 +2326,14 @@ void HELPER(vfcvth_d_s)(CPULoongArchState *env, uin= t32_t vd, uint32_t vj) *Vd =3D temp; } =20 -void HELPER(vfcvt_h_s)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vfcvt_h_s)(void *vd, void *vj, void *vk, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 vec_clear_cause(env); for(i =3D 0; i < LSX_LEN/32; i++) { @@ -2344,14 +2344,14 @@ void HELPER(vfcvt_h_s)(CPULoongArchState *env, *Vd =3D temp; } =20 -void HELPER(vfcvt_s_d)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vfcvt_s_d)(void *vd, void *vj, void *vk, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 vec_clear_cause(env); for(i =3D 0; i < LSX_LEN/64; i++) { @@ -2482,14 +2482,14 @@ FTINT(rz_w_d, float64, int32, uint64_t, uint32_t, f= loat_round_to_zero) FTINT(rne_w_d, float64, int32, uint64_t, uint32_t, float_round_nearest_eve= n) =20 #define FTINT_W_D(NAME, FN) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, \ + CPULoongArchState *env, uint32_t desc) \ { \ int i; \ VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ vec_clear_cause(env); \ for (i =3D 0; i < 2; i++) { \ @@ -2606,14 +2606,14 @@ void HELPER(vffinth_d_w)(CPULoongArchState *env, ui= nt32_t vd, uint32_t vj) *Vd =3D temp; } =20 -void HELPER(vffint_s_l)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vffint_s_l)(void *vd, void *vj, void *vk, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 vec_clear_cause(env); for (i =3D 0; i < 2; i++) { diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 71b1b5ef58..4547ba304d 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -52,6 +52,24 @@ static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, return gen_vvvv_vl(ctx, a, 16, fn); } =20 +static bool gen_vvv_ptr_vl(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, + gen_helper_gvec_3_ptr *fn) +{ + tcg_gen_gvec_3_ptr(vec_full_offset(a->vd), + vec_full_offset(a->vj), + vec_full_offset(a->vk), + cpu_env, + oprsz, ctx->vl / 8, 0, fn); + return true; +} + +static bool gen_vvv_ptr(DisasContext *ctx, arg_vvv *a, + gen_helper_gvec_3_ptr *fn) +{ + CHECK_SXE; + return gen_vvv_ptr_vl(ctx, a, 16, fn); +} + static bool gen_vvv(DisasContext *ctx, arg_vvv *a, void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) { @@ -3648,14 +3666,14 @@ TRANS(vfrstp_h, LSX, gen_vvv, gen_helper_vfrstp_h) TRANS(vfrstpi_b, LSX, gen_vv_i, gen_helper_vfrstpi_b) TRANS(vfrstpi_h, LSX, gen_vv_i, gen_helper_vfrstpi_h) =20 -TRANS(vfadd_s, LSX, gen_vvv, gen_helper_vfadd_s) -TRANS(vfadd_d, LSX, gen_vvv, gen_helper_vfadd_d) -TRANS(vfsub_s, LSX, gen_vvv, gen_helper_vfsub_s) -TRANS(vfsub_d, LSX, gen_vvv, gen_helper_vfsub_d) -TRANS(vfmul_s, LSX, gen_vvv, gen_helper_vfmul_s) -TRANS(vfmul_d, LSX, gen_vvv, gen_helper_vfmul_d) -TRANS(vfdiv_s, LSX, gen_vvv, gen_helper_vfdiv_s) -TRANS(vfdiv_d, LSX, gen_vvv, gen_helper_vfdiv_d) +TRANS(vfadd_s, LSX, gen_vvv_ptr, gen_helper_vfadd_s) +TRANS(vfadd_d, LSX, gen_vvv_ptr, gen_helper_vfadd_d) +TRANS(vfsub_s, LSX, gen_vvv_ptr, gen_helper_vfsub_s) +TRANS(vfsub_d, LSX, gen_vvv_ptr, gen_helper_vfsub_d) +TRANS(vfmul_s, LSX, gen_vvv_ptr, gen_helper_vfmul_s) +TRANS(vfmul_d, LSX, gen_vvv_ptr, gen_helper_vfmul_d) +TRANS(vfdiv_s, LSX, gen_vvv_ptr, gen_helper_vfdiv_s) +TRANS(vfdiv_d, LSX, gen_vvv_ptr, gen_helper_vfdiv_d) =20 TRANS(vfmadd_s, LSX, gen_vvvv_ptr, gen_helper_vfmadd_s) TRANS(vfmadd_d, LSX, gen_vvvv_ptr, gen_helper_vfmadd_d) @@ -3666,15 +3684,15 @@ TRANS(vfnmadd_d, LSX, gen_vvvv_ptr, gen_helper_vfnm= add_d) TRANS(vfnmsub_s, LSX, gen_vvvv_ptr, gen_helper_vfnmsub_s) TRANS(vfnmsub_d, LSX, gen_vvvv_ptr, gen_helper_vfnmsub_d) =20 -TRANS(vfmax_s, LSX, gen_vvv, gen_helper_vfmax_s) -TRANS(vfmax_d, LSX, gen_vvv, gen_helper_vfmax_d) -TRANS(vfmin_s, LSX, gen_vvv, gen_helper_vfmin_s) -TRANS(vfmin_d, LSX, gen_vvv, gen_helper_vfmin_d) +TRANS(vfmax_s, LSX, gen_vvv_ptr, gen_helper_vfmax_s) +TRANS(vfmax_d, LSX, gen_vvv_ptr, gen_helper_vfmax_d) +TRANS(vfmin_s, LSX, gen_vvv_ptr, gen_helper_vfmin_s) +TRANS(vfmin_d, LSX, gen_vvv_ptr, gen_helper_vfmin_d) =20 -TRANS(vfmaxa_s, LSX, gen_vvv, gen_helper_vfmaxa_s) -TRANS(vfmaxa_d, LSX, gen_vvv, gen_helper_vfmaxa_d) -TRANS(vfmina_s, LSX, gen_vvv, gen_helper_vfmina_s) -TRANS(vfmina_d, LSX, gen_vvv, gen_helper_vfmina_d) +TRANS(vfmaxa_s, LSX, gen_vvv_ptr, gen_helper_vfmaxa_s) +TRANS(vfmaxa_d, LSX, gen_vvv_ptr, gen_helper_vfmaxa_d) +TRANS(vfmina_s, LSX, gen_vvv_ptr, gen_helper_vfmina_s) +TRANS(vfmina_d, LSX, gen_vvv_ptr, gen_helper_vfmina_d) =20 TRANS(vflogb_s, LSX, gen_vv, gen_helper_vflogb_s) TRANS(vflogb_d, LSX, gen_vv, gen_helper_vflogb_d) @@ -3693,8 +3711,8 @@ TRANS(vfcvtl_s_h, LSX, gen_vv, gen_helper_vfcvtl_s_h) TRANS(vfcvth_s_h, LSX, gen_vv, gen_helper_vfcvth_s_h) TRANS(vfcvtl_d_s, LSX, gen_vv, gen_helper_vfcvtl_d_s) TRANS(vfcvth_d_s, LSX, gen_vv, gen_helper_vfcvth_d_s) -TRANS(vfcvt_h_s, LSX, gen_vvv, gen_helper_vfcvt_h_s) -TRANS(vfcvt_s_d, LSX, gen_vvv, gen_helper_vfcvt_s_d) +TRANS(vfcvt_h_s, LSX, gen_vvv_ptr, gen_helper_vfcvt_h_s) +TRANS(vfcvt_s_d, LSX, gen_vvv_ptr, gen_helper_vfcvt_s_d) =20 TRANS(vfrintrne_s, LSX, gen_vv, gen_helper_vfrintrne_s) TRANS(vfrintrne_d, LSX, gen_vv, gen_helper_vfrintrne_d) @@ -3721,11 +3739,11 @@ TRANS(vftintrz_wu_s, LSX, gen_vv, gen_helper_vftint= rz_wu_s) TRANS(vftintrz_lu_d, LSX, gen_vv, gen_helper_vftintrz_lu_d) TRANS(vftint_wu_s, LSX, gen_vv, gen_helper_vftint_wu_s) TRANS(vftint_lu_d, LSX, gen_vv, gen_helper_vftint_lu_d) -TRANS(vftintrne_w_d, LSX, gen_vvv, gen_helper_vftintrne_w_d) -TRANS(vftintrz_w_d, LSX, gen_vvv, gen_helper_vftintrz_w_d) -TRANS(vftintrp_w_d, LSX, gen_vvv, gen_helper_vftintrp_w_d) -TRANS(vftintrm_w_d, LSX, gen_vvv, gen_helper_vftintrm_w_d) -TRANS(vftint_w_d, LSX, gen_vvv, gen_helper_vftint_w_d) +TRANS(vftintrne_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrne_w_d) +TRANS(vftintrz_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrz_w_d) +TRANS(vftintrp_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrp_w_d) +TRANS(vftintrm_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrm_w_d) +TRANS(vftint_w_d, LSX, gen_vvv_ptr, gen_helper_vftint_w_d) TRANS(vftintrnel_l_s, LSX, gen_vv, gen_helper_vftintrnel_l_s) TRANS(vftintrneh_l_s, LSX, gen_vv, gen_helper_vftintrneh_l_s) TRANS(vftintrzl_l_s, LSX, gen_vv, gen_helper_vftintrzl_l_s) @@ -3743,7 +3761,7 @@ TRANS(vffint_s_wu, LSX, gen_vv, gen_helper_vffint_s_w= u) TRANS(vffint_d_lu, LSX, gen_vv, gen_helper_vffint_d_lu) TRANS(vffintl_d_w, LSX, gen_vv, gen_helper_vffintl_d_w) TRANS(vffinth_d_w, LSX, gen_vv, gen_helper_vffinth_d_w) -TRANS(vffint_s_l, LSX, gen_vvv, gen_helper_vffint_s_l) +TRANS(vffint_s_l, LSX, gen_vvv_ptr, gen_helper_vffint_s_l) =20 static bool do_cmp(DisasContext *ctx, arg_vvv *a, MemOp mop, TCGCond cond) { --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694659010351155.97904243479957; Wed, 13 Sep 2023 19:36:50 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4Z-00039B-F7; Wed, 13 Sep 2023 22:27:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4Q-00037v-C4 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:07 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4J-0004qC-V6 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:06 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx5_FubwJlWPgmAA--.9471S3; Thu, 14 Sep 2023 10:26:54 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S8; Thu, 14 Sep 2023 10:26:53 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 06/57] target/loongarch: Use gen_helper_gvec_3 for 3OP vector instructions Date: Thu, 14 Sep 2023 10:25:54 +0800 Message-Id: <20230914022645.1151356-7-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S8 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694659011704100005 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 214 +++++----- target/loongarch/vec_helper.c | 444 +++++++++----------- target/loongarch/insn_trans/trans_vec.c.inc | 19 +- 3 files changed, 326 insertions(+), 351 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index bcf82597aa..4b681e948f 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -133,22 +133,22 @@ DEF_HELPER_1(idle, void, env) #endif =20 /* LoongArch LSX */ -DEF_HELPER_4(vhaddw_h_b, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_w_h, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_d_w, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_q_d, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_hu_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_wu_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_du_wu, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_qu_du, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_h_b, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_w_h, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_d_w, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_q_d, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_hu_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_wu_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_du_wu, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_qu_du, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vhaddw_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vaddwev_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vaddwev_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -305,22 +305,22 @@ DEF_HELPER_FLAGS_4(vmaddwod_h_bu_b, TCG_CALL_NO_RWG, = void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vmaddwod_w_hu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, = i32) DEF_HELPER_FLAGS_4(vmaddwod_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, = i32) =20 -DEF_HELPER_4(vdiv_b, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_h, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_w, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_d, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_wu, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_du, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_b, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_h, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_w, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_d, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_wu, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_du, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vdiv_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vsat_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vsat_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) @@ -363,30 +363,30 @@ DEF_HELPER_4(vsllwil_wu_hu, void, env, i32, i32, i32) DEF_HELPER_4(vsllwil_du_wu, void, env, i32, i32, i32) DEF_HELPER_3(vextl_qu_du, void, env, i32, i32) =20 -DEF_HELPER_4(vsrlr_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlr_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlr_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlr_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrlr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlr_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlr_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_4(vsrlri_b, void, env, i32, i32, i32) DEF_HELPER_4(vsrlri_h, void, env, i32, i32, i32) DEF_HELPER_4(vsrlri_w, void, env, i32, i32, i32) DEF_HELPER_4(vsrlri_d, void, env, i32, i32, i32) =20 -DEF_HELPER_4(vsrar_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsrar_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrar_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrar_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrar_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrar_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrar_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_4(vsrari_b, void, env, i32, i32, i32) DEF_HELPER_4(vsrari_h, void, env, i32, i32, i32) DEF_HELPER_4(vsrari_w, void, env, i32, i32, i32) DEF_HELPER_4(vsrari_d, void, env, i32, i32, i32) =20 -DEF_HELPER_4(vsrln_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrln_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrln_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsran_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsran_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsran_w_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrln_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrln_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrln_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsran_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsran_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsran_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 DEF_HELPER_4(vsrlni_b_h, void, env, i32, i32, i32) DEF_HELPER_4(vsrlni_h_w, void, env, i32, i32, i32) @@ -397,12 +397,12 @@ DEF_HELPER_4(vsrani_h_w, void, env, i32, i32, i32) DEF_HELPER_4(vsrani_w_d, void, env, i32, i32, i32) DEF_HELPER_4(vsrani_d_q, void, env, i32, i32, i32) =20 -DEF_HELPER_4(vsrlrn_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrn_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrn_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarn_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarn_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarn_w_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrlrn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlrn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlrn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrarn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrarn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrarn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 DEF_HELPER_4(vsrlrni_b_h, void, env, i32, i32, i32) DEF_HELPER_4(vsrlrni_h_w, void, env, i32, i32, i32) @@ -413,18 +413,18 @@ DEF_HELPER_4(vsrarni_h_w, void, env, i32, i32, i32) DEF_HELPER_4(vsrarni_w_d, void, env, i32, i32, i32) DEF_HELPER_4(vsrarni_d_q, void, env, i32, i32, i32) =20 -DEF_HELPER_4(vssrln_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_wu_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vssrln_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 DEF_HELPER_4(vssrlni_b_h, void, env, i32, i32, i32) DEF_HELPER_4(vssrlni_h_w, void, env, i32, i32, i32) @@ -443,18 +443,18 @@ DEF_HELPER_4(vssrani_hu_w, void, env, i32, i32, i32) DEF_HELPER_4(vssrani_wu_d, void, env, i32, i32, i32) DEF_HELPER_4(vssrani_du_q, void, env, i32, i32, i32) =20 -DEF_HELPER_4(vssrlrn_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_wu_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vssrlrn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 DEF_HELPER_4(vssrlrni_b_h, void, env, i32, i32, i32) DEF_HELPER_4(vssrlrni_h_w, void, env, i32, i32, i32) @@ -514,8 +514,8 @@ DEF_HELPER_FLAGS_4(vbitrevi_h, TCG_CALL_NO_RWG, void, p= tr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vbitrevi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vbitrevi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 -DEF_HELPER_4(vfrstp_b, void, env, i32, i32, i32) -DEF_HELPER_4(vfrstp_h, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vfrstp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vfrstp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_4(vfrstpi_b, void, env, i32, i32, i32) DEF_HELPER_4(vfrstpi_h, void, env, i32, i32, i32) =20 @@ -655,37 +655,37 @@ DEF_HELPER_3(vsetallnez_h, void, env, i32, i32) DEF_HELPER_3(vsetallnez_w, void, env, i32, i32) DEF_HELPER_3(vsetallnez_d, void, env, i32, i32) =20 -DEF_HELPER_4(vpackev_b, void, env, i32, i32, i32) -DEF_HELPER_4(vpackev_h, void, env, i32, i32, i32) -DEF_HELPER_4(vpackev_w, void, env, i32, i32, i32) -DEF_HELPER_4(vpackev_d, void, env, i32, i32, i32) -DEF_HELPER_4(vpackod_b, void, env, i32, i32, i32) -DEF_HELPER_4(vpackod_h, void, env, i32, i32, i32) -DEF_HELPER_4(vpackod_w, void, env, i32, i32, i32) -DEF_HELPER_4(vpackod_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vpickev_b, void, env, i32, i32, i32) -DEF_HELPER_4(vpickev_h, void, env, i32, i32, i32) -DEF_HELPER_4(vpickev_w, void, env, i32, i32, i32) -DEF_HELPER_4(vpickev_d, void, env, i32, i32, i32) -DEF_HELPER_4(vpickod_b, void, env, i32, i32, i32) -DEF_HELPER_4(vpickod_h, void, env, i32, i32, i32) -DEF_HELPER_4(vpickod_w, void, env, i32, i32, i32) -DEF_HELPER_4(vpickod_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vilvl_b, void, env, i32, i32, i32) -DEF_HELPER_4(vilvl_h, void, env, i32, i32, i32) -DEF_HELPER_4(vilvl_w, void, env, i32, i32, i32) -DEF_HELPER_4(vilvl_d, void, env, i32, i32, i32) -DEF_HELPER_4(vilvh_b, void, env, i32, i32, i32) -DEF_HELPER_4(vilvh_h, void, env, i32, i32, i32) -DEF_HELPER_4(vilvh_w, void, env, i32, i32, i32) -DEF_HELPER_4(vilvh_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vpackev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackev_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackev_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackod_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackod_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackod_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackod_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(vpickev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickev_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickev_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickod_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickod_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickod_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickod_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(vilvl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvl_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvh_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_5(vshuf_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_4(vshuf_h, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf_w, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vshuf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vshuf_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vshuf_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_4(vshuf4i_b, void, env, i32, i32, i32) DEF_HELPER_4(vshuf4i_h, void, env, i32, i32, i32) DEF_HELPER_4(vshuf4i_w, void, env, i32, i32, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index eab94a8b76..15b361c6b3 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -17,13 +17,12 @@ #define DO_SUB(a, b) (a - b) =20 #define DO_ODD_EVEN(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ typedef __typeof(Vd->E1(0)) TD; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ @@ -35,12 +34,11 @@ DO_ODD_EVEN(vhaddw_h_b, 16, H, B, DO_ADD) DO_ODD_EVEN(vhaddw_w_h, 32, W, H, DO_ADD) DO_ODD_EVEN(vhaddw_d_w, 64, D, W, DO_ADD) =20 -void HELPER(vhaddw_q_d)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vhaddw_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 Vd->Q(0) =3D int128_add(int128_makes64(Vj->D(1)), int128_makes64(Vk->D= (0))); } @@ -49,12 +47,11 @@ DO_ODD_EVEN(vhsubw_h_b, 16, H, B, DO_SUB) DO_ODD_EVEN(vhsubw_w_h, 32, W, H, DO_SUB) DO_ODD_EVEN(vhsubw_d_w, 64, D, W, DO_SUB) =20 -void HELPER(vhsubw_q_d)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vhsubw_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 Vd->Q(0) =3D int128_sub(int128_makes64(Vj->D(1)), int128_makes64(Vk->D= (0))); } @@ -63,12 +60,11 @@ DO_ODD_EVEN(vhaddw_hu_bu, 16, UH, UB, DO_ADD) DO_ODD_EVEN(vhaddw_wu_hu, 32, UW, UH, DO_ADD) DO_ODD_EVEN(vhaddw_du_wu, 64, UD, UW, DO_ADD) =20 -void HELPER(vhaddw_qu_du)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vhaddw_qu_du)(void *vd, void *vj, void *vk, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 Vd->Q(0) =3D int128_add(int128_make64((uint64_t)Vj->D(1)), int128_make64((uint64_t)Vk->D(0))); @@ -78,12 +74,11 @@ DO_ODD_EVEN(vhsubw_hu_bu, 16, UH, UB, DO_SUB) DO_ODD_EVEN(vhsubw_wu_hu, 32, UW, UH, DO_SUB) DO_ODD_EVEN(vhsubw_du_wu, 64, UD, UW, DO_SUB) =20 -void HELPER(vhsubw_qu_du)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vhsubw_qu_du)(void *vd, void *vj, void *vk, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); - VReg *Vk =3D &(env->fpr[vk].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; =20 Vd->Q(0) =3D int128_sub(int128_make64((uint64_t)Vj->D(1)), int128_make64((uint64_t)Vk->D(0))); @@ -564,17 +559,16 @@ VMADDWOD_U_S(vmaddwod_d_wu_w, 64, D, UD, W, UW, DO_MU= L) #define DO_REM(N, M) (unlikely(M =3D=3D 0) ? 0 :\ unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? 0 : N % = M) =20 -#define VDIV(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)); \ - } \ +#define VDIV(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)); \ + } \ } =20 VDIV(vdiv_b, 8, B, DO_DIV) @@ -854,13 +848,12 @@ do_vsrlr(W, uint32_t) do_vsrlr(D, uint64_t) =20 #define VSRLR(NAME, BIT, T, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ Vd->E(i) =3D do_vsrlr_ ## E(Vj->E(i), ((T)Vk->E(i))%BIT); \ @@ -906,13 +899,12 @@ do_vsrar(W, int32_t) do_vsrar(D, int64_t) =20 #define VSRAR(NAME, BIT, T, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ Vd->E(i) =3D do_vsrar_ ## E(Vj->E(i), ((T)Vk->E(i))%BIT); \ @@ -945,13 +937,12 @@ VSRARI(vsrari_d, 64, D) #define R_SHIFT(a, b) (a >> b) =20 #define VSRLN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) =3D R_SHIFT((T)Vj->E2(i),((T)Vk->E2(i)) % BIT); \ @@ -963,19 +954,18 @@ VSRLN(vsrln_b_h, 16, uint16_t, B, H) VSRLN(vsrln_h_w, 32, uint32_t, H, W) VSRLN(vsrln_w_d, 64, uint64_t, W, D) =20 -#define VSRAN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D R_SHIFT(Vj->E2(i), ((T)Vk->E2(i)) % BIT); \ - } \ - Vd->D(1) =3D 0; \ +#define VSRAN(NAME, BIT, T, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E1(i) =3D R_SHIFT(Vj->E2(i), ((T)Vk->E2(i)) % BIT); \ + } \ + Vd->D(1) =3D 0; \ } =20 VSRAN(vsran_b_h, 16, uint16_t, B, H) @@ -1057,13 +1047,12 @@ VSRANI(vsrani_h_w, 32, H, W) VSRANI(vsrani_w_d, 64, W, D) =20 #define VSRLRN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) =3D do_vsrlr_ ## E2(Vj->E2(i), ((T)Vk->E2(i))%BIT); \ @@ -1076,13 +1065,12 @@ VSRLRN(vsrlrn_h_w, 32, uint32_t, H, W) VSRLRN(vsrlrn_w_d, 64, uint64_t, W, D) =20 #define VSRARN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) =3D do_vsrar_ ## E2(Vj->E2(i), ((T)Vk->E2(i))%BIT); \ @@ -1205,13 +1193,12 @@ SSRLNS(H, uint32_t, int32_t, uint16_t) SSRLNS(W, uint64_t, int64_t, uint32_t) =20 #define VSSRLN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrlns_ ## E1(Vj->E2(i), (T)Vk->E2(i)% BIT, BIT/2= -1); \ @@ -1248,13 +1235,12 @@ SSRANS(H, int32_t, int16_t) SSRANS(W, int64_t, int32_t) =20 #define VSSRAN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrans_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2 = -1); \ @@ -1289,13 +1275,12 @@ SSRLNU(H, uint32_t, uint16_t, int32_t) SSRLNU(W, uint64_t, uint32_t, int64_t) =20 #define VSSRLNU(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrlnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2)= ; \ @@ -1333,13 +1318,12 @@ SSRANU(H, uint32_t, uint16_t, int32_t) SSRANU(W, uint64_t, uint32_t, int64_t) =20 #define VSSRANU(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssranu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2)= ; \ @@ -1581,13 +1565,12 @@ SSRLRNS(H, W, uint32_t, int32_t, uint16_t) SSRLRNS(W, D, uint64_t, int64_t, uint32_t) =20 #define VSSRLRN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrlrns_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= -1); \ @@ -1621,13 +1604,12 @@ SSRARNS(H, W, int32_t, int16_t) SSRARNS(W, D, int64_t, int32_t) =20 #define VSSRARN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrarns_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= -1); \ @@ -1660,13 +1642,12 @@ SSRLRNU(H, W, uint32_t, uint16_t, int32_t) SSRLRNU(W, D, uint64_t, uint32_t, int64_t) =20 #define VSSRLRNU(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrlrnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= ); \ @@ -1702,13 +1683,12 @@ SSRARNU(H, W, uint32_t, uint16_t, int32_t) SSRARNU(W, D, uint64_t, uint32_t, int64_t) =20 #define VSSRARNU(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t vk) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ - VReg *Vk =3D &(env->fpr[vk].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ Vd->E1(i) =3D do_ssrarnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= ); \ @@ -2023,22 +2003,21 @@ DO_BITI(vbitrevi_h, 16, UH, DO_BITREV) DO_BITI(vbitrevi_w, 32, UW, DO_BITREV) DO_BITI(vbitrevi_d, 64, UD, DO_BITREV) =20 -#define VFRSTP(NAME, BIT, MASK, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i, m; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - if (Vj->E(i) < 0) { \ - break; \ - } \ - } \ - m =3D Vk->E(0) & MASK; \ - Vd->E(m) =3D i; \ +#define VFRSTP(NAME, BIT, MASK, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, m; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + if (Vj->E(i) < 0) { \ + break; \ + } \ + } \ + m =3D Vk->E(0) & MASK; \ + Vd->E(m) =3D i; \ } =20 VFRSTP(vfrstp_b, 8, 0xf, B) @@ -2767,21 +2746,20 @@ SETALLNEZ(vsetallnez_h, MO_16) SETALLNEZ(vsetallnez_w, MO_32) SETALLNEZ(vsetallnez_d, MO_64) =20 -#define VPACKEV(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) =3D Vj->E(2 * i); \ - temp.E(2 *i) =3D Vk->E(2 * i); \ - } \ - *Vd =3D temp; \ +#define VPACKEV(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(2 * i + 1) =3D Vj->E(2 * i); \ + temp.E(2 *i) =3D Vk->E(2 * i); \ + } \ + *Vd =3D temp; \ } =20 VPACKEV(vpackev_b, 16, B) @@ -2789,21 +2767,20 @@ VPACKEV(vpackev_h, 32, H) VPACKEV(vpackev_w, 64, W) VPACKEV(vpackev_d, 128, D) =20 -#define VPACKOD(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) =3D Vj->E(2 * i + 1); \ - temp.E(2 * i) =3D Vk->E(2 * i + 1); \ - } \ - *Vd =3D temp; \ +#define VPACKOD(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(2 * i + 1) =3D Vj->E(2 * i + 1); \ + temp.E(2 * i) =3D Vk->E(2 * i + 1); \ + } \ + *Vd =3D temp; \ } =20 VPACKOD(vpackod_b, 16, B) @@ -2811,21 +2788,20 @@ VPACKOD(vpackod_h, 32, H) VPACKOD(vpackod_w, 64, W) VPACKOD(vpackod_d, 128, D) =20 -#define VPICKEV(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i + LSX_LEN/BIT) =3D Vj->E(2 * i); \ - temp.E(i) =3D Vk->E(2 * i); \ - } \ - *Vd =3D temp; \ +#define VPICKEV(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(i + LSX_LEN/BIT) =3D Vj->E(2 * i); \ + temp.E(i) =3D Vk->E(2 * i); \ + } \ + *Vd =3D temp; \ } =20 VPICKEV(vpickev_b, 16, B) @@ -2833,21 +2809,20 @@ VPICKEV(vpickev_h, 32, H) VPICKEV(vpickev_w, 64, W) VPICKEV(vpickev_d, 128, D) =20 -#define VPICKOD(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i + LSX_LEN/BIT) =3D Vj->E(2 * i + 1); \ - temp.E(i) =3D Vk->E(2 * i + 1); \ - } \ - *Vd =3D temp; \ +#define VPICKOD(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(i + LSX_LEN/BIT) =3D Vj->E(2 * i + 1); \ + temp.E(i) =3D Vk->E(2 * i + 1); \ + } \ + *Vd =3D temp; \ } =20 VPICKOD(vpickod_b, 16, B) @@ -2855,21 +2830,20 @@ VPICKOD(vpickod_h, 32, H) VPICKOD(vpickod_w, 64, W) VPICKOD(vpickod_d, 128, D) =20 -#define VILVL(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) =3D Vj->E(i); \ - temp.E(2 * i) =3D Vk->E(i); \ - } \ - *Vd =3D temp; \ +#define VILVL(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(2 * i + 1) =3D Vj->E(i); \ + temp.E(2 * i) =3D Vk->E(i); \ + } \ + *Vd =3D temp; \ } =20 VILVL(vilvl_b, 16, B) @@ -2877,21 +2851,20 @@ VILVL(vilvl_h, 32, H) VILVL(vilvl_w, 64, W) VILVL(vilvl_d, 128, D) =20 -#define VILVH(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) =3D Vj->E(i + LSX_LEN/BIT); \ - temp.E(2 * i) =3D Vk->E(i + LSX_LEN/BIT); \ - } \ - *Vd =3D temp; \ +#define VILVH(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(2 * i + 1) =3D Vj->E(i + LSX_LEN/BIT); \ + temp.E(2 * i) =3D Vk->E(i + LSX_LEN/BIT); \ + } \ + *Vd =3D temp; \ } =20 VILVH(vilvh_b, 16, B) @@ -2916,22 +2889,21 @@ void HELPER(vshuf_b)(void *vd, void *vj, void *vk, = void *va, uint32_t desc) *Vd =3D temp; } =20 -#define VSHUF(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i, m; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - VReg *Vk =3D &(env->fpr[vk].vreg); \ - \ - m =3D LSX_LEN/BIT; \ - for (i =3D 0; i < m; i++) { \ - uint64_t k =3D ((uint8_t) Vd->E(i)) % (2 * m); \ - temp.E(i) =3D k < m ? Vk->E(k) : Vj->E(k - m); \ - } \ - *Vd =3D temp; \ +#define VSHUF(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, m; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + m =3D LSX_LEN/BIT; \ + for (i =3D 0; i < m; i++) { \ + uint64_t k =3D ((uint8_t) Vd->E(i)) % (2 * m); \ + temp.E(i) =3D k < m ? Vk->E(k) : Vj->E(k - m); \ + } \ + *Vd =3D temp; \ } =20 VSHUF(vshuf_h, 16, H) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 4547ba304d..9f67774fc9 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -70,17 +70,20 @@ static bool gen_vvv_ptr(DisasContext *ctx, arg_vvv *a, return gen_vvv_ptr_vl(ctx, a, 16, fn); } =20 -static bool gen_vvv(DisasContext *ctx, arg_vvv *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) +static bool gen_vvv_vl(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, + gen_helper_gvec_3 *fn) { - TCGv_i32 vd =3D tcg_constant_i32(a->vd); - TCGv_i32 vj =3D tcg_constant_i32(a->vj); - TCGv_i32 vk =3D tcg_constant_i32(a->vk); + tcg_gen_gvec_3_ool(vec_full_offset(a->vd), + vec_full_offset(a->vj), + vec_full_offset(a->vk), + oprsz, ctx->vl / 8, 0, fn); + return true; +} =20 +static bool gen_vvv(DisasContext *ctx, arg_vvv *a, gen_helper_gvec_3 *fn) +{ CHECK_SXE; - - func(cpu_env, vd, vj, vk); - return true; + return gen_vvv_vl(ctx, a, 16, fn); } =20 static bool gen_vv(DisasContext *ctx, arg_vv *a, --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694659021005705.1026055600597; Wed, 13 Sep 2023 19:37:01 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4Z-00039G-Gv; Wed, 13 Sep 2023 22:27:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4P-00037S-G1 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:05 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4J-0004qR-Qd for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:05 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dx_+tvbwJlWvgmAA--.8899S3; Thu, 14 Sep 2023 10:26:55 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S9; Thu, 14 Sep 2023 10:26:54 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 07/57] target/loongarch: Use gen_helper_gvec_2_ptr for 2OP + env vector instructions Date: Thu, 14 Sep 2023 10:25:55 +0800 Message-Id: <20230914022645.1151356-8-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S9 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694659022191100003 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 118 +++++++------- target/loongarch/vec_helper.c | 161 +++++++++++--------- target/loongarch/insn_trans/trans_vec.c.inc | 129 +++++++++------- 3 files changed, 219 insertions(+), 189 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 4b681e948f..0752cc7212 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -547,73 +547,73 @@ DEF_HELPER_FLAGS_5(vfmaxa_d, TCG_CALL_NO_RWG, void, p= tr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vfmina_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i3= 2) DEF_HELPER_FLAGS_5(vfmina_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i3= 2) =20 -DEF_HELPER_3(vflogb_s, void, env, i32, i32) -DEF_HELPER_3(vflogb_d, void, env, i32, i32) - -DEF_HELPER_3(vfclass_s, void, env, i32, i32) -DEF_HELPER_3(vfclass_d, void, env, i32, i32) - -DEF_HELPER_3(vfsqrt_s, void, env, i32, i32) -DEF_HELPER_3(vfsqrt_d, void, env, i32, i32) -DEF_HELPER_3(vfrecip_s, void, env, i32, i32) -DEF_HELPER_3(vfrecip_d, void, env, i32, i32) -DEF_HELPER_3(vfrsqrt_s, void, env, i32, i32) -DEF_HELPER_3(vfrsqrt_d, void, env, i32, i32) - -DEF_HELPER_3(vfcvtl_s_h, void, env, i32, i32) -DEF_HELPER_3(vfcvth_s_h, void, env, i32, i32) -DEF_HELPER_3(vfcvtl_d_s, void, env, i32, i32) -DEF_HELPER_3(vfcvth_d_s, void, env, i32, i32) +DEF_HELPER_FLAGS_4(vflogb_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vflogb_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vfclass_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfclass_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vfsqrt_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfsqrt_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrecip_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrecip_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrsqrt_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrsqrt_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vfcvtl_s_h, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfcvth_s_h, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfcvtl_d_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfcvth_d_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vfcvt_h_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i= 32) DEF_HELPER_FLAGS_5(vfcvt_s_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i= 32) =20 -DEF_HELPER_3(vfrintrne_s, void, env, i32, i32) -DEF_HELPER_3(vfrintrne_d, void, env, i32, i32) -DEF_HELPER_3(vfrintrz_s, void, env, i32, i32) -DEF_HELPER_3(vfrintrz_d, void, env, i32, i32) -DEF_HELPER_3(vfrintrp_s, void, env, i32, i32) -DEF_HELPER_3(vfrintrp_d, void, env, i32, i32) -DEF_HELPER_3(vfrintrm_s, void, env, i32, i32) -DEF_HELPER_3(vfrintrm_d, void, env, i32, i32) -DEF_HELPER_3(vfrint_s, void, env, i32, i32) -DEF_HELPER_3(vfrint_d, void, env, i32, i32) - -DEF_HELPER_3(vftintrne_w_s, void, env, i32, i32) -DEF_HELPER_3(vftintrne_l_d, void, env, i32, i32) -DEF_HELPER_3(vftintrz_w_s, void, env, i32, i32) -DEF_HELPER_3(vftintrz_l_d, void, env, i32, i32) -DEF_HELPER_3(vftintrp_w_s, void, env, i32, i32) -DEF_HELPER_3(vftintrp_l_d, void, env, i32, i32) -DEF_HELPER_3(vftintrm_w_s, void, env, i32, i32) -DEF_HELPER_3(vftintrm_l_d, void, env, i32, i32) -DEF_HELPER_3(vftint_w_s, void, env, i32, i32) -DEF_HELPER_3(vftint_l_d, void, env, i32, i32) -DEF_HELPER_3(vftintrz_wu_s, void, env, i32, i32) -DEF_HELPER_3(vftintrz_lu_d, void, env, i32, i32) -DEF_HELPER_3(vftint_wu_s, void, env, i32, i32) -DEF_HELPER_3(vftint_lu_d, void, env, i32, i32) +DEF_HELPER_FLAGS_4(vfrintrne_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrne_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrz_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrz_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrp_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrp_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrm_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrm_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrint_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrint_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vftintrne_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrne_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrz_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrz_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrp_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrp_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrm_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrm_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftint_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftint_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrz_wu_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrz_lu_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftint_wu_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftint_lu_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vftintrne_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, en= v, i32) DEF_HELPER_FLAGS_5(vftintrz_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env= , i32) DEF_HELPER_FLAGS_5(vftintrp_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env= , i32) DEF_HELPER_FLAGS_5(vftintrm_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env= , i32) DEF_HELPER_FLAGS_5(vftint_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, = i32) -DEF_HELPER_3(vftintrnel_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrneh_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrzl_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrzh_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrpl_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrph_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrml_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrmh_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintl_l_s, void, env, i32, i32) -DEF_HELPER_3(vftinth_l_s, void, env, i32, i32) - -DEF_HELPER_3(vffint_s_w, void, env, i32, i32) -DEF_HELPER_3(vffint_d_l, void, env, i32, i32) -DEF_HELPER_3(vffint_s_wu, void, env, i32, i32) -DEF_HELPER_3(vffint_d_lu, void, env, i32, i32) -DEF_HELPER_3(vffintl_d_w, void, env, i32, i32) -DEF_HELPER_3(vffinth_d_w, void, env, i32, i32) +DEF_HELPER_FLAGS_4(vftintrnel_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i= 32) +DEF_HELPER_FLAGS_4(vftintrneh_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i= 32) +DEF_HELPER_FLAGS_4(vftintrzl_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrzh_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrpl_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrph_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrml_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintrmh_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i3= 2) +DEF_HELPER_FLAGS_4(vftintl_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftinth_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vffint_s_w, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffint_d_l, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffint_s_wu, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffint_d_lu, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffintl_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffinth_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vffint_s_l, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, = i32) =20 DEF_HELPER_FLAGS_4(vseqi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 15b361c6b3..2898ae06ce 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2135,17 +2135,18 @@ DO_4OP_F(vfnmsub_s, 32, UW, float32_muladd, DO_4OP_F(vfnmsub_d, 64, UD, float64_muladd, float_muladd_negate_c | float_muladd_negate_result) =20 -#define DO_2OP_F(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - vec_clear_cause(env); \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D FN(env, Vj->E(i)); \ - } \ +#define DO_2OP_F(NAME, BIT, E, FN) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + vec_clear_cause(env); \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) =3D FN(env, Vj->E(i)); \ + } \ } =20 #define FLOGB(BIT, T) \ @@ -2166,16 +2167,17 @@ static T do_flogb_## BIT(CPULoongArchState *env, T = fj) \ FLOGB(32, uint32_t) FLOGB(64, uint64_t) =20 -#define FCLASS(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D FN(env, Vj->E(i)); \ - } \ +#define FCLASS(NAME, BIT, E, FN) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) =3D FN(env, Vj->E(i)); \ + } \ } =20 FCLASS(vfclass_s, 32, UW, helper_fclass_s) @@ -2245,12 +2247,13 @@ static uint32_t float64_cvt_float32(uint64_t d, flo= at_status *status) return float64_to_float32(d, status); } =20 -void HELPER(vfcvtl_s_h)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfcvtl_s_h)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < LSX_LEN/32; i++) { @@ -2260,12 +2263,13 @@ void HELPER(vfcvtl_s_h)(CPULoongArchState *env, uin= t32_t vd, uint32_t vj) *Vd =3D temp; } =20 -void HELPER(vfcvtl_d_s)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfcvtl_d_s)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < LSX_LEN/64; i++) { @@ -2275,12 +2279,13 @@ void HELPER(vfcvtl_d_s)(CPULoongArchState *env, uin= t32_t vd, uint32_t vj) *Vd =3D temp; } =20 -void HELPER(vfcvth_s_h)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfcvth_s_h)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < LSX_LEN/32; i++) { @@ -2290,12 +2295,13 @@ void HELPER(vfcvth_s_h)(CPULoongArchState *env, uin= t32_t vd, uint32_t vj) *Vd =3D temp; } =20 -void HELPER(vfcvth_d_s)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfcvth_d_s)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < LSX_LEN/64; i++) { @@ -2341,11 +2347,12 @@ void HELPER(vfcvt_s_d)(void *vd, void *vj, void *vk, *Vd =3D temp; } =20 -void HELPER(vfrint_s)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfrint_s)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < 4; i++) { @@ -2354,11 +2361,12 @@ void HELPER(vfrint_s)(CPULoongArchState *env, uint3= 2_t vd, uint32_t vj) } } =20 -void HELPER(vfrint_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfrint_d)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < 2; i++) { @@ -2368,11 +2376,12 @@ void HELPER(vfrint_d)(CPULoongArchState *env, uint3= 2_t vd, uint32_t vj) } =20 #define FCVT_2OP(NAME, BIT, E, MODE) = \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) = \ +void HELPER(NAME)(void *vd, void *vj, = \ + CPULoongArchState *env, uint32_t desc) = \ { = \ int i; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ vec_clear_cause(env); = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ @@ -2493,19 +2502,20 @@ FTINT(rph_l_s, float32, int64, uint32_t, uint64_t, = float_round_up) FTINT(rzh_l_s, float32, int64, uint32_t, uint64_t, float_round_to_zero) FTINT(rneh_l_s, float32, int64, uint32_t, uint64_t, float_round_nearest_ev= en) =20 -#define FTINTL_L_S(NAME, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - vec_clear_cause(env); \ - for (i =3D 0; i < 2; i++) { \ - temp.D(i) =3D FN(env, Vj->UW(i)); \ - } \ - *Vd =3D temp; \ +#define FTINTL_L_S(NAME, FN) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + vec_clear_cause(env); \ + for (i =3D 0; i < 2; i++) { \ + temp.D(i) =3D FN(env, Vj->UW(i)); \ + } \ + *Vd =3D temp; \ } =20 FTINTL_L_S(vftintl_l_s, do_float32_to_int64) @@ -2514,19 +2524,20 @@ FTINTL_L_S(vftintrpl_l_s, do_ftintrpl_l_s) FTINTL_L_S(vftintrzl_l_s, do_ftintrzl_l_s) FTINTL_L_S(vftintrnel_l_s, do_ftintrnel_l_s) =20 -#define FTINTH_L_S(NAME, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - vec_clear_cause(env); \ - for (i =3D 0; i < 2; i++) { \ - temp.D(i) =3D FN(env, Vj->UW(i + 2)); \ - } \ - *Vd =3D temp; \ +#define FTINTH_L_S(NAME, FN) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + vec_clear_cause(env); \ + for (i =3D 0; i < 2; i++) { \ + temp.D(i) =3D FN(env, Vj->UW(i + 2)); \ + } \ + *Vd =3D temp; \ } =20 FTINTH_L_S(vftinth_l_s, do_float32_to_int64) @@ -2555,12 +2566,13 @@ DO_2OP_F(vffint_d_l, 64, D, do_ffint_d_l) DO_2OP_F(vffint_s_wu, 32, UW, do_ffint_s_wu) DO_2OP_F(vffint_d_lu, 64, UD, do_ffint_d_lu) =20 -void HELPER(vffintl_d_w)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vffintl_d_w)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < 2; i++) { @@ -2570,12 +2582,13 @@ void HELPER(vffintl_d_w)(CPULoongArchState *env, ui= nt32_t vd, uint32_t vj) *Vd =3D temp; } =20 -void HELPER(vffinth_d_w)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vffinth_d_w)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 vec_clear_cause(env); for (i =3D 0; i < 2; i++) { diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 9f67774fc9..06aa037e3b 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -86,6 +86,23 @@ static bool gen_vvv(DisasContext *ctx, arg_vvv *a, gen_h= elper_gvec_3 *fn) return gen_vvv_vl(ctx, a, 16, fn); } =20 +static bool gen_vv_ptr_vl(DisasContext *ctx, arg_vv *a, uint32_t oprsz, + gen_helper_gvec_2_ptr *fn) +{ + tcg_gen_gvec_2_ptr(vec_full_offset(a->vd), + vec_full_offset(a->vj), + cpu_env, + oprsz, ctx->vl / 8, 0, fn); + return true; +} + +static bool gen_vv_ptr(DisasContext *ctx, arg_vv *a, + gen_helper_gvec_2_ptr *fn) +{ + CHECK_SXE; + return gen_vv_ptr_vl(ctx, a, 16, fn); +} + static bool gen_vv(DisasContext *ctx, arg_vv *a, void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32)) { @@ -3697,73 +3714,73 @@ TRANS(vfmaxa_d, LSX, gen_vvv_ptr, gen_helper_vfmaxa= _d) TRANS(vfmina_s, LSX, gen_vvv_ptr, gen_helper_vfmina_s) TRANS(vfmina_d, LSX, gen_vvv_ptr, gen_helper_vfmina_d) =20 -TRANS(vflogb_s, LSX, gen_vv, gen_helper_vflogb_s) -TRANS(vflogb_d, LSX, gen_vv, gen_helper_vflogb_d) +TRANS(vflogb_s, LSX, gen_vv_ptr, gen_helper_vflogb_s) +TRANS(vflogb_d, LSX, gen_vv_ptr, gen_helper_vflogb_d) =20 -TRANS(vfclass_s, LSX, gen_vv, gen_helper_vfclass_s) -TRANS(vfclass_d, LSX, gen_vv, gen_helper_vfclass_d) +TRANS(vfclass_s, LSX, gen_vv_ptr, gen_helper_vfclass_s) +TRANS(vfclass_d, LSX, gen_vv_ptr, gen_helper_vfclass_d) =20 -TRANS(vfsqrt_s, LSX, gen_vv, gen_helper_vfsqrt_s) -TRANS(vfsqrt_d, LSX, gen_vv, gen_helper_vfsqrt_d) -TRANS(vfrecip_s, LSX, gen_vv, gen_helper_vfrecip_s) -TRANS(vfrecip_d, LSX, gen_vv, gen_helper_vfrecip_d) -TRANS(vfrsqrt_s, LSX, gen_vv, gen_helper_vfrsqrt_s) -TRANS(vfrsqrt_d, LSX, gen_vv, gen_helper_vfrsqrt_d) +TRANS(vfsqrt_s, LSX, gen_vv_ptr, gen_helper_vfsqrt_s) +TRANS(vfsqrt_d, LSX, gen_vv_ptr, gen_helper_vfsqrt_d) +TRANS(vfrecip_s, LSX, gen_vv_ptr, gen_helper_vfrecip_s) +TRANS(vfrecip_d, LSX, gen_vv_ptr, gen_helper_vfrecip_d) +TRANS(vfrsqrt_s, LSX, gen_vv_ptr, gen_helper_vfrsqrt_s) +TRANS(vfrsqrt_d, LSX, gen_vv_ptr, gen_helper_vfrsqrt_d) =20 -TRANS(vfcvtl_s_h, LSX, gen_vv, gen_helper_vfcvtl_s_h) -TRANS(vfcvth_s_h, LSX, gen_vv, gen_helper_vfcvth_s_h) -TRANS(vfcvtl_d_s, LSX, gen_vv, gen_helper_vfcvtl_d_s) -TRANS(vfcvth_d_s, LSX, gen_vv, gen_helper_vfcvth_d_s) +TRANS(vfcvtl_s_h, LSX, gen_vv_ptr, gen_helper_vfcvtl_s_h) +TRANS(vfcvth_s_h, LSX, gen_vv_ptr, gen_helper_vfcvth_s_h) +TRANS(vfcvtl_d_s, LSX, gen_vv_ptr, gen_helper_vfcvtl_d_s) +TRANS(vfcvth_d_s, LSX, gen_vv_ptr, gen_helper_vfcvth_d_s) TRANS(vfcvt_h_s, LSX, gen_vvv_ptr, gen_helper_vfcvt_h_s) TRANS(vfcvt_s_d, LSX, gen_vvv_ptr, gen_helper_vfcvt_s_d) =20 -TRANS(vfrintrne_s, LSX, gen_vv, gen_helper_vfrintrne_s) -TRANS(vfrintrne_d, LSX, gen_vv, gen_helper_vfrintrne_d) -TRANS(vfrintrz_s, LSX, gen_vv, gen_helper_vfrintrz_s) -TRANS(vfrintrz_d, LSX, gen_vv, gen_helper_vfrintrz_d) -TRANS(vfrintrp_s, LSX, gen_vv, gen_helper_vfrintrp_s) -TRANS(vfrintrp_d, LSX, gen_vv, gen_helper_vfrintrp_d) -TRANS(vfrintrm_s, LSX, gen_vv, gen_helper_vfrintrm_s) -TRANS(vfrintrm_d, LSX, gen_vv, gen_helper_vfrintrm_d) -TRANS(vfrint_s, LSX, gen_vv, gen_helper_vfrint_s) -TRANS(vfrint_d, LSX, gen_vv, gen_helper_vfrint_d) - -TRANS(vftintrne_w_s, LSX, gen_vv, gen_helper_vftintrne_w_s) -TRANS(vftintrne_l_d, LSX, gen_vv, gen_helper_vftintrne_l_d) -TRANS(vftintrz_w_s, LSX, gen_vv, gen_helper_vftintrz_w_s) -TRANS(vftintrz_l_d, LSX, gen_vv, gen_helper_vftintrz_l_d) -TRANS(vftintrp_w_s, LSX, gen_vv, gen_helper_vftintrp_w_s) -TRANS(vftintrp_l_d, LSX, gen_vv, gen_helper_vftintrp_l_d) -TRANS(vftintrm_w_s, LSX, gen_vv, gen_helper_vftintrm_w_s) -TRANS(vftintrm_l_d, LSX, gen_vv, gen_helper_vftintrm_l_d) -TRANS(vftint_w_s, LSX, gen_vv, gen_helper_vftint_w_s) -TRANS(vftint_l_d, LSX, gen_vv, gen_helper_vftint_l_d) -TRANS(vftintrz_wu_s, LSX, gen_vv, gen_helper_vftintrz_wu_s) -TRANS(vftintrz_lu_d, LSX, gen_vv, gen_helper_vftintrz_lu_d) -TRANS(vftint_wu_s, LSX, gen_vv, gen_helper_vftint_wu_s) -TRANS(vftint_lu_d, LSX, gen_vv, gen_helper_vftint_lu_d) +TRANS(vfrintrne_s, LSX, gen_vv_ptr, gen_helper_vfrintrne_s) +TRANS(vfrintrne_d, LSX, gen_vv_ptr, gen_helper_vfrintrne_d) +TRANS(vfrintrz_s, LSX, gen_vv_ptr, gen_helper_vfrintrz_s) +TRANS(vfrintrz_d, LSX, gen_vv_ptr, gen_helper_vfrintrz_d) +TRANS(vfrintrp_s, LSX, gen_vv_ptr, gen_helper_vfrintrp_s) +TRANS(vfrintrp_d, LSX, gen_vv_ptr, gen_helper_vfrintrp_d) +TRANS(vfrintrm_s, LSX, gen_vv_ptr, gen_helper_vfrintrm_s) +TRANS(vfrintrm_d, LSX, gen_vv_ptr, gen_helper_vfrintrm_d) +TRANS(vfrint_s, LSX, gen_vv_ptr, gen_helper_vfrint_s) +TRANS(vfrint_d, LSX, gen_vv_ptr, gen_helper_vfrint_d) + +TRANS(vftintrne_w_s, LSX, gen_vv_ptr, gen_helper_vftintrne_w_s) +TRANS(vftintrne_l_d, LSX, gen_vv_ptr, gen_helper_vftintrne_l_d) +TRANS(vftintrz_w_s, LSX, gen_vv_ptr, gen_helper_vftintrz_w_s) +TRANS(vftintrz_l_d, LSX, gen_vv_ptr, gen_helper_vftintrz_l_d) +TRANS(vftintrp_w_s, LSX, gen_vv_ptr, gen_helper_vftintrp_w_s) +TRANS(vftintrp_l_d, LSX, gen_vv_ptr, gen_helper_vftintrp_l_d) +TRANS(vftintrm_w_s, LSX, gen_vv_ptr, gen_helper_vftintrm_w_s) +TRANS(vftintrm_l_d, LSX, gen_vv_ptr, gen_helper_vftintrm_l_d) +TRANS(vftint_w_s, LSX, gen_vv_ptr, gen_helper_vftint_w_s) +TRANS(vftint_l_d, LSX, gen_vv_ptr, gen_helper_vftint_l_d) +TRANS(vftintrz_wu_s, LSX, gen_vv_ptr, gen_helper_vftintrz_wu_s) +TRANS(vftintrz_lu_d, LSX, gen_vv_ptr, gen_helper_vftintrz_lu_d) +TRANS(vftint_wu_s, LSX, gen_vv_ptr, gen_helper_vftint_wu_s) +TRANS(vftint_lu_d, LSX, gen_vv_ptr, gen_helper_vftint_lu_d) TRANS(vftintrne_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrne_w_d) TRANS(vftintrz_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrz_w_d) TRANS(vftintrp_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrp_w_d) TRANS(vftintrm_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrm_w_d) TRANS(vftint_w_d, LSX, gen_vvv_ptr, gen_helper_vftint_w_d) -TRANS(vftintrnel_l_s, LSX, gen_vv, gen_helper_vftintrnel_l_s) -TRANS(vftintrneh_l_s, LSX, gen_vv, gen_helper_vftintrneh_l_s) -TRANS(vftintrzl_l_s, LSX, gen_vv, gen_helper_vftintrzl_l_s) -TRANS(vftintrzh_l_s, LSX, gen_vv, gen_helper_vftintrzh_l_s) -TRANS(vftintrpl_l_s, LSX, gen_vv, gen_helper_vftintrpl_l_s) -TRANS(vftintrph_l_s, LSX, gen_vv, gen_helper_vftintrph_l_s) -TRANS(vftintrml_l_s, LSX, gen_vv, gen_helper_vftintrml_l_s) -TRANS(vftintrmh_l_s, LSX, gen_vv, gen_helper_vftintrmh_l_s) -TRANS(vftintl_l_s, LSX, gen_vv, gen_helper_vftintl_l_s) -TRANS(vftinth_l_s, LSX, gen_vv, gen_helper_vftinth_l_s) - -TRANS(vffint_s_w, LSX, gen_vv, gen_helper_vffint_s_w) -TRANS(vffint_d_l, LSX, gen_vv, gen_helper_vffint_d_l) -TRANS(vffint_s_wu, LSX, gen_vv, gen_helper_vffint_s_wu) -TRANS(vffint_d_lu, LSX, gen_vv, gen_helper_vffint_d_lu) -TRANS(vffintl_d_w, LSX, gen_vv, gen_helper_vffintl_d_w) -TRANS(vffinth_d_w, LSX, gen_vv, gen_helper_vffinth_d_w) +TRANS(vftintrnel_l_s, LSX, gen_vv_ptr, gen_helper_vftintrnel_l_s) +TRANS(vftintrneh_l_s, LSX, gen_vv_ptr, gen_helper_vftintrneh_l_s) +TRANS(vftintrzl_l_s, LSX, gen_vv_ptr, gen_helper_vftintrzl_l_s) +TRANS(vftintrzh_l_s, LSX, gen_vv_ptr, gen_helper_vftintrzh_l_s) +TRANS(vftintrpl_l_s, LSX, gen_vv_ptr, gen_helper_vftintrpl_l_s) +TRANS(vftintrph_l_s, LSX, gen_vv_ptr, gen_helper_vftintrph_l_s) +TRANS(vftintrml_l_s, LSX, gen_vv_ptr, gen_helper_vftintrml_l_s) +TRANS(vftintrmh_l_s, LSX, gen_vv_ptr, gen_helper_vftintrmh_l_s) +TRANS(vftintl_l_s, LSX, gen_vv_ptr, gen_helper_vftintl_l_s) +TRANS(vftinth_l_s, LSX, gen_vv_ptr, gen_helper_vftinth_l_s) + +TRANS(vffint_s_w, LSX, gen_vv_ptr, gen_helper_vffint_s_w) +TRANS(vffint_d_l, LSX, gen_vv_ptr, gen_helper_vffint_d_l) +TRANS(vffint_s_wu, LSX, gen_vv_ptr, gen_helper_vffint_s_wu) +TRANS(vffint_d_lu, LSX, gen_vv_ptr, gen_helper_vffint_d_lu) +TRANS(vffintl_d_w, LSX, gen_vv_ptr, gen_helper_vffintl_d_w) +TRANS(vffinth_d_w, LSX, gen_vv_ptr, gen_helper_vffinth_d_w) TRANS(vffint_s_l, LSX, gen_vvv_ptr, gen_helper_vffint_s_l) =20 static bool do_cmp(DisasContext *ctx, arg_vvv *a, MemOp mop, TCGCond cond) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16946590101185.510446001877881; Wed, 13 Sep 2023 19:36:50 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4d-0003Ae-NZ; Wed, 13 Sep 2023 22:27:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4Q-00037d-1W for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:06 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4M-0004qe-Uw for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:05 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cxh+hwbwJlX_gmAA--.39771S3; Thu, 14 Sep 2023 10:26:56 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S10; Thu, 14 Sep 2023 10:26:55 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 08/57] target/loongarch: Use gen_helper_gvec_2 for 2OP vector instructions Date: Thu, 14 Sep 2023 10:25:56 +0800 Message-Id: <20230914022645.1151356-9-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S10 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694659012013100007 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 58 ++++----- target/loongarch/vec_helper.c | 124 ++++++++++---------- target/loongarch/insn_trans/trans_vec.c.inc | 16 ++- 3 files changed, 101 insertions(+), 97 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 0752cc7212..523591035d 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -331,37 +331,37 @@ DEF_HELPER_FLAGS_4(vsat_hu, TCG_CALL_NO_RWG, void, pt= r, ptr, i64, i32) DEF_HELPER_FLAGS_4(vsat_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vsat_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 -DEF_HELPER_3(vexth_h_b, void, env, i32, i32) -DEF_HELPER_3(vexth_w_h, void, env, i32, i32) -DEF_HELPER_3(vexth_d_w, void, env, i32, i32) -DEF_HELPER_3(vexth_q_d, void, env, i32, i32) -DEF_HELPER_3(vexth_hu_bu, void, env, i32, i32) -DEF_HELPER_3(vexth_wu_hu, void, env, i32, i32) -DEF_HELPER_3(vexth_du_wu, void, env, i32, i32) -DEF_HELPER_3(vexth_qu_du, void, env, i32, i32) +DEF_HELPER_FLAGS_3(vexth_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vsigncov_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 -DEF_HELPER_3(vmskltz_b, void, env, i32, i32) -DEF_HELPER_3(vmskltz_h, void, env, i32, i32) -DEF_HELPER_3(vmskltz_w, void, env, i32, i32) -DEF_HELPER_3(vmskltz_d, void, env, i32, i32) -DEF_HELPER_3(vmskgez_b, void, env, i32, i32) -DEF_HELPER_3(vmsknz_b, void, env, i32,i32) +DEF_HELPER_FLAGS_3(vmskltz_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmskltz_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmskltz_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmskltz_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmskgez_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmsknz_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vnori_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 DEF_HELPER_4(vsllwil_h_b, void, env, i32, i32, i32) DEF_HELPER_4(vsllwil_w_h, void, env, i32, i32, i32) DEF_HELPER_4(vsllwil_d_w, void, env, i32, i32, i32) -DEF_HELPER_3(vextl_q_d, void, env, i32, i32) +DEF_HELPER_FLAGS_3(vextl_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_4(vsllwil_hu_bu, void, env, i32, i32, i32) DEF_HELPER_4(vsllwil_wu_hu, void, env, i32, i32, i32) DEF_HELPER_4(vsllwil_du_wu, void, env, i32, i32, i32) -DEF_HELPER_3(vextl_qu_du, void, env, i32, i32) +DEF_HELPER_FLAGS_3(vextl_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vsrlr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrlr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -473,19 +473,19 @@ DEF_HELPER_4(vssrarni_hu_w, void, env, i32, i32, i32) DEF_HELPER_4(vssrarni_wu_d, void, env, i32, i32, i32) DEF_HELPER_4(vssrarni_du_q, void, env, i32, i32, i32) =20 -DEF_HELPER_3(vclo_b, void, env, i32, i32) -DEF_HELPER_3(vclo_h, void, env, i32, i32) -DEF_HELPER_3(vclo_w, void, env, i32, i32) -DEF_HELPER_3(vclo_d, void, env, i32, i32) -DEF_HELPER_3(vclz_b, void, env, i32, i32) -DEF_HELPER_3(vclz_h, void, env, i32, i32) -DEF_HELPER_3(vclz_w, void, env, i32, i32) -DEF_HELPER_3(vclz_d, void, env, i32, i32) - -DEF_HELPER_3(vpcnt_b, void, env, i32, i32) -DEF_HELPER_3(vpcnt_h, void, env, i32, i32) -DEF_HELPER_3(vpcnt_w, void, env, i32, i32) -DEF_HELPER_3(vpcnt_d, void, env, i32, i32) +DEF_HELPER_FLAGS_3(vclo_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclo_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclo_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclo_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclz_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclz_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclz_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclz_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(vpcnt_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vpcnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vpcnt_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vpcnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vbitclr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vbitclr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 2898ae06ce..fd38b47c28 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -625,30 +625,30 @@ VSAT_U(vsat_hu, 16, UH) VSAT_U(vsat_wu, 32, UW) VSAT_U(vsat_du, 64, UD) =20 -#define VEXTH(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D Vj->E2(i + LSX_LEN/BIT); \ - } \ +#define VEXTH(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E1(i) =3D Vj->E2(i + LSX_LEN/BIT); \ + } \ } =20 -void HELPER(vexth_q_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vexth_q_d)(void *vd, void *vj, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 Vd->Q(0) =3D int128_makes64(Vj->D(1)); } =20 -void HELPER(vexth_qu_du)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vexth_qu_du)(void *vd, void *vj, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 Vd->Q(0) =3D int128_make64((uint64_t)Vj->D(1)); } @@ -677,11 +677,11 @@ static uint64_t do_vmskltz_b(int64_t val) return c >> 56; } =20 -void HELPER(vmskltz_b)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskltz_b)(void *vd, void *vj, uint32_t desc) { uint16_t temp =3D 0; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp =3D do_vmskltz_b(Vj->D(0)); temp |=3D (do_vmskltz_b(Vj->D(1)) << 8); @@ -698,11 +698,11 @@ static uint64_t do_vmskltz_h(int64_t val) return c >> 60; } =20 -void HELPER(vmskltz_h)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskltz_h)(void *vd, void *vj, uint32_t desc) { uint16_t temp =3D 0; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp =3D do_vmskltz_h(Vj->D(0)); temp |=3D (do_vmskltz_h(Vj->D(1)) << 4); @@ -718,11 +718,11 @@ static uint64_t do_vmskltz_w(int64_t val) return c >> 62; } =20 -void HELPER(vmskltz_w)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskltz_w)(void *vd, void *vj, uint32_t desc) { uint16_t temp =3D 0; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp =3D do_vmskltz_w(Vj->D(0)); temp |=3D (do_vmskltz_w(Vj->D(1)) << 2); @@ -734,11 +734,11 @@ static uint64_t do_vmskltz_d(int64_t val) { return (uint64_t)val >> 63; } -void HELPER(vmskltz_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskltz_d)(void *vd, void *vj, uint32_t desc) { uint16_t temp =3D 0; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp =3D do_vmskltz_d(Vj->D(0)); temp |=3D (do_vmskltz_d(Vj->D(1)) << 1); @@ -746,11 +746,11 @@ void HELPER(vmskltz_d)(CPULoongArchState *env, uint32= _t vd, uint32_t vj) Vd->D(1) =3D 0; } =20 -void HELPER(vmskgez_b)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskgez_b)(void *vd, void *vj, uint32_t desc) { uint16_t temp =3D 0; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp =3D do_vmskltz_b(Vj->D(0)); temp |=3D (do_vmskltz_b(Vj->D(1)) << 8); @@ -768,11 +768,11 @@ static uint64_t do_vmskez_b(uint64_t a) return c >> 56; } =20 -void HELPER(vmsknz_b)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmsknz_b)(void *vd, void *vj, uint32_t desc) { uint16_t temp =3D 0; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp =3D do_vmskez_b(Vj->D(0)); temp |=3D (do_vmskez_b(Vj->D(1)) << 8); @@ -809,18 +809,18 @@ void HELPER(NAME)(CPULoongArchState *env, = \ *Vd =3D temp; \ } =20 -void HELPER(vextl_q_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vextl_q_d)(void *vd, void *vj, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 Vd->Q(0) =3D int128_makes64(Vj->D(0)); } =20 -void HELPER(vextl_qu_du)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vextl_qu_du)(void *vd, void *vj, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 Vd->Q(0) =3D int128_make64(Vj->D(0)); } @@ -1899,17 +1899,17 @@ VSSRARNUI(vssrarni_bu_h, 16, B, H) VSSRARNUI(vssrarni_hu_w, 32, H, W) VSSRARNUI(vssrarni_wu_d, 64, W, D) =20 -#define DO_2OP(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) \ - { \ - Vd->E(i) =3D DO_OP(Vj->E(i)); \ - } \ +#define DO_2OP(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) \ + { \ + Vd->E(i) =3D DO_OP(Vj->E(i)); \ + } \ } =20 #define DO_CLO_B(N) (clz32(~N & 0xff) - 24) @@ -1930,17 +1930,17 @@ DO_2OP(vclz_h, 16, UH, DO_CLZ_H) DO_2OP(vclz_w, 32, UW, DO_CLZ_W) DO_2OP(vclz_d, 64, UD, DO_CLZ_D) =20 -#define VPCNT(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) \ - { \ - Vd->E(i) =3D FN(Vj->E(i)); \ - } \ +#define VPCNT(NAME, BIT, E, FN) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) \ + { \ + Vd->E(i) =3D FN(Vj->E(i)); \ + } \ } =20 VPCNT(vpcnt_b, 8, UB, ctpop8) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 06aa037e3b..e78c6bba0a 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -103,15 +103,19 @@ static bool gen_vv_ptr(DisasContext *ctx, arg_vv *a, return gen_vv_ptr_vl(ctx, a, 16, fn); } =20 -static bool gen_vv(DisasContext *ctx, arg_vv *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32)) +static bool gen_vv_vl(DisasContext *ctx, arg_vv *a, uint32_t oprsz, + gen_helper_gvec_2 *fn) { - TCGv_i32 vd =3D tcg_constant_i32(a->vd); - TCGv_i32 vj =3D tcg_constant_i32(a->vj); + tcg_gen_gvec_2_ool(vec_full_offset(a->vd), + vec_full_offset(a->vj), + oprsz, ctx->vl / 8, 0, fn); + return true; +} =20 +static bool gen_vv(DisasContext *ctx, arg_vv *a, gen_helper_gvec_2 *fn) +{ CHECK_SXE; - func(cpu_env, vd, vj); - return true; + return gen_vv_vl(ctx, a, 16, fn); } =20 static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 169465903408793.00238910327607; Wed, 13 Sep 2023 19:37:14 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4Z-00039D-FX; Wed, 13 Sep 2023 22:27:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4R-000388-QB for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:09 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4M-0004qg-Ur for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:07 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxXOpwbwJlXvgmAA--.48373S3; Thu, 14 Sep 2023 10:26:56 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S11; Thu, 14 Sep 2023 10:26:55 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 09/57] target/loongarch: Use gen_helper_gvec_2i for 2OP + imm vector instructions Date: Thu, 14 Sep 2023 10:25:57 +0800 Message-Id: <20230914022645.1151356-10-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S11 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694659035537100005 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 146 +++---- target/loongarch/vec_helper.c | 445 +++++++++----------- target/loongarch/insn_trans/trans_vec.c.inc | 18 +- 3 files changed, 291 insertions(+), 318 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 523591035d..1abd9e1410 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -354,32 +354,32 @@ DEF_HELPER_FLAGS_3(vmsknz_b, TCG_CALL_NO_RWG, void, p= tr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vnori_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 -DEF_HELPER_4(vsllwil_h_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsllwil_w_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsllwil_d_w, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsllwil_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsllwil_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsllwil_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_3(vextl_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) -DEF_HELPER_4(vsllwil_hu_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vsllwil_wu_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vsllwil_du_wu, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsllwil_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vsllwil_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vsllwil_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) DEF_HELPER_FLAGS_3(vextl_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(vsrlr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrlr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrlr_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrlr_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vsrlri_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlri_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlri_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlri_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrlri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlri_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 DEF_HELPER_FLAGS_4(vsrar_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrar_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrar_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vsrari_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsrari_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrari_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrari_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrari_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrari_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrari_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrari_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 DEF_HELPER_FLAGS_4(vsrln_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrln_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -388,14 +388,14 @@ DEF_HELPER_FLAGS_4(vsran_b_h, TCG_CALL_NO_RWG, void, = ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsran_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsran_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 -DEF_HELPER_4(vsrlni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vsrani_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrani_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrani_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrani_d_q, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrlni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrani_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrani_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrani_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrani_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 DEF_HELPER_FLAGS_4(vsrlrn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrlrn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -404,14 +404,14 @@ DEF_HELPER_FLAGS_4(vsrarn_b_h, TCG_CALL_NO_RWG, void,= ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrarn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrarn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 -DEF_HELPER_4(vsrlrni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarni_d_q, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrlrni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlrni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlrni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlrni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrarni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrarni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrarni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrarni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 DEF_HELPER_FLAGS_4(vssrln_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vssrln_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -426,22 +426,22 @@ DEF_HELPER_FLAGS_4(vssran_bu_h, TCG_CALL_NO_RWG, void= , ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vssran_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vssran_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 -DEF_HELPER_4(vssrlni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_du_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_du_q, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vssrlni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_du_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_du_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 DEF_HELPER_FLAGS_4(vssrlrn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vssrlrn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -456,22 +456,22 @@ DEF_HELPER_FLAGS_4(vssrarn_bu_h, TCG_CALL_NO_RWG, voi= d, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vssrarn_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vssrarn_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 -DEF_HELPER_4(vssrlrni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_du_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_du_q, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vssrlrni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrlrni_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrlrni_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrlrni_du_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrarni_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrarni_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrarni_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) +DEF_HELPER_FLAGS_4(vssrarni_du_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i3= 2) =20 DEF_HELPER_FLAGS_3(vclo_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(vclo_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) @@ -516,8 +516,8 @@ DEF_HELPER_FLAGS_4(vbitrevi_d, TCG_CALL_NO_RWG, void, p= tr, ptr, i64, i32) =20 DEF_HELPER_FLAGS_4(vfrstp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vfrstp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vfrstpi_b, void, env, i32, i32, i32) -DEF_HELPER_4(vfrstpi_h, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vfrstpi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vfrstpi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 DEF_HELPER_FLAGS_5(vfadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vfadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) @@ -686,14 +686,14 @@ DEF_HELPER_FLAGS_5(vshuf_b, TCG_CALL_NO_RWG, void, pt= r, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vshuf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vshuf_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vshuf_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vshuf4i_b, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf4i_h, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf4i_w, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf4i_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vshuf4i_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vshuf4i_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vshuf4i_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vshuf4i_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 -DEF_HELPER_4(vpermi_w, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vpermi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 -DEF_HELPER_4(vextrins_b, void, env, i32, i32, i32) -DEF_HELPER_4(vextrins_h, void, env, i32, i32, i32) -DEF_HELPER_4(vextrins_w, void, env, i32, i32, i32) -DEF_HELPER_4(vextrins_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vextrins_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vextrins_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vextrins_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vextrins_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index fd38b47c28..4e10957b90 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -791,22 +791,21 @@ void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm= , uint32_t v) } } =20 -#define VSLLWIL(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - typedef __typeof(temp.E1(0)) TD; \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E1(i) =3D (TD)Vj->E2(i) << (imm % BIT); \ - } \ - *Vd =3D temp; \ +#define VSLLWIL(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + typedef __typeof(temp.E1(0)) TD; \ + \ + temp.D(0) =3D 0; \ + temp.D(1) =3D 0; \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E1(i) =3D (TD)Vj->E2(i) << (imm % BIT); \ + } \ + *Vd =3D temp; \ } =20 void HELPER(vextl_q_d)(void *vd, void *vj, uint32_t desc) @@ -865,17 +864,16 @@ VSRLR(vsrlr_h, 16, uint16_t, H) VSRLR(vsrlr_w, 32, uint32_t, W) VSRLR(vsrlr_d, 64, uint64_t, D) =20 -#define VSRLRI(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D do_vsrlr_ ## E(Vj->E(i), imm); \ - } \ +#define VSRLRI(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) =3D do_vsrlr_ ## E(Vj->E(i), imm); \ + } \ } =20 VSRLRI(vsrlri_b, 8, B) @@ -916,17 +914,16 @@ VSRAR(vsrar_h, 16, uint16_t, H) VSRAR(vsrar_w, 32, uint32_t, W) VSRAR(vsrar_d, 64, uint64_t, D) =20 -#define VSRARI(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D do_vsrar_ ## E(Vj->E(i), imm); \ - } \ +#define VSRARI(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) =3D do_vsrar_ ## E(Vj->E(i), imm); \ + } \ } =20 VSRARI(vsrari_b, 8, B) @@ -972,31 +969,29 @@ VSRAN(vsran_b_h, 16, uint16_t, B, H) VSRAN(vsran_h_w, 32, uint32_t, H, W) VSRAN(vsran_w_d, 64, uint64_t, W, D) =20 -#define VSRLNI(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D R_SHIFT((T)Vj->E2(i), imm); \ - temp.E1(i + max) =3D R_SHIFT((T)Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ -} - -void HELPER(vsrlni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +#define VSRLNI(NAME, BIT, T, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, max; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + temp.D(0) =3D 0; \ + temp.D(1) =3D 0; \ + max =3D LSX_LEN/BIT; \ + for (i =3D 0; i < max; i++) { \ + temp.E1(i) =3D R_SHIFT((T)Vj->E2(i), imm); \ + temp.E1(i + max) =3D R_SHIFT((T)Vd->E2(i), imm); \ + } \ + *Vd =3D temp; \ +} + +void HELPER(vsrlni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp.D(0) =3D 0; temp.D(1) =3D 0; @@ -1009,31 +1004,29 @@ VSRLNI(vsrlni_b_h, 16, uint16_t, B, H) VSRLNI(vsrlni_h_w, 32, uint32_t, H, W) VSRLNI(vsrlni_w_d, 64, uint64_t, W, D) =20 -#define VSRANI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D R_SHIFT(Vj->E2(i), imm); \ - temp.E1(i + max) =3D R_SHIFT(Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ -} - -void HELPER(vsrani_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +#define VSRANI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, max; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + temp.D(0) =3D 0; \ + temp.D(1) =3D 0; \ + max =3D LSX_LEN/BIT; \ + for (i =3D 0; i < max; i++) { \ + temp.E1(i) =3D R_SHIFT(Vj->E2(i), imm); \ + temp.E1(i + max) =3D R_SHIFT(Vd->E2(i), imm); \ + } \ + *Vd =3D temp; \ +} + +void HELPER(vsrani_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp.D(0) =3D 0; temp.D(1) =3D 0; @@ -1082,31 +1075,29 @@ VSRARN(vsrarn_b_h, 16, uint8_t, B, H) VSRARN(vsrarn_h_w, 32, uint16_t, H, W) VSRARN(vsrarn_w_d, 64, uint32_t, W, D) =20 -#define VSRLRNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D do_vsrlr_ ## E2(Vj->E2(i), imm); \ - temp.E1(i + max) =3D do_vsrlr_ ## E2(Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ -} - -void HELPER(vsrlrni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +#define VSRLRNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, max; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + temp.D(0) =3D 0; \ + temp.D(1) =3D 0; \ + max =3D LSX_LEN/BIT; \ + for (i =3D 0; i < max; i++) { \ + temp.E1(i) =3D do_vsrlr_ ## E2(Vj->E2(i), imm); \ + temp.E1(i + max) =3D do_vsrlr_ ## E2(Vd->E2(i), imm); \ + } \ + *Vd =3D temp; \ +} + +void HELPER(vsrlrni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; Int128 r1, r2; =20 if (imm =3D=3D 0) { @@ -1126,31 +1117,29 @@ VSRLRNI(vsrlrni_b_h, 16, B, H) VSRLRNI(vsrlrni_h_w, 32, H, W) VSRLRNI(vsrlrni_w_d, 64, W, D) =20 -#define VSRARNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D do_vsrar_ ## E2(Vj->E2(i), imm); \ - temp.E1(i + max) =3D do_vsrar_ ## E2(Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ -} - -void HELPER(vsrarni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +#define VSRARNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, max; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + temp.D(0) =3D 0; \ + temp.D(1) =3D 0; \ + max =3D LSX_LEN/BIT; \ + for (i =3D 0; i < max; i++) { \ + temp.E1(i) =3D do_vsrar_ ## E2(Vj->E2(i), imm); \ + temp.E1(i + max) =3D do_vsrar_ ## E2(Vd->E2(i), imm); \ + } \ + *Vd =3D temp; \ +} + +void HELPER(vsrarni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; Int128 r1, r2; =20 if (imm =3D=3D 0) { @@ -1336,13 +1325,12 @@ VSSRANU(vssran_hu_w, 32, uint32_t, H, W) VSSRANU(vssran_wu_d, 64, uint64_t, W, D) =20 #define VSSRLNI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrlns_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ @@ -1351,12 +1339,11 @@ void HELPER(NAME)(CPULoongArchState *env, = \ *Vd =3D temp; = \ } =20 -void HELPER(vssrlni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrlni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 if (imm =3D=3D 0) { shft_res1 =3D Vj->Q(0); @@ -1385,13 +1372,12 @@ VSSRLNI(vssrlni_h_w, 32, H, W) VSSRLNI(vssrlni_w_d, 64, W, D) =20 #define VSSRANI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrans_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ @@ -1400,12 +1386,11 @@ void HELPER(NAME)(CPULoongArchState *env, = \ *Vd =3D temp; = \ } =20 -void HELPER(vssrani_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrani_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask, min; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 if (imm =3D=3D 0) { shft_res1 =3D Vj->Q(0); @@ -1439,13 +1424,12 @@ VSSRANI(vssrani_h_w, 32, H, W) VSSRANI(vssrani_w_d, 64, W, D) =20 #define VSSRLNUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrlnu_ ## E1(Vj->E2(i), imm, BIT/2); = \ @@ -1454,12 +1438,11 @@ void HELPER(NAME)(CPULoongArchState *env, = \ *Vd =3D temp; = \ } =20 -void HELPER(vssrlni_du_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrlni_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 if (imm =3D=3D 0) { shft_res1 =3D Vj->Q(0); @@ -1488,13 +1471,12 @@ VSSRLNUI(vssrlni_hu_w, 32, H, W) VSSRLNUI(vssrlni_wu_d, 64, W, D) =20 #define VSSRANUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssranu_ ## E1(Vj->E2(i), imm, BIT/2); = \ @@ -1503,12 +1485,11 @@ void HELPER(NAME)(CPULoongArchState *env, = \ *Vd =3D temp; = \ } =20 -void HELPER(vssrani_du_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrani_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 if (imm =3D=3D 0) { shft_res1 =3D Vj->Q(0); @@ -1701,13 +1682,12 @@ VSSRARNU(vssrarn_hu_w, 32, uint32_t, H, W) VSSRARNU(vssrarn_wu_d, 64, uint64_t, W, D) =20 #define VSSRLRNI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrlrns_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ @@ -1717,12 +1697,11 @@ void HELPER(NAME)(CPULoongArchState *env, = \ } =20 #define VSSRLRNI_Q(NAME, sh) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ Int128 shft_res1, shft_res2, mask, r1, r2; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ if (imm =3D=3D 0) { = \ shft_res1 =3D Vj->Q(0); = \ @@ -1756,13 +1735,12 @@ VSSRLRNI(vssrlrni_w_d, 64, W, D) VSSRLRNI_Q(vssrlrni_d_q, 63) =20 #define VSSRARNI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrarns_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ @@ -1771,12 +1749,11 @@ void HELPER(NAME)(CPULoongArchState *env, *Vd =3D temp; = \ } =20 -void HELPER(vssrarni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrarni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask1, mask2, r1, r2; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 if (imm =3D=3D 0) { shft_res1 =3D Vj->Q(0); @@ -1814,13 +1791,12 @@ VSSRARNI(vssrarni_h_w, 32, H, W) VSSRARNI(vssrarni_w_d, 64, W, D) =20 #define VSSRLRNUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrlrnu_ ## E1(Vj->E2(i), imm, BIT/2); = \ @@ -1835,13 +1811,12 @@ VSSRLRNUI(vssrlrni_wu_d, 64, W, D) VSSRLRNI_Q(vssrlrni_du_q, 64) =20 #define VSSRARNUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(CPULoongArchState *env, = \ - uint32_t vd, uint32_t vj, uint32_t imm) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ { = \ int i; = \ VReg temp; = \ - VReg *Vd =3D &(env->fpr[vd].vreg); = \ - VReg *Vj =3D &(env->fpr[vj].vreg); = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ = \ for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ temp.E1(i) =3D do_ssrarnu_ ## E1(Vj->E2(i), imm, BIT/2); = \ @@ -1850,12 +1825,11 @@ void HELPER(NAME)(CPULoongArchState *env, = \ *Vd =3D temp; = \ } =20 -void HELPER(vssrarni_du_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrarni_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask1, mask2, r1, r2; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 if (imm =3D=3D 0) { shft_res1 =3D Vj->Q(0); @@ -2023,21 +1997,20 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uin= t32_t desc) \ VFRSTP(vfrstp_b, 8, 0xf, B) VFRSTP(vfrstp_h, 16, 0x7, H) =20 -#define VFRSTPI(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, m; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - if (Vj->E(i) < 0) { \ - break; \ - } \ - } \ - m =3D imm % (LSX_LEN/BIT); \ - Vd->E(m) =3D i; \ +#define VFRSTPI(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, m; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + if (Vj->E(i) < 0) { \ + break; \ + } \ + } \ + m =3D imm % (LSX_LEN/BIT); \ + Vd->E(m) =3D i; \ } =20 VFRSTPI(vfrstpi_b, 8, B) @@ -2923,31 +2896,29 @@ VSHUF(vshuf_h, 16, H) VSHUF(vshuf_w, 32, W) VSHUF(vshuf_d, 64, D) =20 -#define VSHUF4I(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i) =3D Vj->E(((i) & 0xfc) + (((imm) >> \ - (2 * ((i) & 0x03))) & 0x03)); \ - } \ - *Vd =3D temp; \ +#define VSHUF4I(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + temp.E(i) =3D Vj->E(((i) & 0xfc) + (((imm) >> \ + (2 * ((i) & 0x03))) & 0x03)); \ + } \ + *Vd =3D temp; \ } =20 VSHUF4I(vshuf4i_b, 8, B) VSHUF4I(vshuf4i_h, 16, H) VSHUF4I(vshuf4i_w, 32, W) =20 -void HELPER(vshuf4i_d)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vshuf4i_d)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 VReg temp; temp.D(0) =3D (imm & 2 ? Vj : Vd)->D(imm & 1); @@ -2955,12 +2926,11 @@ void HELPER(vshuf4i_d)(CPULoongArchState *env, *Vd =3D temp; } =20 -void HELPER(vpermi_w)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vpermi_w)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd =3D &(env->fpr[vd].vreg); - VReg *Vj =3D &(env->fpr[vj].vreg); + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; =20 temp.W(0) =3D Vj->W(imm & 0x3); temp.W(1) =3D Vj->W((imm >> 2) & 0x3); @@ -2969,17 +2939,16 @@ void HELPER(vpermi_w)(CPULoongArchState *env, *Vd =3D temp; } =20 -#define VEXTRINS(NAME, BIT, E, MASK) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int ins, extr; \ - VReg *Vd =3D &(env->fpr[vd].vreg); \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - ins =3D (imm >> 4) & MASK; \ - extr =3D imm & MASK; \ - Vd->E(ins) =3D Vj->E(extr); \ +#define VEXTRINS(NAME, BIT, E, MASK) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int ins, extr; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + ins =3D (imm >> 4) & MASK; \ + extr =3D imm & MASK; \ + Vd->E(ins) =3D Vj->E(extr); \ } =20 VEXTRINS(vextrins_b, 8, B, 0xf) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index e78c6bba0a..7504e3a62d 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -118,16 +118,20 @@ static bool gen_vv(DisasContext *ctx, arg_vv *a, gen_= helper_gvec_2 *fn) return gen_vv_vl(ctx, a, 16, fn); } =20 -static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) +static bool gen_vv_i_vl(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz, + gen_helper_gvec_2i *fn) { - TCGv_i32 vd =3D tcg_constant_i32(a->vd); - TCGv_i32 vj =3D tcg_constant_i32(a->vj); - TCGv_i32 imm =3D tcg_constant_i32(a->imm); + tcg_gen_gvec_2i_ool(vec_full_offset(a->vd), + vec_full_offset(a->vj), + tcg_constant_i64(a->imm), + oprsz, ctx->vl / 8, 0, fn); + return true; +} =20 +static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, gen_helper_gvec_2i *f= n) +{ CHECK_SXE; - func(cpu_env, vd, vj, imm); - return true; + return gen_vv_i_vl(ctx, a, 16, fn); } =20 static bool gen_cv(DisasContext *ctx, arg_cv *a, --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658774825639.9330346901104; Wed, 13 Sep 2023 19:32:54 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6M-0006mQ-1H; Wed, 13 Sep 2023 22:29:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc5s-0004pX-Bq for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:36 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5j-0005FH-36 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:36 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxIvBwbwJlYPgmAA--.9201S3; Thu, 14 Sep 2023 10:26:56 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S12; Thu, 14 Sep 2023 10:26:56 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 10/57] target/loongarch: Replace CHECK_SXE to check_vec(ctx, 16) Date: Thu, 14 Sep 2023 10:25:58 +0800 Message-Id: <20230914022645.1151356-11-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S12 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658775747100002 Content-Type: text/plain; charset="utf-8" Introduce a new function check_vec to replace CHECK_SXE Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insn_trans/trans_vec.c.inc | 248 +++++++++++++++----- 1 file changed, 192 insertions(+), 56 deletions(-) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 7504e3a62d..d8ab7c3417 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -5,14 +5,23 @@ */ =20 #ifndef CONFIG_USER_ONLY -#define CHECK_SXE do { \ - if ((ctx->base.tb->flags & HW_FLAGS_EUEN_SXE) =3D=3D 0) { \ - generate_exception(ctx, EXCCODE_SXD); \ - return true; \ - } \ -} while (0) + +static bool check_vec(DisasContext *ctx, uint32_t oprsz) +{ + if ((oprsz =3D=3D 16) && ((ctx->base.tb->flags & HW_FLAGS_EUEN_SXE) = =3D=3D 0)) { + generate_exception(ctx, EXCCODE_SXD); + return false; + } + return true; +} + #else -#define CHECK_SXE + +static bool check_vec(DisasContext *ctx, uint32_t oprsz) +{ + return true; +} + #endif =20 static bool gen_vvvv_ptr_vl(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz, @@ -30,7 +39,10 @@ static bool gen_vvvv_ptr_vl(DisasContext *ctx, arg_vvvv = *a, uint32_t oprsz, static bool gen_vvvv_ptr(DisasContext *ctx, arg_vvvv *a, gen_helper_gvec_4_ptr *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vvvv_ptr_vl(ctx, a, 16, fn); } =20 @@ -48,7 +60,10 @@ static bool gen_vvvv_vl(DisasContext *ctx, arg_vvvv *a, = uint32_t oprsz, static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, gen_helper_gvec_4 *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vvvv_vl(ctx, a, 16, fn); } =20 @@ -66,7 +81,10 @@ static bool gen_vvv_ptr_vl(DisasContext *ctx, arg_vvv *a= , uint32_t oprsz, static bool gen_vvv_ptr(DisasContext *ctx, arg_vvv *a, gen_helper_gvec_3_ptr *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vvv_ptr_vl(ctx, a, 16, fn); } =20 @@ -82,7 +100,10 @@ static bool gen_vvv_vl(DisasContext *ctx, arg_vvv *a, u= int32_t oprsz, =20 static bool gen_vvv(DisasContext *ctx, arg_vvv *a, gen_helper_gvec_3 *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vvv_vl(ctx, a, 16, fn); } =20 @@ -99,7 +120,10 @@ static bool gen_vv_ptr_vl(DisasContext *ctx, arg_vv *a,= uint32_t oprsz, static bool gen_vv_ptr(DisasContext *ctx, arg_vv *a, gen_helper_gvec_2_ptr *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vv_ptr_vl(ctx, a, 16, fn); } =20 @@ -114,7 +138,10 @@ static bool gen_vv_vl(DisasContext *ctx, arg_vv *a, ui= nt32_t oprsz, =20 static bool gen_vv(DisasContext *ctx, arg_vv *a, gen_helper_gvec_2 *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vv_vl(ctx, a, 16, fn); } =20 @@ -130,7 +157,10 @@ static bool gen_vv_i_vl(DisasContext *ctx, arg_vv_i *a= , uint32_t oprsz, =20 static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, gen_helper_gvec_2i *f= n) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vv_i_vl(ctx, a, 16, fn); } =20 @@ -140,7 +170,10 @@ static bool gen_cv(DisasContext *ctx, arg_cv *a, TCGv_i32 vj =3D tcg_constant_i32(a->vj); TCGv_i32 cd =3D tcg_constant_i32(a->cd); =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + func(cpu_env, cd, vj); return true; } @@ -162,7 +195,10 @@ static bool gvec_vvv(DisasContext *ctx, arg_vvv *a, Me= mOp mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t)) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gvec_vvv_vl(ctx, a, 16, mop, func); } =20 @@ -184,7 +220,10 @@ static bool gvec_vv(DisasContext *ctx, arg_vv *a, MemO= p mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t)) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gvec_vv_vl(ctx, a, 16, mop, func); } =20 @@ -204,7 +243,10 @@ static bool gvec_vv_i(DisasContext *ctx, arg_vv_i *a, = MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, int64_t, uint32_t, uint32_t)) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gvec_vv_i_vl(ctx, a, 16, mop, func); } =20 @@ -220,7 +262,10 @@ static bool gvec_subi_vl(DisasContext *ctx, arg_vv_i *= a, =20 static bool gvec_subi(DisasContext *ctx, arg_vv_i *a, MemOp mop) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gvec_subi_vl(ctx, a, 16, mop); } =20 @@ -238,7 +283,9 @@ static bool trans_v## NAME ##_q(DisasContext *ctx, arg_= vvv *a) \ return false; \ } \ \ - CHECK_SXE; \ + if (!check_vec(ctx, 16)) { \ + return true; \ + } \ \ rh =3D tcg_temp_new_i64(); \ rl =3D tcg_temp_new_i64(); \ @@ -3138,7 +3185,9 @@ static bool trans_vldi(DisasContext *ctx, arg_vldi *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 sel =3D (a->imm >> 12) & 0x1; =20 @@ -3168,7 +3217,9 @@ static bool trans_vandn_v(DisasContext *ctx, arg_vvv = *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 vd_ofs =3D vec_full_offset(a->vd); vj_ofs =3D vec_full_offset(a->vj); @@ -3795,7 +3846,9 @@ static bool do_cmp(DisasContext *ctx, arg_vvv *a, Mem= Op mop, TCGCond cond) { uint32_t vd_ofs, vj_ofs, vk_ofs; =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 vd_ofs =3D vec_full_offset(a->vd); vj_ofs =3D vec_full_offset(a->vj); @@ -3841,7 +3894,9 @@ static bool do_## NAME ##_s(DisasContext *ctx, arg_vv= _i *a, MemOp mop) \ { \ uint32_t vd_ofs, vj_ofs; \ \ - CHECK_SXE; \ + if (!check_vec(ctx, 16)) { \ + return true; \ + } \ \ static const TCGOpcode vecop_list[] =3D { \ INDEX_op_cmp_vec, 0 \ @@ -3890,7 +3945,9 @@ static bool do_## NAME ##_u(DisasContext *ctx, arg_vv= _i *a, MemOp mop) \ { \ uint32_t vd_ofs, vj_ofs; \ \ - CHECK_SXE; \ + if (!check_vec(ctx, 16)) { \ + return true; \ + } \ \ static const TCGOpcode vecop_list[] =3D { \ INDEX_op_cmp_vec, 0 \ @@ -3988,7 +4045,9 @@ static bool trans_vfcmp_cond_s(DisasContext *ctx, arg= _vvv_fcond *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 fn =3D (a->fcond & 1 ? gen_helper_vfcmp_s_s : gen_helper_vfcmp_c_s); flags =3D get_fcmp_flags(a->fcond >> 1); @@ -4009,7 +4068,9 @@ static bool trans_vfcmp_cond_d(DisasContext *ctx, arg= _vvv_fcond *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 fn =3D (a->fcond & 1 ? gen_helper_vfcmp_s_d : gen_helper_vfcmp_c_d); flags =3D get_fcmp_flags(a->fcond >> 1); @@ -4024,7 +4085,9 @@ static bool trans_vbitsel_v(DisasContext *ctx, arg_vv= vv *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 tcg_gen_gvec_bitsel(MO_64, vec_full_offset(a->vd), vec_full_offset(a->= va), vec_full_offset(a->vk), vec_full_offset(a->vj), @@ -4050,7 +4113,9 @@ static bool trans_vbitseli_b(DisasContext *ctx, arg_v= v_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 tcg_gen_gvec_2i(vec_full_offset(a->vd), vec_full_offset(a->vj), 16, ctx->vl/8, a->imm, &op); @@ -4073,7 +4138,10 @@ static bool trans_## NAME (DisasContext *ctx, arg_cv= *a) \ return false; = \ } = \ = \ - CHECK_SXE; = \ + if (!check_vec(ctx, 16)) { = \ + return true; = \ + } = \ + = \ tcg_gen_or_i64(t1, al, ah); = \ tcg_gen_setcondi_i64(COND, t1, t1, 0); = \ tcg_gen_st8_tl(t1, cpu_env, offsetof(CPULoongArchState, cf[a->cd & 0x7= ])); \ @@ -4101,7 +4169,10 @@ static bool trans_vinsgr2vr_b(DisasContext *ctx, arg= _vr_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_st8_i64(src, cpu_env, offsetof(CPULoongArchState, fpr[a->vd].vreg.B(a->imm))= ); return true; @@ -4115,7 +4186,10 @@ static bool trans_vinsgr2vr_h(DisasContext *ctx, arg= _vr_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_st16_i64(src, cpu_env, offsetof(CPULoongArchState, fpr[a->vd].vreg.H(a->imm))= ); return true; @@ -4129,7 +4203,10 @@ static bool trans_vinsgr2vr_w(DisasContext *ctx, arg= _vr_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_st32_i64(src, cpu_env, offsetof(CPULoongArchState, fpr[a->vd].vreg.W(a->imm)= )); return true; @@ -4143,7 +4220,10 @@ static bool trans_vinsgr2vr_d(DisasContext *ctx, arg= _vr_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_st_i64(src, cpu_env, offsetof(CPULoongArchState, fpr[a->vd].vreg.D(a->imm))); return true; @@ -4157,7 +4237,10 @@ static bool trans_vpickve2gr_b(DisasContext *ctx, ar= g_rv_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld8s_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.B(a->imm)= )); return true; @@ -4171,7 +4254,10 @@ static bool trans_vpickve2gr_h(DisasContext *ctx, ar= g_rv_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld16s_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.H(a->imm= ))); return true; @@ -4185,7 +4271,10 @@ static bool trans_vpickve2gr_w(DisasContext *ctx, ar= g_rv_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld32s_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.W(a->imm= ))); return true; @@ -4199,7 +4288,10 @@ static bool trans_vpickve2gr_d(DisasContext *ctx, ar= g_rv_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.D(a->imm))); return true; @@ -4213,7 +4305,10 @@ static bool trans_vpickve2gr_bu(DisasContext *ctx, a= rg_rv_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld8u_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.B(a->imm)= )); return true; @@ -4227,7 +4322,10 @@ static bool trans_vpickve2gr_hu(DisasContext *ctx, a= rg_rv_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld16u_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.H(a->imm= ))); return true; @@ -4241,7 +4339,10 @@ static bool trans_vpickve2gr_wu(DisasContext *ctx, a= rg_rv_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld32u_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.W(a->imm= ))); return true; @@ -4255,7 +4356,10 @@ static bool trans_vpickve2gr_du(DisasContext *ctx, a= rg_rv_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.D(a->imm))); return true; @@ -4269,7 +4373,9 @@ static bool gvec_dup(DisasContext *ctx, arg_vr *a, Me= mOp mop) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 tcg_gen_gvec_dup_i64(mop, vec_full_offset(a->vd), 16, ctx->vl/8, src); @@ -4287,7 +4393,10 @@ static bool trans_vreplvei_b(DisasContext *ctx, arg_= vv_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_gvec_dup_mem(MO_8,vec_full_offset(a->vd), offsetof(CPULoongArchState, fpr[a->vj].vreg.B((a->imm))), @@ -4301,7 +4410,10 @@ static bool trans_vreplvei_h(DisasContext *ctx, arg_= vv_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_gvec_dup_mem(MO_16, vec_full_offset(a->vd), offsetof(CPULoongArchState, fpr[a->vj].vreg.H((a->imm))), @@ -4314,7 +4426,10 @@ static bool trans_vreplvei_w(DisasContext *ctx, arg_= vv_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_gvec_dup_mem(MO_32, vec_full_offset(a->vd), offsetof(CPULoongArchState, fpr[a->vj].vreg.W((a->imm))), @@ -4327,7 +4442,10 @@ static bool trans_vreplvei_d(DisasContext *ctx, arg_= vv_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_gvec_dup_mem(MO_64, vec_full_offset(a->vd), offsetof(CPULoongArchState, fpr[a->vj].vreg.D((a->imm))), @@ -4346,7 +4464,9 @@ static bool gen_vreplve(DisasContext *ctx, arg_vvr *a= , int vece, int bit, return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 tcg_gen_andi_i64(t0, gpr_src(ctx, a->rk, EXT_NONE), (LSX_LEN/bit) -1); tcg_gen_shli_i64(t0, t0, vece); @@ -4376,7 +4496,9 @@ static bool trans_vbsll_v(DisasContext *ctx, arg_vv_i= *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 desthigh =3D tcg_temp_new_i64(); destlow =3D tcg_temp_new_i64(); @@ -4410,7 +4532,9 @@ static bool trans_vbsrl_v(DisasContext *ctx, arg_vv_i= *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 desthigh =3D tcg_temp_new_i64(); destlow =3D tcg_temp_new_i64(); @@ -4488,7 +4612,9 @@ static bool trans_vld(DisasContext *ctx, arg_vr_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 addr =3D gpr_src(ctx, a->rj, EXT_NONE); val =3D tcg_temp_new_i128(); @@ -4515,7 +4641,9 @@ static bool trans_vst(DisasContext *ctx, arg_vr_i *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 addr =3D gpr_src(ctx, a->rj, EXT_NONE); val =3D tcg_temp_new_i128(); @@ -4542,7 +4670,9 @@ static bool trans_vldx(DisasContext *ctx, arg_vrr *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 src1 =3D gpr_src(ctx, a->rj, EXT_NONE); src2 =3D gpr_src(ctx, a->rk, EXT_NONE); @@ -4569,7 +4699,9 @@ static bool trans_vstx(DisasContext *ctx, arg_vrr *a) return false; } =20 - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } =20 src1 =3D gpr_src(ctx, a->rj, EXT_NONE); src2 =3D gpr_src(ctx, a->rk, EXT_NONE); @@ -4596,7 +4728,9 @@ static bool trans_## NAME (DisasContext *ctx, arg_vr_= i *a) \ return false; \ } \ \ - CHECK_SXE; \ + if (!check_vec(ctx, 16)) { \ + return true; \ + } \ \ addr =3D gpr_src(ctx, a->rj, EXT_NONE); = \ val =3D tcg_temp_new_i64(); = \ @@ -4624,7 +4758,9 @@ static bool trans_## NAME (DisasContext *ctx, arg_vr_= ii *a) \ return false; = \ } = \ = \ - CHECK_SXE; = \ + if (!check_vec(ctx, 16)) { = \ + return true; = \ + } = \ = \ addr =3D gpr_src(ctx, a->rj, EXT_NONE); = \ val =3D tcg_temp_new_i64(); = \ --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16946587652431002.0048766458419; Wed, 13 Sep 2023 19:32:45 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4e-0003Az-OJ; Wed, 13 Sep 2023 22:27:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4Z-000398-7t for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:15 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4T-0004ql-92 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:15 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxPOtxbwJlY_gmAA--.4368S3; Thu, 14 Sep 2023 10:26:57 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S13; Thu, 14 Sep 2023 10:26:56 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 11/57] target/loongarch: Add LASX data support Date: Thu, 14 Sep 2023 10:25:59 +0800 Message-Id: <20230914022645.1151356-12-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S13 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658765893100001 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/cpu.h | 24 ++++++++++++---------- target/loongarch/internals.h | 22 -------------------- target/loongarch/vec.h | 33 ++++++++++++++++++++++++++++++ linux-user/loongarch64/signal.c | 1 + target/loongarch/cpu.c | 1 + target/loongarch/gdbstub.c | 1 + target/loongarch/machine.c | 36 ++++++++++++++++++++++++++++++++- target/loongarch/translate.c | 1 + target/loongarch/vec_helper.c | 1 + 9 files changed, 86 insertions(+), 34 deletions(-) create mode 100644 target/loongarch/vec.h diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h index 4d7201995a..347ad1c8a9 100644 --- a/target/loongarch/cpu.h +++ b/target/loongarch/cpu.h @@ -251,18 +251,20 @@ FIELD(TLB_MISC, ASID, 1, 10) FIELD(TLB_MISC, VPPN, 13, 35) FIELD(TLB_MISC, PS, 48, 6) =20 -#define LSX_LEN (128) +#define LSX_LEN (128) +#define LASX_LEN (256) + typedef union VReg { - int8_t B[LSX_LEN / 8]; - int16_t H[LSX_LEN / 16]; - int32_t W[LSX_LEN / 32]; - int64_t D[LSX_LEN / 64]; - uint8_t UB[LSX_LEN / 8]; - uint16_t UH[LSX_LEN / 16]; - uint32_t UW[LSX_LEN / 32]; - uint64_t UD[LSX_LEN / 64]; - Int128 Q[LSX_LEN / 128]; -}VReg; + int8_t B[LASX_LEN / 8]; + int16_t H[LASX_LEN / 16]; + int32_t W[LASX_LEN / 32]; + int64_t D[LASX_LEN / 64]; + uint8_t UB[LASX_LEN / 8]; + uint16_t UH[LASX_LEN / 16]; + uint32_t UW[LASX_LEN / 32]; + uint64_t UD[LASX_LEN / 64]; + Int128 Q[LASX_LEN / 128]; +} VReg; =20 typedef union fpr_t fpr_t; union fpr_t { diff --git a/target/loongarch/internals.h b/target/loongarch/internals.h index 7b0f29c942..c492863cc5 100644 --- a/target/loongarch/internals.h +++ b/target/loongarch/internals.h @@ -21,28 +21,6 @@ /* Global bit for huge page */ #define LOONGARCH_HGLOBAL_SHIFT 12 =20 -#if HOST_BIG_ENDIAN -#define B(x) B[15 - (x)] -#define H(x) H[7 - (x)] -#define W(x) W[3 - (x)] -#define D(x) D[1 - (x)] -#define UB(x) UB[15 - (x)] -#define UH(x) UH[7 - (x)] -#define UW(x) UW[3 - (x)] -#define UD(x) UD[1 -(x)] -#define Q(x) Q[x] -#else -#define B(x) B[x] -#define H(x) H[x] -#define W(x) W[x] -#define D(x) D[x] -#define UB(x) UB[x] -#define UH(x) UH[x] -#define UW(x) UW[x] -#define UD(x) UD[x] -#define Q(x) Q[x] -#endif - void loongarch_translate_init(void); =20 void loongarch_cpu_dump_state(CPUState *cpu, FILE *f, int flags); diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h new file mode 100644 index 0000000000..2f23cae7d7 --- /dev/null +++ b/target/loongarch/vec.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * QEMU LoongArch vector utilitites + * + * Copyright (c) 2023 Loongson Technology Corporation Limited + */ + +#ifndef LOONGARCH_VEC_H +#define LOONGARCH_VEC_H + +#if HOST_BIG_ENDIAN +#define B(x) B[(x) ^ 15] +#define H(x) H[(x) ^ 7] +#define W(x) W[(x) ^ 3] +#define D(x) D[(x) ^ 1] +#define UB(x) UB[(x) ^ 15] +#define UH(x) UH[(x) ^ 7] +#define UW(x) UW[(x) ^ 3] +#define UD(x) UD[(x) ^ 1] +#define Q(x) Q[x] +#else +#define B(x) B[x] +#define H(x) H[x] +#define W(x) W[x] +#define D(x) D[x] +#define UB(x) UB[x] +#define UH(x) UH[x] +#define UW(x) UW[x] +#define UD(x) UD[x] +#define Q(x) Q[x] +#endif /* HOST_BIG_ENDIAN */ + +#endif /* LOONGARCH_VEC_H */ diff --git a/linux-user/loongarch64/signal.c b/linux-user/loongarch64/signa= l.c index bb8efb1172..39572c1190 100644 --- a/linux-user/loongarch64/signal.c +++ b/linux-user/loongarch64/signal.c @@ -12,6 +12,7 @@ #include "linux-user/trace.h" =20 #include "target/loongarch/internals.h" +#include "target/loongarch/vec.h" =20 /* FP context was used */ #define SC_USED_FP (1 << 0) diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c index 65f9320e34..4d72e905aa 100644 --- a/target/loongarch/cpu.c +++ b/target/loongarch/cpu.c @@ -19,6 +19,7 @@ #include "cpu-csr.h" #include "sysemu/reset.h" #include "tcg/tcg.h" +#include "vec.h" =20 const char * const regnames[32] =3D { "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", diff --git a/target/loongarch/gdbstub.c b/target/loongarch/gdbstub.c index b09804b62f..5fc2f19e96 100644 --- a/target/loongarch/gdbstub.c +++ b/target/loongarch/gdbstub.c @@ -11,6 +11,7 @@ #include "internals.h" #include "exec/gdbstub.h" #include "gdbstub/helpers.h" +#include "vec.h" =20 uint64_t read_fcc(CPULoongArchState *env) { diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c index d8ac99c9a4..1c4e01d076 100644 --- a/target/loongarch/machine.c +++ b/target/loongarch/machine.c @@ -8,7 +8,7 @@ #include "qemu/osdep.h" #include "cpu.h" #include "migration/cpu.h" -#include "internals.h" +#include "vec.h" =20 static const VMStateDescription vmstate_fpu_reg =3D { .name =3D "fpu_reg", @@ -76,6 +76,39 @@ static const VMStateDescription vmstate_lsx =3D { }, }; =20 +static const VMStateDescription vmstate_lasxh_reg =3D { + .name =3D "lasxh_reg", + .version_id =3D 1, + .minimum_version_id =3D 1, + .fields =3D (VMStateField[]) { + VMSTATE_UINT64(UD(2), VReg), + VMSTATE_UINT64(UD(3), VReg), + VMSTATE_END_OF_LIST() + } +}; + +#define VMSTATE_LASXH_REGS(_field, _state, _start) \ + VMSTATE_STRUCT_SUB_ARRAY(_field, _state, _start, 32, 0, \ + vmstate_lasxh_reg, fpr_t) + +static bool lasx_needed(void *opaque) +{ + LoongArchCPU *cpu =3D opaque; + + return FIELD_EX64(cpu->env.cpucfg[2], CPUCFG2, LASX); +} + +static const VMStateDescription vmstate_lasx =3D { + .name =3D "cpu/lasx", + .version_id =3D 1, + .minimum_version_id =3D 1, + .needed =3D lasx_needed, + .fields =3D (VMStateField[]) { + VMSTATE_LASXH_REGS(env.fpr, LoongArchCPU, 0), + VMSTATE_END_OF_LIST() + }, +}; + /* TLB state */ const VMStateDescription vmstate_tlb =3D { .name =3D "cpu/tlb", @@ -163,6 +196,7 @@ const VMStateDescription vmstate_loongarch_cpu =3D { .subsections =3D (const VMStateDescription*[]) { &vmstate_fpu, &vmstate_lsx, + &vmstate_lasx, NULL } }; diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c index 288727181b..7f3958a1f4 100644 --- a/target/loongarch/translate.c +++ b/target/loongarch/translate.c @@ -18,6 +18,7 @@ #include "fpu/softfloat.h" #include "translate.h" #include "internals.h" +#include "vec.h" =20 /* Global register indices */ TCGv cpu_gpr[32], cpu_pc; diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 4e10957b90..c784f98ab2 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -12,6 +12,7 @@ #include "fpu/softfloat.h" #include "internals.h" #include "tcg/tcg.h" +#include "vec.h" =20 #define DO_ADD(a, b) (a + b) #define DO_SUB(a, b) (a - b) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694659029676263.32630296221805; Wed, 13 Sep 2023 19:37:09 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6N-0006wj-5u; Wed, 13 Sep 2023 22:29:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc5q-0004mO-Ob for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:35 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5h-0005FC-TT for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:34 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxqOhybwJlbPgmAA--.39106S3; Thu, 14 Sep 2023 10:26:58 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S14; Thu, 14 Sep 2023 10:26:57 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 12/57] target/loongarch: check_vec support check LASX instructions Date: Thu, 14 Sep 2023 10:26:00 +0800 Message-Id: <20230914022645.1151356-13-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S14 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694659030205100003 Content-Type: text/plain; charset="utf-8" Reviewed-by: Richard Henderson Signed-off-by: Song Gao --- target/loongarch/cpu.h | 2 ++ target/loongarch/cpu.c | 2 ++ target/loongarch/insn_trans/trans_vec.c.inc | 6 ++++++ 3 files changed, 10 insertions(+) diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h index 347ad1c8a9..f125a8e49b 100644 --- a/target/loongarch/cpu.h +++ b/target/loongarch/cpu.h @@ -462,6 +462,7 @@ static inline void set_pc(CPULoongArchState *env, uint6= 4_t value) #define HW_FLAGS_CRMD_PG R_CSR_CRMD_PG_MASK /* 0x10 */ #define HW_FLAGS_EUEN_FPE 0x04 #define HW_FLAGS_EUEN_SXE 0x08 +#define HW_FLAGS_EUEN_ASXE 0x10 #define HW_FLAGS_VA32 0x20 =20 static inline void cpu_get_tb_cpu_state(CPULoongArchState *env, vaddr *pc, @@ -472,6 +473,7 @@ static inline void cpu_get_tb_cpu_state(CPULoongArchSta= te *env, vaddr *pc, *flags =3D env->CSR_CRMD & (R_CSR_CRMD_PLV_MASK | R_CSR_CRMD_PG_MASK); *flags |=3D FIELD_EX64(env->CSR_EUEN, CSR_EUEN, FPE) * HW_FLAGS_EUEN_F= PE; *flags |=3D FIELD_EX64(env->CSR_EUEN, CSR_EUEN, SXE) * HW_FLAGS_EUEN_S= XE; + *flags |=3D FIELD_EX64(env->CSR_EUEN, CSR_EUEN, ASXE) * HW_FLAGS_EUEN_= ASXE; *flags |=3D is_va32(env) * HW_FLAGS_VA32; } =20 diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c index 4d72e905aa..a1d3f680d8 100644 --- a/target/loongarch/cpu.c +++ b/target/loongarch/cpu.c @@ -55,6 +55,7 @@ static const char * const excp_names[] =3D { [EXCCODE_DBP] =3D "Debug breakpoint", [EXCCODE_BCE] =3D "Bound Check Exception", [EXCCODE_SXD] =3D "128 bit vector instructions Disable exception", + [EXCCODE_ASXD] =3D "256 bit vector instructions Disable exception", }; =20 const char *loongarch_exception_name(int32_t exception) @@ -190,6 +191,7 @@ static void loongarch_cpu_do_interrupt(CPUState *cs) case EXCCODE_FPD: case EXCCODE_FPE: case EXCCODE_SXD: + case EXCCODE_ASXD: env->CSR_BADV =3D env->pc; QEMU_FALLTHROUGH; case EXCCODE_BCE: diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index d8ab7c3417..b5ca65c250 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -12,6 +12,12 @@ static bool check_vec(DisasContext *ctx, uint32_t oprsz) generate_exception(ctx, EXCCODE_SXD); return false; } + + if ((oprsz =3D=3D 32) && ((ctx->base.tb->flags & HW_FLAGS_EUEN_ASXE) = =3D=3D 0)) { + generate_exception(ctx, EXCCODE_ASXD); + return false; + } + return true; } =20 --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658916661188.44617752204624; Wed, 13 Sep 2023 19:35:16 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4f-0003BL-62; Wed, 13 Sep 2023 22:27:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4b-00039i-9E for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:17 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4X-0004r4-N0 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:16 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cxh+hybwJlb_gmAA--.39772S3; Thu, 14 Sep 2023 10:26:58 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S15; Thu, 14 Sep 2023 10:26:58 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 13/57] target/loongarch: Add avail_LASX to check LASX instructions Date: Thu, 14 Sep 2023 10:26:01 +0800 Message-Id: <20230914022645.1151356-14-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S15 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658917024100002 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/translate.h | 1 + 1 file changed, 1 insertion(+) diff --git a/target/loongarch/translate.h b/target/loongarch/translate.h index 89b49a859e..195f53573a 100644 --- a/target/loongarch/translate.h +++ b/target/loongarch/translate.h @@ -23,6 +23,7 @@ #define avail_LSPW(C) (FIELD_EX32((C)->cpucfg2, CPUCFG2, LSPW)) #define avail_LAM(C) (FIELD_EX32((C)->cpucfg2, CPUCFG2, LAM)) #define avail_LSX(C) (FIELD_EX32((C)->cpucfg2, CPUCFG2, LSX)) +#define avail_LASX(C) (FIELD_EX32((C)->cpucfg2, CPUCFG2, LASX)) #define avail_IOCSR(C) (FIELD_EX32((C)->cpucfg1, CPUCFG1, IOCSR)) =20 /* --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 169465901245777.64461171808102; Wed, 13 Sep 2023 19:36:52 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6O-0007CG-Ne; Wed, 13 Sep 2023 22:29:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc5r-0004o5-PT for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:36 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5h-0005FD-Tw for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:35 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Ax1fBzbwJlc_gmAA--.9915S3; Thu, 14 Sep 2023 10:26:59 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S16; Thu, 14 Sep 2023 10:26:58 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 14/57] target/loongarch: Implement xvadd/xvsub Date: Thu, 14 Sep 2023 10:26:02 +0800 Message-Id: <20230914022645.1151356-15-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S16 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694659013561100009 Content-Type: text/plain; charset="utf-8" This patch includes: - XVADD.{B/H/W/D/Q}; - XVSUB.{B/H/W/D/Q}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 14 +++ target/loongarch/disas.c | 23 +++++ target/loongarch/translate.c | 4 + target/loongarch/insn_trans/trans_vec.c.inc | 107 +++++++++++++------- 4 files changed, 109 insertions(+), 39 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index c9c3bc2c73..bcc18fb6c5 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1296,3 +1296,17 @@ vstelm_d 0011 00010001 0 . ........ ..... ..= ... @vr_i8i1 vstelm_w 0011 00010010 .. ........ ..... ..... @vr_i8i2 vstelm_h 0011 0001010 ... ........ ..... ..... @vr_i8i3 vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 + +# +# LoongArch LASX instructions +# +xvadd_b 0111 01000000 10100 ..... ..... ..... @vvv +xvadd_h 0111 01000000 10101 ..... ..... ..... @vvv +xvadd_w 0111 01000000 10110 ..... ..... ..... @vvv +xvadd_d 0111 01000000 10111 ..... ..... ..... @vvv +xvadd_q 0111 01010010 11010 ..... ..... ..... @vvv +xvsub_b 0111 01000000 11000 ..... ..... ..... @vvv +xvsub_h 0111 01000000 11001 ..... ..... ..... @vvv +xvsub_w 0111 01000000 11010 ..... ..... ..... @vvv +xvsub_d 0111 01000000 11011 ..... ..... ..... @vvv +xvsub_q 0111 01010010 11011 ..... ..... ..... @vvv diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 5c402d944d..d8b62ba532 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1695,3 +1695,26 @@ INSN_LSX(vstelm_d, vr_ii) INSN_LSX(vstelm_w, vr_ii) INSN_LSX(vstelm_h, vr_ii) INSN_LSX(vstelm_b, vr_ii) + +#define INSN_LASX(insn, type) \ +static bool trans_##insn(DisasContext *ctx, arg_##type * a) \ +{ \ + output_##type ## _x(ctx, a, #insn); \ + return true; \ +} + +static void output_vvv_x(DisasContext *ctx, arg_vvv * a, const char *mnemo= nic) +{ + output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk); +} + +INSN_LASX(xvadd_b, vvv) +INSN_LASX(xvadd_h, vvv) +INSN_LASX(xvadd_w, vvv) +INSN_LASX(xvadd_d, vvv) +INSN_LASX(xvadd_q, vvv) +INSN_LASX(xvsub_b, vvv) +INSN_LASX(xvsub_h, vvv) +INSN_LASX(xvsub_w, vvv) +INSN_LASX(xvsub_d, vvv) +INSN_LASX(xvsub_q, vvv) diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c index 7f3958a1f4..10e2fe8ff6 100644 --- a/target/loongarch/translate.c +++ b/target/loongarch/translate.c @@ -124,6 +124,10 @@ static void loongarch_tr_init_disas_context(DisasConte= xtBase *dcbase, ctx->vl =3D LSX_LEN; } =20 + if (FIELD_EX64(env->cpucfg[2], CPUCFG2, LASX)) { + ctx->vl =3D LASX_LEN; + } + ctx->la64 =3D is_la64(env); ctx->va32 =3D (ctx->base.tb->flags & HW_FLAGS_VA32) !=3D 0; =20 diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index b5ca65c250..3252e1d809 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -193,6 +193,10 @@ static bool gvec_vvv_vl(DisasContext *ctx, arg_vvv *a, uint32_t vj_ofs =3D vec_full_offset(a->vj); uint32_t vk_ofs =3D vec_full_offset(a->vk); =20 + if (!check_vec(ctx, oprsz)) { + return true; + } + func(mop, vd_ofs, vj_ofs, vk_ofs, oprsz, ctx->vl / 8); return true; } @@ -201,13 +205,15 @@ static bool gvec_vvv(DisasContext *ctx, arg_vvv *a, M= emOp mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t)) { - if (!check_vec(ctx, 16)) { - return true; - } - return gvec_vvv_vl(ctx, a, 16, mop, func); } =20 +static bool gvec_xxx(DisasContext *ctx, arg_vvv *a, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t, uint32_t)) +{ + return gvec_vvv_vl(ctx, a, 32, mop, func); +} =20 static bool gvec_vv_vl(DisasContext *ctx, arg_vv *a, uint32_t oprsz, MemOp mop, @@ -279,47 +285,70 @@ TRANS(vadd_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_add) TRANS(vadd_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_add) TRANS(vadd_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_add) TRANS(vadd_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_add) +TRANS(xvadd_b, LASX, gvec_xxx, MO_8, tcg_gen_gvec_add) +TRANS(xvadd_h, LASX, gvec_xxx, MO_16, tcg_gen_gvec_add) +TRANS(xvadd_w, LASX, gvec_xxx, MO_32, tcg_gen_gvec_add) +TRANS(xvadd_d, LASX, gvec_xxx, MO_64, tcg_gen_gvec_add) + +static bool gen_vaddsub_q_vl(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, + void (*func)(TCGv_i64, TCGv_i64, TCGv_i64, + TCGv_i64, TCGv_i64, TCGv_i64)) +{ + int i; + TCGv_i64 rh, rl, ah, al, bh, bl; + + if (!check_vec(ctx, oprsz)) { + return true; + } =20 -#define VADDSUB_Q(NAME) \ -static bool trans_v## NAME ##_q(DisasContext *ctx, arg_vvv *a) \ -{ \ - TCGv_i64 rh, rl, ah, al, bh, bl; \ - \ - if (!avail_LSX(ctx)) { \ - return false; \ - } \ - \ - if (!check_vec(ctx, 16)) { \ - return true; \ - } \ - \ - rh =3D tcg_temp_new_i64(); \ - rl =3D tcg_temp_new_i64(); \ - ah =3D tcg_temp_new_i64(); \ - al =3D tcg_temp_new_i64(); \ - bh =3D tcg_temp_new_i64(); \ - bl =3D tcg_temp_new_i64(); \ - \ - get_vreg64(ah, a->vj, 1); \ - get_vreg64(al, a->vj, 0); \ - get_vreg64(bh, a->vk, 1); \ - get_vreg64(bl, a->vk, 0); \ - \ - tcg_gen_## NAME ##2_i64(rl, rh, al, ah, bl, bh); \ - \ - set_vreg64(rh, a->vd, 1); \ - set_vreg64(rl, a->vd, 0); \ - \ - return true; \ -} - -VADDSUB_Q(add) -VADDSUB_Q(sub) + rh =3D tcg_temp_new_i64(); + rl =3D tcg_temp_new_i64(); + ah =3D tcg_temp_new_i64(); + al =3D tcg_temp_new_i64(); + bh =3D tcg_temp_new_i64(); + bl =3D tcg_temp_new_i64(); + + for (i =3D 0; i < oprsz / 16; i++) { + get_vreg64(ah, a->vj, 1 + i * 2); + get_vreg64(al, a->vj, i * 2); + get_vreg64(bh, a->vk, 1 + i * 2); + get_vreg64(bl, a->vk, i * 2); + + func(rl, rh, al, ah, bl, bh); + + set_vreg64(rh, a->vd, 1 + i * 2); + set_vreg64(rl, a->vd, i * 2); + } + return true; +} + +static bool gen_vaddsub_q(DisasContext *ctx, arg_vvv *a, + void (*func)(TCGv_i64, TCGv_i64, TCGv_i64, + TCGv_i64, TCGv_i64, TCGv_i64)) +{ + return gen_vaddsub_q_vl(ctx, a, 16, func); +} + +static bool gen_xvaddsub_q(DisasContext *ctx, arg_vvv *a, + void (*func)(TCGv_i64, TCGv_i64, TCGv_i64, + TCGv_i64, TCGv_i64, TCGv_i64)) +{ + return gen_vaddsub_q_vl(ctx, a, 32, func); +} =20 TRANS(vsub_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_sub) TRANS(vsub_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_sub) TRANS(vsub_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_sub) TRANS(vsub_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_sub) +TRANS(xvsub_b, LASX, gvec_xxx, MO_8, tcg_gen_gvec_sub) +TRANS(xvsub_h, LASX, gvec_xxx, MO_16, tcg_gen_gvec_sub) +TRANS(xvsub_w, LASX, gvec_xxx, MO_32, tcg_gen_gvec_sub) +TRANS(xvsub_d, LASX, gvec_xxx, MO_64, tcg_gen_gvec_sub) + +TRANS(vadd_q, LSX, gen_vaddsub_q, tcg_gen_add2_i64) +TRANS(vsub_q, LSX, gen_vaddsub_q, tcg_gen_sub2_i64) +TRANS(xvadd_q, LASX, gen_xvaddsub_q, tcg_gen_add2_i64) +TRANS(xvsub_q, LASX, gen_xvaddsub_q, tcg_gen_sub2_i64) =20 TRANS(vaddi_bu, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_addi) TRANS(vaddi_hu, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_addi) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658795491440.27734046325827; Wed, 13 Sep 2023 19:33:15 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4e-0003Ao-Fa; Wed, 13 Sep 2023 22:27:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4b-00039k-9T for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:17 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4X-0004rJ-Mu for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:16 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxEvB0bwJldfgmAA--.9048S3; Thu, 14 Sep 2023 10:27:00 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S17; Thu, 14 Sep 2023 10:26:59 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 15/57] target/loongarch: Implement xvreplgr2vr Date: Thu, 14 Sep 2023 10:26:03 +0800 Message-Id: <20230914022645.1151356-16-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S17 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658796364100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVREPLGR2VR.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 5 +++++ target/loongarch/disas.c | 10 +++++++++ target/loongarch/insn_trans/trans_vec.c.inc | 25 +++++++++++++++------ 3 files changed, 33 insertions(+), 7 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index bcc18fb6c5..04bd238995 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1310,3 +1310,8 @@ xvsub_h 0111 01000000 11001 ..... ..... ....= . @vvv xvsub_w 0111 01000000 11010 ..... ..... ..... @vvv xvsub_d 0111 01000000 11011 ..... ..... ..... @vvv xvsub_q 0111 01010010 11011 ..... ..... ..... @vvv + +xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr +xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr +xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr +xvreplgr2vr_d 0111 01101001 11110 00011 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index d8b62ba532..c47f455ed0 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1708,6 +1708,11 @@ static void output_vvv_x(DisasContext *ctx, arg_vvv = * a, const char *mnemonic) output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk); } =20 +static void output_vr_x(DisasContext *ctx, arg_vr *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, r%d", a->vd, a->rj); +} + INSN_LASX(xvadd_b, vvv) INSN_LASX(xvadd_h, vvv) INSN_LASX(xvadd_w, vvv) @@ -1718,3 +1723,8 @@ INSN_LASX(xvsub_h, vvv) INSN_LASX(xvsub_w, vvv) INSN_LASX(xvsub_d, vvv) INSN_LASX(xvsub_q, vvv) + +INSN_LASX(xvreplgr2vr_b, vr) +INSN_LASX(xvreplgr2vr_h, vr) +INSN_LASX(xvreplgr2vr_w, vr) +INSN_LASX(xvreplgr2vr_d, vr) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 3252e1d809..5001042870 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -4400,27 +4400,38 @@ static bool trans_vpickve2gr_du(DisasContext *ctx, = arg_rv_i *a) return true; } =20 -static bool gvec_dup(DisasContext *ctx, arg_vr *a, MemOp mop) +static bool gvec_dup_vl(DisasContext *ctx, arg_vr *a, + uint32_t oprsz, MemOp mop) { TCGv src =3D gpr_src(ctx, a->rj, EXT_NONE); =20 - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, oprsz)) { return true; } =20 tcg_gen_gvec_dup_i64(mop, vec_full_offset(a->vd), - 16, ctx->vl/8, src); + oprsz, ctx->vl/8, src); return true; } =20 +static bool gvec_dup(DisasContext *ctx, arg_vr *a, MemOp mop) +{ + return gvec_dup_vl(ctx, a, 16, mop); +} + +static bool gvec_dupx(DisasContext *ctx, arg_vr *a, MemOp mop) +{ + return gvec_dup_vl(ctx, a, 32, mop); +} + TRANS(vreplgr2vr_b, LSX, gvec_dup, MO_8) TRANS(vreplgr2vr_h, LSX, gvec_dup, MO_16) TRANS(vreplgr2vr_w, LSX, gvec_dup, MO_32) TRANS(vreplgr2vr_d, LSX, gvec_dup, MO_64) +TRANS(xvreplgr2vr_b, LASX, gvec_dupx, MO_8) +TRANS(xvreplgr2vr_h, LASX, gvec_dupx, MO_16) +TRANS(xvreplgr2vr_w, LASX, gvec_dupx, MO_32) +TRANS(xvreplgr2vr_d, LASX, gvec_dupx, MO_64) =20 static bool trans_vreplvei_b(DisasContext *ctx, arg_vv_i *a) { --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658698090331.2816059729994; Wed, 13 Sep 2023 19:31:38 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6J-0006Qs-Ux; Wed, 13 Sep 2023 22:29:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc5r-0004o4-M2 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:35 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5h-0005FB-Sg for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:35 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Bxyep1bwJld_gmAA--.447S3; Thu, 14 Sep 2023 10:27:01 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S18; Thu, 14 Sep 2023 10:27:00 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 16/57] target/loongarch: Implement xvaddi/xvsubi Date: Thu, 14 Sep 2023 10:26:04 +0800 Message-Id: <20230914022645.1151356-17-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S18 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658699415100006 Content-Type: text/plain; charset="utf-8" This patch includes: - XVADDI.{B/H/W/D}U; - XVSUBI.{B/H/W/D}U. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 9 ++++++ target/loongarch/disas.c | 14 ++++++++ target/loongarch/insn_trans/trans_vec.c.inc | 36 ++++++++++++++++----- 3 files changed, 51 insertions(+), 8 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 04bd238995..c48dca70b8 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1311,6 +1311,15 @@ xvsub_w 0111 01000000 11010 ..... ..... ...= .. @vvv xvsub_d 0111 01000000 11011 ..... ..... ..... @vvv xvsub_q 0111 01010010 11011 ..... ..... ..... @vvv =20 +xvaddi_bu 0111 01101000 10100 ..... ..... ..... @vv_ui5 +xvaddi_hu 0111 01101000 10101 ..... ..... ..... @vv_ui5 +xvaddi_wu 0111 01101000 10110 ..... ..... ..... @vv_ui5 +xvaddi_du 0111 01101000 10111 ..... ..... ..... @vv_ui5 +xvsubi_bu 0111 01101000 11000 ..... ..... ..... @vv_ui5 +xvsubi_hu 0111 01101000 11001 ..... ..... ..... @vv_ui5 +xvsubi_wu 0111 01101000 11010 ..... ..... ..... @vv_ui5 +xvsubi_du 0111 01101000 11011 ..... ..... ..... @vv_ui5 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index c47f455ed0..20df9c7c99 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1713,6 +1713,11 @@ static void output_vr_x(DisasContext *ctx, arg_vr *a= , const char *mnemonic) output(ctx, mnemonic, "x%d, r%d", a->vd, a->rj); } =20 +static void output_vv_i_x(DisasContext *ctx, arg_vv_i *a, const char *mnem= onic) +{ + output(ctx, mnemonic, "x%d, x%d, 0x%x", a->vd, a->vj, a->imm); +} + INSN_LASX(xvadd_b, vvv) INSN_LASX(xvadd_h, vvv) INSN_LASX(xvadd_w, vvv) @@ -1724,6 +1729,15 @@ INSN_LASX(xvsub_w, vvv) INSN_LASX(xvsub_d, vvv) INSN_LASX(xvsub_q, vvv) =20 +INSN_LASX(xvaddi_bu, vv_i) +INSN_LASX(xvaddi_hu, vv_i) +INSN_LASX(xvaddi_wu, vv_i) +INSN_LASX(xvaddi_du, vv_i) +INSN_LASX(xvsubi_bu, vv_i) +INSN_LASX(xvsubi_hu, vv_i) +INSN_LASX(xvsubi_wu, vv_i) +INSN_LASX(xvsubi_du, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 5001042870..689db12d71 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -247,6 +247,10 @@ static bool gvec_vv_i_vl(DisasContext *ctx, arg_vv_i *= a, uint32_t vd_ofs =3D vec_full_offset(a->vd); uint32_t vj_ofs =3D vec_full_offset(a->vj); =20 + if (!check_vec(ctx, oprsz)) { + return true; + } + func(mop, vd_ofs, vj_ofs, a->imm, oprsz, ctx->vl / 8); return true; } @@ -255,32 +259,40 @@ static bool gvec_vv_i(DisasContext *ctx, arg_vv_i *a,= MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, int64_t, uint32_t, uint32_t)) { - if (!check_vec(ctx, 16)) { - return true; - } - return gvec_vv_i_vl(ctx, a, 16, mop, func); } =20 +static bool gvec_xx_i(DisasContext *ctx, arg_vv_i *a, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + int64_t, uint32_t, uint32_t)) +{ + return gvec_vv_i_vl(ctx,a, 32, mop, func); +} + static bool gvec_subi_vl(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz, MemOp mop) { uint32_t vd_ofs =3D vec_full_offset(a->vd); uint32_t vj_ofs =3D vec_full_offset(a->vj); =20 + if (!check_vec(ctx, oprsz)) { + return true; + } + tcg_gen_gvec_addi(mop, vd_ofs, vj_ofs, -a->imm, oprsz, ctx->vl / 8); return true; } =20 static bool gvec_subi(DisasContext *ctx, arg_vv_i *a, MemOp mop) { - if (!check_vec(ctx, 16)) { - return true; - } - return gvec_subi_vl(ctx, a, 16, mop); } =20 +static bool gvec_xsubi(DisasContext *ctx, arg_vv_i *a, MemOp mop) +{ + return gvec_subi_vl(ctx, a, 32, mop); +} + TRANS(vadd_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_add) TRANS(vadd_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_add) TRANS(vadd_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_add) @@ -358,6 +370,14 @@ TRANS(vsubi_bu, LSX, gvec_subi, MO_8) TRANS(vsubi_hu, LSX, gvec_subi, MO_16) TRANS(vsubi_wu, LSX, gvec_subi, MO_32) TRANS(vsubi_du, LSX, gvec_subi, MO_64) +TRANS(xvaddi_bu, LASX, gvec_xx_i, MO_8, tcg_gen_gvec_addi) +TRANS(xvaddi_hu, LASX, gvec_xx_i, MO_16, tcg_gen_gvec_addi) +TRANS(xvaddi_wu, LASX, gvec_xx_i, MO_32, tcg_gen_gvec_addi) +TRANS(xvaddi_du, LASX, gvec_xx_i, MO_64, tcg_gen_gvec_addi) +TRANS(xvsubi_bu, LASX, gvec_xsubi, MO_8) +TRANS(xvsubi_hu, LASX, gvec_xsubi, MO_16) +TRANS(xvsubi_wu, LASX, gvec_xsubi, MO_32) +TRANS(xvsubi_du, LASX, gvec_xsubi, MO_64) =20 TRANS(vneg_b, LSX, gvec_vv, MO_8, tcg_gen_gvec_neg) TRANS(vneg_h, LSX, gvec_vv, MO_16, tcg_gen_gvec_neg) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658835818934.6035813340687; Wed, 13 Sep 2023 19:33:55 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4e-0003Ap-Gn; Wed, 13 Sep 2023 22:27:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4b-00039j-9S for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:17 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4Y-0004rt-Gm for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:16 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Axuep1bwJlePgmAA--.65452S3; Thu, 14 Sep 2023 10:27:01 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S19; Thu, 14 Sep 2023 10:27:00 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 17/57] target/loongarch: Implement xvneg Date: Thu, 14 Sep 2023 10:26:05 +0800 Message-Id: <20230914022645.1151356-18-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S19 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658836938100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVNEG.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 5 +++++ target/loongarch/disas.c | 10 ++++++++++ target/loongarch/insn_trans/trans_vec.c.inc | 19 +++++++++++++++---- 3 files changed, 30 insertions(+), 4 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index c48dca70b8..759172628f 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1320,6 +1320,11 @@ xvsubi_hu 0111 01101000 11001 ..... ..... ...= .. @vv_ui5 xvsubi_wu 0111 01101000 11010 ..... ..... ..... @vv_ui5 xvsubi_du 0111 01101000 11011 ..... ..... ..... @vv_ui5 =20 +xvneg_b 0111 01101001 11000 01100 ..... ..... @vv +xvneg_h 0111 01101001 11000 01101 ..... ..... @vv +xvneg_w 0111 01101001 11000 01110 ..... ..... @vv +xvneg_d 0111 01101001 11000 01111 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 20df9c7c99..a7455840a0 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1718,6 +1718,11 @@ static void output_vv_i_x(DisasContext *ctx, arg_vv_= i *a, const char *mnemonic) output(ctx, mnemonic, "x%d, x%d, 0x%x", a->vd, a->vj, a->imm); } =20 +static void output_vv_x(DisasContext *ctx, arg_vv *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, x%d", a->vd, a->vj); +} + INSN_LASX(xvadd_b, vvv) INSN_LASX(xvadd_h, vvv) INSN_LASX(xvadd_w, vvv) @@ -1738,6 +1743,11 @@ INSN_LASX(xvsubi_hu, vv_i) INSN_LASX(xvsubi_wu, vv_i) INSN_LASX(xvsubi_du, vv_i) =20 +INSN_LASX(xvneg_b, vv) +INSN_LASX(xvneg_h, vv) +INSN_LASX(xvneg_w, vv) +INSN_LASX(xvneg_d, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 689db12d71..f837d695d1 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -223,6 +223,10 @@ static bool gvec_vv_vl(DisasContext *ctx, arg_vv *a, uint32_t vd_ofs =3D vec_full_offset(a->vd); uint32_t vj_ofs =3D vec_full_offset(a->vj); =20 + if (!check_vec(ctx, oprsz)) { + return true; + } + func(mop, vd_ofs, vj_ofs, oprsz, ctx->vl / 8); return true; } @@ -232,13 +236,16 @@ static bool gvec_vv(DisasContext *ctx, arg_vv *a, Mem= Op mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t)) { - if (!check_vec(ctx, 16)) { - return true; - } - return gvec_vv_vl(ctx, a, 16, mop, func); } =20 +static bool gvec_xx(DisasContext *ctx, arg_vv *a, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t)) +{ + return gvec_vv_vl(ctx, a, 32, mop, func); +} + static bool gvec_vv_i_vl(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, @@ -383,6 +390,10 @@ TRANS(vneg_b, LSX, gvec_vv, MO_8, tcg_gen_gvec_neg) TRANS(vneg_h, LSX, gvec_vv, MO_16, tcg_gen_gvec_neg) TRANS(vneg_w, LSX, gvec_vv, MO_32, tcg_gen_gvec_neg) TRANS(vneg_d, LSX, gvec_vv, MO_64, tcg_gen_gvec_neg) +TRANS(xvneg_b, LASX, gvec_xx, MO_8, tcg_gen_gvec_neg) +TRANS(xvneg_h, LASX, gvec_xx, MO_16, tcg_gen_gvec_neg) +TRANS(xvneg_w, LASX, gvec_xx, MO_32, tcg_gen_gvec_neg) +TRANS(xvneg_d, LASX, gvec_xx, MO_64, tcg_gen_gvec_neg) =20 TRANS(vsadd_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_ssadd) TRANS(vsadd_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_ssadd) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658562795311.85153639498446; Wed, 13 Sep 2023 19:29:22 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4g-0003Bi-7Y; Wed, 13 Sep 2023 22:27:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4c-0003AR-S0 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:18 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4a-0004sQ-2z for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:18 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxRuh3bwJle_gmAA--.21965S3; Thu, 14 Sep 2023 10:27:03 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S20; Thu, 14 Sep 2023 10:27:01 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 18/57] target/loongarch: Implement xvsadd/xvssub Date: Thu, 14 Sep 2023 10:26:06 +0800 Message-Id: <20230914022645.1151356-19-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S20 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658563077100009 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSADD.{B/H/W/D}[U]; - XVSSUB.{B/H/W/D}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 18 ++++++++++++++++++ target/loongarch/disas.c | 17 +++++++++++++++++ target/loongarch/insn_trans/trans_vec.c.inc | 17 +++++++++++++++++ 3 files changed, 52 insertions(+) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 759172628f..32f857ff7c 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1325,6 +1325,24 @@ xvneg_h 0111 01101001 11000 01101 ..... ...= .. @vv xvneg_w 0111 01101001 11000 01110 ..... ..... @vv xvneg_d 0111 01101001 11000 01111 ..... ..... @vv =20 +xvsadd_b 0111 01000100 01100 ..... ..... ..... @vvv +xvsadd_h 0111 01000100 01101 ..... ..... ..... @vvv +xvsadd_w 0111 01000100 01110 ..... ..... ..... @vvv +xvsadd_d 0111 01000100 01111 ..... ..... ..... @vvv +xvsadd_bu 0111 01000100 10100 ..... ..... ..... @vvv +xvsadd_hu 0111 01000100 10101 ..... ..... ..... @vvv +xvsadd_wu 0111 01000100 10110 ..... ..... ..... @vvv +xvsadd_du 0111 01000100 10111 ..... ..... ..... @vvv + +xvssub_b 0111 01000100 10000 ..... ..... ..... @vvv +xvssub_h 0111 01000100 10001 ..... ..... ..... @vvv +xvssub_w 0111 01000100 10010 ..... ..... ..... @vvv +xvssub_d 0111 01000100 10011 ..... ..... ..... @vvv +xvssub_bu 0111 01000100 11000 ..... ..... ..... @vvv +xvssub_hu 0111 01000100 11001 ..... ..... ..... @vvv +xvssub_wu 0111 01000100 11010 ..... ..... ..... @vvv +xvssub_du 0111 01000100 11011 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index a7455840a0..4ba4fbfc64 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1748,6 +1748,23 @@ INSN_LASX(xvneg_h, vv) INSN_LASX(xvneg_w, vv) INSN_LASX(xvneg_d, vv) =20 +INSN_LASX(xvsadd_b, vvv) +INSN_LASX(xvsadd_h, vvv) +INSN_LASX(xvsadd_w, vvv) +INSN_LASX(xvsadd_d, vvv) +INSN_LASX(xvsadd_bu, vvv) +INSN_LASX(xvsadd_hu, vvv) +INSN_LASX(xvsadd_wu, vvv) +INSN_LASX(xvsadd_du, vvv) +INSN_LASX(xvssub_b, vvv) +INSN_LASX(xvssub_h, vvv) +INSN_LASX(xvssub_w, vvv) +INSN_LASX(xvssub_d, vvv) +INSN_LASX(xvssub_bu, vvv) +INSN_LASX(xvssub_hu, vvv) +INSN_LASX(xvssub_wu, vvv) +INSN_LASX(xvssub_du, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index f837d695d1..0f9a85bbb2 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -412,6 +412,23 @@ TRANS(vssub_hu, LSX, gvec_vvv, MO_16, tcg_gen_gvec_uss= ub) TRANS(vssub_wu, LSX, gvec_vvv, MO_32, tcg_gen_gvec_ussub) TRANS(vssub_du, LSX, gvec_vvv, MO_64, tcg_gen_gvec_ussub) =20 +TRANS(xvsadd_b, LASX, gvec_xxx, MO_8, tcg_gen_gvec_ssadd) +TRANS(xvsadd_h, LASX, gvec_xxx, MO_16, tcg_gen_gvec_ssadd) +TRANS(xvsadd_w, LASX, gvec_xxx, MO_32, tcg_gen_gvec_ssadd) +TRANS(xvsadd_d, LASX, gvec_xxx, MO_64, tcg_gen_gvec_ssadd) +TRANS(xvsadd_bu, LASX, gvec_xxx, MO_8, tcg_gen_gvec_usadd) +TRANS(xvsadd_hu, LASX, gvec_xxx, MO_16, tcg_gen_gvec_usadd) +TRANS(xvsadd_wu, LASX, gvec_xxx, MO_32, tcg_gen_gvec_usadd) +TRANS(xvsadd_du, LASX, gvec_xxx, MO_64, tcg_gen_gvec_usadd) +TRANS(xvssub_b, LASX, gvec_xxx, MO_8, tcg_gen_gvec_sssub) +TRANS(xvssub_h, LASX, gvec_xxx, MO_16, tcg_gen_gvec_sssub) +TRANS(xvssub_w, LASX, gvec_xxx, MO_32, tcg_gen_gvec_sssub) +TRANS(xvssub_d, LASX, gvec_xxx, MO_64, tcg_gen_gvec_sssub) +TRANS(xvssub_bu, LASX, gvec_xxx, MO_8, tcg_gen_gvec_ussub) +TRANS(xvssub_hu, LASX, gvec_xxx, MO_16, tcg_gen_gvec_ussub) +TRANS(xvssub_wu, LASX, gvec_xxx, MO_32, tcg_gen_gvec_ussub) +TRANS(xvssub_du, LASX, gvec_xxx, MO_64, tcg_gen_gvec_ussub) + TRANS(vhaddw_h_b, LSX, gen_vvv, gen_helper_vhaddw_h_b) TRANS(vhaddw_w_h, LSX, gen_vvv, gen_helper_vhaddw_w_h) TRANS(vhaddw_d_w, LSX, gen_vvv, gen_helper_vhaddw_d_w) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658680164915.9490788912444; Wed, 13 Sep 2023 19:31:20 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4g-0003Bl-8J; Wed, 13 Sep 2023 22:27:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4e-0003As-E4 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:20 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4b-0004u8-CC for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:20 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Bxyep4bwJlfPgmAA--.448S3; Thu, 14 Sep 2023 10:27:04 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S21; Thu, 14 Sep 2023 10:27:03 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 19/57] target/loongarch: Implement xvhaddw/xvhsubw Date: Thu, 14 Sep 2023 10:26:07 +0800 Message-Id: <20230914022645.1151356-20-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S21 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658680406100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVHADDW.{H.B/W.H/D.W/Q.D/HU.BU/WU.HU/DU.WU/QU.DU}; - XVHSUBW.{H.B/W.H/D.W/Q.D/HU.BU/WU.HU/DU.WU/QU.DU}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 18 +++++++++++ target/loongarch/disas.c | 17 +++++++++++ target/loongarch/vec_helper.c | 34 ++++++++++++++++----- target/loongarch/insn_trans/trans_vec.c.inc | 30 +++++++++++++++--- 4 files changed, 88 insertions(+), 11 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 32f857ff7c..ba0b36f4a7 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1343,6 +1343,24 @@ xvssub_hu 0111 01000100 11001 ..... ..... ...= .. @vvv xvssub_wu 0111 01000100 11010 ..... ..... ..... @vvv xvssub_du 0111 01000100 11011 ..... ..... ..... @vvv =20 +xvhaddw_h_b 0111 01000101 01000 ..... ..... ..... @vvv +xvhaddw_w_h 0111 01000101 01001 ..... ..... ..... @vvv +xvhaddw_d_w 0111 01000101 01010 ..... ..... ..... @vvv +xvhaddw_q_d 0111 01000101 01011 ..... ..... ..... @vvv +xvhaddw_hu_bu 0111 01000101 10000 ..... ..... ..... @vvv +xvhaddw_wu_hu 0111 01000101 10001 ..... ..... ..... @vvv +xvhaddw_du_wu 0111 01000101 10010 ..... ..... ..... @vvv +xvhaddw_qu_du 0111 01000101 10011 ..... ..... ..... @vvv + +xvhsubw_h_b 0111 01000101 01100 ..... ..... ..... @vvv +xvhsubw_w_h 0111 01000101 01101 ..... ..... ..... @vvv +xvhsubw_d_w 0111 01000101 01110 ..... ..... ..... @vvv +xvhsubw_q_d 0111 01000101 01111 ..... ..... ..... @vvv +xvhsubw_hu_bu 0111 01000101 10100 ..... ..... ..... @vvv +xvhsubw_wu_hu 0111 01000101 10101 ..... ..... ..... @vvv +xvhsubw_du_wu 0111 01000101 10110 ..... ..... ..... @vvv +xvhsubw_qu_du 0111 01000101 10111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 4ba4fbfc64..c810a52f0d 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1765,6 +1765,23 @@ INSN_LASX(xvssub_hu, vvv) INSN_LASX(xvssub_wu, vvv) INSN_LASX(xvssub_du, vvv) =20 +INSN_LASX(xvhaddw_h_b, vvv) +INSN_LASX(xvhaddw_w_h, vvv) +INSN_LASX(xvhaddw_d_w, vvv) +INSN_LASX(xvhaddw_q_d, vvv) +INSN_LASX(xvhaddw_hu_bu, vvv) +INSN_LASX(xvhaddw_wu_hu, vvv) +INSN_LASX(xvhaddw_du_wu, vvv) +INSN_LASX(xvhaddw_qu_du, vvv) +INSN_LASX(xvhsubw_h_b, vvv) +INSN_LASX(xvhsubw_w_h, vvv) +INSN_LASX(xvhsubw_d_w, vvv) +INSN_LASX(xvhsubw_q_d, vvv) +INSN_LASX(xvhsubw_hu_bu, vvv) +INSN_LASX(xvhsubw_wu_hu, vvv) +INSN_LASX(xvhsubw_du_wu, vvv) +INSN_LASX(xvhsubw_qu_du, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index c784f98ab2..2ce0ca41a7 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -13,6 +13,7 @@ #include "internals.h" #include "tcg/tcg.h" #include "vec.h" +#include "tcg/tcg-gvec-desc.h" =20 #define DO_ADD(a, b) (a + b) #define DO_SUB(a, b) (a - b) @@ -25,8 +26,9 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t = desc) \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ typedef __typeof(Vd->E1(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E1(i) =3D DO_OP((TD)Vj->E2(2 * i + 1), (TD)Vk->E2(2 * i)); \ } \ } @@ -37,11 +39,16 @@ DO_ODD_EVEN(vhaddw_d_w, 64, D, W, DO_ADD) =20 void HELPER(vhaddw_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_makes64(Vj->D(1)), int128_makes64(Vk->D= (0))); + for (i =3D 0; i < oprsz / 16 ; i++) { + Vd->Q(i) =3D int128_add(int128_makes64(Vj->D(2 * i + 1)), + int128_makes64(Vk->D(2 * i))); + } } =20 DO_ODD_EVEN(vhsubw_h_b, 16, H, B, DO_SUB) @@ -50,11 +57,16 @@ DO_ODD_EVEN(vhsubw_d_w, 64, D, W, DO_SUB) =20 void HELPER(vhsubw_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_sub(int128_makes64(Vj->D(1)), int128_makes64(Vk->D= (0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_sub(int128_makes64(Vj->D(2 * i + 1)), + int128_makes64(Vk->D(2 * i))); + } } =20 DO_ODD_EVEN(vhaddw_hu_bu, 16, UH, UB, DO_ADD) @@ -63,12 +75,16 @@ DO_ODD_EVEN(vhaddw_du_wu, 64, UD, UW, DO_ADD) =20 void HELPER(vhaddw_qu_du)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_make64((uint64_t)Vj->D(1)), - int128_make64((uint64_t)Vk->D(0))); + for (i =3D 0; i < oprsz / 16; i ++) { + Vd->Q(i) =3D int128_add(int128_make64(Vj->UD(2 * i + 1)), + int128_make64(Vk->UD(2 * i))); + } } =20 DO_ODD_EVEN(vhsubw_hu_bu, 16, UH, UB, DO_SUB) @@ -77,12 +93,16 @@ DO_ODD_EVEN(vhsubw_du_wu, 64, UD, UW, DO_SUB) =20 void HELPER(vhsubw_qu_du)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_sub(int128_make64((uint64_t)Vj->D(1)), - int128_make64((uint64_t)Vk->D(0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_sub(int128_make64(Vj->UD(2 * i + 1)), + int128_make64(Vk->UD(2 * i))); + } } =20 #define DO_EVEN(NAME, BIT, E1, E2, DO_OP) \ diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 0f9a85bbb2..7786f21022 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -97,6 +97,10 @@ static bool gen_vvv_ptr(DisasContext *ctx, arg_vvv *a, static bool gen_vvv_vl(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, gen_helper_gvec_3 *fn) { + if (!check_vec(ctx, oprsz)) { + return true; + } + tcg_gen_gvec_3_ool(vec_full_offset(a->vd), vec_full_offset(a->vj), vec_full_offset(a->vk), @@ -106,13 +110,14 @@ static bool gen_vvv_vl(DisasContext *ctx, arg_vvv *a,= uint32_t oprsz, =20 static bool gen_vvv(DisasContext *ctx, arg_vvv *a, gen_helper_gvec_3 *fn) { - if (!check_vec(ctx, 16)) { - return true; - } - return gen_vvv_vl(ctx, a, 16, fn); } =20 +static bool gen_xxx(DisasContext *ctx, arg_vvv *a, gen_helper_gvec_3 *fn) +{ + return gen_vvv_vl(ctx, a, 32, fn); +} + static bool gen_vv_ptr_vl(DisasContext *ctx, arg_vv *a, uint32_t oprsz, gen_helper_gvec_2_ptr *fn) { @@ -446,6 +451,23 @@ TRANS(vhsubw_wu_hu, LSX, gen_vvv, gen_helper_vhsubw_wu= _hu) TRANS(vhsubw_du_wu, LSX, gen_vvv, gen_helper_vhsubw_du_wu) TRANS(vhsubw_qu_du, LSX, gen_vvv, gen_helper_vhsubw_qu_du) =20 +TRANS(xvhaddw_h_b, LASX, gen_xxx, gen_helper_vhaddw_h_b) +TRANS(xvhaddw_w_h, LASX, gen_xxx, gen_helper_vhaddw_w_h) +TRANS(xvhaddw_d_w, LASX, gen_xxx, gen_helper_vhaddw_d_w) +TRANS(xvhaddw_q_d, LASX, gen_xxx, gen_helper_vhaddw_q_d) +TRANS(xvhaddw_hu_bu, LASX, gen_xxx, gen_helper_vhaddw_hu_bu) +TRANS(xvhaddw_wu_hu, LASX, gen_xxx, gen_helper_vhaddw_wu_hu) +TRANS(xvhaddw_du_wu, LASX, gen_xxx, gen_helper_vhaddw_du_wu) +TRANS(xvhaddw_qu_du, LASX, gen_xxx, gen_helper_vhaddw_qu_du) +TRANS(xvhsubw_h_b, LASX, gen_xxx, gen_helper_vhsubw_h_b) +TRANS(xvhsubw_w_h, LASX, gen_xxx, gen_helper_vhsubw_w_h) +TRANS(xvhsubw_d_w, LASX, gen_xxx, gen_helper_vhsubw_d_w) +TRANS(xvhsubw_q_d, LASX, gen_xxx, gen_helper_vhsubw_q_d) +TRANS(xvhsubw_hu_bu, LASX, gen_xxx, gen_helper_vhsubw_hu_bu) +TRANS(xvhsubw_wu_hu, LASX, gen_xxx, gen_helper_vhsubw_wu_hu) +TRANS(xvhsubw_du_wu, LASX, gen_xxx, gen_helper_vhsubw_du_wu) +TRANS(xvhsubw_qu_du, LASX, gen_xxx, gen_helper_vhsubw_qu_du) + static void gen_vaddwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { TCGv_vec t1, t2; --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658926331264.4877516564977; Wed, 13 Sep 2023 19:35:26 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4q-0003IE-Rz; Wed, 13 Sep 2023 22:27:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4g-0003CJ-TT for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:22 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4d-0004ub-84 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:22 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dxl+h5bwJlf_gmAA--.39394S3; Thu, 14 Sep 2023 10:27:05 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S22; Thu, 14 Sep 2023 10:27:04 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 20/57] target/loongarch: Implement xvaddw/xvsubw Date: Thu, 14 Sep 2023 10:26:08 +0800 Message-Id: <20230914022645.1151356-21-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S22 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658928542100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVADDW{EV/OD}.{H.B/W.H/D.W/Q.D}[U]; - XVSUBW{EV/OD}.{H.B/W.H/D.W/Q.D}[U]; - XVADDW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 45 ++++++++ target/loongarch/disas.c | 43 +++++++ target/loongarch/vec_helper.c | 120 ++++++++++++++------ target/loongarch/insn_trans/trans_vec.c.inc | 41 +++++++ 4 files changed, 215 insertions(+), 34 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index ba0b36f4a7..e1d8b30179 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1361,6 +1361,51 @@ xvhsubw_wu_hu 0111 01000101 10101 ..... ..... ...= .. @vvv xvhsubw_du_wu 0111 01000101 10110 ..... ..... ..... @vvv xvhsubw_qu_du 0111 01000101 10111 ..... ..... ..... @vvv =20 +xvaddwev_h_b 0111 01000001 11100 ..... ..... ..... @vvv +xvaddwev_w_h 0111 01000001 11101 ..... ..... ..... @vvv +xvaddwev_d_w 0111 01000001 11110 ..... ..... ..... @vvv +xvaddwev_q_d 0111 01000001 11111 ..... ..... ..... @vvv +xvaddwod_h_b 0111 01000010 00100 ..... ..... ..... @vvv +xvaddwod_w_h 0111 01000010 00101 ..... ..... ..... @vvv +xvaddwod_d_w 0111 01000010 00110 ..... ..... ..... @vvv +xvaddwod_q_d 0111 01000010 00111 ..... ..... ..... @vvv + +xvsubwev_h_b 0111 01000010 00000 ..... ..... ..... @vvv +xvsubwev_w_h 0111 01000010 00001 ..... ..... ..... @vvv +xvsubwev_d_w 0111 01000010 00010 ..... ..... ..... @vvv +xvsubwev_q_d 0111 01000010 00011 ..... ..... ..... @vvv +xvsubwod_h_b 0111 01000010 01000 ..... ..... ..... @vvv +xvsubwod_w_h 0111 01000010 01001 ..... ..... ..... @vvv +xvsubwod_d_w 0111 01000010 01010 ..... ..... ..... @vvv +xvsubwod_q_d 0111 01000010 01011 ..... ..... ..... @vvv + +xvaddwev_h_bu 0111 01000010 11100 ..... ..... ..... @vvv +xvaddwev_w_hu 0111 01000010 11101 ..... ..... ..... @vvv +xvaddwev_d_wu 0111 01000010 11110 ..... ..... ..... @vvv +xvaddwev_q_du 0111 01000010 11111 ..... ..... ..... @vvv +xvaddwod_h_bu 0111 01000011 00100 ..... ..... ..... @vvv +xvaddwod_w_hu 0111 01000011 00101 ..... ..... ..... @vvv +xvaddwod_d_wu 0111 01000011 00110 ..... ..... ..... @vvv +xvaddwod_q_du 0111 01000011 00111 ..... ..... ..... @vvv + +xvsubwev_h_bu 0111 01000011 00000 ..... ..... ..... @vvv +xvsubwev_w_hu 0111 01000011 00001 ..... ..... ..... @vvv +xvsubwev_d_wu 0111 01000011 00010 ..... ..... ..... @vvv +xvsubwev_q_du 0111 01000011 00011 ..... ..... ..... @vvv +xvsubwod_h_bu 0111 01000011 01000 ..... ..... ..... @vvv +xvsubwod_w_hu 0111 01000011 01001 ..... ..... ..... @vvv +xvsubwod_d_wu 0111 01000011 01010 ..... ..... ..... @vvv +xvsubwod_q_du 0111 01000011 01011 ..... ..... ..... @vvv + +xvaddwev_h_bu_b 0111 01000011 11100 ..... ..... ..... @vvv +xvaddwev_w_hu_h 0111 01000011 11101 ..... ..... ..... @vvv +xvaddwev_d_wu_w 0111 01000011 11110 ..... ..... ..... @vvv +xvaddwev_q_du_d 0111 01000011 11111 ..... ..... ..... @vvv +xvaddwod_h_bu_b 0111 01000100 00000 ..... ..... ..... @vvv +xvaddwod_w_hu_h 0111 01000100 00001 ..... ..... ..... @vvv +xvaddwod_d_wu_w 0111 01000100 00010 ..... ..... ..... @vvv +xvaddwod_q_du_d 0111 01000100 00011 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index c810a52f0d..e3e57e1d05 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1782,6 +1782,49 @@ INSN_LASX(xvhsubw_wu_hu, vvv) INSN_LASX(xvhsubw_du_wu, vvv) INSN_LASX(xvhsubw_qu_du, vvv) =20 +INSN_LASX(xvaddwev_h_b, vvv) +INSN_LASX(xvaddwev_w_h, vvv) +INSN_LASX(xvaddwev_d_w, vvv) +INSN_LASX(xvaddwev_q_d, vvv) +INSN_LASX(xvaddwod_h_b, vvv) +INSN_LASX(xvaddwod_w_h, vvv) +INSN_LASX(xvaddwod_d_w, vvv) +INSN_LASX(xvaddwod_q_d, vvv) +INSN_LASX(xvsubwev_h_b, vvv) +INSN_LASX(xvsubwev_w_h, vvv) +INSN_LASX(xvsubwev_d_w, vvv) +INSN_LASX(xvsubwev_q_d, vvv) +INSN_LASX(xvsubwod_h_b, vvv) +INSN_LASX(xvsubwod_w_h, vvv) +INSN_LASX(xvsubwod_d_w, vvv) +INSN_LASX(xvsubwod_q_d, vvv) + +INSN_LASX(xvaddwev_h_bu, vvv) +INSN_LASX(xvaddwev_w_hu, vvv) +INSN_LASX(xvaddwev_d_wu, vvv) +INSN_LASX(xvaddwev_q_du, vvv) +INSN_LASX(xvaddwod_h_bu, vvv) +INSN_LASX(xvaddwod_w_hu, vvv) +INSN_LASX(xvaddwod_d_wu, vvv) +INSN_LASX(xvaddwod_q_du, vvv) +INSN_LASX(xvsubwev_h_bu, vvv) +INSN_LASX(xvsubwev_w_hu, vvv) +INSN_LASX(xvsubwev_d_wu, vvv) +INSN_LASX(xvsubwev_q_du, vvv) +INSN_LASX(xvsubwod_h_bu, vvv) +INSN_LASX(xvsubwod_w_hu, vvv) +INSN_LASX(xvsubwod_d_wu, vvv) +INSN_LASX(xvsubwod_q_du, vvv) + +INSN_LASX(xvaddwev_h_bu_b, vvv) +INSN_LASX(xvaddwev_w_hu_h, vvv) +INSN_LASX(xvaddwev_d_wu_w, vvv) +INSN_LASX(xvaddwev_q_du_d, vvv) +INSN_LASX(xvaddwod_h_bu_b, vvv) +INSN_LASX(xvaddwod_w_hu_h, vvv) +INSN_LASX(xvaddwod_d_wu_w, vvv) +INSN_LASX(xvaddwod_q_du_d, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 2ce0ca41a7..fc3b07e8d2 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -106,133 +106,173 @@ void HELPER(vhsubw_qu_du)(void *vd, void *vj, void = *vk, uint32_t desc) } =20 #define DO_EVEN(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ typedef __typeof(Vd->E1(0)) TD; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E1(i) =3D DO_OP((TD)Vj->E2(2 * i) ,(TD)Vk->E2(2 * i)); \ } \ } =20 #define DO_ODD(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ VReg *Vd =3D (VReg *)vd; = \ VReg *Vj =3D (VReg *)vj; = \ VReg *Vk =3D (VReg *)vk; = \ typedef __typeof(Vd->E1(0)) TD; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { = \ Vd->E1(i) =3D DO_OP((TD)Vj->E2(2 * i + 1), (TD)Vk->E2(2 * i + 1));= \ } \ } =20 -void HELPER(vaddwev_q_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vaddwev_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_makes64(Vj->D(0)), int128_makes64(Vk->D= (0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_add(int128_makes64(Vj->D(2 * i)), + int128_makes64(Vk->D(2 * i))); + } } =20 DO_EVEN(vaddwev_h_b, 16, H, B, DO_ADD) DO_EVEN(vaddwev_w_h, 32, W, H, DO_ADD) DO_EVEN(vaddwev_d_w, 64, D, W, DO_ADD) =20 -void HELPER(vaddwod_q_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vaddwod_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_makes64(Vj->D(1)), int128_makes64(Vk->D= (1))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_add(int128_makes64(Vj->D(2 * i +1)), + int128_makes64(Vk->D(2 * i +1))); + } } =20 DO_ODD(vaddwod_h_b, 16, H, B, DO_ADD) DO_ODD(vaddwod_w_h, 32, W, H, DO_ADD) DO_ODD(vaddwod_d_w, 64, D, W, DO_ADD) =20 -void HELPER(vsubwev_q_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vsubwev_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_sub(int128_makes64(Vj->D(0)), int128_makes64(Vk->D= (0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_sub(int128_makes64(Vj->D(2 * i)), + int128_makes64(Vk->D(2 * i))); + } } =20 DO_EVEN(vsubwev_h_b, 16, H, B, DO_SUB) DO_EVEN(vsubwev_w_h, 32, W, H, DO_SUB) DO_EVEN(vsubwev_d_w, 64, D, W, DO_SUB) =20 -void HELPER(vsubwod_q_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vsubwod_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_sub(int128_makes64(Vj->D(1)), int128_makes64(Vk->D= (1))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_sub(int128_makes64(Vj->D(2 * i + 1)), + int128_makes64(Vk->D(2 * i + 1))); + } } =20 DO_ODD(vsubwod_h_b, 16, H, B, DO_SUB) DO_ODD(vsubwod_w_h, 32, W, H, DO_SUB) DO_ODD(vsubwod_d_w, 64, D, W, DO_SUB) =20 -void HELPER(vaddwev_q_du)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vaddwev_q_du)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_make64((uint64_t)Vj->D(0)), - int128_make64((uint64_t)Vk->D(0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_add(int128_make64(Vj->UD(2 * i)), + int128_make64(Vk->UD(2 * i))); + } } =20 DO_EVEN(vaddwev_h_bu, 16, UH, UB, DO_ADD) DO_EVEN(vaddwev_w_hu, 32, UW, UH, DO_ADD) DO_EVEN(vaddwev_d_wu, 64, UD, UW, DO_ADD) =20 -void HELPER(vaddwod_q_du)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vaddwod_q_du)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_make64((uint64_t)Vj->D(1)), - int128_make64((uint64_t)Vk->D(1))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_add(int128_make64(Vj->UD(2 * i + 1)), + int128_make64(Vk->UD(2 * i + 1))); + } } =20 DO_ODD(vaddwod_h_bu, 16, UH, UB, DO_ADD) DO_ODD(vaddwod_w_hu, 32, UW, UH, DO_ADD) DO_ODD(vaddwod_d_wu, 64, UD, UW, DO_ADD) =20 -void HELPER(vsubwev_q_du)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vsubwev_q_du)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_sub(int128_make64((uint64_t)Vj->D(0)), - int128_make64((uint64_t)Vk->D(0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_sub(int128_make64(Vj->UD(2 * i)), + int128_make64(Vk->UD(2 * i))); + } } =20 DO_EVEN(vsubwev_h_bu, 16, UH, UB, DO_SUB) DO_EVEN(vsubwev_w_hu, 32, UW, UH, DO_SUB) DO_EVEN(vsubwev_d_wu, 64, UD, UW, DO_SUB) =20 -void HELPER(vsubwod_q_du)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vsubwod_q_du)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_sub(int128_make64((uint64_t)Vj->D(1)), - int128_make64((uint64_t)Vk->D(1))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_sub(int128_make64(Vj->UD(2 * i + 1)), + int128_make64(Vk->UD(2 * i + 1))); + } } =20 DO_ODD(vsubwod_h_bu, 16, UH, UB, DO_SUB) @@ -240,7 +280,7 @@ DO_ODD(vsubwod_w_hu, 32, UW, UH, DO_SUB) DO_ODD(vsubwod_d_wu, 64, UD, UW, DO_SUB) =20 #define DO_EVEN_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ VReg *Vd =3D (VReg *)vd; \ @@ -248,13 +288,15 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint3= 2_t v) \ VReg *Vk =3D (VReg *)vk; \ typedef __typeof(Vd->ES1(0)) TDS; \ typedef __typeof(Vd->EU1(0)) TDU; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->ES1(i) =3D DO_OP((TDU)Vj->EU2(2 * i) ,(TDS)Vk->ES2(2 * i)); \ } \ } =20 #define DO_ODD_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ { = \ int i; = \ VReg *Vd =3D (VReg *)vd; = \ @@ -262,33 +304,43 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint3= 2_t v) \ VReg *Vk =3D (VReg *)vk; = \ typedef __typeof(Vd->ES1(0)) TDS; = \ typedef __typeof(Vd->EU1(0)) TDU; = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { = \ Vd->ES1(i) =3D DO_OP((TDU)Vj->EU2(2 * i + 1), (TDS)Vk->ES2(2 * i += 1)); \ } = \ } =20 -void HELPER(vaddwev_q_du_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vaddwev_q_du_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_make64((uint64_t)Vj->D(0)), - int128_makes64(Vk->D(0))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_add(int128_make64(Vj->UD(2 * i)), + int128_makes64(Vk->D(2 * i))); + } } =20 DO_EVEN_U_S(vaddwev_h_bu_b, 16, H, UH, B, UB, DO_ADD) DO_EVEN_U_S(vaddwev_w_hu_h, 32, W, UW, H, UH, DO_ADD) DO_EVEN_U_S(vaddwev_d_wu_w, 64, D, UD, W, UW, DO_ADD) =20 -void HELPER(vaddwod_q_du_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vaddwod_q_du_d)(void *vd, void *vj, void *vk, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_add(int128_make64((uint64_t)Vj->D(1)), - int128_makes64(Vk->D(1))); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_add(int128_make64(Vj->UD(2 * i + 1)), + int128_makes64(Vk->D(2 * i + 1))); + } } =20 DO_ODD_U_S(vaddwod_h_bu_b, 16, H, UH, B, UB, DO_ADD) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 7786f21022..322d43d50c 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -550,6 +550,10 @@ TRANS(vaddwev_h_b, LSX, gvec_vvv, MO_8, do_vaddwev_s) TRANS(vaddwev_w_h, LSX, gvec_vvv, MO_16, do_vaddwev_s) TRANS(vaddwev_d_w, LSX, gvec_vvv, MO_32, do_vaddwev_s) TRANS(vaddwev_q_d, LSX, gvec_vvv, MO_64, do_vaddwev_s) +TRANS(xvaddwev_h_b, LASX, gvec_xxx, MO_8, do_vaddwev_s) +TRANS(xvaddwev_w_h, LASX, gvec_xxx, MO_16, do_vaddwev_s) +TRANS(xvaddwev_d_w, LASX, gvec_xxx, MO_32, do_vaddwev_s) +TRANS(xvaddwev_q_d, LASX, gvec_xxx, MO_64, do_vaddwev_s) =20 static void gen_vaddwod_w_h(TCGv_i32 t, TCGv_i32 a, TCGv_i32 b) { @@ -629,6 +633,11 @@ TRANS(vaddwod_h_b, LSX, gvec_vvv, MO_8, do_vaddwod_s) TRANS(vaddwod_w_h, LSX, gvec_vvv, MO_16, do_vaddwod_s) TRANS(vaddwod_d_w, LSX, gvec_vvv, MO_32, do_vaddwod_s) TRANS(vaddwod_q_d, LSX, gvec_vvv, MO_64, do_vaddwod_s) +TRANS(xvaddwod_h_b, LASX, gvec_xxx, MO_8, do_vaddwod_s) +TRANS(xvaddwod_w_h, LASX, gvec_xxx, MO_16, do_vaddwod_s) +TRANS(xvaddwod_d_w, LASX, gvec_xxx, MO_32, do_vaddwod_s) +TRANS(xvaddwod_q_d, LASX, gvec_xxx, MO_64, do_vaddwod_s) + =20 static void gen_vsubwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -712,6 +721,10 @@ TRANS(vsubwev_h_b, LSX, gvec_vvv, MO_8, do_vsubwev_s) TRANS(vsubwev_w_h, LSX, gvec_vvv, MO_16, do_vsubwev_s) TRANS(vsubwev_d_w, LSX, gvec_vvv, MO_32, do_vsubwev_s) TRANS(vsubwev_q_d, LSX, gvec_vvv, MO_64, do_vsubwev_s) +TRANS(xvsubwev_h_b, LASX, gvec_xxx, MO_8, do_vsubwev_s) +TRANS(xvsubwev_w_h, LASX, gvec_xxx, MO_16, do_vsubwev_s) +TRANS(xvsubwev_d_w, LASX, gvec_xxx, MO_32, do_vsubwev_s) +TRANS(xvsubwev_q_d, LASX, gvec_xxx, MO_64, do_vsubwev_s) =20 static void gen_vsubwod_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -791,6 +804,10 @@ TRANS(vsubwod_h_b, LSX, gvec_vvv, MO_8, do_vsubwod_s) TRANS(vsubwod_w_h, LSX, gvec_vvv, MO_16, do_vsubwod_s) TRANS(vsubwod_d_w, LSX, gvec_vvv, MO_32, do_vsubwod_s) TRANS(vsubwod_q_d, LSX, gvec_vvv, MO_64, do_vsubwod_s) +TRANS(xvsubwod_h_b, LASX, gvec_xxx, MO_8, do_vsubwod_s) +TRANS(xvsubwod_w_h, LASX, gvec_xxx, MO_16, do_vsubwod_s) +TRANS(xvsubwod_d_w, LASX, gvec_xxx, MO_32, do_vsubwod_s) +TRANS(xvsubwod_q_d, LASX, gvec_xxx, MO_64, do_vsubwod_s) =20 static void gen_vaddwev_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -866,6 +883,10 @@ TRANS(vaddwev_h_bu, LSX, gvec_vvv, MO_8, do_vaddwev_u) TRANS(vaddwev_w_hu, LSX, gvec_vvv, MO_16, do_vaddwev_u) TRANS(vaddwev_d_wu, LSX, gvec_vvv, MO_32, do_vaddwev_u) TRANS(vaddwev_q_du, LSX, gvec_vvv, MO_64, do_vaddwev_u) +TRANS(xvaddwev_h_bu, LASX, gvec_xxx, MO_8, do_vaddwev_u) +TRANS(xvaddwev_w_hu, LASX, gvec_xxx, MO_16, do_vaddwev_u) +TRANS(xvaddwev_d_wu, LASX, gvec_xxx, MO_32, do_vaddwev_u) +TRANS(xvaddwev_q_du, LASX, gvec_xxx, MO_64, do_vaddwev_u) =20 static void gen_vaddwod_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -945,6 +966,10 @@ TRANS(vaddwod_h_bu, LSX, gvec_vvv, MO_8, do_vaddwod_u) TRANS(vaddwod_w_hu, LSX, gvec_vvv, MO_16, do_vaddwod_u) TRANS(vaddwod_d_wu, LSX, gvec_vvv, MO_32, do_vaddwod_u) TRANS(vaddwod_q_du, LSX, gvec_vvv, MO_64, do_vaddwod_u) +TRANS(xvaddwod_h_bu, LASX, gvec_xxx, MO_8, do_vaddwod_u) +TRANS(xvaddwod_w_hu, LASX, gvec_xxx, MO_16, do_vaddwod_u) +TRANS(xvaddwod_d_wu, LASX, gvec_xxx, MO_32, do_vaddwod_u) +TRANS(xvaddwod_q_du, LASX, gvec_xxx, MO_64, do_vaddwod_u) =20 static void gen_vsubwev_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -1020,6 +1045,10 @@ TRANS(vsubwev_h_bu, LSX, gvec_vvv, MO_8, do_vsubwev_= u) TRANS(vsubwev_w_hu, LSX, gvec_vvv, MO_16, do_vsubwev_u) TRANS(vsubwev_d_wu, LSX, gvec_vvv, MO_32, do_vsubwev_u) TRANS(vsubwev_q_du, LSX, gvec_vvv, MO_64, do_vsubwev_u) +TRANS(xvsubwev_h_bu, LASX, gvec_xxx, MO_8, do_vsubwev_u) +TRANS(xvsubwev_w_hu, LASX, gvec_xxx, MO_16, do_vsubwev_u) +TRANS(xvsubwev_d_wu, LASX, gvec_xxx, MO_32, do_vsubwev_u) +TRANS(xvsubwev_q_du, LASX, gvec_xxx, MO_64, do_vsubwev_u) =20 static void gen_vsubwod_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -1099,6 +1128,10 @@ TRANS(vsubwod_h_bu, LSX, gvec_vvv, MO_8, do_vsubwod_= u) TRANS(vsubwod_w_hu, LSX, gvec_vvv, MO_16, do_vsubwod_u) TRANS(vsubwod_d_wu, LSX, gvec_vvv, MO_32, do_vsubwod_u) TRANS(vsubwod_q_du, LSX, gvec_vvv, MO_64, do_vsubwod_u) +TRANS(xvsubwod_h_bu, LASX, gvec_xxx, MO_8, do_vsubwod_u) +TRANS(xvsubwod_w_hu, LASX, gvec_xxx, MO_16, do_vsubwod_u) +TRANS(xvsubwod_d_wu, LASX, gvec_xxx, MO_32, do_vsubwod_u) +TRANS(xvsubwod_q_du, LASX, gvec_xxx, MO_64, do_vsubwod_u) =20 static void gen_vaddwev_u_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_ve= c b) { @@ -1182,6 +1215,10 @@ TRANS(vaddwev_h_bu_b, LSX, gvec_vvv, MO_8, do_vaddwe= v_u_s) TRANS(vaddwev_w_hu_h, LSX, gvec_vvv, MO_16, do_vaddwev_u_s) TRANS(vaddwev_d_wu_w, LSX, gvec_vvv, MO_32, do_vaddwev_u_s) TRANS(vaddwev_q_du_d, LSX, gvec_vvv, MO_64, do_vaddwev_u_s) +TRANS(xvaddwev_h_bu_b, LASX, gvec_xxx, MO_8, do_vaddwev_u_s) +TRANS(xvaddwev_w_hu_h, LASX, gvec_xxx, MO_16, do_vaddwev_u_s) +TRANS(xvaddwev_d_wu_w, LASX, gvec_xxx, MO_32, do_vaddwev_u_s) +TRANS(xvaddwev_q_du_d, LASX, gvec_xxx, MO_64, do_vaddwev_u_s) =20 static void gen_vaddwod_u_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_ve= c b) { @@ -1262,6 +1299,10 @@ TRANS(vaddwod_h_bu_b, LSX, gvec_vvv, MO_8, do_vaddwo= d_u_s) TRANS(vaddwod_w_hu_h, LSX, gvec_vvv, MO_16, do_vaddwod_u_s) TRANS(vaddwod_d_wu_w, LSX, gvec_vvv, MO_32, do_vaddwod_u_s) TRANS(vaddwod_q_du_d, LSX, gvec_vvv, MO_64, do_vaddwod_u_s) +TRANS(xvaddwod_h_bu_b, LSX, gvec_xxx, MO_8, do_vaddwod_u_s) +TRANS(xvaddwod_w_hu_h, LSX, gvec_xxx, MO_16, do_vaddwod_u_s) +TRANS(xvaddwod_d_wu_w, LSX, gvec_xxx, MO_32, do_vaddwod_u_s) +TRANS(xvaddwod_q_du_d, LSX, gvec_xxx, MO_64, do_vaddwod_u_s) =20 static void do_vavg(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b, void (*gen_shr_vec)(unsigned, TCGv_vec, --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658987304149.13209600625385; Wed, 13 Sep 2023 19:36:27 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5I-0003bz-Q0; Wed, 13 Sep 2023 22:28:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4f-0003Bj-Tu for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:22 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4d-0004uf-89 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:21 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Ax1fB5bwJlgPgmAA--.9916S3; Thu, 14 Sep 2023 10:27:05 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S23; Thu, 14 Sep 2023 10:27:05 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 21/57] target/loongarch: Implement xavg/xvagr Date: Thu, 14 Sep 2023 10:26:09 +0800 Message-Id: <20230914022645.1151356-22-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S23 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658988895100005 Content-Type: text/plain; charset="utf-8" This patch includes: - XVAVG.{B/H/W/D/}[U]; - XVAVGR.{B/H/W/D}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 17 ++++++++++++++++ target/loongarch/disas.c | 17 ++++++++++++++++ target/loongarch/vec_helper.c | 22 +++++++++++---------- target/loongarch/insn_trans/trans_vec.c.inc | 16 +++++++++++++++ 4 files changed, 62 insertions(+), 10 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index e1d8b30179..a2cb39750d 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1406,6 +1406,23 @@ xvaddwod_w_hu_h 0111 01000100 00001 ..... ..... ...= .. @vvv xvaddwod_d_wu_w 0111 01000100 00010 ..... ..... ..... @vvv xvaddwod_q_du_d 0111 01000100 00011 ..... ..... ..... @vvv =20 +xvavg_b 0111 01000110 01000 ..... ..... ..... @vvv +xvavg_h 0111 01000110 01001 ..... ..... ..... @vvv +xvavg_w 0111 01000110 01010 ..... ..... ..... @vvv +xvavg_d 0111 01000110 01011 ..... ..... ..... @vvv +xvavg_bu 0111 01000110 01100 ..... ..... ..... @vvv +xvavg_hu 0111 01000110 01101 ..... ..... ..... @vvv +xvavg_wu 0111 01000110 01110 ..... ..... ..... @vvv +xvavg_du 0111 01000110 01111 ..... ..... ..... @vvv +xvavgr_b 0111 01000110 10000 ..... ..... ..... @vvv +xvavgr_h 0111 01000110 10001 ..... ..... ..... @vvv +xvavgr_w 0111 01000110 10010 ..... ..... ..... @vvv +xvavgr_d 0111 01000110 10011 ..... ..... ..... @vvv +xvavgr_bu 0111 01000110 10100 ..... ..... ..... @vvv +xvavgr_hu 0111 01000110 10101 ..... ..... ..... @vvv +xvavgr_wu 0111 01000110 10110 ..... ..... ..... @vvv +xvavgr_du 0111 01000110 10111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index e3e57e1d05..f9d9583fcc 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1825,6 +1825,23 @@ INSN_LASX(xvaddwod_w_hu_h, vvv) INSN_LASX(xvaddwod_d_wu_w, vvv) INSN_LASX(xvaddwod_q_du_d, vvv) =20 +INSN_LASX(xvavg_b, vvv) +INSN_LASX(xvavg_h, vvv) +INSN_LASX(xvavg_w, vvv) +INSN_LASX(xvavg_d, vvv) +INSN_LASX(xvavg_bu, vvv) +INSN_LASX(xvavg_hu, vvv) +INSN_LASX(xvavg_wu, vvv) +INSN_LASX(xvavg_du, vvv) +INSN_LASX(xvavgr_b, vvv) +INSN_LASX(xvavgr_h, vvv) +INSN_LASX(xvavgr_w, vvv) +INSN_LASX(xvavgr_d, vvv) +INSN_LASX(xvavgr_bu, vvv) +INSN_LASX(xvavgr_hu, vvv) +INSN_LASX(xvavgr_wu, vvv) +INSN_LASX(xvavgr_du, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index fc3b07e8d2..35b207aae1 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -350,16 +350,18 @@ DO_ODD_U_S(vaddwod_d_wu_w, 64, D, UD, W, UW, DO_ADD) #define DO_VAVG(a, b) ((a >> 1) + (b >> 1) + (a & b & 1)) #define DO_VAVGR(a, b) ((a >> 1) + (b >> 1) + ((a | b) & 1)) =20 -#define DO_3OP(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)); \ - } \ +#define DO_3OP(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)); \ + } \ } =20 DO_3OP(vavg_b, 8, B, DO_VAVG) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 322d43d50c..050a96088d 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -1419,6 +1419,14 @@ TRANS(vavg_bu, LSX, gvec_vvv, MO_8, do_vavg_u) TRANS(vavg_hu, LSX, gvec_vvv, MO_16, do_vavg_u) TRANS(vavg_wu, LSX, gvec_vvv, MO_32, do_vavg_u) TRANS(vavg_du, LSX, gvec_vvv, MO_64, do_vavg_u) +TRANS(xvavg_b, LASX, gvec_xxx, MO_8, do_vavg_s) +TRANS(xvavg_h, LASX, gvec_xxx, MO_16, do_vavg_s) +TRANS(xvavg_w, LASX, gvec_xxx, MO_32, do_vavg_s) +TRANS(xvavg_d, LASX, gvec_xxx, MO_64, do_vavg_s) +TRANS(xvavg_bu, LASX, gvec_xxx, MO_8, do_vavg_u) +TRANS(xvavg_hu, LASX, gvec_xxx, MO_16, do_vavg_u) +TRANS(xvavg_wu, LASX, gvec_xxx, MO_32, do_vavg_u) +TRANS(xvavg_du, LASX, gvec_xxx, MO_64, do_vavg_u) =20 static void do_vavgr_s(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, uint32_t vk_ofs, uint32_t oprsz, uint32_t maxsz) @@ -1500,6 +1508,14 @@ TRANS(vavgr_bu, LSX, gvec_vvv, MO_8, do_vavgr_u) TRANS(vavgr_hu, LSX, gvec_vvv, MO_16, do_vavgr_u) TRANS(vavgr_wu, LSX, gvec_vvv, MO_32, do_vavgr_u) TRANS(vavgr_du, LSX, gvec_vvv, MO_64, do_vavgr_u) +TRANS(xvavgr_b, LASX, gvec_xxx, MO_8, do_vavgr_s) +TRANS(xvavgr_h, LASX, gvec_xxx, MO_16, do_vavgr_s) +TRANS(xvavgr_w, LASX, gvec_xxx, MO_32, do_vavgr_s) +TRANS(xvavgr_d, LASX, gvec_xxx, MO_64, do_vavgr_s) +TRANS(xvavgr_bu, LASX, gvec_xxx, MO_8, do_vavgr_u) +TRANS(xvavgr_hu, LASX, gvec_xxx, MO_16, do_vavgr_u) +TRANS(xvavgr_wu, LASX, gvec_xxx, MO_32, do_vavgr_u) +TRANS(xvavgr_du, LASX, gvec_xxx, MO_64, do_vavgr_u) =20 static void gen_vabsd_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) { --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16946585368081005.2183577148679; Wed, 13 Sep 2023 19:28:56 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4k-0003EW-0s; Wed, 13 Sep 2023 22:27:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4f-0003Ba-LH for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:21 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4d-0004up-8B for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:21 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxVuh6bwJlg_gmAA--.40123S3; Thu, 14 Sep 2023 10:27:06 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S24; Thu, 14 Sep 2023 10:27:05 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 22/57] target/loongarch: Implement xvabsd Date: Thu, 14 Sep 2023 10:26:10 +0800 Message-Id: <20230914022645.1151356-23-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S24 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658537497100005 Content-Type: text/plain; charset="utf-8" This patch includes: - XVABSD.{B/H/W/D}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 9 +++++++++ target/loongarch/disas.c | 9 +++++++++ target/loongarch/insn_trans/trans_vec.c.inc | 8 ++++++++ 3 files changed, 26 insertions(+) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index a2cb39750d..c086ee9b22 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1423,6 +1423,15 @@ xvavgr_hu 0111 01000110 10101 ..... ..... ...= .. @vvv xvavgr_wu 0111 01000110 10110 ..... ..... ..... @vvv xvavgr_du 0111 01000110 10111 ..... ..... ..... @vvv =20 +xvabsd_b 0111 01000110 00000 ..... ..... ..... @vvv +xvabsd_h 0111 01000110 00001 ..... ..... ..... @vvv +xvabsd_w 0111 01000110 00010 ..... ..... ..... @vvv +xvabsd_d 0111 01000110 00011 ..... ..... ..... @vvv +xvabsd_bu 0111 01000110 00100 ..... ..... ..... @vvv +xvabsd_hu 0111 01000110 00101 ..... ..... ..... @vvv +xvabsd_wu 0111 01000110 00110 ..... ..... ..... @vvv +xvabsd_du 0111 01000110 00111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index f9d9583fcc..bbe7ad8322 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1842,6 +1842,15 @@ INSN_LASX(xvavgr_hu, vvv) INSN_LASX(xvavgr_wu, vvv) INSN_LASX(xvavgr_du, vvv) =20 +INSN_LASX(xvabsd_b, vvv) +INSN_LASX(xvabsd_h, vvv) +INSN_LASX(xvabsd_w, vvv) +INSN_LASX(xvabsd_d, vvv) +INSN_LASX(xvabsd_bu, vvv) +INSN_LASX(xvabsd_hu, vvv) +INSN_LASX(xvabsd_wu, vvv) +INSN_LASX(xvabsd_du, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 050a96088d..37b842075d 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -1611,6 +1611,14 @@ TRANS(vabsd_bu, LSX, gvec_vvv, MO_8, do_vabsd_u) TRANS(vabsd_hu, LSX, gvec_vvv, MO_16, do_vabsd_u) TRANS(vabsd_wu, LSX, gvec_vvv, MO_32, do_vabsd_u) TRANS(vabsd_du, LSX, gvec_vvv, MO_64, do_vabsd_u) +TRANS(xvabsd_b, LASX, gvec_xxx, MO_8, do_vabsd_s) +TRANS(xvabsd_h, LASX, gvec_xxx, MO_16, do_vabsd_s) +TRANS(xvabsd_w, LASX, gvec_xxx, MO_32, do_vabsd_s) +TRANS(xvabsd_d, LASX, gvec_xxx, MO_64, do_vabsd_s) +TRANS(xvabsd_bu, LASX, gvec_xxx, MO_8, do_vabsd_u) +TRANS(xvabsd_hu, LASX, gvec_xxx, MO_16, do_vabsd_u) +TRANS(xvabsd_wu, LASX, gvec_xxx, MO_32, do_vabsd_u) +TRANS(xvabsd_du, LASX, gvec_xxx, MO_64, do_vabsd_u) =20 static void gen_vadda(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) { --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658559128367.73697355112574; Wed, 13 Sep 2023 19:29:19 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4p-0003Gy-CO; Wed, 13 Sep 2023 22:27:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4f-0003Bk-U3 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:22 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4d-0004uy-ES for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:21 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxDOt6bwJlhfgmAA--.4552S3; Thu, 14 Sep 2023 10:27:06 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S25; Thu, 14 Sep 2023 10:27:06 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 23/57] target/loongarch: Implement xvadda Date: Thu, 14 Sep 2023 10:26:11 +0800 Message-Id: <20230914022645.1151356-24-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S25 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658560514100005 Content-Type: text/plain; charset="utf-8" This patch includes: - XVADDA.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 5 ++++ target/loongarch/disas.c | 5 ++++ target/loongarch/vec_helper.c | 30 +++++++++++---------- target/loongarch/insn_trans/trans_vec.c.inc | 4 +++ 4 files changed, 30 insertions(+), 14 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index c086ee9b22..f3722e3aa7 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1432,6 +1432,11 @@ xvabsd_hu 0111 01000110 00101 ..... ..... ...= .. @vvv xvabsd_wu 0111 01000110 00110 ..... ..... ..... @vvv xvabsd_du 0111 01000110 00111 ..... ..... ..... @vvv =20 +xvadda_b 0111 01000101 11000 ..... ..... ..... @vvv +xvadda_h 0111 01000101 11001 ..... ..... ..... @vvv +xvadda_w 0111 01000101 11010 ..... ..... ..... @vvv +xvadda_d 0111 01000101 11011 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index bbe7ad8322..51fbd78279 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1851,6 +1851,11 @@ INSN_LASX(xvabsd_hu, vvv) INSN_LASX(xvabsd_wu, vvv) INSN_LASX(xvabsd_du, vvv) =20 +INSN_LASX(xvadda_b, vvv) +INSN_LASX(xvadda_h, vvv) +INSN_LASX(xvadda_w, vvv) +INSN_LASX(xvadda_d, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 35b207aae1..ec6d86cc83 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -394,22 +394,24 @@ DO_3OP(vabsd_du, 64, UD, DO_VABSD) =20 #define DO_VABS(a) ((a < 0) ? (-a) : (a)) =20 -#define DO_VADDA(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i)) + DO_OP(Vk->E(i)); \ - } \ +#define DO_VADDA(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_VABS(Vj->E(i)) + DO_VABS(Vk->E(i)); \ + } \ } =20 -DO_VADDA(vadda_b, 8, B, DO_VABS) -DO_VADDA(vadda_h, 16, H, DO_VABS) -DO_VADDA(vadda_w, 32, W, DO_VABS) -DO_VADDA(vadda_d, 64, D, DO_VABS) +DO_VADDA(vadda_b, 8, B) +DO_VADDA(vadda_h, 16, H) +DO_VADDA(vadda_w, 32, W) +DO_VADDA(vadda_d, 64, D) =20 #define DO_MIN(a, b) (a < b ? a : b) #define DO_MAX(a, b) (a > b ? a : b) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 37b842075d..c8d356788e 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -1672,6 +1672,10 @@ TRANS(vadda_b, LSX, gvec_vvv, MO_8, do_vadda) TRANS(vadda_h, LSX, gvec_vvv, MO_16, do_vadda) TRANS(vadda_w, LSX, gvec_vvv, MO_32, do_vadda) TRANS(vadda_d, LSX, gvec_vvv, MO_64, do_vadda) +TRANS(xvadda_b, LASX, gvec_xxx, MO_8, do_vadda) +TRANS(xvadda_h, LASX, gvec_xxx, MO_16, do_vadda) +TRANS(xvadda_w, LASX, gvec_xxx, MO_32, do_vadda) +TRANS(xvadda_d, LASX, gvec_xxx, MO_64, do_vadda) =20 TRANS(vmax_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_smax) TRANS(vmax_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_smax) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658848967965.9136076375264; Wed, 13 Sep 2023 19:34:08 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc51-0003Jo-AL; Wed, 13 Sep 2023 22:27:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4j-0003ER-Vi for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:25 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4h-0004v9-N3 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:25 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxY_B7bwJlh_gmAA--.9971S3; Thu, 14 Sep 2023 10:27:07 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S26; Thu, 14 Sep 2023 10:27:06 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 24/57] target/loongarch: Implement xvmax/xvmin Date: Thu, 14 Sep 2023 10:26:12 +0800 Message-Id: <20230914022645.1151356-25-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S26 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658849309100002 Content-Type: text/plain; charset="utf-8" This patch includes: - XVMAX[I].{B/H/W/D}[U]; - XVMIN[I].{B/H/W/D}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 36 +++++++++++++++++++++ target/loongarch/disas.c | 34 +++++++++++++++++++ target/loongarch/vec_helper.c | 23 ++++++------- target/loongarch/insn_trans/trans_vec.c.inc | 32 ++++++++++++++++++ 4 files changed, 114 insertions(+), 11 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index f3722e3aa7..99aefcb651 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1437,6 +1437,42 @@ xvadda_h 0111 01000101 11001 ..... ..... ...= .. @vvv xvadda_w 0111 01000101 11010 ..... ..... ..... @vvv xvadda_d 0111 01000101 11011 ..... ..... ..... @vvv =20 +xvmax_b 0111 01000111 00000 ..... ..... ..... @vvv +xvmax_h 0111 01000111 00001 ..... ..... ..... @vvv +xvmax_w 0111 01000111 00010 ..... ..... ..... @vvv +xvmax_d 0111 01000111 00011 ..... ..... ..... @vvv +xvmax_bu 0111 01000111 01000 ..... ..... ..... @vvv +xvmax_hu 0111 01000111 01001 ..... ..... ..... @vvv +xvmax_wu 0111 01000111 01010 ..... ..... ..... @vvv +xvmax_du 0111 01000111 01011 ..... ..... ..... @vvv + +xvmaxi_b 0111 01101001 00000 ..... ..... ..... @vv_i5 +xvmaxi_h 0111 01101001 00001 ..... ..... ..... @vv_i5 +xvmaxi_w 0111 01101001 00010 ..... ..... ..... @vv_i5 +xvmaxi_d 0111 01101001 00011 ..... ..... ..... @vv_i5 +xvmaxi_bu 0111 01101001 01000 ..... ..... ..... @vv_ui5 +xvmaxi_hu 0111 01101001 01001 ..... ..... ..... @vv_ui5 +xvmaxi_wu 0111 01101001 01010 ..... ..... ..... @vv_ui5 +xvmaxi_du 0111 01101001 01011 ..... ..... ..... @vv_ui5 + +xvmin_b 0111 01000111 00100 ..... ..... ..... @vvv +xvmin_h 0111 01000111 00101 ..... ..... ..... @vvv +xvmin_w 0111 01000111 00110 ..... ..... ..... @vvv +xvmin_d 0111 01000111 00111 ..... ..... ..... @vvv +xvmin_bu 0111 01000111 01100 ..... ..... ..... @vvv +xvmin_hu 0111 01000111 01101 ..... ..... ..... @vvv +xvmin_wu 0111 01000111 01110 ..... ..... ..... @vvv +xvmin_du 0111 01000111 01111 ..... ..... ..... @vvv + +xvmini_b 0111 01101001 00100 ..... ..... ..... @vv_i5 +xvmini_h 0111 01101001 00101 ..... ..... ..... @vv_i5 +xvmini_w 0111 01101001 00110 ..... ..... ..... @vv_i5 +xvmini_d 0111 01101001 00111 ..... ..... ..... @vv_i5 +xvmini_bu 0111 01101001 01100 ..... ..... ..... @vv_ui5 +xvmini_hu 0111 01101001 01101 ..... ..... ..... @vv_ui5 +xvmini_wu 0111 01101001 01110 ..... ..... ..... @vv_ui5 +xvmini_du 0111 01101001 01111 ..... ..... ..... @vv_ui5 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 51fbd78279..ef2c78147e 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1856,6 +1856,40 @@ INSN_LASX(xvadda_h, vvv) INSN_LASX(xvadda_w, vvv) INSN_LASX(xvadda_d, vvv) =20 +INSN_LASX(xvmax_b, vvv) +INSN_LASX(xvmax_h, vvv) +INSN_LASX(xvmax_w, vvv) +INSN_LASX(xvmax_d, vvv) +INSN_LASX(xvmin_b, vvv) +INSN_LASX(xvmin_h, vvv) +INSN_LASX(xvmin_w, vvv) +INSN_LASX(xvmin_d, vvv) +INSN_LASX(xvmax_bu, vvv) +INSN_LASX(xvmax_hu, vvv) +INSN_LASX(xvmax_wu, vvv) +INSN_LASX(xvmax_du, vvv) +INSN_LASX(xvmin_bu, vvv) +INSN_LASX(xvmin_hu, vvv) +INSN_LASX(xvmin_wu, vvv) +INSN_LASX(xvmin_du, vvv) + +INSN_LASX(xvmaxi_b, vv_i) +INSN_LASX(xvmaxi_h, vv_i) +INSN_LASX(xvmaxi_w, vv_i) +INSN_LASX(xvmaxi_d, vv_i) +INSN_LASX(xvmini_b, vv_i) +INSN_LASX(xvmini_h, vv_i) +INSN_LASX(xvmini_w, vv_i) +INSN_LASX(xvmini_d, vv_i) +INSN_LASX(xvmaxi_bu, vv_i) +INSN_LASX(xvmaxi_hu, vv_i) +INSN_LASX(xvmaxi_wu, vv_i) +INSN_LASX(xvmaxi_du, vv_i) +INSN_LASX(xvmini_bu, vv_i) +INSN_LASX(xvmini_hu, vv_i) +INSN_LASX(xvmini_wu, vv_i) +INSN_LASX(xvmini_du, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index ec6d86cc83..fdf8b3dd64 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -416,17 +416,18 @@ DO_VADDA(vadda_d, 64, D) #define DO_MIN(a, b) (a < b ? a : b) #define DO_MAX(a, b) (a > b ? a : b) =20 -#define VMINMAXI(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - typedef __typeof(Vd->E(0)) TD; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i), (TD)imm); \ - } \ +#define VMINMAXI(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + typedef __typeof(Vd->E(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), (TD)imm); \ + } \ } =20 VMINMAXI(vmini_b, 8, B, DO_MIN) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index c8d356788e..78baf39bd9 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -1685,6 +1685,14 @@ TRANS(vmax_bu, LSX, gvec_vvv, MO_8, tcg_gen_gvec_uma= x) TRANS(vmax_hu, LSX, gvec_vvv, MO_16, tcg_gen_gvec_umax) TRANS(vmax_wu, LSX, gvec_vvv, MO_32, tcg_gen_gvec_umax) TRANS(vmax_du, LSX, gvec_vvv, MO_64, tcg_gen_gvec_umax) +TRANS(xvmax_b, LASX, gvec_xxx, MO_8, tcg_gen_gvec_smax) +TRANS(xvmax_h, LASX, gvec_xxx, MO_16, tcg_gen_gvec_smax) +TRANS(xvmax_w, LASX, gvec_xxx, MO_32, tcg_gen_gvec_smax) +TRANS(xvmax_d, LASX, gvec_xxx, MO_64, tcg_gen_gvec_smax) +TRANS(xvmax_bu, LASX, gvec_xxx, MO_8, tcg_gen_gvec_umax) +TRANS(xvmax_hu, LASX, gvec_xxx, MO_16, tcg_gen_gvec_umax) +TRANS(xvmax_wu, LASX, gvec_xxx, MO_32, tcg_gen_gvec_umax) +TRANS(xvmax_du, LASX, gvec_xxx, MO_64, tcg_gen_gvec_umax) =20 TRANS(vmin_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_smin) TRANS(vmin_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_smin) @@ -1694,6 +1702,14 @@ TRANS(vmin_bu, LSX, gvec_vvv, MO_8, tcg_gen_gvec_umi= n) TRANS(vmin_hu, LSX, gvec_vvv, MO_16, tcg_gen_gvec_umin) TRANS(vmin_wu, LSX, gvec_vvv, MO_32, tcg_gen_gvec_umin) TRANS(vmin_du, LSX, gvec_vvv, MO_64, tcg_gen_gvec_umin) +TRANS(xvmin_b, LASX, gvec_xxx, MO_8, tcg_gen_gvec_smin) +TRANS(xvmin_h, LASX, gvec_xxx, MO_16, tcg_gen_gvec_smin) +TRANS(xvmin_w, LASX, gvec_xxx, MO_32, tcg_gen_gvec_smin) +TRANS(xvmin_d, LASX, gvec_xxx, MO_64, tcg_gen_gvec_smin) +TRANS(xvmin_bu, LASX, gvec_xxx, MO_8, tcg_gen_gvec_umin) +TRANS(xvmin_hu, LASX, gvec_xxx, MO_16, tcg_gen_gvec_umin) +TRANS(xvmin_wu, LASX, gvec_xxx, MO_32, tcg_gen_gvec_umin) +TRANS(xvmin_du, LASX, gvec_xxx, MO_64, tcg_gen_gvec_umin) =20 static void gen_vmini_s(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t imm) { @@ -1795,6 +1811,14 @@ TRANS(vmini_bu, LSX, gvec_vv_i, MO_8, do_vmini_u) TRANS(vmini_hu, LSX, gvec_vv_i, MO_16, do_vmini_u) TRANS(vmini_wu, LSX, gvec_vv_i, MO_32, do_vmini_u) TRANS(vmini_du, LSX, gvec_vv_i, MO_64, do_vmini_u) +TRANS(xvmini_b, LASX, gvec_xx_i, MO_8, do_vmini_s) +TRANS(xvmini_h, LASX, gvec_xx_i, MO_16, do_vmini_s) +TRANS(xvmini_w, LASX, gvec_xx_i, MO_32, do_vmini_s) +TRANS(xvmini_d, LASX, gvec_xx_i, MO_64, do_vmini_s) +TRANS(xvmini_bu, LASX, gvec_xx_i, MO_8, do_vmini_u) +TRANS(xvmini_hu, LASX, gvec_xx_i, MO_16, do_vmini_u) +TRANS(xvmini_wu, LASX, gvec_xx_i, MO_32, do_vmini_u) +TRANS(xvmini_du, LASX, gvec_xx_i, MO_64, do_vmini_u) =20 static void do_vmaxi_s(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, int64_t imm, uint32_t oprsz, uint32_t maxsz) @@ -1876,6 +1900,14 @@ TRANS(vmaxi_bu, LSX, gvec_vv_i, MO_8, do_vmaxi_u) TRANS(vmaxi_hu, LSX, gvec_vv_i, MO_16, do_vmaxi_u) TRANS(vmaxi_wu, LSX, gvec_vv_i, MO_32, do_vmaxi_u) TRANS(vmaxi_du, LSX, gvec_vv_i, MO_64, do_vmaxi_u) +TRANS(xvmaxi_b, LASX, gvec_xx_i, MO_8, do_vmaxi_s) +TRANS(xvmaxi_h, LASX, gvec_xx_i, MO_16, do_vmaxi_s) +TRANS(xvmaxi_w, LASX, gvec_xx_i, MO_32, do_vmaxi_s) +TRANS(xvmaxi_d, LASX, gvec_xx_i, MO_64, do_vmaxi_s) +TRANS(xvmaxi_bu, LASX, gvec_xx_i, MO_8, do_vmaxi_u) +TRANS(xvmaxi_hu, LASX, gvec_xx_i, MO_16, do_vmaxi_u) +TRANS(xvmaxi_wu, LASX, gvec_xx_i, MO_32, do_vmaxi_u) +TRANS(xvmaxi_du, LASX, gvec_xx_i, MO_64, do_vmaxi_u) =20 TRANS(vmul_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_mul) TRANS(vmul_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_mul) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658951133625.8891137005496; Wed, 13 Sep 2023 19:35:51 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4o-0003GB-Fd; Wed, 13 Sep 2023 22:27:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4l-0003F0-4b for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:27 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4h-0004vD-Ne for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:26 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx7+t8bwJlifgmAA--.8666S3; Thu, 14 Sep 2023 10:27:08 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S27; Thu, 14 Sep 2023 10:27:07 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 25/57] target/loongarch: Implement xvmul/xvmuh/xvmulw{ev/od} Date: Thu, 14 Sep 2023 10:26:13 +0800 Message-Id: <20230914022645.1151356-26-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S27 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658951579100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVMUL.{B/H/W/D}; - XVMUH.{B/H/W/D}[U]; - XVMULW{EV/OD}.{H.B/W.H/D.W/Q.D}[U]; - XVMULW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 38 +++++++ target/loongarch/disas.c | 38 +++++++ target/loongarch/vec_helper.c | 55 +++++----- target/loongarch/insn_trans/trans_vec.c.inc | 113 +++++++++++++++----- 4 files changed, 189 insertions(+), 55 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 99aefcb651..0f9ebe641f 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1473,6 +1473,44 @@ xvmini_hu 0111 01101001 01101 ..... ..... ...= .. @vv_ui5 xvmini_wu 0111 01101001 01110 ..... ..... ..... @vv_ui5 xvmini_du 0111 01101001 01111 ..... ..... ..... @vv_ui5 =20 +xvmul_b 0111 01001000 01000 ..... ..... ..... @vvv +xvmul_h 0111 01001000 01001 ..... ..... ..... @vvv +xvmul_w 0111 01001000 01010 ..... ..... ..... @vvv +xvmul_d 0111 01001000 01011 ..... ..... ..... @vvv +xvmuh_b 0111 01001000 01100 ..... ..... ..... @vvv +xvmuh_h 0111 01001000 01101 ..... ..... ..... @vvv +xvmuh_w 0111 01001000 01110 ..... ..... ..... @vvv +xvmuh_d 0111 01001000 01111 ..... ..... ..... @vvv +xvmuh_bu 0111 01001000 10000 ..... ..... ..... @vvv +xvmuh_hu 0111 01001000 10001 ..... ..... ..... @vvv +xvmuh_wu 0111 01001000 10010 ..... ..... ..... @vvv +xvmuh_du 0111 01001000 10011 ..... ..... ..... @vvv + +xvmulwev_h_b 0111 01001001 00000 ..... ..... ..... @vvv +xvmulwev_w_h 0111 01001001 00001 ..... ..... ..... @vvv +xvmulwev_d_w 0111 01001001 00010 ..... ..... ..... @vvv +xvmulwev_q_d 0111 01001001 00011 ..... ..... ..... @vvv +xvmulwod_h_b 0111 01001001 00100 ..... ..... ..... @vvv +xvmulwod_w_h 0111 01001001 00101 ..... ..... ..... @vvv +xvmulwod_d_w 0111 01001001 00110 ..... ..... ..... @vvv +xvmulwod_q_d 0111 01001001 00111 ..... ..... ..... @vvv +xvmulwev_h_bu 0111 01001001 10000 ..... ..... ..... @vvv +xvmulwev_w_hu 0111 01001001 10001 ..... ..... ..... @vvv +xvmulwev_d_wu 0111 01001001 10010 ..... ..... ..... @vvv +xvmulwev_q_du 0111 01001001 10011 ..... ..... ..... @vvv +xvmulwod_h_bu 0111 01001001 10100 ..... ..... ..... @vvv +xvmulwod_w_hu 0111 01001001 10101 ..... ..... ..... @vvv +xvmulwod_d_wu 0111 01001001 10110 ..... ..... ..... @vvv +xvmulwod_q_du 0111 01001001 10111 ..... ..... ..... @vvv +xvmulwev_h_bu_b 0111 01001010 00000 ..... ..... ..... @vvv +xvmulwev_w_hu_h 0111 01001010 00001 ..... ..... ..... @vvv +xvmulwev_d_wu_w 0111 01001010 00010 ..... ..... ..... @vvv +xvmulwev_q_du_d 0111 01001010 00011 ..... ..... ..... @vvv +xvmulwod_h_bu_b 0111 01001010 00100 ..... ..... ..... @vvv +xvmulwod_w_hu_h 0111 01001010 00101 ..... ..... ..... @vvv +xvmulwod_d_wu_w 0111 01001010 00110 ..... ..... ..... @vvv +xvmulwod_q_du_d 0111 01001010 00111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index ef2c78147e..f839373a7a 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1890,6 +1890,44 @@ INSN_LASX(xvmini_hu, vv_i) INSN_LASX(xvmini_wu, vv_i) INSN_LASX(xvmini_du, vv_i) =20 +INSN_LASX(xvmul_b, vvv) +INSN_LASX(xvmul_h, vvv) +INSN_LASX(xvmul_w, vvv) +INSN_LASX(xvmul_d, vvv) +INSN_LASX(xvmuh_b, vvv) +INSN_LASX(xvmuh_h, vvv) +INSN_LASX(xvmuh_w, vvv) +INSN_LASX(xvmuh_d, vvv) +INSN_LASX(xvmuh_bu, vvv) +INSN_LASX(xvmuh_hu, vvv) +INSN_LASX(xvmuh_wu, vvv) +INSN_LASX(xvmuh_du, vvv) + +INSN_LASX(xvmulwev_h_b, vvv) +INSN_LASX(xvmulwev_w_h, vvv) +INSN_LASX(xvmulwev_d_w, vvv) +INSN_LASX(xvmulwev_q_d, vvv) +INSN_LASX(xvmulwod_h_b, vvv) +INSN_LASX(xvmulwod_w_h, vvv) +INSN_LASX(xvmulwod_d_w, vvv) +INSN_LASX(xvmulwod_q_d, vvv) +INSN_LASX(xvmulwev_h_bu, vvv) +INSN_LASX(xvmulwev_w_hu, vvv) +INSN_LASX(xvmulwev_d_wu, vvv) +INSN_LASX(xvmulwev_q_du, vvv) +INSN_LASX(xvmulwod_h_bu, vvv) +INSN_LASX(xvmulwod_w_hu, vvv) +INSN_LASX(xvmulwod_d_wu, vvv) +INSN_LASX(xvmulwod_q_du, vvv) +INSN_LASX(xvmulwev_h_bu_b, vvv) +INSN_LASX(xvmulwev_w_hu_h, vvv) +INSN_LASX(xvmulwev_d_wu_w, vvv) +INSN_LASX(xvmulwev_q_du_d, vvv) +INSN_LASX(xvmulwod_h_bu_b, vvv) +INSN_LASX(xvmulwod_w_hu_h, vvv) +INSN_LASX(xvmulwod_d_wu_w, vvv) +INSN_LASX(xvmulwod_q_du_d, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index fdf8b3dd64..e152998094 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -447,50 +447,53 @@ VMINMAXI(vmaxi_hu, 16, UH, DO_MAX) VMINMAXI(vmaxi_wu, 32, UW, DO_MAX) VMINMAXI(vmaxi_du, 64, UD, DO_MAX) =20 -#define DO_VMUH(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - typedef __typeof(Vd->E1(0)) T; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E2(i) =3D ((T)Vj->E2(i)) * ((T)Vk->E2(i)) >> BIT; \ - } \ +#define DO_VMUH(NAME, BIT, E1, E2, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + typedef __typeof(Vd->E1(0)) T; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E2(i) =3D ((T)Vj->E2(i)) * ((T)Vk->E2(i)) >> BIT; \ + } \ } =20 -void HELPER(vmuh_d)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vmuh_d)(void *vd, void *vj, void *vk, uint32_t desc) { - uint64_t l, h1, h2; + int i; + uint64_t l, h; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - muls64(&l, &h1, Vj->D(0), Vk->D(0)); - muls64(&l, &h2, Vj->D(1), Vk->D(1)); - - Vd->D(0) =3D h1; - Vd->D(1) =3D h2; + for (i =3D 0; i < oprsz / 8; i++) { + muls64(&l, &h, Vj->D(i), Vk->D(i)); + Vd->D(i) =3D h; + } } =20 DO_VMUH(vmuh_b, 8, H, B, DO_MUH) DO_VMUH(vmuh_h, 16, W, H, DO_MUH) DO_VMUH(vmuh_w, 32, D, W, DO_MUH) =20 -void HELPER(vmuh_du)(void *vd, void *vj, void *vk, uint32_t v) +void HELPER(vmuh_du)(void *vd, void *vj, void *vk, uint32_t desc) { - uint64_t l, h1, h2; + int i; + uint64_t l, h; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 - mulu64(&l, &h1, Vj->D(0), Vk->D(0)); - mulu64(&l, &h2, Vj->D(1), Vk->D(1)); - - Vd->D(0) =3D h1; - Vd->D(1) =3D h2; + for (i =3D 0; i < oprsz / 8; i++) { + mulu64(&l, &h, Vj->D(i), Vk->D(i)); + Vd->D(i) =3D h; + } } =20 DO_VMUH(vmuh_bu, 8, UH, UB, DO_MUH) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 78baf39bd9..714cea848f 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -1913,6 +1913,10 @@ TRANS(vmul_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_mul) TRANS(vmul_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_mul) TRANS(vmul_w, LSX, gvec_vvv, MO_32, tcg_gen_gvec_mul) TRANS(vmul_d, LSX, gvec_vvv, MO_64, tcg_gen_gvec_mul) +TRANS(xvmul_b, LASX, gvec_xxx, MO_8, tcg_gen_gvec_mul) +TRANS(xvmul_h, LASX, gvec_xxx, MO_16, tcg_gen_gvec_mul) +TRANS(xvmul_w, LASX, gvec_xxx, MO_32, tcg_gen_gvec_mul) +TRANS(xvmul_d, LASX, gvec_xxx, MO_64, tcg_gen_gvec_mul) =20 static void gen_vmuh_w(TCGv_i32 t, TCGv_i32 a, TCGv_i32 b) { @@ -1957,6 +1961,10 @@ TRANS(vmuh_b, LSX, gvec_vvv, MO_8, do_vmuh_s) TRANS(vmuh_h, LSX, gvec_vvv, MO_16, do_vmuh_s) TRANS(vmuh_w, LSX, gvec_vvv, MO_32, do_vmuh_s) TRANS(vmuh_d, LSX, gvec_vvv, MO_64, do_vmuh_s) +TRANS(xvmuh_b, LASX, gvec_xxx, MO_8, do_vmuh_s) +TRANS(xvmuh_h, LASX, gvec_xxx, MO_16, do_vmuh_s) +TRANS(xvmuh_w, LASX, gvec_xxx, MO_32, do_vmuh_s) +TRANS(xvmuh_d, LASX, gvec_xxx, MO_64, do_vmuh_s) =20 static void gen_vmuh_wu(TCGv_i32 t, TCGv_i32 a, TCGv_i32 b) { @@ -2001,6 +2009,10 @@ TRANS(vmuh_bu, LSX, gvec_vvv, MO_8, do_vmuh_u) TRANS(vmuh_hu, LSX, gvec_vvv, MO_16, do_vmuh_u) TRANS(vmuh_wu, LSX, gvec_vvv, MO_32, do_vmuh_u) TRANS(vmuh_du, LSX, gvec_vvv, MO_64, do_vmuh_u) +TRANS(xvmuh_bu, LASX, gvec_xxx, MO_8, do_vmuh_u) +TRANS(xvmuh_hu, LASX, gvec_xxx, MO_16, do_vmuh_u) +TRANS(xvmuh_wu, LASX, gvec_xxx, MO_32, do_vmuh_u) +TRANS(xvmuh_du, LASX, gvec_xxx, MO_64, do_vmuh_u) =20 static void gen_vmulwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -2073,6 +2085,9 @@ static void do_vmulwev_s(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, TRANS(vmulwev_h_b, LSX, gvec_vvv, MO_8, do_vmulwev_s) TRANS(vmulwev_w_h, LSX, gvec_vvv, MO_16, do_vmulwev_s) TRANS(vmulwev_d_w, LSX, gvec_vvv, MO_32, do_vmulwev_s) +TRANS(xvmulwev_h_b, LASX, gvec_xxx, MO_8, do_vmulwev_s) +TRANS(xvmulwev_w_h, LASX, gvec_xxx, MO_16, do_vmulwev_s) +TRANS(xvmulwev_d_w, LASX, gvec_xxx, MO_32, do_vmulwev_s) =20 static void tcg_gen_mulus2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 arg1, TCGv_i64 arg2) @@ -2080,37 +2095,62 @@ static void tcg_gen_mulus2_i64(TCGv_i64 rl, TCGv_i6= 4 rh, tcg_gen_mulsu2_i64(rl, rh, arg2, arg1); } =20 -#define VMUL_Q(NAME, FN, idx1, idx2) \ -static bool trans_## NAME (DisasContext *ctx, arg_vvv *a) \ -{ \ - TCGv_i64 rh, rl, arg1, arg2; \ - \ - if (!avail_LSX(ctx)) { \ - return false; \ - } \ - \ - rh =3D tcg_temp_new_i64(); \ - rl =3D tcg_temp_new_i64(); \ - arg1 =3D tcg_temp_new_i64(); \ - arg2 =3D tcg_temp_new_i64(); \ - \ - get_vreg64(arg1, a->vj, idx1); \ - get_vreg64(arg2, a->vk, idx2); \ - \ - tcg_gen_## FN ##_i64(rl, rh, arg1, arg2); \ - \ - set_vreg64(rh, a->vd, 1); \ - set_vreg64(rl, a->vd, 0); \ - \ - return true; \ +static bool gen_vmul_q_vl(DisasContext *ctx, + arg_vvv *a, uint32_t oprsz, int idx1, int idx2, + void (*func)(TCGv_i64, TCGv_i64, + TCGv_i64, TCGv_i64)) +{ + TCGv_i64 rh, rl, arg1, arg2; + int i; + + if (!check_vec(ctx, oprsz)) { + return true; + } + + rh =3D tcg_temp_new_i64(); + rl =3D tcg_temp_new_i64(); + arg1 =3D tcg_temp_new_i64(); + arg2 =3D tcg_temp_new_i64(); + + for (i =3D 0; i < oprsz / 16; i++) { + get_vreg64(arg1, a->vj, 2 * i + idx1); + get_vreg64(arg2, a->vk, 2 * i + idx2); + + func(rl, rh, arg1, arg2); + + set_vreg64(rh, a->vd, 2 * i + 1); + set_vreg64(rl, a->vd, 2 * i); + } + + return true; +} + +static bool gen_vmul_q(DisasContext *ctx, arg_vvv *a, int idx1, int idx2, + void (*func)(TCGv_i64, TCGv_i64, + TCGv_i64, TCGv_i64)) +{ + return gen_vmul_q_vl(ctx, a, 16, idx1, idx2, func); +} + +static bool gen_xvmul_q(DisasContext *ctx, arg_vvv *a, int idx1, int idx2, + void (*func)(TCGv_i64, TCGv_i64, + TCGv_i64, TCGv_i64)) +{ + return gen_vmul_q_vl(ctx, a, 32, idx1, idx2, func); } =20 -VMUL_Q(vmulwev_q_d, muls2, 0, 0) -VMUL_Q(vmulwod_q_d, muls2, 1, 1) -VMUL_Q(vmulwev_q_du, mulu2, 0, 0) -VMUL_Q(vmulwod_q_du, mulu2, 1, 1) -VMUL_Q(vmulwev_q_du_d, mulus2, 0, 0) -VMUL_Q(vmulwod_q_du_d, mulus2, 1, 1) +TRANS(vmulwev_q_d, LSX, gen_vmul_q, 0, 0, tcg_gen_muls2_i64) +TRANS(vmulwod_q_d, LSX, gen_vmul_q, 1, 1, tcg_gen_muls2_i64) +TRANS(vmulwev_q_du, LSX, gen_vmul_q, 0, 0, tcg_gen_mulu2_i64) +TRANS(vmulwod_q_du, LSX, gen_vmul_q, 1, 1, tcg_gen_mulu2_i64) +TRANS(vmulwev_q_du_d, LSX, gen_vmul_q, 0, 0, tcg_gen_mulus2_i64) +TRANS(vmulwod_q_du_d, LSX, gen_vmul_q, 1, 1, tcg_gen_mulus2_i64) +TRANS(xvmulwev_q_d, LASX, gen_xvmul_q, 0, 0, tcg_gen_muls2_i64) +TRANS(xvmulwod_q_d, LASX, gen_xvmul_q, 1, 1, tcg_gen_muls2_i64) +TRANS(xvmulwev_q_du, LASX, gen_xvmul_q, 0, 0, tcg_gen_mulu2_i64) +TRANS(xvmulwod_q_du, LASX, gen_xvmul_q, 1, 1, tcg_gen_mulu2_i64) +TRANS(xvmulwev_q_du_d, LASX, gen_xvmul_q, 0, 0, tcg_gen_mulus2_i64) +TRANS(xvmulwod_q_du_d, LASX, gen_xvmul_q, 1, 1, tcg_gen_mulus2_i64) =20 static void gen_vmulwod_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -2181,6 +2221,9 @@ static void do_vmulwod_s(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, TRANS(vmulwod_h_b, LSX, gvec_vvv, MO_8, do_vmulwod_s) TRANS(vmulwod_w_h, LSX, gvec_vvv, MO_16, do_vmulwod_s) TRANS(vmulwod_d_w, LSX, gvec_vvv, MO_32, do_vmulwod_s) +TRANS(xvmulwod_h_b, LASX, gvec_xxx, MO_8, do_vmulwod_s) +TRANS(xvmulwod_w_h, LASX, gvec_xxx, MO_16, do_vmulwod_s) +TRANS(xvmulwod_d_w, LASX, gvec_xxx, MO_32, do_vmulwod_s) =20 static void gen_vmulwev_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -2251,6 +2294,9 @@ static void do_vmulwev_u(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, TRANS(vmulwev_h_bu, LSX, gvec_vvv, MO_8, do_vmulwev_u) TRANS(vmulwev_w_hu, LSX, gvec_vvv, MO_16, do_vmulwev_u) TRANS(vmulwev_d_wu, LSX, gvec_vvv, MO_32, do_vmulwev_u) +TRANS(xvmulwev_h_bu, LASX, gvec_xxx, MO_8, do_vmulwev_u) +TRANS(xvmulwev_w_hu, LASX, gvec_xxx, MO_16, do_vmulwev_u) +TRANS(xvmulwev_d_wu, LASX, gvec_xxx, MO_32, do_vmulwev_u) =20 static void gen_vmulwod_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec = b) { @@ -2321,6 +2367,9 @@ static void do_vmulwod_u(unsigned vece, uint32_t vd_o= fs, uint32_t vj_ofs, TRANS(vmulwod_h_bu, LSX, gvec_vvv, MO_8, do_vmulwod_u) TRANS(vmulwod_w_hu, LSX, gvec_vvv, MO_16, do_vmulwod_u) TRANS(vmulwod_d_wu, LSX, gvec_vvv, MO_32, do_vmulwod_u) +TRANS(xvmulwod_h_bu, LASX, gvec_xxx, MO_8, do_vmulwod_u) +TRANS(xvmulwod_w_hu, LASX, gvec_xxx, MO_16, do_vmulwod_u) +TRANS(xvmulwod_d_wu, LASX, gvec_xxx, MO_32, do_vmulwod_u) =20 static void gen_vmulwev_u_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_ve= c b) { @@ -2393,6 +2442,9 @@ static void do_vmulwev_u_s(unsigned vece, uint32_t vd= _ofs, uint32_t vj_ofs, TRANS(vmulwev_h_bu_b, LSX, gvec_vvv, MO_8, do_vmulwev_u_s) TRANS(vmulwev_w_hu_h, LSX, gvec_vvv, MO_16, do_vmulwev_u_s) TRANS(vmulwev_d_wu_w, LSX, gvec_vvv, MO_32, do_vmulwev_u_s) +TRANS(xvmulwev_h_bu_b, LASX, gvec_xxx, MO_8, do_vmulwev_u_s) +TRANS(xvmulwev_w_hu_h, LASX, gvec_xxx, MO_16, do_vmulwev_u_s) +TRANS(xvmulwev_d_wu_w, LASX, gvec_xxx, MO_32, do_vmulwev_u_s) =20 static void gen_vmulwod_u_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_ve= c b) { @@ -2462,6 +2514,9 @@ static void do_vmulwod_u_s(unsigned vece, uint32_t vd= _ofs, uint32_t vj_ofs, TRANS(vmulwod_h_bu_b, LSX, gvec_vvv, MO_8, do_vmulwod_u_s) TRANS(vmulwod_w_hu_h, LSX, gvec_vvv, MO_16, do_vmulwod_u_s) TRANS(vmulwod_d_wu_w, LSX, gvec_vvv, MO_32, do_vmulwod_u_s) +TRANS(xvmulwod_h_bu_b, LASX, gvec_xxx, MO_8, do_vmulwod_u_s) +TRANS(xvmulwod_w_hu_h, LASX, gvec_xxx, MO_16, do_vmulwod_u_s) +TRANS(xvmulwod_d_wu_w, LASX, gvec_xxx, MO_32, do_vmulwod_u_s) =20 static void gen_vmadd(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) { --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658527619390.9783663782448; Wed, 13 Sep 2023 19:28:47 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5B-0003Ok-JR; Wed, 13 Sep 2023 22:27:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4l-0003F7-Aq for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:27 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4h-0004vL-Nq for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:27 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Bxyep8bwJli_gmAA--.451S3; Thu, 14 Sep 2023 10:27:08 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S28; Thu, 14 Sep 2023 10:27:08 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 26/57] target/loongarch: Implement xvmadd/xvmsub/xvmaddw{ev/od} Date: Thu, 14 Sep 2023 10:26:14 +0800 Message-Id: <20230914022645.1151356-27-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S28 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658528477100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVMADD.{B/H/W/D}; - XVMSUB.{B/H/W/D}; - XVMADDW{EV/OD}.{H.B/W.H/D.W/Q.D}[U]; - XVMADDW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 34 ++++++ target/loongarch/disas.c | 34 ++++++ target/loongarch/vec_helper.c | 112 +++++++++--------- target/loongarch/insn_trans/trans_vec.c.inc | 121 ++++++++++++++------ 4 files changed, 212 insertions(+), 89 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 0f9ebe641f..d6fb51ae64 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1511,6 +1511,40 @@ xvmulwod_w_hu_h 0111 01001010 00101 ..... ..... ...= .. @vvv xvmulwod_d_wu_w 0111 01001010 00110 ..... ..... ..... @vvv xvmulwod_q_du_d 0111 01001010 00111 ..... ..... ..... @vvv =20 +xvmadd_b 0111 01001010 10000 ..... ..... ..... @vvv +xvmadd_h 0111 01001010 10001 ..... ..... ..... @vvv +xvmadd_w 0111 01001010 10010 ..... ..... ..... @vvv +xvmadd_d 0111 01001010 10011 ..... ..... ..... @vvv +xvmsub_b 0111 01001010 10100 ..... ..... ..... @vvv +xvmsub_h 0111 01001010 10101 ..... ..... ..... @vvv +xvmsub_w 0111 01001010 10110 ..... ..... ..... @vvv +xvmsub_d 0111 01001010 10111 ..... ..... ..... @vvv + +xvmaddwev_h_b 0111 01001010 11000 ..... ..... ..... @vvv +xvmaddwev_w_h 0111 01001010 11001 ..... ..... ..... @vvv +xvmaddwev_d_w 0111 01001010 11010 ..... ..... ..... @vvv +xvmaddwev_q_d 0111 01001010 11011 ..... ..... ..... @vvv +xvmaddwod_h_b 0111 01001010 11100 ..... ..... ..... @vvv +xvmaddwod_w_h 0111 01001010 11101 ..... ..... ..... @vvv +xvmaddwod_d_w 0111 01001010 11110 ..... ..... ..... @vvv +xvmaddwod_q_d 0111 01001010 11111 ..... ..... ..... @vvv +xvmaddwev_h_bu 0111 01001011 01000 ..... ..... ..... @vvv +xvmaddwev_w_hu 0111 01001011 01001 ..... ..... ..... @vvv +xvmaddwev_d_wu 0111 01001011 01010 ..... ..... ..... @vvv +xvmaddwev_q_du 0111 01001011 01011 ..... ..... ..... @vvv +xvmaddwod_h_bu 0111 01001011 01100 ..... ..... ..... @vvv +xvmaddwod_w_hu 0111 01001011 01101 ..... ..... ..... @vvv +xvmaddwod_d_wu 0111 01001011 01110 ..... ..... ..... @vvv +xvmaddwod_q_du 0111 01001011 01111 ..... ..... ..... @vvv +xvmaddwev_h_bu_b 0111 01001011 11000 ..... ..... ..... @vvv +xvmaddwev_w_hu_h 0111 01001011 11001 ..... ..... ..... @vvv +xvmaddwev_d_wu_w 0111 01001011 11010 ..... ..... ..... @vvv +xvmaddwev_q_du_d 0111 01001011 11011 ..... ..... ..... @vvv +xvmaddwod_h_bu_b 0111 01001011 11100 ..... ..... ..... @vvv +xvmaddwod_w_hu_h 0111 01001011 11101 ..... ..... ..... @vvv +xvmaddwod_d_wu_w 0111 01001011 11110 ..... ..... ..... @vvv +xvmaddwod_q_du_d 0111 01001011 11111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index f839373a7a..e4369fd08b 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1928,6 +1928,40 @@ INSN_LASX(xvmulwod_w_hu_h, vvv) INSN_LASX(xvmulwod_d_wu_w, vvv) INSN_LASX(xvmulwod_q_du_d, vvv) =20 +INSN_LASX(xvmadd_b, vvv) +INSN_LASX(xvmadd_h, vvv) +INSN_LASX(xvmadd_w, vvv) +INSN_LASX(xvmadd_d, vvv) +INSN_LASX(xvmsub_b, vvv) +INSN_LASX(xvmsub_h, vvv) +INSN_LASX(xvmsub_w, vvv) +INSN_LASX(xvmsub_d, vvv) + +INSN_LASX(xvmaddwev_h_b, vvv) +INSN_LASX(xvmaddwev_w_h, vvv) +INSN_LASX(xvmaddwev_d_w, vvv) +INSN_LASX(xvmaddwev_q_d, vvv) +INSN_LASX(xvmaddwod_h_b, vvv) +INSN_LASX(xvmaddwod_w_h, vvv) +INSN_LASX(xvmaddwod_d_w, vvv) +INSN_LASX(xvmaddwod_q_d, vvv) +INSN_LASX(xvmaddwev_h_bu, vvv) +INSN_LASX(xvmaddwev_w_hu, vvv) +INSN_LASX(xvmaddwev_d_wu, vvv) +INSN_LASX(xvmaddwev_q_du, vvv) +INSN_LASX(xvmaddwod_h_bu, vvv) +INSN_LASX(xvmaddwod_w_hu, vvv) +INSN_LASX(xvmaddwod_d_wu, vvv) +INSN_LASX(xvmaddwod_q_du, vvv) +INSN_LASX(xvmaddwev_h_bu_b, vvv) +INSN_LASX(xvmaddwev_w_hu_h, vvv) +INSN_LASX(xvmaddwev_d_wu_w, vvv) +INSN_LASX(xvmaddwev_q_du_d, vvv) +INSN_LASX(xvmaddwod_h_bu_b, vvv) +INSN_LASX(xvmaddwod_w_hu_h, vvv) +INSN_LASX(xvmaddwod_d_wu_w, vvv) +INSN_LASX(xvmaddwod_q_du_d, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index e152998094..a800554159 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -529,16 +529,18 @@ DO_ODD_U_S(vmulwod_d_wu_w, 64, D, UD, W, UW, DO_MUL) #define DO_MADD(a, b, c) (a + b * c) #define DO_MSUB(a, b, c) (a - b * c) =20 -#define VMADDSUB(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vd->E(i), Vj->E(i) ,Vk->E(i)); \ - } \ +#define VMADDSUB(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vd->E(i), Vj->E(i) ,Vk->E(i)); \ + } \ } =20 VMADDSUB(vmadd_b, 8, B, DO_MADD) @@ -551,15 +553,16 @@ VMADDSUB(vmsub_w, 32, W, DO_MSUB) VMADDSUB(vmsub_d, 64, D, DO_MSUB) =20 #define VMADDWEV(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ typedef __typeof(Vd->E1(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E1(i) +=3D DO_OP((TD)Vj->E2(2 * i), (TD)Vk->E2(2 * i)); \ } \ } @@ -571,19 +574,20 @@ VMADDWEV(vmaddwev_h_bu, 16, UH, UB, DO_MUL) VMADDWEV(vmaddwev_w_hu, 32, UW, UH, DO_MUL) VMADDWEV(vmaddwev_d_wu, 64, UD, UW, DO_MUL) =20 -#define VMADDWOD(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - typedef __typeof(Vd->E1(0)) TD; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) +=3D DO_OP((TD)Vj->E2(2 * i + 1), \ - (TD)Vk->E2(2 * i + 1)); \ - } \ +#define VMADDWOD(NAME, BIT, E1, E2, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + typedef __typeof(Vd->E1(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E1(i) +=3D DO_OP((TD)Vj->E2(2 * i + 1), \ + (TD)Vk->E2(2 * i + 1)); \ + } \ } =20 VMADDWOD(vmaddwod_h_b, 16, H, B, DO_MUL) @@ -593,40 +597,42 @@ VMADDWOD(vmaddwod_h_bu, 16, UH, UB, DO_MUL) VMADDWOD(vmaddwod_w_hu, 32, UW, UH, DO_MUL) VMADDWOD(vmaddwod_d_wu, 64, UD, UW, DO_MUL) =20 -#define VMADDWEV_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - typedef __typeof(Vd->ES1(0)) TS1; \ - typedef __typeof(Vd->EU1(0)) TU1; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->ES1(i) +=3D DO_OP((TU1)Vj->EU2(2 * i), \ - (TS1)Vk->ES2(2 * i)); \ - } \ +#define VMADDWEV_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + typedef __typeof(Vd->ES1(0)) TS1; \ + typedef __typeof(Vd->EU1(0)) TU1; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->ES1(i) +=3D DO_OP((TU1)Vj->EU2(2 * i), \ + (TS1)Vk->ES2(2 * i)); \ + } \ } =20 VMADDWEV_U_S(vmaddwev_h_bu_b, 16, H, UH, B, UB, DO_MUL) VMADDWEV_U_S(vmaddwev_w_hu_h, 32, W, UW, H, UH, DO_MUL) VMADDWEV_U_S(vmaddwev_d_wu_w, 64, D, UD, W, UW, DO_MUL) =20 -#define VMADDWOD_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - typedef __typeof(Vd->ES1(0)) TS1; \ - typedef __typeof(Vd->EU1(0)) TU1; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->ES1(i) +=3D DO_OP((TU1)Vj->EU2(2 * i + 1), \ - (TS1)Vk->ES2(2 * i + 1)); \ - } \ +#define VMADDWOD_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + typedef __typeof(Vd->ES1(0)) TS1; \ + typedef __typeof(Vd->EU1(0)) TU1; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->ES1(i) +=3D DO_OP((TU1)Vj->EU2(2 * i + 1), \ + (TS1)Vk->ES2(2 * i + 1)); \ + } \ } =20 VMADDWOD_U_S(vmaddwod_h_bu_b, 16, H, UH, B, UB, DO_MUL) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 714cea848f..370e29ad99 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -2591,6 +2591,10 @@ TRANS(vmadd_b, LSX, gvec_vvv, MO_8, do_vmadd) TRANS(vmadd_h, LSX, gvec_vvv, MO_16, do_vmadd) TRANS(vmadd_w, LSX, gvec_vvv, MO_32, do_vmadd) TRANS(vmadd_d, LSX, gvec_vvv, MO_64, do_vmadd) +TRANS(xvmadd_b, LASX, gvec_xxx, MO_8, do_vmadd) +TRANS(xvmadd_h, LASX, gvec_xxx, MO_16, do_vmadd) +TRANS(xvmadd_w, LASX, gvec_xxx, MO_32, do_vmadd) +TRANS(xvmadd_d, LASX, gvec_xxx, MO_64, do_vmadd) =20 static void gen_vmsub(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) { @@ -2665,6 +2669,10 @@ TRANS(vmsub_b, LSX, gvec_vvv, MO_8, do_vmsub) TRANS(vmsub_h, LSX, gvec_vvv, MO_16, do_vmsub) TRANS(vmsub_w, LSX, gvec_vvv, MO_32, do_vmsub) TRANS(vmsub_d, LSX, gvec_vvv, MO_64, do_vmsub) +TRANS(xvmsub_b, LASX, gvec_xxx, MO_8, do_vmsub) +TRANS(xvmsub_h, LASX, gvec_xxx, MO_16, do_vmsub) +TRANS(xvmsub_w, LASX, gvec_xxx, MO_32, do_vmsub) +TRANS(xvmsub_d, LASX, gvec_xxx, MO_64, do_vmsub) =20 static void gen_vmaddwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec= b) { @@ -2739,43 +2747,69 @@ static void do_vmaddwev_s(unsigned vece, uint32_t v= d_ofs, uint32_t vj_ofs, TRANS(vmaddwev_h_b, LSX, gvec_vvv, MO_8, do_vmaddwev_s) TRANS(vmaddwev_w_h, LSX, gvec_vvv, MO_16, do_vmaddwev_s) TRANS(vmaddwev_d_w, LSX, gvec_vvv, MO_32, do_vmaddwev_s) +TRANS(xvmaddwev_h_b, LASX, gvec_xxx, MO_8, do_vmaddwev_s) +TRANS(xvmaddwev_w_h, LASX, gvec_xxx, MO_16, do_vmaddwev_s) +TRANS(xvmaddwev_d_w, LASX, gvec_xxx, MO_32, do_vmaddwev_s) =20 -#define VMADD_Q(NAME, FN, idx1, idx2) \ -static bool trans_## NAME (DisasContext *ctx, arg_vvv *a) \ -{ \ - TCGv_i64 rh, rl, arg1, arg2, th, tl; \ - \ - if (!avail_LSX(ctx)) { \ - return false; \ - } \ - \ - rh =3D tcg_temp_new_i64(); \ - rl =3D tcg_temp_new_i64(); \ - arg1 =3D tcg_temp_new_i64(); \ - arg2 =3D tcg_temp_new_i64(); \ - th =3D tcg_temp_new_i64(); \ - tl =3D tcg_temp_new_i64(); \ - \ - get_vreg64(arg1, a->vj, idx1); \ - get_vreg64(arg2, a->vk, idx2); \ - get_vreg64(rh, a->vd, 1); \ - get_vreg64(rl, a->vd, 0); \ - \ - tcg_gen_## FN ##_i64(tl, th, arg1, arg2); \ - tcg_gen_add2_i64(rl, rh, rl, rh, tl, th); \ - \ - set_vreg64(rh, a->vd, 1); \ - set_vreg64(rl, a->vd, 0); \ - \ - return true; \ -} - -VMADD_Q(vmaddwev_q_d, muls2, 0, 0) -VMADD_Q(vmaddwod_q_d, muls2, 1, 1) -VMADD_Q(vmaddwev_q_du, mulu2, 0, 0) -VMADD_Q(vmaddwod_q_du, mulu2, 1, 1) -VMADD_Q(vmaddwev_q_du_d, mulus2, 0, 0) -VMADD_Q(vmaddwod_q_du_d, mulus2, 1, 1) +static bool gen_vmadd_q_vl(DisasContext * ctx, + arg_vvv *a, uint32_t oprsz, int idx1, int idx2, + void (*func)(TCGv_i64, TCGv_i64, + TCGv_i64, TCGv_i64)) +{ + TCGv_i64 rh, rl, arg1, arg2, th, tl; + int i; + + if (!check_vec(ctx, oprsz)) { + return true; + } + + rh =3D tcg_temp_new_i64(); + rl =3D tcg_temp_new_i64(); + arg1 =3D tcg_temp_new_i64(); + arg2 =3D tcg_temp_new_i64(); + th =3D tcg_temp_new_i64(); + tl =3D tcg_temp_new_i64(); + + for (i =3D 0; i < oprsz / 16; i++) { + get_vreg64(arg1, a->vj, 2 * i + idx1); + get_vreg64(arg2, a->vk, 2 * i + idx2); + get_vreg64(rh, a->vd, 2 * i + 1); + get_vreg64(rl, a->vd, 2 * i); + + func(tl, th, arg1, arg2); + tcg_gen_add2_i64(rl, rh, rl, rh, tl, th); + + set_vreg64(rh, a->vd, 2 * i + 1); + set_vreg64(rl, a->vd, 2 * i); + } + + return true; +} + +static bool gen_vmadd_q(DisasContext *ctx, arg_vvv *a, int idx1, int idx2, + void (*func)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i6= 4)) +{ + return gen_vmadd_q_vl(ctx, a, 16, idx1, idx2, func); +} + +static bool gen_xvmadd_q(DisasContext *ctx, arg_vvv *a, int idx1, int idx2, + void (*func)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i= 64)) +{ + return gen_vmadd_q_vl(ctx, a, 32, idx1, idx2, func); +} + +TRANS(vmaddwev_q_d, LSX, gen_vmadd_q, 0, 0, tcg_gen_muls2_i64) +TRANS(vmaddwod_q_d, LSX, gen_vmadd_q, 1, 1, tcg_gen_muls2_i64) +TRANS(vmaddwev_q_du, LSX, gen_vmadd_q, 0, 0, tcg_gen_mulu2_i64) +TRANS(vmaddwod_q_du, LSX, gen_vmadd_q, 1, 1, tcg_gen_mulu2_i64) +TRANS(vmaddwev_q_du_d, LSX, gen_vmadd_q, 0, 0, tcg_gen_mulus2_i64) +TRANS(vmaddwod_q_du_d, LSX, gen_vmadd_q, 1, 1, tcg_gen_mulus2_i64) +TRANS(xvmaddwev_q_d, LASX, gen_xvmadd_q, 0, 0, tcg_gen_muls2_i64) +TRANS(xvmaddwod_q_d, LASX, gen_xvmadd_q, 1, 1, tcg_gen_muls2_i64) +TRANS(xvmaddwev_q_du, LASX, gen_xvmadd_q, 0, 0, tcg_gen_mulu2_i64) +TRANS(xvmaddwod_q_du, LASX, gen_xvmadd_q, 1, 1, tcg_gen_mulu2_i64) +TRANS(xvmaddwev_q_du_d, LASX, gen_xvmadd_q, 0, 0, tcg_gen_mulus2_i64) +TRANS(xvmaddwod_q_du_d, LASX, gen_xvmadd_q, 1, 1, tcg_gen_mulus2_i64) =20 static void gen_vmaddwod_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec= b) { @@ -2847,6 +2881,9 @@ static void do_vmaddwod_s(unsigned vece, uint32_t vd_= ofs, uint32_t vj_ofs, TRANS(vmaddwod_h_b, LSX, gvec_vvv, MO_8, do_vmaddwod_s) TRANS(vmaddwod_w_h, LSX, gvec_vvv, MO_16, do_vmaddwod_s) TRANS(vmaddwod_d_w, LSX, gvec_vvv, MO_32, do_vmaddwod_s) +TRANS(xvmaddwod_h_b, LASX, gvec_xxx, MO_8, do_vmaddwod_s) +TRANS(xvmaddwod_w_h, LASX, gvec_xxx, MO_16, do_vmaddwod_s) +TRANS(xvmaddwod_d_w, LASX, gvec_xxx, MO_32, do_vmaddwod_s) =20 static void gen_vmaddwev_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec= b) { @@ -2917,6 +2954,9 @@ static void do_vmaddwev_u(unsigned vece, uint32_t vd_= ofs, uint32_t vj_ofs, TRANS(vmaddwev_h_bu, LSX, gvec_vvv, MO_8, do_vmaddwev_u) TRANS(vmaddwev_w_hu, LSX, gvec_vvv, MO_16, do_vmaddwev_u) TRANS(vmaddwev_d_wu, LSX, gvec_vvv, MO_32, do_vmaddwev_u) +TRANS(xvmaddwev_h_bu, LASX, gvec_xxx, MO_8, do_vmaddwev_u) +TRANS(xvmaddwev_w_hu, LASX, gvec_xxx, MO_16, do_vmaddwev_u) +TRANS(xvmaddwev_d_wu, LASX, gvec_xxx, MO_32, do_vmaddwev_u) =20 static void gen_vmaddwod_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec= b) { @@ -2988,6 +3028,9 @@ static void do_vmaddwod_u(unsigned vece, uint32_t vd_= ofs, uint32_t vj_ofs, TRANS(vmaddwod_h_bu, LSX, gvec_vvv, MO_8, do_vmaddwod_u) TRANS(vmaddwod_w_hu, LSX, gvec_vvv, MO_16, do_vmaddwod_u) TRANS(vmaddwod_d_wu, LSX, gvec_vvv, MO_32, do_vmaddwod_u) +TRANS(xvmaddwod_h_bu, LASX, gvec_xxx, MO_8, do_vmaddwod_u) +TRANS(xvmaddwod_w_hu, LASX, gvec_xxx, MO_16, do_vmaddwod_u) +TRANS(xvmaddwod_d_wu, LASX, gvec_xxx, MO_32, do_vmaddwod_u) =20 static void gen_vmaddwev_u_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_v= ec b) { @@ -3061,6 +3104,9 @@ static void do_vmaddwev_u_s(unsigned vece, uint32_t v= d_ofs, uint32_t vj_ofs, TRANS(vmaddwev_h_bu_b, LSX, gvec_vvv, MO_8, do_vmaddwev_u_s) TRANS(vmaddwev_w_hu_h, LSX, gvec_vvv, MO_16, do_vmaddwev_u_s) TRANS(vmaddwev_d_wu_w, LSX, gvec_vvv, MO_32, do_vmaddwev_u_s) +TRANS(xvmaddwev_h_bu_b, LASX, gvec_xxx, MO_8, do_vmaddwev_u_s) +TRANS(xvmaddwev_w_hu_h, LASX, gvec_xxx, MO_16, do_vmaddwev_u_s) +TRANS(xvmaddwev_d_wu_w, LASX, gvec_xxx, MO_32, do_vmaddwev_u_s) =20 static void gen_vmaddwod_u_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_v= ec b) { @@ -3133,6 +3179,9 @@ static void do_vmaddwod_u_s(unsigned vece, uint32_t v= d_ofs, uint32_t vj_ofs, TRANS(vmaddwod_h_bu_b, LSX, gvec_vvv, MO_8, do_vmaddwod_u_s) TRANS(vmaddwod_w_hu_h, LSX, gvec_vvv, MO_16, do_vmaddwod_u_s) TRANS(vmaddwod_d_wu_w, LSX, gvec_vvv, MO_32, do_vmaddwod_u_s) +TRANS(xvmaddwod_h_bu_b, LASX, gvec_xxx, MO_8, do_vmaddwod_u_s) +TRANS(xvmaddwod_w_hu_h, LASX, gvec_xxx, MO_16, do_vmaddwod_u_s) +TRANS(xvmaddwod_d_wu_w, LASX, gvec_xxx, MO_32, do_vmaddwod_u_s) =20 TRANS(vdiv_b, LSX, gen_vvv, gen_helper_vdiv_b) TRANS(vdiv_h, LSX, gen_vvv, gen_helper_vdiv_h) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16946586100461006.0076711771048; Wed, 13 Sep 2023 19:30:10 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5K-0003cF-8W; Wed, 13 Sep 2023 22:28:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4j-0003EO-NC for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:25 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4h-0004vP-NH for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:25 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxTet9bwJljPgmAA--.4351S3; Thu, 14 Sep 2023 10:27:09 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S29; Thu, 14 Sep 2023 10:27:08 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 27/57] target/loongarch; Implement xvdiv/xvmod Date: Thu, 14 Sep 2023 10:26:15 +0800 Message-Id: <20230914022645.1151356-28-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S29 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658612072100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVDIV.{B/H/W/D}[U]; - XVMOD.{B/H/W/D}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 17 +++++++++++++++++ target/loongarch/disas.c | 17 +++++++++++++++++ target/loongarch/vec_helper.c | 4 +++- target/loongarch/insn_trans/trans_vec.c.inc | 16 ++++++++++++++++ 4 files changed, 53 insertions(+), 1 deletion(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index d6fb51ae64..fa25c876b4 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1545,6 +1545,23 @@ xvmaddwod_w_hu_h 0111 01001011 11101 ..... ..... ...= .. @vvv xvmaddwod_d_wu_w 0111 01001011 11110 ..... ..... ..... @vvv xvmaddwod_q_du_d 0111 01001011 11111 ..... ..... ..... @vvv =20 +xvdiv_b 0111 01001110 00000 ..... ..... ..... @vvv +xvdiv_h 0111 01001110 00001 ..... ..... ..... @vvv +xvdiv_w 0111 01001110 00010 ..... ..... ..... @vvv +xvdiv_d 0111 01001110 00011 ..... ..... ..... @vvv +xvmod_b 0111 01001110 00100 ..... ..... ..... @vvv +xvmod_h 0111 01001110 00101 ..... ..... ..... @vvv +xvmod_w 0111 01001110 00110 ..... ..... ..... @vvv +xvmod_d 0111 01001110 00111 ..... ..... ..... @vvv +xvdiv_bu 0111 01001110 01000 ..... ..... ..... @vvv +xvdiv_hu 0111 01001110 01001 ..... ..... ..... @vvv +xvdiv_wu 0111 01001110 01010 ..... ..... ..... @vvv +xvdiv_du 0111 01001110 01011 ..... ..... ..... @vvv +xvmod_bu 0111 01001110 01100 ..... ..... ..... @vvv +xvmod_hu 0111 01001110 01101 ..... ..... ..... @vvv +xvmod_wu 0111 01001110 01110 ..... ..... ..... @vvv +xvmod_du 0111 01001110 01111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index e4369fd08b..d932318b27 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1962,6 +1962,23 @@ INSN_LASX(xvmaddwod_w_hu_h, vvv) INSN_LASX(xvmaddwod_d_wu_w, vvv) INSN_LASX(xvmaddwod_q_du_d, vvv) =20 +INSN_LASX(xvdiv_b, vvv) +INSN_LASX(xvdiv_h, vvv) +INSN_LASX(xvdiv_w, vvv) +INSN_LASX(xvdiv_d, vvv) +INSN_LASX(xvdiv_bu, vvv) +INSN_LASX(xvdiv_hu, vvv) +INSN_LASX(xvdiv_wu, vvv) +INSN_LASX(xvdiv_du, vvv) +INSN_LASX(xvmod_b, vvv) +INSN_LASX(xvmod_h, vvv) +INSN_LASX(xvmod_w, vvv) +INSN_LASX(xvmod_d, vvv) +INSN_LASX(xvmod_bu, vvv) +INSN_LASX(xvmod_hu, vvv) +INSN_LASX(xvmod_wu, vvv) +INSN_LASX(xvmod_du, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index a800554159..9cf979a4bb 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -653,7 +653,9 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_= t desc) \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)); \ } \ } diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 370e29ad99..a22ced99f5 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3199,6 +3199,22 @@ TRANS(vmod_bu, LSX, gen_vvv, gen_helper_vmod_bu) TRANS(vmod_hu, LSX, gen_vvv, gen_helper_vmod_hu) TRANS(vmod_wu, LSX, gen_vvv, gen_helper_vmod_wu) TRANS(vmod_du, LSX, gen_vvv, gen_helper_vmod_du) +TRANS(xvdiv_b, LASX, gen_xxx, gen_helper_vdiv_b) +TRANS(xvdiv_h, LASX, gen_xxx, gen_helper_vdiv_h) +TRANS(xvdiv_w, LASX, gen_xxx, gen_helper_vdiv_w) +TRANS(xvdiv_d, LASX, gen_xxx, gen_helper_vdiv_d) +TRANS(xvdiv_bu, LASX, gen_xxx, gen_helper_vdiv_bu) +TRANS(xvdiv_hu, LASX, gen_xxx, gen_helper_vdiv_hu) +TRANS(xvdiv_wu, LASX, gen_xxx, gen_helper_vdiv_wu) +TRANS(xvdiv_du, LASX, gen_xxx, gen_helper_vdiv_du) +TRANS(xvmod_b, LASX, gen_xxx, gen_helper_vmod_b) +TRANS(xvmod_h, LASX, gen_xxx, gen_helper_vmod_h) +TRANS(xvmod_w, LASX, gen_xxx, gen_helper_vmod_w) +TRANS(xvmod_d, LASX, gen_xxx, gen_helper_vmod_d) +TRANS(xvmod_bu, LASX, gen_xxx, gen_helper_vmod_bu) +TRANS(xvmod_hu, LASX, gen_xxx, gen_helper_vmod_hu) +TRANS(xvmod_wu, LASX, gen_xxx, gen_helper_vmod_wu) +TRANS(xvmod_du, LASX, gen_xxx, gen_helper_vmod_du) =20 static void gen_vsat_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec max) { --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658716760775.4936084527893; Wed, 13 Sep 2023 19:31:56 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6J-0006Dq-0t; Wed, 13 Sep 2023 22:29:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc5q-0004mL-6t for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:35 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5j-0005FF-33 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:33 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxRuh+bwJlj_gmAA--.21966S3; Thu, 14 Sep 2023 10:27:10 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S30; Thu, 14 Sep 2023 10:27:09 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 28/57] target/loongarch: Implement xvsat Date: Thu, 14 Sep 2023 10:26:16 +0800 Message-Id: <20230914022645.1151356-29-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S30 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658717169100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSAT.{B/H/W/D}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 9 ++++ target/loongarch/disas.c | 9 ++++ target/loongarch/vec_helper.c | 48 +++++++++++---------- target/loongarch/insn_trans/trans_vec.c.inc | 8 ++++ 4 files changed, 51 insertions(+), 23 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index fa25c876b4..e366cf7615 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1562,6 +1562,15 @@ xvmod_hu 0111 01001110 01101 ..... ..... ...= .. @vvv xvmod_wu 0111 01001110 01110 ..... ..... ..... @vvv xvmod_du 0111 01001110 01111 ..... ..... ..... @vvv =20 +xvsat_b 0111 01110010 01000 01 ... ..... ..... @vv_ui3 +xvsat_h 0111 01110010 01000 1 .... ..... ..... @vv_ui4 +xvsat_w 0111 01110010 01001 ..... ..... ..... @vv_ui5 +xvsat_d 0111 01110010 0101 ...... ..... ..... @vv_ui6 +xvsat_bu 0111 01110010 10000 01 ... ..... ..... @vv_ui3 +xvsat_hu 0111 01110010 10000 1 .... ..... ..... @vv_ui4 +xvsat_wu 0111 01110010 10001 ..... ..... ..... @vv_ui5 +xvsat_du 0111 01110010 1001 ...... ..... ..... @vv_ui6 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index d932318b27..4e54dcd08a 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1979,6 +1979,15 @@ INSN_LASX(xvmod_hu, vvv) INSN_LASX(xvmod_wu, vvv) INSN_LASX(xvmod_du, vvv) =20 +INSN_LASX(xvsat_b, vv_i) +INSN_LASX(xvsat_h, vv_i) +INSN_LASX(xvsat_w, vv_i) +INSN_LASX(xvsat_d, vv_i) +INSN_LASX(xvsat_bu, vv_i) +INSN_LASX(xvsat_hu, vv_i) +INSN_LASX(xvsat_wu, vv_i) +INSN_LASX(xvsat_du, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 9cf979a4bb..f2e19343bf 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -677,18 +677,19 @@ VDIV(vmod_hu, 16, UH, DO_REMU) VDIV(vmod_wu, 32, UW, DO_REMU) VDIV(vmod_du, 64, UD, DO_REMU) =20 -#define VSAT_S(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t max, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - typedef __typeof(Vd->E(0)) TD; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D Vj->E(i) > (TD)max ? (TD)max : \ - Vj->E(i) < (TD)~max ? (TD)~max: Vj->E(i); \ - } \ +#define VSAT_S(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t max, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + typedef __typeof(Vd->E(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D Vj->E(i) > (TD)max ? (TD)max : \ + Vj->E(i) < (TD)~max ? (TD)~max: Vj->E(i); \ + } \ } =20 VSAT_S(vsat_b, 8, B) @@ -696,17 +697,18 @@ VSAT_S(vsat_h, 16, H) VSAT_S(vsat_w, 32, W) VSAT_S(vsat_d, 64, D) =20 -#define VSAT_U(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t max, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - typedef __typeof(Vd->E(0)) TD; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D Vj->E(i) > (TD)max ? (TD)max : Vj->E(i); \ - } \ +#define VSAT_U(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t max, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + typedef __typeof(Vd->E(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D Vj->E(i) > (TD)max ? (TD)max : Vj->E(i); \ + } \ } =20 VSAT_U(vsat_bu, 8, UB) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index a22ced99f5..a6c6675a94 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3267,6 +3267,10 @@ TRANS(vsat_b, LSX, gvec_vv_i, MO_8, do_vsat_s) TRANS(vsat_h, LSX, gvec_vv_i, MO_16, do_vsat_s) TRANS(vsat_w, LSX, gvec_vv_i, MO_32, do_vsat_s) TRANS(vsat_d, LSX, gvec_vv_i, MO_64, do_vsat_s) +TRANS(xvsat_b, LASX, gvec_xx_i, MO_8, do_vsat_s) +TRANS(xvsat_h, LASX, gvec_xx_i, MO_16, do_vsat_s) +TRANS(xvsat_w, LASX, gvec_xx_i, MO_32, do_vsat_s) +TRANS(xvsat_d, LASX, gvec_xx_i, MO_64, do_vsat_s) =20 static void gen_vsat_u(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec max) { @@ -3316,6 +3320,10 @@ TRANS(vsat_bu, LSX, gvec_vv_i, MO_8, do_vsat_u) TRANS(vsat_hu, LSX, gvec_vv_i, MO_16, do_vsat_u) TRANS(vsat_wu, LSX, gvec_vv_i, MO_32, do_vsat_u) TRANS(vsat_du, LSX, gvec_vv_i, MO_64, do_vsat_u) +TRANS(xvsat_bu, LASX, gvec_xx_i, MO_8, do_vsat_u) +TRANS(xvsat_hu, LASX, gvec_xx_i, MO_16, do_vsat_u) +TRANS(xvsat_wu, LASX, gvec_xx_i, MO_32, do_vsat_u) +TRANS(xvsat_du, LASX, gvec_xx_i, MO_64, do_vsat_u) =20 TRANS(vexth_h_b, LSX, gen_vv, gen_helper_vexth_h_b) TRANS(vexth_w_h, LSX, gen_vv, gen_helper_vexth_w_h) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694659020961386.0755260478054; Wed, 13 Sep 2023 19:37:00 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6N-0006wb-2B; Wed, 13 Sep 2023 22:29:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc5s-0004pW-Aq for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:36 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5h-0005FE-Sb for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:36 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxJuh_bwJlkfgmAA--.16421S3; Thu, 14 Sep 2023 10:27:11 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S31; Thu, 14 Sep 2023 10:27:09 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 29/57] target/loongarch: Implement xvexth Date: Thu, 14 Sep 2023 10:26:17 +0800 Message-Id: <20230914022645.1151356-30-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S31 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694659023319100007 Content-Type: text/plain; charset="utf-8" This patch includes: - XVEXTH.{H.B/W.H/D.W/Q.D}; - XVEXTH.{HU.BU/WU.HU/DU.WU/QU.DU}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 9 ++++++ target/loongarch/disas.c | 9 ++++++ target/loongarch/vec_helper.c | 36 ++++++++++++++------- target/loongarch/insn_trans/trans_vec.c.inc | 21 +++++++++--- 4 files changed, 59 insertions(+), 16 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index e366cf7615..7491f295a5 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1571,6 +1571,15 @@ xvsat_hu 0111 01110010 10000 1 .... ..... ..= ... @vv_ui4 xvsat_wu 0111 01110010 10001 ..... ..... ..... @vv_ui5 xvsat_du 0111 01110010 1001 ...... ..... ..... @vv_ui6 =20 +xvexth_h_b 0111 01101001 11101 11000 ..... ..... @vv +xvexth_w_h 0111 01101001 11101 11001 ..... ..... @vv +xvexth_d_w 0111 01101001 11101 11010 ..... ..... @vv +xvexth_q_d 0111 01101001 11101 11011 ..... ..... @vv +xvexth_hu_bu 0111 01101001 11101 11100 ..... ..... @vv +xvexth_wu_hu 0111 01101001 11101 11101 ..... ..... @vv +xvexth_du_wu 0111 01101001 11101 11110 ..... ..... @vv +xvexth_qu_du 0111 01101001 11101 11111 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 4e54dcd08a..d4bea69b61 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1988,6 +1988,15 @@ INSN_LASX(xvsat_hu, vv_i) INSN_LASX(xvsat_wu, vv_i) INSN_LASX(xvsat_du, vv_i) =20 +INSN_LASX(xvexth_h_b, vv) +INSN_LASX(xvexth_w_h, vv) +INSN_LASX(xvexth_d_w, vv) +INSN_LASX(xvexth_q_d, vv) +INSN_LASX(xvexth_hu_bu, vv) +INSN_LASX(xvexth_wu_hu, vv) +INSN_LASX(xvexth_du_wu, vv) +INSN_LASX(xvexth_qu_du, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index f2e19343bf..2eccbc81a7 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -716,32 +716,44 @@ VSAT_U(vsat_hu, 16, UH) VSAT_U(vsat_wu, 32, UW) VSAT_U(vsat_du, 64, UD) =20 -#define VEXTH(NAME, BIT, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D Vj->E2(i + LSX_LEN/BIT); \ - } \ +#define VEXTH(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + ofs =3D LSX_LEN / BIT; \ + for (i =3D 0; i < oprsz / 16; i++) { \ + for (j =3D 0; j < ofs; j++) { \ + Vd->E1(j + i * ofs) =3D Vj->E2(j + ofs + ofs * 2 * i); \ + } \ + } \ } =20 void HELPER(vexth_q_d)(void *vd, void *vj, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_makes64(Vj->D(1)); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_makes64(Vj->D(2 * i + 1)); + } } =20 void HELPER(vexth_qu_du)(void *vd, void *vj, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_make64((uint64_t)Vj->D(1)); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_make64(Vj->UD(2 * i + 1)); + } } =20 VEXTH(vexth_h_b, 16, H, B) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index a6c6675a94..e002bb05d6 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -141,6 +141,10 @@ static bool gen_vv_ptr(DisasContext *ctx, arg_vv *a, static bool gen_vv_vl(DisasContext *ctx, arg_vv *a, uint32_t oprsz, gen_helper_gvec_2 *fn) { + if (!check_vec(ctx, oprsz)) { + return true; + } + tcg_gen_gvec_2_ool(vec_full_offset(a->vd), vec_full_offset(a->vj), oprsz, ctx->vl / 8, 0, fn); @@ -149,13 +153,14 @@ static bool gen_vv_vl(DisasContext *ctx, arg_vv *a, u= int32_t oprsz, =20 static bool gen_vv(DisasContext *ctx, arg_vv *a, gen_helper_gvec_2 *fn) { - if (!check_vec(ctx, 16)) { - return true; - } - return gen_vv_vl(ctx, a, 16, fn); } =20 +static bool gen_xx(DisasContext *ctx, arg_vv *a, gen_helper_gvec_2 *fn) +{ + return gen_vv_vl(ctx, a, 32, fn); +} + static bool gen_vv_i_vl(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz, gen_helper_gvec_2i *fn) { @@ -3333,6 +3338,14 @@ TRANS(vexth_hu_bu, LSX, gen_vv, gen_helper_vexth_hu_= bu) TRANS(vexth_wu_hu, LSX, gen_vv, gen_helper_vexth_wu_hu) TRANS(vexth_du_wu, LSX, gen_vv, gen_helper_vexth_du_wu) TRANS(vexth_qu_du, LSX, gen_vv, gen_helper_vexth_qu_du) +TRANS(xvexth_h_b, LASX, gen_xx, gen_helper_vexth_h_b) +TRANS(xvexth_w_h, LASX, gen_xx, gen_helper_vexth_w_h) +TRANS(xvexth_d_w, LASX, gen_xx, gen_helper_vexth_d_w) +TRANS(xvexth_q_d, LASX, gen_xx, gen_helper_vexth_q_d) +TRANS(xvexth_hu_bu, LASX, gen_xx, gen_helper_vexth_hu_bu) +TRANS(xvexth_wu_hu, LASX, gen_xx, gen_helper_vexth_wu_hu) +TRANS(xvexth_du_wu, LASX, gen_xx, gen_helper_vexth_du_wu) +TRANS(xvexth_qu_du, LASX, gen_xx, gen_helper_vexth_qu_du) =20 static void gen_vsigncov(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) { --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658534789926.6479902378558; Wed, 13 Sep 2023 19:28:54 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4t-0003Ic-JQ; Wed, 13 Sep 2023 22:27:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4k-0003Em-N8 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:26 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4h-0004vc-SL for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:26 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxRvF_bwJlk_gmAA--.9992S3; Thu, 14 Sep 2023 10:27:11 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S32; Thu, 14 Sep 2023 10:27:10 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 30/57] target/loongarch: Implement vext2xv Date: Thu, 14 Sep 2023 10:26:18 +0800 Message-Id: <20230914022645.1151356-31-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S32 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658536355100003 Content-Type: text/plain; charset="utf-8" This patch includes: - VEXT2XV.{H/W/D}.B, VEXT2XV.{HU/WU/DU}.BU; - VEXT2XV.{W/D}.B, VEXT2XV.{WU/DU}.HU; - VEXT2XV.D.W, VEXT2XV.DU.WU. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 13 ++++++++++ target/loongarch/insns.decode | 13 ++++++++++ target/loongarch/disas.c | 13 ++++++++++ target/loongarch/vec_helper.c | 28 +++++++++++++++++++++ target/loongarch/insn_trans/trans_vec.c.inc | 13 ++++++++++ 5 files changed, 80 insertions(+) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 1abd9e1410..e9c5412267 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -340,6 +340,19 @@ DEF_HELPER_FLAGS_3(vexth_wu_hu, TCG_CALL_NO_RWG, void,= ptr, ptr, i32) DEF_HELPER_FLAGS_3(vexth_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(vexth_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(vext2xv_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_w_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_d_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_d_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_wu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_du_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_du_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vext2xv_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(vsigncov_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 7491f295a5..db1a6689f0 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1580,6 +1580,19 @@ xvexth_wu_hu 0111 01101001 11101 11101 ..... ...= .. @vv xvexth_du_wu 0111 01101001 11101 11110 ..... ..... @vv xvexth_qu_du 0111 01101001 11101 11111 ..... ..... @vv =20 +vext2xv_h_b 0111 01101001 11110 00100 ..... ..... @vv +vext2xv_w_b 0111 01101001 11110 00101 ..... ..... @vv +vext2xv_d_b 0111 01101001 11110 00110 ..... ..... @vv +vext2xv_w_h 0111 01101001 11110 00111 ..... ..... @vv +vext2xv_d_h 0111 01101001 11110 01000 ..... ..... @vv +vext2xv_d_w 0111 01101001 11110 01001 ..... ..... @vv +vext2xv_hu_bu 0111 01101001 11110 01010 ..... ..... @vv +vext2xv_wu_bu 0111 01101001 11110 01011 ..... ..... @vv +vext2xv_du_bu 0111 01101001 11110 01100 ..... ..... @vv +vext2xv_wu_hu 0111 01101001 11110 01101 ..... ..... @vv +vext2xv_du_hu 0111 01101001 11110 01110 ..... ..... @vv +vext2xv_du_wu 0111 01101001 11110 01111 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index d4bea69b61..714b97e238 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1997,6 +1997,19 @@ INSN_LASX(xvexth_wu_hu, vv) INSN_LASX(xvexth_du_wu, vv) INSN_LASX(xvexth_qu_du, vv) =20 +INSN_LASX(vext2xv_h_b, vv) +INSN_LASX(vext2xv_w_b, vv) +INSN_LASX(vext2xv_d_b, vv) +INSN_LASX(vext2xv_w_h, vv) +INSN_LASX(vext2xv_d_h, vv) +INSN_LASX(vext2xv_d_w, vv) +INSN_LASX(vext2xv_hu_bu, vv) +INSN_LASX(vext2xv_wu_bu, vv) +INSN_LASX(vext2xv_du_bu, vv) +INSN_LASX(vext2xv_wu_hu, vv) +INSN_LASX(vext2xv_du_hu, vv) +INSN_LASX(vext2xv_du_wu, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 2eccbc81a7..3dc20243fd 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -763,6 +763,34 @@ VEXTH(vexth_hu_bu, 16, UH, UB) VEXTH(vexth_wu_hu, 32, UW, UH) VEXTH(vexth_du_wu, 64, UD, UW) =20 +#define VEXT2XV(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i; \ + VReg temp =3D {}; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + temp.E1(i) =3D Vj->E2(i); \ + } \ + *Vd =3D temp; \ +} + +VEXT2XV(vext2xv_h_b, 16, H, B) +VEXT2XV(vext2xv_w_b, 32, W, B) +VEXT2XV(vext2xv_d_b, 64, D, B) +VEXT2XV(vext2xv_w_h, 32, W, H) +VEXT2XV(vext2xv_d_h, 64, D, H) +VEXT2XV(vext2xv_d_w, 64, D, W) +VEXT2XV(vext2xv_hu_bu, 16, UH, UB) +VEXT2XV(vext2xv_wu_bu, 32, UW, UB) +VEXT2XV(vext2xv_du_bu, 64, UD, UB) +VEXT2XV(vext2xv_wu_hu, 32, UW, UH) +VEXT2XV(vext2xv_du_hu, 64, UD, UH) +VEXT2XV(vext2xv_du_wu, 64, UD, UW) + #define DO_SIGNCOV(a, b) (a =3D=3D 0 ? 0 : a < 0 ? -b : b) =20 DO_3OP(vsigncov_b, 8, B, DO_SIGNCOV) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index e002bb05d6..8f21f6b1c4 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3347,6 +3347,19 @@ TRANS(xvexth_wu_hu, LASX, gen_xx, gen_helper_vexth_w= u_hu) TRANS(xvexth_du_wu, LASX, gen_xx, gen_helper_vexth_du_wu) TRANS(xvexth_qu_du, LASX, gen_xx, gen_helper_vexth_qu_du) =20 +TRANS(vext2xv_h_b, LASX, gen_xx, gen_helper_vext2xv_h_b) +TRANS(vext2xv_w_b, LASX, gen_xx, gen_helper_vext2xv_w_b) +TRANS(vext2xv_d_b, LASX, gen_xx, gen_helper_vext2xv_d_b) +TRANS(vext2xv_w_h, LASX, gen_xx, gen_helper_vext2xv_w_h) +TRANS(vext2xv_d_h, LASX, gen_xx, gen_helper_vext2xv_d_h) +TRANS(vext2xv_d_w, LASX, gen_xx, gen_helper_vext2xv_d_w) +TRANS(vext2xv_hu_bu, LASX, gen_xx, gen_helper_vext2xv_hu_bu) +TRANS(vext2xv_wu_bu, LASX, gen_xx, gen_helper_vext2xv_wu_bu) +TRANS(vext2xv_du_bu, LASX, gen_xx, gen_helper_vext2xv_du_bu) +TRANS(vext2xv_wu_hu, LASX, gen_xx, gen_helper_vext2xv_wu_hu) +TRANS(vext2xv_du_hu, LASX, gen_xx, gen_helper_vext2xv_du_hu) +TRANS(vext2xv_du_wu, LASX, gen_xx, gen_helper_vext2xv_du_wu) + static void gen_vsigncov(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) { TCGv_vec t1, zero; --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658716989843.6105153137272; Wed, 13 Sep 2023 19:31:56 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6K-0006b3-Pz; Wed, 13 Sep 2023 22:29:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc5r-0004na-H5 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:35 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5h-0005FG-TJ for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:28:35 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dx_7uAbwJll_gmAA--.16318S3; Thu, 14 Sep 2023 10:27:12 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S33; Thu, 14 Sep 2023 10:27:11 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 31/57] target/loongarch: Implement xvsigncov Date: Thu, 14 Sep 2023 10:26:19 +0800 Message-Id: <20230914022645.1151356-32-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S33 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658717572100004 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSIGNCOV.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 5 +++++ target/loongarch/disas.c | 5 +++++ target/loongarch/insn_trans/trans_vec.c.inc | 4 ++++ 3 files changed, 14 insertions(+) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index db1a6689f0..7bbda1a142 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1593,6 +1593,11 @@ vext2xv_wu_hu 0111 01101001 11110 01101 ..... ...= .. @vv vext2xv_du_hu 0111 01101001 11110 01110 ..... ..... @vv vext2xv_du_wu 0111 01101001 11110 01111 ..... ..... @vv =20 +xvsigncov_b 0111 01010010 11100 ..... ..... ..... @vvv +xvsigncov_h 0111 01010010 11101 ..... ..... ..... @vvv +xvsigncov_w 0111 01010010 11110 ..... ..... ..... @vvv +xvsigncov_d 0111 01010010 11111 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 714b97e238..1f01ec99d5 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2010,6 +2010,11 @@ INSN_LASX(vext2xv_wu_hu, vv) INSN_LASX(vext2xv_du_hu, vv) INSN_LASX(vext2xv_du_wu, vv) =20 +INSN_LASX(xvsigncov_b, vvv) +INSN_LASX(xvsigncov_h, vvv) +INSN_LASX(xvsigncov_w, vvv) +INSN_LASX(xvsigncov_d, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 8f21f6b1c4..492bbcd4e9 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3412,6 +3412,10 @@ TRANS(vsigncov_b, LSX, gvec_vvv, MO_8, do_vsigncov) TRANS(vsigncov_h, LSX, gvec_vvv, MO_16, do_vsigncov) TRANS(vsigncov_w, LSX, gvec_vvv, MO_32, do_vsigncov) TRANS(vsigncov_d, LSX, gvec_vvv, MO_64, do_vsigncov) +TRANS(xvsigncov_b, LASX, gvec_xxx, MO_8, do_vsigncov) +TRANS(xvsigncov_h, LASX, gvec_xxx, MO_16, do_vsigncov) +TRANS(xvsigncov_w, LASX, gvec_xxx, MO_32, do_vsigncov) +TRANS(xvsigncov_d, LASX, gvec_xxx, MO_64, do_vsigncov) =20 TRANS(vmskltz_b, LSX, gen_vv, gen_helper_vmskltz_b) TRANS(vmskltz_h, LSX, gen_vv, gen_helper_vmskltz_h) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658575491170.8754905243594; Wed, 13 Sep 2023 19:29:35 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4p-0003H7-Mn; Wed, 13 Sep 2023 22:27:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4l-0003FD-Jj for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:28 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4j-0004vq-8Q for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:27 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxjuuAbwJllfgmAA--.7255S3; Thu, 14 Sep 2023 10:27:12 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S34; Thu, 14 Sep 2023 10:27:12 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 32/57] target/loongarch: Implement xvmskltz/xvmskgez/xvmsknz Date: Thu, 14 Sep 2023 10:26:20 +0800 Message-Id: <20230914022645.1151356-33-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S34 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658576827100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVMSKLTZ.{B/H/W/D}; - XVMSKGEZ.B; - XVMSKNZ.B. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 7 ++ target/loongarch/disas.c | 7 ++ target/loongarch/vec_helper.c | 78 ++++++++++++++------- target/loongarch/insn_trans/trans_vec.c.inc | 6 ++ 4 files changed, 74 insertions(+), 24 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 7bbda1a142..6a161d6d20 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1598,6 +1598,13 @@ xvsigncov_h 0111 01010010 11101 ..... ..... ...= .. @vvv xvsigncov_w 0111 01010010 11110 ..... ..... ..... @vvv xvsigncov_d 0111 01010010 11111 ..... ..... ..... @vvv =20 +xvmskltz_b 0111 01101001 11000 10000 ..... ..... @vv +xvmskltz_h 0111 01101001 11000 10001 ..... ..... @vv +xvmskltz_w 0111 01101001 11000 10010 ..... ..... @vv +xvmskltz_d 0111 01101001 11000 10011 ..... ..... @vv +xvmskgez_b 0111 01101001 11000 10100 ..... ..... @vv +xvmsknz_b 0111 01101001 11000 11000 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 1f01ec99d5..05710098ad 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2015,6 +2015,13 @@ INSN_LASX(xvsigncov_h, vvv) INSN_LASX(xvsigncov_w, vvv) INSN_LASX(xvsigncov_d, vvv) =20 +INSN_LASX(xvmskltz_b, vv) +INSN_LASX(xvmskltz_h, vv) +INSN_LASX(xvmskltz_w, vv) +INSN_LASX(xvmskltz_d, vv) +INSN_LASX(xvmskgez_b, vv) +INSN_LASX(xvmsknz_b, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 3dc20243fd..f749800880 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -810,14 +810,19 @@ static uint64_t do_vmskltz_b(int64_t val) =20 void HELPER(vmskltz_b)(void *vd, void *vj, uint32_t desc) { + int i; uint16_t temp =3D 0; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - temp =3D do_vmskltz_b(Vj->D(0)); - temp |=3D (do_vmskltz_b(Vj->D(1)) << 8); - Vd->D(0) =3D temp; - Vd->D(1) =3D 0; + for (i =3D 0; i < oprsz / 16; i++) { + temp =3D 0; + temp =3D do_vmskltz_b(Vj->D(2 * i)); + temp |=3D (do_vmskltz_b(Vj->D(2 * i + 1)) << 8); + Vd->D(2 * i) =3D temp; + Vd->D(2 * i + 1) =3D 0; + } } =20 static uint64_t do_vmskltz_h(int64_t val) @@ -831,14 +836,19 @@ static uint64_t do_vmskltz_h(int64_t val) =20 void HELPER(vmskltz_h)(void *vd, void *vj, uint32_t desc) { + int i; uint16_t temp =3D 0; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - temp =3D do_vmskltz_h(Vj->D(0)); - temp |=3D (do_vmskltz_h(Vj->D(1)) << 4); - Vd->D(0) =3D temp; - Vd->D(1) =3D 0; + for (i =3D 0; i < oprsz / 16; i++) { + temp =3D 0; + temp =3D do_vmskltz_h(Vj->D(2 * i)); + temp |=3D (do_vmskltz_h(Vj->D(2 * i + 1)) << 4); + Vd->D(2 * i) =3D temp; + Vd->D(2 * i + 1) =3D 0; + } } =20 static uint64_t do_vmskltz_w(int64_t val) @@ -851,14 +861,19 @@ static uint64_t do_vmskltz_w(int64_t val) =20 void HELPER(vmskltz_w)(void *vd, void *vj, uint32_t desc) { + int i; uint16_t temp =3D 0; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - temp =3D do_vmskltz_w(Vj->D(0)); - temp |=3D (do_vmskltz_w(Vj->D(1)) << 2); - Vd->D(0) =3D temp; - Vd->D(1) =3D 0; + for (i =3D 0; i < oprsz / 16; i++) { + temp =3D 0; + temp =3D do_vmskltz_w(Vj->D(2 * i)); + temp |=3D (do_vmskltz_w(Vj->D(2 * i + 1)) << 2); + Vd->D(2 * i) =3D temp; + Vd->D(2 * i + 1) =3D 0; + } } =20 static uint64_t do_vmskltz_d(int64_t val) @@ -867,26 +882,36 @@ static uint64_t do_vmskltz_d(int64_t val) } void HELPER(vmskltz_d)(void *vd, void *vj, uint32_t desc) { + int i; uint16_t temp =3D 0; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - temp =3D do_vmskltz_d(Vj->D(0)); - temp |=3D (do_vmskltz_d(Vj->D(1)) << 1); - Vd->D(0) =3D temp; - Vd->D(1) =3D 0; + for (i =3D 0; i < oprsz / 16; i++) { + temp =3D 0; + temp =3D do_vmskltz_d(Vj->D(2 * i)); + temp |=3D (do_vmskltz_d(Vj->D(2 * i + 1)) << 1); + Vd->D(2 * i) =3D temp; + Vd->D(2 * i + 1) =3D 0; + } } =20 void HELPER(vmskgez_b)(void *vd, void *vj, uint32_t desc) { + int i; uint16_t temp =3D 0; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - temp =3D do_vmskltz_b(Vj->D(0)); - temp |=3D (do_vmskltz_b(Vj->D(1)) << 8); - Vd->D(0) =3D (uint16_t)(~temp); - Vd->D(1) =3D 0; + for (i =3D 0; i < oprsz / 16; i++) { + temp =3D 0; + temp =3D do_vmskltz_b(Vj->D(2 * i)); + temp |=3D (do_vmskltz_b(Vj->D(2 * i + 1)) << 8); + Vd->D(2 * i) =3D (uint16_t)(~temp); + Vd->D(2 * i + 1) =3D 0; + } } =20 static uint64_t do_vmskez_b(uint64_t a) @@ -901,14 +926,19 @@ static uint64_t do_vmskez_b(uint64_t a) =20 void HELPER(vmsknz_b)(void *vd, void *vj, uint32_t desc) { + int i; uint16_t temp =3D 0; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - temp =3D do_vmskez_b(Vj->D(0)); - temp |=3D (do_vmskez_b(Vj->D(1)) << 8); - Vd->D(0) =3D (uint16_t)(~temp); - Vd->D(1) =3D 0; + for (i =3D 0; i < oprsz / 16; i++) { + temp =3D 0; + temp =3D do_vmskez_b(Vj->D(2 * i)); + temp |=3D (do_vmskez_b(Vj->D(2 * i + 1)) << 8); + Vd->D(2 * i) =3D (uint16_t)(~temp); + Vd->D(2 * i + 1) =3D 0; + } } =20 void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm, uint32_t v) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 492bbcd4e9..843ec6d4af 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3423,6 +3423,12 @@ TRANS(vmskltz_w, LSX, gen_vv, gen_helper_vmskltz_w) TRANS(vmskltz_d, LSX, gen_vv, gen_helper_vmskltz_d) TRANS(vmskgez_b, LSX, gen_vv, gen_helper_vmskgez_b) TRANS(vmsknz_b, LSX, gen_vv, gen_helper_vmsknz_b) +TRANS(xvmskltz_b, LASX, gen_xx, gen_helper_vmskltz_b) +TRANS(xvmskltz_h, LASX, gen_xx, gen_helper_vmskltz_h) +TRANS(xvmskltz_w, LASX, gen_xx, gen_helper_vmskltz_w) +TRANS(xvmskltz_d, LASX, gen_xx, gen_helper_vmskltz_d) +TRANS(xvmskgez_b, LASX, gen_xx, gen_helper_vmskgez_b) +TRANS(xvmsknz_b, LASX, gen_xx, gen_helper_vmsknz_b) =20 #define EXPAND_BYTE(bit) ((uint64_t)(bit ? 0xff : 0)) =20 --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658697904286.0528538156999; Wed, 13 Sep 2023 19:31:37 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5Y-0004BC-Nk; Wed, 13 Sep 2023 22:28:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4l-0003FO-P8 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:28 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4j-0004vw-OF for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:27 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxuOiBbwJlmvgmAA--.39690S3; Thu, 14 Sep 2023 10:27:13 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S35; Thu, 14 Sep 2023 10:27:12 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 33/57] target/loognarch: Implement xvldi Date: Thu, 14 Sep 2023 10:26:21 +0800 Message-Id: <20230914022645.1151356-34-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S35 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658699415100005 Content-Type: text/plain; charset="utf-8" This patch includes: - XVLDI. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 2 ++ target/loongarch/disas.c | 7 +++++++ target/loongarch/insn_trans/trans_vec.c.inc | 13 ++++++------- 3 files changed, 15 insertions(+), 7 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 6a161d6d20..edaa756395 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1605,6 +1605,8 @@ xvmskltz_d 0111 01101001 11000 10011 ..... ....= . @vv xvmskgez_b 0111 01101001 11000 10100 ..... ..... @vv xvmsknz_b 0111 01101001 11000 11000 ..... ..... @vv =20 +xvldi 0111 01111110 00 ............. ..... @v_i13 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 05710098ad..3f6fbeddd7 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1703,6 +1703,11 @@ static bool trans_##insn(DisasContext *ctx, arg_##ty= pe * a) \ return true; \ } =20 +static void output_v_i_x(DisasContext *ctx, arg_v_i *a, const char *mnemon= ic) +{ + output(ctx, mnemonic, "x%d, 0x%x", a->vd, a->imm); +} + static void output_vvv_x(DisasContext *ctx, arg_vvv * a, const char *mnemo= nic) { output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk); @@ -2022,6 +2027,8 @@ INSN_LASX(xvmskltz_d, vv) INSN_LASX(xvmskgez_b, vv) INSN_LASX(xvmsknz_b, vv) =20 +INSN_LASX(xvldi, v_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 843ec6d4af..7ebe971ad9 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3546,16 +3546,12 @@ static uint64_t vldi_get_value(DisasContext *ctx, u= int32_t imm) return data; } =20 -static bool trans_vldi(DisasContext *ctx, arg_vldi *a) +static bool gen_vldi(DisasContext *ctx, arg_vldi *a, uint32_t oprsz) { int sel, vece; uint64_t value; =20 - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, oprsz)) { return true; } =20 @@ -3569,11 +3565,14 @@ static bool trans_vldi(DisasContext *ctx, arg_vldi = *a) vece =3D (a->imm >> 10) & 0x3; } =20 - tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->vd), 16, ctx->vl/8, + tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->vd), oprsz, ctx->vl/8, tcg_constant_i64(value)); return true; } =20 +TRANS(vldi, LSX, gen_vldi, 16) +TRANS(xvldi, LASX, gen_vldi, 32) + TRANS(vand_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_and) TRANS(vor_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_or) TRANS(vxor_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_xor) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 169465903427474.50060452037667; Wed, 13 Sep 2023 19:37:14 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6Z-0000a3-O3; Wed, 13 Sep 2023 22:29:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc6Y-0000KG-5t for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:29:18 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6V-0005Nl-Tj for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:29:17 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxnuuBbwJlnPgmAA--.7687S3; Thu, 14 Sep 2023 10:27:13 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S36; Thu, 14 Sep 2023 10:27:13 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 34/57] target/loongarch: Implement LASX logic instructions Date: Thu, 14 Sep 2023 10:26:22 +0800 Message-Id: <20230914022645.1151356-35-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S36 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694659036397100007 Content-Type: text/plain; charset="utf-8" This patch includes: - XV{AND/OR/XOR/NOR/ANDN/ORN}.V; - XV{AND/OR/XOR/NOR}I.B. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 12 +++++++ target/loongarch/disas.c | 12 +++++++ target/loongarch/vec_helper.c | 4 +-- target/loongarch/insn_trans/trans_vec.c.inc | 38 ++++++++++++--------- 4 files changed, 48 insertions(+), 18 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index edaa756395..fb28666577 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1607,6 +1607,18 @@ xvmsknz_b 0111 01101001 11000 11000 ..... ...= .. @vv =20 xvldi 0111 01111110 00 ............. ..... @v_i13 =20 +xvand_v 0111 01010010 01100 ..... ..... ..... @vvv +xvor_v 0111 01010010 01101 ..... ..... ..... @vvv +xvxor_v 0111 01010010 01110 ..... ..... ..... @vvv +xvnor_v 0111 01010010 01111 ..... ..... ..... @vvv +xvandn_v 0111 01010010 10000 ..... ..... ..... @vvv +xvorn_v 0111 01010010 10001 ..... ..... ..... @vvv + +xvandi_b 0111 01111101 00 ........ ..... ..... @vv_ui8 +xvori_b 0111 01111101 01 ........ ..... ..... @vv_ui8 +xvxori_b 0111 01111101 10 ........ ..... ..... @vv_ui8 +xvnori_b 0111 01111101 11 ........ ..... ..... @vv_ui8 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 3f6fbeddd7..e9adc017db 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2029,6 +2029,18 @@ INSN_LASX(xvmsknz_b, vv) =20 INSN_LASX(xvldi, v_i) =20 +INSN_LASX(xvand_v, vvv) +INSN_LASX(xvor_v, vvv) +INSN_LASX(xvxor_v, vvv) +INSN_LASX(xvnor_v, vvv) +INSN_LASX(xvandn_v, vvv) +INSN_LASX(xvorn_v, vvv) + +INSN_LASX(xvandi_b, vv_i) +INSN_LASX(xvori_b, vv_i) +INSN_LASX(xvxori_b, vv_i) +INSN_LASX(xvnori_b, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index f749800880..1a602ee548 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -941,13 +941,13 @@ void HELPER(vmsknz_b)(void *vd, void *vj, uint32_t de= sc) } } =20 -void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm, uint32_t v) +void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm, uint32_t desc) { int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; =20 - for (i =3D 0; i < LSX_LEN/8; i++) { + for (i =3D 0; i < simd_oprsz(desc); i++) { Vd->B(i) =3D ~(Vj->B(i) | (uint8_t)imm); } } diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 7ebe971ad9..5b14d0f894 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3573,20 +3573,11 @@ static bool gen_vldi(DisasContext *ctx, arg_vldi *a= , uint32_t oprsz) TRANS(vldi, LSX, gen_vldi, 16) TRANS(xvldi, LASX, gen_vldi, 32) =20 -TRANS(vand_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_and) -TRANS(vor_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_or) -TRANS(vxor_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_xor) -TRANS(vnor_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_nor) - -static bool trans_vandn_v(DisasContext *ctx, arg_vvv *a) +static bool gen_vandn_v(DisasContext *ctx, arg_vvv *a, uint32_t oprsz) { uint32_t vd_ofs, vj_ofs, vk_ofs; =20 - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, oprsz)) { return true; } =20 @@ -3594,13 +3585,9 @@ static bool trans_vandn_v(DisasContext *ctx, arg_vvv= *a) vj_ofs =3D vec_full_offset(a->vj); vk_ofs =3D vec_full_offset(a->vk); =20 - tcg_gen_gvec_andc(MO_64, vd_ofs, vk_ofs, vj_ofs, 16, ctx->vl/8); + tcg_gen_gvec_andc(MO_64, vd_ofs, vk_ofs, vj_ofs, oprsz, ctx->vl / 8); return true; } -TRANS(vorn_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_orc) -TRANS(vandi_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_andi) -TRANS(vori_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_ori) -TRANS(vxori_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_xori) =20 static void gen_vnori(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t imm) { @@ -3633,7 +3620,26 @@ static void do_vnori_b(unsigned vece, uint32_t vd_of= s, uint32_t vj_ofs, tcg_gen_gvec_2i(vd_ofs, vj_ofs, oprsz, maxsz, imm, &op); } =20 +TRANS(vand_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_and) +TRANS(vor_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_or) +TRANS(vxor_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_xor) +TRANS(vnor_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_nor) +TRANS(vandn_v, LSX, gen_vandn_v, 16) +TRANS(vorn_v, LSX, gvec_vvv, MO_64, tcg_gen_gvec_orc) +TRANS(vandi_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_andi) +TRANS(vori_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_ori) +TRANS(vxori_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_xori) TRANS(vnori_b, LSX, gvec_vv_i, MO_8, do_vnori_b) +TRANS(xvand_v, LASX, gvec_xxx, MO_64, tcg_gen_gvec_and) +TRANS(xvor_v, LASX, gvec_xxx, MO_64, tcg_gen_gvec_or) +TRANS(xvxor_v, LASX, gvec_xxx, MO_64, tcg_gen_gvec_xor) +TRANS(xvnor_v, LASX, gvec_xxx, MO_64, tcg_gen_gvec_nor) +TRANS(xvandn_v, LASX, gen_vandn_v, 32) +TRANS(xvorn_v, LASX, gvec_xxx, MO_64, tcg_gen_gvec_orc) +TRANS(xvandi_b, LASX, gvec_xx_i, MO_8, tcg_gen_gvec_andi) +TRANS(xvori_b, LASX, gvec_xx_i, MO_8, tcg_gen_gvec_ori) +TRANS(xvxori_b, LASX, gvec_xx_i, MO_8, tcg_gen_gvec_xori) +TRANS(xvnori_b, LASX, gvec_xx_i, MO_8, do_vnori_b) =20 TRANS(vsll_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_shlv) TRANS(vsll_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_shlv) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658542369471.1606510995034; Wed, 13 Sep 2023 19:29:02 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5w-0004r9-By; Wed, 13 Sep 2023 22:28:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4n-0003Fn-Kp for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:29 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4l-0004wd-BY for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:29 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxRuiCbwJlnvgmAA--.21968S3; Thu, 14 Sep 2023 10:27:14 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S37; Thu, 14 Sep 2023 10:27:13 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 35/57] target/loongarch: Implement xvsll xvsrl xvsra xvrotr Date: Thu, 14 Sep 2023 10:26:23 +0800 Message-Id: <20230914022645.1151356-36-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S37 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658542914100002 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSLL[I].{B/H/W/D}; - XVSRL[I].{B/H/W/D}; - XVSRA[I].{B/H/W/D}; - XVROTR[I].{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 33 +++++++++++++++++++ target/loongarch/disas.c | 36 +++++++++++++++++++++ target/loongarch/insn_trans/trans_vec.c.inc | 32 ++++++++++++++++++ 3 files changed, 101 insertions(+) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index fb28666577..fb7bd9fb34 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1619,6 +1619,39 @@ xvori_b 0111 01111101 01 ........ ..... ...= .. @vv_ui8 xvxori_b 0111 01111101 10 ........ ..... ..... @vv_ui8 xvnori_b 0111 01111101 11 ........ ..... ..... @vv_ui8 =20 +xvsll_b 0111 01001110 10000 ..... ..... ..... @vvv +xvsll_h 0111 01001110 10001 ..... ..... ..... @vvv +xvsll_w 0111 01001110 10010 ..... ..... ..... @vvv +xvsll_d 0111 01001110 10011 ..... ..... ..... @vvv +xvslli_b 0111 01110010 11000 01 ... ..... ..... @vv_ui3 +xvslli_h 0111 01110010 11000 1 .... ..... ..... @vv_ui4 +xvslli_w 0111 01110010 11001 ..... ..... ..... @vv_ui5 +xvslli_d 0111 01110010 1101 ...... ..... ..... @vv_ui6 +xvsrl_b 0111 01001110 10100 ..... ..... ..... @vvv +xvsrl_h 0111 01001110 10101 ..... ..... ..... @vvv +xvsrl_w 0111 01001110 10110 ..... ..... ..... @vvv +xvsrl_d 0111 01001110 10111 ..... ..... ..... @vvv +xvsrli_b 0111 01110011 00000 01 ... ..... ..... @vv_ui3 +xvsrli_h 0111 01110011 00000 1 .... ..... ..... @vv_ui4 +xvsrli_w 0111 01110011 00001 ..... ..... ..... @vv_ui5 +xvsrli_d 0111 01110011 0001 ...... ..... ..... @vv_ui6 +xvsra_b 0111 01001110 11000 ..... ..... ..... @vvv +xvsra_h 0111 01001110 11001 ..... ..... ..... @vvv +xvsra_w 0111 01001110 11010 ..... ..... ..... @vvv +xvsra_d 0111 01001110 11011 ..... ..... ..... @vvv +xvsrai_b 0111 01110011 01000 01 ... ..... ..... @vv_ui3 +xvsrai_h 0111 01110011 01000 1 .... ..... ..... @vv_ui4 +xvsrai_w 0111 01110011 01001 ..... ..... ..... @vv_ui5 +xvsrai_d 0111 01110011 0101 ...... ..... ..... @vv_ui6 +xvrotr_b 0111 01001110 11100 ..... ..... ..... @vvv +xvrotr_h 0111 01001110 11101 ..... ..... ..... @vvv +xvrotr_w 0111 01001110 11110 ..... ..... ..... @vvv +xvrotr_d 0111 01001110 11111 ..... ..... ..... @vvv +xvrotri_b 0111 01101010 00000 01 ... ..... ..... @vv_ui3 +xvrotri_h 0111 01101010 00000 1 .... ..... ..... @vv_ui4 +xvrotri_w 0111 01101010 00001 ..... ..... ..... @vv_ui5 +xvrotri_d 0111 01101010 0001 ...... ..... ..... @vv_ui6 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index e9adc017db..209ae230f4 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2041,6 +2041,42 @@ INSN_LASX(xvori_b, vv_i) INSN_LASX(xvxori_b, vv_i) INSN_LASX(xvnori_b, vv_i) =20 +INSN_LASX(xvsll_b, vvv) +INSN_LASX(xvsll_h, vvv) +INSN_LASX(xvsll_w, vvv) +INSN_LASX(xvsll_d, vvv) +INSN_LASX(xvslli_b, vv_i) +INSN_LASX(xvslli_h, vv_i) +INSN_LASX(xvslli_w, vv_i) +INSN_LASX(xvslli_d, vv_i) + +INSN_LASX(xvsrl_b, vvv) +INSN_LASX(xvsrl_h, vvv) +INSN_LASX(xvsrl_w, vvv) +INSN_LASX(xvsrl_d, vvv) +INSN_LASX(xvsrli_b, vv_i) +INSN_LASX(xvsrli_h, vv_i) +INSN_LASX(xvsrli_w, vv_i) +INSN_LASX(xvsrli_d, vv_i) + +INSN_LASX(xvsra_b, vvv) +INSN_LASX(xvsra_h, vvv) +INSN_LASX(xvsra_w, vvv) +INSN_LASX(xvsra_d, vvv) +INSN_LASX(xvsrai_b, vv_i) +INSN_LASX(xvsrai_h, vv_i) +INSN_LASX(xvsrai_w, vv_i) +INSN_LASX(xvsrai_d, vv_i) + +INSN_LASX(xvrotr_b, vvv) +INSN_LASX(xvrotr_h, vvv) +INSN_LASX(xvrotr_w, vvv) +INSN_LASX(xvrotr_d, vvv) +INSN_LASX(xvrotri_b, vv_i) +INSN_LASX(xvrotri_h, vv_i) +INSN_LASX(xvrotri_w, vv_i) +INSN_LASX(xvrotri_d, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 5b14d0f894..24743dbc97 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3649,6 +3649,14 @@ TRANS(vslli_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_sh= li) TRANS(vslli_h, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_shli) TRANS(vslli_w, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_shli) TRANS(vslli_d, LSX, gvec_vv_i, MO_64, tcg_gen_gvec_shli) +TRANS(xvsll_b, LASX, gvec_xxx, MO_8, tcg_gen_gvec_shlv) +TRANS(xvsll_h, LASX, gvec_xxx, MO_16, tcg_gen_gvec_shlv) +TRANS(xvsll_w, LASX, gvec_xxx, MO_32, tcg_gen_gvec_shlv) +TRANS(xvsll_d, LASX, gvec_xxx, MO_64, tcg_gen_gvec_shlv) +TRANS(xvslli_b, LASX, gvec_xx_i, MO_8, tcg_gen_gvec_shli) +TRANS(xvslli_h, LASX, gvec_xx_i, MO_16, tcg_gen_gvec_shli) +TRANS(xvslli_w, LASX, gvec_xx_i, MO_32, tcg_gen_gvec_shli) +TRANS(xvslli_d, LASX, gvec_xx_i, MO_64, tcg_gen_gvec_shli) =20 TRANS(vsrl_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_shrv) TRANS(vsrl_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_shrv) @@ -3658,6 +3666,14 @@ TRANS(vsrli_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_sh= ri) TRANS(vsrli_h, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_shri) TRANS(vsrli_w, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_shri) TRANS(vsrli_d, LSX, gvec_vv_i, MO_64, tcg_gen_gvec_shri) +TRANS(xvsrl_b, LASX, gvec_xxx, MO_8, tcg_gen_gvec_shrv) +TRANS(xvsrl_h, LASX, gvec_xxx, MO_16, tcg_gen_gvec_shrv) +TRANS(xvsrl_w, LASX, gvec_xxx, MO_32, tcg_gen_gvec_shrv) +TRANS(xvsrl_d, LASX, gvec_xxx, MO_64, tcg_gen_gvec_shrv) +TRANS(xvsrli_b, LASX, gvec_xx_i, MO_8, tcg_gen_gvec_shri) +TRANS(xvsrli_h, LASX, gvec_xx_i, MO_16, tcg_gen_gvec_shri) +TRANS(xvsrli_w, LASX, gvec_xx_i, MO_32, tcg_gen_gvec_shri) +TRANS(xvsrli_d, LASX, gvec_xx_i, MO_64, tcg_gen_gvec_shri) =20 TRANS(vsra_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_sarv) TRANS(vsra_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_sarv) @@ -3667,6 +3683,14 @@ TRANS(vsrai_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_sa= ri) TRANS(vsrai_h, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_sari) TRANS(vsrai_w, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_sari) TRANS(vsrai_d, LSX, gvec_vv_i, MO_64, tcg_gen_gvec_sari) +TRANS(xvsra_b, LASX, gvec_xxx, MO_8, tcg_gen_gvec_sarv) +TRANS(xvsra_h, LASX, gvec_xxx, MO_16, tcg_gen_gvec_sarv) +TRANS(xvsra_w, LASX, gvec_xxx, MO_32, tcg_gen_gvec_sarv) +TRANS(xvsra_d, LASX, gvec_xxx, MO_64, tcg_gen_gvec_sarv) +TRANS(xvsrai_b, LASX, gvec_xx_i, MO_8, tcg_gen_gvec_sari) +TRANS(xvsrai_h, LASX, gvec_xx_i, MO_16, tcg_gen_gvec_sari) +TRANS(xvsrai_w, LASX, gvec_xx_i, MO_32, tcg_gen_gvec_sari) +TRANS(xvsrai_d, LASX, gvec_xx_i, MO_64, tcg_gen_gvec_sari) =20 TRANS(vrotr_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_rotrv) TRANS(vrotr_h, LSX, gvec_vvv, MO_16, tcg_gen_gvec_rotrv) @@ -3676,6 +3700,14 @@ TRANS(vrotri_b, LSX, gvec_vv_i, MO_8, tcg_gen_gvec_r= otri) TRANS(vrotri_h, LSX, gvec_vv_i, MO_16, tcg_gen_gvec_rotri) TRANS(vrotri_w, LSX, gvec_vv_i, MO_32, tcg_gen_gvec_rotri) TRANS(vrotri_d, LSX, gvec_vv_i, MO_64, tcg_gen_gvec_rotri) +TRANS(xvrotr_b, LASX, gvec_xxx, MO_8, tcg_gen_gvec_rotrv) +TRANS(xvrotr_h, LASX, gvec_xxx, MO_16, tcg_gen_gvec_rotrv) +TRANS(xvrotr_w, LASX, gvec_xxx, MO_32, tcg_gen_gvec_rotrv) +TRANS(xvrotr_d, LASX, gvec_xxx, MO_64, tcg_gen_gvec_rotrv) +TRANS(xvrotri_b, LASX, gvec_xx_i, MO_8, tcg_gen_gvec_rotri) +TRANS(xvrotri_h, LASX, gvec_xx_i, MO_16, tcg_gen_gvec_rotri) +TRANS(xvrotri_w, LASX, gvec_xx_i, MO_32, tcg_gen_gvec_rotri) +TRANS(xvrotri_d, LASX, gvec_xx_i, MO_64, tcg_gen_gvec_rotri) =20 TRANS(vsllwil_h_b, LSX, gen_vv_i, gen_helper_vsllwil_h_b) TRANS(vsllwil_w_h, LSX, gen_vv_i, gen_helper_vsllwil_w_h) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 169465872506525.51301681039945; Wed, 13 Sep 2023 19:32:05 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5V-0003u6-Lk; Wed, 13 Sep 2023 22:28:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4t-0003Ix-HH for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:35 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4n-0004xA-P4 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:35 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dxl+iFbwJlovgmAA--.39396S3; Thu, 14 Sep 2023 10:27:17 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S38; Thu, 14 Sep 2023 10:27:14 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 36/57] target/loongarch: Implement xvsllwil xvextl Date: Thu, 14 Sep 2023 10:26:24 +0800 Message-Id: <20230914022645.1151356-37-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S38 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658725300100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSLLWIL.{H.B/W.H/D.W}; - XVSLLWIL.{HU.BU/WU.HU/DU.WU}; - XVEXTL.Q.D, VEXTL.QU.DU. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 9 +++++ target/loongarch/disas.c | 9 +++++ target/loongarch/vec_helper.c | 45 +++++++++++++-------- target/loongarch/insn_trans/trans_vec.c.inc | 21 ++++++++-- 4 files changed, 63 insertions(+), 21 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index fb7bd9fb34..8a7933eccc 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1652,6 +1652,15 @@ xvrotri_h 0111 01101010 00000 1 .... ..... ..= ... @vv_ui4 xvrotri_w 0111 01101010 00001 ..... ..... ..... @vv_ui5 xvrotri_d 0111 01101010 0001 ...... ..... ..... @vv_ui6 =20 +xvsllwil_h_b 0111 01110000 10000 01 ... ..... ..... @vv_ui3 +xvsllwil_w_h 0111 01110000 10000 1 .... ..... ..... @vv_ui4 +xvsllwil_d_w 0111 01110000 10001 ..... ..... ..... @vv_ui5 +xvextl_q_d 0111 01110000 10010 00000 ..... ..... @vv +xvsllwil_hu_bu 0111 01110000 11000 01 ... ..... ..... @vv_ui3 +xvsllwil_wu_hu 0111 01110000 11000 1 .... ..... ..... @vv_ui4 +xvsllwil_du_wu 0111 01110000 11001 ..... ..... ..... @vv_ui5 +xvextl_qu_du 0111 01110000 11010 00000 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 209ae230f4..d93ecdb60d 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2077,6 +2077,15 @@ INSN_LASX(xvrotri_h, vv_i) INSN_LASX(xvrotri_w, vv_i) INSN_LASX(xvrotri_d, vv_i) =20 +INSN_LASX(xvsllwil_h_b, vv_i) +INSN_LASX(xvsllwil_w_h, vv_i) +INSN_LASX(xvsllwil_d_w, vv_i) +INSN_LASX(xvextl_q_d, vv) +INSN_LASX(xvsllwil_hu_bu, vv_i) +INSN_LASX(xvsllwil_wu_hu, vv_i) +INSN_LASX(xvsllwil_du_wu, vv_i) +INSN_LASX(xvextl_qu_du, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 1a602ee548..a3376439e3 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -952,37 +952,48 @@ void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm= , uint32_t desc) } } =20 -#define VSLLWIL(NAME, BIT, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - typedef __typeof(temp.E1(0)) TD; \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E1(i) =3D (TD)Vj->E2(i) << (imm % BIT); \ - } \ - *Vd =3D temp; \ +#define VSLLWIL(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + typedef __typeof(temp.E1(0)) TD; = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * i) =3D (TD)Vj->E2(j + ofs * 2 * i) << (imm %= BIT); \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 + void HELPER(vextl_q_d)(void *vd, void *vj, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_makes64(Vj->D(0)); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_makes64(Vj->D(2 * i)); + } } =20 void HELPER(vextl_qu_du)(void *vd, void *vj, uint32_t desc) { + int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - Vd->Q(0) =3D int128_make64(Vj->D(0)); + for (i =3D 0; i < oprsz / 16; i++) { + Vd->Q(i) =3D int128_make64(Vj->UD(2 * i)); + } } =20 VSLLWIL(vsllwil_h_b, 16, H, B) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 24743dbc97..c35d0a51a7 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -164,6 +164,10 @@ static bool gen_xx(DisasContext *ctx, arg_vv *a, gen_h= elper_gvec_2 *fn) static bool gen_vv_i_vl(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz, gen_helper_gvec_2i *fn) { + if (!check_vec(ctx, oprsz)) { + return true; + } + tcg_gen_gvec_2i_ool(vec_full_offset(a->vd), vec_full_offset(a->vj), tcg_constant_i64(a->imm), @@ -173,13 +177,14 @@ static bool gen_vv_i_vl(DisasContext *ctx, arg_vv_i *= a, uint32_t oprsz, =20 static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, gen_helper_gvec_2i *f= n) { - if (!check_vec(ctx, 16)) { - return true; - } - return gen_vv_i_vl(ctx, a, 16, fn); } =20 +static bool gen_xx_i(DisasContext *ctx, arg_vv_i *a, gen_helper_gvec_2i *f= n) +{ + return gen_vv_i_vl(ctx, a, 32, fn); +} + static bool gen_cv(DisasContext *ctx, arg_cv *a, void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32)) { @@ -3717,6 +3722,14 @@ TRANS(vsllwil_hu_bu, LSX, gen_vv_i, gen_helper_vsllw= il_hu_bu) TRANS(vsllwil_wu_hu, LSX, gen_vv_i, gen_helper_vsllwil_wu_hu) TRANS(vsllwil_du_wu, LSX, gen_vv_i, gen_helper_vsllwil_du_wu) TRANS(vextl_qu_du, LSX, gen_vv, gen_helper_vextl_qu_du) +TRANS(xvsllwil_h_b, LASX, gen_xx_i, gen_helper_vsllwil_h_b) +TRANS(xvsllwil_w_h, LASX, gen_xx_i, gen_helper_vsllwil_w_h) +TRANS(xvsllwil_d_w, LASX, gen_xx_i, gen_helper_vsllwil_d_w) +TRANS(xvextl_q_d, LASX, gen_xx, gen_helper_vextl_q_d) +TRANS(xvsllwil_hu_bu, LASX, gen_xx_i, gen_helper_vsllwil_hu_bu) +TRANS(xvsllwil_wu_hu, LASX, gen_xx_i, gen_helper_vsllwil_wu_hu) +TRANS(xvsllwil_du_wu, LASX, gen_xx_i, gen_helper_vsllwil_du_wu) +TRANS(xvextl_qu_du, LASX, gen_xx, gen_helper_vextl_qu_du) =20 TRANS(vsrlr_b, LSX, gen_vvv, gen_helper_vsrlr_b) TRANS(vsrlr_h, LSX, gen_vvv, gen_helper_vsrlr_h) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658558370590.2172526528432; Wed, 13 Sep 2023 19:29:18 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5U-0003p1-T2; Wed, 13 Sep 2023 22:28:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4q-0003IF-4U for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:32 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4n-0004x8-V9 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:31 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dx_7uEbwJlofgmAA--.16320S3; Thu, 14 Sep 2023 10:27:16 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S39; Thu, 14 Sep 2023 10:27:16 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 37/57] target/loongarch: Implement xvsrlr xvsrar Date: Thu, 14 Sep 2023 10:26:25 +0800 Message-Id: <20230914022645.1151356-38-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S39 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658559057100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSRLR[I].{B/H/W/D}; - XVSRAR[I].{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 17 +++++++++++++++++ target/loongarch/disas.c | 18 ++++++++++++++++++ target/loongarch/vec_helper.c | 12 ++++++++---- target/loongarch/insn_trans/trans_vec.c.inc | 16 ++++++++++++++++ 4 files changed, 59 insertions(+), 4 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 8a7933eccc..ca0951e1cc 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1661,6 +1661,23 @@ xvsllwil_wu_hu 0111 01110000 11000 1 .... ..... ..= ... @vv_ui4 xvsllwil_du_wu 0111 01110000 11001 ..... ..... ..... @vv_ui5 xvextl_qu_du 0111 01110000 11010 00000 ..... ..... @vv =20 +xvsrlr_b 0111 01001111 00000 ..... ..... ..... @vvv +xvsrlr_h 0111 01001111 00001 ..... ..... ..... @vvv +xvsrlr_w 0111 01001111 00010 ..... ..... ..... @vvv +xvsrlr_d 0111 01001111 00011 ..... ..... ..... @vvv +xvsrlri_b 0111 01101010 01000 01 ... ..... ..... @vv_ui3 +xvsrlri_h 0111 01101010 01000 1 .... ..... ..... @vv_ui4 +xvsrlri_w 0111 01101010 01001 ..... ..... ..... @vv_ui5 +xvsrlri_d 0111 01101010 0101 ...... ..... ..... @vv_ui6 +xvsrar_b 0111 01001111 00100 ..... ..... ..... @vvv +xvsrar_h 0111 01001111 00101 ..... ..... ..... @vvv +xvsrar_w 0111 01001111 00110 ..... ..... ..... @vvv +xvsrar_d 0111 01001111 00111 ..... ..... ..... @vvv +xvsrari_b 0111 01101010 10000 01 ... ..... ..... @vv_ui3 +xvsrari_h 0111 01101010 10000 1 .... ..... ..... @vv_ui4 +xvsrari_w 0111 01101010 10001 ..... ..... ..... @vv_ui5 +xvsrari_d 0111 01101010 1001 ...... ..... ..... @vv_ui6 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index d93ecdb60d..bc5eb82b49 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2086,6 +2086,24 @@ INSN_LASX(xvsllwil_wu_hu, vv_i) INSN_LASX(xvsllwil_du_wu, vv_i) INSN_LASX(xvextl_qu_du, vv) =20 +INSN_LASX(xvsrlr_b, vvv) +INSN_LASX(xvsrlr_h, vvv) +INSN_LASX(xvsrlr_w, vvv) +INSN_LASX(xvsrlr_d, vvv) +INSN_LASX(xvsrlri_b, vv_i) +INSN_LASX(xvsrlri_h, vv_i) +INSN_LASX(xvsrlri_w, vv_i) +INSN_LASX(xvsrlri_d, vv_i) + +INSN_LASX(xvsrar_b, vvv) +INSN_LASX(xvsrar_h, vvv) +INSN_LASX(xvsrar_w, vvv) +INSN_LASX(xvsrar_d, vvv) +INSN_LASX(xvsrari_b, vv_i) +INSN_LASX(xvsrari_h, vv_i) +INSN_LASX(xvsrari_w, vv_i) +INSN_LASX(xvsrari_d, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index a3376439e3..bb30d24b89 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -1025,8 +1025,9 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint3= 2_t desc) \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D do_vsrlr_ ## E(Vj->E(i), ((T)Vk->E(i))%BIT); \ } \ } @@ -1042,8 +1043,9 @@ void HELPER(NAME)(void *vd, void *vj, uint64_t imm, u= int32_t desc) \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D do_vsrlr_ ## E(Vj->E(i), imm); \ } \ } @@ -1075,8 +1077,9 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint3= 2_t desc) \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D do_vsrar_ ## E(Vj->E(i), ((T)Vk->E(i))%BIT); \ } \ } @@ -1092,8 +1095,9 @@ void HELPER(NAME)(void *vd, void *vj, uint64_t imm, u= int32_t desc) \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D do_vsrar_ ## E(Vj->E(i), imm); \ } \ } diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index c35d0a51a7..8be8670259 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3739,6 +3739,14 @@ TRANS(vsrlri_b, LSX, gen_vv_i, gen_helper_vsrlri_b) TRANS(vsrlri_h, LSX, gen_vv_i, gen_helper_vsrlri_h) TRANS(vsrlri_w, LSX, gen_vv_i, gen_helper_vsrlri_w) TRANS(vsrlri_d, LSX, gen_vv_i, gen_helper_vsrlri_d) +TRANS(xvsrlr_b, LASX, gen_xxx, gen_helper_vsrlr_b) +TRANS(xvsrlr_h, LASX, gen_xxx, gen_helper_vsrlr_h) +TRANS(xvsrlr_w, LASX, gen_xxx, gen_helper_vsrlr_w) +TRANS(xvsrlr_d, LASX, gen_xxx, gen_helper_vsrlr_d) +TRANS(xvsrlri_b, LASX, gen_xx_i, gen_helper_vsrlri_b) +TRANS(xvsrlri_h, LASX, gen_xx_i, gen_helper_vsrlri_h) +TRANS(xvsrlri_w, LASX, gen_xx_i, gen_helper_vsrlri_w) +TRANS(xvsrlri_d, LASX, gen_xx_i, gen_helper_vsrlri_d) =20 TRANS(vsrar_b, LSX, gen_vvv, gen_helper_vsrar_b) TRANS(vsrar_h, LSX, gen_vvv, gen_helper_vsrar_h) @@ -3748,6 +3756,14 @@ TRANS(vsrari_b, LSX, gen_vv_i, gen_helper_vsrari_b) TRANS(vsrari_h, LSX, gen_vv_i, gen_helper_vsrari_h) TRANS(vsrari_w, LSX, gen_vv_i, gen_helper_vsrari_w) TRANS(vsrari_d, LSX, gen_vv_i, gen_helper_vsrari_d) +TRANS(xvsrar_b, LASX, gen_xxx, gen_helper_vsrar_b) +TRANS(xvsrar_h, LASX, gen_xxx, gen_helper_vsrar_h) +TRANS(xvsrar_w, LASX, gen_xxx, gen_helper_vsrar_w) +TRANS(xvsrar_d, LASX, gen_xxx, gen_helper_vsrar_d) +TRANS(xvsrari_b, LASX, gen_xx_i, gen_helper_vsrari_b) +TRANS(xvsrari_h, LASX, gen_xx_i, gen_helper_vsrari_h) +TRANS(xvsrari_w, LASX, gen_xx_i, gen_helper_vsrari_w) +TRANS(xvsrari_d, LASX, gen_xx_i, gen_helper_vsrari_d) =20 TRANS(vsrln_b_h, LSX, gen_vvv, gen_helper_vsrln_b_h) TRANS(vsrln_h_w, LSX, gen_vvv, gen_helper_vsrln_h_w) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658866238442.3972905215629; Wed, 13 Sep 2023 19:34:26 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5X-00045f-2t; Wed, 13 Sep 2023 22:28:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4r-0003IZ-H1 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:34 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4p-0004xi-3X for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:33 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxnuuGbwJlpvgmAA--.7689S3; Thu, 14 Sep 2023 10:27:18 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S40; Thu, 14 Sep 2023 10:27:16 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 38/57] target/loongarch: Implement xvsrln xvsran Date: Thu, 14 Sep 2023 10:26:26 +0800 Message-Id: <20230914022645.1151356-39-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S40 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658867338100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSRLN.{B.H/H.W/W.D}; - XVSRAN.{B.H/H.W/W.D}; - XVSRLNI.{B.H/H.W/W.D/D.Q}; - XVSRANI.{B.H/H.W/W.D/D.Q}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 16 ++ target/loongarch/disas.c | 16 ++ target/loongarch/vec_helper.c | 166 +++++++++++--------- target/loongarch/insn_trans/trans_vec.c.inc | 14 ++ 4 files changed, 137 insertions(+), 75 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index ca0951e1cc..204dcfa075 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1678,6 +1678,22 @@ xvsrari_h 0111 01101010 10000 1 .... ..... ..= ... @vv_ui4 xvsrari_w 0111 01101010 10001 ..... ..... ..... @vv_ui5 xvsrari_d 0111 01101010 1001 ...... ..... ..... @vv_ui6 =20 +xvsrln_b_h 0111 01001111 01001 ..... ..... ..... @vvv +xvsrln_h_w 0111 01001111 01010 ..... ..... ..... @vvv +xvsrln_w_d 0111 01001111 01011 ..... ..... ..... @vvv +xvsran_b_h 0111 01001111 01101 ..... ..... ..... @vvv +xvsran_h_w 0111 01001111 01110 ..... ..... ..... @vvv +xvsran_w_d 0111 01001111 01111 ..... ..... ..... @vvv + +xvsrlni_b_h 0111 01110100 00000 1 .... ..... ..... @vv_ui4 +xvsrlni_h_w 0111 01110100 00001 ..... ..... ..... @vv_ui5 +xvsrlni_w_d 0111 01110100 0001 ...... ..... ..... @vv_ui6 +xvsrlni_d_q 0111 01110100 001 ....... ..... ..... @vv_ui7 +xvsrani_b_h 0111 01110101 10000 1 .... ..... ..... @vv_ui4 +xvsrani_h_w 0111 01110101 10001 ..... ..... ..... @vv_ui5 +xvsrani_w_d 0111 01110101 1001 ...... ..... ..... @vv_ui6 +xvsrani_d_q 0111 01110101 101 ....... ..... ..... @vv_ui7 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index bc5eb82b49..28e5e16eb2 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2104,6 +2104,22 @@ INSN_LASX(xvsrari_h, vv_i) INSN_LASX(xvsrari_w, vv_i) INSN_LASX(xvsrari_d, vv_i) =20 +INSN_LASX(xvsrln_b_h, vvv) +INSN_LASX(xvsrln_h_w, vvv) +INSN_LASX(xvsrln_w_d, vvv) +INSN_LASX(xvsran_b_h, vvv) +INSN_LASX(xvsran_h_w, vvv) +INSN_LASX(xvsran_w_d, vvv) + +INSN_LASX(xvsrlni_b_h, vv_i) +INSN_LASX(xvsrlni_h_w, vv_i) +INSN_LASX(xvsrlni_w_d, vv_i) +INSN_LASX(xvsrlni_d_q, vv_i) +INSN_LASX(xvsrani_b_h, vv_i) +INSN_LASX(xvsrani_h_w, vv_i) +INSN_LASX(xvsrani_w_d, vv_i) +INSN_LASX(xvsrani_d_q, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index bb30d24b89..8c405ce32b 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -1109,105 +1109,121 @@ VSRARI(vsrari_d, 64, D) =20 #define R_SHIFT(a, b) (a >> b) =20 -#define VSRLN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D R_SHIFT((T)Vj->E2(i),((T)Vk->E2(i)) % BIT); \ - } \ - Vd->D(1) =3D 0; \ +#define VSRLN(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D R_SHIFT(Vj->E2(j + ofs * i), = \ + Vk->E2(j + ofs * i) % BIT); \ + } \ + Vd->D(2 * i + 1) =3D 0; = \ + } \ } =20 -VSRLN(vsrln_b_h, 16, uint16_t, B, H) -VSRLN(vsrln_h_w, 32, uint32_t, H, W) -VSRLN(vsrln_w_d, 64, uint64_t, W, D) +VSRLN(vsrln_b_h, 16, B, UH) +VSRLN(vsrln_h_w, 32, H, UW) +VSRLN(vsrln_w_d, 64, W, UD) =20 -#define VSRAN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D R_SHIFT(Vj->E2(i), ((T)Vk->E2(i)) % BIT); \ - } \ - Vd->D(1) =3D 0; \ +#define VSRAN(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D R_SHIFT(Vj->E2(j + ofs * i), = \ + Vk->E3(j + ofs * i) % BIT); \ + } \ + Vd->D(2 * i + 1) =3D 0; = \ + } \ } =20 -VSRAN(vsran_b_h, 16, uint16_t, B, H) -VSRAN(vsran_h_w, 32, uint32_t, H, W) -VSRAN(vsran_w_d, 64, uint64_t, W, D) +VSRAN(vsran_b_h, 16, B, H, UH) +VSRAN(vsran_h_w, 32, H, W, UW) +VSRAN(vsran_w_d, 64, W, D, UD) =20 -#define VSRLNI(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D R_SHIFT((T)Vj->E2(i), imm); \ - temp.E1(i + max) =3D R_SHIFT((T)Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ +#define VSRLNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D R_SHIFT(Vj->E2(j + ofs * i), imm)= ; \ + temp.E1(j + ofs * (2 * i + 1)) =3D R_SHIFT(Vd->E2(j + ofs * i)= , \ + imm); \ + } \ + } \ + *Vd =3D temp; = \ } =20 void HELPER(vsrlni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - VReg temp; + int i; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; =20 - temp.D(0) =3D 0; - temp.D(1) =3D 0; - temp.D(0) =3D int128_getlo(int128_urshift(Vj->Q(0), imm % 128)); - temp.D(1) =3D int128_getlo(int128_urshift(Vd->Q(0), imm % 128)); + for (i =3D 0; i < 2; i++) { + temp.D(2 * i) =3D int128_getlo(int128_urshift(Vj->Q(i), imm % 128)= ); + temp.D(2 * i +1) =3D int128_getlo(int128_urshift(Vd->Q(i), imm % 1= 28)); + } *Vd =3D temp; } =20 -VSRLNI(vsrlni_b_h, 16, uint16_t, B, H) -VSRLNI(vsrlni_h_w, 32, uint32_t, H, W) -VSRLNI(vsrlni_w_d, 64, uint64_t, W, D) +VSRLNI(vsrlni_b_h, 16, B, UH) +VSRLNI(vsrlni_h_w, 32, H, UW) +VSRLNI(vsrlni_w_d, 64, W, UD) =20 -#define VSRANI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D R_SHIFT(Vj->E2(i), imm); \ - temp.E1(i + max) =3D R_SHIFT(Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ +#define VSRANI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D R_SHIFT(Vj->E2(j + ofs * i), imm)= ; \ + temp.E1(j + ofs * (2 * i + 1)) =3D R_SHIFT(Vd->E2(j + ofs * i)= , \ + imm); \ + } \ + } \ + *Vd =3D temp; = \ } =20 void HELPER(vsrani_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - VReg temp; + int i; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; =20 - temp.D(0) =3D 0; - temp.D(1) =3D 0; - temp.D(0) =3D int128_getlo(int128_rshift(Vj->Q(0), imm % 128)); - temp.D(1) =3D int128_getlo(int128_rshift(Vd->Q(0), imm % 128)); + for (i =3D 0; i < 2; i++) { + temp.D(2 * i) =3D int128_getlo(int128_rshift(Vj->Q(i), imm % 128)); + temp.D(2 * i + 1) =3D int128_getlo(int128_rshift(Vd->Q(i), imm % 1= 28)); + } *Vd =3D temp; } =20 diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 8be8670259..51a6846c15 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3771,6 +3771,12 @@ TRANS(vsrln_w_d, LSX, gen_vvv, gen_helper_vsrln_w_d) TRANS(vsran_b_h, LSX, gen_vvv, gen_helper_vsran_b_h) TRANS(vsran_h_w, LSX, gen_vvv, gen_helper_vsran_h_w) TRANS(vsran_w_d, LSX, gen_vvv, gen_helper_vsran_w_d) +TRANS(xvsrln_b_h, LASX, gen_xxx, gen_helper_vsrln_b_h) +TRANS(xvsrln_h_w, LASX, gen_xxx, gen_helper_vsrln_h_w) +TRANS(xvsrln_w_d, LASX, gen_xxx, gen_helper_vsrln_w_d) +TRANS(xvsran_b_h, LASX, gen_xxx, gen_helper_vsran_b_h) +TRANS(xvsran_h_w, LASX, gen_xxx, gen_helper_vsran_h_w) +TRANS(xvsran_w_d, LASX, gen_xxx, gen_helper_vsran_w_d) =20 TRANS(vsrlni_b_h, LSX, gen_vv_i, gen_helper_vsrlni_b_h) TRANS(vsrlni_h_w, LSX, gen_vv_i, gen_helper_vsrlni_h_w) @@ -3780,6 +3786,14 @@ TRANS(vsrani_b_h, LSX, gen_vv_i, gen_helper_vsrani_b= _h) TRANS(vsrani_h_w, LSX, gen_vv_i, gen_helper_vsrani_h_w) TRANS(vsrani_w_d, LSX, gen_vv_i, gen_helper_vsrani_w_d) TRANS(vsrani_d_q, LSX, gen_vv_i, gen_helper_vsrani_d_q) +TRANS(xvsrlni_b_h, LASX, gen_xx_i, gen_helper_vsrlni_b_h) +TRANS(xvsrlni_h_w, LASX, gen_xx_i, gen_helper_vsrlni_h_w) +TRANS(xvsrlni_w_d, LASX, gen_xx_i, gen_helper_vsrlni_w_d) +TRANS(xvsrlni_d_q, LASX, gen_xx_i, gen_helper_vsrlni_d_q) +TRANS(xvsrani_b_h, LASX, gen_xx_i, gen_helper_vsrani_b_h) +TRANS(xvsrani_h_w, LASX, gen_xx_i, gen_helper_vsrani_h_w) +TRANS(xvsrani_w_d, LASX, gen_xx_i, gen_helper_vsrani_w_d) +TRANS(xvsrani_d_q, LASX, gen_xx_i, gen_helper_vsrani_d_q) =20 TRANS(vsrlrn_b_h, LSX, gen_vvv, gen_helper_vsrlrn_b_h) TRANS(vsrlrn_h_w, LSX, gen_vvv, gen_helper_vsrlrn_h_w) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658813814157.92142429485546; Wed, 13 Sep 2023 19:33:33 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc51-0003JW-9P; Wed, 13 Sep 2023 22:27:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4r-0003Ib-Vy for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:35 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4p-0004xo-9A for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:33 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxHOuGbwJlpPgmAA--.5148S3; Thu, 14 Sep 2023 10:27:18 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S41; Thu, 14 Sep 2023 10:27:17 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 39/57] target/loongarch: Implement xvsrlrn xvsrarn Date: Thu, 14 Sep 2023 10:26:27 +0800 Message-Id: <20230914022645.1151356-40-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S41 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658814693100002 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSRLRN.{B.H/H.W/W.D}; - XVSRARN.{B.H/H.W/W.D}; - XVSRLRNI.{B.H/H.W/W.D/D.Q}; - XVSRARNI.{B.H/H.W/W.D/D.Q}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 16 ++ target/loongarch/disas.c | 16 ++ target/loongarch/vec_helper.c | 198 +++++++++++--------- target/loongarch/insn_trans/trans_vec.c.inc | 14 ++ 4 files changed, 159 insertions(+), 85 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 204dcfa075..d7c50b14ca 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1694,6 +1694,22 @@ xvsrani_h_w 0111 01110101 10001 ..... ..... ...= .. @vv_ui5 xvsrani_w_d 0111 01110101 1001 ...... ..... ..... @vv_ui6 xvsrani_d_q 0111 01110101 101 ....... ..... ..... @vv_ui7 =20 +xvsrlrn_b_h 0111 01001111 10001 ..... ..... ..... @vvv +xvsrlrn_h_w 0111 01001111 10010 ..... ..... ..... @vvv +xvsrlrn_w_d 0111 01001111 10011 ..... ..... ..... @vvv +xvsrarn_b_h 0111 01001111 10101 ..... ..... ..... @vvv +xvsrarn_h_w 0111 01001111 10110 ..... ..... ..... @vvv +xvsrarn_w_d 0111 01001111 10111 ..... ..... ..... @vvv + +xvsrlrni_b_h 0111 01110100 01000 1 .... ..... ..... @vv_ui4 +xvsrlrni_h_w 0111 01110100 01001 ..... ..... ..... @vv_ui5 +xvsrlrni_w_d 0111 01110100 0101 ...... ..... ..... @vv_ui6 +xvsrlrni_d_q 0111 01110100 011 ....... ..... ..... @vv_ui7 +xvsrarni_b_h 0111 01110101 11000 1 .... ..... ..... @vv_ui4 +xvsrarni_h_w 0111 01110101 11001 ..... ..... ..... @vv_ui5 +xvsrarni_w_d 0111 01110101 1101 ...... ..... ..... @vv_ui6 +xvsrarni_d_q 0111 01110101 111 ....... ..... ..... @vv_ui7 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 28e5e16eb2..e7b5974eaa 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2120,6 +2120,22 @@ INSN_LASX(xvsrani_h_w, vv_i) INSN_LASX(xvsrani_w_d, vv_i) INSN_LASX(xvsrani_d_q, vv_i) =20 +INSN_LASX(xvsrlrn_b_h, vvv) +INSN_LASX(xvsrlrn_h_w, vvv) +INSN_LASX(xvsrlrn_w_d, vvv) +INSN_LASX(xvsrarn_b_h, vvv) +INSN_LASX(xvsrarn_h_w, vvv) +INSN_LASX(xvsrarn_w_d, vvv) + +INSN_LASX(xvsrlrni_b_h, vv_i) +INSN_LASX(xvsrlrni_h_w, vv_i) +INSN_LASX(xvsrlrni_w_d, vv_i) +INSN_LASX(xvsrlrni_d_q, vv_i) +INSN_LASX(xvsrarni_b_h, vv_i) +INSN_LASX(xvsrarni_h_w, vv_i) +INSN_LASX(xvsrarni_w_d, vv_i) +INSN_LASX(xvsrarni_d_q, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 8c405ce32b..a3f9b396fa 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -1231,76 +1231,95 @@ VSRANI(vsrani_b_h, 16, B, H) VSRANI(vsrani_h_w, 32, H, W) VSRANI(vsrani_w_d, 64, W, D) =20 -#define VSRLRN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D do_vsrlr_ ## E2(Vj->E2(i), ((T)Vk->E2(i))%BIT); \ - } \ - Vd->D(1) =3D 0; \ +#define VSRLRN(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_vsrlr_ ##E2(Vj->E2(j + ofs * i)= , \ + Vk->E3(j + ofs * i) % BIT);= \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSRLRN(vsrlrn_b_h, 16, uint16_t, B, H) -VSRLRN(vsrlrn_h_w, 32, uint32_t, H, W) -VSRLRN(vsrlrn_w_d, 64, uint64_t, W, D) +VSRLRN(vsrlrn_b_h, 16, B, H, UH) +VSRLRN(vsrlrn_h_w, 32, H, W, UW) +VSRLRN(vsrlrn_w_d, 64, W, D, UD) =20 -#define VSRARN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) =3D do_vsrar_ ## E2(Vj->E2(i), ((T)Vk->E2(i))%BIT); \ - } \ - Vd->D(1) =3D 0; \ +#define VSRARN(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_vsrar_ ## E2(Vj->E2(j + ofs * i= ), \ + Vk->E3(j + ofs * i) % BIT)= ; \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSRARN(vsrarn_b_h, 16, uint8_t, B, H) -VSRARN(vsrarn_h_w, 32, uint16_t, H, W) -VSRARN(vsrarn_w_d, 64, uint32_t, W, D) - -#define VSRLRNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D do_vsrlr_ ## E2(Vj->E2(i), imm); \ - temp.E1(i + max) =3D do_vsrlr_ ## E2(Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ +VSRARN(vsrarn_b_h, 16, B, H, UH) +VSRARN(vsrarn_h_w, 32, H, W, UW) +VSRARN(vsrarn_w_d, 64, W, D, UD) + +#define VSRLRNI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_vsrlr_ ## E2(Vj->E2(j + ofs * = i), imm); \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_vsrlr_ ## E2(Vd->E2(j + = ofs * i), \ + imm); = \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 void HELPER(vsrlrni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - VReg temp; + int i; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; - Int128 r1, r2; - - if (imm =3D=3D 0) { - temp.D(0) =3D int128_getlo(Vj->Q(0)); - temp.D(1) =3D int128_getlo(Vd->Q(0)); - } else { - r1 =3D int128_and(int128_urshift(Vj->Q(0), (imm -1)), int128_one()= ); - r2 =3D int128_and(int128_urshift(Vd->Q(0), (imm -1)), int128_one()= ); + Int128 r[4]; + int oprsz =3D simd_oprsz(desc); =20 - temp.D(0) =3D int128_getlo(int128_add(int128_urshift(Vj->Q(0), imm)= , r1)); - temp.D(1) =3D int128_getlo(int128_add(int128_urshift(Vd->Q(0), imm)= , r2)); + for (i =3D 0; i < oprsz / 16; i++) { + if (imm =3D=3D 0) { + temp.D(2 * i) =3D int128_getlo(Vj->Q(i)); + temp.D(2 * i + 1) =3D int128_getlo(Vd->Q(i)); + } else { + r[2 * i] =3D int128_and(int128_urshift(Vj->Q(i), (imm - 1)), + int128_one()); + r[2 * i + 1] =3D int128_and(int128_urshift(Vd->Q(i), (imm - 1)= ), + int128_one()); + temp.D(2 * i) =3D int128_getlo(int128_add(int128_urshift(Vj->Q= (i), + imm), r[2 * i])); + temp.D(2 * i + 1) =3D int128_getlo(int128_add(int128_urshift(V= d->Q(i), + imm), r[ 2 * i + 1= ])); + } } *Vd =3D temp; } @@ -1309,40 +1328,49 @@ VSRLRNI(vsrlrni_b_h, 16, B, H) VSRLRNI(vsrlrni_h_w, 32, H, W) VSRLRNI(vsrlrni_w_d, 64, W, D) =20 -#define VSRARNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - temp.D(0) =3D 0; \ - temp.D(1) =3D 0; \ - max =3D LSX_LEN/BIT; \ - for (i =3D 0; i < max; i++) { \ - temp.E1(i) =3D do_vsrar_ ## E2(Vj->E2(i), imm); \ - temp.E1(i + max) =3D do_vsrar_ ## E2(Vd->E2(i), imm); \ - } \ - *Vd =3D temp; \ +#define VSRARNI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_vsrar_ ## E2(Vj->E2(j + ofs * = i), imm); \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_vsrar_ ## E2(Vd->E2(j + = ofs * i), \ + imm); = \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 void HELPER(vsrarni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - VReg temp; + int i; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; - Int128 r1, r2; - - if (imm =3D=3D 0) { - temp.D(0) =3D int128_getlo(Vj->Q(0)); - temp.D(1) =3D int128_getlo(Vd->Q(0)); - } else { - r1 =3D int128_and(int128_rshift(Vj->Q(0), (imm -1)), int128_one()); - r2 =3D int128_and(int128_rshift(Vd->Q(0), (imm -1)), int128_one()); + Int128 r[4]; + int oprsz =3D simd_oprsz(desc); =20 - temp.D(0) =3D int128_getlo(int128_add(int128_rshift(Vj->Q(0), imm),= r1)); - temp.D(1) =3D int128_getlo(int128_add(int128_rshift(Vd->Q(0), imm),= r2)); + for (i =3D 0; i < oprsz / 16; i++) { + if (imm =3D=3D 0) { + temp.D(2 * i) =3D int128_getlo(Vj->Q(i)); + temp.D(2 * i + 1) =3D int128_getlo(Vd->Q(i)); + } else { + r[2 * i] =3D int128_and(int128_rshift(Vj->Q(i), (imm - 1)), + int128_one()); + r[2 * i + 1] =3D int128_and(int128_rshift(Vd->Q(i), (imm - 1)), + int128_one()); + temp.D(2 * i) =3D int128_getlo(int128_add(int128_rshift(Vj->Q(= i), + imm), r[2 * i])); + temp.D(2 * i + 1) =3D int128_getlo(int128_add(int128_rshift(Vd= ->Q(i), + imm), r[2 * i + 1]= )); + } } *Vd =3D temp; } diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 51a6846c15..7f748ad6a8 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3801,6 +3801,12 @@ TRANS(vsrlrn_w_d, LSX, gen_vvv, gen_helper_vsrlrn_w_= d) TRANS(vsrarn_b_h, LSX, gen_vvv, gen_helper_vsrarn_b_h) TRANS(vsrarn_h_w, LSX, gen_vvv, gen_helper_vsrarn_h_w) TRANS(vsrarn_w_d, LSX, gen_vvv, gen_helper_vsrarn_w_d) +TRANS(xvsrlrn_b_h, LASX, gen_xxx, gen_helper_vsrlrn_b_h) +TRANS(xvsrlrn_h_w, LASX, gen_xxx, gen_helper_vsrlrn_h_w) +TRANS(xvsrlrn_w_d, LASX, gen_xxx, gen_helper_vsrlrn_w_d) +TRANS(xvsrarn_b_h, LASX, gen_xxx, gen_helper_vsrarn_b_h) +TRANS(xvsrarn_h_w, LASX, gen_xxx, gen_helper_vsrarn_h_w) +TRANS(xvsrarn_w_d, LASX, gen_xxx, gen_helper_vsrarn_w_d) =20 TRANS(vsrlrni_b_h, LSX, gen_vv_i, gen_helper_vsrlrni_b_h) TRANS(vsrlrni_h_w, LSX, gen_vv_i, gen_helper_vsrlrni_h_w) @@ -3810,6 +3816,14 @@ TRANS(vsrarni_b_h, LSX, gen_vv_i, gen_helper_vsrarni= _b_h) TRANS(vsrarni_h_w, LSX, gen_vv_i, gen_helper_vsrarni_h_w) TRANS(vsrarni_w_d, LSX, gen_vv_i, gen_helper_vsrarni_w_d) TRANS(vsrarni_d_q, LSX, gen_vv_i, gen_helper_vsrarni_d_q) +TRANS(xvsrlrni_b_h, LASX, gen_xx_i, gen_helper_vsrlrni_b_h) +TRANS(xvsrlrni_h_w, LASX, gen_xx_i, gen_helper_vsrlrni_h_w) +TRANS(xvsrlrni_w_d, LASX, gen_xx_i, gen_helper_vsrlrni_w_d) +TRANS(xvsrlrni_d_q, LASX, gen_xx_i, gen_helper_vsrlrni_d_q) +TRANS(xvsrarni_b_h, LASX, gen_xx_i, gen_helper_vsrarni_b_h) +TRANS(xvsrarni_h_w, LASX, gen_xx_i, gen_helper_vsrarni_h_w) +TRANS(xvsrarni_w_d, LASX, gen_xx_i, gen_helper_vsrarni_w_d) +TRANS(xvsrarni_d_q, LASX, gen_xx_i, gen_helper_vsrarni_d_q) =20 TRANS(vssrln_b_h, LSX, gen_vvv, gen_helper_vssrln_b_h) TRANS(vssrln_h_w, LSX, gen_vvv, gen_helper_vssrln_h_w) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16946586819529.538695308726801; Wed, 13 Sep 2023 19:31:21 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6J-0006K4-F4; Wed, 13 Sep 2023 22:29:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4u-0003JI-56 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:37 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4q-0004zA-VI for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:35 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxJvGIbwJlqPgmAA--.10068S3; Thu, 14 Sep 2023 10:27:20 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S42; Thu, 14 Sep 2023 10:27:18 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 40/57] target/loongarch: Implement xvssrln xvssran Date: Thu, 14 Sep 2023 10:26:28 +0800 Message-Id: <20230914022645.1151356-41-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S42 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658682610100009 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSSRLN.{B.H/H.W/W.D}; - XVSSRAN.{B.H/H.W/W.D}; - XVSSRLN.{BU.H/HU.W/WU.D}; - XVSSRAN.{BU.H/HU.W/WU.D}; - XVSSRLNI.{B.H/H.W/W.D/D.Q}; - XVSSRANI.{B.H/H.W/W.D/D.Q}; - XVSSRLNI.{BU.H/HU.W/WU.D/DU.Q}; - XVSSRANI.{BU.H/HU.W/WU.D/DU.Q}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 30 ++ target/loongarch/disas.c | 30 ++ target/loongarch/vec_helper.c | 456 ++++++++++++-------- target/loongarch/insn_trans/trans_vec.c.inc | 28 ++ 4 files changed, 353 insertions(+), 191 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index d7c50b14ca..022dd9bfd1 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1710,6 +1710,36 @@ xvsrarni_h_w 0111 01110101 11001 ..... ..... ...= .. @vv_ui5 xvsrarni_w_d 0111 01110101 1101 ...... ..... ..... @vv_ui6 xvsrarni_d_q 0111 01110101 111 ....... ..... ..... @vv_ui7 =20 +xvssrln_b_h 0111 01001111 11001 ..... ..... ..... @vvv +xvssrln_h_w 0111 01001111 11010 ..... ..... ..... @vvv +xvssrln_w_d 0111 01001111 11011 ..... ..... ..... @vvv +xvssran_b_h 0111 01001111 11101 ..... ..... ..... @vvv +xvssran_h_w 0111 01001111 11110 ..... ..... ..... @vvv +xvssran_w_d 0111 01001111 11111 ..... ..... ..... @vvv +xvssrln_bu_h 0111 01010000 01001 ..... ..... ..... @vvv +xvssrln_hu_w 0111 01010000 01010 ..... ..... ..... @vvv +xvssrln_wu_d 0111 01010000 01011 ..... ..... ..... @vvv +xvssran_bu_h 0111 01010000 01101 ..... ..... ..... @vvv +xvssran_hu_w 0111 01010000 01110 ..... ..... ..... @vvv +xvssran_wu_d 0111 01010000 01111 ..... ..... ..... @vvv + +xvssrlni_b_h 0111 01110100 10000 1 .... ..... ..... @vv_ui4 +xvssrlni_h_w 0111 01110100 10001 ..... ..... ..... @vv_ui5 +xvssrlni_w_d 0111 01110100 1001 ...... ..... ..... @vv_ui6 +xvssrlni_d_q 0111 01110100 101 ....... ..... ..... @vv_ui7 +xvssrani_b_h 0111 01110110 00000 1 .... ..... ..... @vv_ui4 +xvssrani_h_w 0111 01110110 00001 ..... ..... ..... @vv_ui5 +xvssrani_w_d 0111 01110110 0001 ...... ..... ..... @vv_ui6 +xvssrani_d_q 0111 01110110 001 ....... ..... ..... @vv_ui7 +xvssrlni_bu_h 0111 01110100 11000 1 .... ..... ..... @vv_ui4 +xvssrlni_hu_w 0111 01110100 11001 ..... ..... ..... @vv_ui5 +xvssrlni_wu_d 0111 01110100 1101 ...... ..... ..... @vv_ui6 +xvssrlni_du_q 0111 01110100 111 ....... ..... ..... @vv_ui7 +xvssrani_bu_h 0111 01110110 01000 1 .... ..... ..... @vv_ui4 +xvssrani_hu_w 0111 01110110 01001 ..... ..... ..... @vv_ui5 +xvssrani_wu_d 0111 01110110 0101 ...... ..... ..... @vv_ui6 +xvssrani_du_q 0111 01110110 011 ....... ..... ..... @vv_ui7 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index e7b5974eaa..c02f31019f 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2136,6 +2136,36 @@ INSN_LASX(xvsrarni_h_w, vv_i) INSN_LASX(xvsrarni_w_d, vv_i) INSN_LASX(xvsrarni_d_q, vv_i) =20 +INSN_LASX(xvssrln_b_h, vvv) +INSN_LASX(xvssrln_h_w, vvv) +INSN_LASX(xvssrln_w_d, vvv) +INSN_LASX(xvssran_b_h, vvv) +INSN_LASX(xvssran_h_w, vvv) +INSN_LASX(xvssran_w_d, vvv) +INSN_LASX(xvssrln_bu_h, vvv) +INSN_LASX(xvssrln_hu_w, vvv) +INSN_LASX(xvssrln_wu_d, vvv) +INSN_LASX(xvssran_bu_h, vvv) +INSN_LASX(xvssran_hu_w, vvv) +INSN_LASX(xvssran_wu_d, vvv) + +INSN_LASX(xvssrlni_b_h, vv_i) +INSN_LASX(xvssrlni_h_w, vv_i) +INSN_LASX(xvssrlni_w_d, vv_i) +INSN_LASX(xvssrlni_d_q, vv_i) +INSN_LASX(xvssrani_b_h, vv_i) +INSN_LASX(xvssrani_h_w, vv_i) +INSN_LASX(xvssrani_w_d, vv_i) +INSN_LASX(xvssrani_d_q, vv_i) +INSN_LASX(xvssrlni_bu_h, vv_i) +INSN_LASX(xvssrlni_hu_w, vv_i) +INSN_LASX(xvssrlni_wu_d, vv_i) +INSN_LASX(xvssrlni_du_q, vv_i) +INSN_LASX(xvssrani_bu_h, vv_i) +INSN_LASX(xvssrani_hu_w, vv_i) +INSN_LASX(xvssrani_wu_d, vv_i) +INSN_LASX(xvssrani_du_q, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index a3f9b396fa..e8dd95eaed 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -1401,23 +1401,29 @@ SSRLNS(B, uint16_t, int16_t, uint8_t) SSRLNS(H, uint32_t, int32_t, uint16_t) SSRLNS(W, uint64_t, int64_t, uint32_t) =20 -#define VSSRLN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrlns_ ## E1(Vj->E2(i), (T)Vk->E2(i)% BIT, BIT/2= -1); \ - } = \ - Vd->D(1) =3D 0; = \ +#define VSSRLN(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrlns_ ## E1(Vj->E2(j + ofs * = i), \ + Vk->E3(j + ofs * i) % BIT,= \ + BIT / 2 - 1); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRLN(vssrln_b_h, 16, uint16_t, B, H) -VSSRLN(vssrln_h_w, 32, uint32_t, H, W) -VSSRLN(vssrln_w_d, 64, uint64_t, W, D) +VSSRLN(vssrln_b_h, 16, B, H, UH) +VSSRLN(vssrln_h_w, 32, H, W, UW) +VSSRLN(vssrln_w_d, 64, W, D, UD) =20 #define SSRANS(E, T1, T2) \ static T1 do_ssrans_ ## E(T1 e2, int sa, int sh) \ @@ -1429,10 +1435,10 @@ static T1 do_ssrans_ ## E(T1 e2, int sa, int sh) \ shft_res =3D e2 >> sa; \ } \ T2 mask; \ - mask =3D (1ll << sh) -1; \ + mask =3D (1ll << sh) - 1; \ if (shft_res > mask) { \ return mask; \ - } else if (shft_res < -(mask +1)) { \ + } else if (shft_res < -(mask + 1)) { \ return ~mask; \ } else { \ return shft_res; \ @@ -1443,23 +1449,29 @@ SSRANS(B, int16_t, int8_t) SSRANS(H, int32_t, int16_t) SSRANS(W, int64_t, int32_t) =20 -#define VSSRAN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrans_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2 = -1); \ - } = \ - Vd->D(1) =3D 0; = \ +#define VSSRAN(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrans_ ## E1(Vj->E2(j + ofs * = i), \ + Vk->E3(j + ofs * i) % BIT,= \ + BIT / 2 - 1); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRAN(vssran_b_h, 16, uint16_t, B, H) -VSSRAN(vssran_h_w, 32, uint32_t, H, W) -VSSRAN(vssran_w_d, 64, uint64_t, W, D) +VSSRAN(vssran_b_h, 16, B, H, UH) +VSSRAN(vssran_h_w, 32, H, W, UW) +VSSRAN(vssran_w_d, 64, W, D, UD) =20 #define SSRLNU(E, T1, T2, T3) \ static T1 do_ssrlnu_ ## E(T3 e2, int sa, int sh) \ @@ -1471,7 +1483,7 @@ static T1 do_ssrlnu_ ## E(T3 e2, int sa, int sh) \ shft_res =3D (((T1)e2) >> sa); \ } \ T2 mask; \ - mask =3D (1ull << sh) -1; \ + mask =3D (1ull << sh) - 1; \ if (shft_res > mask) { \ return mask; \ } else { \ @@ -1483,23 +1495,29 @@ SSRLNU(B, uint16_t, uint8_t, int16_t) SSRLNU(H, uint32_t, uint16_t, int32_t) SSRLNU(W, uint64_t, uint32_t, int64_t) =20 -#define VSSRLNU(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrlnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2)= ; \ - } \ - Vd->D(1) =3D 0; = \ +#define VSSRLNU(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrlnu_ ## E1(Vj->E2(j + ofs * = i), \ + Vk->E3(j + ofs * i) % BIT,= \ + BIT / 2); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRLNU(vssrln_bu_h, 16, uint16_t, B, H) -VSSRLNU(vssrln_hu_w, 32, uint32_t, H, W) -VSSRLNU(vssrln_wu_d, 64, uint64_t, W, D) +VSSRLNU(vssrln_bu_h, 16, B, H, UH) +VSSRLNU(vssrln_hu_w, 32, H, W, UW) +VSSRLNU(vssrln_wu_d, 64, W, D, UD) =20 #define SSRANU(E, T1, T2, T3) \ static T1 do_ssranu_ ## E(T3 e2, int sa, int sh) \ @@ -1514,7 +1532,7 @@ static T1 do_ssranu_ ## E(T3 e2, int sa, int sh) \ shft_res =3D 0; \ } \ T2 mask; \ - mask =3D (1ull << sh) -1; \ + mask =3D (1ull << sh) - 1; \ if (shft_res > mask) { \ return mask; \ } else { \ @@ -1526,64 +1544,89 @@ SSRANU(B, uint16_t, uint8_t, int16_t) SSRANU(H, uint32_t, uint16_t, int32_t) SSRANU(W, uint64_t, uint32_t, int64_t) =20 -#define VSSRANU(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssranu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2)= ; \ - } \ - Vd->D(1) =3D 0; = \ -} - -VSSRANU(vssran_bu_h, 16, uint16_t, B, H) -VSSRANU(vssran_hu_w, 32, uint32_t, H, W) -VSSRANU(vssran_wu_d, 64, uint64_t, W, D) - -#define VSSRLNI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrlns_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrlns_ ## E1(Vd->E2(i), imm, BIT/= 2 -1);\ - } = \ - *Vd =3D temp; = \ +#define VSSRANU(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssranu_ ## E1(Vj->E2(j + ofs * = i), \ + Vk->E3(j + ofs * i) % = BIT, \ + BIT / 2); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -void HELPER(vssrlni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) -{ - Int128 shft_res1, shft_res2, mask; - VReg *Vd =3D (VReg *)vd; - VReg *Vj =3D (VReg *)vj; +VSSRANU(vssran_bu_h, 16, B, H, UH) +VSSRANU(vssran_hu_w, 32, H, W, UW) +VSSRANU(vssran_wu_d, 64, W, D, UD) + +#define VSSRLNI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrlns_ ## E1(Vj->E2(j + ofs *= i), \ + imm, BIT / 2 - 1); = \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrlns_ ## E1(Vd->E2(j += ofs * i), \ + imm, BIT / 2 - = 1); \ + } = \ + } = \ + *Vd =3D temp; = \ +} + +static void do_vssrlni_q(VReg *Vd, VReg *Vj, + uint64_t imm, int idx, Int128 mask) +{ + Int128 shft_res1, shft_res2; =20 if (imm =3D=3D 0) { - shft_res1 =3D Vj->Q(0); - shft_res2 =3D Vd->Q(0); + shft_res1 =3D Vj->Q(idx); + shft_res2 =3D Vd->Q(idx); } else { - shft_res1 =3D int128_urshift(Vj->Q(0), imm); - shft_res2 =3D int128_urshift(Vd->Q(0), imm); + shft_res1 =3D int128_urshift(Vj->Q(idx), imm); + shft_res2 =3D int128_urshift(Vd->Q(idx), imm); } - mask =3D int128_sub(int128_lshift(int128_one(), 63), int128_one()); =20 if (int128_ult(mask, shft_res1)) { - Vd->D(0) =3D int128_getlo(mask); + Vd->D(idx * 2) =3D int128_getlo(mask); }else { - Vd->D(0) =3D int128_getlo(shft_res1); + Vd->D(idx * 2) =3D int128_getlo(shft_res1); } =20 if (int128_ult(mask, shft_res2)) { - Vd->D(1) =3D int128_getlo(mask); + Vd->D(idx * 2 + 1) =3D int128_getlo(mask); }else { - Vd->D(1) =3D int128_getlo(shft_res2); + Vd->D(idx * 2 + 1) =3D int128_getlo(shft_res2); + } +} + +void HELPER(vssrlni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) +{ + int i; + Int128 mask; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); + + mask =3D int128_sub(int128_lshift(int128_one(), 63), int128_one()); + + for (i =3D 0; i < oprsz / 16; i++) { + do_vssrlni_q(Vd, Vj, imm, i, mask); } } =20 @@ -1591,98 +1634,111 @@ VSSRLNI(vssrlni_b_h, 16, B, H) VSSRLNI(vssrlni_h_w, 32, H, W) VSSRLNI(vssrlni_w_d, 64, W, D) =20 -#define VSSRANI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrans_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrans_ ## E1(Vd->E2(i), imm, BIT/= 2 -1); \ - } = \ - *Vd =3D temp; = \ -} - -void HELPER(vssrani_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) -{ - Int128 shft_res1, shft_res2, mask, min; - VReg *Vd =3D (VReg *)vd; - VReg *Vj =3D (VReg *)vj; +#define VSSRANI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrans_ ## E1(Vj->E2(j + ofs *= i), \ + imm, BIT / 2 - 1);= \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrans_ ## E1(Vd->E2(j += ofs * i), \ + imm, BIT / 2= - 1); \ + } = \ + } = \ + *Vd =3D temp; = \ +} + +static void do_vssrani_d_q(VReg *Vd, VReg *Vj, + uint64_t imm, int idx, Int128 mask, Int128 min) +{ + Int128 shft_res1, shft_res2; =20 if (imm =3D=3D 0) { - shft_res1 =3D Vj->Q(0); - shft_res2 =3D Vd->Q(0); + shft_res1 =3D Vj->Q(idx); + shft_res2 =3D Vd->Q(idx); } else { - shft_res1 =3D int128_rshift(Vj->Q(0), imm); - shft_res2 =3D int128_rshift(Vd->Q(0), imm); + shft_res1 =3D int128_rshift(Vj->Q(idx), imm); + shft_res2 =3D int128_rshift(Vd->Q(idx), imm); } - mask =3D int128_sub(int128_lshift(int128_one(), 63), int128_one()); - min =3D int128_lshift(int128_one(), 63); =20 - if (int128_gt(shft_res1, mask)) { - Vd->D(0) =3D int128_getlo(mask); + if (int128_gt(shft_res1, mask)) { + Vd->D(idx * 2) =3D int128_getlo(mask); } else if (int128_lt(shft_res1, int128_neg(min))) { - Vd->D(0) =3D int128_getlo(min); + Vd->D(idx * 2) =3D int128_getlo(min); } else { - Vd->D(0) =3D int128_getlo(shft_res1); + Vd->D(idx * 2) =3D int128_getlo(shft_res1); } =20 if (int128_gt(shft_res2, mask)) { - Vd->D(1) =3D int128_getlo(mask); + Vd->D(idx * 2 + 1) =3D int128_getlo(mask); } else if (int128_lt(shft_res2, int128_neg(min))) { - Vd->D(1) =3D int128_getlo(min); + Vd->D(idx * 2 + 1) =3D int128_getlo(min); } else { - Vd->D(1) =3D int128_getlo(shft_res2); + Vd->D(idx * 2 + 1) =3D int128_getlo(shft_res2); + } +} + +void HELPER(vssrani_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) +{ + int i; + Int128 mask, min; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); + + mask =3D int128_sub(int128_lshift(int128_one(), 63), int128_one()); + min =3D int128_lshift(int128_one(), 63); + + for (i =3D 0; i < oprsz / 16; i++) { + do_vssrani_d_q(Vd, Vj, imm, i, mask, min); } } =20 + VSSRANI(vssrani_b_h, 16, B, H) VSSRANI(vssrani_h_w, 32, H, W) VSSRANI(vssrani_w_d, 64, W, D) =20 -#define VSSRLNUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrlnu_ ## E1(Vj->E2(i), imm, BIT/2); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrlnu_ ## E1(Vd->E2(i), imm, BIT/= 2); \ - } = \ - *Vd =3D temp; = \ +#define VSSRLNUI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrlnu_ ## E1(Vj->E2(j + ofs *= i), \ + imm, BIT / 2); = \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrlnu_ ## E1(Vd->E2(j += ofs * i), \ + imm, BIT / 2= ); \ + } = \ + } = \ + *Vd =3D temp; = \ } =20 void HELPER(vssrlni_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - Int128 shft_res1, shft_res2, mask; + int i; + Int128 mask; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - if (imm =3D=3D 0) { - shft_res1 =3D Vj->Q(0); - shft_res2 =3D Vd->Q(0); - } else { - shft_res1 =3D int128_urshift(Vj->Q(0), imm); - shft_res2 =3D int128_urshift(Vd->Q(0), imm); - } mask =3D int128_sub(int128_lshift(int128_one(), 64), int128_one()); =20 - if (int128_ult(mask, shft_res1)) { - Vd->D(0) =3D int128_getlo(mask); - }else { - Vd->D(0) =3D int128_getlo(shft_res1); - } - - if (int128_ult(mask, shft_res2)) { - Vd->D(1) =3D int128_getlo(mask); - }else { - Vd->D(1) =3D int128_getlo(shft_res2); + for (i =3D 0; i < oprsz / 16; i++) { + do_vssrlni_q(Vd, Vj, imm, i, mask); } } =20 @@ -1690,55 +1746,73 @@ VSSRLNUI(vssrlni_bu_h, 16, B, H) VSSRLNUI(vssrlni_hu_w, 32, H, W) VSSRLNUI(vssrlni_wu_d, 64, W, D) =20 -#define VSSRANUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssranu_ ## E1(Vj->E2(i), imm, BIT/2); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssranu_ ## E1(Vd->E2(i), imm, BIT/= 2); \ - } = \ - *Vd =3D temp; = \ -} - -void HELPER(vssrani_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) -{ - Int128 shft_res1, shft_res2, mask; - VReg *Vd =3D (VReg *)vd; - VReg *Vj =3D (VReg *)vj; +#define VSSRANUI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssranu_ ## E1(Vj->E2(j + ofs *= i), \ + imm, BIT / 2); = \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssranu_ ## E1(Vd->E2(j += ofs * i), \ + imm, BIT / 2= ); \ + } = \ + } = \ + *Vd =3D temp; = \ +} + +static void do_vssrani_du_q(VReg *Vd, VReg *Vj, + uint64_t imm, int idx, Int128 mask) +{ + Int128 shft_res1, shft_res2; =20 if (imm =3D=3D 0) { - shft_res1 =3D Vj->Q(0); - shft_res2 =3D Vd->Q(0); + shft_res1 =3D Vj->Q(idx); + shft_res2 =3D Vd->Q(idx); } else { - shft_res1 =3D int128_rshift(Vj->Q(0), imm); - shft_res2 =3D int128_rshift(Vd->Q(0), imm); + shft_res1 =3D int128_rshift(Vj->Q(idx), imm); + shft_res2 =3D int128_rshift(Vd->Q(idx), imm); } =20 - if (int128_lt(Vj->Q(0), int128_zero())) { + if (int128_lt(Vj->Q(idx), int128_zero())) { shft_res1 =3D int128_zero(); } =20 - if (int128_lt(Vd->Q(0), int128_zero())) { + if (int128_lt(Vd->Q(idx), int128_zero())) { shft_res2 =3D int128_zero(); } - - mask =3D int128_sub(int128_lshift(int128_one(), 64), int128_one()); - if (int128_ult(mask, shft_res1)) { - Vd->D(0) =3D int128_getlo(mask); + Vd->D(idx * 2) =3D int128_getlo(mask); }else { - Vd->D(0) =3D int128_getlo(shft_res1); + Vd->D(idx * 2) =3D int128_getlo(shft_res1); } =20 if (int128_ult(mask, shft_res2)) { - Vd->D(1) =3D int128_getlo(mask); + Vd->D(idx * 2 + 1) =3D int128_getlo(mask); }else { - Vd->D(1) =3D int128_getlo(shft_res2); + Vd->D(idx * 2 + 1) =3D int128_getlo(shft_res2); + } + +} + +void HELPER(vssrani_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) +{ + int i; + Int128 mask; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); + + mask =3D int128_sub(int128_lshift(int128_one(), 64), int128_one()); + + for (i =3D 0; i < oprsz / 16; i++) { + do_vssrani_du_q(Vd, Vj, imm, i, mask); } } =20 diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 7f748ad6a8..e734593e4a 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3837,6 +3837,18 @@ TRANS(vssrln_wu_d, LSX, gen_vvv, gen_helper_vssrln_w= u_d) TRANS(vssran_bu_h, LSX, gen_vvv, gen_helper_vssran_bu_h) TRANS(vssran_hu_w, LSX, gen_vvv, gen_helper_vssran_hu_w) TRANS(vssran_wu_d, LSX, gen_vvv, gen_helper_vssran_wu_d) +TRANS(xvssrln_b_h, LASX, gen_xxx, gen_helper_vssrln_b_h) +TRANS(xvssrln_h_w, LASX, gen_xxx, gen_helper_vssrln_h_w) +TRANS(xvssrln_w_d, LASX, gen_xxx, gen_helper_vssrln_w_d) +TRANS(xvssran_b_h, LASX, gen_xxx, gen_helper_vssran_b_h) +TRANS(xvssran_h_w, LASX, gen_xxx, gen_helper_vssran_h_w) +TRANS(xvssran_w_d, LASX, gen_xxx, gen_helper_vssran_w_d) +TRANS(xvssrln_bu_h, LASX, gen_xxx, gen_helper_vssrln_bu_h) +TRANS(xvssrln_hu_w, LASX, gen_xxx, gen_helper_vssrln_hu_w) +TRANS(xvssrln_wu_d, LASX, gen_xxx, gen_helper_vssrln_wu_d) +TRANS(xvssran_bu_h, LASX, gen_xxx, gen_helper_vssran_bu_h) +TRANS(xvssran_hu_w, LASX, gen_xxx, gen_helper_vssran_hu_w) +TRANS(xvssran_wu_d, LASX, gen_xxx, gen_helper_vssran_wu_d) =20 TRANS(vssrlni_b_h, LSX, gen_vv_i, gen_helper_vssrlni_b_h) TRANS(vssrlni_h_w, LSX, gen_vv_i, gen_helper_vssrlni_h_w) @@ -3854,6 +3866,22 @@ TRANS(vssrani_bu_h, LSX, gen_vv_i, gen_helper_vssran= i_bu_h) TRANS(vssrani_hu_w, LSX, gen_vv_i, gen_helper_vssrani_hu_w) TRANS(vssrani_wu_d, LSX, gen_vv_i, gen_helper_vssrani_wu_d) TRANS(vssrani_du_q, LSX, gen_vv_i, gen_helper_vssrani_du_q) +TRANS(xvssrlni_b_h, LASX, gen_xx_i, gen_helper_vssrlni_b_h) +TRANS(xvssrlni_h_w, LASX, gen_xx_i, gen_helper_vssrlni_h_w) +TRANS(xvssrlni_w_d, LASX, gen_xx_i, gen_helper_vssrlni_w_d) +TRANS(xvssrlni_d_q, LASX, gen_xx_i, gen_helper_vssrlni_d_q) +TRANS(xvssrani_b_h, LASX, gen_xx_i, gen_helper_vssrani_b_h) +TRANS(xvssrani_h_w, LASX, gen_xx_i, gen_helper_vssrani_h_w) +TRANS(xvssrani_w_d, LASX, gen_xx_i, gen_helper_vssrani_w_d) +TRANS(xvssrani_d_q, LASX, gen_xx_i, gen_helper_vssrani_d_q) +TRANS(xvssrlni_bu_h, LASX, gen_xx_i, gen_helper_vssrlni_bu_h) +TRANS(xvssrlni_hu_w, LASX, gen_xx_i, gen_helper_vssrlni_hu_w) +TRANS(xvssrlni_wu_d, LASX, gen_xx_i, gen_helper_vssrlni_wu_d) +TRANS(xvssrlni_du_q, LASX, gen_xx_i, gen_helper_vssrlni_du_q) +TRANS(xvssrani_bu_h, LASX, gen_xx_i, gen_helper_vssrani_bu_h) +TRANS(xvssrani_hu_w, LASX, gen_xx_i, gen_helper_vssrani_hu_w) +TRANS(xvssrani_wu_d, LASX, gen_xx_i, gen_helper_vssrani_wu_d) +TRANS(xvssrani_du_q, LASX, gen_xx_i, gen_helper_vssrani_du_q) =20 TRANS(vssrlrn_b_h, LSX, gen_vvv, gen_helper_vssrlrn_b_h) TRANS(vssrlrn_h_w, LSX, gen_vvv, gen_helper_vssrlrn_h_w) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658597695762.2875599173548; Wed, 13 Sep 2023 19:29:57 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc65-0005MM-PM; Wed, 13 Sep 2023 22:28:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4v-0003JU-Uy for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:40 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4r-000509-Q1 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:37 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxXeuJbwJlrvgmAA--.4563S3; Thu, 14 Sep 2023 10:27:21 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S43; Thu, 14 Sep 2023 10:27:19 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 41/57] target/loongarch: Implement xvssrlrn xvssrarn Date: Thu, 14 Sep 2023 10:26:29 +0800 Message-Id: <20230914022645.1151356-42-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S43 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658598312100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSSRLRN.{B.H/H.W/W.D}; - XVSSRARN.{B.H/H.W/W.D}; - XVSSRLRN.{BU.H/HU.W/WU.D}; - XVSSRARN.{BU.H/HU.W/WU.D}; - XVSSRLRNI.{B.H/H.W/W.D/D.Q}; - XVSSRARNI.{B.H/H.W/W.D/D.Q}; - XVSSRLRNI.{BU.H/HU.W/WU.D/DU.Q}; - XVSSRARNI.{BU.H/HU.W/WU.D/DU.Q}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 30 ++ target/loongarch/disas.c | 30 ++ target/loongarch/vec_helper.c | 489 ++++++++++++-------- target/loongarch/insn_trans/trans_vec.c.inc | 28 ++ 4 files changed, 378 insertions(+), 199 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 022dd9bfd1..dc74bae7a5 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1740,6 +1740,36 @@ xvssrani_hu_w 0111 01110110 01001 ..... ..... ...= .. @vv_ui5 xvssrani_wu_d 0111 01110110 0101 ...... ..... ..... @vv_ui6 xvssrani_du_q 0111 01110110 011 ....... ..... ..... @vv_ui7 =20 +xvssrlrn_b_h 0111 01010000 00001 ..... ..... ..... @vvv +xvssrlrn_h_w 0111 01010000 00010 ..... ..... ..... @vvv +xvssrlrn_w_d 0111 01010000 00011 ..... ..... ..... @vvv +xvssrarn_b_h 0111 01010000 00101 ..... ..... ..... @vvv +xvssrarn_h_w 0111 01010000 00110 ..... ..... ..... @vvv +xvssrarn_w_d 0111 01010000 00111 ..... ..... ..... @vvv +xvssrlrn_bu_h 0111 01010000 10001 ..... ..... ..... @vvv +xvssrlrn_hu_w 0111 01010000 10010 ..... ..... ..... @vvv +xvssrlrn_wu_d 0111 01010000 10011 ..... ..... ..... @vvv +xvssrarn_bu_h 0111 01010000 10101 ..... ..... ..... @vvv +xvssrarn_hu_w 0111 01010000 10110 ..... ..... ..... @vvv +xvssrarn_wu_d 0111 01010000 10111 ..... ..... ..... @vvv + +xvssrlrni_b_h 0111 01110101 00000 1 .... ..... ..... @vv_ui4 +xvssrlrni_h_w 0111 01110101 00001 ..... ..... ..... @vv_ui5 +xvssrlrni_w_d 0111 01110101 0001 ...... ..... ..... @vv_ui6 +xvssrlrni_d_q 0111 01110101 001 ....... ..... ..... @vv_ui7 +xvssrarni_b_h 0111 01110110 10000 1 .... ..... ..... @vv_ui4 +xvssrarni_h_w 0111 01110110 10001 ..... ..... ..... @vv_ui5 +xvssrarni_w_d 0111 01110110 1001 ...... ..... ..... @vv_ui6 +xvssrarni_d_q 0111 01110110 101 ....... ..... ..... @vv_ui7 +xvssrlrni_bu_h 0111 01110101 01000 1 .... ..... ..... @vv_ui4 +xvssrlrni_hu_w 0111 01110101 01001 ..... ..... ..... @vv_ui5 +xvssrlrni_wu_d 0111 01110101 0101 ...... ..... ..... @vv_ui6 +xvssrlrni_du_q 0111 01110101 011 ....... ..... ..... @vv_ui7 +xvssrarni_bu_h 0111 01110110 11000 1 .... ..... ..... @vv_ui4 +xvssrarni_hu_w 0111 01110110 11001 ..... ..... ..... @vv_ui5 +xvssrarni_wu_d 0111 01110110 1101 ...... ..... ..... @vv_ui6 +xvssrarni_du_q 0111 01110110 111 ....... ..... ..... @vv_ui7 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index c02f31019f..421eecbb71 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2166,6 +2166,36 @@ INSN_LASX(xvssrani_hu_w, vv_i) INSN_LASX(xvssrani_wu_d, vv_i) INSN_LASX(xvssrani_du_q, vv_i) =20 +INSN_LASX(xvssrlrn_b_h, vvv) +INSN_LASX(xvssrlrn_h_w, vvv) +INSN_LASX(xvssrlrn_w_d, vvv) +INSN_LASX(xvssrarn_b_h, vvv) +INSN_LASX(xvssrarn_h_w, vvv) +INSN_LASX(xvssrarn_w_d, vvv) +INSN_LASX(xvssrlrn_bu_h, vvv) +INSN_LASX(xvssrlrn_hu_w, vvv) +INSN_LASX(xvssrlrn_wu_d, vvv) +INSN_LASX(xvssrarn_bu_h, vvv) +INSN_LASX(xvssrarn_hu_w, vvv) +INSN_LASX(xvssrarn_wu_d, vvv) + +INSN_LASX(xvssrlrni_b_h, vv_i) +INSN_LASX(xvssrlrni_h_w, vv_i) +INSN_LASX(xvssrlrni_w_d, vv_i) +INSN_LASX(xvssrlrni_d_q, vv_i) +INSN_LASX(xvssrlrni_bu_h, vv_i) +INSN_LASX(xvssrlrni_hu_w, vv_i) +INSN_LASX(xvssrlrni_wu_d, vv_i) +INSN_LASX(xvssrlrni_du_q, vv_i) +INSN_LASX(xvssrarni_b_h, vv_i) +INSN_LASX(xvssrarni_h_w, vv_i) +INSN_LASX(xvssrarni_w_d, vv_i) +INSN_LASX(xvssrarni_d_q, vv_i) +INSN_LASX(xvssrarni_bu_h, vv_i) +INSN_LASX(xvssrarni_hu_w, vv_i) +INSN_LASX(xvssrarni_wu_d, vv_i) +INSN_LASX(xvssrarni_du_q, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index e8dd95eaed..68f3667ef4 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -1827,7 +1827,7 @@ static T1 do_ssrlrns_ ## E1(T2 e2, int sa, int sh) \ \ shft_res =3D do_vsrlr_ ## E2(e2, sa); \ T1 mask; \ - mask =3D (1ull << sh) -1; \ + mask =3D (1ull << sh) - 1; \ if (shft_res > mask) { \ return mask; \ } else { \ @@ -1839,23 +1839,29 @@ SSRLRNS(B, H, uint16_t, int16_t, uint8_t) SSRLRNS(H, W, uint32_t, int32_t, uint16_t) SSRLRNS(W, D, uint64_t, int64_t, uint32_t) =20 -#define VSSRLRN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrlrns_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= -1); \ - } = \ - Vd->D(1) =3D 0; = \ +#define VSSRLRN(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrlrns_ ## E1(Vj->E2(j + ofs *= i), \ + Vk->E3(j + ofs * i) % = BIT, \ + BIT / 2 - 1); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRLRN(vssrlrn_b_h, 16, uint16_t, B, H) -VSSRLRN(vssrlrn_h_w, 32, uint32_t, H, W) -VSSRLRN(vssrlrn_w_d, 64, uint64_t, W, D) +VSSRLRN(vssrlrn_b_h, 16, B, H, UH) +VSSRLRN(vssrlrn_h_w, 32, H, W, UW) +VSSRLRN(vssrlrn_w_d, 64, W, D, UD) =20 #define SSRARNS(E1, E2, T1, T2) \ static T1 do_ssrarns_ ## E1(T1 e2, int sa, int sh) \ @@ -1864,7 +1870,7 @@ static T1 do_ssrarns_ ## E1(T1 e2, int sa, int sh) \ \ shft_res =3D do_vsrar_ ## E2(e2, sa); \ T2 mask; \ - mask =3D (1ll << sh) -1; \ + mask =3D (1ll << sh) - 1; \ if (shft_res > mask) { \ return mask; \ } else if (shft_res < -(mask +1)) { \ @@ -1878,23 +1884,29 @@ SSRARNS(B, H, int16_t, int8_t) SSRARNS(H, W, int32_t, int16_t) SSRARNS(W, D, int64_t, int32_t) =20 -#define VSSRARN(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrarns_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= -1); \ - } = \ - Vd->D(1) =3D 0; = \ +#define VSSRARN(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrarns_ ## E1(Vj->E2(j + ofs *= i), \ + Vk->E3(j + ofs * i) % = BIT, \ + BIT/ 2 - 1); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRARN(vssrarn_b_h, 16, uint16_t, B, H) -VSSRARN(vssrarn_h_w, 32, uint32_t, H, W) -VSSRARN(vssrarn_w_d, 64, uint64_t, W, D) +VSSRARN(vssrarn_b_h, 16, B, H, UH) +VSSRARN(vssrarn_h_w, 32, H, W, UW) +VSSRARN(vssrarn_w_d, 64, W, D, UD) =20 #define SSRLRNU(E1, E2, T1, T2, T3) \ static T1 do_ssrlrnu_ ## E1(T3 e2, int sa, int sh) \ @@ -1904,7 +1916,7 @@ static T1 do_ssrlrnu_ ## E1(T3 e2, int sa, int sh) \ shft_res =3D do_vsrlr_ ## E2(e2, sa); \ \ T2 mask; \ - mask =3D (1ull << sh) -1; \ + mask =3D (1ull << sh) - 1; \ if (shft_res > mask) { \ return mask; \ } else { \ @@ -1916,23 +1928,29 @@ SSRLRNU(B, H, uint16_t, uint8_t, int16_t) SSRLRNU(H, W, uint32_t, uint16_t, int32_t) SSRLRNU(W, D, uint64_t, uint32_t, int64_t) =20 -#define VSSRLRNU(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrlrnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= ); \ - } = \ - Vd->D(1) =3D 0; = \ +#define VSSRLRNU(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrlrnu_ ## E1(Vj->E2(j + ofs *= i), \ + Vk->E3(j + ofs * i) % = BIT, \ + BIT / 2); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRLRNU(vssrlrn_bu_h, 16, uint16_t, B, H) -VSSRLRNU(vssrlrn_hu_w, 32, uint32_t, H, W) -VSSRLRNU(vssrlrn_wu_d, 64, uint64_t, W, D) +VSSRLRNU(vssrlrn_bu_h, 16, B, H, UH) +VSSRLRNU(vssrlrn_hu_w, 32, H, W, UW) +VSSRLRNU(vssrlrn_wu_d, 64, W, D, UD) =20 #define SSRARNU(E1, E2, T1, T2, T3) \ static T1 do_ssrarnu_ ## E1(T3 e2, int sa, int sh) \ @@ -1945,7 +1963,7 @@ static T1 do_ssrarnu_ ## E1(T3 e2, int sa, int sh) \ shft_res =3D do_vsrar_ ## E2(e2, sa); \ } \ T2 mask; \ - mask =3D (1ull << sh) -1; \ + mask =3D (1ull << sh) - 1; \ if (shft_res > mask) { \ return mask; \ } else { \ @@ -1957,126 +1975,162 @@ SSRARNU(B, H, uint16_t, uint8_t, int16_t) SSRARNU(H, W, uint32_t, uint16_t, int32_t) SSRARNU(W, D, uint64_t, uint32_t, int64_t) =20 -#define VSSRARNU(NAME, BIT, T, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ -{ = \ - int i; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - VReg *Vk =3D (VReg *)vk; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - Vd->E1(i) =3D do_ssrarnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2= ); \ - } = \ - Vd->D(1) =3D 0; = \ +#define VSSRARNU(NAME, BIT, E1, E2, E3) = \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + Vd->E1(j + ofs * 2 * i) =3D do_ssrarnu_ ## E1(Vj->E2(j + ofs *= i), \ + Vk->E3(j + ofs * i) % BIT,= \ + BIT / 2); = \ + } = \ + Vd->D(2 * i + 1) =3D 0; = \ + } = \ } =20 -VSSRARNU(vssrarn_bu_h, 16, uint16_t, B, H) -VSSRARNU(vssrarn_hu_w, 32, uint32_t, H, W) -VSSRARNU(vssrarn_wu_d, 64, uint64_t, W, D) +VSSRARNU(vssrarn_bu_h, 16, B, H, UH) +VSSRARNU(vssrarn_hu_w, 32, H, W, UW) +VSSRARNU(vssrarn_wu_d, 64, W, D, UD) + +#define VSSRLRNI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrlrns_ ## E1(Vj->E2(j + ofs = * i), \ + imm, BIT / 2 - 1)= ; \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrlrns_ ## E1(Vd->E2(j = + ofs * i), \ + imm, BIT / = 2 - 1); \ + } = \ + } = \ + *Vd =3D temp; = \ +} + +static void do_vssrlrni_q(VReg *Vd, VReg * Vj, + uint64_t imm, int idx, Int128 mask) +{ + Int128 shft_res1, shft_res2, r1, r2; + if (imm =3D=3D 0) { + shft_res1 =3D Vj->Q(idx); + shft_res2 =3D Vd->Q(idx); + } else { + r1 =3D int128_and(int128_urshift(Vj->Q(idx), (imm - 1)), int128_on= e()); + r2 =3D int128_and(int128_urshift(Vd->Q(idx), (imm - 1)), int128_on= e()); + shft_res1 =3D (int128_add(int128_urshift(Vj->Q(idx), imm), r1)); + shft_res2 =3D (int128_add(int128_urshift(Vd->Q(idx), imm), r2)); + } =20 -#define VSSRLRNI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrlrns_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrlrns_ ## E1(Vd->E2(i), imm, BIT= /2 -1);\ - } = \ - *Vd =3D temp; = \ + if (int128_ult(mask, shft_res1)) { + Vd->D(idx * 2) =3D int128_getlo(mask); + }else { + Vd->D(idx * 2) =3D int128_getlo(shft_res1); + } + + if (int128_ult(mask, shft_res2)) { + Vd->D(idx * 2 + 1) =3D int128_getlo(mask); + }else { + Vd->D(idx * 2 + 1) =3D int128_getlo(shft_res2); + } } =20 -#define VSSRLRNI_Q(NAME, sh) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - Int128 shft_res1, shft_res2, mask, r1, r2; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - if (imm =3D=3D 0) { = \ - shft_res1 =3D Vj->Q(0); = \ - shft_res2 =3D Vd->Q(0); = \ - } else { = \ - r1 =3D int128_and(int128_urshift(Vj->Q(0), (imm -1)), int128_one()= ); \ - r2 =3D int128_and(int128_urshift(Vd->Q(0), (imm -1)), int128_one()= ); \ - = \ - shft_res1 =3D (int128_add(int128_urshift(Vj->Q(0), imm), r1)); = \ - shft_res2 =3D (int128_add(int128_urshift(Vd->Q(0), imm), r2)); = \ - } = \ - = \ - mask =3D int128_sub(int128_lshift(int128_one(), sh), int128_one()); = \ - = \ - if (int128_ult(mask, shft_res1)) { = \ - Vd->D(0) =3D int128_getlo(mask); = \ - }else { = \ - Vd->D(0) =3D int128_getlo(shft_res1); = \ - } = \ - = \ - if (int128_ult(mask, shft_res2)) { = \ - Vd->D(1) =3D int128_getlo(mask); = \ - }else { = \ - Vd->D(1) =3D int128_getlo(shft_res2); = \ - } = \ +void HELPER(vssrlrni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) +{ + int i; + Int128 mask; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); + + mask =3D int128_sub(int128_lshift(int128_one(), 63), int128_one()); + + for (i =3D 0; i < oprsz / 16; i++) { + do_vssrlrni_q(Vd, Vj, imm, i, mask); + } } =20 VSSRLRNI(vssrlrni_b_h, 16, B, H) VSSRLRNI(vssrlrni_h_w, 32, H, W) VSSRLRNI(vssrlrni_w_d, 64, W, D) -VSSRLRNI_Q(vssrlrni_d_q, 63) - -#define VSSRARNI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrarns_ ## E1(Vj->E2(i), imm, BIT/2 -1); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrarns_ ## E1(Vd->E2(i), imm, BIT= /2 -1); \ - } = \ - *Vd =3D temp; = \ -} =20 -void HELPER(vssrarni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) -{ - Int128 shft_res1, shft_res2, mask1, mask2, r1, r2; - VReg *Vd =3D (VReg *)vd; - VReg *Vj =3D (VReg *)vj; +#define VSSRARNI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrarns_ ## E1(Vj->E2(j + ofs = * i), \ + imm, BIT / 2 - 1)= ; \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrarns_ ## E1(Vd->E2(j = + ofs * i), \ + imm, BIT / = 2 - 1); \ + } = \ + } = \ + *Vd =3D temp; = \ +} + +static void do_vssrarni_d_q(VReg *Vd, VReg *Vj, + uint64_t imm, int idx, Int128 mask1, Int128 mas= k2) +{ + Int128 shft_res1, shft_res2, r1, r2; =20 if (imm =3D=3D 0) { - shft_res1 =3D Vj->Q(0); - shft_res2 =3D Vd->Q(0); + shft_res1 =3D Vj->Q(idx); + shft_res2 =3D Vd->Q(idx); } else { - r1 =3D int128_and(int128_rshift(Vj->Q(0), (imm -1)), int128_one()); - r2 =3D int128_and(int128_rshift(Vd->Q(0), (imm -1)), int128_one()); - - shft_res1 =3D int128_add(int128_rshift(Vj->Q(0), imm), r1); - shft_res2 =3D int128_add(int128_rshift(Vd->Q(0), imm), r2); + r1 =3D int128_and(int128_rshift(Vj->Q(idx), (imm - 1)), int128_one= ()); + r2 =3D int128_and(int128_rshift(Vd->Q(idx), (imm - 1)), int128_one= ()); + shft_res1 =3D int128_add(int128_rshift(Vj->Q(idx), imm), r1); + shft_res2 =3D int128_add(int128_rshift(Vd->Q(idx), imm), r2); } - - mask1 =3D int128_sub(int128_lshift(int128_one(), 63), int128_one()); - mask2 =3D int128_lshift(int128_one(), 63); - - if (int128_gt(shft_res1, mask1)) { - Vd->D(0) =3D int128_getlo(mask1); + if (int128_gt(shft_res1, mask1)) { + Vd->D(idx * 2) =3D int128_getlo(mask1); } else if (int128_lt(shft_res1, int128_neg(mask2))) { - Vd->D(0) =3D int128_getlo(mask2); + Vd->D(idx * 2) =3D int128_getlo(mask2); } else { - Vd->D(0) =3D int128_getlo(shft_res1); + Vd->D(idx * 2) =3D int128_getlo(shft_res1); } =20 if (int128_gt(shft_res2, mask1)) { - Vd->D(1) =3D int128_getlo(mask1); + Vd->D(idx * 2 + 1) =3D int128_getlo(mask1); } else if (int128_lt(shft_res2, int128_neg(mask2))) { - Vd->D(1) =3D int128_getlo(mask2); + Vd->D(idx * 2 + 1) =3D int128_getlo(mask2); } else { - Vd->D(1) =3D int128_getlo(shft_res2); + Vd->D(idx * 2 + 1) =3D int128_getlo(shft_res2); + } +} + +void HELPER(vssrarni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) +{ + int i; + Int128 mask1, mask2; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); + + mask1 =3D int128_sub(int128_lshift(int128_one(), 63), int128_one()); + mask2 =3D int128_lshift(int128_one(), 63); + + for (i =3D 0; i < oprsz / 16; i++) { + do_vssrarni_d_q(Vd, Vj, imm, i, mask1, mask2); } } =20 @@ -2084,82 +2138,119 @@ VSSRARNI(vssrarni_b_h, 16, B, H) VSSRARNI(vssrarni_h_w, 32, H, W) VSSRARNI(vssrarni_w_d, 64, W, D) =20 -#define VSSRLRNUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrlrnu_ ## E1(Vj->E2(i), imm, BIT/2); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrlrnu_ ## E1(Vd->E2(i), imm, BIT= /2); \ - } = \ - *Vd =3D temp; = \ +#define VSSRLRNUI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrlrnu_ ## E1(Vj->E2(j + ofs = * i), \ + imm, BIT / 2); = \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrlrnu_ ## E1(Vd->E2(j = + ofs * i), \ + imm, BIT / = 2); \ + } = \ + } = \ + *Vd =3D temp; = \ +} + +void HELPER(vssrlrni_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) +{ + int i; + Int128 mask; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); + + mask =3D int128_sub(int128_lshift(int128_one(), 64), int128_one()); + + for (i =3D 0; i < oprsz / 16; i++) { + do_vssrlrni_q(Vd, Vj, imm, i, mask); + } } =20 VSSRLRNUI(vssrlrni_bu_h, 16, B, H) VSSRLRNUI(vssrlrni_hu_w, 32, H, W) VSSRLRNUI(vssrlrni_wu_d, 64, W, D) -VSSRLRNI_Q(vssrlrni_du_q, 64) =20 -#define VSSRARNUI(NAME, BIT, E1, E2) = \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ -{ = \ - int i; = \ - VReg temp; = \ - VReg *Vd =3D (VReg *)vd; = \ - VReg *Vj =3D (VReg *)vj; = \ - = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ - temp.E1(i) =3D do_ssrarnu_ ## E1(Vj->E2(i), imm, BIT/2); = \ - temp.E1(i + LSX_LEN/BIT) =3D do_ssrarnu_ ## E1(Vd->E2(i), imm, BIT= /2); \ - } = \ - *Vd =3D temp; = \ -} - -void HELPER(vssrarni_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) -{ - Int128 shft_res1, shft_res2, mask1, mask2, r1, r2; - VReg *Vd =3D (VReg *)vd; - VReg *Vj =3D (VReg *)vj; +#define VSSRARNUI(NAME, BIT, E1, E2) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, ofs; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E1(j + ofs * 2 * i) =3D do_ssrarnu_ ## E1(Vj->E2(j + ofs = * i), \ + imm, BIT / 2); = \ + temp.E1(j + ofs * (2 * i + 1)) =3D do_ssrarnu_ ## E1(Vd->E2(j = + ofs * i), \ + imm, BIT / = 2); \ + } = \ + } = \ + *Vd =3D temp; = \ +} + +static void do_vssrarni_du_q(VReg *Vd, VReg *Vj, + uint64_t imm, int idx, Int128 mask1, Int128 m= ask2) +{ + Int128 shft_res1, shft_res2, r1, r2; =20 if (imm =3D=3D 0) { - shft_res1 =3D Vj->Q(0); - shft_res2 =3D Vd->Q(0); + shft_res1 =3D Vj->Q(idx); + shft_res2 =3D Vd->Q(idx); } else { - r1 =3D int128_and(int128_rshift(Vj->Q(0), (imm -1)), int128_one()); - r2 =3D int128_and(int128_rshift(Vd->Q(0), (imm -1)), int128_one()); - - shft_res1 =3D int128_add(int128_rshift(Vj->Q(0), imm), r1); - shft_res2 =3D int128_add(int128_rshift(Vd->Q(0), imm), r2); + r1 =3D int128_and(int128_rshift(Vj->Q(idx), (imm - 1)), int128_one= ()); + r2 =3D int128_and(int128_rshift(Vd->Q(idx), (imm - 1)), int128_one= ()); + shft_res1 =3D int128_add(int128_rshift(Vj->Q(idx), imm), r1); + shft_res2 =3D int128_add(int128_rshift(Vd->Q(idx), imm), r2); } =20 - if (int128_lt(Vj->Q(0), int128_zero())) { + if (int128_lt(Vj->Q(idx), int128_zero())) { shft_res1 =3D int128_zero(); } - if (int128_lt(Vd->Q(0), int128_zero())) { + if (int128_lt(Vd->Q(idx), int128_zero())) { shft_res2 =3D int128_zero(); } =20 - mask1 =3D int128_sub(int128_lshift(int128_one(), 64), int128_one()); - mask2 =3D int128_lshift(int128_one(), 64); - if (int128_gt(shft_res1, mask1)) { - Vd->D(0) =3D int128_getlo(mask1); + Vd->D(idx * 2) =3D int128_getlo(mask1); } else if (int128_lt(shft_res1, int128_neg(mask2))) { - Vd->D(0) =3D int128_getlo(mask2); + Vd->D(idx * 2) =3D int128_getlo(mask2); } else { - Vd->D(0) =3D int128_getlo(shft_res1); + Vd->D(idx * 2) =3D int128_getlo(shft_res1); } =20 if (int128_gt(shft_res2, mask1)) { - Vd->D(1) =3D int128_getlo(mask1); + Vd->D(idx * 2 + 1) =3D int128_getlo(mask1); } else if (int128_lt(shft_res2, int128_neg(mask2))) { - Vd->D(1) =3D int128_getlo(mask2); + Vd->D(idx * 2 + 1) =3D int128_getlo(mask2); } else { - Vd->D(1) =3D int128_getlo(shft_res2); + Vd->D(idx * 2 + 1) =3D int128_getlo(shft_res2); + } +} + +void HELPER(vssrarni_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) +{ + int i; + Int128 mask1, mask2; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); + + mask1 =3D int128_sub(int128_lshift(int128_one(), 64), int128_one()); + mask2 =3D int128_lshift(int128_one(), 64); + + for (i =3D 0; i < oprsz / 16; i++) { + do_vssrarni_du_q(Vd, Vj, imm, i, mask1, mask2); } } =20 diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index e734593e4a..817d503490 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3895,6 +3895,18 @@ TRANS(vssrlrn_wu_d, LSX, gen_vvv, gen_helper_vssrlrn= _wu_d) TRANS(vssrarn_bu_h, LSX, gen_vvv, gen_helper_vssrarn_bu_h) TRANS(vssrarn_hu_w, LSX, gen_vvv, gen_helper_vssrarn_hu_w) TRANS(vssrarn_wu_d, LSX, gen_vvv, gen_helper_vssrarn_wu_d) +TRANS(xvssrlrn_b_h, LASX, gen_xxx, gen_helper_vssrlrn_b_h) +TRANS(xvssrlrn_h_w, LASX, gen_xxx, gen_helper_vssrlrn_h_w) +TRANS(xvssrlrn_w_d, LASX, gen_xxx, gen_helper_vssrlrn_w_d) +TRANS(xvssrarn_b_h, LASX, gen_xxx, gen_helper_vssrarn_b_h) +TRANS(xvssrarn_h_w, LASX, gen_xxx, gen_helper_vssrarn_h_w) +TRANS(xvssrarn_w_d, LASX, gen_xxx, gen_helper_vssrarn_w_d) +TRANS(xvssrlrn_bu_h, LASX, gen_xxx, gen_helper_vssrlrn_bu_h) +TRANS(xvssrlrn_hu_w, LASX, gen_xxx, gen_helper_vssrlrn_hu_w) +TRANS(xvssrlrn_wu_d, LASX, gen_xxx, gen_helper_vssrlrn_wu_d) +TRANS(xvssrarn_bu_h, LASX, gen_xxx, gen_helper_vssrarn_bu_h) +TRANS(xvssrarn_hu_w, LASX, gen_xxx, gen_helper_vssrarn_hu_w) +TRANS(xvssrarn_wu_d, LASX, gen_xxx, gen_helper_vssrarn_wu_d) =20 TRANS(vssrlrni_b_h, LSX, gen_vv_i, gen_helper_vssrlrni_b_h) TRANS(vssrlrni_h_w, LSX, gen_vv_i, gen_helper_vssrlrni_h_w) @@ -3912,6 +3924,22 @@ TRANS(vssrarni_bu_h, LSX, gen_vv_i, gen_helper_vssra= rni_bu_h) TRANS(vssrarni_hu_w, LSX, gen_vv_i, gen_helper_vssrarni_hu_w) TRANS(vssrarni_wu_d, LSX, gen_vv_i, gen_helper_vssrarni_wu_d) TRANS(vssrarni_du_q, LSX, gen_vv_i, gen_helper_vssrarni_du_q) +TRANS(xvssrlrni_b_h, LASX, gen_xx_i, gen_helper_vssrlrni_b_h) +TRANS(xvssrlrni_h_w, LASX, gen_xx_i, gen_helper_vssrlrni_h_w) +TRANS(xvssrlrni_w_d, LASX, gen_xx_i, gen_helper_vssrlrni_w_d) +TRANS(xvssrlrni_d_q, LASX, gen_xx_i, gen_helper_vssrlrni_d_q) +TRANS(xvssrarni_b_h, LASX, gen_xx_i, gen_helper_vssrarni_b_h) +TRANS(xvssrarni_h_w, LASX, gen_xx_i, gen_helper_vssrarni_h_w) +TRANS(xvssrarni_w_d, LASX, gen_xx_i, gen_helper_vssrarni_w_d) +TRANS(xvssrarni_d_q, LASX, gen_xx_i, gen_helper_vssrarni_d_q) +TRANS(xvssrlrni_bu_h, LASX, gen_xx_i, gen_helper_vssrlrni_bu_h) +TRANS(xvssrlrni_hu_w, LASX, gen_xx_i, gen_helper_vssrlrni_hu_w) +TRANS(xvssrlrni_wu_d, LASX, gen_xx_i, gen_helper_vssrlrni_wu_d) +TRANS(xvssrlrni_du_q, LASX, gen_xx_i, gen_helper_vssrlrni_du_q) +TRANS(xvssrarni_bu_h, LASX, gen_xx_i, gen_helper_vssrarni_bu_h) +TRANS(xvssrarni_hu_w, LASX, gen_xx_i, gen_helper_vssrarni_hu_w) +TRANS(xvssrarni_wu_d, LASX, gen_xx_i, gen_helper_vssrarni_wu_d) +TRANS(xvssrarni_du_q, LASX, gen_xx_i, gen_helper_vssrarni_du_q) =20 TRANS(vclo_b, LSX, gen_vv, gen_helper_vclo_b) TRANS(vclo_h, LSX, gen_vv, gen_helper_vclo_h) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658571760643.4180168407283; Wed, 13 Sep 2023 19:29:31 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5v-0004pq-2y; Wed, 13 Sep 2023 22:28:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4t-0003Io-9A for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:35 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4r-0004zs-9N for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:34 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx77uJbwJlq_gmAA--.16313S3; Thu, 14 Sep 2023 10:27:21 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S44; Thu, 14 Sep 2023 10:27:20 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 42/57] target/loongarch: Implement xvclo xvclz Date: Thu, 14 Sep 2023 10:26:30 +0800 Message-Id: <20230914022645.1151356-43-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S44 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658573713100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVCLO.{B/H/W/D}; - XVCLZ.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 9 +++++++++ target/loongarch/disas.c | 9 +++++++++ target/loongarch/vec_helper.c | 3 ++- target/loongarch/insn_trans/trans_vec.c.inc | 8 ++++++++ 4 files changed, 28 insertions(+), 1 deletion(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index dc74bae7a5..3175532045 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1770,6 +1770,15 @@ xvssrarni_hu_w 0111 01110110 11001 ..... ..... ...= .. @vv_ui5 xvssrarni_wu_d 0111 01110110 1101 ...... ..... ..... @vv_ui6 xvssrarni_du_q 0111 01110110 111 ....... ..... ..... @vv_ui7 =20 +xvclo_b 0111 01101001 11000 00000 ..... ..... @vv +xvclo_h 0111 01101001 11000 00001 ..... ..... @vv +xvclo_w 0111 01101001 11000 00010 ..... ..... @vv +xvclo_d 0111 01101001 11000 00011 ..... ..... @vv +xvclz_b 0111 01101001 11000 00100 ..... ..... @vv +xvclz_h 0111 01101001 11000 00101 ..... ..... @vv +xvclz_w 0111 01101001 11000 00110 ..... ..... @vv +xvclz_d 0111 01101001 11000 00111 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 421eecbb71..bbf530b349 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2196,6 +2196,15 @@ INSN_LASX(xvssrarni_hu_w, vv_i) INSN_LASX(xvssrarni_wu_d, vv_i) INSN_LASX(xvssrarni_du_q, vv_i) =20 +INSN_LASX(xvclo_b, vv) +INSN_LASX(xvclo_h, vv) +INSN_LASX(xvclo_w, vv) +INSN_LASX(xvclo_d, vv) +INSN_LASX(xvclz_b, vv) +INSN_LASX(xvclz_h, vv) +INSN_LASX(xvclz_w, vv) +INSN_LASX(xvclz_d, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 68f3667ef4..363309b6ea 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2264,8 +2264,9 @@ void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) \ { \ Vd->E(i) =3D DO_OP(Vj->E(i)); \ } \ diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 817d503490..85d0d10355 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3949,6 +3949,14 @@ TRANS(vclz_b, LSX, gen_vv, gen_helper_vclz_b) TRANS(vclz_h, LSX, gen_vv, gen_helper_vclz_h) TRANS(vclz_w, LSX, gen_vv, gen_helper_vclz_w) TRANS(vclz_d, LSX, gen_vv, gen_helper_vclz_d) +TRANS(xvclo_b, LASX, gen_xx, gen_helper_vclo_b) +TRANS(xvclo_h, LASX, gen_xx, gen_helper_vclo_h) +TRANS(xvclo_w, LASX, gen_xx, gen_helper_vclo_w) +TRANS(xvclo_d, LASX, gen_xx, gen_helper_vclo_d) +TRANS(xvclz_b, LASX, gen_xx, gen_helper_vclz_b) +TRANS(xvclz_h, LASX, gen_xx, gen_helper_vclz_h) +TRANS(xvclz_w, LASX, gen_xx, gen_helper_vclz_w) +TRANS(xvclz_d, LASX, gen_xx, gen_helper_vclz_d) =20 TRANS(vpcnt_b, LSX, gen_vv, gen_helper_vpcnt_b) TRANS(vpcnt_h, LSX, gen_vv, gen_helper_vpcnt_h) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658877400583.0312391415433; Wed, 13 Sep 2023 19:34:37 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5X-00046h-7T; Wed, 13 Sep 2023 22:28:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4t-0003Iw-HI for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:35 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4r-0004zu-CJ for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:35 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dx_+uJbwJlrPgmAA--.8905S3; Thu, 14 Sep 2023 10:27:21 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S45; Thu, 14 Sep 2023 10:27:20 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 43/57] target/loongarch: Implement xvpcnt Date: Thu, 14 Sep 2023 10:26:31 +0800 Message-Id: <20230914022645.1151356-44-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S45 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658878067100002 Content-Type: text/plain; charset="utf-8" This patch includes: - VPCNT.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 5 +++++ target/loongarch/disas.c | 5 +++++ target/loongarch/vec_helper.c | 3 ++- target/loongarch/insn_trans/trans_vec.c.inc | 4 ++++ 4 files changed, 16 insertions(+), 1 deletion(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 3175532045..d683c6a6ab 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1779,6 +1779,11 @@ xvclz_h 0111 01101001 11000 00101 ..... ...= .. @vv xvclz_w 0111 01101001 11000 00110 ..... ..... @vv xvclz_d 0111 01101001 11000 00111 ..... ..... @vv =20 +xvpcnt_b 0111 01101001 11000 01000 ..... ..... @vv +xvpcnt_h 0111 01101001 11000 01001 ..... ..... @vv +xvpcnt_w 0111 01101001 11000 01010 ..... ..... @vv +xvpcnt_d 0111 01101001 11000 01011 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index bbf530b349..ff7f7a792a 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2205,6 +2205,11 @@ INSN_LASX(xvclz_h, vv) INSN_LASX(xvclz_w, vv) INSN_LASX(xvclz_d, vv) =20 +INSN_LASX(xvpcnt_b, vv) +INSN_LASX(xvpcnt_h, vv) +INSN_LASX(xvpcnt_w, vv) +INSN_LASX(xvpcnt_d, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 363309b6ea..e529b58419 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2296,8 +2296,9 @@ void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) \ { \ Vd->E(i) =3D FN(Vj->E(i)); \ } \ diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 85d0d10355..94afdf6d70 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -3962,6 +3962,10 @@ TRANS(vpcnt_b, LSX, gen_vv, gen_helper_vpcnt_b) TRANS(vpcnt_h, LSX, gen_vv, gen_helper_vpcnt_h) TRANS(vpcnt_w, LSX, gen_vv, gen_helper_vpcnt_w) TRANS(vpcnt_d, LSX, gen_vv, gen_helper_vpcnt_d) +TRANS(xvpcnt_b, LASX, gen_xx, gen_helper_vpcnt_b) +TRANS(xvpcnt_h, LASX, gen_xx, gen_helper_vpcnt_h) +TRANS(xvpcnt_w, LASX, gen_xx, gen_helper_vpcnt_w) +TRANS(xvpcnt_d, LASX, gen_xx, gen_helper_vpcnt_d) =20 static void do_vbit(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b, void (*func)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec)) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658972314683.2200353746142; Wed, 13 Sep 2023 19:36:12 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6I-00061I-4s; Wed, 13 Sep 2023 22:29:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4u-0003JP-SF for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:37 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4s-00050c-G3 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:36 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxEvCKbwJlr_gmAA--.9049S3; Thu, 14 Sep 2023 10:27:22 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S46; Thu, 14 Sep 2023 10:27:21 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 44/57] target/loongarch: Implement xvbitclr xvbitset xvbitrev Date: Thu, 14 Sep 2023 10:26:32 +0800 Message-Id: <20230914022645.1151356-45-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S46 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658974359100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVBITCLR[I].{B/H/W/D}; - XVBITSET[I].{B/H/W/D}; - XVBITREV[I].{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 27 +++++++++++++ target/loongarch/disas.c | 25 ++++++++++++ target/loongarch/vec_helper.c | 44 +++++++++++---------- target/loongarch/insn_trans/trans_vec.c.inc | 24 +++++++++++ 4 files changed, 99 insertions(+), 21 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index d683c6a6ab..cb6db8002a 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1784,6 +1784,33 @@ xvpcnt_h 0111 01101001 11000 01001 ..... ...= .. @vv xvpcnt_w 0111 01101001 11000 01010 ..... ..... @vv xvpcnt_d 0111 01101001 11000 01011 ..... ..... @vv =20 +xvbitclr_b 0111 01010000 11000 ..... ..... ..... @vvv +xvbitclr_h 0111 01010000 11001 ..... ..... ..... @vvv +xvbitclr_w 0111 01010000 11010 ..... ..... ..... @vvv +xvbitclr_d 0111 01010000 11011 ..... ..... ..... @vvv +xvbitclri_b 0111 01110001 00000 01 ... ..... ..... @vv_ui3 +xvbitclri_h 0111 01110001 00000 1 .... ..... ..... @vv_ui4 +xvbitclri_w 0111 01110001 00001 ..... ..... ..... @vv_ui5 +xvbitclri_d 0111 01110001 0001 ...... ..... ..... @vv_ui6 + +xvbitset_b 0111 01010000 11100 ..... ..... ..... @vvv +xvbitset_h 0111 01010000 11101 ..... ..... ..... @vvv +xvbitset_w 0111 01010000 11110 ..... ..... ..... @vvv +xvbitset_d 0111 01010000 11111 ..... ..... ..... @vvv +xvbitseti_b 0111 01110001 01000 01 ... ..... ..... @vv_ui3 +xvbitseti_h 0111 01110001 01000 1 .... ..... ..... @vv_ui4 +xvbitseti_w 0111 01110001 01001 ..... ..... ..... @vv_ui5 +xvbitseti_d 0111 01110001 0101 ...... ..... ..... @vv_ui6 + +xvbitrev_b 0111 01010001 00000 ..... ..... ..... @vvv +xvbitrev_h 0111 01010001 00001 ..... ..... ..... @vvv +xvbitrev_w 0111 01010001 00010 ..... ..... ..... @vvv +xvbitrev_d 0111 01010001 00011 ..... ..... ..... @vvv +xvbitrevi_b 0111 01110001 10000 01 ... ..... ..... @vv_ui3 +xvbitrevi_h 0111 01110001 10000 1 .... ..... ..... @vv_ui4 +xvbitrevi_w 0111 01110001 10001 ..... ..... ..... @vv_ui5 +xvbitrevi_d 0111 01110001 1001 ...... ..... ..... @vv_ui6 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index ff7f7a792a..7f04c912aa 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2210,6 +2210,31 @@ INSN_LASX(xvpcnt_h, vv) INSN_LASX(xvpcnt_w, vv) INSN_LASX(xvpcnt_d, vv) =20 +INSN_LASX(xvbitclr_b, vvv) +INSN_LASX(xvbitclr_h, vvv) +INSN_LASX(xvbitclr_w, vvv) +INSN_LASX(xvbitclr_d, vvv) +INSN_LASX(xvbitclri_b, vv_i) +INSN_LASX(xvbitclri_h, vv_i) +INSN_LASX(xvbitclri_w, vv_i) +INSN_LASX(xvbitclri_d, vv_i) +INSN_LASX(xvbitset_b, vvv) +INSN_LASX(xvbitset_h, vvv) +INSN_LASX(xvbitset_w, vvv) +INSN_LASX(xvbitset_d, vvv) +INSN_LASX(xvbitseti_b, vv_i) +INSN_LASX(xvbitseti_h, vv_i) +INSN_LASX(xvbitseti_w, vv_i) +INSN_LASX(xvbitseti_d, vv_i) +INSN_LASX(xvbitrev_b, vvv) +INSN_LASX(xvbitrev_h, vvv) +INSN_LASX(xvbitrev_w, vvv) +INSN_LASX(xvbitrev_d, vvv) +INSN_LASX(xvbitrevi_b, vv_i) +INSN_LASX(xvbitrevi_h, vv_i) +INSN_LASX(xvbitrevi_w, vv_i) +INSN_LASX(xvbitrevi_d, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index e529b58419..ec63efb428 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2313,17 +2313,18 @@ VPCNT(vpcnt_d, 64, UD, ctpop64) #define DO_BITSET(a, bit) (a | 1ull << bit) #define DO_BITREV(a, bit) (a ^ (1ull << bit)) =20 -#define DO_BIT(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)%BIT); \ - } \ +#define DO_BIT(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)%BIT); \ + } \ } =20 DO_BIT(vbitclr_b, 8, UB, DO_BITCLR) @@ -2339,16 +2340,17 @@ DO_BIT(vbitrev_h, 16, UH, DO_BITREV) DO_BIT(vbitrev_w, 32, UW, DO_BITREV) DO_BIT(vbitrev_d, 64, UD, DO_BITREV) =20 -#define DO_BITI(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i), imm); \ - } \ +#define DO_BITI(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), imm); \ + } \ } =20 DO_BITI(vbitclri_b, 8, UB, DO_BITCLR) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 94afdf6d70..9e7eb30356 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -4037,6 +4037,10 @@ TRANS(vbitclr_b, LSX, gvec_vvv, MO_8, do_vbitclr) TRANS(vbitclr_h, LSX, gvec_vvv, MO_16, do_vbitclr) TRANS(vbitclr_w, LSX, gvec_vvv, MO_32, do_vbitclr) TRANS(vbitclr_d, LSX, gvec_vvv, MO_64, do_vbitclr) +TRANS(xvbitclr_b, LASX, gvec_xxx, MO_8, do_vbitclr) +TRANS(xvbitclr_h, LASX, gvec_xxx, MO_16, do_vbitclr) +TRANS(xvbitclr_w, LASX, gvec_xxx, MO_32, do_vbitclr) +TRANS(xvbitclr_d, LASX, gvec_xxx, MO_64, do_vbitclr) =20 static void do_vbiti(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t imm, void (*func)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec)) @@ -4107,6 +4111,10 @@ TRANS(vbitclri_b, LSX, gvec_vv_i, MO_8, do_vbitclri) TRANS(vbitclri_h, LSX, gvec_vv_i, MO_16, do_vbitclri) TRANS(vbitclri_w, LSX, gvec_vv_i, MO_32, do_vbitclri) TRANS(vbitclri_d, LSX, gvec_vv_i, MO_64, do_vbitclri) +TRANS(xvbitclri_b, LASX, gvec_xx_i, MO_8, do_vbitclri) +TRANS(xvbitclri_h, LASX, gvec_xx_i, MO_16, do_vbitclri) +TRANS(xvbitclri_w, LASX, gvec_xx_i, MO_32, do_vbitclri) +TRANS(xvbitclri_d, LASX, gvec_xx_i, MO_64, do_vbitclri) =20 static void do_vbitset(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, uint32_t vk_ofs, uint32_t oprsz, uint32_t maxsz) @@ -4148,6 +4156,10 @@ TRANS(vbitset_b, LSX, gvec_vvv, MO_8, do_vbitset) TRANS(vbitset_h, LSX, gvec_vvv, MO_16, do_vbitset) TRANS(vbitset_w, LSX, gvec_vvv, MO_32, do_vbitset) TRANS(vbitset_d, LSX, gvec_vvv, MO_64, do_vbitset) +TRANS(xvbitset_b, LASX, gvec_xxx, MO_8, do_vbitset) +TRANS(xvbitset_h, LASX, gvec_xxx, MO_16, do_vbitset) +TRANS(xvbitset_w, LASX, gvec_xxx, MO_32, do_vbitset) +TRANS(xvbitset_d, LASX, gvec_xxx, MO_64, do_vbitset) =20 static void do_vbitseti(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, int64_t imm, uint32_t oprsz, uint32_t maxsz) @@ -4189,6 +4201,10 @@ TRANS(vbitseti_b, LSX, gvec_vv_i, MO_8, do_vbitseti) TRANS(vbitseti_h, LSX, gvec_vv_i, MO_16, do_vbitseti) TRANS(vbitseti_w, LSX, gvec_vv_i, MO_32, do_vbitseti) TRANS(vbitseti_d, LSX, gvec_vv_i, MO_64, do_vbitseti) +TRANS(xvbitseti_b, LASX, gvec_xx_i, MO_8, do_vbitseti) +TRANS(xvbitseti_h, LASX, gvec_xx_i, MO_16, do_vbitseti) +TRANS(xvbitseti_w, LASX, gvec_xx_i, MO_32, do_vbitseti) +TRANS(xvbitseti_d, LASX, gvec_xx_i, MO_64, do_vbitseti) =20 static void do_vbitrev(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, uint32_t vk_ofs, uint32_t oprsz, uint32_t maxsz) @@ -4230,6 +4246,10 @@ TRANS(vbitrev_b, LSX, gvec_vvv, MO_8, do_vbitrev) TRANS(vbitrev_h, LSX, gvec_vvv, MO_16, do_vbitrev) TRANS(vbitrev_w, LSX, gvec_vvv, MO_32, do_vbitrev) TRANS(vbitrev_d, LSX, gvec_vvv, MO_64, do_vbitrev) +TRANS(xvbitrev_b, LASX, gvec_xxx, MO_8, do_vbitrev) +TRANS(xvbitrev_h, LASX, gvec_xxx, MO_16, do_vbitrev) +TRANS(xvbitrev_w, LASX, gvec_xxx, MO_32, do_vbitrev) +TRANS(xvbitrev_d, LASX, gvec_xxx, MO_64, do_vbitrev) =20 static void do_vbitrevi(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, int64_t imm, uint32_t oprsz, uint32_t maxsz) @@ -4271,6 +4291,10 @@ TRANS(vbitrevi_b, LSX, gvec_vv_i, MO_8, do_vbitrevi) TRANS(vbitrevi_h, LSX, gvec_vv_i, MO_16, do_vbitrevi) TRANS(vbitrevi_w, LSX, gvec_vv_i, MO_32, do_vbitrevi) TRANS(vbitrevi_d, LSX, gvec_vv_i, MO_64, do_vbitrevi) +TRANS(xvbitrevi_b, LASX, gvec_xx_i, MO_8, do_vbitrevi) +TRANS(xvbitrevi_h, LASX, gvec_xx_i, MO_16, do_vbitrevi) +TRANS(xvbitrevi_w, LASX, gvec_xx_i, MO_32, do_vbitrevi) +TRANS(xvbitrevi_d, LASX, gvec_xx_i, MO_64, do_vbitrevi) =20 TRANS(vfrstp_b, LSX, gen_vvv, gen_helper_vfrstp_b) TRANS(vfrstp_h, LSX, gen_vvv, gen_helper_vfrstp_h) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 169465871641787.38016833419135; Wed, 13 Sep 2023 19:31:56 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6J-0006IX-9S; Wed, 13 Sep 2023 22:29:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4v-0003JQ-8V for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:37 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4s-00050n-V4 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:37 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxuOiKbwJltPgmAA--.39692S3; Thu, 14 Sep 2023 10:27:22 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S47; Thu, 14 Sep 2023 10:27:21 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 45/57] target/loongarch: Implement xvfrstp Date: Thu, 14 Sep 2023 10:26:33 +0800 Message-Id: <20230914022645.1151356-46-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S47 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658717573100005 Content-Type: text/plain; charset="utf-8" This patch includes: - XVFRSTP[I].{B/H}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 5 ++++ target/loongarch/disas.c | 5 ++++ target/loongarch/vec_helper.c | 32 +++++++++++++-------- target/loongarch/insn_trans/trans_vec.c.inc | 4 +++ 4 files changed, 34 insertions(+), 12 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index cb6db8002a..6035fe139c 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1811,6 +1811,11 @@ xvbitrevi_h 0111 01110001 10000 1 .... ..... ..= ... @vv_ui4 xvbitrevi_w 0111 01110001 10001 ..... ..... ..... @vv_ui5 xvbitrevi_d 0111 01110001 1001 ...... ..... ..... @vv_ui6 =20 +xvfrstp_b 0111 01010010 10110 ..... ..... ..... @vvv +xvfrstp_h 0111 01010010 10111 ..... ..... ..... @vvv +xvfrstpi_b 0111 01101001 10100 ..... ..... ..... @vv_ui5 +xvfrstpi_h 0111 01101001 10101 ..... ..... ..... @vv_ui5 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 7f04c912aa..1c4aecaa93 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2235,6 +2235,11 @@ INSN_LASX(xvbitrevi_h, vv_i) INSN_LASX(xvbitrevi_w, vv_i) INSN_LASX(xvbitrevi_d, vv_i) =20 +INSN_LASX(xvfrstp_b, vvv) +INSN_LASX(xvfrstp_h, vvv) +INSN_LASX(xvfrstpi_b, vv_i) +INSN_LASX(xvfrstpi_h, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index ec63efb428..9ddbbc665a 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2369,18 +2369,22 @@ DO_BITI(vbitrevi_d, 64, UD, DO_BITREV) #define VFRSTP(NAME, BIT, MASK, E) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ - int i, m; \ + int i, j, m, ofs; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - if (Vj->E(i) < 0) { \ - break; \ + ofs =3D LSX_LEN / BIT; \ + for (i =3D 0; i < oprsz / 16; i++) { \ + m =3D Vk->E(i * ofs) & MASK; \ + for (j =3D 0; j < ofs; j++) { \ + if (Vj->E(j + ofs * i) < 0) { \ + break; \ + } \ } \ + Vd->E(m + i * ofs) =3D j; \ } \ - m =3D Vk->E(0) & MASK; \ - Vd->E(m) =3D i; \ } =20 VFRSTP(vfrstp_b, 8, 0xf, B) @@ -2389,17 +2393,21 @@ VFRSTP(vfrstp_h, 16, 0x7, H) #define VFRSTPI(NAME, BIT, E) \ void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ - int i, m; \ + int i, j, m, ofs; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - if (Vj->E(i) < 0) { \ - break; \ + ofs =3D LSX_LEN / BIT; \ + m =3D imm % ofs; \ + for (i =3D 0; i < oprsz / 16; i++) { \ + for (j =3D 0; j < ofs; j++) { \ + if (Vj->E(j + ofs * i) < 0) { \ + break; \ + } \ } \ + Vd->E(m + i * ofs) =3D j; \ } \ - m =3D imm % (LSX_LEN/BIT); \ - Vd->E(m) =3D i; \ } =20 VFRSTPI(vfrstpi_b, 8, B) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 9e7eb30356..c1e7130a73 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -4300,6 +4300,10 @@ TRANS(vfrstp_b, LSX, gen_vvv, gen_helper_vfrstp_b) TRANS(vfrstp_h, LSX, gen_vvv, gen_helper_vfrstp_h) TRANS(vfrstpi_b, LSX, gen_vv_i, gen_helper_vfrstpi_b) TRANS(vfrstpi_h, LSX, gen_vv_i, gen_helper_vfrstpi_h) +TRANS(xvfrstp_b, LASX, gen_xxx, gen_helper_vfrstp_b) +TRANS(xvfrstp_h, LASX, gen_xxx, gen_helper_vfrstp_h) +TRANS(xvfrstpi_b, LASX, gen_xx_i, gen_helper_vfrstpi_b) +TRANS(xvfrstpi_h, LASX, gen_xx_i, gen_helper_vfrstpi_h) =20 TRANS(vfadd_s, LSX, gen_vvv_ptr, gen_helper_vfadd_s) TRANS(vfadd_d, LSX, gen_vvv_ptr, gen_helper_vfadd_d) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658897206595.112700795371; Wed, 13 Sep 2023 19:34:57 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6l-0002aH-6p; Wed, 13 Sep 2023 22:29:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc6i-0002J1-O9 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:29:28 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6g-0005Nm-2u for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:29:28 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dx_+uLbwJltvgmAA--.8907S3; Thu, 14 Sep 2023 10:27:23 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S48; Thu, 14 Sep 2023 10:27:22 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 46/57] target/loongarch: Implement LASX fpu arith instructions Date: Thu, 14 Sep 2023 10:26:34 +0800 Message-Id: <20230914022645.1151356-47-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S48 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658898089100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVF{ADD/SUB/MUL/DIV}.{S/D}; - XVF{MADD/MSUB/NMADD/NMSUB}.{S/D}; - XVF{MAX/MIN}.{S/D}; - XVF{MAXA/MINA}.{S/D}; - XVFLOGB.{S/D}; - XVFCLASS.{S/D}; - XVF{SQRT/RECIP/RSQRT}.{S/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 41 +++++++++++ target/loongarch/disas.c | 46 +++++++++++++ target/loongarch/vec_helper.c | 12 ++-- target/loongarch/insn_trans/trans_vec.c.inc | 75 +++++++++++++++++---- 4 files changed, 158 insertions(+), 16 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 6035fe139c..4224b0a4b1 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1816,6 +1816,47 @@ xvfrstp_h 0111 01010010 10111 ..... ..... ...= .. @vvv xvfrstpi_b 0111 01101001 10100 ..... ..... ..... @vv_ui5 xvfrstpi_h 0111 01101001 10101 ..... ..... ..... @vv_ui5 =20 +xvfadd_s 0111 01010011 00001 ..... ..... ..... @vvv +xvfadd_d 0111 01010011 00010 ..... ..... ..... @vvv +xvfsub_s 0111 01010011 00101 ..... ..... ..... @vvv +xvfsub_d 0111 01010011 00110 ..... ..... ..... @vvv +xvfmul_s 0111 01010011 10001 ..... ..... ..... @vvv +xvfmul_d 0111 01010011 10010 ..... ..... ..... @vvv +xvfdiv_s 0111 01010011 10101 ..... ..... ..... @vvv +xvfdiv_d 0111 01010011 10110 ..... ..... ..... @vvv + +xvfmadd_s 0000 10100001 ..... ..... ..... ..... @vvvv +xvfmadd_d 0000 10100010 ..... ..... ..... ..... @vvvv +xvfmsub_s 0000 10100101 ..... ..... ..... ..... @vvvv +xvfmsub_d 0000 10100110 ..... ..... ..... ..... @vvvv +xvfnmadd_s 0000 10101001 ..... ..... ..... ..... @vvvv +xvfnmadd_d 0000 10101010 ..... ..... ..... ..... @vvvv +xvfnmsub_s 0000 10101101 ..... ..... ..... ..... @vvvv +xvfnmsub_d 0000 10101110 ..... ..... ..... ..... @vvvv + +xvfmax_s 0111 01010011 11001 ..... ..... ..... @vvv +xvfmax_d 0111 01010011 11010 ..... ..... ..... @vvv +xvfmin_s 0111 01010011 11101 ..... ..... ..... @vvv +xvfmin_d 0111 01010011 11110 ..... ..... ..... @vvv + +xvfmaxa_s 0111 01010100 00001 ..... ..... ..... @vvv +xvfmaxa_d 0111 01010100 00010 ..... ..... ..... @vvv +xvfmina_s 0111 01010100 00101 ..... ..... ..... @vvv +xvfmina_d 0111 01010100 00110 ..... ..... ..... @vvv + +xvflogb_s 0111 01101001 11001 10001 ..... ..... @vv +xvflogb_d 0111 01101001 11001 10010 ..... ..... @vv + +xvfclass_s 0111 01101001 11001 10101 ..... ..... @vv +xvfclass_d 0111 01101001 11001 10110 ..... ..... @vv + +xvfsqrt_s 0111 01101001 11001 11001 ..... ..... @vv +xvfsqrt_d 0111 01101001 11001 11010 ..... ..... @vv +xvfrecip_s 0111 01101001 11001 11101 ..... ..... @vv +xvfrecip_d 0111 01101001 11001 11110 ..... ..... @vv +xvfrsqrt_s 0111 01101001 11010 00001 ..... ..... @vv +xvfrsqrt_d 0111 01101001 11010 00010 ..... ..... @vv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 1c4aecaa93..1fb9d7eac1 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1708,6 +1708,11 @@ static void output_v_i_x(DisasContext *ctx, arg_v_i = *a, const char *mnemonic) output(ctx, mnemonic, "x%d, 0x%x", a->vd, a->imm); } =20 +static void output_vvvv_x(DisasContext *ctx, arg_vvvv *a, const char *mnem= onic) +{ + output(ctx, mnemonic, "x%d, x%d, x%d, x%d", a->vd, a->vj, a->vk, a->va= ); +} + static void output_vvv_x(DisasContext *ctx, arg_vvv * a, const char *mnemo= nic) { output(ctx, mnemonic, "x%d, x%d, x%d", a->vd, a->vj, a->vk); @@ -2240,6 +2245,47 @@ INSN_LASX(xvfrstp_h, vvv) INSN_LASX(xvfrstpi_b, vv_i) INSN_LASX(xvfrstpi_h, vv_i) =20 +INSN_LASX(xvfadd_s, vvv) +INSN_LASX(xvfadd_d, vvv) +INSN_LASX(xvfsub_s, vvv) +INSN_LASX(xvfsub_d, vvv) +INSN_LASX(xvfmul_s, vvv) +INSN_LASX(xvfmul_d, vvv) +INSN_LASX(xvfdiv_s, vvv) +INSN_LASX(xvfdiv_d, vvv) + +INSN_LASX(xvfmadd_s, vvvv) +INSN_LASX(xvfmadd_d, vvvv) +INSN_LASX(xvfmsub_s, vvvv) +INSN_LASX(xvfmsub_d, vvvv) +INSN_LASX(xvfnmadd_s, vvvv) +INSN_LASX(xvfnmadd_d, vvvv) +INSN_LASX(xvfnmsub_s, vvvv) +INSN_LASX(xvfnmsub_d, vvvv) + +INSN_LASX(xvfmax_s, vvv) +INSN_LASX(xvfmax_d, vvv) +INSN_LASX(xvfmin_s, vvv) +INSN_LASX(xvfmin_d, vvv) + +INSN_LASX(xvfmaxa_s, vvv) +INSN_LASX(xvfmaxa_d, vvv) +INSN_LASX(xvfmina_s, vvv) +INSN_LASX(xvfmina_d, vvv) + +INSN_LASX(xvflogb_s, vv) +INSN_LASX(xvflogb_d, vv) + +INSN_LASX(xvfclass_s, vv) +INSN_LASX(xvfclass_d, vv) + +INSN_LASX(xvfsqrt_s, vv) +INSN_LASX(xvfsqrt_d, vv) +INSN_LASX(xvfrecip_s, vv) +INSN_LASX(xvfrecip_d, vv) +INSN_LASX(xvfrsqrt_s, vv) +INSN_LASX(xvfrsqrt_d, vv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 9ddbbc665a..56d13331f6 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2452,9 +2452,10 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, = \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ vec_clear_cause(env); \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D FN(Vj->E(i), Vk->E(i), &env->fp_status); \ vec_update_fcsr0(env, GETPC()); \ } \ @@ -2486,9 +2487,10 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, void= *va, \ VReg *Vj =3D (VReg *)vj; = \ VReg *Vk =3D (VReg *)vk; = \ VReg *Va =3D (VReg *)va; = \ + int oprsz =3D simd_oprsz(desc); = \ = \ vec_clear_cause(env); = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { = \ Vd->E(i) =3D FN(Vj->E(i), Vk->E(i), Va->E(i), flags, &env->fp_stat= us); \ vec_update_fcsr0(env, GETPC()); = \ } = \ @@ -2512,9 +2514,10 @@ void HELPER(NAME)(void *vd, void *vj, = \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ vec_clear_cause(env); \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D FN(env, Vj->E(i)); \ } \ } @@ -2544,8 +2547,9 @@ void HELPER(NAME)(void *vd, void *vj, = \ int i; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ Vd->E(i) =3D FN(env, Vj->E(i)); \ } \ } diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index c1e7130a73..a7f6929db3 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -33,6 +33,10 @@ static bool check_vec(DisasContext *ctx, uint32_t oprsz) static bool gen_vvvv_ptr_vl(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz, gen_helper_gvec_4_ptr *fn) { + if (!check_vec(ctx, oprsz)) { + return true; + } + tcg_gen_gvec_4_ptr(vec_full_offset(a->vd), vec_full_offset(a->vj), vec_full_offset(a->vk), @@ -45,13 +49,15 @@ static bool gen_vvvv_ptr_vl(DisasContext *ctx, arg_vvvv= *a, uint32_t oprsz, static bool gen_vvvv_ptr(DisasContext *ctx, arg_vvvv *a, gen_helper_gvec_4_ptr *fn) { - if (!check_vec(ctx, 16)) { - return true; - } - return gen_vvvv_ptr_vl(ctx, a, 16, fn); } =20 +static bool gen_xxxx_ptr(DisasContext *ctx, arg_vvvv *a, + gen_helper_gvec_4_ptr *fn) +{ + return gen_vvvv_ptr_vl(ctx, a, 32, fn); +} + static bool gen_vvvv_vl(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz, gen_helper_gvec_4 *fn) { @@ -76,6 +82,9 @@ static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, static bool gen_vvv_ptr_vl(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, gen_helper_gvec_3_ptr *fn) { + if (!check_vec(ctx, oprsz)) { + return true; + } tcg_gen_gvec_3_ptr(vec_full_offset(a->vd), vec_full_offset(a->vj), vec_full_offset(a->vk), @@ -87,13 +96,15 @@ static bool gen_vvv_ptr_vl(DisasContext *ctx, arg_vvv *= a, uint32_t oprsz, static bool gen_vvv_ptr(DisasContext *ctx, arg_vvv *a, gen_helper_gvec_3_ptr *fn) { - if (!check_vec(ctx, 16)) { - return true; - } - return gen_vvv_ptr_vl(ctx, a, 16, fn); } =20 +static bool gen_xxx_ptr(DisasContext *ctx, arg_vvv *a, + gen_helper_gvec_3_ptr *fn) +{ + return gen_vvv_ptr_vl(ctx, a, 32, fn); +} + static bool gen_vvv_vl(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, gen_helper_gvec_3 *fn) { @@ -121,6 +132,10 @@ static bool gen_xxx(DisasContext *ctx, arg_vvv *a, gen= _helper_gvec_3 *fn) static bool gen_vv_ptr_vl(DisasContext *ctx, arg_vv *a, uint32_t oprsz, gen_helper_gvec_2_ptr *fn) { + if (!check_vec(ctx, oprsz)) { + return true; + } + tcg_gen_gvec_2_ptr(vec_full_offset(a->vd), vec_full_offset(a->vj), cpu_env, @@ -131,13 +146,15 @@ static bool gen_vv_ptr_vl(DisasContext *ctx, arg_vv *= a, uint32_t oprsz, static bool gen_vv_ptr(DisasContext *ctx, arg_vv *a, gen_helper_gvec_2_ptr *fn) { - if (!check_vec(ctx, 16)) { - return true; - } - return gen_vv_ptr_vl(ctx, a, 16, fn); } =20 +static bool gen_xx_ptr(DisasContext *ctx, arg_vv *a, + gen_helper_gvec_2_ptr *fn) +{ + return gen_vv_ptr_vl(ctx, a, 32, fn); +} + static bool gen_vv_vl(DisasContext *ctx, arg_vv *a, uint32_t oprsz, gen_helper_gvec_2 *fn) { @@ -4313,6 +4330,14 @@ TRANS(vfmul_s, LSX, gen_vvv_ptr, gen_helper_vfmul_s) TRANS(vfmul_d, LSX, gen_vvv_ptr, gen_helper_vfmul_d) TRANS(vfdiv_s, LSX, gen_vvv_ptr, gen_helper_vfdiv_s) TRANS(vfdiv_d, LSX, gen_vvv_ptr, gen_helper_vfdiv_d) +TRANS(xvfadd_s, LASX, gen_xxx_ptr, gen_helper_vfadd_s) +TRANS(xvfadd_d, LASX, gen_xxx_ptr, gen_helper_vfadd_d) +TRANS(xvfsub_s, LASX, gen_xxx_ptr, gen_helper_vfsub_s) +TRANS(xvfsub_d, LASX, gen_xxx_ptr, gen_helper_vfsub_d) +TRANS(xvfmul_s, LASX, gen_xxx_ptr, gen_helper_vfmul_s) +TRANS(xvfmul_d, LASX, gen_xxx_ptr, gen_helper_vfmul_d) +TRANS(xvfdiv_s, LASX, gen_xxx_ptr, gen_helper_vfdiv_s) +TRANS(xvfdiv_d, LASX, gen_xxx_ptr, gen_helper_vfdiv_d) =20 TRANS(vfmadd_s, LSX, gen_vvvv_ptr, gen_helper_vfmadd_s) TRANS(vfmadd_d, LSX, gen_vvvv_ptr, gen_helper_vfmadd_d) @@ -4322,22 +4347,42 @@ TRANS(vfnmadd_s, LSX, gen_vvvv_ptr, gen_helper_vfnm= add_s) TRANS(vfnmadd_d, LSX, gen_vvvv_ptr, gen_helper_vfnmadd_d) TRANS(vfnmsub_s, LSX, gen_vvvv_ptr, gen_helper_vfnmsub_s) TRANS(vfnmsub_d, LSX, gen_vvvv_ptr, gen_helper_vfnmsub_d) +TRANS(xvfmadd_s, LASX, gen_xxxx_ptr, gen_helper_vfmadd_s) +TRANS(xvfmadd_d, LASX, gen_xxxx_ptr, gen_helper_vfmadd_d) +TRANS(xvfmsub_s, LASX, gen_xxxx_ptr, gen_helper_vfmsub_s) +TRANS(xvfmsub_d, LASX, gen_xxxx_ptr, gen_helper_vfmsub_d) +TRANS(xvfnmadd_s, LASX, gen_xxxx_ptr, gen_helper_vfnmadd_s) +TRANS(xvfnmadd_d, LASX, gen_xxxx_ptr, gen_helper_vfnmadd_d) +TRANS(xvfnmsub_s, LASX, gen_xxxx_ptr, gen_helper_vfnmsub_s) +TRANS(xvfnmsub_d, LASX, gen_xxxx_ptr, gen_helper_vfnmsub_d) =20 TRANS(vfmax_s, LSX, gen_vvv_ptr, gen_helper_vfmax_s) TRANS(vfmax_d, LSX, gen_vvv_ptr, gen_helper_vfmax_d) TRANS(vfmin_s, LSX, gen_vvv_ptr, gen_helper_vfmin_s) TRANS(vfmin_d, LSX, gen_vvv_ptr, gen_helper_vfmin_d) +TRANS(xvfmax_s, LASX, gen_xxx_ptr, gen_helper_vfmax_s) +TRANS(xvfmax_d, LASX, gen_xxx_ptr, gen_helper_vfmax_d) +TRANS(xvfmin_s, LASX, gen_xxx_ptr, gen_helper_vfmin_s) +TRANS(xvfmin_d, LASX, gen_xxx_ptr, gen_helper_vfmin_d) =20 TRANS(vfmaxa_s, LSX, gen_vvv_ptr, gen_helper_vfmaxa_s) TRANS(vfmaxa_d, LSX, gen_vvv_ptr, gen_helper_vfmaxa_d) TRANS(vfmina_s, LSX, gen_vvv_ptr, gen_helper_vfmina_s) TRANS(vfmina_d, LSX, gen_vvv_ptr, gen_helper_vfmina_d) +TRANS(xvfmaxa_s, LASX, gen_xxx_ptr, gen_helper_vfmaxa_s) +TRANS(xvfmaxa_d, LASX, gen_xxx_ptr, gen_helper_vfmaxa_d) +TRANS(xvfmina_s, LASX, gen_xxx_ptr, gen_helper_vfmina_s) +TRANS(xvfmina_d, LASX, gen_xxx_ptr, gen_helper_vfmina_d) =20 TRANS(vflogb_s, LSX, gen_vv_ptr, gen_helper_vflogb_s) TRANS(vflogb_d, LSX, gen_vv_ptr, gen_helper_vflogb_d) +TRANS(xvflogb_s, LASX, gen_xx_ptr, gen_helper_vflogb_s) +TRANS(xvflogb_d, LASX, gen_xx_ptr, gen_helper_vflogb_d) =20 TRANS(vfclass_s, LSX, gen_vv_ptr, gen_helper_vfclass_s) TRANS(vfclass_d, LSX, gen_vv_ptr, gen_helper_vfclass_d) +TRANS(xvfclass_s, LASX, gen_xx_ptr, gen_helper_vfclass_s) +TRANS(xvfclass_d, LASX, gen_xx_ptr, gen_helper_vfclass_d) =20 TRANS(vfsqrt_s, LSX, gen_vv_ptr, gen_helper_vfsqrt_s) TRANS(vfsqrt_d, LSX, gen_vv_ptr, gen_helper_vfsqrt_d) @@ -4345,6 +4390,12 @@ TRANS(vfrecip_s, LSX, gen_vv_ptr, gen_helper_vfrecip= _s) TRANS(vfrecip_d, LSX, gen_vv_ptr, gen_helper_vfrecip_d) TRANS(vfrsqrt_s, LSX, gen_vv_ptr, gen_helper_vfrsqrt_s) TRANS(vfrsqrt_d, LSX, gen_vv_ptr, gen_helper_vfrsqrt_d) +TRANS(xvfsqrt_s, LASX, gen_xx_ptr, gen_helper_vfsqrt_s) +TRANS(xvfsqrt_d, LASX, gen_xx_ptr, gen_helper_vfsqrt_d) +TRANS(xvfrecip_s, LASX, gen_xx_ptr, gen_helper_vfrecip_s) +TRANS(xvfrecip_d, LASX, gen_xx_ptr, gen_helper_vfrecip_d) +TRANS(xvfrsqrt_s, LASX, gen_xx_ptr, gen_helper_vfrsqrt_s) +TRANS(xvfrsqrt_d, LASX, gen_xx_ptr, gen_helper_vfrsqrt_d) =20 TRANS(vfcvtl_s_h, LSX, gen_vv_ptr, gen_helper_vfcvtl_s_h) TRANS(vfcvth_s_h, LSX, gen_vv_ptr, gen_helper_vfcvth_s_h) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 169465873782528.2594310175632; Wed, 13 Sep 2023 19:32:17 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6K-0006Uj-CB; Wed, 13 Sep 2023 22:29:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc4z-0003Jm-7O for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:42 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4v-00052t-Fc for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:40 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxEvCMbwJlufgmAA--.9052S3; Thu, 14 Sep 2023 10:27:24 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S49; Thu, 14 Sep 2023 10:27:23 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 47/57] target/loongarch: Implement LASX fpu fcvt instructions Date: Thu, 14 Sep 2023 10:26:35 +0800 Message-Id: <20230914022645.1151356-48-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S49 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658738579100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVFCVT{L/H}.{S.H/D.S}; - XVFCVT.{H.S/S.D}; - XVFRINT[{RNE/RZ/RP/RM}].{S/D}; - XVFTINT[{RNE/RZ/RP/RM}].{W.S/L.D}; - XVFTINT[RZ].{WU.S/LU.D}; - XVFTINT[{RNE/RZ/RP/RM}].W.D; - XVFTINT[{RNE/RZ/RP/RM}]{L/H}.L.S; - XVFFINT.{S.W/D.L}[U]; - X[CVFFINT.S.L, VFFINT{L/H}.D.W. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 58 +++++ target/loongarch/disas.c | 56 +++++ target/loongarch/vec_helper.c | 235 +++++++++++++------- target/loongarch/insn_trans/trans_vec.c.inc | 52 +++++ 4 files changed, 315 insertions(+), 86 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 4224b0a4b1..ed4f82e7fe 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1857,6 +1857,64 @@ xvfrecip_d 0111 01101001 11001 11110 ..... ...= .. @vv xvfrsqrt_s 0111 01101001 11010 00001 ..... ..... @vv xvfrsqrt_d 0111 01101001 11010 00010 ..... ..... @vv =20 +xvfcvtl_s_h 0111 01101001 11011 11010 ..... ..... @vv +xvfcvth_s_h 0111 01101001 11011 11011 ..... ..... @vv +xvfcvtl_d_s 0111 01101001 11011 11100 ..... ..... @vv +xvfcvth_d_s 0111 01101001 11011 11101 ..... ..... @vv +xvfcvt_h_s 0111 01010100 01100 ..... ..... ..... @vvv +xvfcvt_s_d 0111 01010100 01101 ..... ..... ..... @vvv + +xvfrintrne_s 0111 01101001 11010 11101 ..... ..... @vv +xvfrintrne_d 0111 01101001 11010 11110 ..... ..... @vv +xvfrintrz_s 0111 01101001 11010 11001 ..... ..... @vv +xvfrintrz_d 0111 01101001 11010 11010 ..... ..... @vv +xvfrintrp_s 0111 01101001 11010 10101 ..... ..... @vv +xvfrintrp_d 0111 01101001 11010 10110 ..... ..... @vv +xvfrintrm_s 0111 01101001 11010 10001 ..... ..... @vv +xvfrintrm_d 0111 01101001 11010 10010 ..... ..... @vv +xvfrint_s 0111 01101001 11010 01101 ..... ..... @vv +xvfrint_d 0111 01101001 11010 01110 ..... ..... @vv + +xvftintrne_w_s 0111 01101001 11100 10100 ..... ..... @vv +xvftintrne_l_d 0111 01101001 11100 10101 ..... ..... @vv +xvftintrz_w_s 0111 01101001 11100 10010 ..... ..... @vv +xvftintrz_l_d 0111 01101001 11100 10011 ..... ..... @vv +xvftintrp_w_s 0111 01101001 11100 10000 ..... ..... @vv +xvftintrp_l_d 0111 01101001 11100 10001 ..... ..... @vv +xvftintrm_w_s 0111 01101001 11100 01110 ..... ..... @vv +xvftintrm_l_d 0111 01101001 11100 01111 ..... ..... @vv +xvftint_w_s 0111 01101001 11100 01100 ..... ..... @vv +xvftint_l_d 0111 01101001 11100 01101 ..... ..... @vv +xvftintrz_wu_s 0111 01101001 11100 11100 ..... ..... @vv +xvftintrz_lu_d 0111 01101001 11100 11101 ..... ..... @vv +xvftint_wu_s 0111 01101001 11100 10110 ..... ..... @vv +xvftint_lu_d 0111 01101001 11100 10111 ..... ..... @vv + +xvftintrne_w_d 0111 01010100 10111 ..... ..... ..... @vvv +xvftintrz_w_d 0111 01010100 10110 ..... ..... ..... @vvv +xvftintrp_w_d 0111 01010100 10101 ..... ..... ..... @vvv +xvftintrm_w_d 0111 01010100 10100 ..... ..... ..... @vvv +xvftint_w_d 0111 01010100 10011 ..... ..... ..... @vvv + +xvftintrnel_l_s 0111 01101001 11101 01000 ..... ..... @vv +xvftintrneh_l_s 0111 01101001 11101 01001 ..... ..... @vv +xvftintrzl_l_s 0111 01101001 11101 00110 ..... ..... @vv +xvftintrzh_l_s 0111 01101001 11101 00111 ..... ..... @vv +xvftintrpl_l_s 0111 01101001 11101 00100 ..... ..... @vv +xvftintrph_l_s 0111 01101001 11101 00101 ..... ..... @vv +xvftintrml_l_s 0111 01101001 11101 00010 ..... ..... @vv +xvftintrmh_l_s 0111 01101001 11101 00011 ..... ..... @vv +xvftintl_l_s 0111 01101001 11101 00000 ..... ..... @vv +xvftinth_l_s 0111 01101001 11101 00001 ..... ..... @vv + +xvffint_s_w 0111 01101001 11100 00000 ..... ..... @vv +xvffint_d_l 0111 01101001 11100 00010 ..... ..... @vv +xvffint_s_wu 0111 01101001 11100 00001 ..... ..... @vv +xvffint_d_lu 0111 01101001 11100 00011 ..... ..... @vv +xvffintl_d_w 0111 01101001 11100 00100 ..... ..... @vv +xvffinth_d_w 0111 01101001 11100 00101 ..... ..... @vv +xvffint_s_l 0111 01010100 10000 ..... ..... ..... @vvv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 1fb9d7eac1..f1a1321d0d 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2286,6 +2286,62 @@ INSN_LASX(xvfrecip_d, vv) INSN_LASX(xvfrsqrt_s, vv) INSN_LASX(xvfrsqrt_d, vv) =20 +INSN_LASX(xvfcvtl_s_h, vv) +INSN_LASX(xvfcvth_s_h, vv) +INSN_LASX(xvfcvtl_d_s, vv) +INSN_LASX(xvfcvth_d_s, vv) +INSN_LASX(xvfcvt_h_s, vvv) +INSN_LASX(xvfcvt_s_d, vvv) + +INSN_LASX(xvfrint_s, vv) +INSN_LASX(xvfrint_d, vv) +INSN_LASX(xvfrintrm_s, vv) +INSN_LASX(xvfrintrm_d, vv) +INSN_LASX(xvfrintrp_s, vv) +INSN_LASX(xvfrintrp_d, vv) +INSN_LASX(xvfrintrz_s, vv) +INSN_LASX(xvfrintrz_d, vv) +INSN_LASX(xvfrintrne_s, vv) +INSN_LASX(xvfrintrne_d, vv) + +INSN_LASX(xvftint_w_s, vv) +INSN_LASX(xvftint_l_d, vv) +INSN_LASX(xvftintrm_w_s, vv) +INSN_LASX(xvftintrm_l_d, vv) +INSN_LASX(xvftintrp_w_s, vv) +INSN_LASX(xvftintrp_l_d, vv) +INSN_LASX(xvftintrz_w_s, vv) +INSN_LASX(xvftintrz_l_d, vv) +INSN_LASX(xvftintrne_w_s, vv) +INSN_LASX(xvftintrne_l_d, vv) +INSN_LASX(xvftint_wu_s, vv) +INSN_LASX(xvftint_lu_d, vv) +INSN_LASX(xvftintrz_wu_s, vv) +INSN_LASX(xvftintrz_lu_d, vv) +INSN_LASX(xvftint_w_d, vvv) +INSN_LASX(xvftintrm_w_d, vvv) +INSN_LASX(xvftintrp_w_d, vvv) +INSN_LASX(xvftintrz_w_d, vvv) +INSN_LASX(xvftintrne_w_d, vvv) +INSN_LASX(xvftintl_l_s, vv) +INSN_LASX(xvftinth_l_s, vv) +INSN_LASX(xvftintrml_l_s, vv) +INSN_LASX(xvftintrmh_l_s, vv) +INSN_LASX(xvftintrpl_l_s, vv) +INSN_LASX(xvftintrph_l_s, vv) +INSN_LASX(xvftintrzl_l_s, vv) +INSN_LASX(xvftintrzh_l_s, vv) +INSN_LASX(xvftintrnel_l_s, vv) +INSN_LASX(xvftintrneh_l_s, vv) + +INSN_LASX(xvffint_s_w, vv) +INSN_LASX(xvffint_s_wu, vv) +INSN_LASX(xvffint_d_l, vv) +INSN_LASX(xvffint_d_lu, vv) +INSN_LASX(xvffintl_d_w, vv) +INSN_LASX(xvffinth_d_w, vv) +INSN_LASX(xvffint_s_l, vvv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 56d13331f6..20d7675256 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2624,14 +2624,19 @@ static uint32_t float64_cvt_float32(uint64_t d, flo= at_status *status) void HELPER(vfcvtl_s_h)(void *vd, void *vj, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 32; vec_clear_cause(env); - for (i =3D 0; i < LSX_LEN/32; i++) { - temp.UW(i) =3D float16_cvt_float32(Vj->UH(i), &env->fp_status); + for (i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.UW(j + ofs * i) =3Dfloat16_cvt_float32(Vj->UH(j + ofs * 2= * i), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2640,14 +2645,19 @@ void HELPER(vfcvtl_s_h)(void *vd, void *vj, void HELPER(vfcvtl_d_s)(void *vd, void *vj, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 64; vec_clear_cause(env); - for (i =3D 0; i < LSX_LEN/64; i++) { - temp.UD(i) =3D float32_cvt_float64(Vj->UW(i), &env->fp_status); + for (i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.UD(j + ofs * i) =3D float32_cvt_float64(Vj->UW(j + ofs * = 2 * i), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2656,14 +2666,19 @@ void HELPER(vfcvtl_d_s)(void *vd, void *vj, void HELPER(vfcvth_s_h)(void *vd, void *vj, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 32; vec_clear_cause(env); - for (i =3D 0; i < LSX_LEN/32; i++) { - temp.UW(i) =3D float16_cvt_float32(Vj->UH(i + 4), &env->fp_status); + for (i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.UW(j + ofs * i) =3D float16_cvt_float32(Vj->UH(j + ofs * = (2 * i + 1)), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2672,14 +2687,19 @@ void HELPER(vfcvth_s_h)(void *vd, void *vj, void HELPER(vfcvth_d_s)(void *vd, void *vj, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 64; vec_clear_cause(env); - for (i =3D 0; i < LSX_LEN/64; i++) { - temp.UD(i) =3D float32_cvt_float64(Vj->UW(i + 2), &env->fp_status); + for (i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.UD(j + ofs * i) =3D float32_cvt_float64(Vj->UW(j + ofs * = (2 * i + 1)), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2688,16 +2708,22 @@ void HELPER(vfcvth_d_s)(void *vd, void *vj, void HELPER(vfcvt_h_s)(void *vd, void *vj, void *vk, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 32; vec_clear_cause(env); - for(i =3D 0; i < LSX_LEN/32; i++) { - temp.UH(i + 4) =3D float32_cvt_float16(Vj->UW(i), &env->fp_status); - temp.UH(i) =3D float32_cvt_float16(Vk->UW(i), &env->fp_status); + for(i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.UH(j + ofs * (2 * i + 1)) =3D float32_cvt_float16(Vj->UW(= j + ofs * i), + &env->fp_= status); + temp.UH(j + ofs * 2 * i) =3D float32_cvt_float16(Vk->UW(j + of= s * i), + &env->fp_status= ); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2706,16 +2732,22 @@ void HELPER(vfcvt_h_s)(void *vd, void *vj, void *vk, void HELPER(vfcvt_s_d)(void *vd, void *vj, void *vk, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 64; vec_clear_cause(env); - for(i =3D 0; i < LSX_LEN/64; i++) { - temp.UW(i + 2) =3D float64_cvt_float32(Vj->UD(i), &env->fp_status); - temp.UW(i) =3D float64_cvt_float32(Vk->UD(i), &env->fp_status); + for(i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.UW(j + ofs * (2 * i + 1)) =3D float64_cvt_float32(Vj->UD(= j + ofs * i), + &env->fp_= status); + temp.UW(j + ofs * 2 * i) =3D float64_cvt_float32(Vk->UD(j + of= s * i), + &env->fp_status= ); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2727,9 +2759,10 @@ void HELPER(vfrint_s)(void *vd, void *vj, int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 vec_clear_cause(env); - for (i =3D 0; i < 4; i++) { + for (i =3D 0; i < oprsz / 4; i++) { Vd->W(i) =3D float32_round_to_int(Vj->UW(i), &env->fp_status); vec_update_fcsr0(env, GETPC()); } @@ -2741,9 +2774,10 @@ void HELPER(vfrint_d)(void *vd, void *vj, int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 vec_clear_cause(env); - for (i =3D 0; i < 2; i++) { + for (i =3D 0; i < oprsz / 8; i++) { Vd->D(i) =3D float64_round_to_int(Vj->UD(i), &env->fp_status); vec_update_fcsr0(env, GETPC()); } @@ -2756,9 +2790,10 @@ void HELPER(NAME)(void *vd, void *vj, = \ int i; = \ VReg *Vd =3D (VReg *)vd; = \ VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ = \ vec_clear_cause(env); = \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { = \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { = \ FloatRoundMode old_mode =3D get_float_rounding_mode(&env->fp_statu= s); \ set_float_rounding_mode(MODE, &env->fp_status); = \ Vd->E(i) =3D float## BIT ## _round_to_int(Vj->E(i), &env->fp_statu= s); \ @@ -2843,22 +2878,26 @@ FTINT(rp_w_d, float64, int32, uint64_t, uint32_t, f= loat_round_up) FTINT(rz_w_d, float64, int32, uint64_t, uint32_t, float_round_to_zero) FTINT(rne_w_d, float64, int32, uint64_t, uint32_t, float_round_nearest_eve= n) =20 -#define FTINT_W_D(NAME, FN) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, \ - CPULoongArchState *env, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - vec_clear_cause(env); \ - for (i =3D 0; i < 2; i++) { \ - temp.W(i + 2) =3D FN(env, Vj->UD(i)); \ - temp.W(i) =3D FN(env, Vk->UD(i)); \ - } \ - *Vd =3D temp; \ +#define FTINT_W_D(NAME, FN) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / 64; = \ + vec_clear_cause(env); \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.W(j + ofs * (2 * i + 1)) =3D FN(env, Vj->UD(j + ofs * i))= ; \ + temp.W(j + ofs * 2 * i) =3D FN(env, Vk->UD(j + ofs * i)); = \ + } \ + } \ + *Vd =3D temp; = \ } =20 FTINT_W_D(vftint_w_d, do_float64_to_int32) @@ -2876,20 +2915,24 @@ FTINT(rph_l_s, float32, int64, uint32_t, uint64_t, = float_round_up) FTINT(rzh_l_s, float32, int64, uint32_t, uint64_t, float_round_to_zero) FTINT(rneh_l_s, float32, int64, uint32_t, uint64_t, float_round_nearest_ev= en) =20 -#define FTINTL_L_S(NAME, FN) \ -void HELPER(NAME)(void *vd, void *vj, \ - CPULoongArchState *env, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - vec_clear_cause(env); \ - for (i =3D 0; i < 2; i++) { \ - temp.D(i) =3D FN(env, Vj->UW(i)); \ - } \ - *Vd =3D temp; \ +#define FTINTL_L_S(NAME, FN) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + ofs =3D LSX_LEN / 64; \ + vec_clear_cause(env); \ + for (i =3D 0; i < oprsz / 16; i++) { \ + for (j =3D 0; j < ofs; j++) { \ + temp.D(j + ofs * i) =3D FN(env, Vj->UW(j + ofs * 2 * i)); \ + } \ + } \ + *Vd =3D temp; \ } =20 FTINTL_L_S(vftintl_l_s, do_float32_to_int64) @@ -2898,20 +2941,24 @@ FTINTL_L_S(vftintrpl_l_s, do_ftintrpl_l_s) FTINTL_L_S(vftintrzl_l_s, do_ftintrzl_l_s) FTINTL_L_S(vftintrnel_l_s, do_ftintrnel_l_s) =20 -#define FTINTH_L_S(NAME, FN) \ -void HELPER(NAME)(void *vd, void *vj, \ - CPULoongArchState *env, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - vec_clear_cause(env); \ - for (i =3D 0; i < 2; i++) { \ - temp.D(i) =3D FN(env, Vj->UW(i + 2)); \ - } \ - *Vd =3D temp; \ +#define FTINTH_L_S(NAME, FN) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / 64; = \ + vec_clear_cause(env); \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.D(j + ofs * i) =3D FN(env, Vj->UW(j + ofs * (2 * i + 1)))= ; \ + } \ + } \ + *Vd =3D temp; = \ } =20 FTINTH_L_S(vftinth_l_s, do_float32_to_int64) @@ -2943,14 +2990,19 @@ DO_2OP_F(vffint_d_lu, 64, UD, do_ffint_d_lu) void HELPER(vffintl_d_w)(void *vd, void *vj, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 64; vec_clear_cause(env); - for (i =3D 0; i < 2; i++) { - temp.D(i) =3D int32_to_float64(Vj->W(i), &env->fp_status); + for (i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.D(j + ofs * i) =3D int32_to_float64(Vj->W(j + ofs * 2 * i= ), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2959,14 +3011,19 @@ void HELPER(vffintl_d_w)(void *vd, void *vj, void HELPER(vffinth_d_w)(void *vd, void *vj, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 64; vec_clear_cause(env); - for (i =3D 0; i < 2; i++) { - temp.D(i) =3D int32_to_float64(Vj->W(i + 2), &env->fp_status); + for (i =3D 0; i < oprsz /16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.D(j + ofs * i) =3D int32_to_float64(Vj->W(j + ofs * (2 * = i + 1)), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; @@ -2975,16 +3032,22 @@ void HELPER(vffinth_d_w)(void *vd, void *vj, void HELPER(vffint_s_l)(void *vd, void *vj, void *vk, CPULoongArchState *env, uint32_t desc) { - int i; - VReg temp; + int i, j, ofs; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; + int oprsz =3D simd_oprsz(desc); =20 + ofs =3D LSX_LEN / 64; vec_clear_cause(env); - for (i =3D 0; i < 2; i++) { - temp.W(i + 2) =3D int64_to_float32(Vj->D(i), &env->fp_status); - temp.W(i) =3D int64_to_float32(Vk->D(i), &env->fp_status); + for (i =3D 0; i < oprsz / 16; i++) { + for (j =3D 0; j < ofs; j++) { + temp.W(j + ofs * (2 * i + 1)) =3D int64_to_float32(Vj->D(j + o= fs * i), + &env->fp_stat= us); + temp.W(j + ofs * 2 * i) =3D int64_to_float32(Vk->D(j + ofs * i= ), + &env->fp_status); + } vec_update_fcsr0(env, GETPC()); } *Vd =3D temp; diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index a7f6929db3..9ee3e613fc 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -4403,6 +4403,12 @@ TRANS(vfcvtl_d_s, LSX, gen_vv_ptr, gen_helper_vfcvtl= _d_s) TRANS(vfcvth_d_s, LSX, gen_vv_ptr, gen_helper_vfcvth_d_s) TRANS(vfcvt_h_s, LSX, gen_vvv_ptr, gen_helper_vfcvt_h_s) TRANS(vfcvt_s_d, LSX, gen_vvv_ptr, gen_helper_vfcvt_s_d) +TRANS(xvfcvtl_s_h, LASX, gen_xx_ptr, gen_helper_vfcvtl_s_h) +TRANS(xvfcvth_s_h, LASX, gen_xx_ptr, gen_helper_vfcvth_s_h) +TRANS(xvfcvtl_d_s, LASX, gen_xx_ptr, gen_helper_vfcvtl_d_s) +TRANS(xvfcvth_d_s, LASX, gen_xx_ptr, gen_helper_vfcvth_d_s) +TRANS(xvfcvt_h_s, LASX, gen_xxx_ptr, gen_helper_vfcvt_h_s) +TRANS(xvfcvt_s_d, LASX, gen_xxx_ptr, gen_helper_vfcvt_s_d) =20 TRANS(vfrintrne_s, LSX, gen_vv_ptr, gen_helper_vfrintrne_s) TRANS(vfrintrne_d, LSX, gen_vv_ptr, gen_helper_vfrintrne_d) @@ -4414,6 +4420,16 @@ TRANS(vfrintrm_s, LSX, gen_vv_ptr, gen_helper_vfrint= rm_s) TRANS(vfrintrm_d, LSX, gen_vv_ptr, gen_helper_vfrintrm_d) TRANS(vfrint_s, LSX, gen_vv_ptr, gen_helper_vfrint_s) TRANS(vfrint_d, LSX, gen_vv_ptr, gen_helper_vfrint_d) +TRANS(xvfrintrne_s, LASX, gen_xx_ptr, gen_helper_vfrintrne_s) +TRANS(xvfrintrne_d, LASX, gen_xx_ptr, gen_helper_vfrintrne_d) +TRANS(xvfrintrz_s, LASX, gen_xx_ptr, gen_helper_vfrintrz_s) +TRANS(xvfrintrz_d, LASX, gen_xx_ptr, gen_helper_vfrintrz_d) +TRANS(xvfrintrp_s, LASX, gen_xx_ptr, gen_helper_vfrintrp_s) +TRANS(xvfrintrp_d, LASX, gen_xx_ptr, gen_helper_vfrintrp_d) +TRANS(xvfrintrm_s, LASX, gen_xx_ptr, gen_helper_vfrintrm_s) +TRANS(xvfrintrm_d, LASX, gen_xx_ptr, gen_helper_vfrintrm_d) +TRANS(xvfrint_s, LASX, gen_xx_ptr, gen_helper_vfrint_s) +TRANS(xvfrint_d, LASX, gen_xx_ptr, gen_helper_vfrint_d) =20 TRANS(vftintrne_w_s, LSX, gen_vv_ptr, gen_helper_vftintrne_w_s) TRANS(vftintrne_l_d, LSX, gen_vv_ptr, gen_helper_vftintrne_l_d) @@ -4444,6 +4460,35 @@ TRANS(vftintrml_l_s, LSX, gen_vv_ptr, gen_helper_vft= intrml_l_s) TRANS(vftintrmh_l_s, LSX, gen_vv_ptr, gen_helper_vftintrmh_l_s) TRANS(vftintl_l_s, LSX, gen_vv_ptr, gen_helper_vftintl_l_s) TRANS(vftinth_l_s, LSX, gen_vv_ptr, gen_helper_vftinth_l_s) +TRANS(xvftintrne_w_s, LASX, gen_xx_ptr, gen_helper_vftintrne_w_s) +TRANS(xvftintrne_l_d, LASX, gen_xx_ptr, gen_helper_vftintrne_l_d) +TRANS(xvftintrz_w_s, LASX, gen_xx_ptr, gen_helper_vftintrz_w_s) +TRANS(xvftintrz_l_d, LASX, gen_xx_ptr, gen_helper_vftintrz_l_d) +TRANS(xvftintrp_w_s, LASX, gen_xx_ptr, gen_helper_vftintrp_w_s) +TRANS(xvftintrp_l_d, LASX, gen_xx_ptr, gen_helper_vftintrp_l_d) +TRANS(xvftintrm_w_s, LASX, gen_xx_ptr, gen_helper_vftintrm_w_s) +TRANS(xvftintrm_l_d, LASX, gen_xx_ptr, gen_helper_vftintrm_l_d) +TRANS(xvftint_w_s, LASX, gen_xx_ptr, gen_helper_vftint_w_s) +TRANS(xvftint_l_d, LASX, gen_xx_ptr, gen_helper_vftint_l_d) +TRANS(xvftintrz_wu_s, LASX, gen_xx_ptr, gen_helper_vftintrz_wu_s) +TRANS(xvftintrz_lu_d, LASX, gen_xx_ptr, gen_helper_vftintrz_lu_d) +TRANS(xvftint_wu_s, LASX, gen_xx_ptr, gen_helper_vftint_wu_s) +TRANS(xvftint_lu_d, LASX, gen_xx_ptr, gen_helper_vftint_lu_d) +TRANS(xvftintrne_w_d, LASX, gen_xxx_ptr, gen_helper_vftintrne_w_d) +TRANS(xvftintrz_w_d, LASX, gen_xxx_ptr, gen_helper_vftintrz_w_d) +TRANS(xvftintrp_w_d, LASX, gen_xxx_ptr, gen_helper_vftintrp_w_d) +TRANS(xvftintrm_w_d, LASX, gen_xxx_ptr, gen_helper_vftintrm_w_d) +TRANS(xvftint_w_d, LASX, gen_xxx_ptr, gen_helper_vftint_w_d) +TRANS(xvftintrnel_l_s, LASX, gen_xx_ptr, gen_helper_vftintrnel_l_s) +TRANS(xvftintrneh_l_s, LASX, gen_xx_ptr, gen_helper_vftintrneh_l_s) +TRANS(xvftintrzl_l_s, LASX, gen_xx_ptr, gen_helper_vftintrzl_l_s) +TRANS(xvftintrzh_l_s, LASX, gen_xx_ptr, gen_helper_vftintrzh_l_s) +TRANS(xvftintrpl_l_s, LASX, gen_xx_ptr, gen_helper_vftintrpl_l_s) +TRANS(xvftintrph_l_s, LASX, gen_xx_ptr, gen_helper_vftintrph_l_s) +TRANS(xvftintrml_l_s, LASX, gen_xx_ptr, gen_helper_vftintrml_l_s) +TRANS(xvftintrmh_l_s, LASX, gen_xx_ptr, gen_helper_vftintrmh_l_s) +TRANS(xvftintl_l_s, LASX, gen_xx_ptr, gen_helper_vftintl_l_s) +TRANS(xvftinth_l_s, LASX, gen_xx_ptr, gen_helper_vftinth_l_s) =20 TRANS(vffint_s_w, LSX, gen_vv_ptr, gen_helper_vffint_s_w) TRANS(vffint_d_l, LSX, gen_vv_ptr, gen_helper_vffint_d_l) @@ -4452,6 +4497,13 @@ TRANS(vffint_d_lu, LSX, gen_vv_ptr, gen_helper_vffin= t_d_lu) TRANS(vffintl_d_w, LSX, gen_vv_ptr, gen_helper_vffintl_d_w) TRANS(vffinth_d_w, LSX, gen_vv_ptr, gen_helper_vffinth_d_w) TRANS(vffint_s_l, LSX, gen_vvv_ptr, gen_helper_vffint_s_l) +TRANS(xvffint_s_w, LASX, gen_xx_ptr, gen_helper_vffint_s_w) +TRANS(xvffint_d_l, LASX, gen_xx_ptr, gen_helper_vffint_d_l) +TRANS(xvffint_s_wu, LASX, gen_xx_ptr, gen_helper_vffint_s_wu) +TRANS(xvffint_d_lu, LASX, gen_xx_ptr, gen_helper_vffint_d_lu) +TRANS(xvffintl_d_w, LASX, gen_xx_ptr, gen_helper_vffintl_d_w) +TRANS(xvffinth_d_w, LASX, gen_xx_ptr, gen_helper_vffinth_d_w) +TRANS(xvffint_s_l, LASX, gen_xxx_ptr, gen_helper_vffint_s_l) =20 static bool do_cmp(DisasContext *ctx, arg_vvv *a, MemOp mop, TCGCond cond) { --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658848609504.1464438972962; Wed, 13 Sep 2023 19:34:08 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5c-0004MO-9I; Wed, 13 Sep 2023 22:28:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc51-0003KX-HV for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:46 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4x-000551-40 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:42 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxueqObwJlu_gmAA--.65456S3; Thu, 14 Sep 2023 10:27:26 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S50; Thu, 14 Sep 2023 10:27:24 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 48/57] target/loongarch: Implement xvseq xvsle xvslt Date: Thu, 14 Sep 2023 10:26:36 +0800 Message-Id: <20230914022645.1151356-49-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S50 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658849321100004 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSEQ[I].{B/H/W/D}; - XVSLE[I].{B/H/W/D}[U]; - XVSLT[I].{B/H/W/D/}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 43 ++++ target/loongarch/disas.c | 43 ++++ target/loongarch/vec_helper.c | 23 +- target/loongarch/insn_trans/trans_vec.c.inc | 257 ++++++++------------ 4 files changed, 201 insertions(+), 165 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index ed4f82e7fe..82c26a318b 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1915,6 +1915,49 @@ xvffintl_d_w 0111 01101001 11100 00100 ..... ...= .. @vv xvffinth_d_w 0111 01101001 11100 00101 ..... ..... @vv xvffint_s_l 0111 01010100 10000 ..... ..... ..... @vvv =20 +xvseq_b 0111 01000000 00000 ..... ..... ..... @vvv +xvseq_h 0111 01000000 00001 ..... ..... ..... @vvv +xvseq_w 0111 01000000 00010 ..... ..... ..... @vvv +xvseq_d 0111 01000000 00011 ..... ..... ..... @vvv +xvseqi_b 0111 01101000 00000 ..... ..... ..... @vv_i5 +xvseqi_h 0111 01101000 00001 ..... ..... ..... @vv_i5 +xvseqi_w 0111 01101000 00010 ..... ..... ..... @vv_i5 +xvseqi_d 0111 01101000 00011 ..... ..... ..... @vv_i5 + +xvsle_b 0111 01000000 00100 ..... ..... ..... @vvv +xvsle_h 0111 01000000 00101 ..... ..... ..... @vvv +xvsle_w 0111 01000000 00110 ..... ..... ..... @vvv +xvsle_d 0111 01000000 00111 ..... ..... ..... @vvv +xvslei_b 0111 01101000 00100 ..... ..... ..... @vv_i5 +xvslei_h 0111 01101000 00101 ..... ..... ..... @vv_i5 +xvslei_w 0111 01101000 00110 ..... ..... ..... @vv_i5 +xvslei_d 0111 01101000 00111 ..... ..... ..... @vv_i5 +xvsle_bu 0111 01000000 01000 ..... ..... ..... @vvv +xvsle_hu 0111 01000000 01001 ..... ..... ..... @vvv +xvsle_wu 0111 01000000 01010 ..... ..... ..... @vvv +xvsle_du 0111 01000000 01011 ..... ..... ..... @vvv +xvslei_bu 0111 01101000 01000 ..... ..... ..... @vv_ui5 +xvslei_hu 0111 01101000 01001 ..... ..... ..... @vv_ui5 +xvslei_wu 0111 01101000 01010 ..... ..... ..... @vv_ui5 +xvslei_du 0111 01101000 01011 ..... ..... ..... @vv_ui5 + +xvslt_b 0111 01000000 01100 ..... ..... ..... @vvv +xvslt_h 0111 01000000 01101 ..... ..... ..... @vvv +xvslt_w 0111 01000000 01110 ..... ..... ..... @vvv +xvslt_d 0111 01000000 01111 ..... ..... ..... @vvv +xvslti_b 0111 01101000 01100 ..... ..... ..... @vv_i5 +xvslti_h 0111 01101000 01101 ..... ..... ..... @vv_i5 +xvslti_w 0111 01101000 01110 ..... ..... ..... @vv_i5 +xvslti_d 0111 01101000 01111 ..... ..... ..... @vv_i5 +xvslt_bu 0111 01000000 10000 ..... ..... ..... @vvv +xvslt_hu 0111 01000000 10001 ..... ..... ..... @vvv +xvslt_wu 0111 01000000 10010 ..... ..... ..... @vvv +xvslt_du 0111 01000000 10011 ..... ..... ..... @vvv +xvslti_bu 0111 01101000 10000 ..... ..... ..... @vv_ui5 +xvslti_hu 0111 01101000 10001 ..... ..... ..... @vv_ui5 +xvslti_wu 0111 01101000 10010 ..... ..... ..... @vv_ui5 +xvslti_du 0111 01101000 10011 ..... ..... ..... @vv_ui5 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index f1a1321d0d..48e0b559f2 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2342,6 +2342,49 @@ INSN_LASX(xvffintl_d_w, vv) INSN_LASX(xvffinth_d_w, vv) INSN_LASX(xvffint_s_l, vvv) =20 +INSN_LASX(xvseq_b, vvv) +INSN_LASX(xvseq_h, vvv) +INSN_LASX(xvseq_w, vvv) +INSN_LASX(xvseq_d, vvv) +INSN_LASX(xvseqi_b, vv_i) +INSN_LASX(xvseqi_h, vv_i) +INSN_LASX(xvseqi_w, vv_i) +INSN_LASX(xvseqi_d, vv_i) + +INSN_LASX(xvsle_b, vvv) +INSN_LASX(xvsle_h, vvv) +INSN_LASX(xvsle_w, vvv) +INSN_LASX(xvsle_d, vvv) +INSN_LASX(xvslei_b, vv_i) +INSN_LASX(xvslei_h, vv_i) +INSN_LASX(xvslei_w, vv_i) +INSN_LASX(xvslei_d, vv_i) +INSN_LASX(xvsle_bu, vvv) +INSN_LASX(xvsle_hu, vvv) +INSN_LASX(xvsle_wu, vvv) +INSN_LASX(xvsle_du, vvv) +INSN_LASX(xvslei_bu, vv_i) +INSN_LASX(xvslei_hu, vv_i) +INSN_LASX(xvslei_wu, vv_i) +INSN_LASX(xvslei_du, vv_i) + +INSN_LASX(xvslt_b, vvv) +INSN_LASX(xvslt_h, vvv) +INSN_LASX(xvslt_w, vvv) +INSN_LASX(xvslt_d, vvv) +INSN_LASX(xvslti_b, vv_i) +INSN_LASX(xvslti_h, vv_i) +INSN_LASX(xvslti_w, vv_i) +INSN_LASX(xvslti_d, vv_i) +INSN_LASX(xvslt_bu, vvv) +INSN_LASX(xvslt_hu, vvv) +INSN_LASX(xvslt_wu, vvv) +INSN_LASX(xvslt_du, vvv) +INSN_LASX(xvslti_bu, vv_i) +INSN_LASX(xvslti_hu, vv_i) +INSN_LASX(xvslti_wu, vv_i) +INSN_LASX(xvslti_du, vv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 20d7675256..960b3d19ec 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -3057,17 +3057,18 @@ void HELPER(vffint_s_l)(void *vd, void *vj, void *v= k, #define VSLE(a, b) (a <=3D b ? -1 : 0) #define VSLT(a, b) (a < b ? -1 : 0) =20 -#define VCMPI(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t v) \ -{ \ - int i; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - typedef __typeof(Vd->E(0)) TD; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) =3D DO_OP(Vj->E(i), (TD)imm); \ - } \ +#define VCMPI(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + typedef __typeof(Vd->E(0)) TD; \ + int oprsz =3D simd_oprsz(desc); \ + \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), (TD)imm); \ + } \ } =20 VCMPI(vseqi_b, 8, B, VSEQ) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 9ee3e613fc..63d71e2332 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -4505,11 +4505,12 @@ TRANS(xvffintl_d_w, LASX, gen_xx_ptr, gen_helper_vf= fintl_d_w) TRANS(xvffinth_d_w, LASX, gen_xx_ptr, gen_helper_vffinth_d_w) TRANS(xvffint_s_l, LASX, gen_xxx_ptr, gen_helper_vffint_s_l) =20 -static bool do_cmp(DisasContext *ctx, arg_vvv *a, MemOp mop, TCGCond cond) +static bool do_cmp_vl(DisasContext *ctx, arg_vvv *a, + uint32_t oprsz, MemOp mop, TCGCond cond) { uint32_t vd_ofs, vj_ofs, vk_ofs; =20 - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, oprsz)) { return true; } =20 @@ -4517,184 +4518,132 @@ static bool do_cmp(DisasContext *ctx, arg_vvv *a,= MemOp mop, TCGCond cond) vj_ofs =3D vec_full_offset(a->vj); vk_ofs =3D vec_full_offset(a->vk); =20 - tcg_gen_gvec_cmp(cond, mop, vd_ofs, vj_ofs, vk_ofs, 16, ctx->vl/8); + tcg_gen_gvec_cmp(cond, mop, vd_ofs, vj_ofs, vk_ofs, oprsz, ctx->vl / 8= ); return true; } =20 -static void do_cmpi_vec(TCGCond cond, - unsigned vece, TCGv_vec t, TCGv_vec a, int64_t imm) -{ - tcg_gen_cmp_vec(cond, vece, t, a, tcg_constant_vec_matching(t, vece, i= mm)); -} - -static void gen_vseqi_s_vec(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t= imm) -{ - do_cmpi_vec(TCG_COND_EQ, vece, t, a, imm); -} - -static void gen_vslei_s_vec(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t= imm) -{ - do_cmpi_vec(TCG_COND_LE, vece, t, a, imm); -} - -static void gen_vslti_s_vec(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t= imm) -{ - do_cmpi_vec(TCG_COND_LT, vece, t, a, imm); -} - -static void gen_vslei_u_vec(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t= imm) -{ - do_cmpi_vec(TCG_COND_LEU, vece, t, a, imm); -} - -static void gen_vslti_u_vec(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t= imm) -{ - do_cmpi_vec(TCG_COND_LTU, vece, t, a, imm); -} - -#define DO_CMPI_S(NAME) \ -static bool do_## NAME ##_s(DisasContext *ctx, arg_vv_i *a, MemOp mop) \ -{ \ - uint32_t vd_ofs, vj_ofs; \ - \ - if (!check_vec(ctx, 16)) { \ - return true; \ - } \ - \ - static const TCGOpcode vecop_list[] =3D { \ - INDEX_op_cmp_vec, 0 \ - }; \ - static const GVecGen2i op[4] =3D { \ - { \ - .fniv =3D gen_## NAME ##_s_vec, \ - .fnoi =3D gen_helper_## NAME ##_b, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_8 \ - }, \ - { \ - .fniv =3D gen_## NAME ##_s_vec, \ - .fnoi =3D gen_helper_## NAME ##_h, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_16 \ - }, \ - { \ - .fniv =3D gen_## NAME ##_s_vec, \ - .fnoi =3D gen_helper_## NAME ##_w, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_32 \ - }, \ - { \ - .fniv =3D gen_## NAME ##_s_vec, \ - .fnoi =3D gen_helper_## NAME ##_d, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_64 \ - } \ - }; \ - \ - vd_ofs =3D vec_full_offset(a->vd); \ - vj_ofs =3D vec_full_offset(a->vj); \ - \ - tcg_gen_gvec_2i(vd_ofs, vj_ofs, 16, ctx->vl/8, a->imm, &op[mop]); \ - \ - return true; \ -} - -DO_CMPI_S(vseqi) -DO_CMPI_S(vslei) -DO_CMPI_S(vslti) - -#define DO_CMPI_U(NAME) \ -static bool do_## NAME ##_u(DisasContext *ctx, arg_vv_i *a, MemOp mop) \ -{ \ - uint32_t vd_ofs, vj_ofs; \ - \ - if (!check_vec(ctx, 16)) { \ - return true; \ - } \ - \ - static const TCGOpcode vecop_list[] =3D { \ - INDEX_op_cmp_vec, 0 \ - }; \ - static const GVecGen2i op[4] =3D { \ - { \ - .fniv =3D gen_## NAME ##_u_vec, \ - .fnoi =3D gen_helper_## NAME ##_bu, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_8 \ - }, \ - { \ - .fniv =3D gen_## NAME ##_u_vec, \ - .fnoi =3D gen_helper_## NAME ##_hu, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_16 \ - }, \ - { \ - .fniv =3D gen_## NAME ##_u_vec, \ - .fnoi =3D gen_helper_## NAME ##_wu, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_32 \ - }, \ - { \ - .fniv =3D gen_## NAME ##_u_vec, \ - .fnoi =3D gen_helper_## NAME ##_du, \ - .opt_opc =3D vecop_list, \ - .vece =3D MO_64 \ - } \ - }; \ - \ - vd_ofs =3D vec_full_offset(a->vd); \ - vj_ofs =3D vec_full_offset(a->vj); \ - \ - tcg_gen_gvec_2i(vd_ofs, vj_ofs, 16, ctx->vl/8, a->imm, &op[mop]); \ - \ - return true; \ -} - -DO_CMPI_U(vslei) -DO_CMPI_U(vslti) +static bool do_cmp(DisasContext *ctx, arg_vvv *a, + MemOp mop, TCGCond cond) +{ + return do_cmp_vl(ctx, a, 16, mop, cond); +} + +static bool do_xcmp(DisasContext *ctx, arg_vvv *a, + MemOp mop, TCGCond cond) +{ + return do_cmp_vl(ctx, a, 32, mop, cond); +} + +static bool do_cmpi_vl(DisasContext *ctx, arg_vv_i *a, + uint32_t oprsz, MemOp mop, TCGCond cond) +{ + uint32_t vd_ofs, vj_ofs; + + if (!check_vec(ctx, oprsz)) { + return true; + } + + vd_ofs =3D vec_full_offset(a->vd); + vj_ofs =3D vec_full_offset(a->vj); + + tcg_gen_gvec_cmpi(cond, mop, vd_ofs, vj_ofs, a->imm, oprsz, ctx->vl / = 8); + return true; +} + +static bool do_cmpi(DisasContext *ctx, arg_vv_i *a, + MemOp mop, TCGCond cond) +{ + return do_cmpi_vl(ctx, a, 16, mop, cond); +} + +static bool do_xcmpi(DisasContext *ctx, arg_vv_i *a, + MemOp mop, TCGCond cond) +{ + return do_cmpi_vl(ctx, a, 32, mop, cond); +} =20 TRANS(vseq_b, LSX, do_cmp, MO_8, TCG_COND_EQ) TRANS(vseq_h, LSX, do_cmp, MO_16, TCG_COND_EQ) TRANS(vseq_w, LSX, do_cmp, MO_32, TCG_COND_EQ) TRANS(vseq_d, LSX, do_cmp, MO_64, TCG_COND_EQ) -TRANS(vseqi_b, LSX, do_vseqi_s, MO_8) -TRANS(vseqi_h, LSX, do_vseqi_s, MO_16) -TRANS(vseqi_w, LSX, do_vseqi_s, MO_32) -TRANS(vseqi_d, LSX, do_vseqi_s, MO_64) +TRANS(vseqi_b, LSX, do_cmpi, MO_8, TCG_COND_EQ) +TRANS(vseqi_h, LSX, do_cmpi, MO_16, TCG_COND_EQ) +TRANS(vseqi_w, LSX, do_cmpi, MO_32, TCG_COND_EQ) +TRANS(vseqi_d, LSX, do_cmpi, MO_64, TCG_COND_EQ) +TRANS(xvseq_b, LASX, do_xcmp, MO_8, TCG_COND_EQ) +TRANS(xvseq_h, LASX, do_xcmp, MO_16, TCG_COND_EQ) +TRANS(xvseq_w, LASX, do_xcmp, MO_32, TCG_COND_EQ) +TRANS(xvseq_d, LASX, do_xcmp, MO_64, TCG_COND_EQ) +TRANS(xvseqi_b, LASX, do_xcmpi, MO_8, TCG_COND_EQ) +TRANS(xvseqi_h, LASX, do_xcmpi, MO_16, TCG_COND_EQ) +TRANS(xvseqi_w, LASX, do_xcmpi, MO_32, TCG_COND_EQ) +TRANS(xvseqi_d, LASX, do_xcmpi, MO_64, TCG_COND_EQ) =20 TRANS(vsle_b, LSX, do_cmp, MO_8, TCG_COND_LE) TRANS(vsle_h, LSX, do_cmp, MO_16, TCG_COND_LE) TRANS(vsle_w, LSX, do_cmp, MO_32, TCG_COND_LE) TRANS(vsle_d, LSX, do_cmp, MO_64, TCG_COND_LE) -TRANS(vslei_b, LSX, do_vslei_s, MO_8) -TRANS(vslei_h, LSX, do_vslei_s, MO_16) -TRANS(vslei_w, LSX, do_vslei_s, MO_32) -TRANS(vslei_d, LSX, do_vslei_s, MO_64) +TRANS(vslei_b, LSX, do_cmpi, MO_8, TCG_COND_LE) +TRANS(vslei_h, LSX, do_cmpi, MO_16, TCG_COND_LE) +TRANS(vslei_w, LSX, do_cmpi, MO_32, TCG_COND_LE) +TRANS(vslei_d, LSX, do_cmpi, MO_64, TCG_COND_LE) TRANS(vsle_bu, LSX, do_cmp, MO_8, TCG_COND_LEU) TRANS(vsle_hu, LSX, do_cmp, MO_16, TCG_COND_LEU) TRANS(vsle_wu, LSX, do_cmp, MO_32, TCG_COND_LEU) TRANS(vsle_du, LSX, do_cmp, MO_64, TCG_COND_LEU) -TRANS(vslei_bu, LSX, do_vslei_u, MO_8) -TRANS(vslei_hu, LSX, do_vslei_u, MO_16) -TRANS(vslei_wu, LSX, do_vslei_u, MO_32) -TRANS(vslei_du, LSX, do_vslei_u, MO_64) +TRANS(vslei_bu, LSX, do_cmpi, MO_8, TCG_COND_LEU) +TRANS(vslei_hu, LSX, do_cmpi, MO_16, TCG_COND_LEU) +TRANS(vslei_wu, LSX, do_cmpi, MO_32, TCG_COND_LEU) +TRANS(vslei_du, LSX, do_cmpi, MO_64, TCG_COND_LEU) +TRANS(xvsle_b, LASX, do_xcmp, MO_8, TCG_COND_LE) +TRANS(xvsle_h, LASX, do_xcmp, MO_16, TCG_COND_LE) +TRANS(xvsle_w, LASX, do_xcmp, MO_32, TCG_COND_LE) +TRANS(xvsle_d, LASX, do_xcmp, MO_64, TCG_COND_LE) +TRANS(xvslei_b, LASX, do_xcmpi, MO_8, TCG_COND_LE) +TRANS(xvslei_h, LASX, do_xcmpi, MO_16, TCG_COND_LE) +TRANS(xvslei_w, LASX, do_xcmpi, MO_32, TCG_COND_LE) +TRANS(xvslei_d, LASX, do_xcmpi, MO_64, TCG_COND_LE) +TRANS(xvsle_bu, LASX, do_xcmp, MO_8, TCG_COND_LEU) +TRANS(xvsle_hu, LASX, do_xcmp, MO_16, TCG_COND_LEU) +TRANS(xvsle_wu, LASX, do_xcmp, MO_32, TCG_COND_LEU) +TRANS(xvsle_du, LASX, do_xcmp, MO_64, TCG_COND_LEU) +TRANS(xvslei_bu, LASX, do_xcmpi, MO_8, TCG_COND_LEU) +TRANS(xvslei_hu, LASX, do_xcmpi, MO_16, TCG_COND_LEU) +TRANS(xvslei_wu, LASX, do_xcmpi, MO_32, TCG_COND_LEU) +TRANS(xvslei_du, LASX, do_xcmpi, MO_64, TCG_COND_LEU) =20 TRANS(vslt_b, LSX, do_cmp, MO_8, TCG_COND_LT) TRANS(vslt_h, LSX, do_cmp, MO_16, TCG_COND_LT) TRANS(vslt_w, LSX, do_cmp, MO_32, TCG_COND_LT) TRANS(vslt_d, LSX, do_cmp, MO_64, TCG_COND_LT) -TRANS(vslti_b, LSX, do_vslti_s, MO_8) -TRANS(vslti_h, LSX, do_vslti_s, MO_16) -TRANS(vslti_w, LSX, do_vslti_s, MO_32) -TRANS(vslti_d, LSX, do_vslti_s, MO_64) +TRANS(vslti_b, LSX, do_cmpi, MO_8, TCG_COND_LT) +TRANS(vslti_h, LSX, do_cmpi, MO_16, TCG_COND_LT) +TRANS(vslti_w, LSX, do_cmpi, MO_32, TCG_COND_LT) +TRANS(vslti_d, LSX, do_cmpi, MO_64, TCG_COND_LT) TRANS(vslt_bu, LSX, do_cmp, MO_8, TCG_COND_LTU) TRANS(vslt_hu, LSX, do_cmp, MO_16, TCG_COND_LTU) TRANS(vslt_wu, LSX, do_cmp, MO_32, TCG_COND_LTU) TRANS(vslt_du, LSX, do_cmp, MO_64, TCG_COND_LTU) -TRANS(vslti_bu, LSX, do_vslti_u, MO_8) -TRANS(vslti_hu, LSX, do_vslti_u, MO_16) -TRANS(vslti_wu, LSX, do_vslti_u, MO_32) -TRANS(vslti_du, LSX, do_vslti_u, MO_64) +TRANS(vslti_bu, LSX, do_cmpi, MO_8, TCG_COND_LTU) +TRANS(vslti_hu, LSX, do_cmpi, MO_16, TCG_COND_LTU) +TRANS(vslti_wu, LSX, do_cmpi, MO_32, TCG_COND_LTU) +TRANS(vslti_du, LSX, do_cmpi, MO_64, TCG_COND_LTU) +TRANS(xvslt_b, LASX, do_xcmp, MO_8, TCG_COND_LT) +TRANS(xvslt_h, LASX, do_xcmp, MO_16, TCG_COND_LT) +TRANS(xvslt_w, LASX, do_xcmp, MO_32, TCG_COND_LT) +TRANS(xvslt_d, LASX, do_xcmp, MO_64, TCG_COND_LT) +TRANS(xvslti_b, LASX, do_xcmpi, MO_8, TCG_COND_LT) +TRANS(xvslti_h, LASX, do_xcmpi, MO_16, TCG_COND_LT) +TRANS(xvslti_w, LASX, do_xcmpi, MO_32, TCG_COND_LT) +TRANS(xvslti_d, LASX, do_xcmpi, MO_64, TCG_COND_LT) +TRANS(xvslt_bu, LASX, do_xcmp, MO_8, TCG_COND_LTU) +TRANS(xvslt_hu, LASX, do_xcmp, MO_16, TCG_COND_LTU) +TRANS(xvslt_wu, LASX, do_xcmp, MO_32, TCG_COND_LTU) +TRANS(xvslt_du, LASX, do_xcmp, MO_64, TCG_COND_LTU) +TRANS(xvslti_bu, LASX, do_xcmpi, MO_8, TCG_COND_LTU) +TRANS(xvslti_hu, LASX, do_xcmpi, MO_16, TCG_COND_LTU) +TRANS(xvslti_wu, LASX, do_xcmpi, MO_32, TCG_COND_LTU) +TRANS(xvslti_du, LASX, do_xcmpi, MO_64, TCG_COND_LTU) =20 static bool trans_vfcmp_cond_s(DisasContext *ctx, arg_vvv_fcond *a) { --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658817321813.1045761620092; Wed, 13 Sep 2023 19:33:37 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6M-0006pV-88; Wed, 13 Sep 2023 22:29:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc51-0003KY-Hi for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:46 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4x-00054p-3u for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:42 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxbeuObwJlvfgmAA--.7592S3; Thu, 14 Sep 2023 10:27:26 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S51; Thu, 14 Sep 2023 10:27:25 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 49/57] target/loongarch: Implement xvfcmp Date: Thu, 14 Sep 2023 10:26:37 +0800 Message-Id: <20230914022645.1151356-50-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S51 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658818714100002 Content-Type: text/plain; charset="utf-8" This patch includes: - XVFCMP.cond.{S/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 8 +- target/loongarch/insns.decode | 3 + target/loongarch/disas.c | 93 +++++++++++++++++++++ target/loongarch/vec_helper.c | 4 +- target/loongarch/insn_trans/trans_vec.c.inc | 31 ++++--- 5 files changed, 117 insertions(+), 22 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index e9c5412267..b54ce68077 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -652,10 +652,10 @@ DEF_HELPER_FLAGS_4(vslti_hu, TCG_CALL_NO_RWG, void, p= tr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vslti_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vslti_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 -DEF_HELPER_5(vfcmp_c_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfcmp_s_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfcmp_c_d, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfcmp_s_d, void, env, i32, i32, i32, i32) +DEF_HELPER_6(vfcmp_c_s, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vfcmp_s_s, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vfcmp_c_d, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vfcmp_s_d, void, env, i32, i32, i32, i32, i32) =20 DEF_HELPER_FLAGS_4(vbitseli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 82c26a318b..0d46bd5e5e 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1958,6 +1958,9 @@ xvslti_hu 0111 01101000 10001 ..... ..... ....= . @vv_ui5 xvslti_wu 0111 01101000 10010 ..... ..... ..... @vv_ui5 xvslti_du 0111 01101000 10011 ..... ..... ..... @vv_ui5 =20 +xvfcmp_cond_s 0000 11001001 ..... ..... ..... ..... @vvv_fcond +xvfcmp_cond_d 0000 11001010 ..... ..... ..... ..... @vvv_fcond + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 48e0b559f2..4ab51b712e 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2385,6 +2385,99 @@ INSN_LASX(xvslti_hu, vv_i) INSN_LASX(xvslti_wu, vv_i) INSN_LASX(xvslti_du, vv_i) =20 +#define output_xvfcmp(C, PREFIX, SUFFIX) = \ +{ = \ + (C)->info->fprintf_func((C)->info->stream, "%08x %s%s\tx%d, x%d, x%d"= , \ + (C)->insn, PREFIX, SUFFIX, a->vd, = \ + a->vj, a->vk); = \ +} +static bool output_xxx_fcond(DisasContext *ctx, arg_vvv_fcond * a, + const char *suffix) +{ + bool ret =3D true; + switch (a->fcond) { + case 0x0: + output_xvfcmp(ctx, "xvfcmp_caf_", suffix); + break; + case 0x1: + output_xvfcmp(ctx, "xvfcmp_saf_", suffix); + break; + case 0x2: + output_xvfcmp(ctx, "xvfcmp_clt_", suffix); + break; + case 0x3: + output_xvfcmp(ctx, "xvfcmp_slt_", suffix); + break; + case 0x4: + output_xvfcmp(ctx, "xvfcmp_ceq_", suffix); + break; + case 0x5: + output_xvfcmp(ctx, "xvfcmp_seq_", suffix); + break; + case 0x6: + output_xvfcmp(ctx, "xvfcmp_cle_", suffix); + break; + case 0x7: + output_xvfcmp(ctx, "xvfcmp_sle_", suffix); + break; + case 0x8: + output_xvfcmp(ctx, "xvfcmp_cun_", suffix); + break; + case 0x9: + output_xvfcmp(ctx, "xvfcmp_sun_", suffix); + break; + case 0xA: + output_xvfcmp(ctx, "xvfcmp_cult_", suffix); + break; + case 0xB: + output_xvfcmp(ctx, "xvfcmp_sult_", suffix); + break; + case 0xC: + output_xvfcmp(ctx, "xvfcmp_cueq_", suffix); + break; + case 0xD: + output_xvfcmp(ctx, "xvfcmp_sueq_", suffix); + break; + case 0xE: + output_xvfcmp(ctx, "xvfcmp_cule_", suffix); + break; + case 0xF: + output_xvfcmp(ctx, "xvfcmp_sule_", suffix); + break; + case 0x10: + output_xvfcmp(ctx, "xvfcmp_cne_", suffix); + break; + case 0x11: + output_xvfcmp(ctx, "xvfcmp_sne_", suffix); + break; + case 0x14: + output_xvfcmp(ctx, "xvfcmp_cor_", suffix); + break; + case 0x15: + output_xvfcmp(ctx, "xvfcmp_sor_", suffix); + break; + case 0x18: + output_xvfcmp(ctx, "xvfcmp_cune_", suffix); + break; + case 0x19: + output_xvfcmp(ctx, "xvfcmp_sune_", suffix); + break; + default: + ret =3D false; + } + return ret; +} + +#define LASX_FCMP_INSN(suffix) \ +static bool trans_xvfcmp_cond_##suffix(DisasContext *ctx, \ + arg_vvv_fcond * a) \ +{ \ + return output_xxx_fcond(ctx, a, #suffix); \ +} + +LASX_FCMP_INSN(s) +LASX_FCMP_INSN(d) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 960b3d19ec..5188b7f72d 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -3122,7 +3122,7 @@ static uint64_t vfcmp_common(CPULoongArchState *env, } =20 #define VFCMP(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t oprsz, \ uint32_t vd, uint32_t vj, uint32_t vk, uint32_t flags) \ { \ int i; \ @@ -3132,7 +3132,7 @@ void HELPER(NAME)(CPULoongArchState *env, = \ VReg *Vk =3D &(env->fpr[vk].vreg); = \ \ vec_clear_cause(env); \ - for (i =3D 0; i < LSX_LEN/BIT ; i++) { = \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { = \ FloatRelation cmp; \ cmp =3D FN(Vj->E(i), Vk->E(i), &env->fp_status); = \ t.E(i) =3D vfcmp_common(env, cmp, flags); = \ diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 63d71e2332..9b89b81cfb 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -4645,52 +4645,51 @@ TRANS(xvslti_hu, LASX, do_xcmpi, MO_16, TCG_COND_LT= U) TRANS(xvslti_wu, LASX, do_xcmpi, MO_32, TCG_COND_LTU) TRANS(xvslti_du, LASX, do_xcmpi, MO_64, TCG_COND_LTU) =20 -static bool trans_vfcmp_cond_s(DisasContext *ctx, arg_vvv_fcond *a) +static bool do_vfcmp_cond_s(DisasContext *ctx, arg_vvv_fcond *a, uint32_t = sz) { uint32_t flags; - void (*fn)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32); + void (*fn)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32); TCGv_i32 vd =3D tcg_constant_i32(a->vd); TCGv_i32 vj =3D tcg_constant_i32(a->vj); TCGv_i32 vk =3D tcg_constant_i32(a->vk); + TCGv_i32 oprsz =3D tcg_constant_i32(sz); =20 - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, sz)) { return true; } =20 fn =3D (a->fcond & 1 ? gen_helper_vfcmp_s_s : gen_helper_vfcmp_c_s); flags =3D get_fcmp_flags(a->fcond >> 1); - fn(cpu_env, vd, vj, vk, tcg_constant_i32(flags)); + fn(cpu_env, oprsz, vd, vj, vk, tcg_constant_i32(flags)); =20 return true; } =20 -static bool trans_vfcmp_cond_d(DisasContext *ctx, arg_vvv_fcond *a) +static bool do_vfcmp_cond_d(DisasContext *ctx, arg_vvv_fcond *a, uint32_t = sz) { uint32_t flags; - void (*fn)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32); + void (*fn)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32); TCGv_i32 vd =3D tcg_constant_i32(a->vd); TCGv_i32 vj =3D tcg_constant_i32(a->vj); TCGv_i32 vk =3D tcg_constant_i32(a->vk); + TCGv_i32 oprsz =3D tcg_constant_i32(sz); =20 - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, sz)) { return true; } =20 fn =3D (a->fcond & 1 ? gen_helper_vfcmp_s_d : gen_helper_vfcmp_c_d); flags =3D get_fcmp_flags(a->fcond >> 1); - fn(cpu_env, vd, vj, vk, tcg_constant_i32(flags)); + fn(cpu_env, oprsz, vd, vj, vk, tcg_constant_i32(flags)); =20 return true; } =20 +TRANS(vfcmp_cond_s, LSX, do_vfcmp_cond_s, 16) +TRANS(vfcmp_cond_d, LSX, do_vfcmp_cond_d, 16) +TRANS(xvfcmp_cond_s, LASX, do_vfcmp_cond_s, 32) +TRANS(xvfcmp_cond_d, LASX, do_vfcmp_cond_d, 32) + static bool trans_vbitsel_v(DisasContext *ctx, arg_vvvv *a) { if (!avail_LSX(ctx)) { --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658957442174.5536784068695; Wed, 13 Sep 2023 19:35:57 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5s-0004nV-IR; Wed, 13 Sep 2023 22:28:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc53-0003Kp-B0 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:47 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc4z-000560-5H for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:44 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxnuuQbwJlwfgmAA--.7691S3; Thu, 14 Sep 2023 10:27:28 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S52; Thu, 14 Sep 2023 10:27:26 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 50/57] target/loongarch: Implement xvbitsel xvset Date: Thu, 14 Sep 2023 10:26:38 +0800 Message-Id: <20230914022645.1151356-51-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S52 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658957960100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVBITSEL.V; - XVBITSELI.B; - XVSET{EQZ/NEZ}.V; - XVSETANYEQZ.{B/H/W/D}; - XVSETALLNEZ.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 16 ++-- target/loongarch/insns.decode | 15 ++++ target/loongarch/disas.c | 19 ++++ target/loongarch/vec_helper.c | 42 +++++---- target/loongarch/insn_trans/trans_vec.c.inc | 99 ++++++++++++++++----- 5 files changed, 147 insertions(+), 44 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index b54ce68077..85233586e3 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -659,14 +659,14 @@ DEF_HELPER_6(vfcmp_s_d, void, env, i32, i32, i32, i32= , i32) =20 DEF_HELPER_FLAGS_4(vbitseli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 -DEF_HELPER_3(vsetanyeqz_b, void, env, i32, i32) -DEF_HELPER_3(vsetanyeqz_h, void, env, i32, i32) -DEF_HELPER_3(vsetanyeqz_w, void, env, i32, i32) -DEF_HELPER_3(vsetanyeqz_d, void, env, i32, i32) -DEF_HELPER_3(vsetallnez_b, void, env, i32, i32) -DEF_HELPER_3(vsetallnez_h, void, env, i32, i32) -DEF_HELPER_3(vsetallnez_w, void, env, i32, i32) -DEF_HELPER_3(vsetallnez_d, void, env, i32, i32) +DEF_HELPER_4(vsetanyeqz_b, void, env, i32, i32, i32) +DEF_HELPER_4(vsetanyeqz_h, void, env, i32, i32, i32) +DEF_HELPER_4(vsetanyeqz_w, void, env, i32, i32, i32) +DEF_HELPER_4(vsetanyeqz_d, void, env, i32, i32, i32) +DEF_HELPER_4(vsetallnez_b, void, env, i32, i32, i32) +DEF_HELPER_4(vsetallnez_h, void, env, i32, i32, i32) +DEF_HELPER_4(vsetallnez_w, void, env, i32, i32, i32) +DEF_HELPER_4(vsetallnez_d, void, env, i32, i32, i32) =20 DEF_HELPER_FLAGS_4(vpackev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vpackev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 0d46bd5e5e..ad6751fdfb 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1961,6 +1961,21 @@ xvslti_du 0111 01101000 10011 ..... ..... ...= .. @vv_ui5 xvfcmp_cond_s 0000 11001001 ..... ..... ..... ..... @vvv_fcond xvfcmp_cond_d 0000 11001010 ..... ..... ..... ..... @vvv_fcond =20 +xvbitsel_v 0000 11010010 ..... ..... ..... ..... @vvvv + +xvbitseli_b 0111 01111100 01 ........ ..... ..... @vv_ui8 + +xvseteqz_v 0111 01101001 11001 00110 ..... 00 ... @cv +xvsetnez_v 0111 01101001 11001 00111 ..... 00 ... @cv +xvsetanyeqz_b 0111 01101001 11001 01000 ..... 00 ... @cv +xvsetanyeqz_h 0111 01101001 11001 01001 ..... 00 ... @cv +xvsetanyeqz_w 0111 01101001 11001 01010 ..... 00 ... @cv +xvsetanyeqz_d 0111 01101001 11001 01011 ..... 00 ... @cv +xvsetallnez_b 0111 01101001 11001 01100 ..... 00 ... @cv +xvsetallnez_h 0111 01101001 11001 01101 ..... 00 ... @cv +xvsetallnez_w 0111 01101001 11001 01110 ..... 00 ... @cv +xvsetallnez_d 0111 01101001 11001 01111 ..... 00 ... @cv + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 4ab51b712e..abe113b150 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1703,6 +1703,11 @@ static bool trans_##insn(DisasContext *ctx, arg_##ty= pe * a) \ return true; \ } =20 +static void output_cv_x(DisasContext *ctx, arg_cv *a, const char *mnemonic) +{ + output(ctx, mnemonic, "fcc%d, x%d", a->cd, a->vj); +} + static void output_v_i_x(DisasContext *ctx, arg_v_i *a, const char *mnemon= ic) { output(ctx, mnemonic, "x%d, 0x%x", a->vd, a->imm); @@ -2478,6 +2483,20 @@ static bool trans_xvfcmp_cond_##suffix(DisasContext = *ctx, \ LASX_FCMP_INSN(s) LASX_FCMP_INSN(d) =20 +INSN_LASX(xvbitsel_v, vvvv) +INSN_LASX(xvbitseli_b, vv_i) + +INSN_LASX(xvseteqz_v, cv) +INSN_LASX(xvsetnez_v, cv) +INSN_LASX(xvsetanyeqz_b, cv) +INSN_LASX(xvsetanyeqz_h, cv) +INSN_LASX(xvsetanyeqz_w, cv) +INSN_LASX(xvsetanyeqz_d, cv) +INSN_LASX(xvsetallnez_b, cv) +INSN_LASX(xvsetallnez_h, cv) +INSN_LASX(xvsetallnez_w, cv) +INSN_LASX(xvsetallnez_d, cv) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 5188b7f72d..11793972f5 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -3146,13 +3146,13 @@ VFCMP(vfcmp_s_s, 32, UW, float32_compare) VFCMP(vfcmp_c_d, 64, UD, float64_compare_quiet) VFCMP(vfcmp_s_d, 64, UD, float64_compare) =20 -void HELPER(vbitseli_b)(void *vd, void *vj, uint64_t imm, uint32_t v) +void HELPER(vbitseli_b)(void *vd, void *vj, uint64_t imm, uint32_t desc) { int i; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; =20 - for (i =3D 0; i < 16; i++) { + for (i =3D 0; i < simd_oprsz(desc); i++) { Vd->B(i) =3D (~Vd->B(i) & Vj->B(i)) | (Vd->B(i) & imm); } } @@ -3160,7 +3160,7 @@ void HELPER(vbitseli_b)(void *vd, void *vj, uint64_t= imm, uint32_t v) /* Copy from target/arm/tcg/sve_helper.c */ static inline bool do_match2(uint64_t n, uint64_t m0, uint64_t m1, int esz) { - uint64_t bits =3D 8 << esz; + int bits =3D 8 << esz; uint64_t ones =3D dup_const(esz, 1); uint64_t signs =3D ones << (bits - 1); uint64_t cmp0, cmp1; @@ -3173,25 +3173,37 @@ static inline bool do_match2(uint64_t n, uint64_t m= 0, uint64_t m1, int esz) return (cmp0 | cmp1) & signs; } =20 -#define SETANYEQZ(NAME, MO) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t cd, uint32_t vj) \ -{ \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - env->cf[cd & 0x7] =3D do_match2(0, Vj->D(0), Vj->D(1), MO); \ +#define SETANYEQZ(NAME, MO) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t oprsz, uint32_t cd, uint32_t vj) \ +{ \ + VReg *Vj =3D &(env->fpr[vj].vreg); \ + \ + env->cf[cd & 0x7] =3D do_match2(0, Vj->D(0), Vj->D(1), MO); \ + if (oprsz =3D=3D 32) { \ + env->cf[cd & 0x7] =3D env->cf[cd & 0x7] || \ + do_match2(0, Vj->D(2), Vj->D(3), MO); \ + } \ } + SETANYEQZ(vsetanyeqz_b, MO_8) SETANYEQZ(vsetanyeqz_h, MO_16) SETANYEQZ(vsetanyeqz_w, MO_32) SETANYEQZ(vsetanyeqz_d, MO_64) =20 -#define SETALLNEZ(NAME, MO) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t cd, uint32_t vj) \ -{ \ - VReg *Vj =3D &(env->fpr[vj].vreg); \ - \ - env->cf[cd & 0x7]=3D !do_match2(0, Vj->D(0), Vj->D(1), MO); \ +#define SETALLNEZ(NAME, MO) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t oprsz, uint32_t cd, uint32_t vj) \ +{ \ + VReg *Vj =3D &(env->fpr[vj].vreg); \ + \ + env->cf[cd & 0x7]=3D !do_match2(0, Vj->D(0), Vj->D(1), MO); \ + if (oprsz =3D=3D 32) { \ + env->cf[cd & 0x7] =3D env->cf[cd & 0x7] && \ + !do_match2(0, Vj->D(2), Vj->D(3), MO); \ + } \ } + SETALLNEZ(vsetallnez_b, MO_8) SETALLNEZ(vsetallnez_h, MO_16) SETALLNEZ(vsetallnez_w, MO_32) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 9b89b81cfb..0dec3dfffe 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -202,20 +202,33 @@ static bool gen_xx_i(DisasContext *ctx, arg_vv_i *a, = gen_helper_gvec_2i *fn) return gen_vv_i_vl(ctx, a, 32, fn); } =20 -static bool gen_cv(DisasContext *ctx, arg_cv *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32)) +static bool gen_cv_vl(DisasContext *ctx, arg_cv *a, uint32_t sz, + void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) { - TCGv_i32 vj =3D tcg_constant_i32(a->vj); - TCGv_i32 cd =3D tcg_constant_i32(a->cd); - - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, sz)) { return true; } =20 - func(cpu_env, cd, vj); + TCGv_i32 vj =3D tcg_constant_i32(a->vj); + TCGv_i32 cd =3D tcg_constant_i32(a->cd); + TCGv_i32 oprsz =3D tcg_constant_i32(sz); + + func(cpu_env, oprsz, cd, vj); return true; } =20 +static bool gen_cv(DisasContext *ctx, arg_cv *a, + void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) +{ + return gen_cv_vl(ctx, a, 16, func); +} + +static bool gen_cx(DisasContext *ctx, arg_cv *a, + void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) +{ + return gen_cv_vl(ctx, a, 32, func); +} + static bool gvec_vvv_vl(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, @@ -4690,28 +4703,27 @@ TRANS(vfcmp_cond_d, LSX, do_vfcmp_cond_d, 16) TRANS(xvfcmp_cond_s, LASX, do_vfcmp_cond_s, 32) TRANS(xvfcmp_cond_d, LASX, do_vfcmp_cond_d, 32) =20 -static bool trans_vbitsel_v(DisasContext *ctx, arg_vvvv *a) +static bool do_vbitsel_v(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz) { - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, oprsz)) { return true; } =20 tcg_gen_gvec_bitsel(MO_64, vec_full_offset(a->vd), vec_full_offset(a->= va), vec_full_offset(a->vk), vec_full_offset(a->vj), - 16, ctx->vl/8); + oprsz, ctx->vl / 8); return true; } =20 +TRANS(vbitsel_v, LSX, do_vbitsel_v, 16) +TRANS(xvbitsel_v, LASX, do_vbitsel_v, 32) + static void gen_vbitseli(unsigned vece, TCGv_vec a, TCGv_vec b, int64_t im= m) { tcg_gen_bitsel_vec(vece, a, a, tcg_constant_vec_matching(a, vece, imm)= , b); } =20 -static bool trans_vbitseli_b(DisasContext *ctx, arg_vv_i *a) +static bool do_vbitseli_b(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz) { static const GVecGen2i op =3D { .fniv =3D gen_vbitseli, @@ -4720,19 +4732,18 @@ static bool trans_vbitseli_b(DisasContext *ctx, arg= _vv_i *a) .load_dest =3D true }; =20 - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, oprsz)) { return true; } =20 tcg_gen_gvec_2i(vec_full_offset(a->vd), vec_full_offset(a->vj), - 16, ctx->vl/8, a->imm, &op); + oprsz, ctx->vl / 8, a->imm , &op); return true; } =20 +TRANS(vbitseli_b, LSX, do_vbitseli_b, 16) +TRANS(xvbitseli_b, LASX, do_vbitseli_b, 32) + #define VSET(NAME, COND) = \ static bool trans_## NAME (DisasContext *ctx, arg_cv *a) = \ { = \ @@ -4772,6 +4783,52 @@ TRANS(vsetallnez_h, LSX, gen_cv, gen_helper_vsetalln= ez_h) TRANS(vsetallnez_w, LSX, gen_cv, gen_helper_vsetallnez_w) TRANS(vsetallnez_d, LSX, gen_cv, gen_helper_vsetallnez_d) =20 +#define XVSET(NAME, COND) = \ +static bool trans_## NAME(DisasContext *ctx, arg_cv * a) = \ +{ = \ + TCGv_i64 t1, t2, d[4]; = \ + = \ + d[0] =3D tcg_temp_new_i64(); = \ + d[1] =3D tcg_temp_new_i64(); = \ + d[2] =3D tcg_temp_new_i64(); = \ + d[3] =3D tcg_temp_new_i64(); = \ + t1 =3D tcg_temp_new_i64(); = \ + t2 =3D tcg_temp_new_i64(); = \ + = \ + get_vreg64(d[0], a->vj, 0); = \ + get_vreg64(d[1], a->vj, 1); = \ + get_vreg64(d[2], a->vj, 2); = \ + get_vreg64(d[3], a->vj, 3); = \ + = \ + if (!avail_LASX(ctx)) { = \ + return false; = \ + } = \ + = \ + if (!check_vec(ctx, 32)) { = \ + return true; = \ + } = \ + = \ + tcg_gen_or_i64(t1, d[0], d[1]); = \ + tcg_gen_or_i64(t2, d[2], d[3]); = \ + tcg_gen_or_i64(t1, t2, t1); = \ + tcg_gen_setcondi_i64(COND, t1, t1, 0); = \ + tcg_gen_st8_tl(t1, cpu_env, offsetof(CPULoongArchState, cf[a->cd & 0x7= ])); \ + = \ + return true; = \ +} + +XVSET(xvseteqz_v, TCG_COND_EQ) +XVSET(xvsetnez_v, TCG_COND_NE) + +TRANS(xvsetanyeqz_b, LASX, gen_cx, gen_helper_vsetanyeqz_b) +TRANS(xvsetanyeqz_h, LASX, gen_cx, gen_helper_vsetanyeqz_h) +TRANS(xvsetanyeqz_w, LASX, gen_cx, gen_helper_vsetanyeqz_w) +TRANS(xvsetanyeqz_d, LASX, gen_cx, gen_helper_vsetanyeqz_d) +TRANS(xvsetallnez_b, LASX, gen_cx, gen_helper_vsetallnez_b) +TRANS(xvsetallnez_h, LASX, gen_cx, gen_helper_vsetallnez_h) +TRANS(xvsetallnez_w, LASX, gen_cx, gen_helper_vsetallnez_w) +TRANS(xvsetallnez_d, LASX, gen_cx, gen_helper_vsetallnez_d) + static bool trans_vinsgr2vr_b(DisasContext *ctx, arg_vr_i *a) { TCGv src =3D gpr_src(ctx, a->rj, EXT_NONE); --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658912449639.4716491176504; Wed, 13 Sep 2023 19:35:12 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6H-0005xl-6K; Wed, 13 Sep 2023 22:29:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc55-0003ON-4K for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:50 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc50-00056G-7K for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:46 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxJuiQbwJlwPgmAA--.16425S3; Thu, 14 Sep 2023 10:27:28 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S53; Thu, 14 Sep 2023 10:27:27 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 51/57] target/loongarch: Implement xvinsgr2vr xvpickve2gr Date: Thu, 14 Sep 2023 10:26:39 +0800 Message-Id: <20230914022645.1151356-52-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S53 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658912994100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVINSGR2VR.{W/D}; - XVPICKVE2GR.{W/D}[U]. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 7 + target/loongarch/disas.c | 17 ++ target/loongarch/translate.c | 13 ++ target/loongarch/insn_trans/trans_vec.c.inc | 208 ++++---------------- 4 files changed, 75 insertions(+), 170 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index ad6751fdfb..bb3bb447ae 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1976,6 +1976,13 @@ xvsetallnez_h 0111 01101001 11001 01101 ..... 00 = ... @cv xvsetallnez_w 0111 01101001 11001 01110 ..... 00 ... @cv xvsetallnez_d 0111 01101001 11001 01111 ..... 00 ... @cv =20 +xvinsgr2vr_w 0111 01101110 10111 10 ... ..... ..... @vr_ui3 +xvinsgr2vr_d 0111 01101110 10111 110 .. ..... ..... @vr_ui2 +xvpickve2gr_w 0111 01101110 11111 10 ... ..... ..... @rv_ui3 +xvpickve2gr_d 0111 01101110 11111 110 .. ..... ..... @rv_ui2 +xvpickve2gr_wu 0111 01101111 00111 10 ... ..... ..... @rv_ui3 +xvpickve2gr_du 0111 01101111 00111 110 .. ..... ..... @rv_ui2 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index abe113b150..04f9f9fa4b 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1738,6 +1738,16 @@ static void output_vv_x(DisasContext *ctx, arg_vv *a= , const char *mnemonic) output(ctx, mnemonic, "x%d, x%d", a->vd, a->vj); } =20 +static void output_vr_i_x(DisasContext *ctx, arg_vr_i *a, const char *mnem= onic) +{ + output(ctx, mnemonic, "x%d, r%d, 0x%x", a->vd, a->rj, a->imm); +} + +static void output_rv_i_x(DisasContext *ctx, arg_rv_i *a, const char *mnem= onic) +{ + output(ctx, mnemonic, "r%d, x%d, 0x%x", a->rd, a->vj, a->imm); +} + INSN_LASX(xvadd_b, vvv) INSN_LASX(xvadd_h, vvv) INSN_LASX(xvadd_w, vvv) @@ -2497,6 +2507,13 @@ INSN_LASX(xvsetallnez_h, cv) INSN_LASX(xvsetallnez_w, cv) INSN_LASX(xvsetallnez_d, cv) =20 +INSN_LASX(xvinsgr2vr_w, vr_i) +INSN_LASX(xvinsgr2vr_d, vr_i) +INSN_LASX(xvpickve2gr_w, rv_i) +INSN_LASX(xvpickve2gr_d, rv_i) +INSN_LASX(xvpickve2gr_wu, rv_i) +INSN_LASX(xvpickve2gr_du, rv_i) + INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c index 10e2fe8ff6..4892834d0c 100644 --- a/target/loongarch/translate.c +++ b/target/loongarch/translate.c @@ -37,6 +37,19 @@ static inline int vec_full_offset(int regno) return offsetof(CPULoongArchState, fpr[regno]); } =20 +static inline int vec_reg_offset(int regno, int index, MemOp mop) +{ + const uint8_t size =3D 1 << mop; + int offs =3D index * size; + +#if HOST_BIG_ENDIAN + if (size < 8 ) { + offs ^ =3D (8 - size); + } +#endif + return offs + vec_full_offset(regno); +} + static inline void get_vreg64(TCGv_i64 dest, int regno, int index) { tcg_gen_ld_i64(dest, cpu_env, diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 0dec3dfffe..e1ba54075e 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -4829,209 +4829,77 @@ TRANS(xvsetallnez_h, LASX, gen_cx, gen_helper_vset= allnez_h) TRANS(xvsetallnez_w, LASX, gen_cx, gen_helper_vsetallnez_w) TRANS(xvsetallnez_d, LASX, gen_cx, gen_helper_vsetallnez_d) =20 -static bool trans_vinsgr2vr_b(DisasContext *ctx, arg_vr_i *a) +static bool gen_g2v_vl(DisasContext *ctx, arg_vr_i *a, uint32_t oprsz, Mem= Op mop, + void (*func)(TCGv, TCGv_ptr, tcg_target_long)) { TCGv src =3D gpr_src(ctx, a->rj, EXT_NONE); =20 - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { - return true; - } - - tcg_gen_st8_i64(src, cpu_env, - offsetof(CPULoongArchState, fpr[a->vd].vreg.B(a->imm))= ); - return true; -} - -static bool trans_vinsgr2vr_h(DisasContext *ctx, arg_vr_i *a) -{ - TCGv src =3D gpr_src(ctx, a->rj, EXT_NONE); - - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { - return true; - } - - tcg_gen_st16_i64(src, cpu_env, - offsetof(CPULoongArchState, fpr[a->vd].vreg.H(a->imm))= ); - return true; -} - -static bool trans_vinsgr2vr_w(DisasContext *ctx, arg_vr_i *a) -{ - TCGv src =3D gpr_src(ctx, a->rj, EXT_NONE); - - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, oprsz)) { return true; } =20 - tcg_gen_st32_i64(src, cpu_env, - offsetof(CPULoongArchState, fpr[a->vd].vreg.W(a->imm)= )); - return true; -} - -static bool trans_vinsgr2vr_d(DisasContext *ctx, arg_vr_i *a) -{ - TCGv src =3D gpr_src(ctx, a->rj, EXT_NONE); - - if (!avail_LSX(ctx)) { - return false; - } + func(src, cpu_env, vec_reg_offset(a->vd, a->imm, mop)); =20 - if (!check_vec(ctx, 16)) { - return true; - } - - tcg_gen_st_i64(src, cpu_env, - offsetof(CPULoongArchState, fpr[a->vd].vreg.D(a->imm))); return true; } =20 -static bool trans_vpickve2gr_b(DisasContext *ctx, arg_rv_i *a) +static bool gen_g2v(DisasContext *ctx, arg_vr_i *a, MemOp mop, + void (*func)(TCGv, TCGv_ptr, tcg_target_long)) { - TCGv dst =3D gpr_dst(ctx, a->rd, EXT_NONE); - - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { - return true; - } - - tcg_gen_ld8s_i64(dst, cpu_env, - offsetof(CPULoongArchState, fpr[a->vj].vreg.B(a->imm)= )); - return true; + return gen_g2v_vl(ctx, a, 16, mop, func); } =20 -static bool trans_vpickve2gr_h(DisasContext *ctx, arg_rv_i *a) +static bool gen_g2x(DisasContext *ctx, arg_vr_i *a, MemOp mop, + void (*func)(TCGv, TCGv_ptr, tcg_target_long)) { - TCGv dst =3D gpr_dst(ctx, a->rd, EXT_NONE); - - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { - return true; - } - - tcg_gen_ld16s_i64(dst, cpu_env, - offsetof(CPULoongArchState, fpr[a->vj].vreg.H(a->imm= ))); - return true; + return gen_g2v_vl(ctx, a, 32, mop, func); } =20 -static bool trans_vpickve2gr_w(DisasContext *ctx, arg_rv_i *a) -{ - TCGv dst =3D gpr_dst(ctx, a->rd, EXT_NONE); +TRANS(vinsgr2vr_b, LSX, gen_g2v, MO_8, tcg_gen_st8_i64) +TRANS(vinsgr2vr_h, LSX, gen_g2v, MO_16, tcg_gen_st16_i64) +TRANS(vinsgr2vr_w, LSX, gen_g2v, MO_32, tcg_gen_st32_i64) +TRANS(vinsgr2vr_d, LSX, gen_g2v, MO_64, tcg_gen_st_i64) +TRANS(xvinsgr2vr_w, LASX, gen_g2x, MO_32, tcg_gen_st32_i64) +TRANS(xvinsgr2vr_d, LASX, gen_g2x, MO_64, tcg_gen_st_i64) =20 - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { - return true; - } - - tcg_gen_ld32s_i64(dst, cpu_env, - offsetof(CPULoongArchState, fpr[a->vj].vreg.W(a->imm= ))); - return true; -} - -static bool trans_vpickve2gr_d(DisasContext *ctx, arg_rv_i *a) +static bool gen_v2g_vl(DisasContext *ctx, arg_rv_i *a, uint32_t oprsz, Mem= Op mop, + void (*func)(TCGv, TCGv_ptr, tcg_target_long)) { TCGv dst =3D gpr_dst(ctx, a->rd, EXT_NONE); =20 - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, oprsz)) { return true; } =20 - tcg_gen_ld_i64(dst, cpu_env, - offsetof(CPULoongArchState, fpr[a->vj].vreg.D(a->imm))); - return true; -} + func(dst, cpu_env, vec_reg_offset(a->vj, a->imm, mop)); =20 -static bool trans_vpickve2gr_bu(DisasContext *ctx, arg_rv_i *a) -{ - TCGv dst =3D gpr_dst(ctx, a->rd, EXT_NONE); - - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { - return true; - } - - tcg_gen_ld8u_i64(dst, cpu_env, - offsetof(CPULoongArchState, fpr[a->vj].vreg.B(a->imm)= )); return true; } =20 -static bool trans_vpickve2gr_hu(DisasContext *ctx, arg_rv_i *a) +static bool gen_v2g(DisasContext *ctx, arg_rv_i *a, MemOp mop, + void (*func)(TCGv, TCGv_ptr, tcg_target_long)) { - TCGv dst =3D gpr_dst(ctx, a->rd, EXT_NONE); - - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { - return true; - } - - tcg_gen_ld16u_i64(dst, cpu_env, - offsetof(CPULoongArchState, fpr[a->vj].vreg.H(a->imm= ))); - return true; + return gen_v2g_vl(ctx, a, 16, mop, func); } =20 -static bool trans_vpickve2gr_wu(DisasContext *ctx, arg_rv_i *a) +static bool gen_x2g(DisasContext *ctx, arg_rv_i *a, MemOp mop, + void (*func)(TCGv, TCGv_ptr, tcg_target_long)) { - TCGv dst =3D gpr_dst(ctx, a->rd, EXT_NONE); - - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { - return true; - } - - tcg_gen_ld32u_i64(dst, cpu_env, - offsetof(CPULoongArchState, fpr[a->vj].vreg.W(a->imm= ))); - return true; + return gen_v2g_vl(ctx, a, 32, mop, func); } =20 -static bool trans_vpickve2gr_du(DisasContext *ctx, arg_rv_i *a) -{ - TCGv dst =3D gpr_dst(ctx, a->rd, EXT_NONE); - - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { - return true; - } - - tcg_gen_ld_i64(dst, cpu_env, - offsetof(CPULoongArchState, fpr[a->vj].vreg.D(a->imm))); - return true; -} +TRANS(vpickve2gr_b, LSX, gen_v2g, MO_8, tcg_gen_ld8s_i64) +TRANS(vpickve2gr_h, LSX, gen_v2g, MO_16, tcg_gen_ld16s_i64) +TRANS(vpickve2gr_w, LSX, gen_v2g, MO_32, tcg_gen_ld32s_i64) +TRANS(vpickve2gr_d, LSX, gen_v2g, MO_64, tcg_gen_ld_i64) +TRANS(vpickve2gr_bu, LSX, gen_v2g, MO_8, tcg_gen_ld8u_i64) +TRANS(vpickve2gr_hu, LSX, gen_v2g, MO_16, tcg_gen_ld16u_i64) +TRANS(vpickve2gr_wu, LSX, gen_v2g, MO_32, tcg_gen_ld32u_i64) +TRANS(vpickve2gr_du, LSX, gen_v2g, MO_64, tcg_gen_ld_i64) +TRANS(xvpickve2gr_w, LASX, gen_x2g, MO_32, tcg_gen_ld32s_i64) +TRANS(xvpickve2gr_d, LASX, gen_x2g, MO_64, tcg_gen_ld_i64) +TRANS(xvpickve2gr_wu, LASX, gen_x2g, MO_32, tcg_gen_ld32u_i64) +TRANS(xvpickve2gr_du, LASX, gen_x2g, MO_64, tcg_gen_ld_i64) =20 static bool gvec_dup_vl(DisasContext *ctx, arg_vr *a, uint32_t oprsz, MemOp mop) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658545777404.9422445549335; Wed, 13 Sep 2023 19:29:05 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc5k-0004UD-46; Wed, 13 Sep 2023 22:28:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc55-0003OM-3v for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:50 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc50-000565-80 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:46 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxfeuQbwJlw_gmAA--.7460S3; Thu, 14 Sep 2023 10:27:28 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S54; Thu, 14 Sep 2023 10:27:28 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 52/57] target/loongarch: Implement xvreplve xvinsve0 xvpickve Date: Thu, 14 Sep 2023 10:26:40 +0800 Message-Id: <20230914022645.1151356-53-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S54 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658546670100001 Content-Type: text/plain; charset="utf-8" This patch includes: - XVREPLVE.{B/H/W/D}; - XVREPL128VEI.{B/H/W/D}; - XVREPLVE0.{B/H/W/D/Q}; - XVINSVE0.{W/D}; - XVPICKVE.{W/D}; - XVBSLL.V, XVBSRL.V. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 5 + target/loongarch/insns.decode | 25 +++ target/loongarch/disas.c | 29 ++++ target/loongarch/vec_helper.c | 28 ++++ target/loongarch/insn_trans/trans_vec.c.inc | 170 +++++++++++++------- 5 files changed, 200 insertions(+), 57 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 85233586e3..fb489dda2d 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -668,6 +668,11 @@ DEF_HELPER_4(vsetallnez_h, void, env, i32, i32, i32) DEF_HELPER_4(vsetallnez_w, void, env, i32, i32, i32) DEF_HELPER_4(vsetallnez_d, void, env, i32, i32, i32) =20 +DEF_HELPER_FLAGS_4(xvinsve0_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvinsve0_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvpickve_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvpickve_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + DEF_HELPER_FLAGS_4(vpackev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vpackev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vpackev_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index bb3bb447ae..74383ba3bc 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1987,3 +1987,28 @@ xvreplgr2vr_b 0111 01101001 11110 00000 ..... ...= .. @vr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @vr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @vr xvreplgr2vr_d 0111 01101001 11110 00011 ..... ..... @vr + +xvreplve_b 0111 01010010 00100 ..... ..... ..... @vvr +xvreplve_h 0111 01010010 00101 ..... ..... ..... @vvr +xvreplve_w 0111 01010010 00110 ..... ..... ..... @vvr +xvreplve_d 0111 01010010 00111 ..... ..... ..... @vvr + +xvrepl128vei_b 0111 01101111 01111 0 .... ..... ..... @vv_ui4 +xvrepl128vei_h 0111 01101111 01111 10 ... ..... ..... @vv_ui3 +xvrepl128vei_w 0111 01101111 01111 110 .. ..... ..... @vv_ui2 +xvrepl128vei_d 0111 01101111 01111 1110 . ..... ..... @vv_ui1 + +xvreplve0_b 0111 01110000 01110 00000 ..... ..... @vv +xvreplve0_h 0111 01110000 01111 00000 ..... ..... @vv +xvreplve0_w 0111 01110000 01111 10000 ..... ..... @vv +xvreplve0_d 0111 01110000 01111 11000 ..... ..... @vv +xvreplve0_q 0111 01110000 01111 11100 ..... ..... @vv + +xvinsve0_w 0111 01101111 11111 10 ... ..... ..... @vv_ui3 +xvinsve0_d 0111 01101111 11111 110 .. ..... ..... @vv_ui2 + +xvpickve_w 0111 01110000 00111 10 ... ..... ..... @vv_ui3 +xvpickve_d 0111 01110000 00111 110 .. ..... ..... @vv_ui2 + +xvbsll_v 0111 01101000 11100 ..... ..... ..... @vv_ui5 +xvbsrl_v 0111 01101000 11101 ..... ..... ..... @vv_ui5 diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 04f9f9fa4b..d091402db6 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1748,6 +1748,11 @@ static void output_rv_i_x(DisasContext *ctx, arg_rv_= i *a, const char *mnemonic) output(ctx, mnemonic, "r%d, x%d, 0x%x", a->rd, a->vj, a->imm); } =20 +static void output_vvr_x(DisasContext *ctx, arg_vvr *a, const char *mnemon= ic) +{ + output(ctx, mnemonic, "x%d, x%d, r%d", a->vd, a->vj, a->rk); +} + INSN_LASX(xvadd_b, vvv) INSN_LASX(xvadd_h, vvv) INSN_LASX(xvadd_w, vvv) @@ -2518,3 +2523,27 @@ INSN_LASX(xvreplgr2vr_b, vr) INSN_LASX(xvreplgr2vr_h, vr) INSN_LASX(xvreplgr2vr_w, vr) INSN_LASX(xvreplgr2vr_d, vr) + +INSN_LASX(xvreplve_b, vvr) +INSN_LASX(xvreplve_h, vvr) +INSN_LASX(xvreplve_w, vvr) +INSN_LASX(xvreplve_d, vvr) +INSN_LASX(xvrepl128vei_b, vv_i) +INSN_LASX(xvrepl128vei_h, vv_i) +INSN_LASX(xvrepl128vei_w, vv_i) +INSN_LASX(xvrepl128vei_d, vv_i) + +INSN_LASX(xvreplve0_b, vv) +INSN_LASX(xvreplve0_h, vv) +INSN_LASX(xvreplve0_w, vv) +INSN_LASX(xvreplve0_d, vv) +INSN_LASX(xvreplve0_q, vv) + +INSN_LASX(xvinsve0_w, vv_i) +INSN_LASX(xvinsve0_d, vv_i) + +INSN_LASX(xvpickve_w, vv_i) +INSN_LASX(xvpickve_d, vv_i) + +INSN_LASX(xvbsll_v, vv_i) +INSN_LASX(xvbsrl_v, vv_i) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 11793972f5..26d48ed921 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -3209,6 +3209,34 @@ SETALLNEZ(vsetallnez_h, MO_16) SETALLNEZ(vsetallnez_w, MO_32) SETALLNEZ(vsetallnez_d, MO_64) =20 +#define XVINSVE0(NAME, E, MASK) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + Vd->E(imm & MASK) =3D Vj->E(0); \ +} + +XVINSVE0(xvinsve0_w, W, 0x7) +XVINSVE0(xvinsve0_d, D, 0x3) + +#define XVPICKVE(NAME, E, BIT, MASK) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ + \ + Vd->E(0) =3D Vj->E(imm & MASK); \ + for (i =3D 1; i < oprsz / (BIT / 8); i++) { \ + Vd->E(i) =3D 0; \ + } \ +} + +XVPICKVE(xvpickve_w, W, 32, 0x7) +XVPICKVE(xvpickve_d, D, 64, 0x3) + #define VPACKEV(NAME, BIT, E) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index e1ba54075e..dbaf3124fc 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -5000,112 +5000,168 @@ static bool trans_vreplvei_d(DisasContext *ctx, a= rg_vv_i *a) return true; } =20 -static bool gen_vreplve(DisasContext *ctx, arg_vvr *a, int vece, int bit, - void (*func)(TCGv_i64, TCGv_ptr, tcg_target_long)) +static bool gen_vreplve_vl(DisasContext *ctx, arg_vvr *a, + uint32_t oprsz, int vece, int bit, + void (*func)(TCGv_i64, TCGv_ptr, tcg_target_lon= g)) { + int i; TCGv_i64 t0 =3D tcg_temp_new_i64(); TCGv_ptr t1 =3D tcg_temp_new_ptr(); TCGv_i64 t2 =3D tcg_temp_new_i64(); =20 - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, oprsz)) { return true; } =20 - tcg_gen_andi_i64(t0, gpr_src(ctx, a->rk, EXT_NONE), (LSX_LEN/bit) -1); + tcg_gen_andi_i64(t0, gpr_src(ctx, a->rk, EXT_NONE), (LSX_LEN / bit) - = 1); tcg_gen_shli_i64(t0, t0, vece); if (HOST_BIG_ENDIAN) { - tcg_gen_xori_i64(t0, t0, vece << ((LSX_LEN/bit) -1)); + tcg_gen_xori_i64(t0, t0, vece << ((LSX_LEN / bit) - 1)); } =20 tcg_gen_trunc_i64_ptr(t1, t0); tcg_gen_add_ptr(t1, t1, cpu_env); - func(t2, t1, vec_full_offset(a->vj)); - tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->vd), 16, ctx->vl/8, t2); + + for (i =3D 0; i < oprsz; i +=3D 16) { + func(t2, t1, vec_full_offset(a->vj) + i); + tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->vd) + i, 16, 16, t2); + } =20 return true; } =20 +static bool gen_vreplve(DisasContext *ctx, arg_vvr *a, int vece, int bit, + void (*func)(TCGv_i64, TCGv_ptr, tcg_target_long)) +{ + return gen_vreplve_vl(ctx, a, 16, vece, bit, func); +} + +static bool gen_xvreplve(DisasContext *ctx, arg_vvr *a, int vece, int bit, + void (*func)(TCGv_i64, TCGv_ptr, tcg_target_long)) +{ + return gen_vreplve_vl(ctx, a, 32, vece, bit, func); +} + TRANS(vreplve_b, LSX, gen_vreplve, MO_8, 8, tcg_gen_ld8u_i64) TRANS(vreplve_h, LSX, gen_vreplve, MO_16, 16, tcg_gen_ld16u_i64) TRANS(vreplve_w, LSX, gen_vreplve, MO_32, 32, tcg_gen_ld32u_i64) TRANS(vreplve_d, LSX, gen_vreplve, MO_64, 64, tcg_gen_ld_i64) +TRANS(xvreplve_b, LASX, gen_xvreplve, MO_8, 8, tcg_gen_ld8u_i64) +TRANS(xvreplve_h, LASX, gen_xvreplve, MO_16, 16, tcg_gen_ld16u_i64) +TRANS(xvreplve_w, LASX, gen_xvreplve, MO_32, 32, tcg_gen_ld32u_i64) +TRANS(xvreplve_d, LASX, gen_xvreplve, MO_64, 64, tcg_gen_ld_i64) =20 -static bool trans_vbsll_v(DisasContext *ctx, arg_vv_i *a) +static bool gen_xvrepl128(DisasContext *ctx, arg_vv_i *a, MemOp mop) { - int ofs; - TCGv_i64 desthigh, destlow, high, low; + int index =3D LSX_LEN / (8 * (1 << mop)); =20 - if (!avail_LSX(ctx)) { - return false; - } - - if (!check_vec(ctx, 16)) { + if (!check_vec(ctx, 32)) { return true; } =20 - desthigh =3D tcg_temp_new_i64(); - destlow =3D tcg_temp_new_i64(); - high =3D tcg_temp_new_i64(); - low =3D tcg_temp_new_i64(); + tcg_gen_gvec_dup_mem(mop, vec_reg_offset(a->vd, 0, mop), + vec_reg_offset(a->vj, a->imm, mop), 16, 16); + tcg_gen_gvec_dup_mem(mop, vec_reg_offset(a->vd, index, mop), + vec_reg_offset(a->vj, a->imm + index , mop), 16, = 16); + return true; +} =20 - get_vreg64(low, a->vj, 0); +TRANS(xvrepl128vei_b, LASX, gen_xvrepl128, MO_8) +TRANS(xvrepl128vei_h, LASX, gen_xvrepl128, MO_16) +TRANS(xvrepl128vei_w, LASX, gen_xvrepl128, MO_32) +TRANS(xvrepl128vei_d, LASX, gen_xvrepl128, MO_64) =20 - ofs =3D ((a->imm) & 0xf) * 8; - if (ofs < 64) { - get_vreg64(high, a->vj, 1); - tcg_gen_extract2_i64(desthigh, low, high, 64 - ofs); - tcg_gen_shli_i64(destlow, low, ofs); - } else { - tcg_gen_shli_i64(desthigh, low, ofs - 64); - destlow =3D tcg_constant_i64(0); +static bool gen_xvreplve0(DisasContext *ctx, arg_vv *a, MemOp mop) +{ + if (!check_vec(ctx, 32)) { + return true; } =20 - set_vreg64(desthigh, a->vd, 1); - set_vreg64(destlow, a->vd, 0); - + tcg_gen_gvec_dup_mem(mop, vec_full_offset(a->vd), + vec_full_offset(a->vj), 32, 32); return true; } =20 -static bool trans_vbsrl_v(DisasContext *ctx, arg_vv_i *a) +TRANS(xvreplve0_b, LASX, gen_xvreplve0, MO_8) +TRANS(xvreplve0_h, LASX, gen_xvreplve0, MO_16) +TRANS(xvreplve0_w, LASX, gen_xvreplve0, MO_32) +TRANS(xvreplve0_d, LASX, gen_xvreplve0, MO_64) +TRANS(xvreplve0_q, LASX, gen_xvreplve0, MO_128) + +TRANS(xvinsve0_w, LASX, gen_xx_i, gen_helper_xvinsve0_w) +TRANS(xvinsve0_d, LASX, gen_xx_i, gen_helper_xvinsve0_d) + +TRANS(xvpickve_w, LASX, gen_xx_i, gen_helper_xvpickve_w) +TRANS(xvpickve_d, LASX, gen_xx_i, gen_helper_xvpickve_d) + +static bool do_vbsll_v(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz) { - TCGv_i64 desthigh, destlow, high, low; - int ofs; + int i, ofs; =20 - if (!avail_LSX(ctx)) { - return false; + if (!check_vec(ctx, oprsz)) { + return true; } =20 - if (!check_vec(ctx, 16)) { - return true; + for (i =3D 0; i < oprsz / 16; i++) { + TCGv desthigh =3D tcg_temp_new_i64(); + TCGv destlow =3D tcg_temp_new_i64(); + TCGv high =3D tcg_temp_new_i64(); + TCGv low =3D tcg_temp_new_i64(); + + get_vreg64(low, a->vj, 2 * i); + + ofs =3D ((a->imm) & 0xf) * 8; + if (ofs < 64) { + get_vreg64(high, a->vj, 2 * i + 1); + tcg_gen_extract2_i64(desthigh, low, high, 64 - ofs); + tcg_gen_shli_i64(destlow, low, ofs); + } else { + tcg_gen_shli_i64(desthigh, low, ofs - 64); + destlow =3D tcg_constant_i64(0); + } + set_vreg64(desthigh, a->vd, 2 * i + 1); + set_vreg64(destlow, a->vd, 2 * i); } =20 - desthigh =3D tcg_temp_new_i64(); - destlow =3D tcg_temp_new_i64(); - high =3D tcg_temp_new_i64(); - low =3D tcg_temp_new_i64(); + return true; +} =20 - get_vreg64(high, a->vj, 1); +static bool do_vbsrl_v(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz) +{ + int i, ofs; =20 - ofs =3D ((a->imm) & 0xf) * 8; - if (ofs < 64) { - get_vreg64(low, a->vj, 0); - tcg_gen_extract2_i64(destlow, low, high, ofs); - tcg_gen_shri_i64(desthigh, high, ofs); - } else { - tcg_gen_shri_i64(destlow, high, ofs - 64); - desthigh =3D tcg_constant_i64(0); + if (!check_vec(ctx, 32)) { + return true; } =20 - set_vreg64(desthigh, a->vd, 1); - set_vreg64(destlow, a->vd, 0); + for (i =3D 0; i < oprsz / 16; i++) { + TCGv desthigh =3D tcg_temp_new_i64(); + TCGv destlow =3D tcg_temp_new_i64(); + TCGv high =3D tcg_temp_new_i64(); + TCGv low =3D tcg_temp_new_i64(); + get_vreg64(high, a->vj, 2 * i + 1); + + ofs =3D ((a->imm) & 0xf) * 8; + if (ofs < 64) { + get_vreg64(low, a->vj, 2 * i); + tcg_gen_extract2_i64(destlow, low, high, ofs); + tcg_gen_shri_i64(desthigh, high, ofs); + } else { + tcg_gen_shri_i64(destlow, high, ofs - 64); + desthigh =3D tcg_constant_i64(0); + } + set_vreg64(desthigh, a->vd, 2 * i + 1); + set_vreg64(destlow, a->vd, 2 * i); + } =20 return true; } =20 +TRANS(vbsll_v, LSX, do_vbsll_v, 16) +TRANS(vbsrl_v, LSX, do_vbsrl_v, 16) +TRANS(xvbsll_v, LASX, do_vbsll_v, 32) +TRANS(xvbsrl_v, LASX, do_vbsrl_v, 32) + TRANS(vpackev_b, LSX, gen_vvv, gen_helper_vpackev_b) TRANS(vpackev_h, LSX, gen_vvv, gen_helper_vpackev_h) TRANS(vpackev_w, LSX, gen_vvv, gen_helper_vpackev_w) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694659023663317.68895086010116; Wed, 13 Sep 2023 19:37:03 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6N-0006wh-3s; Wed, 13 Sep 2023 22:29:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc53-0003Ko-Ao for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:47 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc50-00056K-9p for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:45 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dx_7uRbwJlxPgmAA--.16321S3; Thu, 14 Sep 2023 10:27:29 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S55; Thu, 14 Sep 2023 10:27:28 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 53/57] target/loongarch: Implement xvpack xvpick xvilv{l/h} Date: Thu, 14 Sep 2023 10:26:41 +0800 Message-Id: <20230914022645.1151356-54-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S55 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694659024177100011 Content-Type: text/plain; charset="utf-8" This patch includes: - XVPACK{EV/OD}.{B/H/W/D}; - XVPICK{EV/OD}.{B/H/W/D}; - XVILV{L/H}.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 27 ++++ target/loongarch/disas.c | 27 ++++ target/loongarch/vec_helper.c | 138 +++++++++++--------- target/loongarch/insn_trans/trans_vec.c.inc | 24 ++++ 4 files changed, 156 insertions(+), 60 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 74383ba3bc..a325b861c1 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -2012,3 +2012,30 @@ xvpickve_d 0111 01110000 00111 110 .. ..... ..= ... @vv_ui2 =20 xvbsll_v 0111 01101000 11100 ..... ..... ..... @vv_ui5 xvbsrl_v 0111 01101000 11101 ..... ..... ..... @vv_ui5 + +xvpackev_b 0111 01010001 01100 ..... ..... ..... @vvv +xvpackev_h 0111 01010001 01101 ..... ..... ..... @vvv +xvpackev_w 0111 01010001 01110 ..... ..... ..... @vvv +xvpackev_d 0111 01010001 01111 ..... ..... ..... @vvv +xvpackod_b 0111 01010001 10000 ..... ..... ..... @vvv +xvpackod_h 0111 01010001 10001 ..... ..... ..... @vvv +xvpackod_w 0111 01010001 10010 ..... ..... ..... @vvv +xvpackod_d 0111 01010001 10011 ..... ..... ..... @vvv + +xvpickev_b 0111 01010001 11100 ..... ..... ..... @vvv +xvpickev_h 0111 01010001 11101 ..... ..... ..... @vvv +xvpickev_w 0111 01010001 11110 ..... ..... ..... @vvv +xvpickev_d 0111 01010001 11111 ..... ..... ..... @vvv +xvpickod_b 0111 01010010 00000 ..... ..... ..... @vvv +xvpickod_h 0111 01010010 00001 ..... ..... ..... @vvv +xvpickod_w 0111 01010010 00010 ..... ..... ..... @vvv +xvpickod_d 0111 01010010 00011 ..... ..... ..... @vvv + +xvilvl_b 0111 01010001 10100 ..... ..... ..... @vvv +xvilvl_h 0111 01010001 10101 ..... ..... ..... @vvv +xvilvl_w 0111 01010001 10110 ..... ..... ..... @vvv +xvilvl_d 0111 01010001 10111 ..... ..... ..... @vvv +xvilvh_b 0111 01010001 11000 ..... ..... ..... @vvv +xvilvh_h 0111 01010001 11001 ..... ..... ..... @vvv +xvilvh_w 0111 01010001 11010 ..... ..... ..... @vvv +xvilvh_d 0111 01010001 11011 ..... ..... ..... @vvv diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index d091402db6..74ae916a10 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2547,3 +2547,30 @@ INSN_LASX(xvpickve_d, vv_i) =20 INSN_LASX(xvbsll_v, vv_i) INSN_LASX(xvbsrl_v, vv_i) + +INSN_LASX(xvpackev_b, vvv) +INSN_LASX(xvpackev_h, vvv) +INSN_LASX(xvpackev_w, vvv) +INSN_LASX(xvpackev_d, vvv) +INSN_LASX(xvpackod_b, vvv) +INSN_LASX(xvpackod_h, vvv) +INSN_LASX(xvpackod_w, vvv) +INSN_LASX(xvpackod_d, vvv) + +INSN_LASX(xvpickev_b, vvv) +INSN_LASX(xvpickev_h, vvv) +INSN_LASX(xvpickev_w, vvv) +INSN_LASX(xvpickev_d, vvv) +INSN_LASX(xvpickod_b, vvv) +INSN_LASX(xvpickod_h, vvv) +INSN_LASX(xvpickod_w, vvv) +INSN_LASX(xvpickod_d, vvv) + +INSN_LASX(xvilvl_b, vvv) +INSN_LASX(xvilvl_h, vvv) +INSN_LASX(xvilvl_w, vvv) +INSN_LASX(xvilvl_d, vvv) +INSN_LASX(xvilvh_b, vvv) +INSN_LASX(xvilvh_h, vvv) +INSN_LASX(xvilvh_w, vvv) +INSN_LASX(xvilvh_d, vvv) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 26d48ed921..2bbaee628b 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -3241,12 +3241,13 @@ XVPICKVE(xvpickve_d, D, 64, 0x3) void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg temp; \ + VReg temp =3D {}; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ temp.E(2 * i + 1) =3D Vj->E(2 * i); \ temp.E(2 *i) =3D Vk->E(2 * i); \ } \ @@ -3262,12 +3263,13 @@ VPACKEV(vpackev_d, 128, D) void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg temp; \ + VReg temp =3D {}; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { \ temp.E(2 * i + 1) =3D Vj->E(2 * i + 1); \ temp.E(2 * i) =3D Vk->E(2 * i + 1); \ } \ @@ -3279,20 +3281,24 @@ VPACKOD(vpackod_h, 32, H) VPACKOD(vpackod_w, 64, W) VPACKOD(vpackod_d, 128, D) =20 -#define VPICKEV(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i + LSX_LEN/BIT) =3D Vj->E(2 * i); \ - temp.E(i) =3D Vk->E(2 * i); \ - } \ - *Vd =3D temp; \ +#define VPICKEV(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + ofs =3D LSX_LEN / BIT; \ + for (i =3D 0; i < oprsz / 16; i++) { \ + for (j =3D 0; j < ofs; j++) { \ + temp.E(j + ofs * (2 * i + 1)) =3D Vj->E(2 * (j + ofs * i)); \ + temp.E(j + ofs * 2 * i) =3D Vk->E(2 * (j + ofs * i)); \ + } \ + } \ + *Vd =3D temp; \ } =20 VPICKEV(vpickev_b, 16, B) @@ -3300,20 +3306,24 @@ VPICKEV(vpickev_h, 32, H) VPICKEV(vpickev_w, 64, W) VPICKEV(vpickev_d, 128, D) =20 -#define VPICKOD(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i + LSX_LEN/BIT) =3D Vj->E(2 * i + 1); \ - temp.E(i) =3D Vk->E(2 * i + 1); \ - } \ - *Vd =3D temp; \ +#define VPICKOD(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E(j + ofs * (2 * i + 1)) =3D Vj->E(2 * (j + ofs * i) + 1)= ; \ + temp.E(j + ofs * 2 * i) =3D Vk->E(2 * (j + ofs * i) + 1); = \ + } \ + } \ + *Vd =3D temp; = \ } =20 VPICKOD(vpickod_b, 16, B) @@ -3321,20 +3331,24 @@ VPICKOD(vpickod_h, 32, H) VPICKOD(vpickod_w, 64, W) VPICKOD(vpickod_d, 128, D) =20 -#define VILVL(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) =3D Vj->E(i); \ - temp.E(2 * i) =3D Vk->E(i); \ - } \ - *Vd =3D temp; \ +#define VILVL(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + ofs =3D LSX_LEN / BIT; \ + for (i =3D 0; i < oprsz / 16; i++) { \ + for (j =3D 0; j < ofs; j++) { \ + temp.E(2 * (j + ofs * i) + 1) =3D Vj->E(j + ofs * 2 * i); \ + temp.E(2 * (j + ofs * i)) =3D Vk->E(j + ofs * 2 * i); \ + } \ + } \ + *Vd =3D temp; \ } =20 VILVL(vilvl_b, 16, B) @@ -3342,20 +3356,24 @@ VILVL(vilvl_h, 32, H) VILVL(vilvl_w, 64, W) VILVL(vilvl_d, 128, D) =20 -#define VILVH(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) =3D Vj->E(i + LSX_LEN/BIT); \ - temp.E(2 * i) =3D Vk->E(i + LSX_LEN/BIT); \ - } \ - *Vd =3D temp; \ +#define VILVH(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, ofs; \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + VReg *Vk =3D (VReg *)vk; = \ + int oprsz =3D simd_oprsz(desc); = \ + \ + ofs =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / 16; i++) { = \ + for (j =3D 0; j < ofs; j++) { = \ + temp.E(2 * (j + ofs * i) + 1) =3D Vj->E(j + ofs * (2 * i + 1))= ; \ + temp.E(2 * (j + ofs * i)) =3D Vk->E(j + ofs * (2 * i + 1)); = \ + } \ + } \ + *Vd =3D temp; = \ } =20 VILVH(vilvh_b, 16, B) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index dbaf3124fc..2b55ce4464 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -5170,6 +5170,14 @@ TRANS(vpackod_b, LSX, gen_vvv, gen_helper_vpackod_b) TRANS(vpackod_h, LSX, gen_vvv, gen_helper_vpackod_h) TRANS(vpackod_w, LSX, gen_vvv, gen_helper_vpackod_w) TRANS(vpackod_d, LSX, gen_vvv, gen_helper_vpackod_d) +TRANS(xvpackev_b, LASX, gen_xxx, gen_helper_vpackev_b) +TRANS(xvpackev_h, LASX, gen_xxx, gen_helper_vpackev_h) +TRANS(xvpackev_w, LASX, gen_xxx, gen_helper_vpackev_w) +TRANS(xvpackev_d, LASX, gen_xxx, gen_helper_vpackev_d) +TRANS(xvpackod_b, LASX, gen_xxx, gen_helper_vpackod_b) +TRANS(xvpackod_h, LASX, gen_xxx, gen_helper_vpackod_h) +TRANS(xvpackod_w, LASX, gen_xxx, gen_helper_vpackod_w) +TRANS(xvpackod_d, LASX, gen_xxx, gen_helper_vpackod_d) =20 TRANS(vpickev_b, LSX, gen_vvv, gen_helper_vpickev_b) TRANS(vpickev_h, LSX, gen_vvv, gen_helper_vpickev_h) @@ -5179,6 +5187,14 @@ TRANS(vpickod_b, LSX, gen_vvv, gen_helper_vpickod_b) TRANS(vpickod_h, LSX, gen_vvv, gen_helper_vpickod_h) TRANS(vpickod_w, LSX, gen_vvv, gen_helper_vpickod_w) TRANS(vpickod_d, LSX, gen_vvv, gen_helper_vpickod_d) +TRANS(xvpickev_b, LASX, gen_xxx, gen_helper_vpickev_b) +TRANS(xvpickev_h, LASX, gen_xxx, gen_helper_vpickev_h) +TRANS(xvpickev_w, LASX, gen_xxx, gen_helper_vpickev_w) +TRANS(xvpickev_d, LASX, gen_xxx, gen_helper_vpickev_d) +TRANS(xvpickod_b, LASX, gen_xxx, gen_helper_vpickod_b) +TRANS(xvpickod_h, LASX, gen_xxx, gen_helper_vpickod_h) +TRANS(xvpickod_w, LASX, gen_xxx, gen_helper_vpickod_w) +TRANS(xvpickod_d, LASX, gen_xxx, gen_helper_vpickod_d) =20 TRANS(vilvl_b, LSX, gen_vvv, gen_helper_vilvl_b) TRANS(vilvl_h, LSX, gen_vvv, gen_helper_vilvl_h) @@ -5188,6 +5204,14 @@ TRANS(vilvh_b, LSX, gen_vvv, gen_helper_vilvh_b) TRANS(vilvh_h, LSX, gen_vvv, gen_helper_vilvh_h) TRANS(vilvh_w, LSX, gen_vvv, gen_helper_vilvh_w) TRANS(vilvh_d, LSX, gen_vvv, gen_helper_vilvh_d) +TRANS(xvilvl_b, LASX, gen_xxx, gen_helper_vilvl_b) +TRANS(xvilvl_h, LASX, gen_xxx, gen_helper_vilvl_h) +TRANS(xvilvl_w, LASX, gen_xxx, gen_helper_vilvl_w) +TRANS(xvilvl_d, LASX, gen_xxx, gen_helper_vilvl_d) +TRANS(xvilvh_b, LASX, gen_xxx, gen_helper_vilvh_b) +TRANS(xvilvh_h, LASX, gen_xxx, gen_helper_vilvh_h) +TRANS(xvilvh_w, LASX, gen_xxx, gen_helper_vilvh_w) +TRANS(xvilvh_d, LASX, gen_xxx, gen_helper_vilvh_d) =20 TRANS(vshuf_b, LSX, gen_vvvv, gen_helper_vshuf_b) TRANS(vshuf_h, LSX, gen_vvv, gen_helper_vshuf_h) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 169465873838622.53380661915014; Wed, 13 Sep 2023 19:32:18 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6H-0005yi-9u; Wed, 13 Sep 2023 22:29:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc55-0003OP-4r for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:50 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc51-00056h-4h for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:46 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxuOiSbwJlxvgmAA--.39695S3; Thu, 14 Sep 2023 10:27:30 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S56; Thu, 14 Sep 2023 10:27:29 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 54/57] target/loongarch: Implement xvshuf xvperm{i} xvshuf4i Date: Thu, 14 Sep 2023 10:26:42 +0800 Message-Id: <20230914022645.1151356-55-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S56 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658740550100007 Content-Type: text/plain; charset="utf-8" This patch includes: - XVSHUF.{B/H/W/D}; - XVPERM.W; - XVSHUF4i.{B/H/W/D}; - XVPERMI.{W/D/Q}; - XVEXTRINS.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/helper.h | 3 + target/loongarch/insns.decode | 21 +++ target/loongarch/disas.c | 21 +++ target/loongarch/vec_helper.c | 146 ++++++++++++++------ target/loongarch/insn_trans/trans_vec.c.inc | 30 +++- 5 files changed, 175 insertions(+), 46 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index fb489dda2d..b3b64a0215 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -709,7 +709,10 @@ DEF_HELPER_FLAGS_4(vshuf4i_h, TCG_CALL_NO_RWG, void, p= tr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vshuf4i_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vshuf4i_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 +DEF_HELPER_FLAGS_4(vperm_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vpermi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vpermi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vpermi_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 DEF_HELPER_FLAGS_4(vextrins_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vextrins_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index a325b861c1..64b67ee9ac 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -2039,3 +2039,24 @@ xvilvh_b 0111 01010001 11000 ..... ..... ...= .. @vvv xvilvh_h 0111 01010001 11001 ..... ..... ..... @vvv xvilvh_w 0111 01010001 11010 ..... ..... ..... @vvv xvilvh_d 0111 01010001 11011 ..... ..... ..... @vvv + +xvshuf_b 0000 11010110 ..... ..... ..... ..... @vvvv +xvshuf_h 0111 01010111 10101 ..... ..... ..... @vvv +xvshuf_w 0111 01010111 10110 ..... ..... ..... @vvv +xvshuf_d 0111 01010111 10111 ..... ..... ..... @vvv + +xvperm_w 0111 01010111 11010 ..... ..... ..... @vvv + +xvshuf4i_b 0111 01111001 00 ........ ..... ..... @vv_ui8 +xvshuf4i_h 0111 01111001 01 ........ ..... ..... @vv_ui8 +xvshuf4i_w 0111 01111001 10 ........ ..... ..... @vv_ui8 +xvshuf4i_d 0111 01111001 11 ........ ..... ..... @vv_ui8 + +xvpermi_w 0111 01111110 01 ........ ..... ..... @vv_ui8 +xvpermi_d 0111 01111110 10 ........ ..... ..... @vv_ui8 +xvpermi_q 0111 01111110 11 ........ ..... ..... @vv_ui8 + +xvextrins_d 0111 01111000 00 ........ ..... ..... @vv_ui8 +xvextrins_w 0111 01111000 01 ........ ..... ..... @vv_ui8 +xvextrins_h 0111 01111000 10 ........ ..... ..... @vv_ui8 +xvextrins_b 0111 01111000 11 ........ ..... ..... @vv_ui8 diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 74ae916a10..1ec8e21e01 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2574,3 +2574,24 @@ INSN_LASX(xvilvh_b, vvv) INSN_LASX(xvilvh_h, vvv) INSN_LASX(xvilvh_w, vvv) INSN_LASX(xvilvh_d, vvv) + +INSN_LASX(xvshuf_b, vvvv) +INSN_LASX(xvshuf_h, vvv) +INSN_LASX(xvshuf_w, vvv) +INSN_LASX(xvshuf_d, vvv) + +INSN_LASX(xvperm_w, vvv) + +INSN_LASX(xvshuf4i_b, vv_i) +INSN_LASX(xvshuf4i_h, vv_i) +INSN_LASX(xvshuf4i_w, vv_i) +INSN_LASX(xvshuf4i_d, vv_i) + +INSN_LASX(xvpermi_w, vv_i) +INSN_LASX(xvpermi_d, vv_i) +INSN_LASX(xvpermi_q, vv_i) + +INSN_LASX(xvextrins_d, vv_i) +INSN_LASX(xvextrins_w, vv_i) +INSN_LASX(xvextrins_h, vv_i) +INSN_LASX(xvextrins_b, vv_i) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 2bbaee628b..6b61a5c447 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -3381,57 +3381,65 @@ VILVH(vilvh_h, 32, H) VILVH(vilvh_w, 64, W) VILVH(vilvh_d, 128, D) =20 +#define SHF_POS(i, imm) (((i) & 0xfc) + (((imm) >> (2 * ((i) & 0x03))) & 0= x03)) + void HELPER(vshuf_b)(void *vd, void *vj, void *vk, void *va, uint32_t desc) { - int i, m; - VReg temp; + int i, j, m; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; VReg *Vk =3D (VReg *)vk; VReg *Va =3D (VReg *)va; + int oprsz =3D simd_oprsz(desc); =20 - m =3D LSX_LEN/8; - for (i =3D 0; i < m ; i++) { + m =3D LSX_LEN / 8; + for (i =3D 0; i < (oprsz / 16) * m; i++) { + j =3D i < m ? 0 : 1; uint64_t k =3D (uint8_t)Va->B(i) % (2 * m); - temp.B(i) =3D k < m ? Vk->B(k) : Vj->B(k - m); + temp.B(i) =3D k < m ? Vk->B(k + j * m): Vj->B(k + (j - 1) * m); } *Vd =3D temp; } =20 -#define VSHUF(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ -{ \ - int i, m; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - VReg *Vk =3D (VReg *)vk; \ - \ - m =3D LSX_LEN/BIT; \ - for (i =3D 0; i < m; i++) { \ - uint64_t k =3D ((uint8_t) Vd->E(i)) % (2 * m); \ - temp.E(i) =3D k < m ? Vk->E(k) : Vj->E(k - m); \ - } \ - *Vd =3D temp; \ +#define VSHUF(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, j, m; \ + VReg temp =3D {}; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + int oprsz =3D simd_oprsz(desc); \ + \ + m =3D LSX_LEN / BIT; \ + for (i =3D 0; i < (oprsz / 16) * m; i++) { \ + j =3D i < m ? 0 : 1; \ + uint64_t k =3D ((uint8_t)Vd->E(i)) % (2 * m); \ + temp.E(i) =3D k < m ? Vk->E(k + j * m) : Vj->E(k + (j - 1) * m); \ + } \ + *Vd =3D temp; \ } =20 VSHUF(vshuf_h, 16, H) VSHUF(vshuf_w, 32, W) VSHUF(vshuf_d, 64, D) =20 -#define VSHUF4I(NAME, BIT, E) \ -void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd =3D (VReg *)vd; \ - VReg *Vj =3D (VReg *)vj; \ - \ - for (i =3D 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i) =3D Vj->E(((i) & 0xfc) + (((imm) >> \ - (2 * ((i) & 0x03))) & 0x03)); \ - } \ - *Vd =3D temp; \ +#define VSHUF4I(NAME, BIT, E) = \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) = \ +{ = \ + int i, j, max; = \ + VReg temp =3D {}; = \ + VReg *Vd =3D (VReg *)vd; = \ + VReg *Vj =3D (VReg *)vj; = \ + int oprsz =3D simd_oprsz(desc); = \ + = \ + max =3D LSX_LEN / BIT; = \ + for (i =3D 0; i < oprsz / (BIT / 8); i++) { = \ + j =3D i < max ? 1 : 2; = \ + temp.E(i) =3D Vj->E(SHF_POS(i - ((j -1)* max), imm) + (j - 1) * ma= x); \ + } = \ + *Vd =3D temp; = \ } =20 VSHUF4I(vshuf4i_b, 8, B) @@ -3440,38 +3448,92 @@ VSHUF4I(vshuf4i_w, 32, W) =20 void HELPER(vshuf4i_d)(void *vd, void *vj, uint64_t imm, uint32_t desc) { + int i; + VReg temp =3D {}; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); =20 - VReg temp; - temp.D(0) =3D (imm & 2 ? Vj : Vd)->D(imm & 1); - temp.D(1) =3D (imm & 8 ? Vj : Vd)->D((imm >> 2) & 1); + for (i =3D 0; i < oprsz / 16; i++) { + temp.D(2 * i) =3D (imm & 2 ? Vj : Vd)->D((imm & 1) + 2 * i); + temp.D(2 * i + 1) =3D (imm & 8 ? Vj : Vd)->D(((imm >> 2) & 1) + 2 = * i); + } + *Vd =3D temp; +} + +void HELPER(vperm_w)(void *vd, void *vj, void *vk, uint32_t desc) +{ + int i, m; + VReg temp =3D {}; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + VReg *Vk =3D (VReg *)vk; + + m =3D LASX_LEN / 32; + for (i =3D 0; i < m ; i++) { + uint64_t k =3D (uint8_t)Vk->W(i) % 8; + temp.W(i) =3D Vj->W(k); + } *Vd =3D temp; } =20 void HELPER(vpermi_w)(void *vd, void *vj, uint64_t imm, uint32_t desc) { + int i; + VReg temp =3D {}; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + int oprsz =3D simd_oprsz(desc); + + for (i =3D 0; i < oprsz / 16; i++) { + temp.W(4 * i) =3D Vj->W((imm & 0x3) + 4 * i); + temp.W(4 * i + 1) =3D Vj->W(((imm >> 2) & 0x3) + 4 * i); + temp.W(4 * i + 2) =3D Vd->W(((imm >> 4) & 0x3) + 4 * i); + temp.W(4 * i + 3) =3D Vd->W(((imm >> 6) & 0x3) + 4 * i); + } + *Vd =3D temp; +} + +void HELPER(vpermi_d)(void *vd, void *vj, uint64_t imm, uint32_t desc) +{ + VReg temp =3D {}; + VReg *Vd =3D (VReg *)vd; + VReg *Vj =3D (VReg *)vj; + + temp.D(0) =3D Vj->D(imm & 0x3); + temp.D(1) =3D Vj->D((imm >> 2) & 0x3); + temp.D(2) =3D Vj->D((imm >> 4) & 0x3); + temp.D(3) =3D Vj->D((imm >> 6) & 0x3); + *Vd =3D temp; +} + +void HELPER(vpermi_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) +{ + int i; VReg temp; VReg *Vd =3D (VReg *)vd; VReg *Vj =3D (VReg *)vj; =20 - temp.W(0) =3D Vj->W(imm & 0x3); - temp.W(1) =3D Vj->W((imm >> 2) & 0x3); - temp.W(2) =3D Vd->W((imm >> 4) & 0x3); - temp.W(3) =3D Vd->W((imm >> 6) & 0x3); + for (i =3D 0; i < 2; i++, imm >>=3D 4) { + temp.Q(i) =3D (imm & 2 ? Vd: Vj)->Q(imm & 1); + } *Vd =3D temp; } =20 #define VEXTRINS(NAME, BIT, E, MASK) \ void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ - int ins, extr; \ + int i, ins, extr, max; \ VReg *Vd =3D (VReg *)vd; \ VReg *Vj =3D (VReg *)vj; \ + int oprsz =3D simd_oprsz(desc); \ \ + max =3D LSX_LEN / BIT; \ ins =3D (imm >> 4) & MASK; \ extr =3D imm & MASK; \ - Vd->E(ins) =3D Vj->E(extr); \ + for (i =3D 0; i < oprsz / 16; i++) { \ + Vd->E(ins + i * max) =3D Vj->E(extr + i * max); \ + } \ } =20 VEXTRINS(vextrins_b, 8, B, 0xf) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 2b55ce4464..3d0b8bfb74 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -61,6 +61,10 @@ static bool gen_xxxx_ptr(DisasContext *ctx, arg_vvvv *a, static bool gen_vvvv_vl(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz, gen_helper_gvec_4 *fn) { + if (!check_vec(ctx, oprsz)) { + return true; + } + tcg_gen_gvec_4_ool(vec_full_offset(a->vd), vec_full_offset(a->vj), vec_full_offset(a->vk), @@ -72,13 +76,15 @@ static bool gen_vvvv_vl(DisasContext *ctx, arg_vvvv *a,= uint32_t oprsz, static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, gen_helper_gvec_4 *fn) { - if (!check_vec(ctx, 16)) { - return true; - } - return gen_vvvv_vl(ctx, a, 16, fn); } =20 +static bool gen_xxxx(DisasContext *ctx, arg_vvvv *a, + gen_helper_gvec_4 *fn) +{ + return gen_vvvv_vl(ctx, a, 32, fn); +} + static bool gen_vvv_ptr_vl(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, gen_helper_gvec_3_ptr *fn) { @@ -5217,17 +5223,33 @@ TRANS(vshuf_b, LSX, gen_vvvv, gen_helper_vshuf_b) TRANS(vshuf_h, LSX, gen_vvv, gen_helper_vshuf_h) TRANS(vshuf_w, LSX, gen_vvv, gen_helper_vshuf_w) TRANS(vshuf_d, LSX, gen_vvv, gen_helper_vshuf_d) +TRANS(xvshuf_b, LASX, gen_xxxx, gen_helper_vshuf_b) +TRANS(xvshuf_h, LASX, gen_xxx, gen_helper_vshuf_h) +TRANS(xvshuf_w, LASX, gen_xxx, gen_helper_vshuf_w) +TRANS(xvshuf_d, LASX, gen_xxx, gen_helper_vshuf_d) TRANS(vshuf4i_b, LSX, gen_vv_i, gen_helper_vshuf4i_b) TRANS(vshuf4i_h, LSX, gen_vv_i, gen_helper_vshuf4i_h) TRANS(vshuf4i_w, LSX, gen_vv_i, gen_helper_vshuf4i_w) TRANS(vshuf4i_d, LSX, gen_vv_i, gen_helper_vshuf4i_d) +TRANS(xvshuf4i_b, LASX, gen_xx_i, gen_helper_vshuf4i_b) +TRANS(xvshuf4i_h, LASX, gen_xx_i, gen_helper_vshuf4i_h) +TRANS(xvshuf4i_w, LASX, gen_xx_i, gen_helper_vshuf4i_w) +TRANS(xvshuf4i_d, LASX, gen_xx_i, gen_helper_vshuf4i_d) =20 +TRANS(xvperm_w, LASX, gen_xxx, gen_helper_vperm_w) TRANS(vpermi_w, LSX, gen_vv_i, gen_helper_vpermi_w) +TRANS(xvpermi_w, LASX, gen_xx_i, gen_helper_vpermi_w) +TRANS(xvpermi_d, LASX, gen_xx_i, gen_helper_vpermi_d) +TRANS(xvpermi_q, LASX, gen_xx_i, gen_helper_vpermi_q) =20 TRANS(vextrins_b, LSX, gen_vv_i, gen_helper_vextrins_b) TRANS(vextrins_h, LSX, gen_vv_i, gen_helper_vextrins_h) TRANS(vextrins_w, LSX, gen_vv_i, gen_helper_vextrins_w) TRANS(vextrins_d, LSX, gen_vv_i, gen_helper_vextrins_d) +TRANS(xvextrins_b, LASX, gen_xx_i, gen_helper_vextrins_b) +TRANS(xvextrins_h, LASX, gen_xx_i, gen_helper_vextrins_h) +TRANS(xvextrins_w, LASX, gen_xx_i, gen_helper_vextrins_w) +TRANS(xvextrins_d, LASX, gen_xx_i, gen_helper_vextrins_d) =20 static bool trans_vld(DisasContext *ctx, arg_vr_i *a) { --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658941736537.6623213286243; Wed, 13 Sep 2023 19:35:41 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6l-0002ac-7g; Wed, 13 Sep 2023 22:29:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc6j-0002NL-AP for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:29:29 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6g-0005No-70 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:29:29 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Bx5fCTbwJlyPgmAA--.10238S3; Thu, 14 Sep 2023 10:27:31 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S57; Thu, 14 Sep 2023 10:27:30 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 55/57] target/loongarch: Implement xvld xvst Date: Thu, 14 Sep 2023 10:26:43 +0800 Message-Id: <20230914022645.1151356-56-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S57 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658943644100003 Content-Type: text/plain; charset="utf-8" This patch includes: - XVLD[X], XVST[X]; - XVLDREPL.{B/H/W/D}; - XVSTELM.{B/H/W/D}. Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/insns.decode | 18 ++ target/loongarch/disas.c | 24 +++ target/loongarch/insn_trans/trans_vec.c.inc | 212 ++++++++++++++------ 3 files changed, 194 insertions(+), 60 deletions(-) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 64b67ee9ac..64b308f9fb 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -550,6 +550,10 @@ dbcl 0000 00000010 10101 ............... = @i15 @vr_i8i2 .... ........ imm2:2 ........ rj:5 vd:5 &vr_ii imm=3D%i8s2 @vr_i8i3 .... ....... imm2:3 ........ rj:5 vd:5 &vr_ii imm=3D%i8s1 @vr_i8i4 .... ...... imm2:4 imm:s8 rj:5 vd:5 &vr_ii +@vr_i8i2x .... ........ imm2:2 ........ rj:5 vd:5 &vr_ii imm=3D%i8s3 +@vr_i8i3x .... ....... imm2:3 ........ rj:5 vd:5 &vr_ii imm=3D%i8s2 +@vr_i8i4x .... ...... imm2:4 ........ rj:5 vd:5 &vr_ii imm=3D%i8s1 +@vr_i8i5x .... ..... imm2:5 imm:s8 rj:5 vd:5 &vr_ii @vrr .... ........ ..... rk:5 rj:5 vd:5 &vrr @v_i13 .... ........ .. imm:13 vd:5 &v_i =20 @@ -2060,3 +2064,17 @@ xvextrins_d 0111 01111000 00 ........ ..... ...= .. @vv_ui8 xvextrins_w 0111 01111000 01 ........ ..... ..... @vv_ui8 xvextrins_h 0111 01111000 10 ........ ..... ..... @vv_ui8 xvextrins_b 0111 01111000 11 ........ ..... ..... @vv_ui8 + +xvld 0010 110010 ............ ..... ..... @vr_i12 +xvst 0010 110011 ............ ..... ..... @vr_i12 +xvldx 0011 10000100 10000 ..... ..... ..... @vrr +xvstx 0011 10000100 11000 ..... ..... ..... @vrr + +xvldrepl_d 0011 00100001 0 ......... ..... ..... @vr_i9 +xvldrepl_w 0011 00100010 .......... ..... ..... @vr_i10 +xvldrepl_h 0011 0010010 ........... ..... ..... @vr_i11 +xvldrepl_b 0011 001010 ............ ..... ..... @vr_i12 +xvstelm_d 0011 00110001 .. ........ ..... ..... @vr_i8i2x +xvstelm_w 0011 0011001 ... ........ ..... ..... @vr_i8i3x +xvstelm_h 0011 001101 .... ........ ..... ..... @vr_i8i4x +xvstelm_b 0011 00111 ..... ........ ..... ..... @vr_i8i5x diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 1ec8e21e01..c8a29eac2b 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1753,6 +1753,16 @@ static void output_vvr_x(DisasContext *ctx, arg_vvr = *a, const char *mnemonic) output(ctx, mnemonic, "x%d, x%d, r%d", a->vd, a->vj, a->rk); } =20 +static void output_vrr_x(DisasContext *ctx, arg_vrr *a, const char *mnemon= ic) +{ + output(ctx, mnemonic, "x%d, r%d, r%d", a->vd, a->rj, a->rk); +} + +static void output_vr_ii_x(DisasContext *ctx, arg_vr_ii *a, const char *mn= emonic) +{ + output(ctx, mnemonic, "x%d, r%d, 0x%x, 0x%x", a->vd, a->rj, a->imm, a-= >imm2); +} + INSN_LASX(xvadd_b, vvv) INSN_LASX(xvadd_h, vvv) INSN_LASX(xvadd_w, vvv) @@ -2595,3 +2605,17 @@ INSN_LASX(xvextrins_d, vv_i) INSN_LASX(xvextrins_w, vv_i) INSN_LASX(xvextrins_h, vv_i) INSN_LASX(xvextrins_b, vv_i) + +INSN_LASX(xvld, vr_i) +INSN_LASX(xvst, vr_i) +INSN_LASX(xvldx, vrr) +INSN_LASX(xvstx, vrr) + +INSN_LASX(xvldrepl_d, vr_i) +INSN_LASX(xvldrepl_w, vr_i) +INSN_LASX(xvldrepl_h, vr_i) +INSN_LASX(xvldrepl_b, vr_i) +INSN_LASX(xvstelm_d, vr_ii) +INSN_LASX(xvstelm_w, vr_ii) +INSN_LASX(xvstelm_h, vr_ii) +INSN_LASX(xvstelm_b, vr_ii) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch= /insn_trans/trans_vec.c.inc index 3d0b8bfb74..b1084e88ca 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -5367,63 +5367,155 @@ static bool trans_vstx(DisasContext *ctx, arg_vrr = *a) return true; } =20 -#define VLDREPL(NAME, MO) \ -static bool trans_## NAME (DisasContext *ctx, arg_vr_i *a) \ -{ \ - TCGv addr; \ - TCGv_i64 val; \ - \ - if (!avail_LSX(ctx)) { \ - return false; \ - } \ - \ - if (!check_vec(ctx, 16)) { \ - return true; \ - } \ - \ - addr =3D gpr_src(ctx, a->rj, EXT_NONE); = \ - val =3D tcg_temp_new_i64(); = \ - \ - addr =3D make_address_i(ctx, addr, a->imm); = \ - \ - tcg_gen_qemu_ld_i64(val, addr, ctx->mem_idx, MO); \ - tcg_gen_gvec_dup_i64(MO, vec_full_offset(a->vd), 16, ctx->vl/8, val); \ - \ - return true; \ -} - -VLDREPL(vldrepl_b, MO_8) -VLDREPL(vldrepl_h, MO_16) -VLDREPL(vldrepl_w, MO_32) -VLDREPL(vldrepl_d, MO_64) - -#define VSTELM(NAME, MO, E) = \ -static bool trans_## NAME (DisasContext *ctx, arg_vr_ii *a) = \ -{ = \ - TCGv addr; = \ - TCGv_i64 val; = \ - = \ - if (!avail_LSX(ctx)) { = \ - return false; = \ - } = \ - = \ - if (!check_vec(ctx, 16)) { = \ - return true; = \ - } = \ - = \ - addr =3D gpr_src(ctx, a->rj, EXT_NONE); = \ - val =3D tcg_temp_new_i64(); = \ - = \ - addr =3D make_address_i(ctx, addr, a->imm); = \ - = \ - tcg_gen_ld_i64(val, cpu_env, = \ - offsetof(CPULoongArchState, fpr[a->vd].vreg.E(a->imm2))= ); \ - tcg_gen_qemu_st_i64(val, addr, ctx->mem_idx, MO); = \ - = \ - return true; = \ -} - -VSTELM(vstelm_b, MO_8, B) -VSTELM(vstelm_h, MO_16, H) -VSTELM(vstelm_w, MO_32, W) -VSTELM(vstelm_d, MO_64, D) +static bool do_vldrepl_vl(DisasContext *ctx, arg_vr_i *a, + uint32_t oprsz, MemOp mop) +{ + TCGv addr; + TCGv_i64 val; + + if (!check_vec(ctx, oprsz)) { + return true; + } + + addr =3D gpr_src(ctx, a->rj, EXT_NONE); + val =3D tcg_temp_new_i64(); + + addr =3D make_address_i(ctx, addr, a->imm); + + tcg_gen_qemu_ld_i64(val, addr, ctx->mem_idx, mop); + tcg_gen_gvec_dup_i64(mop, vec_full_offset(a->vd), oprsz, ctx->vl / 8, = val); + + return true; +} + +static bool do_vldrepl(DisasContext *ctx, arg_vr_i *a, MemOp mop) +{ + return do_vldrepl_vl(ctx, a, 16, mop); +} + +static bool do_xvldrepl(DisasContext *ctx, arg_vr_i *a, MemOp mop) +{ + return do_vldrepl_vl(ctx, a, 32, mop); +} + +TRANS(vldrepl_b, LSX, do_vldrepl, MO_8) +TRANS(vldrepl_h, LSX, do_vldrepl, MO_16) +TRANS(vldrepl_w, LSX, do_vldrepl, MO_32) +TRANS(vldrepl_d, LSX, do_vldrepl, MO_64) +TRANS(xvldrepl_b, LASX, do_xvldrepl, MO_8) +TRANS(xvldrepl_h, LASX, do_xvldrepl, MO_16) +TRANS(xvldrepl_w, LASX, do_xvldrepl, MO_32) +TRANS(xvldrepl_d, LASX, do_xvldrepl, MO_64) + +static bool do_vstelm_vl(DisasContext *ctx, + arg_vr_ii *a, uint32_t oprsz, MemOp mop) +{ + TCGv addr; + TCGv_i64 val; + + if (!check_vec(ctx, oprsz)) { + return true; + } + + addr =3D gpr_src(ctx, a->rj, EXT_NONE); + val =3D tcg_temp_new_i64(); + + addr =3D make_address_i(ctx, addr, a->imm); + tcg_gen_ld_i64(val, cpu_env, vec_reg_offset(a->vd, a->imm2, mop)); + tcg_gen_qemu_st_i64(val, addr, ctx->mem_idx, mop); + return true; +} + +static bool do_vstelm(DisasContext *ctx, arg_vr_ii *a, MemOp mop) +{ + return do_vstelm_vl(ctx, a, 16, mop); +} + +static bool do_xvstelm(DisasContext *ctx, arg_vr_ii *a, MemOp mop) +{ + return do_vstelm_vl(ctx, a, 32, mop); +} + +TRANS(vstelm_b, LSX, do_vstelm, MO_8) +TRANS(vstelm_h, LSX, do_vstelm, MO_16) +TRANS(vstelm_w, LSX, do_vstelm, MO_32) +TRANS(vstelm_d, LSX, do_vstelm, MO_64) +TRANS(xvstelm_b, LASX, do_xvstelm, MO_8) +TRANS(xvstelm_h, LASX, do_xvstelm, MO_16) +TRANS(xvstelm_w, LASX, do_xvstelm, MO_32) +TRANS(xvstelm_d, LASX, do_xvstelm, MO_64) + +static bool gen_lasx_memory(DisasContext *ctx, arg_vr_i *a, + void (*func)(DisasContext *, int, TCGv)) +{ + TCGv addr =3D gpr_src(ctx, a->rj, EXT_NONE); + TCGv temp =3D NULL; + + if (!check_vec(ctx, 32)) { + return true; + } + + if (a->imm) { + temp =3D tcg_temp_new(); + tcg_gen_addi_tl(temp, addr, a->imm); + addr =3D temp; + } + + func(ctx, a->vd, addr); + return true; +} + +static void gen_xvld(DisasContext *ctx, int vreg, TCGv addr) +{ + int i; + TCGv temp =3D tcg_temp_new(); + TCGv dest =3D tcg_temp_new(); + + tcg_gen_qemu_ld_i64(dest, addr, ctx->mem_idx, MO_TEUQ); + set_vreg64(dest, vreg, 0); + + for (i =3D 1; i < 4; i++) { + tcg_gen_addi_tl(temp, addr, 8 * i); + tcg_gen_qemu_ld_i64(dest, temp, ctx->mem_idx, MO_TEUQ); + set_vreg64(dest, vreg, i); + } +} + +static void gen_xvst(DisasContext * ctx, int vreg, TCGv addr) +{ + int i; + TCGv temp =3D tcg_temp_new(); + TCGv dest =3D tcg_temp_new(); + + get_vreg64(dest, vreg, 0); + tcg_gen_qemu_st_i64(dest, addr, ctx->mem_idx, MO_TEUQ); + + for (i =3D 1; i < 4; i++) { + tcg_gen_addi_tl(temp, addr, 8 * i); + get_vreg64(dest, vreg, i); + tcg_gen_qemu_st_i64(dest, temp, ctx->mem_idx, MO_TEUQ); + } +} + +TRANS(xvld, LASX, gen_lasx_memory, gen_xvld) +TRANS(xvst, LASX, gen_lasx_memory, gen_xvst) + +static bool gen_lasx_memoryx(DisasContext *ctx, arg_vrr *a, + void (*func)(DisasContext*, int, TCGv)) +{ + TCGv src1 =3D gpr_src(ctx, a->rj, EXT_NONE); + TCGv src2 =3D gpr_src(ctx, a->rk, EXT_NONE); + TCGv addr =3D tcg_temp_new(); + + if (!check_vec(ctx, 32)) { + return true; + } + + tcg_gen_add_tl(addr, src1, src2); + func(ctx, a->vd, addr); + + return true; +} + +TRANS(xvldx, LASX, gen_lasx_memoryx, gen_xvld) +TRANS(xvstx, LASX, gen_lasx_memoryx, gen_xvst) --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658581510350.7143845541625; Wed, 13 Sep 2023 19:29:41 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc63-0005Lq-8u; Wed, 13 Sep 2023 22:28:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc55-0003OO-4r for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:50 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc51-00056w-NH for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:46 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxyOiTbwJly_gmAA--.39701S3; Thu, 14 Sep 2023 10:27:31 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S58; Thu, 14 Sep 2023 10:27:31 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 56/57] target/loongarch: Move simply DO_XX marcos togther Date: Thu, 14 Sep 2023 10:26:44 +0800 Message-Id: <20230914022645.1151356-57-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S58 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658582946100003 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/vec.h | 42 ++++++++++++++++++++++++++++++ target/loongarch/vec_helper.c | 48 ----------------------------------- 2 files changed, 42 insertions(+), 48 deletions(-) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 2f23cae7d7..3c9adf8427 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -30,4 +30,46 @@ #define Q(x) Q[x] #endif /* HOST_BIG_ENDIAN */ =20 +#define DO_ADD(a, b) (a + b) +#define DO_SUB(a, b) (a - b) +#define DO_VAVG(a, b) ((a >> 1) + (b >> 1) + (a & b & 1)) +#define DO_VAVGR(a, b) ((a >> 1) + (b >> 1) + ((a | b) & 1)) +#define DO_VABSD(a, b) ((a > b) ? (a -b) : (b-a)) +#define DO_VABS(a) ((a < 0) ? (-a) : (a)) +#define DO_MIN(a, b) (a < b ? a : b) +#define DO_MAX(a, b) (a > b ? a : b) +#define DO_MUL(a, b) (a * b) +#define DO_MADD(a, b, c) (a + b * c) +#define DO_MSUB(a, b, c) (a - b * c) + +#define DO_DIVU(N, M) (unlikely(M =3D=3D 0) ? 0 : N / M) +#define DO_REMU(N, M) (unlikely(M =3D=3D 0) ? 0 : N % M) +#define DO_DIV(N, M) (unlikely(M =3D=3D 0) ? 0 :\ + unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? N : N / = M) +#define DO_REM(N, M) (unlikely(M =3D=3D 0) ? 0 :\ + unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? 0 : N % = M) + +#define DO_SIGNCOV(a, b) (a =3D=3D 0 ? 0 : a < 0 ? -b : b) + +#define R_SHIFT(a, b) (a >> b) + +#define DO_CLO_B(N) (clz32(~N & 0xff) - 24) +#define DO_CLO_H(N) (clz32(~N & 0xffff) - 16) +#define DO_CLO_W(N) (clz32(~N)) +#define DO_CLO_D(N) (clz64(~N)) +#define DO_CLZ_B(N) (clz32(N) - 24) +#define DO_CLZ_H(N) (clz32(N) - 16) +#define DO_CLZ_W(N) (clz32(N)) +#define DO_CLZ_D(N) (clz64(N)) + +#define DO_BITCLR(a, bit) (a & ~(1ull << bit)) +#define DO_BITSET(a, bit) (a | 1ull << bit) +#define DO_BITREV(a, bit) (a ^ (1ull << bit)) + +#define VSEQ(a, b) (a =3D=3D b ? -1 : 0) +#define VSLE(a, b) (a <=3D b ? -1 : 0) +#define VSLT(a, b) (a < b ? -1 : 0) + +#define SHF_POS(i, imm) (((i) & 0xfc) + (((imm) >> (2 * ((i) & 0x03))) & 0= x03)) + #endif /* LOONGARCH_VEC_H */ diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 6b61a5c447..3faf52cbc4 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -15,9 +15,6 @@ #include "vec.h" #include "tcg/tcg-gvec-desc.h" =20 -#define DO_ADD(a, b) (a + b) -#define DO_SUB(a, b) (a - b) - #define DO_ODD_EVEN(NAME, BIT, E1, E2, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ @@ -347,9 +344,6 @@ DO_ODD_U_S(vaddwod_h_bu_b, 16, H, UH, B, UB, DO_ADD) DO_ODD_U_S(vaddwod_w_hu_h, 32, W, UW, H, UH, DO_ADD) DO_ODD_U_S(vaddwod_d_wu_w, 64, D, UD, W, UW, DO_ADD) =20 -#define DO_VAVG(a, b) ((a >> 1) + (b >> 1) + (a & b & 1)) -#define DO_VAVGR(a, b) ((a >> 1) + (b >> 1) + ((a | b) & 1)) - #define DO_3OP(NAME, BIT, E, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ @@ -381,8 +375,6 @@ DO_3OP(vavgr_hu, 16, UH, DO_VAVGR) DO_3OP(vavgr_wu, 32, UW, DO_VAVGR) DO_3OP(vavgr_du, 64, UD, DO_VAVGR) =20 -#define DO_VABSD(a, b) ((a > b) ? (a -b) : (b-a)) - DO_3OP(vabsd_b, 8, B, DO_VABSD) DO_3OP(vabsd_h, 16, H, DO_VABSD) DO_3OP(vabsd_w, 32, W, DO_VABSD) @@ -392,8 +384,6 @@ DO_3OP(vabsd_hu, 16, UH, DO_VABSD) DO_3OP(vabsd_wu, 32, UW, DO_VABSD) DO_3OP(vabsd_du, 64, UD, DO_VABSD) =20 -#define DO_VABS(a) ((a < 0) ? (-a) : (a)) - #define DO_VADDA(NAME, BIT, E) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ @@ -413,9 +403,6 @@ DO_VADDA(vadda_h, 16, H) DO_VADDA(vadda_w, 32, W) DO_VADDA(vadda_d, 64, D) =20 -#define DO_MIN(a, b) (a < b ? a : b) -#define DO_MAX(a, b) (a > b ? a : b) - #define VMINMAXI(NAME, BIT, E, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ @@ -500,8 +487,6 @@ DO_VMUH(vmuh_bu, 8, UH, UB, DO_MUH) DO_VMUH(vmuh_hu, 16, UW, UH, DO_MUH) DO_VMUH(vmuh_wu, 32, UD, UW, DO_MUH) =20 -#define DO_MUL(a, b) (a * b) - DO_EVEN(vmulwev_h_b, 16, H, B, DO_MUL) DO_EVEN(vmulwev_w_h, 32, W, H, DO_MUL) DO_EVEN(vmulwev_d_w, 64, D, W, DO_MUL) @@ -526,9 +511,6 @@ DO_ODD_U_S(vmulwod_h_bu_b, 16, H, UH, B, UB, DO_MUL) DO_ODD_U_S(vmulwod_w_hu_h, 32, W, UW, H, UH, DO_MUL) DO_ODD_U_S(vmulwod_d_wu_w, 64, D, UD, W, UW, DO_MUL) =20 -#define DO_MADD(a, b, c) (a + b * c) -#define DO_MSUB(a, b, c) (a - b * c) - #define VMADDSUB(NAME, BIT, E, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ @@ -639,13 +621,6 @@ VMADDWOD_U_S(vmaddwod_h_bu_b, 16, H, UH, B, UB, DO_MUL) VMADDWOD_U_S(vmaddwod_w_hu_h, 32, W, UW, H, UH, DO_MUL) VMADDWOD_U_S(vmaddwod_d_wu_w, 64, D, UD, W, UW, DO_MUL) =20 -#define DO_DIVU(N, M) (unlikely(M =3D=3D 0) ? 0 : N / M) -#define DO_REMU(N, M) (unlikely(M =3D=3D 0) ? 0 : N % M) -#define DO_DIV(N, M) (unlikely(M =3D=3D 0) ? 0 :\ - unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? N : N / = M) -#define DO_REM(N, M) (unlikely(M =3D=3D 0) ? 0 :\ - unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? 0 : N % = M) - #define VDIV(NAME, BIT, E, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ @@ -791,8 +766,6 @@ VEXT2XV(vext2xv_wu_hu, 32, UW, UH) VEXT2XV(vext2xv_du_hu, 64, UD, UH) VEXT2XV(vext2xv_du_wu, 64, UD, UW) =20 -#define DO_SIGNCOV(a, b) (a =3D=3D 0 ? 0 : a < 0 ? -b : b) - DO_3OP(vsigncov_b, 8, B, DO_SIGNCOV) DO_3OP(vsigncov_h, 16, H, DO_SIGNCOV) DO_3OP(vsigncov_w, 32, W, DO_SIGNCOV) @@ -1107,8 +1080,6 @@ VSRARI(vsrari_h, 16, H) VSRARI(vsrari_w, 32, W) VSRARI(vsrari_d, 64, D) =20 -#define R_SHIFT(a, b) (a >> b) - #define VSRLN(NAME, BIT, E1, E2) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ @@ -2272,15 +2243,6 @@ void HELPER(NAME)(void *vd, void *vj, uint32_t desc)= \ } \ } =20 -#define DO_CLO_B(N) (clz32(~N & 0xff) - 24) -#define DO_CLO_H(N) (clz32(~N & 0xffff) - 16) -#define DO_CLO_W(N) (clz32(~N)) -#define DO_CLO_D(N) (clz64(~N)) -#define DO_CLZ_B(N) (clz32(N) - 24) -#define DO_CLZ_H(N) (clz32(N) - 16) -#define DO_CLZ_W(N) (clz32(N)) -#define DO_CLZ_D(N) (clz64(N)) - DO_2OP(vclo_b, 8, UB, DO_CLO_B) DO_2OP(vclo_h, 16, UH, DO_CLO_H) DO_2OP(vclo_w, 32, UW, DO_CLO_W) @@ -2309,10 +2271,6 @@ VPCNT(vpcnt_h, 16, UH, ctpop16) VPCNT(vpcnt_w, 32, UW, ctpop32) VPCNT(vpcnt_d, 64, UD, ctpop64) =20 -#define DO_BITCLR(a, bit) (a & ~(1ull << bit)) -#define DO_BITSET(a, bit) (a | 1ull << bit) -#define DO_BITREV(a, bit) (a ^ (1ull << bit)) - #define DO_BIT(NAME, BIT, E, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ @@ -3053,10 +3011,6 @@ void HELPER(vffint_s_l)(void *vd, void *vj, void *vk, *Vd =3D temp; } =20 -#define VSEQ(a, b) (a =3D=3D b ? -1 : 0) -#define VSLE(a, b) (a <=3D b ? -1 : 0) -#define VSLT(a, b) (a < b ? -1 : 0) - #define VCMPI(NAME, BIT, E, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ @@ -3381,8 +3335,6 @@ VILVH(vilvh_h, 32, H) VILVH(vilvh_w, 64, W) VILVH(vilvh_d, 128, D) =20 -#define SHF_POS(i, imm) (((i) & 0xfc) + (((imm) >> (2 * ((i) & 0x03))) & 0= x03)) - void HELPER(vshuf_b)(void *vd, void *vj, void *vk, void *va, uint32_t desc) { int i, j, m; --=20 2.39.1 From nobody Wed May 15 16:20:20 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694658931819421.2829981151906; Wed, 13 Sep 2023 19:35:31 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc6A-0005ak-3Q; Wed, 13 Sep 2023 22:28:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qgc54-0003L7-87 for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:47 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qgc51-00056u-LZ for qemu-devel@nongnu.org; Wed, 13 Sep 2023 22:27:46 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxZ+iTbwJlzfgmAA--.39209S3; Thu, 14 Sep 2023 10:27:31 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxvdxmbwJlJ+UDAA--.7298S59; Thu, 14 Sep 2023 10:27:31 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v6 57/57] target/loongarch: CPUCFG support LASX Date: Thu, 14 Sep 2023 10:26:45 +0800 Message-Id: <20230914022645.1151356-58-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230914022645.1151356-1-gaosong@loongson.cn> References: <20230914022645.1151356-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8CxvdxmbwJlJ+UDAA--.7298S59 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1694658933272100003 Content-Type: text/plain; charset="utf-8" Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/cpu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c index a1d3f680d8..fc7f70fbe5 100644 --- a/target/loongarch/cpu.c +++ b/target/loongarch/cpu.c @@ -393,6 +393,7 @@ static void loongarch_la464_initfn(Object *obj) data =3D FIELD_DP32(data, CPUCFG2, FP_DP, 1); data =3D FIELD_DP32(data, CPUCFG2, FP_VER, 1); data =3D FIELD_DP32(data, CPUCFG2, LSX, 1), + data =3D FIELD_DP32(data, CPUCFG2, LASX, 1), data =3D FIELD_DP32(data, CPUCFG2, LLFTP, 1); data =3D FIELD_DP32(data, CPUCFG2, LLFTP_VER, 1); data =3D FIELD_DP32(data, CPUCFG2, LSPW, 1); --=20 2.39.1