From nobody Wed Feb 11 04:02:31 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1682406296426183.1077861296087; Tue, 25 Apr 2023 00:04:56 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1prCiC-0001Ui-Lm; Tue, 25 Apr 2023 03:03:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1prCi9-0001TB-WC for qemu-devel@nongnu.org; Tue, 25 Apr 2023 03:03:38 -0400 Received: from mail.loongson.cn ([114.242.206.163] helo=loongson.cn) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1prCi6-0008DO-SY for qemu-devel@nongnu.org; Tue, 25 Apr 2023 03:03:37 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxRPA6e0dkq18AAA--.737S3; Tue, 25 Apr 2023 15:03:22 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8Cxeb0Ye0dk3Eo6AA--.4591S33; Tue, 25 Apr 2023 15:03:21 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, gaosong@loongson.cn Subject: [RFC PATCH v4 31/44] target/loongarch: Implement vbitclr vbitset vbitrev Date: Tue, 25 Apr 2023 15:02:35 +0800 Message-Id: <20230425070248.2550028-32-gaosong@loongson.cn> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230425070248.2550028-1-gaosong@loongson.cn> References: <20230425070248.2550028-1-gaosong@loongson.cn> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: AQAAf8Cxeb0Ye0dk3Eo6AA--.4591S33 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjvAXoW3Zr4UGF4UZr48CrWDCw1xZrb_yoW8Xw4UXo ZFq3WrJrW8Jr1rGFyUG3WxX3Z7tF40va9xXayq9w4qva4rAF47tr1Fq3WrKa97trW29a43 Jr9ruF45XF9Yvr1kn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7KY7ZEXasCq-sGcSsGvf J3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnRJU UUqC1xkIjI8I6I8E6xAIw20EY4v20xvaj40_Wr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64 kEwVA0rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVWDJVCq3wA2z4x0Y4vE2Ix0cI8IcVCY 1x0267AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2js IEc7CjxVAFwI0_Gr1j6F4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc804VCY07AIYIkI 8VC2zVCFFI0UMc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VCjz48v1sIEY20_WwAm72CE4I kC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2Ij64vIr41l42xK82IY6x8E rcxFaVAv8VWrMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2 IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUXVWUAwCIc40Y0x0EwIxGrwCI 42IY6xIIjxv20xvE14v26F1j6w1UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1lIx AIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26F4j6r4UJwCI42IY6I8E 87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvj4RC_MaUUUUU Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1682406297276100001 Content-Type: text/plain; charset="utf-8" This patch includes: - VBITCLR[I].{B/H/W/D}; - VBITSET[I].{B/H/W/D}; - VBITREV[I].{B/H/W/D}. Reviewed-by: Richard Henderson Signed-off-by: Song Gao --- target/loongarch/disas.c | 25 ++ target/loongarch/helper.h | 27 ++ target/loongarch/insn_trans/trans_lsx.c.inc | 305 ++++++++++++++++++++ target/loongarch/insns.decode | 25 ++ target/loongarch/lsx_helper.c | 55 ++++ 5 files changed, 437 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 0ca51de9d8..48c7ea47a4 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1272,3 +1272,28 @@ INSN_LSX(vpcnt_b, vv) INSN_LSX(vpcnt_h, vv) INSN_LSX(vpcnt_w, vv) INSN_LSX(vpcnt_d, vv) + +INSN_LSX(vbitclr_b, vvv) +INSN_LSX(vbitclr_h, vvv) +INSN_LSX(vbitclr_w, vvv) +INSN_LSX(vbitclr_d, vvv) +INSN_LSX(vbitclri_b, vv_i) +INSN_LSX(vbitclri_h, vv_i) +INSN_LSX(vbitclri_w, vv_i) +INSN_LSX(vbitclri_d, vv_i) +INSN_LSX(vbitset_b, vvv) +INSN_LSX(vbitset_h, vvv) +INSN_LSX(vbitset_w, vvv) +INSN_LSX(vbitset_d, vvv) +INSN_LSX(vbitseti_b, vv_i) +INSN_LSX(vbitseti_h, vv_i) +INSN_LSX(vbitseti_w, vv_i) +INSN_LSX(vbitseti_d, vv_i) +INSN_LSX(vbitrev_b, vvv) +INSN_LSX(vbitrev_h, vvv) +INSN_LSX(vbitrev_w, vvv) +INSN_LSX(vbitrev_d, vvv) +INSN_LSX(vbitrevi_b, vv_i) +INSN_LSX(vbitrevi_h, vv_i) +INSN_LSX(vbitrevi_w, vv_i) +INSN_LSX(vbitrevi_d, vv_i) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 96b9b16923..75120ca55e 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -485,3 +485,30 @@ DEF_HELPER_3(vpcnt_b, void, env, i32, i32) DEF_HELPER_3(vpcnt_h, void, env, i32, i32) DEF_HELPER_3(vpcnt_w, void, env, i32, i32) DEF_HELPER_3(vpcnt_d, void, env, i32, i32) + +DEF_HELPER_FLAGS_4(vbitclr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vbitclr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vbitclr_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vbitclr_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vbitclri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vbitclri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vbitclri_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vbitclri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(vbitset_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vbitset_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vbitset_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vbitset_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vbitseti_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vbitseti_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vbitseti_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vbitseti_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(vbitrev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vbitrev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vbitrev_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vbitrev_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vbitrevi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vbitrevi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vbitrevi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vbitrevi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch= /insn_trans/trans_lsx.c.inc index f4ebdca63c..86243b54ba 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_lsx.c.inc @@ -3111,3 +3111,308 @@ TRANS(vpcnt_b, gen_vv, gen_helper_vpcnt_b) TRANS(vpcnt_h, gen_vv, gen_helper_vpcnt_h) TRANS(vpcnt_w, gen_vv, gen_helper_vpcnt_w) TRANS(vpcnt_d, gen_vv, gen_helper_vpcnt_d) + +static void do_vbit(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b, + void (*func)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec)) +{ + TCGv_vec mask, lsh, t1, one; + + lsh =3D tcg_temp_new_vec_matching(t); + t1 =3D tcg_temp_new_vec_matching(t); + mask =3D tcg_constant_vec_matching(t, vece, (8 << vece) - 1); + one =3D tcg_constant_vec_matching(t, vece, 1); + + tcg_gen_and_vec(vece, lsh, b, mask); + tcg_gen_shlv_vec(vece, t1, one, lsh); + func(vece, t, a, t1); +} + +static void gen_vbitclr(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) +{ + do_vbit(vece, t, a, b, tcg_gen_andc_vec); +} + +static void gen_vbitset(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) +{ + do_vbit(vece, t, a, b, tcg_gen_or_vec); +} + +static void gen_vbitrev(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) +{ + do_vbit(vece, t, a, b, tcg_gen_xor_vec); +} + +static void do_vbitclr(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, + uint32_t vk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] =3D { + INDEX_op_shlv_vec, INDEX_op_andc_vec, 0 + }; + static const GVecGen3 op[4] =3D { + { + .fniv =3D gen_vbitclr, + .fno =3D gen_helper_vbitclr_b, + .opt_opc =3D vecop_list, + .vece =3D MO_8 + }, + { + .fniv =3D gen_vbitclr, + .fno =3D gen_helper_vbitclr_h, + .opt_opc =3D vecop_list, + .vece =3D MO_16 + }, + { + .fniv =3D gen_vbitclr, + .fno =3D gen_helper_vbitclr_w, + .opt_opc =3D vecop_list, + .vece =3D MO_32 + }, + { + .fniv =3D gen_vbitclr, + .fno =3D gen_helper_vbitclr_d, + .opt_opc =3D vecop_list, + .vece =3D MO_64 + }, + }; + + tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(vbitclr_b, gvec_vvv, MO_8, do_vbitclr) +TRANS(vbitclr_h, gvec_vvv, MO_16, do_vbitclr) +TRANS(vbitclr_w, gvec_vvv, MO_32, do_vbitclr) +TRANS(vbitclr_d, gvec_vvv, MO_64, do_vbitclr) + +static void do_vbiti(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t imm, + void (*func)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec)) +{ + int lsh; + TCGv_vec t1, one; + + lsh =3D imm & ((8 << vece) -1); + t1 =3D tcg_temp_new_vec_matching(t); + one =3D tcg_constant_vec_matching(t, vece, 1); + + tcg_gen_shli_vec(vece, t1, one, lsh); + func(vece, t, a, t1); +} + +static void gen_vbitclri(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t im= m) +{ + do_vbiti(vece, t, a, imm, tcg_gen_andc_vec); +} + +static void gen_vbitseti(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t im= m) +{ + do_vbiti(vece, t, a, imm, tcg_gen_or_vec); +} + +static void gen_vbitrevi(unsigned vece, TCGv_vec t, TCGv_vec a, int64_t im= m) +{ + do_vbiti(vece, t, a, imm, tcg_gen_xor_vec); +} + +static void do_vbitclri(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] =3D { + INDEX_op_shli_vec, INDEX_op_andc_vec, 0 + }; + static const GVecGen2i op[4] =3D { + { + .fniv =3D gen_vbitclri, + .fnoi =3D gen_helper_vbitclri_b, + .opt_opc =3D vecop_list, + .vece =3D MO_8 + }, + { + .fniv =3D gen_vbitclri, + .fnoi =3D gen_helper_vbitclri_h, + .opt_opc =3D vecop_list, + .vece =3D MO_16 + }, + { + .fniv =3D gen_vbitclri, + .fnoi =3D gen_helper_vbitclri_w, + .opt_opc =3D vecop_list, + .vece =3D MO_32 + }, + { + .fniv =3D gen_vbitclri, + .fnoi =3D gen_helper_vbitclri_d, + .opt_opc =3D vecop_list, + .vece =3D MO_64 + }, + }; + + tcg_gen_gvec_2i(vd_ofs, vj_ofs, oprsz, maxsz, imm, &op[vece]); +} + +TRANS(vbitclri_b, gvec_vv_i, MO_8, do_vbitclri) +TRANS(vbitclri_h, gvec_vv_i, MO_16, do_vbitclri) +TRANS(vbitclri_w, gvec_vv_i, MO_32, do_vbitclri) +TRANS(vbitclri_d, gvec_vv_i, MO_64, do_vbitclri) + +static void do_vbitset(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, + uint32_t vk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] =3D { + INDEX_op_shlv_vec, 0 + }; + static const GVecGen3 op[4] =3D { + { + .fniv =3D gen_vbitset, + .fno =3D gen_helper_vbitset_b, + .opt_opc =3D vecop_list, + .vece =3D MO_8 + }, + { + .fniv =3D gen_vbitset, + .fno =3D gen_helper_vbitset_h, + .opt_opc =3D vecop_list, + .vece =3D MO_16 + }, + { + .fniv =3D gen_vbitset, + .fno =3D gen_helper_vbitset_w, + .opt_opc =3D vecop_list, + .vece =3D MO_32 + }, + { + .fniv =3D gen_vbitset, + .fno =3D gen_helper_vbitset_d, + .opt_opc =3D vecop_list, + .vece =3D MO_64 + }, + }; + + tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(vbitset_b, gvec_vvv, MO_8, do_vbitset) +TRANS(vbitset_h, gvec_vvv, MO_16, do_vbitset) +TRANS(vbitset_w, gvec_vvv, MO_32, do_vbitset) +TRANS(vbitset_d, gvec_vvv, MO_64, do_vbitset) + +static void do_vbitseti(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] =3D { + INDEX_op_shli_vec, 0 + }; + static const GVecGen2i op[4] =3D { + { + .fniv =3D gen_vbitseti, + .fnoi =3D gen_helper_vbitseti_b, + .opt_opc =3D vecop_list, + .vece =3D MO_8 + }, + { + .fniv =3D gen_vbitseti, + .fnoi =3D gen_helper_vbitseti_h, + .opt_opc =3D vecop_list, + .vece =3D MO_16 + }, + { + .fniv =3D gen_vbitseti, + .fnoi =3D gen_helper_vbitseti_w, + .opt_opc =3D vecop_list, + .vece =3D MO_32 + }, + { + .fniv =3D gen_vbitseti, + .fnoi =3D gen_helper_vbitseti_d, + .opt_opc =3D vecop_list, + .vece =3D MO_64 + }, + }; + + tcg_gen_gvec_2i(vd_ofs, vj_ofs, oprsz, maxsz, imm, &op[vece]); +} + +TRANS(vbitseti_b, gvec_vv_i, MO_8, do_vbitseti) +TRANS(vbitseti_h, gvec_vv_i, MO_16, do_vbitseti) +TRANS(vbitseti_w, gvec_vv_i, MO_32, do_vbitseti) +TRANS(vbitseti_d, gvec_vv_i, MO_64, do_vbitseti) + +static void do_vbitrev(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, + uint32_t vk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] =3D { + INDEX_op_shlv_vec, 0 + }; + static const GVecGen3 op[4] =3D { + { + .fniv =3D gen_vbitrev, + .fno =3D gen_helper_vbitrev_b, + .opt_opc =3D vecop_list, + .vece =3D MO_8 + }, + { + .fniv =3D gen_vbitrev, + .fno =3D gen_helper_vbitrev_h, + .opt_opc =3D vecop_list, + .vece =3D MO_16 + }, + { + .fniv =3D gen_vbitrev, + .fno =3D gen_helper_vbitrev_w, + .opt_opc =3D vecop_list, + .vece =3D MO_32 + }, + { + .fniv =3D gen_vbitrev, + .fno =3D gen_helper_vbitrev_d, + .opt_opc =3D vecop_list, + .vece =3D MO_64 + }, + }; + + tcg_gen_gvec_3(vd_ofs, vj_ofs, vk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(vbitrev_b, gvec_vvv, MO_8, do_vbitrev) +TRANS(vbitrev_h, gvec_vvv, MO_16, do_vbitrev) +TRANS(vbitrev_w, gvec_vvv, MO_32, do_vbitrev) +TRANS(vbitrev_d, gvec_vvv, MO_64, do_vbitrev) + +static void do_vbitrevi(unsigned vece, uint32_t vd_ofs, uint32_t vj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] =3D { + INDEX_op_shli_vec, 0 + }; + static const GVecGen2i op[4] =3D { + { + .fniv =3D gen_vbitrevi, + .fnoi =3D gen_helper_vbitrevi_b, + .opt_opc =3D vecop_list, + .vece =3D MO_8 + }, + { + .fniv =3D gen_vbitrevi, + .fnoi =3D gen_helper_vbitrevi_h, + .opt_opc =3D vecop_list, + .vece =3D MO_16 + }, + { + .fniv =3D gen_vbitrevi, + .fnoi =3D gen_helper_vbitrevi_w, + .opt_opc =3D vecop_list, + .vece =3D MO_32 + }, + { + .fniv =3D gen_vbitrevi, + .fnoi =3D gen_helper_vbitrevi_d, + .opt_opc =3D vecop_list, + .vece =3D MO_64 + }, + }; + + tcg_gen_gvec_2i(vd_ofs, vj_ofs, oprsz, maxsz, imm, &op[vece]); +} + +TRANS(vbitrevi_b, gvec_vv_i, MO_8, do_vbitrevi) +TRANS(vbitrevi_h, gvec_vv_i, MO_16, do_vbitrevi) +TRANS(vbitrevi_w, gvec_vv_i, MO_32, do_vbitrevi) +TRANS(vbitrevi_d, gvec_vv_i, MO_64, do_vbitrevi) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index f865e83da5..801c97714e 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -973,3 +973,28 @@ vpcnt_b 0111 00101001 11000 01000 ..... .....= @vv vpcnt_h 0111 00101001 11000 01001 ..... ..... @vv vpcnt_w 0111 00101001 11000 01010 ..... ..... @vv vpcnt_d 0111 00101001 11000 01011 ..... ..... @vv + +vbitclr_b 0111 00010000 11000 ..... ..... ..... @vvv +vbitclr_h 0111 00010000 11001 ..... ..... ..... @vvv +vbitclr_w 0111 00010000 11010 ..... ..... ..... @vvv +vbitclr_d 0111 00010000 11011 ..... ..... ..... @vvv +vbitclri_b 0111 00110001 00000 01 ... ..... ..... @vv_ui3 +vbitclri_h 0111 00110001 00000 1 .... ..... ..... @vv_ui4 +vbitclri_w 0111 00110001 00001 ..... ..... ..... @vv_ui5 +vbitclri_d 0111 00110001 0001 ...... ..... ..... @vv_ui6 +vbitset_b 0111 00010000 11100 ..... ..... ..... @vvv +vbitset_h 0111 00010000 11101 ..... ..... ..... @vvv +vbitset_w 0111 00010000 11110 ..... ..... ..... @vvv +vbitset_d 0111 00010000 11111 ..... ..... ..... @vvv +vbitseti_b 0111 00110001 01000 01 ... ..... ..... @vv_ui3 +vbitseti_h 0111 00110001 01000 1 .... ..... ..... @vv_ui4 +vbitseti_w 0111 00110001 01001 ..... ..... ..... @vv_ui5 +vbitseti_d 0111 00110001 0101 ...... ..... ..... @vv_ui6 +vbitrev_b 0111 00010001 00000 ..... ..... ..... @vvv +vbitrev_h 0111 00010001 00001 ..... ..... ..... @vvv +vbitrev_w 0111 00010001 00010 ..... ..... ..... @vvv +vbitrev_d 0111 00010001 00011 ..... ..... ..... @vvv +vbitrevi_b 0111 00110001 10000 01 ... ..... ..... @vv_ui3 +vbitrevi_h 0111 00110001 10000 1 .... ..... ..... @vv_ui4 +vbitrevi_w 0111 00110001 10001 ..... ..... ..... @vv_ui5 +vbitrevi_d 0111 00110001 1001 ...... ..... ..... @vv_ui6 diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index f18c4a2978..f160abfd8e 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -1964,3 +1964,58 @@ VPCNT(vpcnt_b, 8, UB, ctpop8) VPCNT(vpcnt_h, 16, UH, ctpop16) VPCNT(vpcnt_w, 32, UW, ctpop32) VPCNT(vpcnt_d, 64, UD, ctpop64) + +#define DO_BITCLR(a, bit) (a & ~(1ull << bit)) +#define DO_BITSET(a, bit) (a | 1ull << bit) +#define DO_BITREV(a, bit) (a ^ (1ull << bit)) + +#define DO_BIT(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + VReg *Vk =3D (VReg *)vk; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), Vk->E(i)%BIT); \ + } \ +} + +DO_BIT(vbitclr_b, 8, UB, DO_BITCLR) +DO_BIT(vbitclr_h, 16, UH, DO_BITCLR) +DO_BIT(vbitclr_w, 32, UW, DO_BITCLR) +DO_BIT(vbitclr_d, 64, UD, DO_BITCLR) +DO_BIT(vbitset_b, 8, UB, DO_BITSET) +DO_BIT(vbitset_h, 16, UH, DO_BITSET) +DO_BIT(vbitset_w, 32, UW, DO_BITSET) +DO_BIT(vbitset_d, 64, UD, DO_BITSET) +DO_BIT(vbitrev_b, 8, UB, DO_BITREV) +DO_BIT(vbitrev_h, 16, UH, DO_BITREV) +DO_BIT(vbitrev_w, 32, UW, DO_BITREV) +DO_BIT(vbitrev_d, 64, UD, DO_BITREV) + +#define DO_BITI(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t v) \ +{ \ + int i; \ + VReg *Vd =3D (VReg *)vd; \ + VReg *Vj =3D (VReg *)vj; \ + \ + for (i =3D 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) =3D DO_OP(Vj->E(i), imm); \ + } \ +} + +DO_BITI(vbitclri_b, 8, UB, DO_BITCLR) +DO_BITI(vbitclri_h, 16, UH, DO_BITCLR) +DO_BITI(vbitclri_w, 32, UW, DO_BITCLR) +DO_BITI(vbitclri_d, 64, UD, DO_BITCLR) +DO_BITI(vbitseti_b, 8, UB, DO_BITSET) +DO_BITI(vbitseti_h, 16, UH, DO_BITSET) +DO_BITI(vbitseti_w, 32, UW, DO_BITSET) +DO_BITI(vbitseti_d, 64, UD, DO_BITSET) +DO_BITI(vbitrevi_b, 8, UB, DO_BITREV) +DO_BITI(vbitrevi_h, 16, UH, DO_BITREV) +DO_BITI(vbitrevi_w, 32, UW, DO_BITREV) +DO_BITI(vbitrevi_d, 64, UD, DO_BITREV) --=20 2.31.1