From nobody Tue May 7 07:58:55 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1555601969; cv=none; d=zoho.com; s=zohoarc; b=PwweBd4SK0cR9a9lN1sMjON6qUGqWOq5FSrdHySLfEMS+7FieaUxkWOZqyC1FRt5UTp3rdE0PD2S5eCtbJU0PW0nU3mYZsgPTA3U3IBxXGsAz2OFnKME8anqBpIlgKC+D2ZgeT1oJjOQPPiMNcP2on+3+P7iSqUaAHaTA7Vs6ys= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555601969; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=LVetxBdODynw05pplb6xPI9GxBvLkJN0/pYmATT0+DE=; b=ddkTLTKBaz8wWXbPbebHtZacmzPTxSh/TwX/1f4Q/5ux2UNE6diSrn5hAQczTpbr54Pxy/jXGiWDeIkSbFaRCUF/gpHj8d5jdUKxTtnDvxR+aBdz7Fcb97qeity/vQgBHkrLYAFh+LPFJ9AOUfP9QrPFR7l/aGsFBlkasz9AFqs= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1555601969023863.2371739320627; Thu, 18 Apr 2019 08:39:29 -0700 (PDT) Received: from localhost ([127.0.0.1]:43144 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH98K-0002uT-0D for importer@patchew.org; Thu, 18 Apr 2019 11:39:28 -0400 Received: from eggs.gnu.org ([209.51.188.92]:43724) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH941-0007d8-Py for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:35:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH8yW-00033g-NZ for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:29:22 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36772 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH8yW-00032H-C2 for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:29:20 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id CBC701A246A; Thu, 18 Apr 2019 17:29:17 +0200 (CEST) Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id A0FC11A2462; Thu, 18 Apr 2019 17:29:17 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 17:29:05 +0200 Message-Id: <1555601350-4176-2-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555601350-4176-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555601350-4176-1-git-send-email-mateja.marjanovic@rt-rk.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v9 1/6] target/mips: Optimize ILVOD. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Mateja Marjanovic Optimize set of MSA instructions ILVOD., using directly tcg registers and performing logic on them instead of using helpers. In the following table, the first column is the performance before this patch. The second represents the performance after converting from helpers to tcg, but without using tcg_gen_deposit function. The third one is with the deposit function and with using a uint64_t constant bit mask, and the fourth is with the deposit function and with a mask which is a tcg constant. The fourth is implemented in this patch. Performance measurement is done by executing the instructions 10 million times on a computer with Intel Core i7-3770 CPU @ 3.40GHz=C3=978. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D || instruction || 1 || 2 || 3 || 4 || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D || ilvod.b || 107.585 ms || 2.717 ms || 2.572 ms || 2.373 ms || || ilvod.h || 82.871 ms || 2.420 ms || 2.414 ms || 2.320 ms || || ilvod.w || 109.722 ms || 2.702 ms || 2.348 ms || 2.303 ms || || ilvod.d || 30.813 ms || 2.083 ms || 2.036 ms || 2.036 ms || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 1 - before 2 - no-deposit-no-mask-as-tcg-constant 3 - with-deposit-no-mask-as-tcg-constant 4 - with-deposit-with-mask-as-tcg-constant (final) The deposit function is used only in ILVOD.W. No-deposit version of the ILVOD.W implementation: static inline void gen_ilvod_w(CPUMIPSState *env, uint32_t wd, uint32_t ws, uint32_t wt) { TCGv_i64 t1 =3D tcg_temp_new_i64(); TCGv_i64 t2 =3D tcg_temp_new_i64(); TCGv_i64 mask =3D tcg_const_i64(0xffffffff00000000ULL); tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask); tcg_gen_shri_i64(t1, t1, 32); tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask); tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask); tcg_gen_shri_i64(t1, t1, 32); tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask); tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); tcg_temp_free_i64(mask); tcg_temp_free_i64(t1); tcg_temp_free_i64(t2); } Reviewed-by: Richard Henderson Suggested-by: Richard Henderson Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 1 - target/mips/msa_helper.c | 7 ---- target/mips/translate.c | 91 ++++++++++++++++++++++++++++++++++++++++++++= +++- 3 files changed, 90 insertions(+), 9 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index a6d687e..d162836 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -865,7 +865,6 @@ DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i3= 2) DEF_HELPER_5(msa_ilvl_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvev_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvod_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srlr_df, void, env, i32, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index c74e3cd..9e52a31 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1206,13 +1206,6 @@ MSA_FN_DF(ilvr_df) MSA_FN_DF(ilvev_df) #undef MSA_DO =20 -#define MSA_DO(DF) \ - do { \ - pwx->DF[2*i] =3D pwt->DF[2*i+1]; \ - pwx->DF[2*i+1] =3D pws->DF[2*i+1]; \ - } while (0) -MSA_FN_DF(ilvod_df) -#undef MSA_DO #undef MSA_LOOP_COND =20 #define MSA_LOOP_COND(DF) \ diff --git a/target/mips/translate.c b/target/mips/translate.c index 364bd6d..99bd441 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28001,6 +28001,80 @@ static void gen_msa_bit(CPUMIPSState *env, DisasCo= ntext *ctx) tcg_temp_free_i32(tws); } =20 +/* + * [MSA] ILVOD. wd, ws, wt + * + * Vector Interleave Odd ( data elements) + * + */ +static inline void gen_ilvod_bh(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt, + uint64_t mask, uint32_t shift) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + TCGv_i64 mask_tcg =3D tcg_const_i64(mask); + + tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask_tcg); + tcg_gen_shri_i64(t1, t1, shift); + tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask_tcg); + tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); + + tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask_tcg); + tcg_gen_shri_i64(t1, t1, shift); + tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask_tcg); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); + + tcg_temp_free_i64(mask_tcg); + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +static inline void gen_ilvod_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + gen_ilvod_bh(env, wd, ws, wt, 0xff00ff00ff00ff00ULL, 8); +} + +static inline void gen_ilvod_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + gen_ilvod_bh(env, wd, ws, wt, 0xffff0000ffff0000ULL, 16); +} + +/* + * [MSA] ILVOD.W wd, ws, wt + * + * Vector Interleave Odd (word data elements) + * + */ +static inline void gen_ilvod_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + + tcg_gen_shri_i64(t1, msa_wr_d[wt * 2], 32); + tcg_gen_deposit_i64(msa_wr_d[wd * 2], msa_wr_d[ws * 2], t1, 0, 32); + + tcg_gen_shri_i64(t1, msa_wr_d[wt * 2 + 1], 32); + tcg_gen_deposit_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1], t1, 0,= 32); + + tcg_temp_free_i64(t1); +} + +/* + * [MSA] ILVOD.D wd, ws, wt + * + * Vector Interleave Odd (doubleword data elements) + * + */ +static inline void gen_ilvod_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2 + 1]); + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); +} + static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) { #define MASK_MSA_3R(op) (MASK_MSA_MINOR(op) | (op & (0x7 << 23))) @@ -28172,7 +28246,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCon= text *ctx) gen_helper_msa_mod_u_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVOD_df: - gen_helper_msa_ilvod_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_ilvod_b(env, wd, ws, wt); + break; + case DF_HALF: + gen_ilvod_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvod_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvod_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; =20 case OPC_DOTP_S_df: --=20 2.7.4 From nobody Tue May 7 07:58:55 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1555602092; cv=none; d=zoho.com; s=zohoarc; b=l/Osm55Rj3nTygcrdVhSC/0YMUXjKn1vDmwmw3pVT6+UNVxkTrpLDpVwfnKCqBDX1D2MUE4cWsjgQwvKw3kEG55jSjByN5l+FMPOLEAylqn6vxBC2g6UDEEZENaj+wrE2rR85J0tARuWXw466IdVqvZyaMTYBnxe4ullE2OIpRw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555602092; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=Vl/bhxCTbkr0aiJortx1rtaRcabvKmkFoJqMtrKekBY=; b=lK7bb0PPHq674SxQzlWwTuknBXa5hGdEAKqtXhRDDxFA7z9+3f23ieiGPkhLQBXcZYQcRR1wgNgO0pwJFq5yAjgwDqimfDQta4/NobZ16M/I9bz2plbFTGWJL9AaELbj9ESL7wNZ+uoaST9UR3ep93sC+GUlCsjtwzwTMZe8vyw= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1555602092991918.2674096848808; Thu, 18 Apr 2019 08:41:32 -0700 (PDT) Received: from localhost ([127.0.0.1]:43213 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH9AK-0004n0-02 for importer@patchew.org; Thu, 18 Apr 2019 11:41:32 -0400 Received: from eggs.gnu.org ([209.51.188.92]:43722) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH942-0007d7-AK for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:35:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH8yW-00033U-Kl for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:29:22 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36787 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH8yW-00032M-8P for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:29:20 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id EE1DD1A246B; Thu, 18 Apr 2019 17:29:17 +0200 (CEST) Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id BD1CB1A2435; Thu, 18 Apr 2019 17:29:17 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 17:29:06 +0200 Message-Id: <1555601350-4176-3-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555601350-4176-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555601350-4176-1-git-send-email-mateja.marjanovic@rt-rk.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v9 2/6] target/mips: Optimize ILVEV. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Mateja Marjanovic Optimize set of MSA instructions ILVEV., using directly tcg registers and performing logic on them instead of using helpers. In the following table, the first column is the performance before this patch. The second represents the performance after converting from helpers to tcg, but without using tcg_gen_deposit function. The third one is with using the tcg_gen_deposit function and with using a uint64_t constant bit mask, and the fourth is with using the tcg_gen_deposit function and with a mask which is a tcg constant. The fourth is implemented in this patch. Performance measurement is done by executing the instructions 10 million times on a computer with Intel Core i7-3770 CPU @ 3.40GHz=C3=978. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D || instruction || 1 || 2 || 3 || 4 || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D || ilvev.b || 107.592 ms || 2.432 ms || 2.381 ms || 2.599 ms || || ilvev.h || 83.422 ms || 2.352 ms || 2.623 ms || 2.532 ms || || ilvev.w || 109.300 ms || 2.342 ms || 2.329 ms || 2.266 ms || || ilvev.d || 30.915 ms || 1.926 ms || 2.002 ms || 1.976 ms || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 1 - before 2 - no-deposit-no-mask-as-tcg-constant 3 - with-deposit-no-mask-as-tcg-constant 4 - with-deposit-with-mask-as-tcg-constant (final) The deposit function is used only in ILVEV.W. No-deposit version of the ILVEV.W implementation: static inline void gen_ilvev_w(CPUMIPSState *env, uint32_t wd, uint32_t ws, uint32_t wt) { TCGv_i64 t1 =3D tcg_temp_new_i64(); TCGv_i64 t2 =3D tcg_temp_new_i64(); uint64_t mask =3D 0x00000000ffffffffULL; tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); tcg_gen_andi_i64(t2, msa_wr_d[ws * 2], mask); tcg_gen_shli_i64(t2, t2, 32); tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); tcg_gen_andi_i64(t2, msa_wr_d[ws * 2 + 1], mask); tcg_gen_shli_i64(t2, t2, 32); tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); tcg_temp_free_i64(t1); tcg_temp_free_i64(t2); } Reviewed-by: Richard Henderson Suggested-by: Aleksandar Markovic Suggested-by: Philippe Mathieu-Daud=C3=A9 Suggested-by: Richard Henderson Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 1 - target/mips/msa_helper.c | 9 ----- target/mips/translate.c | 87 ++++++++++++++++++++++++++++++++++++++++++++= +++- 3 files changed, 86 insertions(+), 11 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index d162836..2f23b0d 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -864,7 +864,6 @@ DEF_HELPER_5(msa_pckev_df, void, env, i32, i32, i32, i3= 2) DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvl_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvev_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srlr_df, void, env, i32, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index 9e52a31..a500c59 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1197,15 +1197,6 @@ MSA_FN_DF(ilvl_df) } while (0) MSA_FN_DF(ilvr_df) #undef MSA_DO - -#define MSA_DO(DF) \ - do { \ - pwx->DF[2*i] =3D pwt->DF[2*i]; \ - pwx->DF[2*i+1] =3D pws->DF[2*i]; \ - } while (0) -MSA_FN_DF(ilvev_df) -#undef MSA_DO - #undef MSA_LOOP_COND =20 #define MSA_LOOP_COND(DF) \ diff --git a/target/mips/translate.c b/target/mips/translate.c index 99bd441..930ef3a 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28075,6 +28075,76 @@ static inline void gen_ilvod_d(CPUMIPSState *env, = uint32_t wd, tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); } =20 + +/* + * [MSA] ILVEV. wd, ws, wt + * + * Vector Interleave Even ( data elements) + * + */ +static inline void gen_ilvev_bh(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt, + uint64_t mask, uint32_t shift) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + TCGv_i64 mask_tcg =3D tcg_const_i64(mask); + + tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask_tcg); + tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask_tcg); + tcg_gen_shli_i64(t2, t2, shift); + tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); + + tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask_tcg); + tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask_tcg); + tcg_gen_shli_i64(t2, t2, shift); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); + + tcg_temp_free_i64(mask_tcg); + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +static inline void gen_ilvev_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + gen_ilvev_bh(env, wd, ws, wt, 0x00ff00ff00ff00ffULL, 8); +} + +static inline void gen_ilvev_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + gen_ilvev_bh(env, wd, ws, wt, 0x0000ffff0000ffffULL, 16); +} + +/* + * [MSA] ILVEV.W wd, ws, wt + * + * Vector Interleave Even (word data elements) + * + */ +static inline void gen_ilvev_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_deposit_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2], + msa_wr_d[ws * 2], 32, 32); + tcg_gen_deposit_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[wt * 2 + 1], + msa_wr_d[ws * 2 + 1], 32, 32); +} + +/* + * [MSA] ILVEV.D wd, ws, wt + * + * Vector Interleave Even (Doubleword data elements) + * + */ +static inline void gen_ilvev_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]); + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]); +} + static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) { #define MASK_MSA_3R(op) (MASK_MSA_MINOR(op) | (op & (0x7 << 23))) @@ -28231,7 +28301,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCon= text *ctx) gen_helper_msa_mod_s_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVEV_df: - gen_helper_msa_ilvev_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_ilvev_b(env, wd, ws, wt); + break; + case DF_HALF: + gen_ilvev_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvev_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvev_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; case OPC_BINSR_df: gen_helper_msa_binsr_df(cpu_env, tdf, twd, tws, twt); --=20 2.7.4 From nobody Tue May 7 07:58:55 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1555601982; cv=none; d=zoho.com; s=zohoarc; b=b7wjWASmXrOQpcS05AqSz4Ap/jeuqrjGIN95DomSc55965zBJdG1MPTrt1Oz4pYIYsUBfcZNAt/+T4kyQ4Yhgv3BZyrecCKg3J3Lev1BN0KRX2X1HyojoiwMt0ceuzm78EdhCcMkvCq6I2etUS09CF/bY8aoOK5PrdYtJQmdByM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555601982; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=VUF0ZJBJ+yWZWMf4cJvspuyHF3R5x5chQEbqq2Q5jHs=; b=j9VQC+kxa4O/KuYaX6K0YtDosf12vESuylXRcOkJP/Npq9kDAcXjvopxJ6CqqARqBUGr1p4W53JAB6BsKWO1leQBiwnQxmDl3ngvqNuDFumnUBLGB4d8dAvh9Tak5+YXS0q2sDUrXmNVtlIJjt4ocXOLJjKDf7Ae5CbnkyI8Eag= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 155560198240696.75461713409038; Thu, 18 Apr 2019 08:39:42 -0700 (PDT) Received: from localhost ([127.0.0.1]:43148 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH98U-00034F-9W for importer@patchew.org; Thu, 18 Apr 2019 11:39:38 -0400 Received: from eggs.gnu.org ([209.51.188.92]:43722) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH940-0007d7-QT for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:35:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH8yW-00033Z-LC for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:29:22 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36797 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH8yW-00032V-8x for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:29:20 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 12BA71A246E; Thu, 18 Apr 2019 17:29:18 +0200 (CEST) Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id DBF931A2462; Thu, 18 Apr 2019 17:29:17 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 17:29:07 +0200 Message-Id: <1555601350-4176-4-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555601350-4176-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555601350-4176-1-git-send-email-mateja.marjanovic@rt-rk.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v9 3/6] target/mips: Optimize ILVL. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Mateja Marjanovic Optimize ILVL. instructions, using directly tcg registers and logic performed on them, and instead of shifting the bit mask or assigning a new tcg constant to the bit mask, assign a new (shifted) uint64_t value to the bit mask. Performance measurement is done by executing the instructions 10 million times on a computer with Intel Core i7-3770 CPU @ 3.40GHz=C3=978. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D || instruction || BEFORE || LOOP UNROLL || TCG || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D || ilvl.b || 107.069 ms || 55.619 ms || 7.735 ms || || ilvl.h || 83.340 ms || 31.320 ms || 3.797 ms || || ilvl.w || 109.448 ms || 31.714 ms || 2.381 ms || || ilvl.d || 31.557 ms || 28.716 ms || 2.029 ms || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D Suggested-by: Aleksandar Markovic Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 1 - target/mips/msa_helper.c | 8 --- target/mips/translate.c | 184 +++++++++++++++++++++++++++++++++++++++++++= +++- 3 files changed, 183 insertions(+), 10 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index 2f23b0d..85c8b17 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -862,7 +862,6 @@ DEF_HELPER_5(msa_sld_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_splat_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckev_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvl_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index a500c59..f9b85fc 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1184,14 +1184,6 @@ MSA_FN_DF(pckod_df) =20 #define MSA_DO(DF) \ do { \ - pwx->DF[2*i] =3D L##DF(pwt, i); \ - pwx->DF[2*i+1] =3D L##DF(pws, i); \ - } while (0) -MSA_FN_DF(ilvl_df) -#undef MSA_DO - -#define MSA_DO(DF) \ - do { \ pwx->DF[2*i] =3D R##DF(pwt, i); \ pwx->DF[2*i+1] =3D R##DF(pws, i); \ } while (0) diff --git a/target/mips/translate.c b/target/mips/translate.c index 930ef3a..d9aef77 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28002,6 +28002,173 @@ static void gen_msa_bit(CPUMIPSState *env, DisasC= ontext *ctx) } =20 /* + * [MSA] ILVL.B wd, ws, wt + * + * Vector Interleave Left (byte data elements) + * + */ +static inline void gen_ilvl_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + uint64_t mask =3D 0x00000000000000ffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 8); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x000000000000ff00ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 8); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x0000000000ff0000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 24); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x00000000ff000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 24); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask =3D 0x000000ff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 24); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x0000ff0000000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 24); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x00ff000000000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 8); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0xff00000000000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 8); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVL.H wd, ws, wt + * + * Vector Interleave Left (halfword data elements) + * + */ +static inline void gen_ilvl_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + uint64_t mask =3D 0x000000000000ffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x00000000ffff0000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask =3D 0x0000ffff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0xffff000000000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVL.W wd, ws, wt + * + * Vector Interleave Left (word data elements) + * + */ +static inline void gen_ilvl_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + uint64_t mask =3D 0x00000000ffffffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask =3D 0xffffffff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVL.D wd, ws, wt + * + * Vector Interleave Left (doubleword data elements) + * + */ +static inline void gen_ilvl_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2 + 1]); + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); +} + +/* * [MSA] ILVOD. wd, ws, wt * * Vector Interleave Odd ( data elements) @@ -28265,7 +28432,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCon= text *ctx) gen_helper_msa_div_s_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVL_df: - gen_helper_msa_ilvl_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_ilvl_b(env, wd, ws, wt); + break; + case DF_HALF: + gen_ilvl_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvl_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvl_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; case OPC_BNEG_df: gen_helper_msa_bneg_df(cpu_env, tdf, twd, tws, twt); --=20 2.7.4 From nobody Tue May 7 07:58:55 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1555602202; cv=none; d=zoho.com; s=zohoarc; b=JcWmO/ZZwutAuhUtlsRNpgfIOL+cj+5C8HsuZAlr2HPajL6MnZfGH9BkC97J4f8ngawu4UuwePqaS2pI3mgFtONhZ81Y/xsDguB/iZ4dB0zD7i/52pQpIUbXA9vX9sMAkPu2LYMLIfjSDeXbBBTdBSliQjqgZuGG/PjYgvGXyeA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555602202; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=p1Sun4PzRXsd6WdJhsRs17lqaZKWX7Q2HFKjl1zRpUM=; b=c3zBMaR5L7M7PGYOYZbxFn/TYAhGF0djcvizU4Hq+sm8iGxHP5yItRSWVGTC772JmPxsGdgHG+k/FvAGVbgBcN5J/YNZk65kFb6K4kHwdWmQGDEQi0Zoqea3akXkvHQP7dkboIBG8gFKHMBWWs1N3TrSkNYBn9MOg3Y/qTWQW9U= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1555602201807982.1684392049; Thu, 18 Apr 2019 08:43:21 -0700 (PDT) Received: from localhost ([127.0.0.1]:43225 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH9C0-0005yM-MF for importer@patchew.org; Thu, 18 Apr 2019 11:43:16 -0400 Received: from eggs.gnu.org ([209.51.188.92]:43724) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH943-0007d8-Cg for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:35:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH8yW-00033I-IC for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:29:22 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36809 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH8yW-00032Z-71 for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:29:20 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 2AC391A2472; Thu, 18 Apr 2019 17:29:18 +0200 (CEST) Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id 038421A2435; Thu, 18 Apr 2019 17:29:18 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 17:29:08 +0200 Message-Id: <1555601350-4176-5-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555601350-4176-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555601350-4176-1-git-send-email-mateja.marjanovic@rt-rk.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v9 4/6] target/mips: Optimize ILVR. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Mateja Marjanovic Optimize ILVR. instructions, using directly tcg registers and logic performed on them, and instead of shifting the bit mask or assigning a new tcg constant to the bit mask, assign a new (shifted) uint64_t value to the bit mask. Performance measurement is done by executing the instructions 10 million times on a computer with Intel Core i7-3770 CPU @ 3.40GHz=C3=978. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D || instruction || BEFORE || LOOP UNROLL || TCG || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D || ilvr.b || 106.461 ms || 52.131 ms || 7.813 ms || || ilvr.h || 82.962 ms || 36.222 ms || 3.622 ms || || ilvr.w || 109.451 ms || 33.042 ms || 2.331 ms || || ilvr.d || 32.270 ms || 27.328 ms || 2.025 ms || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D Suggested-by: Aleksandar Markovic Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 1 - target/mips/msa_helper.c | 8 --- target/mips/translate.c | 184 +++++++++++++++++++++++++++++++++++++++++++= +++- 3 files changed, 183 insertions(+), 10 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index 85c8b17..c1681da 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -862,7 +862,6 @@ DEF_HELPER_5(msa_sld_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_splat_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckev_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srlr_df, void, env, i32, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index f9b85fc..4cb0929 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1181,14 +1181,6 @@ MSA_FN_DF(pckev_df) } while (0) MSA_FN_DF(pckod_df) #undef MSA_DO - -#define MSA_DO(DF) \ - do { \ - pwx->DF[2*i] =3D R##DF(pwt, i); \ - pwx->DF[2*i+1] =3D R##DF(pws, i); \ - } while (0) -MSA_FN_DF(ilvr_df) -#undef MSA_DO #undef MSA_LOOP_COND =20 #define MSA_LOOP_COND(DF) \ diff --git a/target/mips/translate.c b/target/mips/translate.c index d9aef77..214736c 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28002,6 +28002,173 @@ static void gen_msa_bit(CPUMIPSState *env, DisasC= ontext *ctx) } =20 /* + * [MSA] ILVR.B wd, ws, wt + * + * Vector Interleave Right (byte data elements) + * + */ +static inline void gen_ilvr_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + uint64_t mask =3D 0x00000000000000ffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 8); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x000000000000ff00ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shli_i64(t1, t1, 8); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x0000000000ff0000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 24); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x00000000ff000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shli_i64(t1, t1, 24); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask =3D 0x000000ff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shri_i64(t1, t1, 24); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x0000ff0000000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 24); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x00ff000000000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shri_i64(t1, t1, 8); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0xff00000000000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 8); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVR.H wd, ws, wt + * + * Vector Interleave Right (halfword data elements) + * + */ +static inline void gen_ilvr_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + uint64_t mask =3D 0x000000000000ffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x00000000ffff0000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask =3D 0x0000ffff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0xffff000000000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVR.W wd, ws, wt + * + * Vector Interleave Right (word data elements) + * + */ +static inline void gen_ilvr_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + uint64_t mask =3D 0x00000000ffffffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask =3D 0xffffffff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVR.D wd, ws, wt + * + * Vector Interleave Right (doubleword data elements) + * + */ +static inline void gen_ilvr_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]); + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]); +} + +/* * [MSA] ILVL.B wd, ws, wt * * Vector Interleave Left (byte data elements) @@ -28468,7 +28635,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCon= text *ctx) gen_helper_msa_div_u_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVR_df: - gen_helper_msa_ilvr_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_ilvr_b(env, wd, ws, wt); + break; + case DF_HALF: + gen_ilvr_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvr_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvr_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; case OPC_BINSL_df: gen_helper_msa_binsl_df(cpu_env, tdf, twd, tws, twt); --=20 2.7.4 From nobody Tue May 7 07:58:55 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1555601860; cv=none; d=zoho.com; s=zohoarc; b=cwpp3vaBDgsAijv4BJXFMdqJ4LWouhA9vQoGFVENpYzXn0GtlkFVmv7N2GJwnX5zjoolR+3WnEg9HnZH3gvLGCJIEoTm20FsWSqQrCvpjMjvO+gthzqyNcoK6Xo8YFUA+g4A+eoRsDdLxbef0yN8HSMFz6K61mc9Ep2bE88KyQA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555601860; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=19Wzq0SL2yFAJIOt77SYqCJ6hJIfIYHHSVkoN3wd5h8=; b=nTBr1o9fy7tL2tEnrEkKoIduENmgkBiEPhlSLgiLEZllLU+f9RYySs9qd0xxwO+tvAkdGgPPMV1tSC34AQrgFkcCWM6qLIohS95R4tDk6q6Z5CwujtZYzJB0U3OATeBy6/xrxkUPVWmNqmR7X+W8ju97jOOKn0V6fYa2vwuIHcU= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1555601860766946.6426236629686; Thu, 18 Apr 2019 08:37:40 -0700 (PDT) Received: from localhost ([127.0.0.1]:43121 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH96T-0001Bt-JO for importer@patchew.org; Thu, 18 Apr 2019 11:37:33 -0400 Received: from eggs.gnu.org ([209.51.188.92]:43724) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH940-0007d8-NE for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:35:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH8yX-000342-H0 for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:29:22 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36899 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH8yX-00033M-A7 for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:29:21 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 464671A2473; Thu, 18 Apr 2019 17:29:18 +0200 (CEST) Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id 1BD071A2462; Thu, 18 Apr 2019 17:29:18 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 17:29:09 +0200 Message-Id: <1555601350-4176-6-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555601350-4176-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555601350-4176-1-git-send-email-mateja.marjanovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v9 5/6] target/mips: Merge implementation of ILVEV.D and ILVR.D X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Mateja Marjanovic The implementation for ILVEV.D and ILVR.D instructions is equivalent, so use a single handler for both of them. Suggested-by: Aleksandar Markovic Signed-off-by: Mateja Marjanovic --- target/mips/translate.c | 29 +++++++++++------------------ 1 file changed, 11 insertions(+), 18 deletions(-) diff --git a/target/mips/translate.c b/target/mips/translate.c index 214736c..019a2c0 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28156,19 +28156,6 @@ static inline void gen_ilvr_w(CPUMIPSState *env, u= int32_t wd, } =20 /* - * [MSA] ILVR.D wd, ws, wt - * - * Vector Interleave Right (doubleword data elements) - * - */ -static inline void gen_ilvr_d(CPUMIPSState *env, uint32_t wd, - uint32_t ws, uint32_t wt) -{ - tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]); - tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]); -} - -/* * [MSA] ILVL.B wd, ws, wt * * Vector Interleave Left (byte data elements) @@ -28469,11 +28456,17 @@ static inline void gen_ilvev_w(CPUMIPSState *env,= uint32_t wd, /* * [MSA] ILVEV.D wd, ws, wt * - * Vector Interleave Even (Doubleword data elements) + * Vector Interleave Even (doubleword data elements) + * + * [MSA] ILVR.D wd, ws, wt + * + * Vector Interleave Right (doubleword data elements) + * + * These two instructions are functionally equivalent. * */ -static inline void gen_ilvev_d(CPUMIPSState *env, uint32_t wd, - uint32_t ws, uint32_t wt) +static inline void gen_ilvev_ilvr_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) { tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]); tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]); @@ -28646,7 +28639,7 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCont= ext *ctx) gen_ilvr_w(env, wd, ws, wt); break; case DF_DOUBLE: - gen_ilvr_d(env, wd, ws, wt); + gen_ilvev_ilvr_d(env, wd, ws, wt); break; default: assert(0); @@ -28676,7 +28669,7 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCont= ext *ctx) gen_ilvev_w(env, wd, ws, wt); break; case DF_DOUBLE: - gen_ilvev_d(env, wd, ws, wt); + gen_ilvev_ilvr_d(env, wd, ws, wt); break; default: assert(0); --=20 2.7.4 From nobody Tue May 7 07:58:55 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1555601831; cv=none; d=zoho.com; s=zohoarc; b=MyJO4+m8+991a4i0d5fX/bA8Z181DxWweGXmlnop6pcjwsMfuoGsNncbs5UW3Z54CNX7yLh9KsQ3sJq5jLdZ+3NqRBirfDwx4JBop7sgi2rGjZ2bBfJiNQDiTlrAvITrA4sYXwsap/EUx0bBLSXheNv3DyzD12SahY/TQdYhqes= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555601831; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=/B2B7YWIvcQDQBgZE0nuOjUEt/vzj6gyKQ8kWyYUTAg=; b=KwtoxyGSdjpZLp8YJn39xgHAMcbsOBLnFVxBRLRu47AvfSiaYV6nCwUOw+FXCjlqHpt/zJcvR2R6wBgMhyWBGz5ZDspN1i58P0c5OnkVn42azMpwl0wRZFfOzhFp+l9yTO7J9QqDlcVTB6C5M0DClVrjKtTjBjQ2Vpw/yn5ZyAY= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 155560183167743.54235395709395; Thu, 18 Apr 2019 08:37:11 -0700 (PDT) Received: from localhost ([127.0.0.1]:43119 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH95z-0000no-Ru for importer@patchew.org; Thu, 18 Apr 2019 11:37:03 -0400 Received: from eggs.gnu.org ([209.51.188.92]:43724) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH93z-0007d8-KF for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:35:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH8yX-000348-I7 for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:29:22 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36902 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH8yX-00033P-BM for qemu-devel@nongnu.org; Thu, 18 Apr 2019 11:29:21 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 5BD951A2462; Thu, 18 Apr 2019 17:29:18 +0200 (CEST) Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id 345AD1A2435; Thu, 18 Apr 2019 17:29:18 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 17:29:10 +0200 Message-Id: <1555601350-4176-7-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555601350-4176-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555601350-4176-1-git-send-email-mateja.marjanovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v9 6/6] target/mips: Merge implementation of ILVOD.D and ILVL.D X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Mateja Marjanovic The implementation for ILVOD.D and ILVL.D instructions is equivalent, so use a single handler for both of them. Suggested-by: Aleksandar Markovic Signed-off-by: Mateja Marjanovic --- target/mips/translate.c | 27 ++++++++++----------------- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/target/mips/translate.c b/target/mips/translate.c index 019a2c0..020a659 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28310,19 +28310,6 @@ static inline void gen_ilvl_w(CPUMIPSState *env, u= int32_t wd, } =20 /* - * [MSA] ILVL.D wd, ws, wt - * - * Vector Interleave Left (doubleword data elements) - * - */ -static inline void gen_ilvl_d(CPUMIPSState *env, uint32_t wd, - uint32_t ws, uint32_t wt) -{ - tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2 + 1]); - tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); -} - -/* * [MSA] ILVOD. wd, ws, wt * * Vector Interleave Odd ( data elements) @@ -28388,9 +28375,15 @@ static inline void gen_ilvod_w(CPUMIPSState *env, = uint32_t wd, * * Vector Interleave Odd (doubleword data elements) * + * [MSA] ILVL.D wd, ws, wt + * + * Vector Interleave Left (doubleword data elements) + * + * These two instructions are functionally equivalent. + * */ -static inline void gen_ilvod_d(CPUMIPSState *env, uint32_t wd, - uint32_t ws, uint32_t wt) +static inline void gen_ilvod_ilvl_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) { tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2 + 1]); tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); @@ -28603,7 +28596,7 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCont= ext *ctx) gen_ilvl_w(env, wd, ws, wt); break; case DF_DOUBLE: - gen_ilvl_d(env, wd, ws, wt); + gen_ilvod_ilvl_d(env, wd, ws, wt); break; default: assert(0); @@ -28699,7 +28692,7 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCont= ext *ctx) gen_ilvod_w(env, wd, ws, wt); break; case DF_DOUBLE: - gen_ilvod_d(env, wd, ws, wt); + gen_ilvod_ilvl_d(env, wd, ws, wt); break; default: assert(0); --=20 2.7.4