From nobody Tue May 7 00:32:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1555587966; cv=none; d=zoho.com; s=zohoarc; b=Mv3qZO0bEG3EHTiJsL6GTg6kn3iUIUbBEwN8xIUL36/TuNbJICiZU5hqlzEbluQ2i8OWSDWj2mEtwsz+ikH52oiQRYmx+5KwkrIMz+S70hSIW/zl8HaMKchd7L3xnpKAd4wEg5IUNAscAj3qBWFsCO/uN72Os1sieQmgrFIOcBM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555587966; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=11zWZ80+JW4ebIGBKoUD5cJ0TOpCyOllwu3l65786XI=; b=AbJfcvmHkPRiL4nwn/OdOnZsH5WQrDkXE5Yg0dus5xBijXzpYCrQcVicQPlo3nXg2x8H22yx0Zn0sqA4JN744WJ+Y37jCuzq9VUb9hVccjLD+0PPQxYVWQmbudqIeHkcx0Ss5vUmIQZ6H9S4hiMePI0c+8eB6pZn0iK3BgstBJ8= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1555587966727572.4860428031052; Thu, 18 Apr 2019 04:46:06 -0700 (PDT) Received: from localhost ([127.0.0.1]:40019 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH5UL-0004CC-5r for importer@patchew.org; Thu, 18 Apr 2019 07:45:57 -0400 Received: from eggs.gnu.org ([209.51.188.92]:53511) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH5SV-000374-Oe for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH5ST-0006N5-5o for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:03 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:45474 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH5SP-0004tR-Mp for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:01 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id D1DBA1A2436; Thu, 18 Apr 2019 13:42:52 +0200 (CEST) Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id A66C51A2023; Thu, 18 Apr 2019 13:42:52 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 13:42:41 +0200 Message-Id: <1555587766-985-2-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555587766-985-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555587766-985-1-git-send-email-mateja.marjanovic@rt-rk.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v8 1/6] target/mips: Optimize ILVOD. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Mateja Marjanovic Optimize set of MSA instructions ILVOD., using directly tcg registers and performing logic on them instead of using helpers. In the following table, the first column is the performance before this patch. The second represents the performance after converting from helpers to tcg, but without using tcg_gen_deposit function. The third one is with the deposit function and with using a uint64_t constant bit mask, and the fourth is with the deposit function and with a mask which is a tcg constant. The fourth is implemented in this patch. Performance measurement is done by executing the instructions 10 million times on a computer with Intel Core i7-3770 CPU @ 3.40GHz=C3=978. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D || instruction || 1 || 2 || 3 || 4 || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D || ilvod.b || 117.50 ms || 24.13 ms || 24.45 ms || 23.24 ms || || ilvod.h || 93.16 ms || 24.21 ms || 24.28 ms || 23.20 ms || || ilvod.w || 119.90 ms || 24.15 ms || 23.19 ms || 22.95 ms || || ilvod.d || 43.01 ms || 21.17 ms || 23.07 ms || 22.59 ms || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 1 - before 2 - no-deposit-no-mask-as-tcg-constant 3 - with-deposit-no-mask-as-tcg-constant 4 - with-deposit-with-mask-as-tcg-constant (final) The deposit function is used only in ILVOD.W. No-deposit version of the ILVOD.W implementation: static inline void gen_ilvod_w(CPUMIPSState *env, uint32_t wd, uint32_t ws, uint32_t wt) { TCGv_i64 t1 =3D tcg_temp_new_i64(); TCGv_i64 t2 =3D tcg_temp_new_i64(); TCGv_i64 mask =3D tcg_const_i64(0xffffffff00000000ULL); tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask); tcg_gen_shri_i64(t1, t1, 32); tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask); tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask); tcg_gen_shri_i64(t1, t1, 32); tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask); tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); tcg_temp_free_i64(mask); tcg_temp_free_i64(t1); tcg_temp_free_i64(t2); } Reviewed-by: Richard Henderson Suggested-by: Richard Henderson Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 1 - target/mips/msa_helper.c | 7 ---- target/mips/translate.c | 91 ++++++++++++++++++++++++++++++++++++++++++++= +++- 3 files changed, 90 insertions(+), 9 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index a6d687e..d162836 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -865,7 +865,6 @@ DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i3= 2) DEF_HELPER_5(msa_ilvl_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvev_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvod_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srlr_df, void, env, i32, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index c74e3cd..9e52a31 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1206,13 +1206,6 @@ MSA_FN_DF(ilvr_df) MSA_FN_DF(ilvev_df) #undef MSA_DO =20 -#define MSA_DO(DF) \ - do { \ - pwx->DF[2*i] =3D pwt->DF[2*i+1]; \ - pwx->DF[2*i+1] =3D pws->DF[2*i+1]; \ - } while (0) -MSA_FN_DF(ilvod_df) -#undef MSA_DO #undef MSA_LOOP_COND =20 #define MSA_LOOP_COND(DF) \ diff --git a/target/mips/translate.c b/target/mips/translate.c index 364bd6d..99bd441 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28001,6 +28001,80 @@ static void gen_msa_bit(CPUMIPSState *env, DisasCo= ntext *ctx) tcg_temp_free_i32(tws); } =20 +/* + * [MSA] ILVOD. wd, ws, wt + * + * Vector Interleave Odd ( data elements) + * + */ +static inline void gen_ilvod_bh(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt, + uint64_t mask, uint32_t shift) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + TCGv_i64 mask_tcg =3D tcg_const_i64(mask); + + tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask_tcg); + tcg_gen_shri_i64(t1, t1, shift); + tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask_tcg); + tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); + + tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask_tcg); + tcg_gen_shri_i64(t1, t1, shift); + tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask_tcg); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); + + tcg_temp_free_i64(mask_tcg); + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +static inline void gen_ilvod_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + gen_ilvod_bh(env, wd, ws, wt, 0xff00ff00ff00ff00ULL, 8); +} + +static inline void gen_ilvod_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + gen_ilvod_bh(env, wd, ws, wt, 0xffff0000ffff0000ULL, 16); +} + +/* + * [MSA] ILVOD.W wd, ws, wt + * + * Vector Interleave Odd (word data elements) + * + */ +static inline void gen_ilvod_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + + tcg_gen_shri_i64(t1, msa_wr_d[wt * 2], 32); + tcg_gen_deposit_i64(msa_wr_d[wd * 2], msa_wr_d[ws * 2], t1, 0, 32); + + tcg_gen_shri_i64(t1, msa_wr_d[wt * 2 + 1], 32); + tcg_gen_deposit_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1], t1, 0,= 32); + + tcg_temp_free_i64(t1); +} + +/* + * [MSA] ILVOD.D wd, ws, wt + * + * Vector Interleave Odd (doubleword data elements) + * + */ +static inline void gen_ilvod_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2 + 1]); + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); +} + static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) { #define MASK_MSA_3R(op) (MASK_MSA_MINOR(op) | (op & (0x7 << 23))) @@ -28172,7 +28246,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCon= text *ctx) gen_helper_msa_mod_u_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVOD_df: - gen_helper_msa_ilvod_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_ilvod_b(env, wd, ws, wt); + break; + case DF_HALF: + gen_ilvod_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvod_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvod_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; =20 case OPC_DOTP_S_df: --=20 2.7.4 From nobody Tue May 7 00:32:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1555587970; cv=none; d=zoho.com; s=zohoarc; b=gYb6v3rNJIhGD20E+0fCe7IR89hN+1rQxOAXRSzcqAxI4DNlWpQGW7gT4CJZqS025wtZzZCdZzUC26PNMM43aF5SWLHiEWqs9WwYLDhORWGcPHuPC6LSQa8/oQGzwCZw8oxJ+rwRa5LDRbhmAOUdJguafb5y84evt4xSKz+9nZ0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555587970; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=fzDMtVDt2XG4r2sFlQYdxxPPh8w4+safkmHga+ZDMBk=; b=Fl9wkguHArRzZSK4HPBMyjIIcEm7OoU7FISSuPgsGsa4gkCxROkZtLV6s+kNHFjWT7gDRXPfIdoEqRffKRZbuvfe7Tde4RbgTKnY34LXDLVB55NUsCderzDstZ17kgwCAeRhrIMq1dg9BQI40RAlDZ3Tp+9Ut46VRu8BhLFt554= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 15555879703981004.0375551455437; Thu, 18 Apr 2019 04:46:10 -0700 (PDT) Received: from localhost ([127.0.0.1]:40027 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH5UR-0004OU-Ku for importer@patchew.org; Thu, 18 Apr 2019 07:46:03 -0400 Received: from eggs.gnu.org ([209.51.188.92]:53545) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH5SZ-00037T-9q for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH5SV-0006QB-GD for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:06 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:45519 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH5ST-0004tk-29 for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:01 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id E4AEA1A2023; Thu, 18 Apr 2019 13:42:52 +0200 (CEST) Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id B35071A20D0; Thu, 18 Apr 2019 13:42:52 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 13:42:42 +0200 Message-Id: <1555587766-985-3-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555587766-985-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555587766-985-1-git-send-email-mateja.marjanovic@rt-rk.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v8 2/6] target/mips: Optimize ILVEV. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Mateja Marjanovic Optimize set of MSA instructions ILVEV., using directly tcg registers and performing logic on them instead of using helpers. In the following table, the first column is the performance before this patch. The second represents the performance after converting from helpers to tcg, but without using tcg_gen_deposit function. The third one is with using the tcg_gen_deposit function and with using a uint64_t constant bit mask, and the fourth is with using the tcg_gen_deposit function and with a mask which is a tcg constant. The fourth is implemented in this patch. Performance measurement is done by executing the instructions 10 million times on a computer with Intel Core i7-3770 CPU @ 3.40GHz=C3=978. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D || instruction || 1 || 2 || 3 || 4 || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D || ilvev.b || 126.92 ms || 24.52 ms || 25.19 ms || 23.89 ms || || ilvev.h || 93.67 ms || 23.92 ms || 24.76 ms || 24.31 ms || || ilvev.w || 117.86 ms || 23.83 ms || 21.84 ms || 21.99 ms || || ilvev.d || 45.49 ms || 19.74 ms || 20.21 ms || 20.07 ms || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 1 - before 2 - no-deposit-no-mask-as-tcg-constant 3 - with-deposit-no-mask-as-tcg-constant 4 - with-deposit-with-mask-as-tcg-constant (final) The deposit function is used only in ILVEV.W. No-deposit version of the ILVEV.W implementation: static inline void gen_ilvev_w(CPUMIPSState *env, uint32_t wd, uint32_t ws, uint32_t wt) { TCGv_i64 t1 =3D tcg_temp_new_i64(); TCGv_i64 t2 =3D tcg_temp_new_i64(); uint64_t mask =3D 0x00000000ffffffffULL; tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); tcg_gen_andi_i64(t2, msa_wr_d[ws * 2], mask); tcg_gen_shli_i64(t2, t2, 32); tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); tcg_gen_andi_i64(t2, msa_wr_d[ws * 2 + 1], mask); tcg_gen_shli_i64(t2, t2, 32); tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); tcg_temp_free_i64(t1); tcg_temp_free_i64(t2); } Reviewed-by: Richard Henderson Suggested-by: Aleksandar Markovic Suggested-by: Philippe Mathieu-Daud=C3=A9 Suggested-by: Richard Henderson Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 1 - target/mips/msa_helper.c | 9 ----- target/mips/translate.c | 87 ++++++++++++++++++++++++++++++++++++++++++++= +++- 3 files changed, 86 insertions(+), 11 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index d162836..2f23b0d 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -864,7 +864,6 @@ DEF_HELPER_5(msa_pckev_df, void, env, i32, i32, i32, i3= 2) DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvl_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvev_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srlr_df, void, env, i32, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index 9e52a31..a500c59 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1197,15 +1197,6 @@ MSA_FN_DF(ilvl_df) } while (0) MSA_FN_DF(ilvr_df) #undef MSA_DO - -#define MSA_DO(DF) \ - do { \ - pwx->DF[2*i] =3D pwt->DF[2*i]; \ - pwx->DF[2*i+1] =3D pws->DF[2*i]; \ - } while (0) -MSA_FN_DF(ilvev_df) -#undef MSA_DO - #undef MSA_LOOP_COND =20 #define MSA_LOOP_COND(DF) \ diff --git a/target/mips/translate.c b/target/mips/translate.c index 99bd441..930ef3a 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28075,6 +28075,76 @@ static inline void gen_ilvod_d(CPUMIPSState *env, = uint32_t wd, tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); } =20 + +/* + * [MSA] ILVEV. wd, ws, wt + * + * Vector Interleave Even ( data elements) + * + */ +static inline void gen_ilvev_bh(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt, + uint64_t mask, uint32_t shift) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + TCGv_i64 mask_tcg =3D tcg_const_i64(mask); + + tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask_tcg); + tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask_tcg); + tcg_gen_shli_i64(t2, t2, shift); + tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); + + tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask_tcg); + tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask_tcg); + tcg_gen_shli_i64(t2, t2, shift); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); + + tcg_temp_free_i64(mask_tcg); + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +static inline void gen_ilvev_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + gen_ilvev_bh(env, wd, ws, wt, 0x00ff00ff00ff00ffULL, 8); +} + +static inline void gen_ilvev_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + gen_ilvev_bh(env, wd, ws, wt, 0x0000ffff0000ffffULL, 16); +} + +/* + * [MSA] ILVEV.W wd, ws, wt + * + * Vector Interleave Even (word data elements) + * + */ +static inline void gen_ilvev_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_deposit_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2], + msa_wr_d[ws * 2], 32, 32); + tcg_gen_deposit_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[wt * 2 + 1], + msa_wr_d[ws * 2 + 1], 32, 32); +} + +/* + * [MSA] ILVEV.D wd, ws, wt + * + * Vector Interleave Even (Doubleword data elements) + * + */ +static inline void gen_ilvev_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]); + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]); +} + static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx) { #define MASK_MSA_3R(op) (MASK_MSA_MINOR(op) | (op & (0x7 << 23))) @@ -28231,7 +28301,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCon= text *ctx) gen_helper_msa_mod_s_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVEV_df: - gen_helper_msa_ilvev_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_ilvev_b(env, wd, ws, wt); + break; + case DF_HALF: + gen_ilvev_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvev_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvev_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; case OPC_BINSR_df: gen_helper_msa_binsr_df(cpu_env, tdf, twd, tws, twt); --=20 2.7.4 From nobody Tue May 7 00:32:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1555587974; cv=none; d=zoho.com; s=zohoarc; b=ibvdm75ePivjvVipzicDl1i6+kKLffvKfpZ+XmyI6AsaK/bOU/0OOUHG5o4zPaD6QHJlB6CsieewygO23DoNP9gcHVUAxi69VIfz0+P8jyPrjgL777LkIZkYDKJjsXlKW79i6elmzPUPIxudG4X19RstjnoIUg0G/54cjS3MCf8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555587974; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=hIB5mN4bQsxEbRr82isB0cDzdAIJCgAxpnJeUlk4tac=; b=U7EHUPtf2jW0xqEg3v+xHzn+yEPpnUT4U1BPBKUZ8VOEAYqLybji2sJBRZGx+ucsrZUCR1sQ5DGSvRVDhCHmBSvJxJetztCR6c/81jhHllqJEWIVoQ9ex3OOrG+wbB/uiN3qHHFhpje/LrLsMqD2oxMsFBQNJFf7p9TppQFS73E= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1555587974419130.73066169944138; Thu, 18 Apr 2019 04:46:14 -0700 (PDT) Received: from localhost ([127.0.0.1]:40029 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH5UV-0004Qs-6h for importer@patchew.org; Thu, 18 Apr 2019 07:46:07 -0400 Received: from eggs.gnu.org ([209.51.188.92]:53562) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH5Sa-00038H-Kc for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH5SV-0006QG-GS for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:06 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:45593 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH5ST-0004u1-2h for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:01 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 03A8D1A39CF; Thu, 18 Apr 2019 13:42:53 +0200 (CEST) Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id C02C11A22B4; Thu, 18 Apr 2019 13:42:52 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 13:42:43 +0200 Message-Id: <1555587766-985-4-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555587766-985-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555587766-985-1-git-send-email-mateja.marjanovic@rt-rk.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v8 3/6] target/mips: Optimize ILVL. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Mateja Marjanovic Optimize ILVL. instructions, using a hybrid approach. For byte data elements, use a helper with an unrolled loop (having much better performance than direct tcg translation), for halfword, word and doubleword data elements use directly tcg registers and logic performed on them. Performance measurement is done by executing the instructions 10 million times on a computer with Intel Core i7-3770 CPU @ 3.40GHz=C3=978. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D ||instruction|| helper || tcg || hybrid || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D || ilvl.b || 59.91 ms || 74.41 ms || 60.50 ms (helper) || || ilvl.h || 41.33 ms || 33.08 ms || 33.34 ms (tcg) || || ilvl.w || 30.99 ms || 22.87 ms || 23.19 ms (tcg) || || ilvl.d || 26.40 ms || 19.64 ms || 20.49 ms (tcg) || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Suggested-by: Aleksandar Markovic Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 3 +- target/mips/msa_helper.c | 33 +++++++++++---- target/mips/translate.c | 106 +++++++++++++++++++++++++++++++++++++++++++= +++- 3 files changed, 132 insertions(+), 10 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index 2f23b0d..ba2af87 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -862,7 +862,6 @@ DEF_HELPER_5(msa_sld_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_splat_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckev_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvl_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) @@ -936,6 +935,8 @@ DEF_HELPER_4(msa_pcnt_df, void, env, i32, i32, i32) DEF_HELPER_4(msa_nloc_df, void, env, i32, i32, i32) DEF_HELPER_4(msa_nlzc_df, void, env, i32, i32, i32) =20 +DEF_HELPER_4(msa_ilvl_b, void, env, i32, i32, i32) + DEF_HELPER_4(msa_fclass_df, void, env, i32, i32, i32) DEF_HELPER_4(msa_ftrunc_s_df, void, env, i32, i32, i32) DEF_HELPER_4(msa_ftrunc_u_df, void, env, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index a500c59..91beb1a 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1184,14 +1184,6 @@ MSA_FN_DF(pckod_df) =20 #define MSA_DO(DF) \ do { \ - pwx->DF[2*i] =3D L##DF(pwt, i); \ - pwx->DF[2*i+1] =3D L##DF(pws, i); \ - } while (0) -MSA_FN_DF(ilvl_df) -#undef MSA_DO - -#define MSA_DO(DF) \ - do { \ pwx->DF[2*i] =3D R##DF(pwt, i); \ pwx->DF[2*i+1] =3D R##DF(pws, i); \ } while (0) @@ -1214,6 +1206,31 @@ MSA_FN_DF(vshf_df) #undef MSA_LOOP_COND #undef MSA_FN_DF =20 +void helper_msa_ilvl_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + wr_t *pwd =3D &(env->active_fpu.fpr[wd].wr); + wr_t *pws =3D &(env->active_fpu.fpr[ws].wr); + wr_t *pwt =3D &(env->active_fpu.fpr[wt].wr); + + pwd->b[0] =3D pwt->b[8]; + pwd->b[1] =3D pws->b[8]; + pwd->b[2] =3D pwt->b[9]; + pwd->b[3] =3D pws->b[9]; + pwd->b[4] =3D pwt->b[10]; + pwd->b[5] =3D pws->b[10]; + pwd->b[6] =3D pwt->b[11]; + pwd->b[7] =3D pws->b[11]; + pwd->b[8] =3D pwt->b[12]; + pwd->b[9] =3D pws->b[12]; + pwd->b[10] =3D pwt->b[13]; + pwd->b[11] =3D pws->b[13]; + pwd->b[12] =3D pwt->b[14]; + pwd->b[13] =3D pws->b[14]; + pwd->b[14] =3D pwt->b[15]; + pwd->b[15] =3D pws->b[15]; +} + void helper_msa_sldi_df(CPUMIPSState *env, uint32_t df, uint32_t wd, uint32_t ws, uint32_t n) { diff --git a/target/mips/translate.c b/target/mips/translate.c index 930ef3a..ce5c240 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28002,6 +28002,95 @@ static void gen_msa_bit(CPUMIPSState *env, DisasCo= ntext *ctx) } =20 /* + * [MSA] ILVL.H wd, ws, wt + * + * Vector Interleave Left (halfword data elements) + * + */ +static inline void gen_ilvl_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + uint64_t mask =3D 0x000000000000ffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x00000000ffff0000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask =3D 0x0000ffff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0xffff000000000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVL.W wd, ws, wt + * + * Vector Interleave Left (word data elements) + * + */ +static inline void gen_ilvl_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + uint64_t mask =3D 0x00000000ffffffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask =3D 0xffffffff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2 + 1], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVL.D wd, ws, wt + * + * Vector Interleave Left (doubleword data elements) + * + */ +static inline void gen_ilvl_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2 + 1]); + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); +} + +/* * [MSA] ILVOD. wd, ws, wt * * Vector Interleave Odd ( data elements) @@ -28265,7 +28354,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCon= text *ctx) gen_helper_msa_div_s_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVL_df: - gen_helper_msa_ilvl_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_helper_msa_ilvl_b(cpu_env, twd, tws, twt); + break; + case DF_HALF: + gen_ilvl_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvl_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvl_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; case OPC_BNEG_df: gen_helper_msa_bneg_df(cpu_env, tdf, twd, tws, twt); --=20 2.7.4 From nobody Tue May 7 00:32:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1555588124; cv=none; d=zoho.com; s=zohoarc; b=UL56Jesusld78EdU+fqkXsYfbg8X6XictkgtVNfDPGQ6Ljohl3J6JdgfqsGh8Own2bayWy8xUZbMhtcs2v+zZBZOiJZdO10Iw75SIOv+axfsXJH1n3gr2gjC4tYgZjPWsmgmfKenFwFucF0hzi5FRSyANZLa1q4yDOCUKTLwuSo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555588124; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=XAo6TI9fGblKXLiAvuyszc3kclbdhJBpdo6TlD/SX4M=; b=X7q+lyGorJmmkig3Lm0Jh0O0GruNbHWN2moq+6/CMIEIr3Lb0MP8Ft97Nt0DNRjmuxfxaTp3914Fz2mhVJImYgF/htH0mtxdimzMiDD1P3ou5hofCuWFX40BWbGseR1AvTHKmnCR7k+ajrtDzgaKts74n/KWp/NN7qItwd7Ijqg= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1555588124796427.0358824026396; Thu, 18 Apr 2019 04:48:44 -0700 (PDT) Received: from localhost ([127.0.0.1]:40057 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH5Wy-0006gM-Pz for importer@patchew.org; Thu, 18 Apr 2019 07:48:40 -0400 Received: from eggs.gnu.org ([209.51.188.92]:53543) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH5SZ-00037S-9f for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH5SV-0006QC-GG for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:06 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:45552 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH5ST-0004tn-2z for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:01 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 0195B1A2475; Thu, 18 Apr 2019 13:42:53 +0200 (CEST) Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id CD73C1A1DF9; Thu, 18 Apr 2019 13:42:52 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 13:42:44 +0200 Message-Id: <1555587766-985-5-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555587766-985-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555587766-985-1-git-send-email-mateja.marjanovic@rt-rk.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v8 4/6] target/mips: Optimize ILVR. MSA instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" From: Mateja Marjanovic Optimize ILVR. instructions, using a hybrid approach. For byte data elements, use a helper with an unrolled loop (having much better performance than direct tcg translation), for halfword, word and doubleword data elements use directly tcg registers and logic performed on them. Performance measurement is done by executing the instructions 10 million times on a computer with Intel Core i7-3770 CPU @ 3.40GHz=C3=978. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D ||instruction|| helper || tcg || hybrid || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D || ilvr.b || 62.87 ms || 74.76 ms || 61.49 ms (helper) || || ilvr.h || 44.11 ms || 33.00 ms || 33.69 ms (tcg) || || ilvr.w || 34.97 ms || 23.06 ms || 23.01 ms (tcg) || || ilvr.d || 27.33 ms || 19.87 ms || 19.65 ms (tcg) || =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Suggested-by: Aleksandar Markovic Signed-off-by: Mateja Marjanovic --- target/mips/helper.h | 2 +- target/mips/msa_helper.c | 33 +++++++++++---- target/mips/translate.c | 107 +++++++++++++++++++++++++++++++++++++++++++= +++- 3 files changed, 132 insertions(+), 10 deletions(-) diff --git a/target/mips/helper.h b/target/mips/helper.h index ba2af87..e8d80b4 100644 --- a/target/mips/helper.h +++ b/target/mips/helper.h @@ -862,7 +862,6 @@ DEF_HELPER_5(msa_sld_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_splat_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckev_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32) -DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32) DEF_HELPER_5(msa_srlr_df, void, env, i32, i32, i32, i32) @@ -935,6 +934,7 @@ DEF_HELPER_4(msa_pcnt_df, void, env, i32, i32, i32) DEF_HELPER_4(msa_nloc_df, void, env, i32, i32, i32) DEF_HELPER_4(msa_nlzc_df, void, env, i32, i32, i32) =20 +DEF_HELPER_4(msa_ilvr_b, void, env, i32, i32, i32) DEF_HELPER_4(msa_ilvl_b, void, env, i32, i32, i32) =20 DEF_HELPER_4(msa_fclass_df, void, env, i32, i32, i32) diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c index 91beb1a..530eee5 100644 --- a/target/mips/msa_helper.c +++ b/target/mips/msa_helper.c @@ -1181,14 +1181,6 @@ MSA_FN_DF(pckev_df) } while (0) MSA_FN_DF(pckod_df) #undef MSA_DO - -#define MSA_DO(DF) \ - do { \ - pwx->DF[2*i] =3D R##DF(pwt, i); \ - pwx->DF[2*i+1] =3D R##DF(pws, i); \ - } while (0) -MSA_FN_DF(ilvr_df) -#undef MSA_DO #undef MSA_LOOP_COND =20 #define MSA_LOOP_COND(DF) \ @@ -1206,6 +1198,31 @@ MSA_FN_DF(vshf_df) #undef MSA_LOOP_COND #undef MSA_FN_DF =20 +void helper_msa_ilvr_b(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + wr_t *pwd =3D &(env->active_fpu.fpr[wd].wr); + wr_t *pws =3D &(env->active_fpu.fpr[ws].wr); + wr_t *pwt =3D &(env->active_fpu.fpr[wt].wr); + + pwd->b[15] =3D pws->b[7]; + pwd->b[14] =3D pwt->b[7]; + pwd->b[13] =3D pws->b[6]; + pwd->b[12] =3D pwt->b[6]; + pwd->b[11] =3D pws->b[5]; + pwd->b[10] =3D pwt->b[5]; + pwd->b[9] =3D pws->b[4]; + pwd->b[8] =3D pwt->b[4]; + pwd->b[7] =3D pws->b[3]; + pwd->b[6] =3D pwt->b[3]; + pwd->b[5] =3D pws->b[2]; + pwd->b[4] =3D pwt->b[2]; + pwd->b[3] =3D pws->b[1]; + pwd->b[2] =3D pwt->b[1]; + pwd->b[1] =3D pws->b[0]; + pwd->b[0] =3D pwt->b[0]; +} + void helper_msa_ilvl_b(CPUMIPSState *env, uint32_t wd, uint32_t ws, uint32_t wt) { diff --git a/target/mips/translate.c b/target/mips/translate.c index ce5c240..d4bbfc3 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28002,6 +28002,96 @@ static void gen_msa_bit(CPUMIPSState *env, DisasCo= ntext *ctx) } =20 /* + * [MSA] ILVR.H wd, ws, wt + * + * Vector Interleave Right (halfword data elements) + * + */ +static inline void gen_ilvr_h(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + uint64_t mask =3D 0x000000000000ffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0x00000000ffff0000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shli_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask =3D 0x0000ffff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + + mask =3D 0xffff000000000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 16); + tcg_gen_or_i64(t2, t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVR.W wd, ws, wt + * + * Vector Interleave Right (word data elements) + * + */ +static inline void gen_ilvr_w(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2 =3D tcg_temp_new_i64(); + uint64_t mask =3D 0x00000000ffffffffULL; + + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_shli_i64(t1, t1, 32); + tcg_gen_or_i64(msa_wr_d[wd * 2], t2, t1); + + mask =3D 0xffffffff00000000ULL; + tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); + tcg_gen_shri_i64(t1, t1, 32); + tcg_gen_mov_i64(t2, t1); + tcg_gen_andi_i64(t1, msa_wr_d[ws * 2], mask); + tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t2, t1); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +/* + * [MSA] ILVR.D wd, ws, wt + * + * Vector Interleave Right (doubleword data elements) + * + */ +static inline void gen_ilvr_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) +{ + tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]); + tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]); +} + + +/* * [MSA] ILVL.H wd, ws, wt * * Vector Interleave Left (halfword data elements) @@ -28390,7 +28480,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCon= text *ctx) gen_helper_msa_div_u_df(cpu_env, tdf, twd, tws, twt); break; case OPC_ILVR_df: - gen_helper_msa_ilvr_df(cpu_env, tdf, twd, tws, twt); + switch (df) { + case DF_BYTE: + gen_helper_msa_ilvr_b(cpu_env, twd, tws, twt); + break; + case DF_HALF: + gen_ilvr_h(env, wd, ws, wt); + break; + case DF_WORD: + gen_ilvr_w(env, wd, ws, wt); + break; + case DF_DOUBLE: + gen_ilvr_d(env, wd, ws, wt); + break; + default: + assert(0); + } break; case OPC_BINSL_df: gen_helper_msa_binsl_df(cpu_env, tdf, twd, tws, twt); --=20 2.7.4 From nobody Tue May 7 00:32:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1555588219; cv=none; d=zoho.com; s=zohoarc; b=KtTyt/SvpWzKmNYeE8sVE1TidIdBiIHa0a+2Tt547yZPo24KVSl88EpQZf98IlC505iOag36vO56ivW9W1/DNc9PdTDHADisFSXWbmuHF+0WeUOiqapsYGJkDaeKOWpfDZ7Z76qkTZVT2zf5Wz4BPDo4qDWmXezR8hZ6Zn4ajFc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555588219; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=gTXI7zwDFlTNHwaOwRPszY/SxxN5mGIwUfVzeuG5iYs=; b=UcKfNIOIPxb80JN9RkBqj1+y4b3Hg2Pv4XPn9OuwSTUL5hhUD9qnq/m3Ugon/U0Zg8KAIztmWEeKURNkWlN3hwIMaLZQ+E4aYzGlUqFNe513CO60odYVz6N/tsigsgU50YuYOXI3LL4Man0hatlqApvka0S3iHPULOmDhjEdw/k= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 155558821990768.3533686377449; Thu, 18 Apr 2019 04:50:19 -0700 (PDT) Received: from localhost ([127.0.0.1]:40067 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH5YU-0007cS-0u for importer@patchew.org; Thu, 18 Apr 2019 07:50:14 -0400 Received: from eggs.gnu.org ([209.51.188.92]:53587) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH5Sg-0003DM-8E for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH5Se-0006ap-Bv for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:14 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:50796 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH5Sa-0006NR-KU for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:10 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 1BB601A20D0; Thu, 18 Apr 2019 13:42:53 +0200 (CEST) Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id D98E91A2460; Thu, 18 Apr 2019 13:42:52 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 13:42:45 +0200 Message-Id: <1555587766-985-6-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555587766-985-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555587766-985-1-git-send-email-mateja.marjanovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v8 5/6] target/mips: Merge implementation of ILVEV.D and ILVR.D X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Mateja Marjanovic The implementation for ILVEV.D and ILVR.D instructions is equivalent, so use a single handler for both of them. Suggested-by: Aleksandar Markovic Signed-off-by: Mateja Marjanovic --- target/mips/translate.c | 30 +++++++++++------------------- 1 file changed, 11 insertions(+), 19 deletions(-) diff --git a/target/mips/translate.c b/target/mips/translate.c index d4bbfc3..c6aa995 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28078,20 +28078,6 @@ static inline void gen_ilvr_w(CPUMIPSState *env, u= int32_t wd, } =20 /* - * [MSA] ILVR.D wd, ws, wt - * - * Vector Interleave Right (doubleword data elements) - * - */ -static inline void gen_ilvr_d(CPUMIPSState *env, uint32_t wd, - uint32_t ws, uint32_t wt) -{ - tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]); - tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]); -} - - -/* * [MSA] ILVL.H wd, ws, wt * * Vector Interleave Left (halfword data elements) @@ -28314,11 +28300,17 @@ static inline void gen_ilvev_w(CPUMIPSState *env,= uint32_t wd, /* * [MSA] ILVEV.D wd, ws, wt * - * Vector Interleave Even (Doubleword data elements) + * Vector Interleave Even (doubleword data elements) + * + * [MSA] ILVR.D wd, ws, wt + * + * Vector Interleave Right (doubleword data elements) + * + * These two instructions are functionally equivalent. * */ -static inline void gen_ilvev_d(CPUMIPSState *env, uint32_t wd, - uint32_t ws, uint32_t wt) +static inline void gen_ilvev_ilvr_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) { tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]); tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]); @@ -28491,7 +28483,7 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCont= ext *ctx) gen_ilvr_w(env, wd, ws, wt); break; case DF_DOUBLE: - gen_ilvr_d(env, wd, ws, wt); + gen_ilvev_ilvr_d(env, wd, ws, wt); break; default: assert(0); @@ -28521,7 +28513,7 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCont= ext *ctx) gen_ilvev_w(env, wd, ws, wt); break; case DF_DOUBLE: - gen_ilvev_d(env, wd, ws, wt); + gen_ilvev_ilvr_d(env, wd, ws, wt); break; default: assert(0); --=20 2.7.4 From nobody Tue May 7 00:32:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1555588122; cv=none; d=zoho.com; s=zohoarc; b=Ub9r4GACpXi2iQytXmJ44W/wT4arfGJKTLB8J5OQMykaJhHKCiga5tmuZHOI/9HXv5L/Q3x+dg5LUKTl/XeUbFqb9JM/x1q44hKstmkLg/6yNv+qKemrWUTh+0F1JpkfmjIXNNKfiynysmHRv7hzcsXjl3lYmTwVrEjSYyMEWfM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555588122; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=KNr2AyzRIzqkQ+z3Nxc8YYf8ydDGyduwB9pNw2z4JBI=; b=hoUm1u0bAOduGreBhiz9vpvaplPsPUmO29QoEGklbnxXAuMf6qV6A8PnqoBnieKg47WrclszeTD1310sceZTRt+7M5Cz22ngYuzOLCALU53TqCM+f6bX2mAy0cEiaYqGzgVTerHcXlEyasfDUcq4YWOd+o7j3AsZXa4DqmX/bU8= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1555588122626148.80730200594076; Thu, 18 Apr 2019 04:48:42 -0700 (PDT) Received: from localhost ([127.0.0.1]:40055 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH5Wx-0006fE-Iu for importer@patchew.org; Thu, 18 Apr 2019 07:48:39 -0400 Received: from eggs.gnu.org ([209.51.188.92]:53568) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH5Sd-0003An-E5 for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH5SY-0006Th-Op for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:10 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:50792 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH5SV-0006NO-HP for qemu-devel@nongnu.org; Thu, 18 Apr 2019 07:44:04 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 184A01A1DF9; Thu, 18 Apr 2019 13:42:53 +0200 (CEST) Received: from rtrkw310-lin.domain.local (rtrkw310-lin.domain.local [10.10.13.97]) by mail.rt-rk.com (Postfix) with ESMTPSA id E51901A246A; Thu, 18 Apr 2019 13:42:52 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Mateja Marjanovic To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 13:42:46 +0200 Message-Id: <1555587766-985-7-git-send-email-mateja.marjanovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555587766-985-1-git-send-email-mateja.marjanovic@rt-rk.com> References: <1555587766-985-1-git-send-email-mateja.marjanovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v8 6/6] target/mips: Merge implementation of ILVOD.D and ILVL.D X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: arikalo@wavecomp.com, richard.henderson@linaro.org, philmd@redhat.com, amarkovic@wavecomp.com, aurelien@aurel32.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Mateja Marjanovic The implementation for ILVOD.D and ILVL.D instructions is equivalent, so use a single handler for both of them. Suggested-by: Aleksandar Markovic Signed-off-by: Mateja Marjanovic --- target/mips/translate.c | 27 ++++++++++----------------- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/target/mips/translate.c b/target/mips/translate.c index c6aa995..4837c43 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -28154,19 +28154,6 @@ static inline void gen_ilvl_w(CPUMIPSState *env, u= int32_t wd, } =20 /* - * [MSA] ILVL.D wd, ws, wt - * - * Vector Interleave Left (doubleword data elements) - * - */ -static inline void gen_ilvl_d(CPUMIPSState *env, uint32_t wd, - uint32_t ws, uint32_t wt) -{ - tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2 + 1]); - tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); -} - -/* * [MSA] ILVOD. wd, ws, wt * * Vector Interleave Odd ( data elements) @@ -28232,9 +28219,15 @@ static inline void gen_ilvod_w(CPUMIPSState *env, = uint32_t wd, * * Vector Interleave Odd (doubleword data elements) * + * [MSA] ILVL.D wd, ws, wt + * + * Vector Interleave Left (doubleword data elements) + * + * These two instructions are functionally equivalent. + * */ -static inline void gen_ilvod_d(CPUMIPSState *env, uint32_t wd, - uint32_t ws, uint32_t wt) +static inline void gen_ilvod_ilvl_d(CPUMIPSState *env, uint32_t wd, + uint32_t ws, uint32_t wt) { tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2 + 1]); tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]); @@ -28447,7 +28440,7 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCont= ext *ctx) gen_ilvl_w(env, wd, ws, wt); break; case DF_DOUBLE: - gen_ilvl_d(env, wd, ws, wt); + gen_ilvod_ilvl_d(env, wd, ws, wt); break; default: assert(0); @@ -28543,7 +28536,7 @@ static void gen_msa_3r(CPUMIPSState *env, DisasCont= ext *ctx) gen_ilvod_w(env, wd, ws, wt); break; case DF_DOUBLE: - gen_ilvod_d(env, wd, ws, wt); + gen_ilvod_ilvl_d(env, wd, ws, wt); break; default: assert(0); --=20 2.7.4