From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561633566; cv=none; d=zoho.com; s=zohoarc; b=Zx80suU8MoS47VbvBCU90ltyaWNmQLYBvFi71XFQ+tfF0D1hLEJ6ZYrpE11O3B6GWNIfcqiqn4Ygon157yjYQV8qmk+1oujUAJS4GRUkpMtIJmevk20syz07yTDdkYCm5s2gzMe1ed70xiK+ka7257+At/PYVIqqt8I/U99pWqQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561633566; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=8WD+yY5rQJQLG8t7+rZaZZznFroGbV0xi4LAKQMo/zY=; b=frSAd7nd3D/r126X2Uejlgrn6mC83bucGMg0T6jwllJefxnDNWKerfPQtNzmC6GoOL8zGPCCdMMRt3niu44/gajY2fi4czp5ae5y3b9QC/F20d5UZGOtrJLMxdsUix0jwz/VHMmK0Hvd9TmPtB+/Upo+l5fO9Eqs59pvcqLzPIQ= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561633566044701.8787317181113; Thu, 27 Jun 2019 04:06:06 -0700 (PDT) Received: from localhost ([::1]:48634 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgSE3-0000Ie-To for importer@patchew.org; Thu, 27 Jun 2019 07:05:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55780) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5w-0001yb-0i for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5u-0001fR-6T for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:35 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:53763 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5t-0000f5-S4 for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:34 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 92EF01A4568; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id 584071A453F; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:13 +0200 Message-Id: <1561632985-24866-2-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 01/13] target/ppc: Optimize emulation of lvsl and lvsr instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" Adding simple macro that is calling tcg implementation of appropriate instruction if altivec support is active. Optimization of altivec instruction lvsl (Load Vector for Shift Left). Place bytes sh:sh+15 of value 0x00 || 0x01 || 0x02 || ... || 0x1E || 0x1F in destination register. Sh is calculated by adding 2 source registers and getting bits 60-63 of result. First, the bits [28-31] are placed from EA to variable sh. After that, the bytes are created in the following way: sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101 followed by addition of the result with 0x0001020304050607. Value obtained is placed in higher doubleword element of vD. (sh+8):(sh+15) by adding the result of previous multiplication with 0x08090a0b0c0d0e0f. Value obtained is placed in lower doubleword element of vD. Optimization of altivec instruction lvsr (Load Vector for Shift Right). Place bytes 16-sh:31-sh of value 0x00 || 0x01 || 0x02 || ... || 0x1E || 0x1F in destination register. Sh is calculated by adding 2 source registers and getting bits 60-63 of result. First, the bits [28-31] are placed from EA to variable sh. After that, the bytes are created in the following way: sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101 followed by substraction of the result from 0x1011121314151617. Value obtained is placed in higher doubleword element of vD. (sh+8):(sh+15) by substracting the result of previous multiplication from 0x18191a1b1c1d1e1f. Value obtained is placed in lower doubleword element of vD. Signed-off-by: Stefan Brankovic Reviewed-by: Richard Henderson --- target/ppc/helper.h | 2 - target/ppc/int_helper.c | 18 ------ target/ppc/translate/vmx-impl.inc.c | 121 ++++++++++++++++++++++++++------= ---- 3 files changed, 89 insertions(+), 52 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 02b67a3..c82105e 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -189,8 +189,6 @@ DEF_HELPER_2(vprtybw, void, avr, avr) DEF_HELPER_2(vprtybd, void, avr, avr) DEF_HELPER_2(vprtybq, void, avr, avr) DEF_HELPER_3(vsubcuw, void, avr, avr, avr) -DEF_HELPER_2(lvsl, void, avr, tl) -DEF_HELPER_2(lvsr, void, avr, tl) DEF_HELPER_FLAGS_5(vaddsbs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) DEF_HELPER_FLAGS_5(vaddshs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) DEF_HELPER_FLAGS_5(vaddsws, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 8ce89f2..9505f4c 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -457,24 +457,6 @@ SATCVT(sd, uw, int64_t, uint32_t, 0, UINT32_MAX) #undef SATCVT #undef SATCVTU =20 -void helper_lvsl(ppc_avr_t *r, target_ulong sh) -{ - int i, j =3D (sh & 0xf); - - for (i =3D 0; i < ARRAY_SIZE(r->u8); i++) { - r->VsrB(i) =3D j++; - } -} - -void helper_lvsr(ppc_avr_t *r, target_ulong sh) -{ - int i, j =3D 0x10 - (sh & 0xf); - - for (i =3D 0; i < ARRAY_SIZE(r->u8); i++) { - r->VsrB(i) =3D j++; - } -} - void helper_mtvscr(CPUPPCState *env, uint32_t vscr) { env->vscr =3D vscr & ~(1u << VSCR_SAT); diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx= -impl.inc.c index 663275b..a9fe3c7 100644 --- a/target/ppc/translate/vmx-impl.inc.c +++ b/target/ppc/translate/vmx-impl.inc.c @@ -142,38 +142,6 @@ GEN_VR_STVE(bx, 0x07, 0x04, 1); GEN_VR_STVE(hx, 0x07, 0x05, 2); GEN_VR_STVE(wx, 0x07, 0x06, 4); =20 -static void gen_lvsl(DisasContext *ctx) -{ - TCGv_ptr rd; - TCGv EA; - if (unlikely(!ctx->altivec_enabled)) { - gen_exception(ctx, POWERPC_EXCP_VPU); - return; - } - EA =3D tcg_temp_new(); - gen_addr_reg_index(ctx, EA); - rd =3D gen_avr_ptr(rD(ctx->opcode)); - gen_helper_lvsl(rd, EA); - tcg_temp_free(EA); - tcg_temp_free_ptr(rd); -} - -static void gen_lvsr(DisasContext *ctx) -{ - TCGv_ptr rd; - TCGv EA; - if (unlikely(!ctx->altivec_enabled)) { - gen_exception(ctx, POWERPC_EXCP_VPU); - return; - } - EA =3D tcg_temp_new(); - gen_addr_reg_index(ctx, EA); - rd =3D gen_avr_ptr(rD(ctx->opcode)); - gen_helper_lvsr(rd, EA); - tcg_temp_free(EA); - tcg_temp_free_ptr(rd); -} - static void gen_mfvscr(DisasContext *ctx) { TCGv_i32 t; @@ -316,6 +284,16 @@ static void glue(gen_, name)(DisasContext *ctx) = \ tcg_temp_free_ptr(rd); \ } =20 +#define GEN_VXFORM_TRANS(name, opc2, opc3) \ +static void glue(gen_, name)(DisasContext *ctx) \ +{ \ + if (unlikely(!ctx->altivec_enabled)) { \ + gen_exception(ctx, POWERPC_EXCP_VPU); \ + return; \ + } \ + trans_##name(ctx); \ +} + #define GEN_VXFORM_ENV(name, opc2, opc3) \ static void glue(gen_, name)(DisasContext *ctx) \ { \ @@ -515,6 +493,83 @@ static void gen_vmrgow(DisasContext *ctx) tcg_temp_free_i64(avr); } =20 +/* + * lvsl VRT,RA,RB - Load Vector for Shift Left + * + * Let the EA be the sum (rA|0)+(rB). Let sh=3DEA[28=E2=80=9331]. + * Let X be the 32-byte value 0x00 || 0x01 || 0x02 || ... || 0x1E || 0x1F. + * Bytes sh:sh+15 of X are placed into vD. + */ +static void trans_lvsl(DisasContext *ctx) +{ + int VT =3D rD(ctx->opcode); + TCGv_i64 result =3D tcg_temp_new_i64(); + TCGv_i64 sh =3D tcg_temp_new_i64(); + TCGv EA =3D tcg_temp_new(); + + /* Get sh(from description) by anding EA with 0xf. */ + gen_addr_reg_index(ctx, EA); + tcg_gen_extu_tl_i64(sh, EA); + tcg_gen_andi_i64(sh, sh, 0xfULL); + + /* + * Create bytes sh:sh+7 of X(from description) and place them in + * higher doubleword of vD. + */ + tcg_gen_muli_i64(sh, sh, 0x0101010101010101ULL); + tcg_gen_addi_i64(result, sh, 0x0001020304050607ull); + set_avr64(VT, result, true); + /* + * Create bytes sh+8:sh+15 of X(from description) and place them in + * lower doubleword of vD. + */ + tcg_gen_addi_i64(result, sh, 0x08090a0b0c0d0e0fULL); + set_avr64(VT, result, false); + + tcg_temp_free_i64(result); + tcg_temp_free_i64(sh); + tcg_temp_free(EA); +} + +/* + * lvsr VRT,RA,RB - Load Vector for Shift Right + * + * Let the EA be the sum (rA|0)+(rB). Let sh=3DEA[28=E2=80=9331]. + * Let X be the 32-byte value 0x00 || 0x01 || 0x02 || ... || 0x1E || 0x1F. + * Bytes (16-sh):(31-sh) of X are placed into vD. + */ +static void trans_lvsr(DisasContext *ctx) +{ + int VT =3D rD(ctx->opcode); + TCGv_i64 result =3D tcg_temp_new_i64(); + TCGv_i64 sh =3D tcg_temp_new_i64(); + TCGv EA =3D tcg_temp_new(); + + + /* Get sh(from description) by anding EA with 0xf. */ + gen_addr_reg_index(ctx, EA); + tcg_gen_extu_tl_i64(sh, EA); + tcg_gen_andi_i64(sh, sh, 0xfULL); + + /* + * Create bytes (16-sh):(23-sh) of X(from description) and place them = in + * higher doubleword of vD. + */ + tcg_gen_muli_i64(sh, sh, 0x0101010101010101ULL); + tcg_gen_subfi_i64(result, 0x1011121314151617ULL, sh); + set_avr64(VT, result, true); + /* + * Create bytes (24-sh):(32-sh) of X(from description) and place them = in + * lower doubleword of vD. + */ + tcg_gen_subfi_i64(result, 0x18191a1b1c1d1e1fULL, sh); + set_avr64(VT, result, false); + + tcg_temp_free_i64(result); + tcg_temp_free_i64(sh); + tcg_temp_free(EA); +} + GEN_VXFORM(vmuloub, 4, 0); GEN_VXFORM(vmulouh, 4, 1); GEN_VXFORM(vmulouw, 4, 2); @@ -662,6 +717,8 @@ GEN_VXFORM_DUAL(vmrgow, PPC_NONE, PPC2_ALTIVEC_207, GEN_VXFORM_HETRO(vextubrx, 6, 28) GEN_VXFORM_HETRO(vextuhrx, 6, 29) GEN_VXFORM_HETRO(vextuwrx, 6, 30) +GEN_VXFORM_TRANS(lvsl, 6, 31) +GEN_VXFORM_TRANS(lvsr, 6, 32) GEN_VXFORM_DUAL(vmrgew, PPC_NONE, PPC2_ALTIVEC_207, \ vextuwrx, PPC_NONE, PPC2_ISA300) =20 --=20 2.7.4 From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561633183; cv=none; d=zoho.com; s=zohoarc; b=nzV+Ae8VeAcg5VhYIj60WRX+auoi0w4CerFu3E1ddyUknrolibO5azVKMfdnFcjS6Y+i/cQRTdkl4S2efBbtL4wN3/MwMYcZ+Ny5LtgmZPSJoTpraqq2mCUTuAmYfZmExtCn5EhXUAKjsrBJ0smnygHSCwf/puuSk7lcqBPSyyo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561633183; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=4vVwOpXcU8XWunx+8B4SkIXjgkMMtM7rSMSR0AQgerE=; b=NRzsksH47TJVF+fhgYxbCMznVAp/w9xV7lTPhzHeHF0eCxQhZca5o4NvNbZ+sshA8RM4PD1iBAzTNtyUV/zw5nldL+BCYhL1dLD149EcVekgFxPsQ1WNT8kh8Qasnt2vQwt8Yvd2In4dBtpwWPGa/T/MJpIVvzZou/U6e5hCK/w= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561633183256661.6043098720112; Thu, 27 Jun 2019 03:59:43 -0700 (PDT) Received: from localhost ([::1]:48596 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS7t-0004Kv-K0 for importer@patchew.org; Thu, 27 Jun 2019 06:59:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55778) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5v-0001ya-UQ for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5u-0001fJ-5n for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:35 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:53722 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5t-0000ev-Ra for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:34 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 95D121A4588; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id 638E71A4548; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:14 +0200 Message-Id: <1561632985-24866-3-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 02/13] target/ppc: Optimize emulation of vsl and vsr instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Optimization of altivec instructions vsl and vsr(Vector Shift Left/Rigt). Perform shift operation (left and right respectively) on 128 bit value of register vA by value specified in bits 125-127 of register vB. Lowest 3 bits in each byte element of register vB must be identical or result is undefined. For vsl instruction, the first step is bits 125-127 of register vB have to be saved in variable sh. Then, the highest sh bits of the lower doubleword element of register vA are saved in variable shifted, in order not to lose those bits when shift operation is performed on the lower doubleword element of register vA, which is the next step. After shifting the lower doubleword element shift operation is performed on higher doubleword element of vA, with replacement of the lowest sh bits(that are now 0) with bits saved in shifted. For vsr instruction, firstly, the bits 125-127 of register vB have to be saved in variable sh. Then, the lowest sh bits of the higher doubleword element of register vA are saved in variable shifted, in odred not to lose those bits when the shift operation is performed on the higher doubleword element of register vA, which is the next step. After shifting higher doubleword element, shift operation is performed on lower doubleword element of vA, with replacement of highest sh bits(that are now 0) with bits saved in shifted. Signed-off-by: Stefan Brankovic Reviewed-by: Richard Henderson --- target/ppc/helper.h | 2 - target/ppc/int_helper.c | 35 ------------- target/ppc/translate/vmx-impl.inc.c | 99 +++++++++++++++++++++++++++++++++= +++- 3 files changed, 97 insertions(+), 39 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index c82105e..33dad6a 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -213,8 +213,6 @@ DEF_HELPER_3(vrlb, void, avr, avr, avr) DEF_HELPER_3(vrlh, void, avr, avr, avr) DEF_HELPER_3(vrlw, void, avr, avr, avr) DEF_HELPER_3(vrld, void, avr, avr, avr) -DEF_HELPER_3(vsl, void, avr, avr, avr) -DEF_HELPER_3(vsr, void, avr, avr, avr) DEF_HELPER_4(vsldoi, void, avr, avr, avr, i32) DEF_HELPER_3(vextractub, void, avr, avr, i32) DEF_HELPER_3(vextractuh, void, avr, avr, i32) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 9505f4c..a23853e 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -1738,41 +1738,6 @@ VEXTU_X_DO(vextuhrx, 16, 0) VEXTU_X_DO(vextuwrx, 32, 0) #undef VEXTU_X_DO =20 -/* - * The specification says that the results are undefined if all of the - * shift counts are not identical. We check to make sure that they - * are to conform to what real hardware appears to do. - */ -#define VSHIFT(suffix, leftp) \ - void helper_vs##suffix(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \ - { \ - int shift =3D b->VsrB(15) & 0x7; \ - int doit =3D 1; \ - int i; \ - \ - for (i =3D 0; i < ARRAY_SIZE(r->u8); i++) { \ - doit =3D doit && ((b->u8[i] & 0x7) =3D=3D shift); = \ - } \ - if (doit) { \ - if (shift =3D=3D 0) { = \ - *r =3D *a; \ - } else if (leftp) { \ - uint64_t carry =3D a->VsrD(1) >> (64 - shift); \ - \ - r->VsrD(0) =3D (a->VsrD(0) << shift) | carry; \ - r->VsrD(1) =3D a->VsrD(1) << shift; \ - } else { \ - uint64_t carry =3D a->VsrD(0) << (64 - shift); \ - \ - r->VsrD(1) =3D (a->VsrD(1) >> shift) | carry; \ - r->VsrD(0) =3D a->VsrD(0) >> shift; \ - } \ - } \ - } -VSHIFT(l, 1) -VSHIFT(r, 0) -#undef VSHIFT - void helper_vslv(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) { int i; diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx= -impl.inc.c index a9fe3c7..62108ca 100644 --- a/target/ppc/translate/vmx-impl.inc.c +++ b/target/ppc/translate/vmx-impl.inc.c @@ -570,6 +570,101 @@ static void trans_lvsr(DisasContext *ctx) tcg_temp_free(EA); } =20 +/* + * vsl VRT,VRA,VRB - Vector Shift Left + * + * Shifting left 128 bit value of vA by value specified in bits 125-127 of= vB. + * Lowest 3 bits in each byte element of register vB must be identical or + * result is undefined. + */ +static void trans_vsl(DisasContext *ctx) +{ + int VT =3D rD(ctx->opcode); + int VA =3D rA(ctx->opcode); + int VB =3D rB(ctx->opcode); + TCGv_i64 avrA =3D tcg_temp_new_i64(); + TCGv_i64 avrB =3D tcg_temp_new_i64(); + TCGv_i64 sh =3D tcg_temp_new_i64(); + TCGv_i64 shifted =3D tcg_temp_new_i64(); + TCGv_i64 tmp =3D tcg_temp_new_i64(); + + /* Place bits 125-127 of vB in sh. */ + get_avr64(avrB, VB, false); + tcg_gen_andi_i64(sh, avrB, 0x07ULL); + + /* + * Save highest sh bits of lower doubleword element of vA in variable + * shifted and perform shift on lower doubleword. + */ + get_avr64(avrA, VA, false); + tcg_gen_subfi_i64(tmp, 64, sh); + tcg_gen_shr_i64(shifted, avrA, tmp); + tcg_gen_shl_i64(avrA, avrA, sh); + set_avr64(VT, avrA, false); + + /* + * Perform shift on higher doubleword element of vA and replace lowest + * sh bits with shifted. + */ + get_avr64(avrA, VA, true); + tcg_gen_shl_i64(avrA, avrA, sh); + tcg_gen_or_i64(avrA, avrA, shifted); + set_avr64(VT, avrA, true); + + tcg_temp_free_i64(avrA); + tcg_temp_free_i64(avrB); + tcg_temp_free_i64(sh); + tcg_temp_free_i64(shifted); + tcg_temp_free_i64(tmp); +} + +/* + * vsr VRT,VRA,VRB - Vector Shift Right + * + * Shifting right 128 bit value of vA by value specified in bits 125-127 o= f vB. + * Lowest 3 bits in each byte element of register vB must be identical or + * result is undefined. + */ +static void trans_vsr(DisasContext *ctx) +{ + int VT =3D rD(ctx->opcode); + int VA =3D rA(ctx->opcode); + int VB =3D rB(ctx->opcode); + TCGv_i64 avrA =3D tcg_temp_new_i64(); + TCGv_i64 avrB =3D tcg_temp_new_i64(); + TCGv_i64 sh =3D tcg_temp_new_i64(); + TCGv_i64 shifted =3D tcg_temp_new_i64(); + TCGv_i64 tmp =3D tcg_temp_new_i64(); + + /* Place bits 125-127 of vB in sh. */ + get_avr64(avrB, VB, false); + tcg_gen_andi_i64(sh, avrB, 0x07ULL); + + /* + * Save lowest sh bits of higher doubleword element of vA in variable + * shifted and perform shift on higher doubleword. + */ + get_avr64(avrA, VA, true); + tcg_gen_subfi_i64(tmp, 64, sh); + tcg_gen_shl_i64(shifted, avrA, tmp); + tcg_gen_shr_i64(avrA, avrA, sh); + set_avr64(VT, avrA, true); + /* + * Perform shift on lower doubleword element of vA and replace highest + * sh bits with shifted. + */ + get_avr64(avrA, VA, false); + tcg_gen_shr_i64(avrA, avrA, sh); + tcg_gen_or_i64(avrA, avrA, shifted); + set_avr64(VT, avrA, false); + + tcg_temp_free_i64(avrA); + tcg_temp_free_i64(avrB); + tcg_temp_free_i64(sh); + tcg_temp_free_i64(shifted); + tcg_temp_free_i64(tmp); +} + GEN_VXFORM(vmuloub, 4, 0); GEN_VXFORM(vmulouh, 4, 1); GEN_VXFORM(vmulouw, 4, 2); @@ -682,11 +777,11 @@ GEN_VXFORM(vrld, 2, 3); GEN_VXFORM(vrldmi, 2, 3); GEN_VXFORM_DUAL(vrld, PPC_NONE, PPC2_ALTIVEC_207, \ vrldmi, PPC_NONE, PPC2_ISA300) -GEN_VXFORM(vsl, 2, 7); +GEN_VXFORM_TRANS(vsl, 2, 7); GEN_VXFORM(vrldnm, 2, 7); GEN_VXFORM_DUAL(vsl, PPC_ALTIVEC, PPC_NONE, \ vrldnm, PPC_NONE, PPC2_ISA300) -GEN_VXFORM(vsr, 2, 11); +GEN_VXFORM_TRANS(vsr, 2, 11); GEN_VXFORM_ENV(vpkuhum, 7, 0); GEN_VXFORM_ENV(vpkuwum, 7, 1); GEN_VXFORM_ENV(vpkudum, 7, 17); --=20 2.7.4 From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561633703; cv=none; d=zoho.com; s=zohoarc; b=GbXC9UGvJPTAx+Xvk8DeN+rKDjV+4isjOE1x9htMODujcPhNtV0D2EV8S13bR2ZHGBsuYB6gPXv8J1dXKx8kfGNJzwDrHcP2NeNAjAMV+iyHqqV/pj6JLraeX3Ae3yCe78ej2luaLnQ8A3l22X4SlJR508djOQSP6ibpUI3MDB4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561633703; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=3WWIeP17RjnaITUah6iTnmbz2XEBEmADdxfk6Y0skC4=; b=Yuv5YDvv2jon7puVZc2RIxs+VAY6Uv9PoUf+OHfbuOHG9/6qDEJC8/tCdLrbd4RErhgX95jPXNBVrKWV/3/J738YU4QqoVwPcNaJQXtBthIaijqluowD0m+zSS+cTzSMITqS2ubucJP3RHycbM+HyMAPzVJJDkIAUVo11/f57L8= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561633703927565.0209258991868; Thu, 27 Jun 2019 04:08:23 -0700 (PDT) Received: from localhost ([::1]:48638 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgSGM-0001rR-St for importer@patchew.org; Thu, 27 Jun 2019 07:08:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55815) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5x-0001yi-1Z for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5u-0001fk-Cz for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:36 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:53801 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5t-0000fG-UI for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:34 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id AC31A1A4598; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id 6DAFA1A4567; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:15 +0200 Message-Id: <1561632985-24866-4-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 03/13] target/ppc: Optimize emulation of vgbbd instruction X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Optimize altivec instruction vgbbd (Vector Gather Bits by Bytes by Doublewo= rd) All ith bits (i in range 1 to 8) of each byte of doubleword element in source register are concatenated and placed into ith byte of appropriate doubleword element in destination register. Following solution is done for both doubleword elements of source register in parallel, in order to reduce the number of instructions needed(that's why arrays are used): First, both doubleword elements of source register vB are placed in appropriate element of array avr. Bits are gathered in 2x8 iterations(2 for loops). In first iteration bit 1 of byte 1, bit 2 of byte 2,... bit 8 of byte 8 are in their final spots so avr[i], i=3D{0,1} can be and-ed with tcg_mask. For every following iteration, both avr[i] and tcg_mask variables have to be shifted right for 7 and 8 places, respectively, in order to get bit 1 of byte 2, bit 2 of byte 3.. bit 7 of byte 8 in their final spots so shifted avr values(saved in tmp) can be and-ed with new value of tcg_mask... After first 8 iteration(first loop), all the first bits are in their final places, all second bits but second bit from eight byte are in their places.= .. only 1 eight bit from eight byte is in it's place). In second loop we do all operations symmetrically, in order to get other half of bits in their final spots. Results for first and second doubleword elements are saved in result[0] and result[1] respectively. In the end those results are saved in appropriate doubleword element of destination register vD. Signed-off-by: Stefan Brankovic Reviewed-by: Richard Henderson --- target/ppc/helper.h | 1 - target/ppc/int_helper.c | 276 --------------------------------= ---- target/ppc/translate/vmx-impl.inc.c | 77 +++++++++- 3 files changed, 76 insertions(+), 278 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 33dad6a..cf1af51 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -320,7 +320,6 @@ DEF_HELPER_1(vclzlsbb, tl, avr) DEF_HELPER_1(vctzlsbb, tl, avr) DEF_HELPER_3(vbpermd, void, avr, avr, avr) DEF_HELPER_3(vbpermq, void, avr, avr, avr) -DEF_HELPER_2(vgbbd, void, avr, avr) DEF_HELPER_3(vpmsumb, void, avr, avr, avr) DEF_HELPER_3(vpmsumh, void, avr, avr, avr) DEF_HELPER_3(vpmsumw, void, avr, avr, avr) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index a23853e..87e3062 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -1185,282 +1185,6 @@ void helper_vbpermq(ppc_avr_t *r, ppc_avr_t *a, ppc= _avr_t *b) #undef VBPERMQ_INDEX #undef VBPERMQ_DW =20 -static const uint64_t VGBBD_MASKS[256] =3D { - 0x0000000000000000ull, /* 00 */ - 0x0000000000000080ull, /* 01 */ - 0x0000000000008000ull, /* 02 */ - 0x0000000000008080ull, /* 03 */ - 0x0000000000800000ull, /* 04 */ - 0x0000000000800080ull, /* 05 */ - 0x0000000000808000ull, /* 06 */ - 0x0000000000808080ull, /* 07 */ - 0x0000000080000000ull, /* 08 */ - 0x0000000080000080ull, /* 09 */ - 0x0000000080008000ull, /* 0A */ - 0x0000000080008080ull, /* 0B */ - 0x0000000080800000ull, /* 0C */ - 0x0000000080800080ull, /* 0D */ - 0x0000000080808000ull, /* 0E */ - 0x0000000080808080ull, /* 0F */ - 0x0000008000000000ull, /* 10 */ - 0x0000008000000080ull, /* 11 */ - 0x0000008000008000ull, /* 12 */ - 0x0000008000008080ull, /* 13 */ - 0x0000008000800000ull, /* 14 */ - 0x0000008000800080ull, /* 15 */ - 0x0000008000808000ull, /* 16 */ - 0x0000008000808080ull, /* 17 */ - 0x0000008080000000ull, /* 18 */ - 0x0000008080000080ull, /* 19 */ - 0x0000008080008000ull, /* 1A */ - 0x0000008080008080ull, /* 1B */ - 0x0000008080800000ull, /* 1C */ - 0x0000008080800080ull, /* 1D */ - 0x0000008080808000ull, /* 1E */ - 0x0000008080808080ull, /* 1F */ - 0x0000800000000000ull, /* 20 */ - 0x0000800000000080ull, /* 21 */ - 0x0000800000008000ull, /* 22 */ - 0x0000800000008080ull, /* 23 */ - 0x0000800000800000ull, /* 24 */ - 0x0000800000800080ull, /* 25 */ - 0x0000800000808000ull, /* 26 */ - 0x0000800000808080ull, /* 27 */ - 0x0000800080000000ull, /* 28 */ - 0x0000800080000080ull, /* 29 */ - 0x0000800080008000ull, /* 2A */ - 0x0000800080008080ull, /* 2B */ - 0x0000800080800000ull, /* 2C */ - 0x0000800080800080ull, /* 2D */ - 0x0000800080808000ull, /* 2E */ - 0x0000800080808080ull, /* 2F */ - 0x0000808000000000ull, /* 30 */ - 0x0000808000000080ull, /* 31 */ - 0x0000808000008000ull, /* 32 */ - 0x0000808000008080ull, /* 33 */ - 0x0000808000800000ull, /* 34 */ - 0x0000808000800080ull, /* 35 */ - 0x0000808000808000ull, /* 36 */ - 0x0000808000808080ull, /* 37 */ - 0x0000808080000000ull, /* 38 */ - 0x0000808080000080ull, /* 39 */ - 0x0000808080008000ull, /* 3A */ - 0x0000808080008080ull, /* 3B */ - 0x0000808080800000ull, /* 3C */ - 0x0000808080800080ull, /* 3D */ - 0x0000808080808000ull, /* 3E */ - 0x0000808080808080ull, /* 3F */ - 0x0080000000000000ull, /* 40 */ - 0x0080000000000080ull, /* 41 */ - 0x0080000000008000ull, /* 42 */ - 0x0080000000008080ull, /* 43 */ - 0x0080000000800000ull, /* 44 */ - 0x0080000000800080ull, /* 45 */ - 0x0080000000808000ull, /* 46 */ - 0x0080000000808080ull, /* 47 */ - 0x0080000080000000ull, /* 48 */ - 0x0080000080000080ull, /* 49 */ - 0x0080000080008000ull, /* 4A */ - 0x0080000080008080ull, /* 4B */ - 0x0080000080800000ull, /* 4C */ - 0x0080000080800080ull, /* 4D */ - 0x0080000080808000ull, /* 4E */ - 0x0080000080808080ull, /* 4F */ - 0x0080008000000000ull, /* 50 */ - 0x0080008000000080ull, /* 51 */ - 0x0080008000008000ull, /* 52 */ - 0x0080008000008080ull, /* 53 */ - 0x0080008000800000ull, /* 54 */ - 0x0080008000800080ull, /* 55 */ - 0x0080008000808000ull, /* 56 */ - 0x0080008000808080ull, /* 57 */ - 0x0080008080000000ull, /* 58 */ - 0x0080008080000080ull, /* 59 */ - 0x0080008080008000ull, /* 5A */ - 0x0080008080008080ull, /* 5B */ - 0x0080008080800000ull, /* 5C */ - 0x0080008080800080ull, /* 5D */ - 0x0080008080808000ull, /* 5E */ - 0x0080008080808080ull, /* 5F */ - 0x0080800000000000ull, /* 60 */ - 0x0080800000000080ull, /* 61 */ - 0x0080800000008000ull, /* 62 */ - 0x0080800000008080ull, /* 63 */ - 0x0080800000800000ull, /* 64 */ - 0x0080800000800080ull, /* 65 */ - 0x0080800000808000ull, /* 66 */ - 0x0080800000808080ull, /* 67 */ - 0x0080800080000000ull, /* 68 */ - 0x0080800080000080ull, /* 69 */ - 0x0080800080008000ull, /* 6A */ - 0x0080800080008080ull, /* 6B */ - 0x0080800080800000ull, /* 6C */ - 0x0080800080800080ull, /* 6D */ - 0x0080800080808000ull, /* 6E */ - 0x0080800080808080ull, /* 6F */ - 0x0080808000000000ull, /* 70 */ - 0x0080808000000080ull, /* 71 */ - 0x0080808000008000ull, /* 72 */ - 0x0080808000008080ull, /* 73 */ - 0x0080808000800000ull, /* 74 */ - 0x0080808000800080ull, /* 75 */ - 0x0080808000808000ull, /* 76 */ - 0x0080808000808080ull, /* 77 */ - 0x0080808080000000ull, /* 78 */ - 0x0080808080000080ull, /* 79 */ - 0x0080808080008000ull, /* 7A */ - 0x0080808080008080ull, /* 7B */ - 0x0080808080800000ull, /* 7C */ - 0x0080808080800080ull, /* 7D */ - 0x0080808080808000ull, /* 7E */ - 0x0080808080808080ull, /* 7F */ - 0x8000000000000000ull, /* 80 */ - 0x8000000000000080ull, /* 81 */ - 0x8000000000008000ull, /* 82 */ - 0x8000000000008080ull, /* 83 */ - 0x8000000000800000ull, /* 84 */ - 0x8000000000800080ull, /* 85 */ - 0x8000000000808000ull, /* 86 */ - 0x8000000000808080ull, /* 87 */ - 0x8000000080000000ull, /* 88 */ - 0x8000000080000080ull, /* 89 */ - 0x8000000080008000ull, /* 8A */ - 0x8000000080008080ull, /* 8B */ - 0x8000000080800000ull, /* 8C */ - 0x8000000080800080ull, /* 8D */ - 0x8000000080808000ull, /* 8E */ - 0x8000000080808080ull, /* 8F */ - 0x8000008000000000ull, /* 90 */ - 0x8000008000000080ull, /* 91 */ - 0x8000008000008000ull, /* 92 */ - 0x8000008000008080ull, /* 93 */ - 0x8000008000800000ull, /* 94 */ - 0x8000008000800080ull, /* 95 */ - 0x8000008000808000ull, /* 96 */ - 0x8000008000808080ull, /* 97 */ - 0x8000008080000000ull, /* 98 */ - 0x8000008080000080ull, /* 99 */ - 0x8000008080008000ull, /* 9A */ - 0x8000008080008080ull, /* 9B */ - 0x8000008080800000ull, /* 9C */ - 0x8000008080800080ull, /* 9D */ - 0x8000008080808000ull, /* 9E */ - 0x8000008080808080ull, /* 9F */ - 0x8000800000000000ull, /* A0 */ - 0x8000800000000080ull, /* A1 */ - 0x8000800000008000ull, /* A2 */ - 0x8000800000008080ull, /* A3 */ - 0x8000800000800000ull, /* A4 */ - 0x8000800000800080ull, /* A5 */ - 0x8000800000808000ull, /* A6 */ - 0x8000800000808080ull, /* A7 */ - 0x8000800080000000ull, /* A8 */ - 0x8000800080000080ull, /* A9 */ - 0x8000800080008000ull, /* AA */ - 0x8000800080008080ull, /* AB */ - 0x8000800080800000ull, /* AC */ - 0x8000800080800080ull, /* AD */ - 0x8000800080808000ull, /* AE */ - 0x8000800080808080ull, /* AF */ - 0x8000808000000000ull, /* B0 */ - 0x8000808000000080ull, /* B1 */ - 0x8000808000008000ull, /* B2 */ - 0x8000808000008080ull, /* B3 */ - 0x8000808000800000ull, /* B4 */ - 0x8000808000800080ull, /* B5 */ - 0x8000808000808000ull, /* B6 */ - 0x8000808000808080ull, /* B7 */ - 0x8000808080000000ull, /* B8 */ - 0x8000808080000080ull, /* B9 */ - 0x8000808080008000ull, /* BA */ - 0x8000808080008080ull, /* BB */ - 0x8000808080800000ull, /* BC */ - 0x8000808080800080ull, /* BD */ - 0x8000808080808000ull, /* BE */ - 0x8000808080808080ull, /* BF */ - 0x8080000000000000ull, /* C0 */ - 0x8080000000000080ull, /* C1 */ - 0x8080000000008000ull, /* C2 */ - 0x8080000000008080ull, /* C3 */ - 0x8080000000800000ull, /* C4 */ - 0x8080000000800080ull, /* C5 */ - 0x8080000000808000ull, /* C6 */ - 0x8080000000808080ull, /* C7 */ - 0x8080000080000000ull, /* C8 */ - 0x8080000080000080ull, /* C9 */ - 0x8080000080008000ull, /* CA */ - 0x8080000080008080ull, /* CB */ - 0x8080000080800000ull, /* CC */ - 0x8080000080800080ull, /* CD */ - 0x8080000080808000ull, /* CE */ - 0x8080000080808080ull, /* CF */ - 0x8080008000000000ull, /* D0 */ - 0x8080008000000080ull, /* D1 */ - 0x8080008000008000ull, /* D2 */ - 0x8080008000008080ull, /* D3 */ - 0x8080008000800000ull, /* D4 */ - 0x8080008000800080ull, /* D5 */ - 0x8080008000808000ull, /* D6 */ - 0x8080008000808080ull, /* D7 */ - 0x8080008080000000ull, /* D8 */ - 0x8080008080000080ull, /* D9 */ - 0x8080008080008000ull, /* DA */ - 0x8080008080008080ull, /* DB */ - 0x8080008080800000ull, /* DC */ - 0x8080008080800080ull, /* DD */ - 0x8080008080808000ull, /* DE */ - 0x8080008080808080ull, /* DF */ - 0x8080800000000000ull, /* E0 */ - 0x8080800000000080ull, /* E1 */ - 0x8080800000008000ull, /* E2 */ - 0x8080800000008080ull, /* E3 */ - 0x8080800000800000ull, /* E4 */ - 0x8080800000800080ull, /* E5 */ - 0x8080800000808000ull, /* E6 */ - 0x8080800000808080ull, /* E7 */ - 0x8080800080000000ull, /* E8 */ - 0x8080800080000080ull, /* E9 */ - 0x8080800080008000ull, /* EA */ - 0x8080800080008080ull, /* EB */ - 0x8080800080800000ull, /* EC */ - 0x8080800080800080ull, /* ED */ - 0x8080800080808000ull, /* EE */ - 0x8080800080808080ull, /* EF */ - 0x8080808000000000ull, /* F0 */ - 0x8080808000000080ull, /* F1 */ - 0x8080808000008000ull, /* F2 */ - 0x8080808000008080ull, /* F3 */ - 0x8080808000800000ull, /* F4 */ - 0x8080808000800080ull, /* F5 */ - 0x8080808000808000ull, /* F6 */ - 0x8080808000808080ull, /* F7 */ - 0x8080808080000000ull, /* F8 */ - 0x8080808080000080ull, /* F9 */ - 0x8080808080008000ull, /* FA */ - 0x8080808080008080ull, /* FB */ - 0x8080808080800000ull, /* FC */ - 0x8080808080800080ull, /* FD */ - 0x8080808080808000ull, /* FE */ - 0x8080808080808080ull, /* FF */ -}; - -void helper_vgbbd(ppc_avr_t *r, ppc_avr_t *b) -{ - int i; - uint64_t t[2] =3D { 0, 0 }; - - VECTOR_FOR_INORDER_I(i, u8) { -#if defined(HOST_WORDS_BIGENDIAN) - t[i >> 3] |=3D VGBBD_MASKS[b->u8[i]] >> (i & 7); -#else - t[i >> 3] |=3D VGBBD_MASKS[b->u8[i]] >> (7 - (i & 7)); -#endif - } - - r->u64[0] =3D t[0]; - r->u64[1] =3D t[1]; -} - #define PMSUM(name, srcfld, trgfld, trgtyp) \ void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \ { \ diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx= -impl.inc.c index 62108ca..d9b346b 100644 --- a/target/ppc/translate/vmx-impl.inc.c +++ b/target/ppc/translate/vmx-impl.inc.c @@ -665,6 +665,81 @@ static void trans_vsr(DisasContext *ctx) tcg_temp_free_i64(tmp); } =20 +/* + * vgbbd VRT,VRB - Vector Gather Bits by Bytes by Doubleword + * + * All ith bits (i in range 1 to 8) of each byte of doubleword element in = source + * register are concatenated and placed into ith byte of appropriate doubl= eword + * element in destination register. + * + * Following solution is done for both doubleword elements of source regis= ter + * in parallel, in order to reduce the number of instructions needed(that'= s why + * arrays are used): + * First, both doubleword elements of source register vB are placed in + * appropriate element of array avr. Bits are gathered in 2x8 iterations(2= for + * loops). In first iteration bit 1 of byte 1, bit 2 of byte 2,... bit 8 of + * byte 8 are in their final spots so avr[i], i=3D{0,1} can be and-ed with + * tcg_mask. For every following iteration, both avr[i] and tcg_mask varia= bles + * have to be shifted right for 7 and 8 places, respectively, in order to = get + * bit 1 of byte 2, bit 2 of byte 3.. bit 7 of byte 8 in their final spots= so + * shifted avr values(saved in tmp) can be and-ed with new value of tcg_ma= sk... + * After first 8 iteration(first loop), all the first bits are in their fi= nal + * places, all second bits but second bit from eight byte are in their pla= ces... + * only 1 eight bit from eight byte is in it's place). In second loop we d= o all + * operations symmetrically, in order to get other half of bits in their f= inal + * spots. Results for first and second doubleword elements are saved in + * result[0] and result[1] respectively. In the end those results are save= d in + * appropriate doubleword element of destination register vD. + */ +static void trans_vgbbd(DisasContext *ctx) +{ + int VT =3D rD(ctx->opcode); + int VB =3D rB(ctx->opcode); + TCGv_i64 tmp =3D tcg_temp_new_i64(); + uint64_t mask =3D 0x8040201008040201ULL; + int i, j; + + TCGv_i64 result[2]; + result[0] =3D tcg_temp_new_i64(); + result[1] =3D tcg_temp_new_i64(); + TCGv_i64 avr[2]; + avr[0] =3D tcg_temp_new_i64(); + avr[1] =3D tcg_temp_new_i64(); + TCGv_i64 tcg_mask =3D tcg_temp_new_i64(); + + tcg_gen_movi_i64(tcg_mask, mask); + for (j =3D 0; j < 2; j++) { + get_avr64(avr[j], VB, j); + tcg_gen_and_i64(result[j], avr[j], tcg_mask); + } + for (i =3D 1; i < 8; i++) { + tcg_gen_movi_i64(tcg_mask, mask >> (i * 8)); + for (j =3D 0; j < 2; j++) { + tcg_gen_shri_i64(tmp, avr[j], i * 7); + tcg_gen_and_i64(tmp, tmp, tcg_mask); + tcg_gen_or_i64(result[j], result[j], tmp); + } + } + for (i =3D 1; i < 8; i++) { + tcg_gen_movi_i64(tcg_mask, mask << (i * 8)); + for (j =3D 0; j < 2; j++) { + tcg_gen_shli_i64(tmp, avr[j], i * 7); + tcg_gen_and_i64(tmp, tmp, tcg_mask); + tcg_gen_or_i64(result[j], result[j], tmp); + } + } + for (j =3D 0; j < 2; j++) { + set_avr64(VT, result[j], j); + } + + tcg_temp_free_i64(tmp); + tcg_temp_free_i64(tcg_mask); + tcg_temp_free_i64(result[0]); + tcg_temp_free_i64(result[1]); + tcg_temp_free_i64(avr[0]); + tcg_temp_free_i64(avr[1]); +} + GEN_VXFORM(vmuloub, 4, 0); GEN_VXFORM(vmulouh, 4, 1); GEN_VXFORM(vmulouw, 4, 2); @@ -1209,7 +1284,7 @@ GEN_VXFORM_DUAL(vclzd, PPC_NONE, PPC2_ALTIVEC_207, \ vpopcntd, PPC_NONE, PPC2_ALTIVEC_207) GEN_VXFORM(vbpermd, 6, 23); GEN_VXFORM(vbpermq, 6, 21); -GEN_VXFORM_NOA(vgbbd, 6, 20); +GEN_VXFORM_TRANS(vgbbd, 6, 20); GEN_VXFORM(vpmsumb, 4, 16) GEN_VXFORM(vpmsumh, 4, 17) GEN_VXFORM(vpmsumw, 4, 18) --=20 2.7.4 From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561633177; cv=none; d=zoho.com; s=zohoarc; b=Upc+AgZO2GjtlNzqVEFbPz7un0gylCV5EQXenLJY+eG3XgM9PEpNd60rng5WLVArGLfk7RCXgcCvNNnmh9FUUNBCVYcdpJssFZGUxqwQ2SMO/GosLfl2t6vY3SkyevDroErTUofi7/1eyt39cdPBbUy6mSTJ9ACIddaNLVQA34s= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561633177; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=8aMNbcLipHp+pejCtVwntHjMqThXpkAgEcfdQiZ8w6k=; b=FDlW+bpDh0T36N9FYGSER7xjPrPme+wmDzZ4U/+Ly3kVG1egMS3PdYU09Lh0o5SGndWQOJj/zMf2gheZRFoO2NIio+7H46ZxuN+kY94jwXHtRUPijH2t3q1C1APHQryrM75Za9SjziLhfjTLrXpo5r1XGJHFhUJhajNfL/iBy+Q= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561633177274852.2219688654143; Thu, 27 Jun 2019 03:59:37 -0700 (PDT) Received: from localhost ([::1]:48592 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS7r-0004HA-Tx for importer@patchew.org; Thu, 27 Jun 2019 06:59:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55775) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5v-0001yZ-NP for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5u-0001fX-8b for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:35 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:53864 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5t-0000fP-Th for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:34 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id B5B211A453F; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id 79C621A454D; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:16 +0200 Message-Id: <1561632985-24866-5-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 04/13] target/ppc: Optimize emulation of vclzd instruction X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Optimize Altivec instruction vclzd (Vector Count Leading Zeros Doubleword). This instruction counts the number of leading zeros of each doubleword elem= ent in source register and places result in the appropriate doubleword element = of destination register. Using tcg-s count leading zeros instruction two times(once for each doubleword element of source register vB) and placing result in appropriate doubleword element of destination register vD. Signed-off-by: Stefan Brankovic Reviewed-by: Richard Henderson --- target/ppc/helper.h | 1 - target/ppc/int_helper.c | 3 --- target/ppc/translate/vmx-impl.inc.c | 28 +++++++++++++++++++++++++++- 3 files changed, 27 insertions(+), 5 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index cf1af51..57a954c 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -307,7 +307,6 @@ DEF_HELPER_4(vctsxs, void, env, avr, avr, i32) DEF_HELPER_2(vclzb, void, avr, avr) DEF_HELPER_2(vclzh, void, avr, avr) DEF_HELPER_2(vclzw, void, avr, avr) -DEF_HELPER_2(vclzd, void, avr, avr) DEF_HELPER_2(vctzb, void, avr, avr) DEF_HELPER_2(vctzh, void, avr, avr) DEF_HELPER_2(vctzw, void, avr, avr) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 87e3062..210e8be 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -1824,17 +1824,14 @@ VUPK(lsw, s64, s32, UPKLO) #define clzb(v) ((v) ? clz32((uint32_t)(v) << 24) : 8) #define clzh(v) ((v) ? clz32((uint32_t)(v) << 16) : 16) #define clzw(v) clz32((v)) -#define clzd(v) clz64((v)) =20 VGENERIC_DO(clzb, u8) VGENERIC_DO(clzh, u16) VGENERIC_DO(clzw, u32) -VGENERIC_DO(clzd, u64) =20 #undef clzb #undef clzh #undef clzw -#undef clzd =20 #define ctzb(v) ((v) ? ctz32(v) : 8) #define ctzh(v) ((v) ? ctz32(v) : 16) diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx= -impl.inc.c index d9b346b..50d906b 100644 --- a/target/ppc/translate/vmx-impl.inc.c +++ b/target/ppc/translate/vmx-impl.inc.c @@ -740,6 +740,32 @@ static void trans_vgbbd(DisasContext *ctx) tcg_temp_free_i64(avr[1]); } =20 +/* + * vclzd VRT,VRB - Vector Count Leading Zeros Doubleword + * + * Counting the number of leading zero bits of each doubleword element in = source + * register and placing result in appropriate doubleword element of destin= ation + * register. + */ +static void trans_vclzd(DisasContext *ctx) +{ + int VT =3D rD(ctx->opcode); + int VB =3D rB(ctx->opcode); + TCGv_i64 avr =3D tcg_temp_new_i64(); + + /* high doubleword */ + get_avr64(avr, VB, true); + tcg_gen_clzi_i64(avr, avr, 64); + set_avr64(VT, avr, true); + + /* low doubleword */ + get_avr64(avr, VB, false); + tcg_gen_clzi_i64(avr, avr, 64); + set_avr64(VT, avr, false); + + tcg_temp_free_i64(avr); +} + GEN_VXFORM(vmuloub, 4, 0); GEN_VXFORM(vmulouh, 4, 1); GEN_VXFORM(vmulouw, 4, 2); @@ -1256,7 +1282,7 @@ GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23) GEN_VXFORM_NOA(vclzb, 1, 28) GEN_VXFORM_NOA(vclzh, 1, 29) GEN_VXFORM_NOA(vclzw, 1, 30) -GEN_VXFORM_NOA(vclzd, 1, 31) +GEN_VXFORM_TRANS(vclzd, 1, 31) GEN_VXFORM_NOA_2(vnegw, 1, 24, 6) GEN_VXFORM_NOA_2(vnegd, 1, 24, 7) GEN_VXFORM_NOA_2(vextsb2w, 1, 24, 16) --=20 2.7.4 From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561633552; cv=none; d=zoho.com; s=zohoarc; b=TMssI59bUqbtckgVblivGB/YsY6QanUzCXCBQaYLZbMb5SUTxBx9SCjhUTvpAN72M27fArwKQ0xiRH2cJfOfUFeZm+Gn9objzJx49xknZ7spjaHhTs28UgiPJH1OqJC3IZFqJWpbVqU3zQLwQtSmiXWdtX4cvpu2fe9vpPYbtBQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561633552; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=Y9UXrj84zrGEJxDCVKqAFF0PkLtSCSIkSPu/jQvDmko=; b=dKvazDEr3A1CAwu4zYVC6uURgHlVQs1GSZ/GeyF5X460dLoxuOcU2vgc+u/JA68ZqMQky3InLO9da87NpLcKvvYDTSM6CpET2CcwDtaTs1YMF2fqmavSEFqTC/mz2251ApGImFK6TMy4z/DPiwlfgpPQsS5Ho1qU9JCCWV/FBUo= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561633552083456.54318519727246; Thu, 27 Jun 2019 04:05:52 -0700 (PDT) Received: from localhost ([::1]:48632 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgSDu-0000BR-C8 for importer@patchew.org; Thu, 27 Jun 2019 07:05:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55823) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5x-0001yk-4r for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5v-0001hA-GH for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:37 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36480 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5v-0001fc-37 for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:35 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id CDC6B1A454D; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id 8232F1A4539; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:17 +0200 Message-Id: <1561632985-24866-6-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 05/13] target/ppc: Optimize emulation of vclzw instruction X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Optimize Altivec instruction vclzw (Vector Count Leading Zeros Word). This instruction counts the number of leading zeros of each word element in source register and places result in the appropriate word element of destination register. Counting is to be performed in four iterations of for loop(one for each word elemnt of source register vB). Every iteration consists of loading appropriate word element from source register, counting leading zeros with tcg_gen_clzi_i32, and saving the result in appropriate word element of destination register. Signed-off-by: Stefan Brankovic Reviewed-by: Richard Henderson --- target/ppc/helper.h | 1 - target/ppc/int_helper.c | 3 --- target/ppc/translate/vmx-impl.inc.c | 28 +++++++++++++++++++++++++++- 3 files changed, 27 insertions(+), 5 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 57a954c..4c5c359 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -306,7 +306,6 @@ DEF_HELPER_4(vctsxs, void, env, avr, avr, i32) =20 DEF_HELPER_2(vclzb, void, avr, avr) DEF_HELPER_2(vclzh, void, avr, avr) -DEF_HELPER_2(vclzw, void, avr, avr) DEF_HELPER_2(vctzb, void, avr, avr) DEF_HELPER_2(vctzh, void, avr, avr) DEF_HELPER_2(vctzw, void, avr, avr) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 210e8be..cd25b66 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -1823,15 +1823,12 @@ VUPK(lsw, s64, s32, UPKLO) =20 #define clzb(v) ((v) ? clz32((uint32_t)(v) << 24) : 8) #define clzh(v) ((v) ? clz32((uint32_t)(v) << 16) : 16) -#define clzw(v) clz32((v)) =20 VGENERIC_DO(clzb, u8) VGENERIC_DO(clzh, u16) -VGENERIC_DO(clzw, u32) =20 #undef clzb #undef clzh -#undef clzw =20 #define ctzb(v) ((v) ? ctz32(v) : 8) #define ctzh(v) ((v) ? ctz32(v) : 16) diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx= -impl.inc.c index 50d906b..39c7839 100644 --- a/target/ppc/translate/vmx-impl.inc.c +++ b/target/ppc/translate/vmx-impl.inc.c @@ -741,6 +741,32 @@ static void trans_vgbbd(DisasContext *ctx) } =20 /* + * vclzw VRT,VRB - Vector Count Leading Zeros Word + * + * Counting the number of leading zero bits of each word element in source + * register and placing result in appropriate word element of destination + * register. + */ +static void trans_vclzw(DisasContext *ctx) +{ + int VT =3D rD(ctx->opcode); + int VB =3D rB(ctx->opcode); + TCGv_i32 tmp =3D tcg_temp_new_i32(); + int i; + + /* Perform count for every word element using tcg_gen_clzi_i32. */ + for (i =3D 0; i < 4; i++) { + tcg_gen_ld_i32(tmp, cpu_env, + offsetof(CPUPPCState, vsr[32 + VB].u64[0]) + i * 4); + tcg_gen_clzi_i32(tmp, tmp, 32); + tcg_gen_st_i32(tmp, cpu_env, + offsetof(CPUPPCState, vsr[32 + VT].u64[0]) + i * 4); + } + + tcg_temp_free_i32(tmp); +} + +/* * vclzd VRT,VRB - Vector Count Leading Zeros Doubleword * * Counting the number of leading zero bits of each doubleword element in = source @@ -1281,7 +1307,7 @@ GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23) =20 GEN_VXFORM_NOA(vclzb, 1, 28) GEN_VXFORM_NOA(vclzh, 1, 29) -GEN_VXFORM_NOA(vclzw, 1, 30) +GEN_VXFORM_TRANS(vclzw, 1, 30) GEN_VXFORM_TRANS(vclzd, 1, 31) GEN_VXFORM_NOA_2(vnegw, 1, 24, 6) GEN_VXFORM_NOA_2(vnegd, 1, 24, 7) --=20 2.7.4 From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561633992; cv=none; d=zoho.com; s=zohoarc; b=VChIlC1iZU+yvRbkH4A1F6NyWa5hkWMeKbuEyZn9MEgXiXIDgWSG5rPV9TpBZA7/tyGJkyqo943EycLrdhQexc6iq8wEHosaj+g3Sb6Sw24ldFYPZnI65IKsK3gT3FfWGXjYAsvXjqJ0A1dgBio4TSaj4qciCrF1AhyItoQOA6U= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561633992; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=21BlGxUQyNCFDkPzLzHx6BKry4kMngg9SPpQuTDGd2g=; b=K6QCa312Xv7RG4U8qC0ZEQVrzPlkH18w/wtl9ixfwxVQ5hHw74dJbdX8BcbZjJzfhdvATIfCxyFydMwbkY3t/9PYtXmzJvhkl0+OP8fiXb9PEjy7cynfWG5OCpMcN/cokYFtYKCenQccMPoPjIXwJ2ItTGZqy/MNqWnP5LsbQgQ= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561633992284409.03305547997957; Thu, 27 Jun 2019 04:13:12 -0700 (PDT) Received: from localhost ([::1]:48670 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgSKz-0005fK-97 for importer@patchew.org; Thu, 27 Jun 2019 07:13:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55841) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5x-0001yt-Gs for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5v-0001h3-EK for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:37 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36481 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5v-0001fb-2h for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:35 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id EB5251A4577; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id C00E61A4548; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:18 +0200 Message-Id: <1561632985-24866-7-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 06/13] target/ppc: Optimize emulation of vclzh and vclzb instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Optimize Altivec instruction vclzh (Vector Count Leading Zeros Halfword). This instruction counts the number of leading zeros of each halfword element in source register and places result in the appropriate halfword element of destination register. In each iteration of outer for loop count operation is performed on one doubleword element of source register vB. In the first iteration, higher doubleword element of vB is placed in variable avr, and then counting for every halfword element is performed by using tcg_gen_clzi_i64. Since it counts leading zeros on 64 bit lenght, ith byte element has to be moved to the highest 16 bits of tmp, or-ed with mask(in order to get all ones in lowest 48 bits), then perform tcg_gen_clzi_i64 and move it's result in appropriate halfword element of result. This is done in inner for loop. After the operation is finished, the result is saved in the appropriate doubleword element of destination register vD. The same sequence of orders is to be applied again for the lower doubleword element of vB. Optimize Altivec instruction vclzb (Vector Count Leading Zeros Byte). This instruction counts the number of leading zeros of each byte element in source register and places result in the appropriate byte element of destination register. In each iteration of the outer for loop, counting operation is done on one doubleword element of source register vB. In the first iteration, the higher doubleword element of vB is placed in variable avr, and then counting for every byte element is performed using tcg_gen_clzi_i64. Since it counts leading zeros on 64 bit lenght, ith byte element has to be moved to the hig= hest 8 bits of variable tmp, or-ed with mask(in order to get all ones in the lo= west 56 bits), then perform tcg_gen_clzi_i64 and move it's result in the appropr= iate byte element of result. This is done in inner for loop. After the operation= is finished, the result is saved in the appropriate doubleword element of des= tination register vD. The same sequence of orders is to be applied again for the low= er doubleword element of vB. Signed-off-by: Stefan Brankovic --- target/ppc/helper.h | 2 - target/ppc/int_helper.c | 9 --- target/ppc/translate/vmx-impl.inc.c | 122 ++++++++++++++++++++++++++++++++= +++- 3 files changed, 120 insertions(+), 13 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 4c5c359..ac1a5bd 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -304,8 +304,6 @@ DEF_HELPER_4(vcfsx, void, env, avr, avr, i32) DEF_HELPER_4(vctuxs, void, env, avr, avr, i32) DEF_HELPER_4(vctsxs, void, env, avr, avr, i32) =20 -DEF_HELPER_2(vclzb, void, avr, avr) -DEF_HELPER_2(vclzh, void, avr, avr) DEF_HELPER_2(vctzb, void, avr, avr) DEF_HELPER_2(vctzh, void, avr, avr) DEF_HELPER_2(vctzw, void, avr, avr) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index cd25b66..3edf334 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -1821,15 +1821,6 @@ VUPK(lsw, s64, s32, UPKLO) } \ } =20 -#define clzb(v) ((v) ? clz32((uint32_t)(v) << 24) : 8) -#define clzh(v) ((v) ? clz32((uint32_t)(v) << 16) : 16) - -VGENERIC_DO(clzb, u8) -VGENERIC_DO(clzh, u16) - -#undef clzb -#undef clzh - #define ctzb(v) ((v) ? ctz32(v) : 8) #define ctzh(v) ((v) ? ctz32(v) : 16) #define ctzw(v) ctz32((v)) diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx= -impl.inc.c index 39c7839..fd25b7c 100644 --- a/target/ppc/translate/vmx-impl.inc.c +++ b/target/ppc/translate/vmx-impl.inc.c @@ -741,6 +741,124 @@ static void trans_vgbbd(DisasContext *ctx) } =20 /* + * vclzb VRT,VRB - Vector Count Leading Zeros Byte + * + * Counting the number of leading zero bits of each byte element in source + * register and placing result in appropriate byte element of destination + * register. + */ +static void trans_vclzb(DisasContext *ctx) +{ + int VT =3D rD(ctx->opcode); + int VB =3D rB(ctx->opcode); + TCGv_i64 avr =3D tcg_temp_new_i64(); + TCGv_i64 result =3D tcg_temp_new_i64(); + TCGv_i64 tmp =3D tcg_temp_new_i64(); + TCGv_i64 mask =3D tcg_const_i64(0xffffffffffffffULL); + int i, j; + + for (i =3D 0; i < 2; i++) { + if (i =3D=3D 0) { + /* Get high doubleword of vB in avr. */ + get_avr64(avr, VB, true); + } else { + /* Get low doubleword of vB in avr. */ + get_avr64(avr, VB, false); + } + /* + * Perform count for every byte element using tcg_gen_clzi_i64. + * Since it counts leading zeros on 64 bit lenght, we have to move + * ith byte element to highest 8 bits of tmp, or it with mask(so w= e get + * all ones in lowest 56 bits), then perform tcg_gen_clzi_i64 and = move + * it's result in appropriate byte element of result. + */ + tcg_gen_shli_i64(tmp, avr, 56); + tcg_gen_or_i64(tmp, tmp, mask); + tcg_gen_clzi_i64(result, tmp, 64); + for (j =3D 1; j < 7; j++) { + tcg_gen_shli_i64(tmp, avr, (7 - j) * 8); + tcg_gen_or_i64(tmp, tmp, mask); + tcg_gen_clzi_i64(tmp, tmp, 64); + tcg_gen_deposit_i64(result, result, tmp, j * 8, 8); + } + tcg_gen_or_i64(tmp, avr, mask); + tcg_gen_clzi_i64(tmp, tmp, 64); + tcg_gen_deposit_i64(result, result, tmp, 56, 8); + if (i =3D=3D 0) { + /* Place result in high doubleword element of vD. */ + set_avr64(VT, result, true); + } else { + /* Place result in low doubleword element of vD. */ + set_avr64(VT, result, false); + } + } + + tcg_temp_free_i64(avr); + tcg_temp_free_i64(result); + tcg_temp_free_i64(tmp); + tcg_temp_free_i64(mask); +} + +/* + * vclzh VRT,VRB - Vector Count Leading Zeros Halfword + * + * Counting the number of leading zero bits of each halfword element in so= urce + * register and placing result in appropriate halfword element of destinat= ion + * register. + */ +static void trans_vclzh(DisasContext *ctx) +{ + int VT =3D rD(ctx->opcode); + int VB =3D rB(ctx->opcode); + TCGv_i64 avr =3D tcg_temp_new_i64(); + TCGv_i64 result =3D tcg_temp_new_i64(); + TCGv_i64 tmp =3D tcg_temp_new_i64(); + TCGv_i64 mask =3D tcg_const_i64(0xffffffffffffULL); + int i, j; + + for (i =3D 0; i < 2; i++) { + if (i =3D=3D 0) { + /* Get high doubleword element of vB in avr. */ + get_avr64(avr, VB, true); + } else { + /* Get low doubleword element of vB in avr. */ + get_avr64(avr, VB, false); + } + /* + * Perform count for every halfword element using tcg_gen_clzi_i64. + * Since it counts leading zeros on 64 bit lenght, we have to move + * ith byte element to highest 16 bits of tmp, or it with mask(so = we get + * all ones in lowest 48 bits), then perform tcg_gen_clzi_i64 and = move + * it's result in appropriate halfword element of result. + */ + tcg_gen_shli_i64(tmp, avr, 48); + tcg_gen_or_i64(tmp, tmp, mask); + tcg_gen_clzi_i64(result, tmp, 64); + for (j =3D 1; j < 3; j++) { + tcg_gen_shli_i64(tmp, avr, (3 - j) * 16); + tcg_gen_or_i64(tmp, tmp, mask); + tcg_gen_clzi_i64(tmp, tmp, 64); + tcg_gen_deposit_i64(result, result, tmp, j * 16, 16); + } + tcg_gen_or_i64(tmp, avr, mask); + tcg_gen_clzi_i64(tmp, tmp, 64); + tcg_gen_deposit_i64(result, result, tmp, 48, 16); + if (i =3D=3D 0) { + /* Place result in high doubleword element of vD. */ + set_avr64(VT, result, true); + } else { + /* Place result in low doubleword element of vD. */ + set_avr64(VT, result, false); + } + } + + tcg_temp_free_i64(avr); + tcg_temp_free_i64(result); + tcg_temp_free_i64(tmp); + tcg_temp_free_i64(mask); +} + +/* * vclzw VRT,VRB - Vector Count Leading Zeros Word * * Counting the number of leading zero bits of each word element in source @@ -1305,8 +1423,8 @@ GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20) GEN_VAFORM_PAIRED(vsel, vperm, 21) GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23) =20 -GEN_VXFORM_NOA(vclzb, 1, 28) -GEN_VXFORM_NOA(vclzh, 1, 29) +GEN_VXFORM_TRANS(vclzb, 1, 28) +GEN_VXFORM_TRANS(vclzh, 1, 29) GEN_VXFORM_TRANS(vclzw, 1, 30) GEN_VXFORM_TRANS(vclzd, 1, 31) GEN_VXFORM_NOA_2(vnegw, 1, 24, 6) --=20 2.7.4 From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561634327; cv=none; d=zoho.com; s=zohoarc; b=AYB7bZZSAQ8+8ueZZ4j0vmFf0vtV5lAu5P9hw11pOIpgIR6EEmH8WieKn00XULHH2t6c2BTp8v1WKm8VhCVveLN+YdmoOk3ZI5MhOj+gYGu6DtrxVb52FvQVvfQkiXJcTX3Mw1YvIm0YMhHkU3I1dbVE+t+c5zvLUO6DWatWCOw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561634327; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=K+DT173m1Bf24MLkbDKU3PNH7PSGtV1Toqx4BPDmwfE=; b=J/zFKlX3zEFn/DXJSv14h8DNwWgBHjiKhKx8UwrIOP7mMufS8ZjpyOqCAMHHyqKSh2CxDwsdZMOS6MLGIofNBhReagL7CGFuf6PjRFCDbAdSRdi7cVbjZLeC1By38Sap3ezQjQIL7GVIEQlsWK4AgfjGpBjehBWQoTw7Tesl0e0= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (209.51.188.17 [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561634327298515.3218717024879; Thu, 27 Jun 2019 04:18:47 -0700 (PDT) Received: from localhost ([::1]:48734 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgSQI-0002US-9S for importer@patchew.org; Thu, 27 Jun 2019 07:18:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55845) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5x-0001yv-Iq for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5v-0001hW-Jx for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:37 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36494 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5v-0001fm-6T for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:35 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 1533E1A4548; Thu, 27 Jun 2019 12:56:30 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id D61181A4567; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:19 +0200 Message-Id: <1561632985-24866-8-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 07/13] target/ppc: Refactor emulation of vmrgew and vmrgow instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Since I found this two instructions implemented with tcg, I refactored them so they are consistent with other similar implementations that I introduced in this patch. Also, a new dual macro GEN_VXFORM_TRANS_DUAL is added. This macro is used if one instruction is realized with direct translation, and second one with a helper. Signed-off-by: Stefan Brankovic --- target/ppc/translate/vmx-impl.inc.c | 66 +++++++++++++++++++++------------= ---- 1 file changed, 37 insertions(+), 29 deletions(-) diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx= -impl.inc.c index fd25b7c..39fb26d 100644 --- a/target/ppc/translate/vmx-impl.inc.c +++ b/target/ppc/translate/vmx-impl.inc.c @@ -350,6 +350,28 @@ static void glue(gen_, name0##_##name1)(DisasContext *= ctx) \ } \ } =20 +/* + * We use this macro if one instruction is realized with direct + * translation, and second one with helper. + */ +#define GEN_VXFORM_TRANS_DUAL(name0, flg0, flg2_0, name1, flg1, flg2_1)\ +static void glue(gen_, name0##_##name1)(DisasContext *ctx) \ +{ \ + if ((Rc(ctx->opcode) =3D=3D 0) && = \ + ((ctx->insns_flags & flg0) || (ctx->insns_flags2 & flg2_0))) { \ + if (unlikely(!ctx->altivec_enabled)) { \ + gen_exception(ctx, POWERPC_EXCP_VPU); \ + return; \ + } \ + trans_##name0(ctx); \ + } else if ((Rc(ctx->opcode) =3D=3D 1) && = \ + ((ctx->insns_flags & flg1) || (ctx->insns_flags2 & flg2_1))) { \ + gen_##name1(ctx); \ + } else { \ + gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL); \ + } \ +} + /* Adds support to provide invalid mask */ #define GEN_VXFORM_DUAL_EXT(name0, flg0, flg2_0, inval0, \ name1, flg1, flg2_1, inval1) \ @@ -431,20 +453,13 @@ GEN_VXFORM(vmrglb, 6, 4); GEN_VXFORM(vmrglh, 6, 5); GEN_VXFORM(vmrglw, 6, 6); =20 -static void gen_vmrgew(DisasContext *ctx) +static void trans_vmrgew(DisasContext *ctx) { - TCGv_i64 tmp; - TCGv_i64 avr; - int VT, VA, VB; - if (unlikely(!ctx->altivec_enabled)) { - gen_exception(ctx, POWERPC_EXCP_VPU); - return; - } - VT =3D rD(ctx->opcode); - VA =3D rA(ctx->opcode); - VB =3D rB(ctx->opcode); - tmp =3D tcg_temp_new_i64(); - avr =3D tcg_temp_new_i64(); + int VT =3D rD(ctx->opcode); + int VA =3D rA(ctx->opcode); + int VB =3D rB(ctx->opcode); + TCGv_i64 tmp =3D tcg_temp_new_i64(); + TCGv_i64 avr =3D tcg_temp_new_i64(); =20 get_avr64(avr, VB, true); tcg_gen_shri_i64(tmp, avr, 32); @@ -462,21 +477,14 @@ static void gen_vmrgew(DisasContext *ctx) tcg_temp_free_i64(avr); } =20 -static void gen_vmrgow(DisasContext *ctx) +static void trans_vmrgow(DisasContext *ctx) { - TCGv_i64 t0, t1; - TCGv_i64 avr; - int VT, VA, VB; - if (unlikely(!ctx->altivec_enabled)) { - gen_exception(ctx, POWERPC_EXCP_VPU); - return; - } - VT =3D rD(ctx->opcode); - VA =3D rA(ctx->opcode); - VB =3D rB(ctx->opcode); - t0 =3D tcg_temp_new_i64(); - t1 =3D tcg_temp_new_i64(); - avr =3D tcg_temp_new_i64(); + int VT =3D rD(ctx->opcode); + int VA =3D rA(ctx->opcode); + int VB =3D rB(ctx->opcode); + TCGv_i64 t0 =3D tcg_temp_new_i64(); + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 avr =3D tcg_temp_new_i64(); =20 get_avr64(t0, VB, true); get_avr64(t1, VA, true); @@ -1052,14 +1060,14 @@ GEN_VXFORM_ENV(vminfp, 5, 17); GEN_VXFORM_HETRO(vextublx, 6, 24) GEN_VXFORM_HETRO(vextuhlx, 6, 25) GEN_VXFORM_HETRO(vextuwlx, 6, 26) -GEN_VXFORM_DUAL(vmrgow, PPC_NONE, PPC2_ALTIVEC_207, +GEN_VXFORM_TRANS_DUAL(vmrgow, PPC_NONE, PPC2_ALTIVEC_207, vextuwlx, PPC_NONE, PPC2_ISA300) GEN_VXFORM_HETRO(vextubrx, 6, 28) GEN_VXFORM_HETRO(vextuhrx, 6, 29) GEN_VXFORM_HETRO(vextuwrx, 6, 30) GEN_VXFORM_TRANS(lvsl, 6, 31) GEN_VXFORM_TRANS(lvsr, 6, 32) -GEN_VXFORM_DUAL(vmrgew, PPC_NONE, PPC2_ALTIVEC_207, \ +GEN_VXFORM_TRANS_DUAL(vmrgew, PPC_NONE, PPC2_ALTIVEC_207, vextuwrx, PPC_NONE, PPC2_ISA300) =20 #define GEN_VXRFORM1(opname, name, str, opc2, opc3) \ --=20 2.7.4 From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561634490; cv=none; d=zoho.com; s=zohoarc; b=Vg4y/F9VpSXHw2gzqn/Je+E2CmXSmY+Lqv7z1kkCTbl8VlFp7+dEIIsVaVywFVEIFLtwSJxU50Qy9g3K+O+A9pmoWeeRGx1d6YFGLBHQ3uRlu9qf/lL5b5PGE7fMOTL22x3I5+Wpf+JCZmN4pPjZjcIvKUUXZWIyTAmb3zZRBHc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561634490; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=l3Q/1ozdDQ1/D9U73HjR1TBLZY/vfw2g9XX3BXpuBEg=; b=bY/uQ251Qw1Oib2kC2I3cXW3dNxmMhEemj7U0HkDWR/ufqix9MZGT7HIqe6s7J4L8ydEIqGrg95TBpAkIPMbxjDQBRMJOfVXa3E1qje00xC6N3U616lMweS/y81CBpLRrrkV6uA7dGSE0JAhZ0JM0EWW1gE0D/G+pSclSM4Znyk= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561634490798714.2034681861502; Thu, 27 Jun 2019 04:21:30 -0700 (PDT) Received: from localhost ([::1]:48774 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgST3-0006Us-Pw for importer@patchew.org; Thu, 27 Jun 2019 07:21:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55857) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5y-0001zs-4f for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:58:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5v-0001hj-Km for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:38 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36495 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5v-0001fl-85 for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:35 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 280661A45A6; Thu, 27 Jun 2019 12:56:30 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id E2BA91A4539; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:20 +0200 Message-Id: <1561632985-24866-9-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 08/13] tcg: Add opcodes for vector vmrgh instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Stefan Brankovic --- accel/tcg/tcg-runtime-gvec.c | 42 ++++++++++++++++++++++++++++++++++++++++= ++ accel/tcg/tcg-runtime.h | 4 ++++ tcg/i386/tcg-target.h | 1 + tcg/tcg-op-gvec.c | 23 +++++++++++++++++++++++ tcg/tcg-op-gvec.h | 3 +++ tcg/tcg-op-vec.c | 5 +++++ tcg/tcg-op.h | 2 ++ tcg/tcg-opc.h | 2 ++ tcg/tcg.c | 2 ++ tcg/tcg.h | 1 + 10 files changed, 85 insertions(+) diff --git a/accel/tcg/tcg-runtime-gvec.c b/accel/tcg/tcg-runtime-gvec.c index 51cb29c..28173ae 100644 --- a/accel/tcg/tcg-runtime-gvec.c +++ b/accel/tcg/tcg-runtime-gvec.c @@ -1458,3 +1458,45 @@ void HELPER(gvec_bitsel)(void *d, void *a, void *b, = void *c, uint32_t desc) } clear_high(d, oprsz, desc); } + +void HELPER(gvec_vmrgh8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < (oprsz / 2); i +=3D sizeof(uint8_t)) { + uint8_t aa =3D *(uint8_t *)(a + 8 * sizeof(uint8_t) + i); + uint8_t bb =3D *(uint8_t *)(b + 8 * sizeof(uint8_t) + i); + *(uint8_t *)(d + 2 * i) =3D bb; + *(uint8_t *)(d + 2 * i + sizeof(uint8_t)) =3D aa; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_vmrgh16)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < (oprsz / 2); i +=3D sizeof(uint16_t)) { + uint16_t aa =3D *(uint16_t *)(a + 4 * sizeof(uint16_t) + i); + uint16_t bb =3D *(uint16_t *)(b + 4 * sizeof(uint16_t) + i); + *(uint16_t *)(d + 2 * i) =3D bb; + *(uint16_t *)(d + 2 * i + sizeof(uint16_t)) =3D aa; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_vmrgh32)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint32_t)) { + uint32_t aa =3D *(uint32_t *)(a + 2 * sizeof(uint32_t) + i); + uint32_t bb =3D *(uint32_t *)(b + 2 * sizeof(uint32_t) + i); + *(uint32_t *)(d + 2 * i) =3D bb; + *(uint32_t *)(d + 2 * i + sizeof(uint32_t)) =3D aa; + } + clear_high(d, oprsz, desc); +} diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index 4fa61b4..089956f 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -305,3 +305,7 @@ DEF_HELPER_FLAGS_4(gvec_leu32, TCG_CALL_NO_RWG, void, p= tr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_leu64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_5(gvec_bitsel, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) + +DEF_HELPER_FLAGS_4(gvec_vmrgh8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_vmrgh16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_vmrgh32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 928e8b8..e11b22d 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -192,6 +192,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_minmax_vec 1 #define TCG_TARGET_HAS_bitsel_vec 0 #define TCG_TARGET_HAS_cmpsel_vec -1 +#define TCG_TARGET_HAS_vmrgh_vec 0 =20 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) =3D=3D 0 && (len) =3D=3D 8) || ((ofs) =3D=3D 8 && (len) =3D=3D= 8) || \ diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 17679b6..2560fb6 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -2102,6 +2102,29 @@ void tcg_gen_gvec_umax(unsigned vece, uint32_t dofs,= uint32_t aofs, tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); } =20 +static const TCGOpcode vecop_list_vmrgh[] =3D { INDEX_op_vmrgh_vec, 0 }; + +void tcg_gen_gvec_vmrgh(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[3] =3D { + { .fniv =3D tcg_gen_vmrgh_vec, + .fno =3D gen_helper_gvec_vmrgh8, + .opt_opc =3D vecop_list_vmrgh, + .vece =3D MO_8 }, + { .fniv =3D tcg_gen_vmrgh_vec, + .fno =3D gen_helper_gvec_vmrgh16, + .opt_opc =3D vecop_list_vmrgh, + .vece =3D MO_16 }, + { .fniv =3D tcg_gen_vmrgh_vec, + .fno =3D gen_helper_gvec_vmrgh32, + .opt_opc =3D vecop_list_vmrgh, + .vece =3D MO_32 } + }; + tcg_debug_assert(vece <=3D MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + /* Perform a vector negation using normal negation and a mask. Compare gen_subv_mask above. */ static void gen_negv_mask(TCGv_i64 d, TCGv_i64 b, TCGv_i64 m) diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index 830d68f..8c04d71 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -272,6 +272,9 @@ void tcg_gen_gvec_smax(unsigned vece, uint32_t dofs, ui= nt32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); void tcg_gen_gvec_umax(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +/* Vector merge. */ +void tcg_gen_gvec_vmrgh(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); =20 void tcg_gen_gvec_and(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index c8fdc24..fb0b83e 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -663,6 +663,11 @@ void tcg_gen_umax_vec(unsigned vece, TCGv_vec r, TCGv_= vec a, TCGv_vec b) do_minmax(vece, r, a, b, INDEX_op_umax_vec, TCG_COND_GTU); } =20 +void tcg_gen_vmrgh_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_vmrgh_vec); +} + void tcg_gen_shlv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { do_op3_nofail(vece, r, a, b, INDEX_op_shlv_vec); diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 2d4dd5c..d8de022 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -985,6 +985,8 @@ void tcg_gen_umin_vec(unsigned vece, TCGv_vec r, TCGv_v= ec a, TCGv_vec b); void tcg_gen_smax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_umax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); =20 +void tcg_gen_vmrgh_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); + void tcg_gen_shli_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); void tcg_gen_shri_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); void tcg_gen_sari_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index 242d608..2bc3bdf 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -235,6 +235,8 @@ DEF(umin_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_mi= nmax_vec)) DEF(smax_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_minmax_vec)) DEF(umax_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_minmax_vec)) =20 +DEF(vmrgh_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_vmrgh_vec)) + DEF(and_vec, 1, 2, 0, IMPLVEC) DEF(or_vec, 1, 2, 0, IMPLVEC) DEF(xor_vec, 1, 2, 0, IMPLVEC) diff --git a/tcg/tcg.c b/tcg/tcg.c index 02a2680..fed9a6f 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1646,6 +1646,8 @@ bool tcg_op_supported(TCGOpcode op) case INDEX_op_smax_vec: case INDEX_op_umax_vec: return have_vec && TCG_TARGET_HAS_minmax_vec; + case INDEX_op_vmrgh_vec: + return have_vec && TCG_TARGET_HAS_vmrgh_vec; case INDEX_op_bitsel_vec: return have_vec && TCG_TARGET_HAS_bitsel_vec; case INDEX_op_cmpsel_vec: diff --git a/tcg/tcg.h b/tcg/tcg.h index b411e17..05b9b51 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -186,6 +186,7 @@ typedef uint64_t TCGRegSet; #define TCG_TARGET_HAS_mul_vec 0 #define TCG_TARGET_HAS_sat_vec 0 #define TCG_TARGET_HAS_minmax_vec 0 +#define TCG_TARGET_HAS_vmrgh_vec 0 #define TCG_TARGET_HAS_bitsel_vec 0 #define TCG_TARGET_HAS_cmpsel_vec 0 #else --=20 2.7.4 From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561634115; cv=none; d=zoho.com; s=zohoarc; b=l4gkKG6CZzfQMWgc7/YlmzfcH2S9loK+F70Yzsw7XoHAOa9gkf0gLgb/gCT+fp5KwD5Vz/yUjWlExz3SeKNpW03+I6ZSMBkTr0GuJCcXKc6utV3lFKKnZOnyhCNxOcCsY1RztSA/zsSAGrQalzhzK/aFlH3VGpKnq8hGf1ntsDo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561634115; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=ufYcKl9dkVsb4dSjuONMAweakIFlE062mNXIUYlsR4E=; b=krKPuZXynb9npXy0GgGtbJWbcgjULwqvyTCBkbFMRV3XXJCaftQ7Lan7HZXO7pXyFPGeOWmdBHbf1Q+rErpj4h0SVv2OMAmu2H0/SRz2jJ+hdfniRHqC+S5Lo3jMkVXLCkAO9F4q5kFlwqouRFzq3PQWpZNv6Ip2+hFCyG9Pp/8= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561634115618890.3854984070614; Thu, 27 Jun 2019 04:15:15 -0700 (PDT) Received: from localhost ([::1]:48690 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgSN0-0007jN-Ly for importer@patchew.org; Thu, 27 Jun 2019 07:15:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55828) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5x-0001yl-7h for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5v-0001hL-Hm for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:37 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36500 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5v-0001fs-8J for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:35 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 2DA471A45AE; Thu, 27 Jun 2019 12:56:30 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id EEA2D1A4583; Thu, 27 Jun 2019 12:56:29 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:21 +0200 Message-Id: <1561632985-24866-10-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 09/13] tcg/i386: Implement vector vmrgh instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Stefan Brankovic --- tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.inc.c | 19 +++++++++++++++++++ 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index e11b22d..daae35f 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -192,7 +192,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_minmax_vec 1 #define TCG_TARGET_HAS_bitsel_vec 0 #define TCG_TARGET_HAS_cmpsel_vec -1 -#define TCG_TARGET_HAS_vmrgh_vec 0 +#define TCG_TARGET_HAS_vmrgh_vec 1 =20 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) =3D=3D 0 && (len) =3D=3D 8) || ((ofs) =3D=3D 8 && (len) =3D=3D= 8) || \ diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 6ddeebf..31e1b2b 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -2823,6 +2823,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode o= pc, case INDEX_op_umax_vec: insn =3D umax_insn[vece]; goto gen_simd; + case INDEX_op_vmrgh_vec: + insn =3D punpckh_insn[vece]; + goto gen_simd; case INDEX_op_shlv_vec: insn =3D shlv_insn[vece]; goto gen_simd; @@ -3223,6 +3226,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpc= ode op) case INDEX_op_umin_vec: case INDEX_op_smax_vec: case INDEX_op_umax_vec: + case INDEX_op_vmrgh_vec: case INDEX_op_shlv_vec: case INDEX_op_shrv_vec: case INDEX_op_sarv_vec: @@ -3321,6 +3325,8 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, = unsigned vece) case INDEX_op_umax_vec: case INDEX_op_abs_vec: return vece <=3D MO_32; + case INDEX_op_vmrgh_vec: + return vece <=3D MO_32 ? -1 : 0; =20 default: return 0; @@ -3614,6 +3620,14 @@ static void expand_vec_cmpsel(TCGType type, unsigned= vece, TCGv_vec v0, tcg_temp_free_vec(t); } =20 +static void expand_vec_vmrg(TCGOpcode opc, TCGType type, unsigned vece, + TCGv_vec v0, TCGv_vec v1, TCGv_vec v2) +{ + vec_gen_3(opc, type, vece, + tcgv_vec_arg(v0), tcgv_vec_arg(v2), + tcgv_vec_arg(v1)); +} + void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, TCGArg a0, ...) { @@ -3653,6 +3667,11 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type, = unsigned vece, expand_vec_cmpsel(type, vece, v0, v1, v2, v3, v4, va_arg(va, TCGAr= g)); break; =20 + case INDEX_op_vmrgh_vec: + v2 =3D temp_tcgv_vec(arg_temp(a2)); + expand_vec_vmrg(opc, type, vece, v0, v1, v2); + break; + default: break; } --=20 2.7.4 From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561633889; cv=none; d=zoho.com; s=zohoarc; b=Ed0h/wikagHZ9DRfQggocXhwQXeWGl0Y2IMTyD40bfG7GbSYur7G6lCX/9VOTFmqSjH0w6Yjy990MgoxHWPfxFL2QUiiyfQoz8c+QQcCwBNzSjequz3W5psHyGrZemIpgcFSs4eTe/tIRydIMO+4/GV+Es1sdcQsRi8gAA8wxmw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561633889; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=GdzGwzE0okcimw4hYq0qFB1mywmXMgGhg/vJJaMic2k=; b=kflwFpkI8BLJjuZlPJz636uxhM2LZjyUv63ZmqORyID3+cJJp/xIHaju8H5XXxVjVdt2W7WrOlnP/zb9gQXkRmGKg5tr4htURtjOGotqkDpTjRjyEEJFNOVYd/lcUFO5vTKbndpS5d5jd0zJiwlUJTvrAk25ExK33Srgn+qoQdY= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561633889316259.35460547538025; Thu, 27 Jun 2019 04:11:29 -0700 (PDT) Received: from localhost ([::1]:48664 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgSJL-0003zS-24 for importer@patchew.org; Thu, 27 Jun 2019 07:11:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55830) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5x-0001ym-8c for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5v-0001hJ-Hc for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:37 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36504 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5v-0001fw-7d for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:35 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 3EC001A4539; Thu, 27 Jun 2019 12:56:30 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id 034331A457F; Thu, 27 Jun 2019 12:56:30 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:22 +0200 Message-Id: <1561632985-24866-11-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 10/13] target/ppc: convert vmrgh instructions to vector operations X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Stefan Brankovic --- target/ppc/helper.h | 3 --- target/ppc/int_helper.c | 2 +- target/ppc/translate/vmx-impl.inc.c | 6 +++--- 3 files changed, 4 insertions(+), 7 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index ac1a5bd..9a7721f 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -164,9 +164,6 @@ DEF_HELPER_4(vcmpbfp_dot, void, env, avr, avr, avr) DEF_HELPER_3(vmrglb, void, avr, avr, avr) DEF_HELPER_3(vmrglh, void, avr, avr, avr) DEF_HELPER_3(vmrglw, void, avr, avr, avr) -DEF_HELPER_3(vmrghb, void, avr, avr, avr) -DEF_HELPER_3(vmrghh, void, avr, avr, avr) -DEF_HELPER_3(vmrghw, void, avr, avr, avr) DEF_HELPER_3(vmulesb, void, avr, avr, avr) DEF_HELPER_3(vmulesh, void, avr, avr, avr) DEF_HELPER_3(vmulesw, void, avr, avr, avr) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 3edf334..00e6e02 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -948,7 +948,7 @@ void helper_vmladduhm(ppc_avr_t *r, ppc_avr_t *a, ppc_a= vr_t *b, ppc_avr_t *c) =20 #define VMRG(suffix, element, access) \ VMRG_DO(mrgl##suffix, element, access, half) \ - VMRG_DO(mrgh##suffix, element, access, 0) + VMRG(b, u8, VsrB) VMRG(h, u16, VsrH) VMRG(w, u32, VsrW) diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx= -impl.inc.c index 39fb26d..e02390f 100644 --- a/target/ppc/translate/vmx-impl.inc.c +++ b/target/ppc/translate/vmx-impl.inc.c @@ -446,9 +446,9 @@ GEN_VXFORM_DUAL(vavguw, PPC_ALTIVEC, PPC_NONE, \ GEN_VXFORM(vavgsb, 1, 20); GEN_VXFORM(vavgsh, 1, 21); GEN_VXFORM(vavgsw, 1, 22); -GEN_VXFORM(vmrghb, 6, 0); -GEN_VXFORM(vmrghh, 6, 1); -GEN_VXFORM(vmrghw, 6, 2); +GEN_VXFORM_V(vmrghb, MO_8, tcg_gen_gvec_vmrgh, 6, 0); +GEN_VXFORM_V(vmrghh, MO_16, tcg_gen_gvec_vmrgh, 6, 1); +GEN_VXFORM_V(vmrghw, MO_32, tcg_gen_gvec_vmrgh, 6, 2); GEN_VXFORM(vmrglb, 6, 4); GEN_VXFORM(vmrglh, 6, 5); GEN_VXFORM(vmrglw, 6, 6); --=20 2.7.4 From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561634206; cv=none; d=zoho.com; s=zohoarc; b=HIUfFv7PwCqOMnHZeJvtMddtiEriAkqsEo6ooxugTyWvmsU0rCURQd8pPrpXmggeHAKhUzshrDtsUvnymW21ap99KFx1J9X9xSoSyWvyzsODe8qUD3PS0M5XVQUf9pm++arYtAvB4N6jZ9M/sQltSZL8Lej6/gqsOw6koUjooX4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561634206; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=Nha/YJd4Bf0FfT9Hwt+GYpkbP9iaQrGnUwcJhM6KTjo=; b=R5SmwLRp6DZq8ptz9r/37T7DtQPkd6kU9ZuxZYYQPiFusidj7jPfrrLcMR0o3mPol69bMQToFog1xkfxNoO0/lue8Yf2GM75UZIlL03xFs194kbh7Ojelv4Qm0BxG+jJAXMCBTU/8GQhKAPe3i01E/yci0G1Z7zUN+gPHHZ8cfc= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561634206048695.1317308867858; Thu, 27 Jun 2019 04:16:46 -0700 (PDT) Received: from localhost ([::1]:48720 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgSOS-0000t5-NA for importer@patchew.org; Thu, 27 Jun 2019 07:16:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55858) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5y-0001zv-4m for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:58:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5v-0001hx-MX for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:38 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36512 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5v-0001g4-8x for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:35 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 4E9151A456F; Thu, 27 Jun 2019 12:56:30 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id 15D9E1A45A3; Thu, 27 Jun 2019 12:56:30 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:23 +0200 Message-Id: <1561632985-24866-12-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 11/13] tcg: Add opcodes for verctor vmrgl instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Stefan Brankovic --- accel/tcg/tcg-runtime-gvec.c | 42 ++++++++++++++++++++++++++++++++++++++++= ++ accel/tcg/tcg-runtime.h | 4 ++++ tcg/i386/tcg-target.h | 1 + tcg/tcg-op-gvec.c | 24 ++++++++++++++++++++++++ tcg/tcg-op-gvec.h | 2 ++ tcg/tcg-op-vec.c | 5 +++++ tcg/tcg-op.h | 1 + tcg/tcg-opc.h | 1 + tcg/tcg.c | 2 ++ tcg/tcg.h | 1 + 10 files changed, 83 insertions(+) diff --git a/accel/tcg/tcg-runtime-gvec.c b/accel/tcg/tcg-runtime-gvec.c index 28173ae..152f277 100644 --- a/accel/tcg/tcg-runtime-gvec.c +++ b/accel/tcg/tcg-runtime-gvec.c @@ -1500,3 +1500,45 @@ void HELPER(gvec_vmrgh32)(void *d, void *a, void *b,= uint32_t desc) } clear_high(d, oprsz, desc); } + +void HELPER(gvec_vmrgl8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < (oprsz / 2); i +=3D sizeof(uint8_t)) { + uint8_t aa =3D *(uint8_t *)(a + i); + uint8_t bb =3D *(uint8_t *)(b + i); + *(uint8_t *)(d + 2 * i) =3D bb; + *(uint8_t *)(d + 2 * i + sizeof(uint8_t)) =3D aa; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_vmrgl16)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < (oprsz / 2); i +=3D sizeof(uint16_t)) { + uint16_t aa =3D *(uint16_t *)(a + i); + uint16_t bb =3D *(uint16_t *)(b + i); + *(uint16_t *)(d + 2 * i) =3D bb; + *(uint16_t *)(d + 2 * i + sizeof(uint16_t)) =3D aa; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_vmrgl32)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint32_t)) { + uint32_t aa =3D *(uint32_t *)(a + i); + uint32_t bb =3D *(uint32_t *)(b + i); + *(uint32_t *)(d + 2 * i) =3D bb; + *(uint32_t *)(d + 2 * i + sizeof(uint32_t)) =3D aa; + } + clear_high(d, oprsz, desc); +} diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index 089956f..fd0ba1e 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -309,3 +309,7 @@ DEF_HELPER_FLAGS_5(gvec_bitsel, TCG_CALL_NO_RWG, void, = ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_vmrgh8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_vmrgh16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_vmrgh32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_vmrgl8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_vmrgl16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_vmrgl32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index daae35f..e825324 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -193,6 +193,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_bitsel_vec 0 #define TCG_TARGET_HAS_cmpsel_vec -1 #define TCG_TARGET_HAS_vmrgh_vec 1 +#define TCG_TARGET_HAS_vmrgl_vec 0 =20 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) =3D=3D 0 && (len) =3D=3D 8) || ((ofs) =3D=3D 8 && (len) =3D=3D= 8) || \ diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 2560fb6..da1d272 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -2125,6 +2125,30 @@ void tcg_gen_gvec_vmrgh(unsigned vece, uint32_t dofs= , uint32_t aofs, tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); } =20 +static const TCGOpcode vecop_list_vmrgl[] =3D { INDEX_op_vmrgl_vec, 0 }; + +void tcg_gen_gvec_vmrgl(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[3] =3D { + { .fniv =3D tcg_gen_vmrgl_vec, + .fno =3D gen_helper_gvec_vmrgl8, + .opt_opc =3D vecop_list_vmrgl, + .vece =3D MO_8 }, + { .fniv =3D tcg_gen_vmrgl_vec, + .fno =3D gen_helper_gvec_vmrgl16, + .opt_opc =3D vecop_list_vmrgl, + .vece =3D MO_16 }, + { + .fniv =3D tcg_gen_vmrgl_vec, + .fno =3D gen_helper_gvec_vmrgl32, + .opt_opc =3D vecop_list_vmrgl, + .vece =3D MO_32 } + }; + tcg_debug_assert(vece <=3D MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + /* Perform a vector negation using normal negation and a mask. Compare gen_subv_mask above. */ static void gen_negv_mask(TCGv_i64 d, TCGv_i64 b, TCGv_i64 m) diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index 8c04d71..a2eb45c 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -275,6 +275,8 @@ void tcg_gen_gvec_umax(unsigned vece, uint32_t dofs, ui= nt32_t aofs, /* Vector merge. */ void tcg_gen_gvec_vmrgh(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_vmrgl(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); =20 void tcg_gen_gvec_and(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index fb0b83e..ab22335 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -668,6 +668,11 @@ void tcg_gen_vmrgh_vec(unsigned vece, TCGv_vec r, TCGv= _vec a, TCGv_vec b) do_op3(vece, r, a, b, INDEX_op_vmrgh_vec); } =20 +void tcg_gen_vmrgl_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_vmrgl_vec); +} + void tcg_gen_shlv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { do_op3_nofail(vece, r, a, b, INDEX_op_shlv_vec); diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index d8de022..c101170 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -986,6 +986,7 @@ void tcg_gen_smax_vec(unsigned vece, TCGv_vec r, TCGv_v= ec a, TCGv_vec b); void tcg_gen_umax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); =20 void tcg_gen_vmrgh_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_vmrgl_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); =20 void tcg_gen_shli_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); void tcg_gen_shri_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index 2bc3bdf..d99131a 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -236,6 +236,7 @@ DEF(smax_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_mi= nmax_vec)) DEF(umax_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_minmax_vec)) =20 DEF(vmrgh_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_vmrgh_vec)) +DEF(vmrgl_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_vmrgl_vec)) =20 DEF(and_vec, 1, 2, 0, IMPLVEC) DEF(or_vec, 1, 2, 0, IMPLVEC) diff --git a/tcg/tcg.c b/tcg/tcg.c index fed9a6f..01245d5 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1648,6 +1648,8 @@ bool tcg_op_supported(TCGOpcode op) return have_vec && TCG_TARGET_HAS_minmax_vec; case INDEX_op_vmrgh_vec: return have_vec && TCG_TARGET_HAS_vmrgh_vec; + case INDEX_op_vmrgl_vec: + return have_vec && TCG_TARGET_HAS_vmrgl_vec; case INDEX_op_bitsel_vec: return have_vec && TCG_TARGET_HAS_bitsel_vec; case INDEX_op_cmpsel_vec: diff --git a/tcg/tcg.h b/tcg/tcg.h index 05b9b51..6f9f333 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -187,6 +187,7 @@ typedef uint64_t TCGRegSet; #define TCG_TARGET_HAS_sat_vec 0 #define TCG_TARGET_HAS_minmax_vec 0 #define TCG_TARGET_HAS_vmrgh_vec 0 +#define TCG_TARGET_HAS_vmrgl_vec 0 #define TCG_TARGET_HAS_bitsel_vec 0 #define TCG_TARGET_HAS_cmpsel_vec 0 #else --=20 2.7.4 From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561634660; cv=none; d=zoho.com; s=zohoarc; b=HnohoohRmm1KOMPPFXC9FwPFS1z2JJO9U71jvmvZWauPj+PcxCsUJ0qZDf+0/L+aoaqFaRgsqkxUJfnphB5v+6qdl4f+6gdcO+uuf4zENpOf+1AUK6UyXRacDsgp4FlJ82uD5cCDzMjiHlE+qZv/D4OQzSmHvyEit8WTV/b4ZOo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561634660; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=fX/+zk8FN0muC0Aw75MHt+1ixHek+JfsSrwipV+jAUw=; b=KlPzUWDm/sJKbarqftzReoPqePOGrgcj3R342Rv6CgEqERkQQQwPmMtITDH15LqPjI0fGUmjFmqlGHMV8tcfyjOGBxksrey5TTs0CT0xTwpv3yf/dKOJKum9IFVe3lxs1lKZfzkJbYJp1IW///ms7v5gt3x5tTs0cxm5aIaAwHk= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1561634660239443.9352376419781; Thu, 27 Jun 2019 04:24:20 -0700 (PDT) Received: from localhost ([::1]:48796 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgSVn-00016Q-8g for importer@patchew.org; Thu, 27 Jun 2019 07:24:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55836) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5x-0001yn-F8 for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:58:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5v-0001he-L4 for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:37 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36511 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5v-0001g5-9f for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:35 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 5617C1A457F; Thu, 27 Jun 2019 12:56:30 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id 212881A45A7; Thu, 27 Jun 2019 12:56:30 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:24 +0200 Message-Id: <1561632985-24866-13-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 12/13] tcg/i386: Implement vector vmrgl instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Stefan Brankovic --- tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.inc.c | 10 ++++++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index e825324..d20d08f 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -193,7 +193,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_bitsel_vec 0 #define TCG_TARGET_HAS_cmpsel_vec -1 #define TCG_TARGET_HAS_vmrgh_vec 1 -#define TCG_TARGET_HAS_vmrgl_vec 0 +#define TCG_TARGET_HAS_vmrgl_vec 1 =20 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) =3D=3D 0 && (len) =3D=3D 8) || ((ofs) =3D=3D 8 && (len) =3D=3D= 8) || \ diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 31e1b2b..dc3cd65 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -2826,6 +2826,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode o= pc, case INDEX_op_vmrgh_vec: insn =3D punpckh_insn[vece]; goto gen_simd; + case INDEX_op_vmrgl_vec: + insn =3D punpckl_insn[vece]; + goto gen_simd; case INDEX_op_shlv_vec: insn =3D shlv_insn[vece]; goto gen_simd; @@ -3227,6 +3230,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpc= ode op) case INDEX_op_smax_vec: case INDEX_op_umax_vec: case INDEX_op_vmrgh_vec: + case INDEX_op_vmrgl_vec: case INDEX_op_shlv_vec: case INDEX_op_shrv_vec: case INDEX_op_sarv_vec: @@ -3327,6 +3331,8 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, = unsigned vece) return vece <=3D MO_32; case INDEX_op_vmrgh_vec: return vece <=3D MO_32 ? -1 : 0; + case INDEX_op_vmrgl_vec: + return vece <=3D MO_32 ? -1 : 0; =20 default: return 0; @@ -3671,6 +3677,10 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type, = unsigned vece, v2 =3D temp_tcgv_vec(arg_temp(a2)); expand_vec_vmrg(opc, type, vece, v0, v1, v2); break; + case INDEX_op_vmrgl_vec: + v2 =3D temp_tcgv_vec(arg_temp(a2)); + expand_vec_vmrg(opc, type, vece, v0, v1, v2); + break; =20 default: break; --=20 2.7.4 From nobody Sat May 18 18:02:05 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1561633213; cv=none; d=zoho.com; s=zohoarc; b=Ph4Gl0AgIRFCNHdtZ0GiAbBptyvL4yaGm1MQhzCLE0KjOdKp2sqSAQu8Jc5ENupgfe1qsIrkjCUZMGmgWePHDxOji9ijenjwkpnoq/2sW+tiH5Xx0FTl9/F/CSAHj4cC7L+45Geg1v6kU7S7bDZsyIRDF59MBSavowGiQD3zRCw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1561633213; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=kOFo54jfNlH4H3ldov8/GBB73bz+BF0W2WK9dtOKQ9U=; b=Th6Nz83CNv+YhQcRSH6JAZJNHVX6Pw43o5y7ZMzlrLRleFMhd+SrnHlwqlwHCbF4gn/iasdHmvo8F6SIzhByWFU2qCnO4d6KiZujr1YfpdEoJKNIO3ikTEKVmSIIVuPX7ma6bgKkZM3ZogFm6uKTd0YU5NGSzV4Q4MgLFxMUYBM= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (209.51.188.17 [209.51.188.17]) by mx.zohomail.com with SMTPS id 156163321388130.172881022983574; Thu, 27 Jun 2019 04:00:13 -0700 (PDT) Received: from localhost ([::1]:48598 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS8N-0004o3-St for importer@patchew.org; Thu, 27 Jun 2019 07:00:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55842) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgS5x-0001yu-H4 for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:58:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgS5v-0001hq-MX for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:37 -0400 Received: from mx2.rt-rk.com ([89.216.37.149]:36521 helo=mail.rt-rk.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgS5v-0001gG-Cv for qemu-devel@nongnu.org; Thu, 27 Jun 2019 06:57:35 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 64EC01A4567; Thu, 27 Jun 2019 12:56:30 +0200 (CEST) Received: from rtrkw870-lin.domain.local (rtrkw870-lin.domain.local [10.10.13.132]) by mail.rt-rk.com (Postfix) with ESMTPSA id 28FBB1A45A9; Thu, 27 Jun 2019 12:56:30 +0200 (CEST) X-Virus-Scanned: amavisd-new at rt-rk.com From: Stefan Brankovic To: qemu-devel@nongnu.org Date: Thu, 27 Jun 2019 12:56:25 +0200 Message-Id: <1561632985-24866-14-git-send-email-stefan.brankovic@rt-rk.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> References: <1561632985-24866-1-git-send-email-stefan.brankovic@rt-rk.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 89.216.37.149 Subject: [Qemu-devel] [PATCH v4 13/13] target/ppc: convert vmrgl instructions to vector operations X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stefan.brankovic@rt-rk.com, hsp.cat7@gmail.com, richard.henderson@linaro.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Stefan Brankovic --- target/ppc/helper.h | 3 --- target/ppc/int_helper.c | 9 --------- target/ppc/translate/vmx-impl.inc.c | 6 +++--- 3 files changed, 3 insertions(+), 15 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 9a7721f..0f10657 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -161,9 +161,6 @@ DEF_HELPER_4(vcmpeqfp_dot, void, env, avr, avr, avr) DEF_HELPER_4(vcmpgefp_dot, void, env, avr, avr, avr) DEF_HELPER_4(vcmpgtfp_dot, void, env, avr, avr, avr) DEF_HELPER_4(vcmpbfp_dot, void, env, avr, avr, avr) -DEF_HELPER_3(vmrglb, void, avr, avr, avr) -DEF_HELPER_3(vmrglh, void, avr, avr, avr) -DEF_HELPER_3(vmrglw, void, avr, avr, avr) DEF_HELPER_3(vmulesb, void, avr, avr, avr) DEF_HELPER_3(vmulesh, void, avr, avr, avr) DEF_HELPER_3(vmulesw, void, avr, avr, avr) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 00e6e02..4b6e074 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -946,15 +946,6 @@ void helper_vmladduhm(ppc_avr_t *r, ppc_avr_t *a, ppc_= avr_t *b, ppc_avr_t *c) *r =3D result; = \ } =20 -#define VMRG(suffix, element, access) \ - VMRG_DO(mrgl##suffix, element, access, half) \ - -VMRG(b, u8, VsrB) -VMRG(h, u16, VsrH) -VMRG(w, u32, VsrW) -#undef VMRG_DO -#undef VMRG - void helper_vmsummbm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c) { diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx= -impl.inc.c index e02390f..12f41af 100644 --- a/target/ppc/translate/vmx-impl.inc.c +++ b/target/ppc/translate/vmx-impl.inc.c @@ -449,9 +449,9 @@ GEN_VXFORM(vavgsw, 1, 22); GEN_VXFORM_V(vmrghb, MO_8, tcg_gen_gvec_vmrgh, 6, 0); GEN_VXFORM_V(vmrghh, MO_16, tcg_gen_gvec_vmrgh, 6, 1); GEN_VXFORM_V(vmrghw, MO_32, tcg_gen_gvec_vmrgh, 6, 2); -GEN_VXFORM(vmrglb, 6, 4); -GEN_VXFORM(vmrglh, 6, 5); -GEN_VXFORM(vmrglw, 6, 6); +GEN_VXFORM_V(vmrglb, MO_8, tcg_gen_gvec_vmrgl, 6, 4); +GEN_VXFORM_V(vmrglh, MO_16, tcg_gen_gvec_vmrgl, 6, 5); +GEN_VXFORM_V(vmrglw, MO_32, tcg_gen_gvec_vmrgl, 6, 6); =20 static void trans_vmrgew(DisasContext *ctx) { --=20 2.7.4