From nobody Sat May 18 07:09:01 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1588367793; cv=none; d=zohomail.com; s=zohoarc; b=YSE57vbmbzTm22pMLTWens1R9cP4Esj5fhRM3l9mv0ZgP5abPyv2XR8hff5ZFgc9l4J4YQDYJfswJ6jm82ZwnhRdu/H1lhMR+1F+6oHYAnBRzgAS1a3LWRSJPPTh5LxZd2Fo98iMf2EZLdsmjMXdzlAzRvmShgznc+X6jhfmsu4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1588367793; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=NCRdTQpU6yVNUEPlJLLGHU6a3sZwRzAjgbWbdlsQ8MI=; b=ienrUsgRYzmIrV8EKi6D5PNLanfqO+CJ6Mba2S0aMVKyPjGm1mNpLJZcuNpRFjz1nUn1ljtbSFkygawN4V7GnCRDHQKvtQRrm+hJgk3xm0/DEUx6sQKFwVLPeTEt2Jt3JtwZOfTp2i3OwxvNcT33OvYHDFxsr4NRIM299UeyUAo= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1588367793187581.1348749352284; Fri, 1 May 2020 14:16:33 -0700 (PDT) Received: from localhost ([::1]:58180 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jUd1L-0004N5-Oh for importer@patchew.org; Fri, 01 May 2020 17:16:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41350) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jUcyl-00087j-R1 for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jUcyk-0005RQ-Qj for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:51 -0400 Received: from mail-pl1-x642.google.com ([2607:f8b0:4864:20::642]:34536) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jUcyk-0005L0-Ca for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:50 -0400 Received: by mail-pl1-x642.google.com with SMTP id s10so4065319plr.1 for ; Fri, 01 May 2020 14:13:50 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id g22sm514552pju.21.2020.05.01.14.13.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 May 2020 14:13:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NCRdTQpU6yVNUEPlJLLGHU6a3sZwRzAjgbWbdlsQ8MI=; b=zeqkh+/TTsglElLbIFN3hMBdRqpEcOVZrxll/0xRs0LR3WESm+7VqhGjiEbt+6Wa6a wvbmPdUDEw2ao/Tj0IBs95uktmpKEZNUI0+EBAnc33kIwAp3zguks+7IuQ0xqjZSidWv dWGYxzbmGViSeaxOrlkIrSLL3hvok7sLUbwu4RuFBqeMQCxryul9q7eCLPoIURuCizdC 9mJwDs0SVvzVMGeiYfRUgHPK5In0xFNylFYU5eyeeUOBmzakVXOY1PrmhDD0ZyrZ+Ya8 zLKUDjQkZAFGz22sZfz2xufpeH+5POLdFLeOQH+1LJWOHa205q0Hdu9czxhdbygjVdZ0 EuIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NCRdTQpU6yVNUEPlJLLGHU6a3sZwRzAjgbWbdlsQ8MI=; b=i4gPN5dfmHldgBQShBXq3RSXU4J/+UKRY0Z6dhOoJPjxUmOAFev0Y3Wp/wq5y+uekQ 6MsTzFx92HFOLXn7O9ZisJo4+ISisHRvVPdmxam2Yz7MZiAYWblFDP0Ucue5bcMhjZAs bh3yKZfeeRvCQgf9Kzsg33xAl51Pvv9Xjrodo54E2vAssGBJxqcDDPJHitTfL/iSLuYp zoE8c3rgbiuLSah+o0eiXMGfnC4RiibprqogapvCHVoDosqnCjcF5IS+HHnVvkkkBqwt 0ZIPNHsLIWlcoDUgebMf7vEehewtie+lVKIxWwmRqxyWeIVM9MyzB0k+rg+kaB/3zSB4 83Dw== X-Gm-Message-State: AGi0Pua8N0COAODul8yhb/3LpES3G4aFF266busL8kcjMZHyfKTEIDoZ A4+2GjzPVLrQimBUtLmH0O0W8pIRweM= X-Google-Smtp-Source: APiQypK8JnamwgbovmVDestWVoN4HBBTW6eVcczVaR2bL26fP4fbzTb/vlxcxBK+DdhoKqbWBVpFBQ== X-Received: by 2002:a17:902:b082:: with SMTP id p2mr1780209plr.245.1588367628486; Fri, 01 May 2020 14:13:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 1/6] target/arm: Create gen_gvec_[us]sra Date: Fri, 1 May 2020 14:13:40 -0700 Message-Id: <20200501211345.30410-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200501211345.30410-1-richard.henderson@linaro.org> References: <20200501211345.30410-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::642; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x642.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::642 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @linaro.org) Content-Type: text/plain; charset="utf-8" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 15 +--- target/arm/translate.c | 161 ++++++++++++++++++++++--------------- target/arm/vec_helper.c | 25 ++++++ 5 files changed, 139 insertions(+), 79 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 5817626b20..9bc162345c 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -691,6 +691,16 @@ DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void= , ptr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(gvec_ssra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_usra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 98b319f3f6..a39cf22666 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -285,8 +285,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i ssra_op[4]; -extern const GVecGen2i usra_op[4]; extern const GVecGen2i sri_op[4]; extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; @@ -299,6 +297,11 @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); =20 +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 010b36633e..03f4dc5805 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10205,19 +10205,8 @@ static void handle_vec_simd_shri(DisasContext *s, = bool is_q, bool is_u, =20 switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - if (is_u) { - /* Shift count same as element size produces zero to add. */ - if (shift =3D=3D 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &usra_op[size]); - } else { - /* Shift count same as element size produces all sign to add. = */ - if (shift =3D=3D 8 << size) { - shift -=3D 1; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &ssra_op[size]); - } + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_usra : gen_gvec_ssra, size); return; case 0x08: /* SRI */ /* Shift count same as element size is valid but does nothing. */ diff --git a/target/arm/translate.c b/target/arm/translate.c index a96899549b..04114906d7 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4146,33 +4146,51 @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d,= TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } =20 -static const TCGOpcode vecop_list_ssra[] =3D { - INDEX_op_sari_vec, INDEX_op_add_vec, 0 -}; +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] =3D { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] =3D { + { .fni8 =3D gen_ssra8_i64, + .fniv =3D gen_ssra_vec, + .fno =3D gen_helper_gvec_ssra_b, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_8 }, + { .fni8 =3D gen_ssra16_i64, + .fniv =3D gen_ssra_vec, + .fno =3D gen_helper_gvec_ssra_h, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_16 }, + { .fni4 =3D gen_ssra32_i32, + .fniv =3D gen_ssra_vec, + .fno =3D gen_helper_gvec_ssra_s, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_32 }, + { .fni8 =3D gen_ssra64_i64, + .fniv =3D gen_ssra_vec, + .fno =3D gen_helper_gvec_ssra_b, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + .opt_opc =3D vecop_list, + .load_dest =3D true, + .vece =3D MO_64 }, + }; =20 -const GVecGen2i ssra_op[4] =3D { - { .fni8 =3D gen_ssra8_i64, - .fniv =3D gen_ssra_vec, - .load_dest =3D true, - .opt_opc =3D vecop_list_ssra, - .vece =3D MO_8 }, - { .fni8 =3D gen_ssra16_i64, - .fniv =3D gen_ssra_vec, - .load_dest =3D true, - .opt_opc =3D vecop_list_ssra, - .vece =3D MO_16 }, - { .fni4 =3D gen_ssra32_i32, - .fniv =3D gen_ssra_vec, - .load_dest =3D true, - .opt_opc =3D vecop_list_ssra, - .vece =3D MO_32 }, - { .fni8 =3D gen_ssra64_i64, - .fniv =3D gen_ssra_vec, - .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, - .opt_opc =3D vecop_list_ssra, - .load_dest =3D true, - .vece =3D MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <=3D (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. + */ + shift =3D MIN(shift, (8 << vece) - 1); + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} =20 static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -4204,33 +4222,55 @@ static void gen_usra_vec(unsigned vece, TCGv_vec d,= TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } =20 -static const TCGOpcode vecop_list_usra[] =3D { - INDEX_op_shri_vec, INDEX_op_add_vec, 0 -}; +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] =3D { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] =3D { + { .fni8 =3D gen_usra8_i64, + .fniv =3D gen_usra_vec, + .fno =3D gen_helper_gvec_usra_b, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_8, }, + { .fni8 =3D gen_usra16_i64, + .fniv =3D gen_usra_vec, + .fno =3D gen_helper_gvec_usra_h, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_16, }, + { .fni4 =3D gen_usra32_i32, + .fniv =3D gen_usra_vec, + .fno =3D gen_helper_gvec_usra_s, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_32, }, + { .fni8 =3D gen_usra64_i64, + .fniv =3D gen_usra_vec, + .fno =3D gen_helper_gvec_usra_d, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_64, }, + }; =20 -const GVecGen2i usra_op[4] =3D { - { .fni8 =3D gen_usra8_i64, - .fniv =3D gen_usra_vec, - .load_dest =3D true, - .opt_opc =3D vecop_list_usra, - .vece =3D MO_8, }, - { .fni8 =3D gen_usra16_i64, - .fniv =3D gen_usra_vec, - .load_dest =3D true, - .opt_opc =3D vecop_list_usra, - .vece =3D MO_16, }, - { .fni4 =3D gen_usra32_i32, - .fniv =3D gen_usra_vec, - .load_dest =3D true, - .opt_opc =3D vecop_list_usra, - .vece =3D MO_32, }, - { .fni8 =3D gen_usra64_i64, - .fniv =3D gen_usra_vec, - .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, - .load_dest =3D true, - .opt_opc =3D vecop_list_usra, - .vece =3D MO_64, }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <=3D (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in all zeros as input to accumulate: nop. + */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} =20 static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -5596,19 +5636,12 @@ static int disas_neon_data_insn(DisasContext *s, ui= nt32_t insn) case 1: /* VSRA */ /* Right shift comes here negative. */ shift =3D -shift; - /* Shifts larger than the element size are architectur= ally - * valid. Unsigned results in all zeros; signed resul= ts - * in all sign bits. - */ - if (!u) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - MIN(shift, (8 << size) - 1), - &ssra_op[size]); - } else if (shift >=3D 8 << size) { - /* rd +=3D 0 */ + if (u) { + gen_gvec_usra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &usra_op[size]); + gen_gvec_ssra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } return 0; =20 diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 3d534188a8..230085b35e 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -899,6 +899,31 @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn, clear_tail(d, oprsz, simd_maxsz(desc)); } =20 + +#define DO_SRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + int shift =3D simd_data(desc); \ + TYPE *d =3D vd, *n =3D vn; \ + for (i =3D 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] +=3D n[i] >> shift; \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRA(gvec_ssra_b, int8_t) +DO_SRA(gvec_ssra_h, int16_t) +DO_SRA(gvec_ssra_s, int32_t) +DO_SRA(gvec_ssra_d, int64_t) + +DO_SRA(gvec_usra_b, uint8_t) +DO_SRA(gvec_usra_h, uint16_t) +DO_SRA(gvec_usra_s, uint32_t) +DO_SRA(gvec_usra_d, uint64_t) + +#undef DO_SRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. --=20 2.20.1 From nobody Sat May 18 07:09:01 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1588367870; cv=none; d=zohomail.com; s=zohoarc; b=lRBF/hDwHvpt/P4w/Au1hqibKlxMrzm5U0W7LuWOEEV+9ysNPWnmmFHH0hweFWsqoE8+4gw46uwqjYUadSYS4BV85+BL+PaNGI944SoVWQncbWGlYRpp07TE9z8ZOBp4zv2vh3iYUqe4ZEFjwikpEfeFeFgzQ/JJ58hcBoMA0uM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1588367870; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=CXF9hoqm6JD/UN9y51ciG1nlrGUHFw+zFqLn1o/FF0w=; b=jVidmjfEKO/pi1VPsqcO+I/LsThaqw8GLyEWd8ad5fiUmALzh2q/FVizR36tip90L9gGTy1O2NuPHg0+FhayUECwuwUHhCu1h1ZRxkhrRbaIrf0cxN011eBMBSg8vQhcf5aCDpHfflxzflDljpFOs1H5zH1CH+wHdVcATvoqRtU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 15883678707981018.5074786880115; Fri, 1 May 2020 14:17:50 -0700 (PDT) Received: from localhost ([::1]:33446 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jUd2b-0006SF-Eh for importer@patchew.org; Fri, 01 May 2020 17:17:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41390) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jUcyo-00089J-G6 for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jUcym-0005bk-On for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:54 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:55536) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jUcym-0005TU-6V for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:52 -0400 Received: by mail-pj1-x1044.google.com with SMTP id a32so386449pje.5 for ; Fri, 01 May 2020 14:13:51 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id g22sm514552pju.21.2020.05.01.14.13.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 May 2020 14:13:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=CXF9hoqm6JD/UN9y51ciG1nlrGUHFw+zFqLn1o/FF0w=; b=LKZTcC/lW8hQu8zLXoZfu3LFOopPpsgosc5q5Jrb2MaiGH78H63RsJM3gny2mDpBdc 77orKDu9MPql+uJgAQ1S9Rw9kodGbDcRfhYdTmsfgqqnh8o7xsMip+5ydVogpSGrmEE2 FPip2KqUaZ8hLc4NWXWeqq79Z0+c+9COi8kLL+n12fSgIr5FRqQl6DOfb5ltetGw3+Ta ggiyUqhQO1AyuPCdH4dI8Mrm0Ol5uHA6acYaLklJmFNaQKtWUdMYLyMidx0IthrtuI51 RutY5gPLo3ZOky7gHGwHqTPPEK2PKpAw4TA/FRwB0C6Yxg3XhbGwBJ95lfazJvfcC5DN 6RAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CXF9hoqm6JD/UN9y51ciG1nlrGUHFw+zFqLn1o/FF0w=; b=C9e7N76zA2vJpKnxhugHBjNy8ny8wDntdqTkTSuBLGkqbKb4K5Ua5rTLxGMLQ4X/Sj G4o1Nq77AQfuhgHL7TjMapnMUVXiT5LcR9VHsBh+/vsuz2ly3ypnMX79eHp/0dg0ORgO ooVw1VQ7j0i4iNaj1p1fY8ddczIQkB+rHyCBRnvRcz8pLs6J1ZhfWMeVioVeb9M2z72M YZm/WKnR32wLdyGckuDXE7nMdQnDsA89njEt/jspwbSb45ZKvOOm3iPNucx9UeJUBi5U ChDI4nxGXJJL05T++ahYwiiBa64IdDk+WQXR8DdTr7Vl6t2RM5UHbW3t4dp1yG67MLww vSdQ== X-Gm-Message-State: AGi0PuZbIlf6UhL9lKzbRnOWEfH8eJ0e+voUs2JkrrqKsTkJ5ZwxikjO 06r2f6niJE+ocr49SonshCnSrqhvuWE= X-Google-Smtp-Source: APiQypJmqHKJPPucts8EsoyduqQz7HEo6z+2BxblvqpdKVCu66fZ0QnIzh8wwspbOlpILzB0N6aotQ== X-Received: by 2002:a17:90b:8d7:: with SMTP id ds23mr1791834pjb.39.1588367629793; Fri, 01 May 2020 14:13:49 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 2/6] target/arm: Create gen_gvec_{u,s}{rshr,rsra} Date: Fri, 1 May 2020 14:13:41 -0700 Message-Id: <20200501211345.30410-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200501211345.30410-1-richard.henderson@linaro.org> References: <20200501211345.30410-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::1044 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @linaro.org) Content-Type: text/plain; charset="utf-8" Create vectorized versions of handle_shri_with_rndacc for shift+round and shift+round+accumulate. Add out-of-line helpers in preparation for longer vector lengths from SVE. Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 ++ target/arm/translate.h | 9 + target/arm/translate-a64.c | 11 +- target/arm/translate.c | 461 +++++++++++++++++++++++++++++++++++-- target/arm/vec_helper.c | 50 ++++ 5 files changed, 525 insertions(+), 26 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 9bc162345c..aeb1f52455 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -701,6 +701,26 @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void,= ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index a39cf22666..823821f82c 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -302,6 +302,15 @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uin= t32_t rm_ofs, void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); =20 +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 03f4dc5805..1ef05d5ce1 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10235,10 +10235,15 @@ static void handle_vec_simd_shri(DisasContext *s,= bool is_q, bool is_u, return; =20 case 0x04: /* SRSHR / URSHR (rounding) */ - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_urshr : gen_gvec_srshr, size); + return; + case 0x06: /* SRSRA / URSRA (accum + rounding) */ - accumulate =3D true; - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_ursra : gen_gvec_srsra, size); + return; + default: g_assert_not_reached(); } diff --git a/target/arm/translate.c b/target/arm/translate.c index 04114906d7..d724022cb6 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4272,6 +4272,422 @@ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, = uint32_t rm_ofs, } } =20 +/* + * Shift one less than the requested amount, and the low bit is + * the rounding bit. For the 8 and 16-bit operations, because we + * mask the low bit, we can perform a normal integer shift instead + * of a vector shift. + */ +static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_sar8i_i64(d, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_sar16i_i64(d, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t =3D tcg_temp_new_i32(); + + tcg_gen_extract_i32(t, a, sh - 1, 1); + tcg_gen_sari_i32(d, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + + tcg_gen_extract_i64(t, a, sh - 1, 1); + tcg_gen_sari_i64(d, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t s= h) +{ + TCGv_vec t =3D tcg_temp_new_vec_matching(d); + TCGv_vec ones =3D tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, t, a, sh - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, t, ones); + tcg_gen_sari_vec(vece, d, a, sh); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] =3D { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] =3D { + { .fni8 =3D gen_srshr8_i64, + .fniv =3D gen_srshr_vec, + .fno =3D gen_helper_gvec_srshr_b, + .opt_opc =3D vecop_list, + .vece =3D MO_8 }, + { .fni8 =3D gen_srshr16_i64, + .fniv =3D gen_srshr_vec, + .fno =3D gen_helper_gvec_srshr_h, + .opt_opc =3D vecop_list, + .vece =3D MO_16 }, + { .fni4 =3D gen_srshr32_i32, + .fniv =3D gen_srshr_vec, + .fno =3D gen_helper_gvec_srshr_s, + .opt_opc =3D vecop_list, + .vece =3D MO_32 }, + { .fni8 =3D gen_srshr64_i64, + .fniv =3D gen_srshr_vec, + .fno =3D gen_helper_gvec_srshr_d, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + .opt_opc =3D vecop_list, + .vece =3D MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <=3D (8 << vece)); + + if (shift =3D=3D (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 =3D=3D 0, or (0 + 1) >> 1 =3D=3D 0. + * I.e. always zero. + */ + tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + + gen_srshr8_i64(t, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + + gen_srshr16_i64(t, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t =3D tcg_temp_new_i32(); + + gen_srshr32_i32(t, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + + gen_srshr64_i64(t, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t s= h) +{ + TCGv_vec t =3D tcg_temp_new_vec_matching(d); + + gen_srshr_vec(vece, t, a, sh); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] =3D { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] =3D { + { .fni8 =3D gen_srsra8_i64, + .fniv =3D gen_srsra_vec, + .fno =3D gen_helper_gvec_srsra_b, + .opt_opc =3D vecop_list, + .load_dest =3D true, + .vece =3D MO_8 }, + { .fni8 =3D gen_srsra16_i64, + .fniv =3D gen_srsra_vec, + .fno =3D gen_helper_gvec_srsra_h, + .opt_opc =3D vecop_list, + .load_dest =3D true, + .vece =3D MO_16 }, + { .fni4 =3D gen_srsra32_i32, + .fniv =3D gen_srsra_vec, + .fno =3D gen_helper_gvec_srsra_s, + .opt_opc =3D vecop_list, + .load_dest =3D true, + .vece =3D MO_32 }, + { .fni8 =3D gen_srsra64_i64, + .fniv =3D gen_srsra_vec, + .fno =3D gen_helper_gvec_srsra_d, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + .opt_opc =3D vecop_list, + .load_dest =3D true, + .vece =3D MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <=3D (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 =3D=3D 0, or (0 + 1) >> 1 =3D=3D 0. + * I.e. always zero. With accumulation, this leaves D unchanged. + */ + if (shift =3D=3D (8 << vece)) { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_shr8i_i64(d, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_shr16i_i64(d, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t =3D tcg_temp_new_i32(); + + tcg_gen_extract_i32(t, a, sh - 1, 1); + tcg_gen_shri_i32(d, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + + tcg_gen_extract_i64(t, a, sh - 1, 1); + tcg_gen_shri_i64(d, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t s= hift) +{ + TCGv_vec t =3D tcg_temp_new_vec_matching(d); + TCGv_vec ones =3D tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, t, a, shift - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, t, ones); + tcg_gen_shri_vec(vece, d, a, shift); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] =3D { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] =3D { + { .fni8 =3D gen_urshr8_i64, + .fniv =3D gen_urshr_vec, + .fno =3D gen_helper_gvec_urshr_b, + .opt_opc =3D vecop_list, + .vece =3D MO_8 }, + { .fni8 =3D gen_urshr16_i64, + .fniv =3D gen_urshr_vec, + .fno =3D gen_helper_gvec_urshr_h, + .opt_opc =3D vecop_list, + .vece =3D MO_16 }, + { .fni4 =3D gen_urshr32_i32, + .fniv =3D gen_urshr_vec, + .fno =3D gen_helper_gvec_urshr_s, + .opt_opc =3D vecop_list, + .vece =3D MO_32 }, + { .fni8 =3D gen_urshr64_i64, + .fniv =3D gen_urshr_vec, + .fno =3D gen_helper_gvec_urshr_d, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + .opt_opc =3D vecop_list, + .vece =3D MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <=3D (8 << vece)); + + if (shift =3D=3D (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in zero. With rounding, this produces a + * copy of the most significant bit. + */ + tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + + if (sh =3D=3D 8) { + tcg_gen_vec_shr8i_i64(t, a, 7); + } else { + gen_urshr8_i64(t, a, sh); + } + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + + if (sh =3D=3D 16) { + tcg_gen_vec_shr16i_i64(t, a, 15); + } else { + gen_urshr16_i64(t, a, sh); + } + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t =3D tcg_temp_new_i32(); + + if (sh =3D=3D 32) { + tcg_gen_shri_i32(t, a, 31); + } else { + gen_urshr32_i32(t, a, sh); + } + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + + if (sh =3D=3D 64) { + tcg_gen_shri_i64(t, a, 63); + } else { + gen_urshr64_i64(t, a, sh); + } + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t s= h) +{ + TCGv_vec t =3D tcg_temp_new_vec_matching(d); + + if (sh =3D=3D (8 << vece)) { + tcg_gen_shri_vec(vece, t, a, sh - 1); + } else { + gen_urshr_vec(vece, t, a, sh); + } + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] =3D { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] =3D { + { .fni8 =3D gen_ursra8_i64, + .fniv =3D gen_ursra_vec, + .fno =3D gen_helper_gvec_ursra_b, + .opt_opc =3D vecop_list, + .load_dest =3D true, + .vece =3D MO_8 }, + { .fni8 =3D gen_ursra16_i64, + .fniv =3D gen_ursra_vec, + .fno =3D gen_helper_gvec_ursra_h, + .opt_opc =3D vecop_list, + .load_dest =3D true, + .vece =3D MO_16 }, + { .fni4 =3D gen_ursra32_i32, + .fniv =3D gen_ursra_vec, + .fno =3D gen_helper_gvec_ursra_s, + .opt_opc =3D vecop_list, + .load_dest =3D true, + .vece =3D MO_32 }, + { .fni8 =3D gen_ursra64_i64, + .fniv =3D gen_ursra_vec, + .fno =3D gen_helper_gvec_ursra_d, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + .opt_opc =3D vecop_list, + .load_dest =3D true, + .vece =3D MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <=3D (8 << vece)); + + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} + static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { uint64_t mask =3D dup_const(MO_8, 0xff >> shift); @@ -5645,6 +6061,28 @@ static int disas_neon_data_insn(DisasContext *s, uin= t32_t insn) } return 0; =20 + case 2: /* VRSHR */ + /* Right shift comes here negative. */ + shift =3D -shift; + if (u) { + gen_gvec_urshr(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } else { + gen_gvec_srshr(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } + return 0; + + case 3: /* VRSRA */ + if (u) { + gen_gvec_ursra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } else { + gen_gvec_srsra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } + return 0; + case 4: /* VSRI */ if (!u) { return 1; @@ -5696,13 +6134,6 @@ static int disas_neon_data_insn(DisasContext *s, uin= t32_t insn) neon_load_reg64(cpu_V0, rm + pass); tcg_gen_movi_i64(cpu_V1, imm); switch (op) { - case 2: /* VRSHR */ - case 3: /* VRSRA */ - if (u) - gen_helper_neon_rshl_u64(cpu_V0, cpu_V0, c= pu_V1); - else - gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, c= pu_V1); - break; case 6: /* VQSHLU */ gen_helper_neon_qshlu_s64(cpu_V0, cpu_env, cpu_V0, cpu_V1); @@ -5719,11 +6150,6 @@ static int disas_neon_data_insn(DisasContext *s, uin= t32_t insn) default: g_assert_not_reached(); } - if (op =3D=3D 3) { - /* Accumulate. */ - neon_load_reg64(cpu_V1, rd + pass); - tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1); - } neon_store_reg64(cpu_V0, rd + pass); } else { /* size < 3 */ /* Operands in T0 and T1. */ @@ -5731,10 +6157,6 @@ static int disas_neon_data_insn(DisasContext *s, uin= t32_t insn) tmp2 =3D tcg_temp_new_i32(); tcg_gen_movi_i32(tmp2, imm); switch (op) { - case 2: /* VRSHR */ - case 3: /* VRSRA */ - GEN_NEON_INTEGER_OP(rshl); - break; case 6: /* VQSHLU */ switch (size) { case 0: @@ -5760,13 +6182,6 @@ static int disas_neon_data_insn(DisasContext *s, uin= t32_t insn) g_assert_not_reached(); } tcg_temp_free_i32(tmp2); - - if (op =3D=3D 3) { - /* Accumulate. */ - tmp2 =3D neon_load_reg(rd, pass); - gen_neon_add(size, tmp, tmp2); - tcg_temp_free_i32(tmp2); - } neon_store_reg(rd, pass, tmp); } } /* for pass */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 230085b35e..fd8b2bff49 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -924,6 +924,56 @@ DO_SRA(gvec_usra_d, uint64_t) =20 #undef DO_SRA =20 +#define DO_RSHR(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + int shift =3D simd_data(desc); \ + TYPE *d =3D vd, *n =3D vn; \ + for (i =3D 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp =3D n[i] >> (shift - 1); \ + d[i] =3D (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSHR(gvec_srshr_b, int8_t) +DO_RSHR(gvec_srshr_h, int16_t) +DO_RSHR(gvec_srshr_s, int32_t) +DO_RSHR(gvec_srshr_d, int64_t) + +DO_RSHR(gvec_urshr_b, uint8_t) +DO_RSHR(gvec_urshr_h, uint16_t) +DO_RSHR(gvec_urshr_s, uint32_t) +DO_RSHR(gvec_urshr_d, uint64_t) + +#undef DO_RSHR + +#define DO_RSRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + int shift =3D simd_data(desc); \ + TYPE *d =3D vd, *n =3D vn; \ + for (i =3D 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp =3D n[i] >> (shift - 1); \ + d[i] +=3D (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSRA(gvec_srsra_b, int8_t) +DO_RSRA(gvec_srsra_h, int16_t) +DO_RSRA(gvec_srsra_s, int32_t) +DO_RSRA(gvec_srsra_d, int64_t) + +DO_RSRA(gvec_ursra_b, uint8_t) +DO_RSRA(gvec_ursra_h, uint16_t) +DO_RSRA(gvec_ursra_s, uint32_t) +DO_RSRA(gvec_ursra_d, uint64_t) + +#undef DO_RSRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. --=20 2.20.1 From nobody Sat May 18 07:09:01 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1588367799; cv=none; d=zohomail.com; s=zohoarc; b=AXQrh1W2BlmlOxwEO760vZ2P4nPW7g4huZnKvcMUtICFLe2uu6S+Y3O6DPHGCpk0aZ/EEaMkYXyJejeQwHJnMTnv8xYnA+H4CoSSXT4r2aBQaMvRcrUSmjyqidqcGi4K/gkY2Ten+l/uiPtlrWZjSmv7lbbMnZWP0kc8GXWmyqw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1588367799; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=AcqetdDay4HpRD3f/y6CLMiPz3CMxc+7RZyifvagB/c=; b=lat7VR7xaWQHDV1BrEiGTOG+x9cHpwcOoJOJqMe89VwB2YjNMZp3pNQ/GzCyGtO1twM7R+PEMPHZfrc7Dc6VSA3cOak1quWvi9ad3bBKOyJllB4ozzeO0uUk2IYQd/qCZo9+UCa8fGgh7wTRceii/QSAl026QVNFxzQTl+YV1FU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1588367799336775.5383600376392; Fri, 1 May 2020 14:16:39 -0700 (PDT) Received: from localhost ([::1]:58530 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jUd1S-0004d6-2k for importer@patchew.org; Fri, 01 May 2020 17:16:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41402) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jUcyo-00089r-V6 for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jUcyn-0005hB-Gm for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:54 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:34538) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jUcyn-0005Yt-2T for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:53 -0400 Received: by mail-pl1-x644.google.com with SMTP id s10so4065353plr.1 for ; Fri, 01 May 2020 14:13:52 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id g22sm514552pju.21.2020.05.01.14.13.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 May 2020 14:13:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=AcqetdDay4HpRD3f/y6CLMiPz3CMxc+7RZyifvagB/c=; b=rTguEinA+u4TdwAwxnyR9AbVAC788W90QfvtaoUfDcT0bkiqbY6n6Z2wuJCmh4DwNV dRQyvciSzUrwM3rNt7bhyfh7OVtdPW2xs16uhUWRQTXbfyHiiuQzyj7d4Sz5CpktzdTj oeZA/nIbM+eeczLe5t6Aw4zL3HoEwMFwpPTKn+fw7fa7+zG+4nUwp1WX7NV4ssEEr0t+ sIWsYhrt/kG4Xzb0rlEEQTrHxkAKEAMkVD0GUcLxagqeoYR0vHzZPH3CPuyH61Iat+Ak SCXZTdXkoSIIGT/90/qscE8gV4HxgkbvpaZN+by1rhbAoieIzc/ZPLp0lnlridbouRJz gA2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=AcqetdDay4HpRD3f/y6CLMiPz3CMxc+7RZyifvagB/c=; b=ULZirm7nAglVsCiVliTeTDbViD+0A2fy85OyaqbNT3KTqx1qtrIeHnhoov8+1DyIch msDv0jlzF6VLB6hFCK/n1EFo28f4Z/XNHq8LSVt0W9MDBpCoWw168nOrpgolW52RteAM zAGmBGQ50AHzzeH0Pn725msySAvod3vnz49Usua3Czld8cfDJQcBXh+Ju7mm5YSPFkqK /LLBo0LTWRWWIJRj/Tg29uTfCwnUPjaL+gI/bcVz5jNBhTK+Cp4Pj9nvDCsXNyWSnGgE 0aiFNCAbM+q6S5EGPOl3hLyi3zZD9yCUEcqiuF+bRRs1kz/W9/Mpxis4kD1yBlYwN0U2 7hdQ== X-Gm-Message-State: AGi0Pua4VsWXlEXXWNvNoc800UT7ser/xv761ijwMigiDWqpMjG/wKZ/ ft+KrNA6bJ+BvYeKvBgDrrAdQ668qCA= X-Google-Smtp-Source: APiQypJ34nmakcNkHrc2fV+oPz0KTwbfDlgO00iZ+xCareqXnOxGK9j9hQkj4for/zYir/0xZCf8Sg== X-Received: by 2002:a17:902:748c:: with SMTP id h12mr6446615pll.310.1588367630989; Fri, 01 May 2020 14:13:50 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 3/6] target/arm: Create gen_gvec_{sri,sli} Date: Fri, 1 May 2020 14:13:42 -0700 Message-Id: <20200501211345.30410-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200501211345.30410-1-richard.henderson@linaro.org> References: <20200501211345.30410-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::644 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @linaro.org) Content-Type: text/plain; charset="utf-8" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 ++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 20 +--- target/arm/translate.c | 186 +++++++++++++++++++++---------------- target/arm/vec_helper.c | 38 ++++++++ 5 files changed, 160 insertions(+), 101 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index aeb1f52455..33c76192d2 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -721,6 +721,16 @@ DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void= , ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(gvec_sri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_sli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 823821f82c..7a2008f0dd 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -285,8 +285,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i sri_op[4]; -extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; extern const GVecGen4 sqadd_op[4]; extern const GVecGen4 uqsub_op[4]; @@ -311,6 +309,11 @@ void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, ui= nt32_t rm_ofs, void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); =20 +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 1ef05d5ce1..bc326dadda 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -602,16 +602,6 @@ static void gen_gvec_op2(DisasContext *s, bool is_q, i= nt rd, is_q ? 16 : 8, vec_full_reg_size(s), gvec_op); } =20 -/* Expand a 2-operand + immediate AdvSIMD vector operation using - * an op descriptor. - */ -static void gen_gvec_op2i(DisasContext *s, bool is_q, int rd, - int rn, int64_t imm, const GVecGen2i *gvec_op) -{ - tcg_gen_gvec_2i(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - is_q ? 16 : 8, vec_full_reg_size(s), imm, gvec_op); -} - /* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, int rn, int rm, const GVecGen3 *gvec_op) @@ -10208,12 +10198,9 @@ static void handle_vec_simd_shri(DisasContext *s, = bool is_q, bool is_u, gen_gvec_fn2i(s, is_q, rd, rn, shift, is_u ? gen_gvec_usra : gen_gvec_ssra, size); return; + case 0x08: /* SRI */ - /* Shift count same as element size is valid but does nothing. */ - if (shift =3D=3D 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &sri_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size); return; =20 case 0x00: /* SSHR / USHR */ @@ -10264,7 +10251,6 @@ static void handle_vec_simd_shri(DisasContext *s, b= ool is_q, bool is_u, } tcg_temp_free_i64(tcg_round); =20 - done: clear_vec_high(s, is_q, rd); } =20 @@ -10289,7 +10275,7 @@ static void handle_vec_simd_shli(DisasContext *s, b= ool is_q, bool insert, } =20 if (insert) { - gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sli, size); } else { gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size); } diff --git a/target/arm/translate.c b/target/arm/translate.c index d724022cb6..f730eb5b75 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4726,47 +4726,62 @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 = a, int64_t shift) =20 static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t= sh) { - if (sh =3D=3D 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t =3D tcg_temp_new_vec_matching(d); - TCGv_vec m =3D tcg_temp_new_vec_matching(d); + TCGv_vec t =3D tcg_temp_new_vec_matching(d); + TCGv_vec m =3D tcg_temp_new_vec_matching(d); =20 - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); - tcg_gen_shri_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); + tcg_gen_shri_vec(vece, t, a, sh); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); =20 - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } =20 -static const TCGOpcode vecop_list_sri[] =3D { INDEX_op_shri_vec, 0 }; +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] =3D { INDEX_op_shri_vec, 0 }; + const GVecGen2i ops[4] =3D { + { .fni8 =3D gen_shr8_ins_i64, + .fniv =3D gen_shr_ins_vec, + .fno =3D gen_helper_gvec_sri_b, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_8 }, + { .fni8 =3D gen_shr16_ins_i64, + .fniv =3D gen_shr_ins_vec, + .fno =3D gen_helper_gvec_sri_h, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_16 }, + { .fni4 =3D gen_shr32_ins_i32, + .fniv =3D gen_shr_ins_vec, + .fno =3D gen_helper_gvec_sri_s, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_32 }, + { .fni8 =3D gen_shr64_ins_i64, + .fniv =3D gen_shr_ins_vec, + .fno =3D gen_helper_gvec_sri_d, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_64 }, + }; =20 -const GVecGen2i sri_op[4] =3D { - { .fni8 =3D gen_shr8_ins_i64, - .fniv =3D gen_shr_ins_vec, - .load_dest =3D true, - .opt_opc =3D vecop_list_sri, - .vece =3D MO_8 }, - { .fni8 =3D gen_shr16_ins_i64, - .fniv =3D gen_shr_ins_vec, - .load_dest =3D true, - .opt_opc =3D vecop_list_sri, - .vece =3D MO_16 }, - { .fni4 =3D gen_shr32_ins_i32, - .fniv =3D gen_shr_ins_vec, - .load_dest =3D true, - .opt_opc =3D vecop_list_sri, - .vece =3D MO_32 }, - { .fni8 =3D gen_shr64_ins_i64, - .fniv =3D gen_shr_ins_vec, - .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, - .load_dest =3D true, - .opt_opc =3D vecop_list_sri, - .vece =3D MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <=3D (8 << vece)); + + /* Shift of esize leaves destination unchanged. */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} =20 static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -4804,47 +4819,60 @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 = a, int64_t shift) =20 static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t= sh) { - if (sh =3D=3D 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t =3D tcg_temp_new_vec_matching(d); - TCGv_vec m =3D tcg_temp_new_vec_matching(d); + TCGv_vec t =3D tcg_temp_new_vec_matching(d); + TCGv_vec m =3D tcg_temp_new_vec_matching(d); =20 - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); - tcg_gen_shli_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_shli_vec(vece, t, a, sh); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); =20 - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } =20 -static const TCGOpcode vecop_list_sli[] =3D { INDEX_op_shli_vec, 0 }; +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] =3D { INDEX_op_shli_vec, 0 }; + const GVecGen2i ops[4] =3D { + { .fni8 =3D gen_shl8_ins_i64, + .fniv =3D gen_shl_ins_vec, + .fno =3D gen_helper_gvec_sli_b, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_8 }, + { .fni8 =3D gen_shl16_ins_i64, + .fniv =3D gen_shl_ins_vec, + .fno =3D gen_helper_gvec_sli_h, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_16 }, + { .fni4 =3D gen_shl32_ins_i32, + .fniv =3D gen_shl_ins_vec, + .fno =3D gen_helper_gvec_sli_s, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_32 }, + { .fni8 =3D gen_shl64_ins_i64, + .fniv =3D gen_shl_ins_vec, + .fno =3D gen_helper_gvec_sli_d, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + .load_dest =3D true, + .opt_opc =3D vecop_list, + .vece =3D MO_64 }, + }; =20 -const GVecGen2i sli_op[4] =3D { - { .fni8 =3D gen_shl8_ins_i64, - .fniv =3D gen_shl_ins_vec, - .load_dest =3D true, - .opt_opc =3D vecop_list_sli, - .vece =3D MO_8 }, - { .fni8 =3D gen_shl16_ins_i64, - .fniv =3D gen_shl_ins_vec, - .load_dest =3D true, - .opt_opc =3D vecop_list_sli, - .vece =3D MO_16 }, - { .fni4 =3D gen_shl32_ins_i32, - .fniv =3D gen_shl_ins_vec, - .load_dest =3D true, - .opt_opc =3D vecop_list_sli, - .vece =3D MO_32 }, - { .fni8 =3D gen_shl64_ins_i64, - .fniv =3D gen_shl_ins_vec, - .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, - .load_dest =3D true, - .opt_opc =3D vecop_list_sli, - .vece =3D MO_64 }, -}; + /* tszimm encoding produces immediates in the range [0..esize-1]. */ + tcg_debug_assert(shift >=3D 0); + tcg_debug_assert(shift < (8 << vece)); + + if (shift =3D=3D 0) { + tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} =20 static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) { @@ -6089,20 +6117,14 @@ static int disas_neon_data_insn(DisasContext *s, ui= nt32_t insn) } /* Right shift comes here negative. */ shift =3D -shift; - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &sri_op[size]); - } + gen_gvec_sri(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); return 0; =20 case 5: /* VSHL, VSLI */ if (u) { /* VSLI */ - /* Shift out of range leaves destination unchanged= . */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, - vec_size, shift, &sli_op[size]= ); - } + gen_gvec_sli(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { /* VSHL */ /* Shifts larger than the element size are * architecturally valid and results in zero. diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index fd8b2bff49..096fea67ef 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -974,6 +974,44 @@ DO_RSRA(gvec_ursra_d, uint64_t) =20 #undef DO_RSRA =20 +#define DO_SRI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + int shift =3D simd_data(desc); \ + TYPE *d =3D vd, *n =3D vn; \ + for (i =3D 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] =3D deposit64(d[i], 0, sizeof(TYPE) * 8 - shift, n[i] >> shif= t); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRI(gvec_sri_b, uint8_t) +DO_SRI(gvec_sri_h, uint16_t) +DO_SRI(gvec_sri_s, uint32_t) +DO_SRI(gvec_sri_d, uint64_t) + +#undef DO_SRI + +#define DO_SLI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + int shift =3D simd_data(desc); \ + TYPE *d =3D vd, *n =3D vn; \ + for (i =3D 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] =3D deposit64(d[i], shift, sizeof(TYPE) * 8 - shift, n[i]); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SLI(gvec_sli_b, uint8_t) +DO_SLI(gvec_sli_h, uint16_t) +DO_SLI(gvec_sli_s, uint32_t) +DO_SLI(gvec_sli_d, uint64_t) + +#undef DO_SLI + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. --=20 2.20.1 From nobody Sat May 18 07:09:01 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1588367696; cv=none; d=zohomail.com; s=zohoarc; b=XpXBTJFYx6gtMBijN6ZRbOqkZrRKoh9UTQOBlVTE7qdKLzr9qviiZoFvT8yBLu/ElgztyzDgu5O5V+PKqqmYB6QUvl408HjfhX7oTf4SHm255W7rbincOaxfUHFBNnqR6R7egzmPB8PYadHGamXBY4oovTBcOgh57v3adklnqVI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1588367696; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=T6CcolLGmewGB263Dq5JcXqXdOTfepA8BlCHSWd+1xw=; b=EZhQwLa4ikd+wtME63QOy2VzMlpqzI0I1qU0Bnk8rVLjZWskEVpXsd2WxqR2fQkHkaU5iusVeJXGaYnjW4u4P120tChNfszLSlI3+uVFZiMbV2oovKJaHCISBdladzHHbsNclpmuK/lfvKv8LfAkqvt9CQfRjq+O720WJ6wjn5g= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1588367696420471.7428243750413; Fri, 1 May 2020 14:14:56 -0700 (PDT) Received: from localhost ([::1]:52422 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jUczn-00016k-4v for importer@patchew.org; Fri, 01 May 2020 17:14:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41406) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jUcyp-0008A4-2x for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jUcyo-0005m7-4w for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:54 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:40582) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jUcyn-0005cq-Mv for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:53 -0400 Received: by mail-pj1-x1042.google.com with SMTP id fu13so392029pjb.5 for ; Fri, 01 May 2020 14:13:53 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id g22sm514552pju.21.2020.05.01.14.13.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 May 2020 14:13:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=T6CcolLGmewGB263Dq5JcXqXdOTfepA8BlCHSWd+1xw=; b=xA4F/zyUkQCJ7p3ju3pTIo97fWtVPQbnO6MYUzxklkpnkv3f8FvIPcmKNEVVj/2cUe j95+XYWtl8+3+XYp4DUY6xxjFMRD9KLEgZpdiEeJoiXibyHK4mBlJ0k2NHTVR0csqnAO OJVBOzj/5Li+JZiunFM8GXtebv7Y74rytvljIOOcW3B/I7UnpCx8cVjM729YCrz6XQhx IEtblyyY9N50cPbAgJfrq5lxmg8opzQ8rKODNquhcCLCg+QoF/4msXv8+sCSNkYiGDVY EKW66vqxBpJI2cOjhdQDvPCei5pbYzy45EvwRN3UpmBPWI3oZwk02fuW6v/lgtet1Tin yywg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=T6CcolLGmewGB263Dq5JcXqXdOTfepA8BlCHSWd+1xw=; b=HVxnfgrNRPXMk1JEUdrESadWYoozGaqRSMRwyRPIqHT3a0fulWo4ef+FpPY4caaDyH wyEzFDfjDUvNoyShSdgtTftenz6Cd6qDq35/SXpEhJqDD4SH464JgfDd1+pptNJgW3d8 LQOZTpD+BqZyGD+7iP+sl/7vmKW+vJ6RBQd/C80j1LEGljav3SCNrIqdWF5ewbC+9b1f NhlJ+E+E8uqCSICDDshV4qImEENJKeO39NId90o4WUn7/VEI0py/yabu+5ViWoHa6OQZ Lox5Myf7d25C0bOXlwBk8Wj2JZfyzdOUTHjLQVeoS7nB2UPay3JdSOoHwBojkIMcxKYU qVWQ== X-Gm-Message-State: AGi0PuZKZJW07vO5IZUNpfhb7f3zFVwyGblJRRblb0z3voaB4+2c4NWv JBsOFE4YQSerb3yviGAd+klelartbRk= X-Google-Smtp-Source: APiQypJq6+kYz4zO1NOM2DISz8rNpdjDjCjG24dNBfTwtZAGKDpqwVWJk/ztIm7wMhzyGHD/sE6FQA== X-Received: by 2002:a17:90a:1984:: with SMTP id 4mr1877086pji.36.1588367631967; Fri, 01 May 2020 14:13:51 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 4/6] target/arm: Remove unnecessary range check for VSHL Date: Fri, 1 May 2020 14:13:43 -0700 Message-Id: <20200501211345.30410-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200501211345.30410-1-richard.henderson@linaro.org> References: <20200501211345.30410-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::1042 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @linaro.org) Content-Type: text/plain; charset="utf-8" In 1dc8425e551, while converting to gvec, I added an extra range check against the shift count. This was unnecessary because the encoding of the shift count produces 0 to the element size - 1. Signed-off-by: Richard Henderson --- target/arm/translate.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index f730eb5b75..f082384117 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -6126,16 +6126,8 @@ static int disas_neon_data_insn(DisasContext *s, uin= t32_t insn) gen_gvec_sli(size, rd_ofs, rm_ofs, shift, vec_size, vec_size); } else { /* VSHL */ - /* Shifts larger than the element size are - * architecturally valid and results in zero. - */ - if (shift >=3D 8 << size) { - tcg_gen_gvec_dup_imm(size, rd_ofs, - vec_size, vec_size, 0); - } else { - tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift, - vec_size, vec_size); - } + tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } return 0; } --=20 2.20.1 From nobody Sat May 18 07:09:01 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1588367699; cv=none; d=zohomail.com; s=zohoarc; b=P4zNFiH+VTlWfVXIyEFKN56ZHGPvFxEiBgOUGuLxhBzorm9kABLFmhPoGYWkRZnKR0TFuEMXJ7q+LnCycZbG1GlbwN//CHRNnISrkibVxvM8Un4xkAvaiUQMTu2Gd6jaTfzt38MUvrh57m0MyEcCeOCswUC3+VZSWYKsDfF887o= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1588367699; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=3UIiAqWxRQSzQef8dIuVVnVIZspSlNbG4WcR7Cqf/AI=; b=Yr/K1VteuOwjpfpetbWNLVarlSaLtMXV+t+D/4wJ8OvuJhwfL/ZiK2VThWEcR0Wg0zqPeerr14dnwbryB5/44vciQMhcoe7XDJOQ3vpNMp1YxJFJJtRfMVGB+Fl7+K/NYY6R/Iw7gg1HW0swwh2zWFN8AptM4P3qZ0i8JWuX5Qk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1588367699464716.8086684857063; Fri, 1 May 2020 14:14:59 -0700 (PDT) Received: from localhost ([::1]:52672 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jUczq-0001Cj-7B for importer@patchew.org; Fri, 01 May 2020 17:14:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41430) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jUcyq-0008Bo-JR for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jUcyp-0005ut-J2 for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:56 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:45786) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jUcyp-0005n3-3G for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:55 -0400 Received: by mail-pg1-x542.google.com with SMTP id s18so5065726pgl.12 for ; Fri, 01 May 2020 14:13:54 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id g22sm514552pju.21.2020.05.01.14.13.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 May 2020 14:13:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3UIiAqWxRQSzQef8dIuVVnVIZspSlNbG4WcR7Cqf/AI=; b=M2IVHxe5U95TMLynAu7DNIgpVy0TBHgeiCcRvhSc/boIIClheNr2Wh8hrp3o/M2wa5 RHA6A2f7SmdvovtrhLydvC/Y+eMNDiB1qhPOR9PctnmkiMjDs27UGYwPxExrW158vrpO CcSPnMUalKT4mrbh5uVwfCs//1VfdhymjwF8XQSKmTGREyMDBz+fb6W/0gVkyA+miauk J7VT84EVEvoDldOG7oJMSoDNpvsPU6zBiWrEDq2O4RIjrBaVV8LTtPjTWvk6LLQA+Uhg 8cPEAdXM8FDmwvtGftlEQIzIkpch4oxswyO0mr5t9v6zJGChewVrraVqW81D3mG55/9+ O+Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3UIiAqWxRQSzQef8dIuVVnVIZspSlNbG4WcR7Cqf/AI=; b=Hh1ZRp3fDuikspTn8EChdBeKG4FIhTSMD92q201PxWtMroVRAc9Rl7j/AzCAqWEwhy hAP8vBLuayeA6Gu8fePS0s6NtEf+HWToPE/cuduLUS7SE/guiY0ydg5THFyehsHSK6rG MX1sW0poo9AGI1byfLuijRiiUogjkO8j/n177hG7vsLIzYfSkqhuVREbJ8BVTubo9Fo2 m7W9kB81aHLd95Say1hTwBaK4+iQWDvSnWK7gk62iMpfoUGy34zvfWR/NUTIDjQSkHTi 9TdpW9MXH8Akt/fshidsFaMi3tK5Cr9X1joAxt7So4vqCEx/nFBURkA9yz3E5xO/D29d p49w== X-Gm-Message-State: AGi0PuaERWILX0c+iBAodMt1p/FKtdzaj4qJIjPSpLhuRh2pKmBwJ8cn DQOwOeSS9nvHWQsMzyiFBJt4UmUf2sM= X-Google-Smtp-Source: APiQypIantLL6uZmEN2mdDEvpebQgUGgmoyuIy8ZvDJEFSRKgAxH2q2+a/sBdyw5nY8vXBxpx3tBDA== X-Received: by 2002:a63:d23:: with SMTP id c35mr6082043pgl.191.1588367633179; Fri, 01 May 2020 14:13:53 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 5/6] target/arm: Tidy handle_vec_simd_shri Date: Fri, 1 May 2020 14:13:44 -0700 Message-Id: <20200501211345.30410-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200501211345.30410-1-richard.henderson@linaro.org> References: <20200501211345.30410-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::542; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x542.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::542 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @linaro.org) Content-Type: text/plain; charset="utf-8" Now that we've converted all cases to gvec, there is quite a bit of dead code at the end of the function. Remove it. Sink the call to gen_gvec_fn2i to the end, loading a function pointer within the switch statement. Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 56 ++++++++++---------------------------- 1 file changed, 14 insertions(+), 42 deletions(-) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index bc326dadda..5937069992 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10172,16 +10172,7 @@ static void handle_vec_simd_shri(DisasContext *s, = bool is_q, bool is_u, int size =3D 32 - clz32(immh) - 1; int immhb =3D immh << 3 | immb; int shift =3D 2 * (8 << size) - immhb; - bool accumulate =3D false; - int dsize =3D is_q ? 128 : 64; - int esize =3D 8 << size; - int elements =3D dsize/esize; - MemOp memop =3D size | (is_u ? 0 : MO_SIGN); - TCGv_i64 tcg_rn =3D new_tmp_a64(s); - TCGv_i64 tcg_rd =3D new_tmp_a64(s); - TCGv_i64 tcg_round; - uint64_t round_const; - int i; + GVecGen2iFn *gvec_fn; =20 if (extract32(immh, 3, 1) && !is_q) { unallocated_encoding(s); @@ -10195,13 +10186,12 @@ static void handle_vec_simd_shri(DisasContext *s,= bool is_q, bool is_u, =20 switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_usra : gen_gvec_ssra, size); - return; + gvec_fn =3D is_u ? gen_gvec_usra : gen_gvec_ssra; + break; =20 case 0x08: /* SRI */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size); - return; + gvec_fn =3D gen_gvec_sri; + break; =20 case 0x00: /* SSHR / USHR */ if (is_u) { @@ -10209,49 +10199,31 @@ static void handle_vec_simd_shri(DisasContext *s,= bool is_q, bool is_u, /* Shift count the same size as element size produces zero= . */ tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd), is_q ? 16 : 8, vec_full_reg_size(s), = 0); - } else { - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, s= ize); + return; } + gvec_fn =3D tcg_gen_gvec_shri; } else { /* Shift count the same size as element size produces all sign= . */ if (shift =3D=3D 8 << size) { shift -=3D 1; } - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size); + gvec_fn =3D tcg_gen_gvec_sari; } - return; + break; =20 case 0x04: /* SRSHR / URSHR (rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_urshr : gen_gvec_srshr, size); - return; + gvec_fn =3D is_u ? gen_gvec_urshr : gen_gvec_srshr; + break; =20 case 0x06: /* SRSRA / URSRA (accum + rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_ursra : gen_gvec_srsra, size); - return; + gvec_fn =3D is_u ? gen_gvec_ursra : gen_gvec_srsra; + break; =20 default: g_assert_not_reached(); } =20 - round_const =3D 1ULL << (shift - 1); - tcg_round =3D tcg_const_i64(round_const); - - for (i =3D 0; i < elements; i++) { - read_vec_element(s, tcg_rn, rn, i, memop); - if (accumulate) { - read_vec_element(s, tcg_rd, rd, i, memop); - } - - handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round, - accumulate, is_u, size, shift); - - write_vec_element(s, tcg_rd, rd, i, size); - } - tcg_temp_free_i64(tcg_round); - - clear_vec_high(s, is_q, rd); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size); } =20 /* SHL/SLI - Vector shift left */ --=20 2.20.1 From nobody Sat May 18 07:09:01 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1588367970; cv=none; d=zohomail.com; s=zohoarc; b=CZ1VfChVi4/nv3hnKuO9xl3JQU0zDwfAdQhBVSHNGm2C9iDeZhP8eSheNn3gDXODb1XBKKts1UNq3pVS1K51ka08x4LmtzgKx9mkxWbbaqgj4Tq5JreesxaxWfjNmQusIwZ6nzQDPkoo0MCG8klMxzcMg16sYSoP47xlxHLuEG0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1588367970; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=z1GrNs/INZtuwm/NWd3uanX5FvDBMbWqe89R1ET5KkI=; b=QAkPvRrrQeApwmfEyqWQRxx04KEBl3O4ZT3SGWt5+RLkZrUbavpvT8bHftc3IpSPkKF42yLEJojjnObGfsyJsuiMh8DEre+6/bXUU1PiVxxT7KiALN5m2X7IrDhqbGXWxfue2lNOxuWi7q8U2goq3TgW2RQPUi3BopN/dYEiUjI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1588367970256390.32559676446783; Fri, 1 May 2020 14:19:30 -0700 (PDT) Received: from localhost ([::1]:35858 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jUd4C-0007zc-RQ for importer@patchew.org; Fri, 01 May 2020 17:19:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41450) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jUcys-0008Eb-Co for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jUcyq-000631-Rq for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:58 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:55534) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jUcyq-0005v5-Cf for qemu-devel@nongnu.org; Fri, 01 May 2020 17:13:56 -0400 Received: by mail-pj1-x1041.google.com with SMTP id a32so386533pje.5 for ; Fri, 01 May 2020 14:13:55 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id g22sm514552pju.21.2020.05.01.14.13.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 May 2020 14:13:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=z1GrNs/INZtuwm/NWd3uanX5FvDBMbWqe89R1ET5KkI=; b=AlD18TF5qt5s/oylIka25LDHRgIo2XGU59T8q9VKU/8CymLpqkXej2Nj1G88do7Rlb j467TDPpB36DS7pCNNZyAGmz/eL8G2XnyNf6zTq3BPK3+0js47vv7LrRzXIsIlMSt7Fn yEh6FOUpLV+hH6SPlu6prSrieZIknZHkO4/3d+8uX0gSwWFbL9Xnt8w4xy3Kee2xEaeO 3gfnvuYu9edeoh00waZjjsCN0KEH5x4/vcoCgqK980ZoPIIlgqf99ShzlWeQIfp1w9gl 9aFH3kZR3iLbrhtLPFQ1f1+dnBhSEjTmRU54vSuBV/f3ORkzii5/UysCmBhU/GjujQDV w9PQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=z1GrNs/INZtuwm/NWd3uanX5FvDBMbWqe89R1ET5KkI=; b=no5feZZ2hMNCKYp5WQrzyGWAtkjG07sUOYFqy5KjwWAb2Sjy0pZE78HB2ZeskuHmiA rvJu4a00wysfz+aqA15EZy1j18QO9X3pujn7Zud++q4dWwmv/8YRgEJIzkPTca9Es3i9 k7UQwsr4+TJfEENavTHEsZg0MXOOzujT7yxBRr1nO9CSs/fiHRxDOdWHhRuat/EVFErA OBG0uzU4tMHpyGmGmh/AKxt2jfdIOVuUrFqHVyb7b4822qrAZabXr3IYj8aPul2rQcAE /+Amc3FeNqiFzy2pQH89Tmc+66wqgmiBU34GmIQrIXQ/ezpG1oCAdi0IPMsLJGGOc4kG SFkA== X-Gm-Message-State: AGi0Pua+/SJgCVQAfvBzx7wX7x1XjZCt9NrzyTY1Bjo0BqOUETVj6Dh2 v+8LS/Lj127pt9of7aJVNYQTljYsPs4= X-Google-Smtp-Source: APiQypIaFBLY7mkFT6cEZMTkja0rr5zQ8aOat0ks6fV2h4UMQT+ZNKz+XryA95q0VYI0jQRd60iZ2Q== X-Received: by 2002:a17:90a:e2d0:: with SMTP id fr16mr1862614pjb.146.1588367634444; Fri, 01 May 2020 14:13:54 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 6/6] target/arm: Wrap vector compare zero GVecGen2 in GVecGen2Fn Date: Fri, 1 May 2020 14:13:45 -0700 Message-Id: <20200501211345.30410-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200501211345.30410-1-richard.henderson@linaro.org> References: <20200501211345.30410-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::1041; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1041.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::1041 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @linaro.org) Content-Type: text/plain; charset="utf-8" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Macro-ize the 5 nearly identical comparisons. Signed-off-by: Richard Henderson --- target/arm/translate.h | 16 ++- target/arm/translate-a64.c | 22 ++-- target/arm/translate.c | 254 ++++++++----------------------------- 3 files changed, 74 insertions(+), 218 deletions(-) diff --git a/target/arm/translate.h b/target/arm/translate.h index 7a2008f0dd..20ec9cedd7 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -275,11 +275,17 @@ static inline void gen_swstep_exception(DisasContext = *s, int isv, int ex) uint64_t vfp_expand_imm(int size, uint8_t imm8); =20 /* Vector operations shared between ARM and AArch64. */ -extern const GVecGen2 ceq0_op[4]; -extern const GVecGen2 clt0_op[4]; -extern const GVecGen2 cgt0_op[4]; -extern const GVecGen2 cle0_op[4]; -extern const GVecGen2 cge0_op[4]; +void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_clt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cgt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen3 mla_op[4]; extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 5937069992..8208651394 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -594,14 +594,6 @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, i= nt rd, int rn, int rm, is_q ? 16 : 8, vec_full_reg_size(s)); } =20 -/* Expand a 2-operand AdvSIMD vector operation using an op descriptor. */ -static void gen_gvec_op2(DisasContext *s, bool is_q, int rd, - int rn, const GVecGen2 *gvec_op) -{ - tcg_gen_gvec_2(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - is_q ? 16 : 8, vec_full_reg_size(s), gvec_op); -} - /* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, int rn, int rm, const GVecGen3 *gvec_op) @@ -12327,13 +12319,21 @@ static void disas_simd_two_reg_misc(DisasContext = *s, uint32_t insn) } break; case 0x8: /* CMGT, CMGE */ - gen_gvec_op2(s, is_q, rd, rn, u ? &cge0_op[size] : &cgt0_op[size]); + if (u) { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cge0, size); + } else { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cgt0, size); + } return; case 0x9: /* CMEQ, CMLE */ - gen_gvec_op2(s, is_q, rd, rn, u ? &cle0_op[size] : &ceq0_op[size]); + if (u) { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cle0, size); + } else { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_ceq0, size); + } return; case 0xa: /* CMLT */ - gen_gvec_op2(s, is_q, rd, rn, &clt0_op[size]); + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clt0, size); return; case 0xb: if (u) { /* ABS, NEG */ diff --git a/target/arm/translate.c b/target/arm/translate.c index f082384117..b08c4a2527 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3917,204 +3917,59 @@ static int do_v81_helper(DisasContext *s, gen_help= er_gvec_3_ptr *fn, return 1; } =20 -static void gen_ceq0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_EQ, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_ceq0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_EQ, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_ceq0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero =3D tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_EQ, vece, d, a, zero); - tcg_temp_free_vec(zero); -} +#define GEN_CMP0(NAME, COND) \ + static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a) \ + { \ + tcg_gen_setcondi_i32(COND, d, a, 0); \ + tcg_gen_neg_i32(d, d); \ + } \ + static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a) \ + { \ + tcg_gen_setcondi_i64(COND, d, a, 0); \ + tcg_gen_neg_i64(d, d); \ + } \ + static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \ + { \ + TCGv_vec zero =3D tcg_const_zeros_vec_matching(d); \ + tcg_gen_cmp_vec(COND, vece, d, a, zero); \ + tcg_temp_free_vec(zero); \ + } \ + void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m, \ + uint32_t opr_sz, uint32_t max_sz) \ + { \ + const GVecGen2 op[4] =3D { \ + { .fno =3D gen_helper_gvec_##NAME##0_b, \ + .fniv =3D gen_##NAME##0_vec, \ + .opt_opc =3D vecop_list_cmp, \ + .vece =3D MO_8 }, \ + { .fno =3D gen_helper_gvec_##NAME##0_h, \ + .fniv =3D gen_##NAME##0_vec, \ + .opt_opc =3D vecop_list_cmp, \ + .vece =3D MO_16 }, \ + { .fni4 =3D gen_##NAME##0_i32, \ + .fniv =3D gen_##NAME##0_vec, \ + .opt_opc =3D vecop_list_cmp, \ + .vece =3D MO_32 }, \ + { .fni8 =3D gen_##NAME##0_i64, \ + .fniv =3D gen_##NAME##0_vec, \ + .opt_opc =3D vecop_list_cmp, \ + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, = \ + .vece =3D MO_64 }, \ + }; \ + tcg_gen_gvec_2(d, m, opr_sz, max_sz, &op[vece]); \ + } =20 static const TCGOpcode vecop_list_cmp[] =3D { INDEX_op_cmp_vec, 0 }; =20 -const GVecGen2 ceq0_op[4] =3D { - { .fno =3D gen_helper_gvec_ceq0_b, - .fniv =3D gen_ceq0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_8 }, - { .fno =3D gen_helper_gvec_ceq0_h, - .fniv =3D gen_ceq0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_16 }, - { .fni4 =3D gen_ceq0_i32, - .fniv =3D gen_ceq0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_32 }, - { .fni8 =3D gen_ceq0_i64, - .fniv =3D gen_ceq0_vec, - .opt_opc =3D vecop_list_cmp, - .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, - .vece =3D MO_64 }, -}; +GEN_CMP0(ceq, TCG_COND_EQ) +GEN_CMP0(cle, TCG_COND_LE) +GEN_CMP0(cge, TCG_COND_GE) +GEN_CMP0(clt, TCG_COND_LT) +GEN_CMP0(cgt, TCG_COND_GT) =20 -static void gen_cle0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_LE, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cle0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_LE, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cle0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero =3D tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_LE, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cle0_op[4] =3D { - { .fno =3D gen_helper_gvec_cle0_b, - .fniv =3D gen_cle0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_8 }, - { .fno =3D gen_helper_gvec_cle0_h, - .fniv =3D gen_cle0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_16 }, - { .fni4 =3D gen_cle0_i32, - .fniv =3D gen_cle0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_32 }, - { .fni8 =3D gen_cle0_i64, - .fniv =3D gen_cle0_vec, - .opt_opc =3D vecop_list_cmp, - .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, - .vece =3D MO_64 }, -}; - -static void gen_cge0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_GE, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cge0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_GE, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cge0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero =3D tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_GE, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cge0_op[4] =3D { - { .fno =3D gen_helper_gvec_cge0_b, - .fniv =3D gen_cge0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_8 }, - { .fno =3D gen_helper_gvec_cge0_h, - .fniv =3D gen_cge0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_16 }, - { .fni4 =3D gen_cge0_i32, - .fniv =3D gen_cge0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_32 }, - { .fni8 =3D gen_cge0_i64, - .fniv =3D gen_cge0_vec, - .opt_opc =3D vecop_list_cmp, - .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, - .vece =3D MO_64 }, -}; - -static void gen_clt0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_LT, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_clt0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_LT, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_clt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero =3D tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_LT, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 clt0_op[4] =3D { - { .fno =3D gen_helper_gvec_clt0_b, - .fniv =3D gen_clt0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_8 }, - { .fno =3D gen_helper_gvec_clt0_h, - .fniv =3D gen_clt0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_16 }, - { .fni4 =3D gen_clt0_i32, - .fniv =3D gen_clt0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_32 }, - { .fni8 =3D gen_clt0_i64, - .fniv =3D gen_clt0_vec, - .opt_opc =3D vecop_list_cmp, - .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, - .vece =3D MO_64 }, -}; - -static void gen_cgt0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_GT, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cgt0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_GT, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cgt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero =3D tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_GT, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cgt0_op[4] =3D { - { .fno =3D gen_helper_gvec_cgt0_b, - .fniv =3D gen_cgt0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_8 }, - { .fno =3D gen_helper_gvec_cgt0_h, - .fniv =3D gen_cgt0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_16 }, - { .fni4 =3D gen_cgt0_i32, - .fniv =3D gen_cgt0_vec, - .opt_opc =3D vecop_list_cmp, - .vece =3D MO_32 }, - { .fni8 =3D gen_cgt0_i64, - .fniv =3D gen_cgt0_vec, - .opt_opc =3D vecop_list_cmp, - .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, - .vece =3D MO_64 }, -}; +#undef GEN_CMP0 =20 static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -7146,24 +7001,19 @@ static int disas_neon_data_insn(DisasContext *s, ui= nt32_t insn) break; =20 case NEON_2RM_VCEQ0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &ceq0_op[size]); + gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size= ); break; case NEON_2RM_VCGT0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cgt0_op[size]); + gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size= ); break; case NEON_2RM_VCLE0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cle0_op[size]); + gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size= ); break; case NEON_2RM_VCGE0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cge0_op[size]); + gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size= ); break; case NEON_2RM_VCLT0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &clt0_op[size]); + gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size= ); break; =20 default: --=20 2.20.1