From nobody Tue Feb 10 15:28:55 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1528855339683926.1755226881194; Tue, 12 Jun 2018 19:02:19 -0700 (PDT) Received: from localhost ([::1]:59290 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv75-0005yV-03 for importer@patchew.org; Tue, 12 Jun 2018 22:02:19 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44015) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv21-0002DA-At for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv1x-0006EG-CU for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:02 -0400 Received: from mail-pl0-x234.google.com ([2607:f8b0:400e:c01::234]:37062) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv1x-0006Dz-3V for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:01 -0400 Received: by mail-pl0-x234.google.com with SMTP id 31-v6so554533plc.4 for ; Tue, 12 Jun 2018 18:57:00 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.56.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:56:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ANk7IyD8I0Q+Ruy1oxuydyXHODwQqVjYqLQEOtCCTWI=; b=FwJLCQiCqmgFuN7fHrg+lGZN4n0lAONvXm4G3/y/IUlnocV4xmweBmqC2CSjal0gVR gy1+lc4upK7daALUGoOWJK34MPFC/oXDhKf7N7F7Yk97hH7ybyS1g+eS/8Vt2P+4PHdE pZIRTUdt/58YOqjjno8ciiTjYiu47UmZsNFL8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ANk7IyD8I0Q+Ruy1oxuydyXHODwQqVjYqLQEOtCCTWI=; b=InVpgxpPFoHjtbC4eVDacq/K+NAII8bRXNdqGAVa/yqovzRPlicXzXiry5VEZbQ56O x+uYfAqVGigBN0nK4RBYdwVuIgbqv+yOYdOf8qDEkzJarLqN+CEc2CGD/x8a5aOTxzBJ 2j4CYpvcIO+q7s1Sh9++tzxBYF+hlfLpfTNZjcZbId2upYznm4mCcWM58wzaYE0z9uQF I+Jkh/afvl4KE8KOea+FFnDuNUIMEb/AGrkpGQpEhwPWABodmZAXvmW0uuNzY337ZVJ1 wxmxYG+d3Ox2U+FJvUM2UmLNDqwagEP+mgkzWNjkOu0QCxfkYL6e+I5G5Hu3F5I94YdT jI+g== X-Gm-Message-State: APt69E2gfksUR8j4DlR7JjOjA/e0z0Zosexe1qXLEcq83QuzXbv/6/u5 aiQ29/qFWlD8hZNFyD2pks3mXoVq1H0= X-Google-Smtp-Source: ADUXVKLNPd0DhFpmFCG2eX6AXn/5NQV28yXqfwzXCZh75FZm790jGgIG6dODSAvkQTkhe78SHDDfiA== X-Received: by 2002:a17:902:8685:: with SMTP id g5-v6mr2992980plo.180.1528855019724; Tue, 12 Jun 2018 18:56:59 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:27 -1000 Message-Id: <20180613015641.5667-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::234 Subject: [Qemu-devel] [PATCH v4b 04/18] target/arm: Implement SVE Permute - Interleaving Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 15 ++++++++ target/arm/sve_helper.c | 72 ++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 75 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 10 +++++ 4 files changed, 172 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ff958fcebd..bab20345c6 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -445,6 +445,21 @@ DEF_HELPER_FLAGS_4(sve_trn_p, TCG_CALL_NO_RWG, void, p= tr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_rev_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_punpk_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_zip_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_uzp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_trn_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f4d49d4aff..f114e9ab63 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1964,3 +1964,75 @@ void HELPER(sve_punpk_p)(void *vd, void *vn, uint32_= t pred_desc) } } } + +#define DO_ZIP(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz =3D simd_oprsz(desc); \ + intptr_t i, oprsz_2 =3D oprsz / 2; \ + ARMVectorReg tmp_n, tmp_m; \ + /* We produce output faster than we consume input. \ + Therefore we must be mindful of possible overlap. */ \ + if (unlikely((vn - vd) < (uintptr_t)oprsz)) { \ + vn =3D memcpy(&tmp_n, vn, oprsz_2); \ + } \ + if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \ + vm =3D memcpy(&tmp_m, vm, oprsz_2); \ + } \ + for (i =3D 0; i < oprsz_2; i +=3D sizeof(TYPE)) { \ + *(TYPE *)(vd + H(2 * i + 0)) =3D *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(2 * i + sizeof(TYPE))) =3D *(TYPE *)(vm + H(i)); \ + } \ +} + +DO_ZIP(sve_zip_b, uint8_t, H1) +DO_ZIP(sve_zip_h, uint16_t, H1_2) +DO_ZIP(sve_zip_s, uint32_t, H1_4) +DO_ZIP(sve_zip_d, uint64_t, ) + +#define DO_UZP(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz =3D simd_oprsz(desc); \ + intptr_t oprsz_2 =3D oprsz / 2; \ + intptr_t odd_ofs =3D simd_data(desc); \ + intptr_t i; \ + ARMVectorReg tmp_m; \ + if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \ + vm =3D memcpy(&tmp_m, vm, oprsz); \ + } \ + for (i =3D 0; i < oprsz_2; i +=3D sizeof(TYPE)) { = \ + *(TYPE *)(vd + H(i)) =3D *(TYPE *)(vn + H(2 * i + odd_ofs)); \ + } \ + for (i =3D 0; i < oprsz_2; i +=3D sizeof(TYPE)) { = \ + *(TYPE *)(vd + H(oprsz_2 + i)) =3D *(TYPE *)(vm + H(2 * i + odd_of= s)); \ + } \ +} + +DO_UZP(sve_uzp_b, uint8_t, H1) +DO_UZP(sve_uzp_h, uint16_t, H1_2) +DO_UZP(sve_uzp_s, uint32_t, H1_4) +DO_UZP(sve_uzp_d, uint64_t, ) + +#define DO_TRN(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz =3D simd_oprsz(desc); \ + intptr_t odd_ofs =3D simd_data(desc); \ + intptr_t i; \ + for (i =3D 0; i < oprsz; i +=3D 2 * sizeof(TYPE)) { = \ + TYPE ae =3D *(TYPE *)(vn + H(i + odd_ofs)); \ + TYPE be =3D *(TYPE *)(vm + H(i + odd_ofs)); \ + *(TYPE *)(vd + H(i + 0)) =3D ae; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) =3D be; \ + } \ +} + +DO_TRN(sve_trn_b, uint8_t, H1) +DO_TRN(sve_trn_h, uint16_t, H1_2) +DO_TRN(sve_trn_s, uint32_t, H1_4) +DO_TRN(sve_trn_d, uint64_t, ) + +#undef DO_ZIP +#undef DO_UZP +#undef DO_TRN diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0160d06915..21319518d7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2209,6 +2209,81 @@ static bool trans_PUNPKHI(DisasContext *s, arg_PUNPK= HI *a, uint32_t insn) return do_perm_pred2(s, a, 1, gen_helper_sve_punpk_p); } =20 +/* + *** SVE Permute - Interleaving Group + */ + +static bool do_zip(DisasContext *s, arg_rrr_esz *a, bool high) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + gen_helper_sve_zip_b, gen_helper_sve_zip_h, + gen_helper_sve_zip_s, gen_helper_sve_zip_d, + }; + + if (sve_access_check(s)) { + unsigned vsz =3D vec_full_reg_size(s); + unsigned high_ofs =3D high ? vsz / 2 : 0; + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn) + high_ofs, + vec_full_reg_offset(s, a->rm) + high_ofs, + vsz, vsz, 0, fns[a->esz]); + } + return true; +} + +static bool do_zzz_data_ool(DisasContext *s, arg_rrr_esz *a, int data, + gen_helper_gvec_3 *fn) +{ + if (sve_access_check(s)) { + unsigned vsz =3D vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, data, fn); + } + return true; +} + +static bool trans_ZIP1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zip(s, a, false); +} + +static bool trans_ZIP2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zip(s, a, true); +} + +static gen_helper_gvec_3 * const uzp_fns[4] =3D { + gen_helper_sve_uzp_b, gen_helper_sve_uzp_h, + gen_helper_sve_uzp_s, gen_helper_sve_uzp_d, +}; + +static bool trans_UZP1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zzz_data_ool(s, a, 0, uzp_fns[a->esz]); +} + +static bool trans_UZP2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zzz_data_ool(s, a, 1 << a->esz, uzp_fns[a->esz]); +} + +static gen_helper_gvec_3 * const trn_fns[4] =3D { + gen_helper_sve_trn_b, gen_helper_sve_trn_h, + gen_helper_sve_trn_s, gen_helper_sve_trn_d, +}; + +static bool trans_TRN1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zzz_data_ool(s, a, 0, trn_fns[a->esz]); +} + +static bool trans_TRN2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 26fe1608c4..df2b94dc0a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -414,6 +414,16 @@ REV_p 00000101 .. 11 0100 010 000 0 .... 0 .= ... @pd_pn PUNPKLO 00000101 00 11 0000 010 000 0 .... 0 .... @pd_pn_e0 PUNPKHI 00000101 00 11 0001 010 000 0 .... 0 .... @pd_pn_e0 =20 +### SVE Permute - Interleaving Group + +# SVE permute vector elements +ZIP1_z 00000101 .. 1 ..... 011 000 ..... ..... @rd_rn_rm +ZIP2_z 00000101 .. 1 ..... 011 001 ..... ..... @rd_rn_rm +UZP1_z 00000101 .. 1 ..... 011 010 ..... ..... @rd_rn_rm +UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm +TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm +TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.17.1