From nobody Sun Nov 2 11:46:59 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1527704262774773.1015107117181; Wed, 30 May 2018 11:17:42 -0700 (PDT) Received: from localhost ([::1]:40142 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5fK-0005v8-0y for importer@patchew.org; Wed, 30 May 2018 14:17:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33648) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Px-00025y-18 for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pv-0004ll-2E for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:49 -0400 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:34865) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pu-0004kw-Mp for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:46 -0400 Received: by mail-pf0-x243.google.com with SMTP id x9-v6so9409864pfm.2 for ; Wed, 30 May 2018 11:01:46 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.43 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Ld2xrQU82H6bGici9pXYUtbus0NOEmAaNaAWCDL2/k4=; b=UwC3TtjRNT8CSMvhw0uP/p/Zu772c6Jv5L5Y82fNkoNrhexKLA+exq4rGep2lpIQjB ddASwHK6V+PeEfSPnHPMtE3kfyBARG+vvUzmdiJt+dlfmeCh2uar6lhQzTJOPCgO85l4 Y1dHkWbfl+bKck43VhVjiSaf7zml2JmBayH6s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Ld2xrQU82H6bGici9pXYUtbus0NOEmAaNaAWCDL2/k4=; b=W3Xq/5WNR6enpxMjDNOUlhFfKc8jKeQdDXr7hpmc7QLgoVeqMr+PQPg1HHcefRzh62 juoEIvwtkslQcnd0eO2q4ntazo0RL81vXpkNjTalofWJ0ljt5tIbh37YIIaNXwBT5h6F m0tDb4tUF7VQkUz7KfLJ+sf0MSZ++8fRAjlahBx3hB6Dp+ub951DmPkBhqRsM3MPyEDr dT+zpScZ6hBB50SEGAtXtAnsdXcZyA8+er6oAATp05bkyI6UhKdmuieArp/EvBu6O7r3 2627vcoT0E/gtjs+V99ayoxQ6f19A4FyPLQgBEP5EC4XGNpsvFqGLzpaMUoU1SnahC7/ iJOA== X-Gm-Message-State: ALKqPwfSBUUJ4D2LxCAarK86i8A1/5FcTIC1mRSogzy6btlwyLu/7HZE 3263uN1jRjAvCn7/th7O41eRZwtqCsM= X-Google-Smtp-Source: ADUXVKJ98Bx3t7ERhrpIehu9SM2NP6O2jw0FHLP6p4OVLu8uN5ROI33fVdzOx2djwzNUmW3BfGZ97g== X-Received: by 2002:a62:1656:: with SMTP id 83-v6mr3686176pfw.61.1527703305005; Wed, 30 May 2018 11:01:45 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:16 -0700 Message-Id: <20180530180120.13355-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v3b 14/18] target/arm: Implement SVE Predicate Count Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 14 ++++ target/arm/translate-sve.c | 132 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 27 ++++++++ 4 files changed, 175 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index f0a3ed3414..dd4f8f754d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -676,3 +676,5 @@ DEF_HELPER_FLAGS_4(sve_brkbs_m, TCG_CALL_NO_RWG, i32, p= tr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 409ec0963f..8d1631ea3c 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2722,3 +2722,17 @@ uint32_t HELPER(sve_brkns)(void *vd, void *vn, void = *vg, uint32_t pred_desc) return do_zero(vd, oprsz); } } + +uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz =3D extract32(pred_desc, SIMD_DATA_SHIFT, 2); + uint64_t *n =3D vn, *g =3D vg, sum =3D 0, mask =3D pred_esz_masks[esz]; + intptr_t i; + + for (i =3D 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t t =3D n[i] & g[i] & mask; + sum +=3D ctpop64(t); + } + return sum; +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index d5f33cbfe9..e4815e4912 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -34,6 +34,9 @@ #include "translate-a64.h" =20 =20 +typedef void GVecGen2sFn(unsigned, uint32_t, uint32_t, + TCGv_i64, uint32_t, uint32_t); + typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, @@ -2949,6 +2952,135 @@ static bool trans_BRKN(DisasContext *s, arg_rpr_s *= a, uint32_t insn) return do_brk2(s, a, gen_helper_sve_brkn, gen_helper_sve_brkns); } =20 +/* + *** SVE Predicate Count Group + */ + +static void do_cntp(DisasContext *s, TCGv_i64 val, int esz, int pn, int pg) +{ + unsigned psz =3D pred_full_reg_size(s); + + if (psz <=3D 8) { + uint64_t psz_mask; + + tcg_gen_ld_i64(val, cpu_env, pred_full_reg_offset(s, pn)); + if (pn !=3D pg) { + TCGv_i64 g =3D tcg_temp_new_i64(); + tcg_gen_ld_i64(g, cpu_env, pred_full_reg_offset(s, pg)); + tcg_gen_and_i64(val, val, g); + tcg_temp_free_i64(g); + } + + /* Reduce the pred_esz_masks value simply to reduce the + size of the code generated here. */ + psz_mask =3D deposit64(0, 0, psz * 8, -1); + tcg_gen_andi_i64(val, val, pred_esz_masks[esz] & psz_mask); + + tcg_gen_ctpop_i64(val, val); + } else { + TCGv_ptr t_pn =3D tcg_temp_new_ptr(); + TCGv_ptr t_pg =3D tcg_temp_new_ptr(); + unsigned desc; + TCGv_i32 t_desc; + + desc =3D psz - 2; + desc =3D deposit32(desc, SIMD_DATA_SHIFT, 2, esz); + + tcg_gen_addi_ptr(t_pn, cpu_env, pred_full_reg_offset(s, pn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + t_desc =3D tcg_const_i32(desc); + + gen_helper_sve_cntp(val, t_pn, t_pg, t_desc); + tcg_temp_free_ptr(t_pn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(t_desc); + } +} + +static bool trans_CNTP(DisasContext *s, arg_CNTP *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_cntp(s, cpu_reg(s, a->rd), a->esz, a->rn, a->pg); + } + return true; +} + +static bool trans_INCDECP_r(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 reg =3D cpu_reg(s, a->rd); + TCGv_i64 val =3D tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + if (a->d) { + tcg_gen_sub_i64(reg, reg, val); + } else { + tcg_gen_add_i64(reg, reg, val); + } + tcg_temp_free_i64(val); + } + return true; +} + +static bool trans_INCDECP_z(DisasContext *s, arg_incdec2_pred *a, + uint32_t insn) +{ + if (a->esz =3D=3D 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz =3D vec_full_reg_size(s); + TCGv_i64 val =3D tcg_temp_new_i64(); + GVecGen2sFn *gvec_fn =3D a->d ? tcg_gen_gvec_subs : tcg_gen_gvec_a= dds; + + do_cntp(s, val, a->esz, a->pg, a->pg); + gvec_fn(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), val, vsz, vsz); + } + return true; +} + +static bool trans_SINCDECP_r_32(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 reg =3D cpu_reg(s, a->rd); + TCGv_i64 val =3D tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_32(reg, val, a->u, a->d); + } + return true; +} + +static bool trans_SINCDECP_r_64(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 reg =3D cpu_reg(s, a->rd); + TCGv_i64 val =3D tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_64(reg, val, a->u, a->d); + } + return true; +} + +static bool trans_SINCDECP_z(DisasContext *s, arg_incdec2_pred *a, + uint32_t insn) +{ + if (a->esz =3D=3D 0) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 val =3D tcg_temp_new_i64(); + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, a->u, a->d); + } + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 66e1ee6b5c..62d51c252b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -67,6 +67,8 @@ &ptrue rd esz pat s &incdec_cnt rd pat esz imm d u &incdec2_cnt rd rn pat esz imm d u +&incdec_pred rd pg esz d u +&incdec2_pred rd rn pg esz d u =20 ########################################################################### # Named instruction formats. These are generally used to @@ -113,6 +115,7 @@ =20 # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz +@rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz =20 # Two register operands with a 6-bit signed immediate. @rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri @@ -153,6 +156,12 @@ @incdec2_cnt ........ esz:2 .. .... ...... pat:5 rd:5 \ &incdec2_cnt imm=3D%imm4_16_p1 rn=3D%reg_movprfx =20 +# One register, predicate. +# User must fill in U and D. +@incdec_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 &incdec_pr= ed +@incdec2_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 \ + &incdec2_pred rn=3D%reg_movprfx + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. =20 @@ -579,6 +588,24 @@ BRKB_m 00100101 1. 01000001 .... 0 .... 1 ...= . @pd_pg_pn_s # SVE propagate break to next partition BRKN 00100101 0. 01100001 .... 0 .... 0 .... @pd_pg_pn_s =20 +### SVE Predicate Count Group + +# SVE predicate count +CNTP 00100101 .. 100 000 10 .... 0 .... ..... @rd_pg4_pn + +# SVE inc/dec register by predicate count +INCDECP_r 00100101 .. 10110 d:1 10001 00 .... ..... @incdec_pred= u=3D1 + +# SVE inc/dec vector by predicate count +INCDECP_z 00100101 .. 10110 d:1 10000 00 .... ..... @incdec2_pred= u=3D1 + +# SVE saturating inc/dec register by predicate count +SINCDECP_r_32 00100101 .. 1010 d:1 u:1 10001 00 .... ..... @incdec_pr= ed +SINCDECP_r_64 00100101 .. 1010 d:1 u:1 10001 10 .... ..... @incdec_pr= ed + +# SVE saturating inc/dec vector by predicate count +SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_p= red + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group =20 # SVE load predicate register --=20 2.17.0