From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518891951000693.884399319987; Sat, 17 Feb 2018 10:25:51 -0800 (PST) Received: from localhost ([::1]:48061 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7BF-00022N-Hj for importer@patchew.org; Sat, 17 Feb 2018 13:25:49 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39461) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en791-0008W9-P5 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:32 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en78z-0001Qv-JB for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:31 -0500 Received: from mail-pl0-x242.google.com ([2607:f8b0:400e:c01::242]:34788) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en78z-0001QV-DD for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:29 -0500 Received: by mail-pl0-x242.google.com with SMTP id bd10so3449349plb.1 for ; Sat, 17 Feb 2018 10:23:29 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.26 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=GKTCp6bvksh14RttFM3Mt9UNkxVgwo6yVS4bRZKIG/8=; b=XsENO705oCj3EEvARe2kDxAKzc6yB5FChZN+2zjJDYeidOU2EZl953Q4WtV+meFwSS QR9mr4zHmzh7lBikyjvhqM4Bg+iJM7deo/gqrlIPde40NXj9qQYVADwzGYVABb1kdMRD 3i8PXJGwhKnksw6a7Gky1u4FYErzaFgNuhbwE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=GKTCp6bvksh14RttFM3Mt9UNkxVgwo6yVS4bRZKIG/8=; b=OMd3FPESx1PXgvNqTRwY9fmqVzFoWqjAOQw7NTY1hPm+V4bYNOQJxAygIsCPexogqP aiH49ORjxW6Bm0yRx0JMClI3ZiCOlUGu0rkeo4J5ZOEMLW/DC3+4tw2eWIsJ6BegwOIm j+xmzKNtrLpFyT6fVMlo7tEqv0fxL7HQFAW758eB4x/XE+2unpmHJp1ZCOTRWs9xPSh8 hX8lQG8YZ6WUqazFh0Cds8GyBxpRPTuk3f7bfuHXazi1Q/f73RIm/OWT6fN/Cq8EieTE ossLr351LfiDtRgYHKbUJDxH/aGhCdKLlVv1rBSKZqkjhBfNMYZRDo+tPi7s2WkUUPdY 6+ag== X-Gm-Message-State: APf1xPCGmDMM6OjwoccW/jWM1nLI4fcYrWeYfo+ZF5Ia+QUPbg+eIUG0 ptRJhMeKvnpM74yUjstwtfpCUfIfJlk= X-Google-Smtp-Source: AH8x2242HdokxJqNJiwCRPZsCGwGRRCQtGpEMI6xYWB6QDWMd33k0TshuOGv6cgdEix8e/bHrOhMFg== X-Received: by 2002:a17:902:ab85:: with SMTP id f5-v6mr9592147plr.199.1518891808245; Sat, 17 Feb 2018 10:23:28 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:17 -0800 Message-Id: <20180217182323.25885-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::242 Subject: [Qemu-devel] [PATCH v2 01/67] target/arm: Enable SVE for aarch64-linux-user X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Enable ARM_FEATURE_SVE for the generic "any" cpu. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/cpu.c | 7 +++++++ target/arm/cpu64.c | 1 + 2 files changed, 8 insertions(+) diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 1b3ae62db6..10843994c3 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -150,6 +150,13 @@ static void arm_cpu_reset(CPUState *s) env->cp15.sctlr_el[1] |=3D SCTLR_UCT | SCTLR_UCI | SCTLR_DZE; /* and to the FP/Neon instructions */ env->cp15.cpacr_el1 =3D deposit64(env->cp15.cpacr_el1, 20, 2, 3); + /* and to the SVE instructions */ + env->cp15.cpacr_el1 =3D deposit64(env->cp15.cpacr_el1, 16, 2, 3); + env->cp15.cptr_el[3] |=3D CPTR_EZ; + /* with maximum vector length */ + env->vfp.zcr_el[1] =3D ARM_MAX_VQ - 1; + env->vfp.zcr_el[2] =3D ARM_MAX_VQ - 1; + env->vfp.zcr_el[3] =3D ARM_MAX_VQ - 1; #else /* Reset into the highest available EL */ if (arm_feature(env, ARM_FEATURE_EL3)) { diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index efc519b49b..36ef9e9d9d 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -231,6 +231,7 @@ static void aarch64_any_initfn(Object *obj) set_feature(&cpu->env, ARM_FEATURE_V8_PMULL); set_feature(&cpu->env, ARM_FEATURE_CRC); set_feature(&cpu->env, ARM_FEATURE_V8_FP16); + set_feature(&cpu->env, ARM_FEATURE_SVE); cpu->ctr =3D 0x80038003; /* 32 byte I and D cacheline size, VIPT icach= e */ cpu->dcz_blocksize =3D 7; /* 512 bytes */ } --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518891985297573.4941687659211; Sat, 17 Feb 2018 10:26:25 -0800 (PST) Received: from localhost ([::1]:48066 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Bo-0002WN-DX for importer@patchew.org; Sat, 17 Feb 2018 13:26:24 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39503) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en794-00005y-0H for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en791-0001Rz-SB for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:33 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:39663) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en791-0001RT-EV for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:31 -0500 Received: by mail-pl0-x243.google.com with SMTP id s13so3436568plq.6 for ; Sat, 17 Feb 2018 10:23:31 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.28 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=vA65R0gM6DJd7CAtKhNwANG6CnipdIvbj0v8v4ws5v0=; b=LgbRx/i/daKnKXWQzLRHf5X4pz1xKZrLZGJXjequ6vBRObYm2NGR6NKiPXBxahEO7O QYo96J9EF5aA2piYfxgNCkXC6vT1HkEu05fBkcCHYxt4uXI1de/8yb89sKvIO/iQnHVD YyQ0vEnZfIoyee8JZjCilSx/GFyjj7QUfMt7U= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=vA65R0gM6DJd7CAtKhNwANG6CnipdIvbj0v8v4ws5v0=; b=dKeZ+tJQCR7SLdqyZUdNL2T7STEZilCb2UZTWR3vHuFbD+JNwaQabcA3vMLNV2VJhq 5exLz67jG2OSfcNv0pidA4hJAbuscd+0boUq8hIvdLYSHzS+0Yp1KwknYZdiI9ZAyYkV Ki62gNGwezAQiLapoKC+zwzIZt4lNdYUkQYvme9hrnJyxJN1WRfJMViTHHQrLW1g7uGq 5rquv2ts46ljlgTVzC9lkeC9JHlocktoVvb3w9G85zaKP66rCe0cAVP9UYH4ByBrl9OE XBf//vYAI8e5NvH9YtnY8cz2V7nVwRR7rsNhnQ1fJpevtDSAaSO7njWpQ6/YGxMy84dS Sn5w== X-Gm-Message-State: APf1xPDlRfEcxBKNkLfROXFI/MvuwxmIZxEk9j4mSm7s07sVgcPzxZp0 QkDIYMAcMudLNUgblvpe/gmYk0Wi5no= X-Google-Smtp-Source: AH8x2258aVAujCSUz3jPlh9hvCMU+9FbmuDMbQ2JT99IDgLTeHPr/9FcMV0SJCLHZ0++wLYT7dCWzw== X-Received: by 2002:a17:902:4c88:: with SMTP id b8-v6mr9492851ple.233.1518891810059; Sat, 17 Feb 2018 10:23:30 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:18 -0800 Message-Id: <20180217182323.25885-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 02/67] target/arm: Introduce translate-a64.h X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Move some stuff that will be common to both translate-a64.c and translate-sve.c. Signed-off-by: Richard Henderson Reviewed-by: Alex Benn=C3=A9e Reviewed-by: Peter Maydell --- target/arm/translate-a64.h | 110 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-a64.c | 101 ++++++----------------------------------- 2 files changed, 123 insertions(+), 88 deletions(-) create mode 100644 target/arm/translate-a64.h diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h new file mode 100644 index 0000000000..e519aee314 --- /dev/null +++ b/target/arm/translate-a64.h @@ -0,0 +1,110 @@ +/* + * AArch64 translation, common definitions. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#ifndef TARGET_ARM_TRANSLATE_A64_H +#define TARGET_ARM_TRANSLATE_A64_H + +void unallocated_encoding(DisasContext *s); + +#define unsupported_encoding(s, insn) \ + do { \ + qemu_log_mask(LOG_UNIMP, \ + "%s:%d: unsupported instruction encoding 0x%08x " \ + "at pc=3D%016" PRIx64 "\n", = \ + __FILE__, __LINE__, insn, s->pc - 4); \ + unallocated_encoding(s); \ + } while (0) + +TCGv_i64 new_tmp_a64(DisasContext *s); +TCGv_i64 new_tmp_a64_zero(DisasContext *s); +TCGv_i64 cpu_reg(DisasContext *s, int reg); +TCGv_i64 cpu_reg_sp(DisasContext *s, int reg); +TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf); +TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf); +void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v); +TCGv_ptr get_fpstatus_ptr(bool); +bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn, + unsigned int imms, unsigned int immr); +uint64_t vfp_expand_imm(int size, uint8_t imm8); + +/* We should have at some point before trying to access an FP register + * done the necessary access check, so assert that + * (a) we did the check and + * (b) we didn't then just plough ahead anyway if it failed. + * Print the instruction pattern in the abort message so we can figure + * out what we need to fix if a user encounters this problem in the wild. + */ +static inline void assert_fp_access_checked(DisasContext *s) +{ +#ifdef CONFIG_DEBUG_TCG + if (unlikely(!s->fp_access_checked || s->fp_excp_el)) { + fprintf(stderr, "target-arm: FP access check missing for " + "instruction 0x%08x\n", s->insn); + abort(); + } +#endif +} + +/* Return the offset into CPUARMState of an element of specified + * size, 'element' places in from the least significant end of + * the FP/vector register Qn. + */ +static inline int vec_reg_offset(DisasContext *s, int regno, + int element, TCGMemOp size) +{ + int offs =3D 0; +#ifdef HOST_WORDS_BIGENDIAN + /* This is complicated slightly because vfp.zregs[n].d[0] is + * still the low half and vfp.zregs[n].d[1] the high half + * of the 128 bit vector, even on big endian systems. + * Calculate the offset assuming a fully bigendian 128 bits, + * then XOR to account for the order of the two 64 bit halves. + */ + offs +=3D (16 - ((element + 1) * (1 << size))); + offs ^=3D 8; +#else + offs +=3D element * (1 << size); +#endif + offs +=3D offsetof(CPUARMState, vfp.zregs[regno]); + assert_fp_access_checked(s); + return offs; +} + +/* Return the offset info CPUARMState of the "whole" vector register Qn. = */ +static inline int vec_full_reg_offset(DisasContext *s, int regno) +{ + assert_fp_access_checked(s); + return offsetof(CPUARMState, vfp.zregs[regno]); +} + +/* Return a newly allocated pointer to the vector register. */ +static inline TCGv_ptr vec_full_reg_ptr(DisasContext *s, int regno) +{ + TCGv_ptr ret =3D tcg_temp_new_ptr(); + tcg_gen_addi_ptr(ret, cpu_env, vec_full_reg_offset(s, regno)); + return ret; +} + +/* Return the byte size of the "whole" vector register, VL / 8. */ +static inline int vec_full_reg_size(DisasContext *s) +{ + return s->sve_len; +} + +bool disas_sve(DisasContext *, uint32_t); + +#endif /* TARGET_ARM_TRANSLATE_A64_H */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 032cbfa17d..e0e7ebf68c 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -36,13 +36,13 @@ #include "exec/log.h" =20 #include "trace-tcg.h" +#include "translate-a64.h" =20 static TCGv_i64 cpu_X[32]; static TCGv_i64 cpu_pc; =20 /* Load/store exclusive handling */ static TCGv_i64 cpu_exclusive_high; -static TCGv_i64 cpu_reg(DisasContext *s, int reg); =20 static const char *regnames[] =3D { "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7", @@ -392,22 +392,13 @@ static inline void gen_goto_tb(DisasContext *s, int n= , uint64_t dest) } } =20 -static void unallocated_encoding(DisasContext *s) +void unallocated_encoding(DisasContext *s) { /* Unallocated and reserved encodings are uncategorized */ gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(), default_exception_el(s)); } =20 -#define unsupported_encoding(s, insn) \ - do { \ - qemu_log_mask(LOG_UNIMP, \ - "%s:%d: unsupported instruction encoding 0x%08x " \ - "at pc=3D%016" PRIx64 "\n", = \ - __FILE__, __LINE__, insn, s->pc - 4); \ - unallocated_encoding(s); \ - } while (0) - static void init_tmp_a64_array(DisasContext *s) { #ifdef CONFIG_DEBUG_TCG @@ -425,13 +416,13 @@ static void free_tmp_a64(DisasContext *s) init_tmp_a64_array(s); } =20 -static TCGv_i64 new_tmp_a64(DisasContext *s) +TCGv_i64 new_tmp_a64(DisasContext *s) { assert(s->tmp_a64_count < TMP_A64_MAX); return s->tmp_a64[s->tmp_a64_count++] =3D tcg_temp_new_i64(); } =20 -static TCGv_i64 new_tmp_a64_zero(DisasContext *s) +TCGv_i64 new_tmp_a64_zero(DisasContext *s) { TCGv_i64 t =3D new_tmp_a64(s); tcg_gen_movi_i64(t, 0); @@ -453,7 +444,7 @@ static TCGv_i64 new_tmp_a64_zero(DisasContext *s) * to cpu_X[31] and ZR accesses to a temporary which can be discarded. * This is the point of the _sp forms. */ -static TCGv_i64 cpu_reg(DisasContext *s, int reg) +TCGv_i64 cpu_reg(DisasContext *s, int reg) { if (reg =3D=3D 31) { return new_tmp_a64_zero(s); @@ -463,7 +454,7 @@ static TCGv_i64 cpu_reg(DisasContext *s, int reg) } =20 /* register access for when 31 =3D=3D SP */ -static TCGv_i64 cpu_reg_sp(DisasContext *s, int reg) +TCGv_i64 cpu_reg_sp(DisasContext *s, int reg) { return cpu_X[reg]; } @@ -472,7 +463,7 @@ static TCGv_i64 cpu_reg_sp(DisasContext *s, int reg) * representing the register contents. This TCGv is an auto-freed * temporary so it need not be explicitly freed, and may be modified. */ -static TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf) +TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf) { TCGv_i64 v =3D new_tmp_a64(s); if (reg !=3D 31) { @@ -487,7 +478,7 @@ static TCGv_i64 read_cpu_reg(DisasContext *s, int reg, = int sf) return v; } =20 -static TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf) +TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf) { TCGv_i64 v =3D new_tmp_a64(s); if (sf) { @@ -498,72 +489,6 @@ static TCGv_i64 read_cpu_reg_sp(DisasContext *s, int r= eg, int sf) return v; } =20 -/* We should have at some point before trying to access an FP register - * done the necessary access check, so assert that - * (a) we did the check and - * (b) we didn't then just plough ahead anyway if it failed. - * Print the instruction pattern in the abort message so we can figure - * out what we need to fix if a user encounters this problem in the wild. - */ -static inline void assert_fp_access_checked(DisasContext *s) -{ -#ifdef CONFIG_DEBUG_TCG - if (unlikely(!s->fp_access_checked || s->fp_excp_el)) { - fprintf(stderr, "target-arm: FP access check missing for " - "instruction 0x%08x\n", s->insn); - abort(); - } -#endif -} - -/* Return the offset into CPUARMState of an element of specified - * size, 'element' places in from the least significant end of - * the FP/vector register Qn. - */ -static inline int vec_reg_offset(DisasContext *s, int regno, - int element, TCGMemOp size) -{ - int offs =3D 0; -#ifdef HOST_WORDS_BIGENDIAN - /* This is complicated slightly because vfp.zregs[n].d[0] is - * still the low half and vfp.zregs[n].d[1] the high half - * of the 128 bit vector, even on big endian systems. - * Calculate the offset assuming a fully bigendian 128 bits, - * then XOR to account for the order of the two 64 bit halves. - */ - offs +=3D (16 - ((element + 1) * (1 << size))); - offs ^=3D 8; -#else - offs +=3D element * (1 << size); -#endif - offs +=3D offsetof(CPUARMState, vfp.zregs[regno]); - assert_fp_access_checked(s); - return offs; -} - -/* Return the offset info CPUARMState of the "whole" vector register Qn. = */ -static inline int vec_full_reg_offset(DisasContext *s, int regno) -{ - assert_fp_access_checked(s); - return offsetof(CPUARMState, vfp.zregs[regno]); -} - -/* Return a newly allocated pointer to the vector register. */ -static TCGv_ptr vec_full_reg_ptr(DisasContext *s, int regno) -{ - TCGv_ptr ret =3D tcg_temp_new_ptr(); - tcg_gen_addi_ptr(ret, cpu_env, vec_full_reg_offset(s, regno)); - return ret; -} - -/* Return the byte size of the "whole" vector register, VL / 8. */ -static inline int vec_full_reg_size(DisasContext *s) -{ - /* FIXME SVE: We should put the composite ZCR_EL* value into tb->flags. - In the meantime this is just the AdvSIMD length of 128. */ - return 128 / 8; -} - /* Return the offset into CPUARMState of a slice (from * the least significant end) of FP register Qn (ie * Dn, Sn, Hn or Bn). @@ -620,7 +545,7 @@ static void clear_vec_high(DisasContext *s, bool is_q, = int rd) } } =20 -static void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v) +void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v) { unsigned ofs =3D fp_reg_offset(s, reg, MO_64); =20 @@ -637,7 +562,7 @@ static void write_fp_sreg(DisasContext *s, int reg, TCG= v_i32 v) tcg_temp_free_i64(tmp); } =20 -static TCGv_ptr get_fpstatus_ptr(bool is_f16) +TCGv_ptr get_fpstatus_ptr(bool is_f16) { TCGv_ptr statusptr =3D tcg_temp_new_ptr(); int offset; @@ -3130,8 +3055,8 @@ static inline uint64_t bitmask64(unsigned int length) * value (ie should cause a guest UNDEF exception), and true if they are * valid, in which case the decoded bit pattern is written to result. */ -static bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn, - unsigned int imms, unsigned int immr) +bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn, + unsigned int imms, unsigned int immr) { uint64_t mask; unsigned e, levels, s, r; @@ -5164,7 +5089,7 @@ static void disas_fp_3src(DisasContext *s, uint32_t i= nsn) * the range 01....1xx to 10....0xx, and the most significant 4 bits of * the mantissa; see VFPExpandImm() in the v8 ARM ARM. */ -static uint64_t vfp_expand_imm(int size, uint8_t imm8) +uint64_t vfp_expand_imm(int size, uint8_t imm8) { uint64_t imm; =20 --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892131772421.968375611713; Sat, 17 Feb 2018 10:28:51 -0800 (PST) Received: from localhost ([::1]:48085 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7EA-0004UZ-RI for importer@patchew.org; Sat, 17 Feb 2018 13:28:50 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39515) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en794-000067-HQ for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en793-0001Sd-1R for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:34 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:34789) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en792-0001S7-Ql for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:32 -0500 Received: by mail-pl0-x243.google.com with SMTP id bd10so3449386plb.1 for ; Sat, 17 Feb 2018 10:23:32 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.30 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=OB6hZuzKEdcsMV6fxaOMHmbJji7y30odOZxKDC955Eg=; b=c6YW+HiO4jW5Mmcf4gnYNpIVEXUf4QNj+U0Jg2oowSjDC0/ZqeWs9lzweX5V16/RKJ lSUVOLET4GjNSOWuwZRZRzuYPhWNwVu0TZiXefinbqdJBHwf2XCQfLlbYiUc+pCRAQMe P0XdPf3Vl95L3SlIH1V5h55EZ507BMGvLRDCk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=OB6hZuzKEdcsMV6fxaOMHmbJji7y30odOZxKDC955Eg=; b=LCjFubUS2Xn1edEBwT5+IYMFWzEip0H7rSUzaOXCnOm+/U8R4ZjUi+KB7u1m9xXo3G PghBe8YpJgHJsdEjaxMzgpODSfCzqCOJU+q1WhJiFkTvHvhILC9KzmV8+jdeT2Qejalc Ps/mlLNP4bL/3QSky3AJNsOk9TdA/b5jrWNwkwcY52wk/augINaX5B+mVxVFM/l6BoEa 1drDdwy6AJfiQx3MfbHgIr7hhU5irRFkcSTqiM/IcCGPhME7lP3+NLYdXKYvC1k7J4qb tXi57qKR25y0GswR9XS7zQrAnzaXC5SW5J+nmECv4xO171YXQFSmVL0A8AvIPIqZ0K/w UAcA== X-Gm-Message-State: APf1xPBA2w4JC0YBNf1UbiANhaYY/8kWZCjQhQCKtH3eFK2K5RM7ZGJp NqGPKWZurnfy9xneGJ6G9H7zTDG3CB4= X-Google-Smtp-Source: AH8x2262IxPEdaDPW/AVrcqgspmJ1Yv4ZO6DTkDqtDQ8dtBM7YOhZi01rHrTEyJ15TqTrlzg5uMj/A== X-Received: by 2002:a17:902:6e8c:: with SMTP id v12-v6mr9359642plk.424.1518891811558; Sat, 17 Feb 2018 10:23:31 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:19 -0800 Message-Id: <20180217182323.25885-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 03/67] target/arm: Add SVE decode skeleton X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Including only 4, as-yet unimplemented, instruction patterns so that the whole thing compiles. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/translate-a64.c | 11 +++++++- target/arm/translate-sve.c | 63 ++++++++++++++++++++++++++++++++++++++++++= ++++ .gitignore | 1 + target/arm/Makefile.objs | 10 ++++++++ target/arm/sve.decode | 45 +++++++++++++++++++++++++++++++++ 5 files changed, 129 insertions(+), 1 deletion(-) create mode 100644 target/arm/translate-sve.c create mode 100644 target/arm/sve.decode diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index e0e7ebf68c..a50fef98af 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12772,9 +12772,18 @@ static void disas_a64_insn(CPUARMState *env, Disas= Context *s) s->fp_access_checked =3D false; =20 switch (extract32(insn, 25, 4)) { - case 0x0: case 0x1: case 0x2: case 0x3: /* UNALLOCATED */ + case 0x0: case 0x1: case 0x3: /* UNALLOCATED */ unallocated_encoding(s); break; + case 0x2: + if (!arm_dc_feature(s, ARM_FEATURE_SVE)) { + unallocated_encoding(s); + } else if (!sve_access_check(s) || !fp_access_check(s)) { + /* exception raised */ + } else if (!disas_sve(s, insn)) { + unallocated_encoding(s); + } + break; case 0x8: case 0x9: /* Data processing - immediate */ disas_data_proc_imm(s, insn); break; diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c new file mode 100644 index 0000000000..2c9e4733cb --- /dev/null +++ b/target/arm/translate-sve.c @@ -0,0 +1,63 @@ +/* + * AArch64 SVE translation + * + * Copyright (c) 2018 Linaro, Ltd + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/exec-all.h" +#include "tcg-op.h" +#include "tcg-op-gvec.h" +#include "qemu/log.h" +#include "arm_ldst.h" +#include "translate.h" +#include "internals.h" +#include "exec/helper-proto.h" +#include "exec/helper-gen.h" +#include "exec/log.h" +#include "trace-tcg.h" +#include "translate-a64.h" + +/* + * Include the generated decoder. + */ + +#include "decode-sve.inc.c" + +/* + * Implement all of the translator functions referenced by the decoder. + */ + +static void trans_AND_zzz(DisasContext *s, arg_AND_zzz *a, uint32_t insn) +{ + unsupported_encoding(s, insn); +} + +static void trans_ORR_zzz(DisasContext *s, arg_ORR_zzz *a, uint32_t insn) +{ + unsupported_encoding(s, insn); +} + +static void trans_EOR_zzz(DisasContext *s, arg_EOR_zzz *a, uint32_t insn) +{ + unsupported_encoding(s, insn); +} + +static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) +{ + unsupported_encoding(s, insn); +} diff --git a/.gitignore b/.gitignore index 704b22285d..abe2b81a26 100644 --- a/.gitignore +++ b/.gitignore @@ -140,3 +140,4 @@ trace-dtrace-root.h trace-dtrace-root.dtrace trace-ust-all.h trace-ust-all.c +/target/arm/decode-sve.inc.c diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs index 847fb52ee0..9934cf1d4d 100644 --- a/target/arm/Makefile.objs +++ b/target/arm/Makefile.objs @@ -10,3 +10,13 @@ obj-y +=3D gdbstub.o obj-$(TARGET_AARCH64) +=3D cpu64.o translate-a64.o helper-a64.o gdbstub64.o obj-y +=3D crypto_helper.o obj-$(CONFIG_SOFTMMU) +=3D arm-powerctl.o + +DECODETREE =3D $(SRC_PATH)/scripts/decodetree.py + +target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.decode $(DECODETRE= E) + $(call quiet-command,\ + $(PYTHON) $(DECODETREE) --decode disas_sve -o $@ $<,\ + "GEN", $(TARGET_DIR)$@) + +target/arm/translate-sve.o: target/arm/decode-sve.inc.c +obj-$(TARGET_AARCH64) +=3D translate-sve.o diff --git a/target/arm/sve.decode b/target/arm/sve.decode new file mode 100644 index 0000000000..2c13a6024a --- /dev/null +++ b/target/arm/sve.decode @@ -0,0 +1,45 @@ +# AArch64 SVE instruction descriptions +# +# Copyright (c) 2017 Linaro, Ltd +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, see . + +# +# This file is processed by scripts/decodetree.py +# + +########################################################################### +# Named attribute sets. These are used to make nice(er) names +# when creating helpers common to those for the individual +# instruction patterns. + +&rrr_esz rd rn rm esz + +########################################################################### +# Named instruction formats. These are generally used to +# reduce the amount of duplication between instruction patterns. + +# Three operand with unused vector element size +@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=3D0 + +########################################################################### +# Instruction patterns. Grouped according to the SVE encodingindex.xhtml. + +### SVE Logical - Unpredicated Group + +# SVE bitwise logical operations (unpredicated) +AND_zzz 00000100 00 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892325113411.8739932601551; Sat, 17 Feb 2018 10:32:05 -0800 (PST) Received: from localhost ([::1]:48109 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7HI-0007B1-8v for importer@patchew.org; Sat, 17 Feb 2018 13:32:04 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39527) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en795-000075-78 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en794-0001TG-AA for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:35 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:46293) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en794-0001Ss-4a for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:34 -0500 Received: by mail-pl0-x244.google.com with SMTP id x19so3427827plr.13 for ; Sat, 17 Feb 2018 10:23:34 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.31 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=crjT9mdFhV4QIVceZtyE/uLyIPkijbglCCBt+D3M/dI=; b=et7eDw+j8BCHiHiM5qQ/tvuBRTE9lvahCck7xOYIWpK5kHrZGzh6Fu09G0YZHn9sSU MbphDAAhkkvf7gc6iRiiCzGqsLzn9HYBvVBRcB8CwobVtGVHHblMvTI+RCJv+EZEU979 O3gd8yFaS1Ks4T4zJY02j5kxY3zXJxlxlGlSE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=crjT9mdFhV4QIVceZtyE/uLyIPkijbglCCBt+D3M/dI=; b=GdZwOPEpX8EaDw4Wz0g3R5qgUMOdGIU9W3XzFNt0GHH2zHTgmnEhcSt/3h4UEkBegF ZW+Y2NmZcbm/02/qtG0QqfO9aGX6ShuYYD7htZpjKlpr8epMBEXvaqjH+SFVwNaFINtX hlcIlADueA1b7Ktc9Gyp5wjRml8vPpg3EQhrebtHZ2D+U5jKMwTDhQCdB7SHPU/1WTP9 TX9D5QBXCNDfC3yIn5a4WemAMwhOfRZ9Q1RUnOT2NMq41mWTupCe6QQm3MHczz/L/CFh 9h4Vwxf5C9rXx+9ZHVg5lcMSAgppFOO+5lQqEVaep4Him0Tm2qIVNY+GNxkWkyDVy4vK CKhA== X-Gm-Message-State: APf1xPB3dDxHXd/Oha2SHAm/Cu1wJb/XtuXlHDEmae/0Djzs5EfHcJHK ind+6Zu1kmpeMSADBPJEXTeB4my5q0k= X-Google-Smtp-Source: AH8x226NNmJnomkYPX8GX4H7qXmtVQJdu9jya0z4n/LN6gQdHckucJL92QnuMc4N5MexPdWV9SYkNg== X-Received: by 2002:a17:902:8304:: with SMTP id bd4-v6mr9608299plb.123.1518891812941; Sat, 17 Feb 2018 10:23:32 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:20 -0800 Message-Id: <20180217182323.25885-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 04/67] target/arm: Implement SVE Bitwise Logical - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" These were the instructions that were stubbed out when introducing the decode skeleton. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/translate-sve.c | 50 +++++++++++++++++++++++++++++++++++++++---= ---- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 2c9e4733cb..50cf2a1fdd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -32,6 +32,10 @@ #include "trace-tcg.h" #include "translate-a64.h" =20 +typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t); +typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t, uint32_t); + /* * Include the generated decoder. */ @@ -42,22 +46,54 @@ * Implement all of the translator functions referenced by the decoder. */ =20 -static void trans_AND_zzz(DisasContext *s, arg_AND_zzz *a, uint32_t insn) +/* Invoke a vector expander on two Zregs. */ +static void do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn, + int esz, int rd, int rn) { - unsupported_encoding(s, insn); + unsigned vsz =3D vec_full_reg_size(s); + gvec_fn(esz, vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), vsz, vsz); } =20 -static void trans_ORR_zzz(DisasContext *s, arg_ORR_zzz *a, uint32_t insn) +/* Invoke a vector expander on three Zregs. */ +static void do_vector3_z(DisasContext *s, GVecGen3Fn *gvec_fn, + int esz, int rd, int rn, int rm) { - unsupported_encoding(s, insn); + unsigned vsz =3D vec_full_reg_size(s); + gvec_fn(esz, vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), vsz, vsz); } =20 -static void trans_EOR_zzz(DisasContext *s, arg_EOR_zzz *a, uint32_t insn) +/* Invoke a vector move on two Zregs. */ +static void do_mov_z(DisasContext *s, int rd, int rn) { - unsupported_encoding(s, insn); + do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn); +} + +/* + *** SVE Logical - Unpredicated Group + */ + +static void trans_AND_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_and, 0, a->rd, a->rn, a->rm); +} + +static void trans_ORR_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + if (a->rn =3D=3D a->rm) { /* MOV */ + do_mov_z(s, a->rd, a->rn); + } else { + do_vector3_z(s, tcg_gen_gvec_or, 0, a->rd, a->rn, a->rm); + } +} + +static void trans_EOR_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_xor, 0, a->rd, a->rn, a->rm); } =20 static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) { - unsupported_encoding(s, insn); + do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892142362283.1203950415861; Sat, 17 Feb 2018 10:29:02 -0800 (PST) Received: from localhost ([::1]:48086 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7EL-0004bH-FX for importer@patchew.org; Sat, 17 Feb 2018 13:29:01 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39565) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en797-0000AP-Md for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en796-0001UD-8s for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:37 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:36664) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en796-0001Tu-0K for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:36 -0500 Received: by mail-pl0-x244.google.com with SMTP id v3so3443614plg.3 for ; Sat, 17 Feb 2018 10:23:35 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.33 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Rzb3iILp0N9VPoO2HC7Ijv9hijbpXzSFKAfXOuc/y24=; b=A1yiZ6hgTpl96c/Xw/00JNme4ci8k9OSBXcTPbWr/oWLOlErI/SFubj5rIv9LBNmiZ /9vQuRapzkFu4XompCryPs9N5vg77Mw9NA+ae2K+g9r+hnZPYatky+Rp2Kd9I/NlXsSC iXJO0CgS4kZz/27Fm2pUSB1p4qfpYC3Mc4tKI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Rzb3iILp0N9VPoO2HC7Ijv9hijbpXzSFKAfXOuc/y24=; b=K1ApJ0ATp1rHEZ84dcrZbzbfx09o/Ouvr9QBkkYTuZON4rfyqvOz1VtBAYF8+oOs76 R6LsyboN1c4WX6hygH3AlwLkDOpH3q2QXk5jm99WBt+3o8tHc6DSHFWMhyvKt2WS0yem jhqXlfsIxufxhvN61yRU8O4bOHSlqiyRO/nmrS6fWh5n8yPpx/jaxv/d2Doungkk4oBc wTrt6YHmr5mp2GR8AUf8gP6ef7MmH7UVd7LvSDDsfBf4NNKY2YAUIj5u9Oqx35UtflYH spE/8O5ZoSiExqOK27nYec4Gk3c7sSKXa5pzD+O3rwhSbd+otsDIBRvdXQrKnvrCyTw7 u7vg== X-Gm-Message-State: APf1xPBJ4AGNlbNLJcvnWKeMewgpzNQTZnPz5BtbAcN6TfBG5Roo5nq+ MtjdEq4s3ahp8TzWOfVzCTXBTRhBC54= X-Google-Smtp-Source: AH8x224qGSLNlvTfwZH1+AxcII1OrRlp17wJ2fAMRZVZiFYKALZQFghYGQ7uzeLIeE4NXDozRFjKSQ== X-Received: by 2002:a17:902:6bcb:: with SMTP id m11-v6mr2324350plt.326.1518891814662; Sat, 17 Feb 2018 10:23:34 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:21 -0800 Message-Id: <20180217182323.25885-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 05/67] target/arm: Implement SVE load vector/predicate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 132 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 22 +++++++- 2 files changed, 153 insertions(+), 1 deletion(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 50cf2a1fdd..c0cccfda6f 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -46,6 +46,19 @@ typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, * Implement all of the translator functions referenced by the decoder. */ =20 +/* Return the offset info CPUARMState of the predicate vector register Pn. + * Note for this purpose, FFR is P16. */ +static inline int pred_full_reg_offset(DisasContext *s, int regno) +{ + return offsetof(CPUARMState, vfp.pregs[regno]); +} + +/* Return the byte size of the whole predicate register, VL / 64. */ +static inline int pred_full_reg_size(DisasContext *s) +{ + return s->sve_len >> 3; +} + /* Invoke a vector expander on two Zregs. */ static void do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn, int esz, int rd, int rn) @@ -97,3 +110,122 @@ static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz= *a, uint32_t insn) { do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } + +/* + *** SVE Memory - 32-bit Gather and Unsized Contiguous Group + */ + +/* Subroutine loading a vector register at VOFS of LEN bytes. + * The load should begin at the address Rn + IMM. + */ + +#if UINTPTR_MAX =3D=3D UINT32_MAX +# define ptr i32 +#else +# define ptr i64 +#endif + +static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len, + int rn, int imm) +{ + uint32_t len_align =3D QEMU_ALIGN_DOWN(len, 8); + uint32_t len_remain =3D len % 8; + uint32_t nparts =3D len / 8 + ctpop8(len_remain); + int midx =3D get_mem_index(s); + TCGv_i64 addr, t0, t1; + + addr =3D tcg_temp_new_i64(); + t0 =3D tcg_temp_new_i64(); + + /* Note that unpredicated load/store of vector/predicate registers + * are defined as a stream of bytes, which equates to little-endian + * operations on larger quantities. There is no nice way to force + * a little-endian load for aarch64_be-linux-user out of line. + * + * Attempt to keep code expansion to a minimum by limiting the + * amount of unrolling done. + */ + if (nparts <=3D 4) { + int i; + + for (i =3D 0; i < len_align; i +=3D 8) { + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i); + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEQ); + tcg_gen_st_i64(t0, cpu_env, vofs + i); + } + } else { + TCGLabel *loop =3D gen_new_label(); + TCGv_ptr i =3D TCGV_NAT_TO_PTR(glue(tcg_const_local_, ptr)(0)); + TCGv_ptr dest; + + gen_set_label(loop); + + /* Minimize the number of local temps that must be re-read from + * the stack each iteration. Instead, re-compute values other + * than the loop counter. + */ + dest =3D tcg_temp_new_ptr(); + tcg_gen_addi_ptr(dest, i, imm); +#if UINTPTR_MAX =3D=3D UINT32_MAX + tcg_gen_extu_i32_i64(addr, TCGV_PTR_TO_NAT(dest)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn)); +#else + tcg_gen_add_i64(addr, TCGV_PTR_TO_NAT(dest), cpu_reg_sp(s, rn)); +#endif + + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEQ); + + tcg_gen_add_ptr(dest, cpu_env, i); + tcg_gen_addi_ptr(i, i, 8); + tcg_gen_st_i64(t0, dest, vofs); + tcg_temp_free_ptr(dest); + + glue(tcg_gen_brcondi_, ptr)(TCG_COND_LTU, TCGV_PTR_TO_NAT(i), + len_align, loop); + tcg_temp_free_ptr(i); + } + + /* Predicate register loads can be any multiple of 2. + * Note that we still store the entire 64-bit unit into cpu_env. + */ + if (len_remain) { + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align); + + switch (len_remain) { + case 2: + case 4: + case 8: + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LE | ctz32(len_remain)); + break; + + case 6: + t1 =3D tcg_temp_new_i64(); + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEUL); + tcg_gen_addi_i64(addr, addr, 4); + tcg_gen_qemu_ld_i64(t1, addr, midx, MO_LEUW); + tcg_gen_deposit_i64(t0, t0, t1, 32, 32); + tcg_temp_free_i64(t1); + break; + + default: + g_assert_not_reached(); + } + tcg_gen_st_i64(t0, cpu_env, vofs + len_align); + } + tcg_temp_free_i64(addr); + tcg_temp_free_i64(t0); +} + +#undef ptr + +static void trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + int size =3D vec_full_reg_size(s); + do_ldr(s, vec_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); +} + +static void trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + int size =3D pred_full_reg_size(s); + do_ldr(s, pred_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2c13a6024a..0c6a7ba34d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -19,11 +19,17 @@ # This file is processed by scripts/decodetree.py # =20 +########################################################################### +# Named fields. These are primarily for disjoint fields. + +%imm9_16_10 16:s6 10:3 + ########################################################################### # Named attribute sets. These are used to make nice(er) names # when creating helpers common to those for the individual # instruction patterns. =20 +&rri rd rn imm &rrr_esz rd rn rm esz =20 ########################################################################### @@ -31,7 +37,13 @@ # reduce the amount of duplication between instruction patterns. =20 # Three operand with unused vector element size -@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=3D0 +@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=3D0 + +# Basic Load/Store with 9-bit immediate offset +@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ + &rri imm=3D%imm9_16_10 +@rd_rn_i9 ........ ........ ...... rn:5 rd:5 \ + &rri imm=3D%imm9_16_10 =20 ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -43,3 +55,11 @@ AND_zzz 00000100 00 1 ..... 001 100 ..... ..... @rd_rn= _rm_e0 ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 + +### SVE Memory - 32-bit Gather and Unsized Contiguous Group + +# SVE load predicate register +LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9 + +# SVE load vector register +LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892336597168.87540743034253; Sat, 17 Feb 2018 10:32:16 -0800 (PST) Received: from localhost ([::1]:48110 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7HT-0007KE-Nh for importer@patchew.org; Sat, 17 Feb 2018 13:32:15 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39598) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79A-0000Dw-Ee for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en797-0001Ut-Uh for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:40 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:42012) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en797-0001UV-MW for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:37 -0500 Received: by mail-pg0-x243.google.com with SMTP id y8so4342796pgr.9 for ; Sat, 17 Feb 2018 10:23:37 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.34 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=V7b6SuF1cXACZqT7BVM+onphPKImPG8640eKQWeQ1Es=; b=isafiBsGufYRWtE6E74zecgqEc7cmIFrRYymxQ/7Ihet5OZvDEV9X6p5P/UrjjTCHU CXQFMKEP9znCSSejSmhtfxrz1JrwjfXsqK+G1RxK2U/2erNyhzs5KtAXvsrGYPVp1JfI oONQ0ahslfyxq9n5viitrI4EyzG7RCXiU8uxE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=V7b6SuF1cXACZqT7BVM+onphPKImPG8640eKQWeQ1Es=; b=eD9sQ4C+vgVnEC/wn1bEX5soueGEiSfLxBEDm/6liz9G5qM1o/5gAAagoWrCcxQXdb E/WqoqTvTYBXsKTiNDpdByjgeh8vVIV6PgyDQNzHO/Xp7/LeB8K1Fd/h0C/5Os8t71Td tnaUdQmpmr4mIlYS9+6vpg7IHQcnFzPanMBwBiHmBDCQ6xmA5LZ0j18HavsvfLWKdBr+ VaErc/l4bGSBO69OeWx1flbZZQkfTVILypqv+Zt1NLKHfqrOkK9SEC+TOaiakvZ6Gkq3 wu3ADI8SZes9rNWY7QXLUh6z/VMVWxKFdMf038thmZwvW9aM3xFGra0AeGSozkm5QRGU 8p6w== X-Gm-Message-State: APf1xPC70gace214Goa6BxqHFXj2sdXisFXy/O3fvV4HrLq1ivdkC4Uf Xm+DJ1FbJ7AjZs2keBPM9s1gJNINEdg= X-Google-Smtp-Source: AH8x225bhtDybKet6SQNZXSlVPZOu4UO8w9NpJnGzHxfR6Umv69Upb90XebtEQHV+6p648UriYwMBw== X-Received: by 10.99.114.86 with SMTP id c22mr8196301pgn.41.1518891816308; Sat, 17 Feb 2018 10:23:36 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:22 -0800 Message-Id: <20180217182323.25885-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH v2 06/67] target/arm: Implement SVE predicate test X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 21 +++++++++++++ target/arm/helper.h | 1 + target/arm/sve_helper.c | 77 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 62 +++++++++++++++++++++++++++++++++++++ target/arm/Makefile.objs | 2 +- target/arm/sve.decode | 5 +++ 6 files changed, 167 insertions(+), 1 deletion(-) create mode 100644 target/arm/helper-sve.h create mode 100644 target/arm/sve_helper.c diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h new file mode 100644 index 0000000000..b6e91539ae --- /dev/null +++ b/target/arm/helper-sve.h @@ -0,0 +1,21 @@ +/* + * AArch64 SVE specific helper definitions + * + * Copyright (c) 2018 Linaro, Ltd + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +DEF_HELPER_FLAGS_2(sve_predtest1, TCG_CALL_NO_WG, i32, i64, i64) +DEF_HELPER_FLAGS_3(sve_predtest, TCG_CALL_NO_WG, i32, ptr, ptr, i32) diff --git a/target/arm/helper.h b/target/arm/helper.h index 6dd8504ec3..be3c2fcdc0 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -567,4 +567,5 @@ DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE= , i64, i64, i64) =20 #ifdef TARGET_AARCH64 #include "helper-a64.h" +#include "helper-sve.h" #endif diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c new file mode 100644 index 0000000000..7d13fd40ed --- /dev/null +++ b/target/arm/sve_helper.c @@ -0,0 +1,77 @@ +/* + * ARM SVE Operations + * + * Copyright (c) 2018 Linaro + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/exec-all.h" +#include "exec/cpu_ldst.h" +#include "exec/helper-proto.h" +#include "tcg/tcg-gvec-desc.h" + + +/* Return a value for NZCV as per the ARM PredTest pseudofunction. + * + * The return value has bit 31 set if N is set, bit 1 set if Z is clear, + * and bit 0 set if C is set. + * + * This is an iterative function, called for each Pd and Pg word + * moving forward. + */ + +/* For no G bits set, NZCV =3D C. */ +#define PREDTEST_INIT 1 + +static uint32_t iter_predtest_fwd(uint64_t d, uint64_t g, uint32_t flags) +{ + if (g) { + /* Compute N from first D & G. + Use bit 2 to signal first G bit seen. */ + if (!(flags & 4)) { + flags |=3D ((d & (g & -g)) !=3D 0) << 31; + flags |=3D 4; + } + + /* Accumulate Z from each D & G. */ + flags |=3D ((d & g) !=3D 0) << 1; + + /* Compute C from last !(D & G). Replace previous. */ + flags =3D deposit32(flags, 0, 1, (d & pow2floor(g)) =3D=3D 0); + } + return flags; +} + +/* The same for a single word predicate. */ +uint32_t HELPER(sve_predtest1)(uint64_t d, uint64_t g) +{ + return iter_predtest_fwd(d, g, PREDTEST_INIT); +} + +/* The same for a multi-word predicate. */ +uint32_t HELPER(sve_predtest)(void *vd, void *vg, uint32_t words) +{ + uint32_t flags =3D PREDTEST_INIT; + uint64_t *d =3D vd, *g =3D vg; + uintptr_t i =3D 0; + + do { + flags =3D iter_predtest_fwd(d[i], g[i], flags); + } while (++i < words); + + return flags; +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c0cccfda6f..c2e7fac938 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -83,6 +83,43 @@ static void do_mov_z(DisasContext *s, int rd, int rn) do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn); } =20 +/* Set the cpu flags as per a return from an SVE helper. */ +static void do_pred_flags(TCGv_i32 t) +{ + tcg_gen_mov_i32(cpu_NF, t); + tcg_gen_andi_i32(cpu_ZF, t, 2); + tcg_gen_andi_i32(cpu_CF, t, 1); + tcg_gen_movi_i32(cpu_VF, 0); +} + +/* Subroutines computing the ARM PredTest psuedofunction. */ +static void do_predtest1(TCGv_i64 d, TCGv_i64 g) +{ + TCGv_i32 t =3D tcg_temp_new_i32(); + + gen_helper_sve_predtest1(t, d, g); + do_pred_flags(t); + tcg_temp_free_i32(t); +} + +static void do_predtest(DisasContext *s, int dofs, int gofs, int words) +{ + TCGv_ptr dptr =3D tcg_temp_new_ptr(); + TCGv_ptr gptr =3D tcg_temp_new_ptr(); + TCGv_i32 t; + + tcg_gen_addi_ptr(dptr, cpu_env, dofs); + tcg_gen_addi_ptr(gptr, cpu_env, gofs); + t =3D tcg_const_i32(words); + + gen_helper_sve_predtest(t, dptr, gptr, t); + tcg_temp_free_ptr(dptr); + tcg_temp_free_ptr(gptr); + + do_pred_flags(t); + tcg_temp_free_i32(t); +} + /* *** SVE Logical - Unpredicated Group */ @@ -111,6 +148,31 @@ static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz= *a, uint32_t insn) do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } =20 +/* + *** SVE Predicate Misc Group + */ + +void trans_PTEST(DisasContext *s, arg_PTEST *a, uint32_t insn) +{ + int nofs =3D pred_full_reg_offset(s, a->rn); + int gofs =3D pred_full_reg_offset(s, a->pg); + int words =3D DIV_ROUND_UP(pred_full_reg_size(s), 8); + + if (words =3D=3D 1) { + TCGv_i64 pn =3D tcg_temp_new_i64(); + TCGv_i64 pg =3D tcg_temp_new_i64(); + + tcg_gen_ld_i64(pn, cpu_env, nofs); + tcg_gen_ld_i64(pg, cpu_env, gofs); + do_predtest1(pn, pg); + + tcg_temp_free_i64(pn); + tcg_temp_free_i64(pg); + } else { + do_predtest(s, nofs, gofs, words); + } +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs index 9934cf1d4d..452ac6f453 100644 --- a/target/arm/Makefile.objs +++ b/target/arm/Makefile.objs @@ -19,4 +19,4 @@ target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.d= ecode $(DECODETREE) "GEN", $(TARGET_DIR)$@) =20 target/arm/translate-sve.o: target/arm/decode-sve.inc.c -obj-$(TARGET_AARCH64) +=3D translate-sve.o +obj-$(TARGET_AARCH64) +=3D translate-sve.o sve_helper.o diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0c6a7ba34d..7efaa8fe8e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -56,6 +56,11 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn= _rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 =20 +### SVE Predicate Misc Group + +# SVE predicate test +PTEST 00100101 01010000 11 pg:4 0 rn:4 00000 + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group =20 # SVE load predicate register --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892207925635.6210469015367; Sat, 17 Feb 2018 10:30:07 -0800 (PST) Received: from localhost ([::1]:48090 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7FM-0005Sd-VV for importer@patchew.org; Sat, 17 Feb 2018 13:30:05 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39623) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79C-0000GF-7v for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en799-0001We-NK for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:42 -0500 Received: from mail-pg0-x244.google.com ([2607:f8b0:400e:c05::244]:43776) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en799-0001WB-EU for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:39 -0500 Received: by mail-pg0-x244.google.com with SMTP id f6so4342749pgs.10 for ; Sat, 17 Feb 2018 10:23:39 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.36 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6uhmGKy5nLmfDGSIQVraRklHXj0OvYQ5hNxaPRxQJrw=; b=DMFUbaz6cwZJthSX6wD4LrR8bAVHTO2CF348HaX0jJAiwN8KLlrpbB1A2etO1Pgx5L PMGUZIAVQdb/oCwuq59Ha/x3QDj1yzF87M5SM8ipL/F+WZMxRhYuqTtwEEFybiUrcZs2 Ao2NwnmrRjdBicpkn5zbdHzHb6E1TAy8L0qzg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6uhmGKy5nLmfDGSIQVraRklHXj0OvYQ5hNxaPRxQJrw=; b=VMzln/IxFE+kcQuUPWBovzPu1RCxXUB3k/Ugm+zKdjIFR8Z38WbcEyexgMNG8ieiUa YxofGXINx+OqV6mbKZsSFhHw5gndyQPNal4o524DAh0IqwUJrNn5Duxy/sgElgjxsHt+ kjGVJAYjiGQdpaXsWzh0AzNYRIXNS4ASz+w4GfjQ/93xJiacwqt/quY4lyRdMYbX2dpP 3e1FGG4eP5ZJBi7iIzjnRY/wbuu3bQCNqHN3vg0uwFOJ4AXVwNCAiBrWi48KZ78aYvwv mRsCey1/X09fGGcOuMRUOPT/lskVh+N+W8WdXc9EFFdXh5aRoOhnTis53aAsSr7RnBS2 rTbA== X-Gm-Message-State: APf1xPCRwTSQOnCi4mP3yLPJyh051O/zcVHvMqMY1Py7XQld4o/vvwEi 1X7a2//C6CfhN2P0KPwXJ8tMhg2lqI8= X-Google-Smtp-Source: AH8x224bWQkBz8JNgRZ5U5Xjn0aF/YcvlDKBuswgjzfhAIEuLKE17bXVKeR4MV6/Y6Q4CcEJmr64/A== X-Received: by 10.99.125.13 with SMTP id y13mr8121205pgc.282.1518891817849; Sat, 17 Feb 2018 10:23:37 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:23 -0800 Message-Id: <20180217182323.25885-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::244 Subject: [Qemu-devel] [PATCH v2 07/67] target/arm: Implement SVE Predicate Logical Operations Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/cpu.h | 4 +- target/arm/helper-sve.h | 10 ++ target/arm/sve_helper.c | 39 ++++++ target/arm/translate-sve.c | 338 +++++++++++++++++++++++++++++++++++++++++= +++- target/arm/sve.decode | 16 +++ 5 files changed, 405 insertions(+), 2 deletions(-) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 70e05f00fe..8befe43a01 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -527,6 +527,8 @@ typedef struct CPUARMState { #ifdef TARGET_AARCH64 /* Store FFR as pregs[16] to make it easier to treat as any other.= */ ARMPredicateReg pregs[17]; + /* Scratch space for aa64 sve predicate temporary. */ + ARMPredicateReg preg_tmp; #endif =20 uint32_t xregs[16]; @@ -534,7 +536,7 @@ typedef struct CPUARMState { int vec_len; int vec_stride; =20 - /* scratch space when Tn are not sufficient. */ + /* Scratch space for aa32 neon expansion. */ uint32_t scratch[8]; =20 /* There are a number of distinct float control structures: diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b6e91539ae..57adc4d912 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -19,3 +19,13 @@ =20 DEF_HELPER_FLAGS_2(sve_predtest1, TCG_CALL_NO_WG, i32, i64, i64) DEF_HELPER_FLAGS_3(sve_predtest, TCG_CALL_NO_WG, i32, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_sel_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_orr_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_orn_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_nor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_nand_pppp, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 7d13fd40ed..b63e7cc90e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -75,3 +75,42 @@ uint32_t HELPER(sve_predtest)(void *vd, void *vg, uint32= _t words) =20 return flags; } + +#define LOGICAL_PPPP(NAME, FUNC) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + uintptr_t opr_sz =3D simd_oprsz(desc); = \ + uint64_t *d =3D vd, *n =3D vn, *m =3D vm, *g =3D vg; = \ + uintptr_t i; \ + for (i =3D 0; i < opr_sz / 8; ++i) { = \ + d[i] =3D FUNC(n[i], m[i], g[i]); = \ + } \ +} + +#define DO_AND(N, M, G) (((N) & (M)) & (G)) +#define DO_BIC(N, M, G) (((N) & ~(M)) & (G)) +#define DO_EOR(N, M, G) (((N) ^ (M)) & (G)) +#define DO_ORR(N, M, G) (((N) | (M)) & (G)) +#define DO_ORN(N, M, G) (((N) | ~(M)) & (G)) +#define DO_NOR(N, M, G) (~((N) | (M)) & (G)) +#define DO_NAND(N, M, G) (~((N) & (M)) & (G)) +#define DO_SEL(N, M, G) (((N) & (G)) | ((M) & ~(G))) + +LOGICAL_PPPP(sve_and_pppp, DO_AND) +LOGICAL_PPPP(sve_bic_pppp, DO_BIC) +LOGICAL_PPPP(sve_eor_pppp, DO_EOR) +LOGICAL_PPPP(sve_sel_pppp, DO_SEL) +LOGICAL_PPPP(sve_orr_pppp, DO_ORR) +LOGICAL_PPPP(sve_orn_pppp, DO_ORN) +LOGICAL_PPPP(sve_nor_pppp, DO_NOR) +LOGICAL_PPPP(sve_nand_pppp, DO_NAND) + +#undef DO_ADD +#undef DO_BIC +#undef DO_EOR +#undef DO_ORR +#undef DO_ORN +#undef DO_NOR +#undef DO_NAND +#undef DO_SEL +#undef LOGICAL_PPPP diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c2e7fac938..405f9397a1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -59,6 +59,24 @@ static inline int pred_full_reg_size(DisasContext *s) return s->sve_len >> 3; } =20 +/* Round up the size of a predicate register to a size allowed by + * the tcg vector infrastructure. Any operation which uses this + * size may assume that the bits above pred_full_reg_size are zero, + * and must leave them the same way. + * + * Note that this is not needed for the vector registers as they + * are always properly sized for tcg vectors. + */ +static int pred_gvec_reg_size(DisasContext *s) +{ + int size =3D pred_full_reg_size(s); + if (size <=3D 8) { + return 8; + } else { + return QEMU_ALIGN_UP(size, 16); + } +} + /* Invoke a vector expander on two Zregs. */ static void do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn, int esz, int rd, int rn) @@ -83,6 +101,40 @@ static void do_mov_z(DisasContext *s, int rd, int rn) do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn); } =20 +/* Invoke a vector expander on two Pregs. */ +static void do_vector2_p(DisasContext *s, GVecGen2Fn *gvec_fn, + int esz, int rd, int rn) +{ + unsigned psz =3D pred_gvec_reg_size(s); + gvec_fn(esz, pred_full_reg_offset(s, rd), + pred_full_reg_offset(s, rn), psz, psz); +} + +/* Invoke a vector expander on three Pregs. */ +static void do_vector3_p(DisasContext *s, GVecGen3Fn *gvec_fn, + int esz, int rd, int rn, int rm) +{ + unsigned psz =3D pred_gvec_reg_size(s); + gvec_fn(esz, pred_full_reg_offset(s, rd), pred_full_reg_offset(s, rn), + pred_full_reg_offset(s, rm), psz, psz); +} + +/* Invoke a vector operation on four Pregs. */ +static void do_vecop4_p(DisasContext *s, const GVecGen4 *gvec_op, + int rd, int rn, int rm, int rg) +{ + unsigned psz =3D pred_gvec_reg_size(s); + tcg_gen_gvec_4(pred_full_reg_offset(s, rd), pred_full_reg_offset(s, rn= ), + pred_full_reg_offset(s, rm), pred_full_reg_offset(s, rg= ), + psz, psz, gvec_op); +} + +/* Invoke a vector move on two Pregs. */ +static void do_mov_p(DisasContext *s, int rd, int rn) +{ + do_vector2_p(s, tcg_gen_gvec_mov, 0, rd, rn); +} + /* Set the cpu flags as per a return from an SVE helper. */ static void do_pred_flags(TCGv_i32 t) { @@ -148,11 +200,295 @@ static void trans_BIC_zzz(DisasContext *s, arg_BIC_z= zz *a, uint32_t insn) do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } =20 +/* + *** SVE Predicate Logical Operations Group + */ + +static void do_pppp_flags(DisasContext *s, arg_rprr_s *a, + const GVecGen4 *gvec_op) +{ + unsigned psz =3D pred_gvec_reg_size(s); + int dofs =3D pred_full_reg_offset(s, a->rd); + int nofs =3D pred_full_reg_offset(s, a->rn); + int mofs =3D pred_full_reg_offset(s, a->rm); + int gofs =3D pred_full_reg_offset(s, a->pg); + + if (psz =3D=3D 8) { + /* Do the operation and the flags generation in temps. */ + TCGv_i64 pd =3D tcg_temp_new_i64(); + TCGv_i64 pn =3D tcg_temp_new_i64(); + TCGv_i64 pm =3D tcg_temp_new_i64(); + TCGv_i64 pg =3D tcg_temp_new_i64(); + + tcg_gen_ld_i64(pn, cpu_env, nofs); + tcg_gen_ld_i64(pm, cpu_env, mofs); + tcg_gen_ld_i64(pg, cpu_env, gofs); + + gvec_op->fni8(pd, pn, pm, pg); + tcg_gen_st_i64(pd, cpu_env, dofs); + + do_predtest1(pd, pg); + + tcg_temp_free_i64(pd); + tcg_temp_free_i64(pn); + tcg_temp_free_i64(pm); + tcg_temp_free_i64(pg); + } else { + /* The operation and flags generation is large. The computation + * of the flags depends on the original contents of the guarding + * predicate. If the destination overwrites the guarding predicat= e, + * then the easiest way to get this right is to save a copy. + */ + int tofs =3D gofs; + if (a->rd =3D=3D a->pg) { + tofs =3D offsetof(CPUARMState, vfp.preg_tmp); + tcg_gen_gvec_mov(0, tofs, gofs, psz, psz); + } + + tcg_gen_gvec_4(dofs, nofs, mofs, gofs, psz, psz, gvec_op); + do_predtest(s, dofs, tofs, psz / 8); + } +} + +static void gen_and_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_and_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_and_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_and_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +static void trans_AND_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op =3D { + .fni8 =3D gen_and_pg_i64, + .fniv =3D gen_and_pg_vec, + .fno =3D gen_helper_sve_and_pppp, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else if (a->pg =3D=3D a->rn && a->rn =3D=3D a->rm) { + do_mov_p(s, a->rd, a->rn); + } else if (a->pg =3D=3D a->rn || a->pg =3D=3D a->rm) { + do_vector3_p(s, tcg_gen_gvec_and, 0, a->rd, a->rn, a->rm); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_bic_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_andc_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_bic_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_andc_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +static void trans_BIC_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op =3D { + .fni8 =3D gen_bic_pg_i64, + .fniv =3D gen_bic_pg_vec, + .fno =3D gen_helper_sve_bic_pppp, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else if (a->pg =3D=3D a->rn) { + do_vector3_p(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_eor_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_xor_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_eor_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_xor_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +static void trans_EOR_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op =3D { + .fni8 =3D gen_eor_pg_i64, + .fniv =3D gen_eor_pg_vec, + .fno =3D gen_helper_sve_eor_pppp, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_sel_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_and_i64(pn, pn, pg); + tcg_gen_andc_i64(pm, pm, pg); + tcg_gen_or_i64(pd, pn, pm); +} + +static void gen_sel_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_and_vec(vece, pn, pn, pg); + tcg_gen_andc_vec(vece, pm, pm, pg); + tcg_gen_or_vec(vece, pd, pn, pm); +} + +static void trans_SEL_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op =3D { + .fni8 =3D gen_sel_pg_i64, + .fniv =3D gen_sel_pg_vec, + .fno =3D gen_helper_sve_sel_pppp, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + if (a->s) { + unallocated_encoding(s); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_orr_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_or_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_orr_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_or_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +static void trans_ORR_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op =3D { + .fni8 =3D gen_orr_pg_i64, + .fniv =3D gen_orr_pg_vec, + .fno =3D gen_helper_sve_orr_pppp, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else if (a->pg =3D=3D a->rn && a->rn =3D=3D a->rm) { + do_mov_p(s, a->rd, a->rn); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_orn_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_orc_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_orn_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_orc_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +static void trans_ORN_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op =3D { + .fni8 =3D gen_orn_pg_i64, + .fniv =3D gen_orn_pg_vec, + .fno =3D gen_helper_sve_orn_pppp, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_nor_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_or_i64(pd, pn, pm); + tcg_gen_andc_i64(pd, pg, pd); +} + +static void gen_nor_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_or_vec(vece, pd, pn, pm); + tcg_gen_andc_vec(vece, pd, pg, pd); +} + +static void trans_NOR_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op =3D { + .fni8 =3D gen_nor_pg_i64, + .fniv =3D gen_nor_pg_vec, + .fno =3D gen_helper_sve_nor_pppp, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_nand_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i6= 4 pg) +{ + tcg_gen_and_i64(pd, pn, pm); + tcg_gen_andc_i64(pd, pg, pd); +} + +static void gen_nand_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_and_vec(vece, pd, pn, pm); + tcg_gen_andc_vec(vece, pd, pg, pd); +} + +static void trans_NAND_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op =3D { + .fni8 =3D gen_nand_pg_i64, + .fniv =3D gen_nand_pg_vec, + .fno =3D gen_helper_sve_nand_pppp, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + /* *** SVE Predicate Misc Group */ =20 -void trans_PTEST(DisasContext *s, arg_PTEST *a, uint32_t insn) +static void trans_PTEST(DisasContext *s, arg_PTEST *a, uint32_t insn) { int nofs =3D pred_full_reg_offset(s, a->rn); int gofs =3D pred_full_reg_offset(s, a->pg); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7efaa8fe8e..d92886127a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -31,6 +31,7 @@ =20 &rri rd rn imm &rrr_esz rd rn rm esz +&rprr_s rd pg rn rm s =20 ########################################################################### # Named instruction formats. These are generally used to @@ -39,6 +40,9 @@ # Three operand with unused vector element size @rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=3D0 =20 +# Three prediate operand, with governing predicate, flag setting +@pd_pg_pn_pm_s ........ . s:1 .. rm:4 .. pg:4 . rn:4 . rd:4 &rprr_s + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=3D%imm9_16_10 @@ -56,6 +60,18 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn= _rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 =20 +### SVE Predicate Logical Operations Group + +# SVE predicate logical operations +AND_pppp 00100101 0. 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm_s +BIC_pppp 00100101 0. 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm_s +EOR_pppp 00100101 0. 00 .... 01 .... 1 .... 0 .... @pd_pg_pn_pm_s +SEL_pppp 00100101 0. 00 .... 01 .... 1 .... 1 .... @pd_pg_pn_pm_s +ORR_pppp 00100101 1. 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm_s +ORN_pppp 00100101 1. 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm_s +NOR_pppp 00100101 1. 00 .... 01 .... 1 .... 0 .... @pd_pg_pn_pm_s +NAND_pppp 00100101 1. 00 .... 01 .... 1 .... 1 .... @pd_pg_pn_pm_s + ### SVE Predicate Misc Group =20 # SVE predicate test --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892562010917.0638231752837; Sat, 17 Feb 2018 10:36:02 -0800 (PST) Received: from localhost ([::1]:48152 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7L4-0001uI-64 for importer@patchew.org; Sat, 17 Feb 2018 13:35:58 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39643) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79D-0000HY-BV for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79B-0001Xk-7o for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:43 -0500 Received: from mail-pg0-x244.google.com ([2607:f8b0:400e:c05::244]:44352) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79B-0001XR-08 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:41 -0500 Received: by mail-pg0-x244.google.com with SMTP id l4so2191070pgp.11 for ; Sat, 17 Feb 2018 10:23:40 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.37 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ABkUdeI3do2ssjX0cgzNwJ6hqP+wdGiVs4gSHu+xkhQ=; b=awE8eqcKRSfvzWzdtuGEwqPJXRsM5oMfsT3JEKQPTMWabMFng5ukNL1zziBN6V5J3m 78gY2HQqbi4GZZ0gAuxv7TUYYH6yf/cOrV5doEav2D0IvylVFKy+O6vhhyqjRbGqnPje e8HZYV5IGaIe9q9iY7K1eGjICyVsxAv0PFnd0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ABkUdeI3do2ssjX0cgzNwJ6hqP+wdGiVs4gSHu+xkhQ=; b=XdKLxVBs/Kgmf3Xi26Vo5Uf1bvixTNlWSEJLbnZMieUl/y60EVjBeu9clerEJ919Tj HkSMicgbQpmV712KZ18of9rzuMzp5dOaIwiRzcC/8PpP/mNZj7mMXElQTS4ZQyrHY4B7 PAhl4PcYWC+8yG9lpemi+zbP6S/ch5olYZP65BO6FGDzawmiX6WyxYTIKVdIgWvRAgWr OVP3q2L5hls9YRe4fqFyk8jAPzIFSZRkn9m6J4x3fF3bAwtC5V6RsLe5qImrbzZ1J9lE tVev14L+Vw0mnJHkd311MJ69BcdZn0pouuaEo1WpATXpx2+ornxVD0a81n/IYJhW8K+X sC8Q== X-Gm-Message-State: APf1xPA123HJwoSLdR8R5nilrvdNIBf9QjUKWkRXTFpC9KXbvc/e6TlF ilzJIgYxLFVOu3I6tIlG7tqgvKJaQcg= X-Google-Smtp-Source: AH8x225w8XqMbUZrDGV6ZOOY8GJUMTj5ZzE326riAbfrEmEuUmpuBvecy8QyGg25KNyYXbKQ73x8Gw== X-Received: by 10.99.47.132 with SMTP id v126mr8252843pgv.129.1518891819491; Sat, 17 Feb 2018 10:23:39 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:24 -0800 Message-Id: <20180217182323.25885-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::244 Subject: [Qemu-devel] [PATCH v2 08/67] target/arm: Implement SVE Predicate Misc Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/cpu.h | 3 + target/arm/helper-sve.h | 3 + target/arm/sve_helper.c | 86 +++++++++++++++++++++++- target/arm/translate-sve.c | 163 +++++++++++++++++++++++++++++++++++++++++= +++- target/arm/sve.decode | 41 ++++++++++++ 5 files changed, 293 insertions(+), 3 deletions(-) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 8befe43a01..27f395183b 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -2915,4 +2915,7 @@ static inline uint64_t *aa64_vfp_qreg(CPUARMState *en= v, unsigned regno) return &env->vfp.zregs[regno].d[0]; } =20 +/* Shared between translate-sve.c and sve_helper.c. */ +extern const uint64_t pred_esz_masks[4]; + #endif diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 57adc4d912..0c04afff8c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -20,6 +20,9 @@ DEF_HELPER_FLAGS_2(sve_predtest1, TCG_CALL_NO_WG, i32, i64, i64) DEF_HELPER_FLAGS_3(sve_predtest, TCG_CALL_NO_WG, i32, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(sve_pfirst, TCG_CALL_NO_WG, i32, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_pnext, TCG_CALL_NO_WG, i32, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b63e7cc90e..cee7d9bcf6 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -39,7 +39,7 @@ =20 static uint32_t iter_predtest_fwd(uint64_t d, uint64_t g, uint32_t flags) { - if (g) { + if (likely(g)) { /* Compute N from first D & G. Use bit 2 to signal first G bit seen. */ if (!(flags & 4)) { @@ -114,3 +114,87 @@ LOGICAL_PPPP(sve_nand_pppp, DO_NAND) #undef DO_NAND #undef DO_SEL #undef LOGICAL_PPPP + +/* Similar to the ARM LastActiveElement pseudocode function, except the + result is multiplied by the element size. This includes the not found + indication; e.g. not found for esz=3D3 is -8. */ +static intptr_t last_active_element(uint64_t *g, intptr_t words, intptr_t = esz) +{ + uint64_t mask =3D pred_esz_masks[esz]; + intptr_t i =3D words; + + do { + uint64_t this_g =3D g[--i] & mask; + if (this_g) { + return i * 64 + (63 - clz64(this_g)); + } + } while (i > 0); + return (intptr_t)-1 << esz; +} + +uint32_t HELPER(sve_pfirst)(void *vd, void *vg, uint32_t words) +{ + uint32_t flags =3D PREDTEST_INIT; + uint64_t *d =3D vd, *g =3D vg; + intptr_t i =3D 0; + + do { + uint64_t this_d =3D d[i]; + uint64_t this_g =3D g[i]; + + if (this_g) { + if (!(flags & 4)) { + /* Set in D the first bit of G. */ + this_d |=3D this_g & -this_g; + d[i] =3D this_d; + } + flags =3D iter_predtest_fwd(this_d, this_g, flags); + } + } while (++i < words); + + return flags; +} + +uint32_t HELPER(sve_pnext)(void *vd, void *vg, uint32_t pred_desc) +{ + intptr_t words =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS); + intptr_t esz =3D extract32(pred_desc, SIMD_DATA_SHIFT, 2); + uint32_t flags =3D PREDTEST_INIT; + uint64_t *d =3D vd, *g =3D vg, esz_mask; + intptr_t i, next; + + next =3D last_active_element(vd, words, esz) + (1 << esz); + esz_mask =3D pred_esz_masks[esz]; + + /* Similar to the pseudocode for pnext, but scaled by ESZ + so that we find the correct bit. */ + if (next < words * 64) { + uint64_t mask =3D -1; + + if (next & 63) { + mask =3D ~((1ull << (next & 63)) - 1); + next &=3D -64; + } + do { + uint64_t this_g =3D g[next / 64] & esz_mask & mask; + if (this_g !=3D 0) { + next =3D (next & -64) + ctz64(this_g); + break; + } + next +=3D 64; + mask =3D -1; + } while (next < words * 64); + } + + i =3D 0; + do { + uint64_t this_d =3D 0; + if (i =3D=3D next / 64) { + this_d =3D 1ull << (next & 63); + } + d[i] =3D this_d; + flags =3D iter_predtest_fwd(this_d, g[i] & esz_mask, flags); + } while (++i < words); + + return flags; +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 405f9397a1..a9b6ae046d 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -22,6 +22,7 @@ #include "exec/exec-all.h" #include "tcg-op.h" #include "tcg-op-gvec.h" +#include "tcg-gvec-desc.h" #include "qemu/log.h" #include "arm_ldst.h" #include "translate.h" @@ -67,9 +68,8 @@ static inline int pred_full_reg_size(DisasContext *s) * Note that this is not needed for the vector registers as they * are always properly sized for tcg vectors. */ -static int pred_gvec_reg_size(DisasContext *s) +static int size_for_gvec(int size) { - int size =3D pred_full_reg_size(s); if (size <=3D 8) { return 8; } else { @@ -77,6 +77,11 @@ static int pred_gvec_reg_size(DisasContext *s) } } =20 +static int pred_gvec_reg_size(DisasContext *s) +{ + return size_for_gvec(pred_full_reg_size(s)); +} + /* Invoke a vector expander on two Zregs. */ static void do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn, int esz, int rd, int rn) @@ -172,6 +177,12 @@ static void do_predtest(DisasContext *s, int dofs, int= gofs, int words) tcg_temp_free_i32(t); } =20 +/* For each element size, the bits within a predicate word that are active= . */ +const uint64_t pred_esz_masks[4] =3D { + 0xffffffffffffffffull, 0x5555555555555555ull, + 0x1111111111111111ull, 0x0101010101010101ull +}; + /* *** SVE Logical - Unpredicated Group */ @@ -509,6 +520,154 @@ static void trans_PTEST(DisasContext *s, arg_PTEST *a= , uint32_t insn) } } =20 +/* See the ARM pseudocode DecodePredCount. */ +static unsigned decode_pred_count(unsigned fullsz, int pattern, int esz) +{ + unsigned elements =3D fullsz >> esz; + unsigned bound; + + switch (pattern) { + case 0x0: /* POW2 */ + return pow2floor(elements); + case 0x1: /* VL1 */ + case 0x2: /* VL2 */ + case 0x3: /* VL3 */ + case 0x4: /* VL4 */ + case 0x5: /* VL5 */ + case 0x6: /* VL6 */ + case 0x7: /* VL7 */ + case 0x8: /* VL8 */ + bound =3D pattern; + break; + case 0x9: /* VL16 */ + case 0xa: /* VL32 */ + case 0xb: /* VL64 */ + case 0xc: /* VL128 */ + case 0xd: /* VL256 */ + bound =3D 16 << (pattern - 9); + break; + case 0x1d: /* MUL4 */ + return elements - elements % 4; + case 0x1e: /* MUL3 */ + return elements - elements % 3; + case 0x1f: /* ALL */ + return elements; + default: /* #uimm5 */ + return 0; + } + return elements >=3D bound ? bound : 0; +} + +static void trans_PTRUE(DisasContext *s, arg_PTRUE *a, uint32_t insn) +{ + unsigned fullsz =3D vec_full_reg_size(s); + unsigned ofs =3D pred_full_reg_offset(s, a->rd); + unsigned numelem, setsz, i; + uint64_t word, lastword; + TCGv_i64 t; + + numelem =3D decode_pred_count(fullsz, a->pat, a->esz); + + /* Determine what we must store into each bit, and how many. */ + if (numelem =3D=3D 0) { + lastword =3D word =3D 0; + setsz =3D fullsz; + } else { + setsz =3D numelem << a->esz; + lastword =3D word =3D pred_esz_masks[a->esz]; + if (setsz % 64) { + lastword &=3D ~(-1ull << (setsz % 64)); + } + } + + t =3D tcg_temp_new_i64(); + if (fullsz <=3D 64) { + tcg_gen_movi_i64(t, lastword); + tcg_gen_st_i64(t, cpu_env, ofs); + goto done; + } + + if (word =3D=3D lastword) { + unsigned maxsz =3D size_for_gvec(fullsz / 8); + unsigned oprsz =3D size_for_gvec(setsz / 8); + + if (oprsz * 8 =3D=3D setsz) { + tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word); + goto done; + } + if (oprsz * 8 =3D=3D setsz + 8) { + tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word); + tcg_gen_movi_i64(t, 0); + tcg_gen_st_i64(t, cpu_env, ofs + oprsz - 8); + goto done; + } + } + + setsz /=3D 8; + fullsz /=3D 8; + + tcg_gen_movi_i64(t, word); + for (i =3D 0; i < setsz; i +=3D 8) { + tcg_gen_st_i64(t, cpu_env, ofs + i); + } + if (lastword !=3D word) { + tcg_gen_movi_i64(t, lastword); + tcg_gen_st_i64(t, cpu_env, ofs + i); + i +=3D 8; + } + if (i < fullsz) { + tcg_gen_movi_i64(t, 0); + for (; i < fullsz; i +=3D 8) { + tcg_gen_st_i64(t, cpu_env, ofs + i); + } + } + + done: + tcg_temp_free_i64(t); + + /* PTRUES */ + if (a->s) { + tcg_gen_movi_i32(cpu_NF, -(word !=3D 0)); + tcg_gen_movi_i32(cpu_CF, word =3D=3D 0); + tcg_gen_movi_i32(cpu_VF, 0); + tcg_gen_mov_i32(cpu_ZF, cpu_NF); + } +} + +static void do_pfirst_pnext(DisasContext *s, arg_rr_esz *a, + void (*gen_fn)(TCGv_i32, TCGv_ptr, + TCGv_ptr, TCGv_i32)) +{ + TCGv_ptr t_pd =3D tcg_temp_new_ptr(); + TCGv_ptr t_pg =3D tcg_temp_new_ptr(); + TCGv_i32 t; + unsigned desc; + + desc =3D DIV_ROUND_UP(pred_full_reg_size(s), 8); + desc =3D deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + + tcg_gen_addi_ptr(t_pd, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->rn)); + t =3D tcg_const_i32(desc); + + gen_fn(t, t_pd, t_pg, t); + tcg_temp_free_ptr(t_pd); + tcg_temp_free_ptr(t_pg); + + do_pred_flags(t); + tcg_temp_free_i32(t); +} + +static void trans_PFIRST(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + do_pfirst_pnext(s, a, gen_helper_sve_pfirst); +} + +static void trans_PNEXT(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + do_pfirst_pnext(s, a, gen_helper_sve_pnext); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d92886127a..2e27ef41cd 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -23,20 +23,30 @@ # Named fields. These are primarily for disjoint fields. =20 %imm9_16_10 16:s6 10:3 +%preg4_5 5:4 =20 ########################################################################### # Named attribute sets. These are used to make nice(er) names # when creating helpers common to those for the individual # instruction patterns. =20 +&rr_esz rd rn esz &rri rd rn imm &rrr_esz rd rn rm esz &rprr_s rd pg rn rm s =20 +&ptrue rd esz pat s + ########################################################################### # Named instruction formats. These are generally used to # reduce the amount of duplication between instruction patterns. =20 +# Two operand with unused vector element size +@pd_pn_e0 ........ ........ ....... rn:4 . rd:4 &rr_esz esz=3D0 + +# Two operand +@pd_pn ........ esz:2 .. .... ....... rn:4 . rd:4 &rr_esz + # Three operand with unused vector element size @rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=3D0 =20 @@ -77,6 +87,37 @@ NAND_pppp 00100101 1. 00 .... 01 .... 1 .... 1 .... @pd_= pg_pn_pm_s # SVE predicate test PTEST 00100101 01010000 11 pg:4 0 rn:4 00000 =20 +# SVE predicate initialize +PTRUE 00100101 esz:2 01100 s:1 111000 pat:5 0 rd:4 &ptrue + +# SVE initialize FFR (SETFFR) +PTRUE 00100101 0010 1100 1001 0000 0000 0000 \ + &ptrue rd=3D16 esz=3D0 pat=3D31 s=3D0 + +# SVE zero predicate register (PFALSE) +# Note that pat=3D32 is outside of the natural 0..31, and will +# always hit the default #uimm5 case of decode_pred_count. +PTRUE 00100101 0001 1000 1110 0100 0000 rd:4 \ + &ptrue esz=3D0 pat=3D32 s=3D0 + +# SVE predicate read from FFR (predicated) (RDFFR) +ORR_pppp 00100101 0 s:1 0110001111000 pg:4 0 rd:4 \ + &rprr_s rn=3D16 rm=3D16 + +# SVE predicate read from FFR (unpredicated) (RDFFR) +ORR_pppp 00100101 0001 1001 1111 0000 0000 rd:4 \ + &rprr_s rn=3D16 rm=3D16 pg=3D16 s=3D0 + +# SVE FFR write from predicate (WRFFR) +ORR_pppp 00100101 0010 1000 1001 000 rn:4 00000 \ + &rprr_s rd=3D16 rm=3D%preg4_5 pg=3D%preg4_5 s=3D0 + +# SVE predicate first active +PFIRST 00100101 01 011 000 11000 00 .... 0 .... @pd_pn_e0 + +# SVE predicate next active +PNEXT 00100101 .. 011 001 11000 10 .... 0 .... @pd_pn + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group =20 # SVE load predicate register --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892416409414.46229821128054; Sat, 17 Feb 2018 10:33:36 -0800 (PST) Received: from localhost ([::1]:48115 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Il-0008IY-ED for importer@patchew.org; Sat, 17 Feb 2018 13:33:35 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39682) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79F-0000Ke-VF for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79C-0001YR-U4 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:45 -0500 Received: from mail-pg0-x242.google.com ([2607:f8b0:400e:c05::242]:38908) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79C-0001Y5-IN for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:42 -0500 Received: by mail-pg0-x242.google.com with SMTP id l24so4354238pgc.5 for ; Sat, 17 Feb 2018 10:23:42 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.39 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=qiFNGpwXsmKVCH6IljZFj24cOY655vAKgmFhZVL/p+Y=; b=S6OorHRt8APffA27xNsTIvs/4+a40J1dSUNRJp1o5my3swU82QNgAjAbq2lsxOn8Wq qgqowCdr8HBn6++qRxLJhrmao4fH7Oz1GcarcpMxgzHpQU0GxyrMhKMUsrfvHslb4ABh NN26lTW1/ZboSVjRoUDWFlzISZsaLIbCWIswM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=qiFNGpwXsmKVCH6IljZFj24cOY655vAKgmFhZVL/p+Y=; b=O4TSBHQzFYNW3qgvWOzczpHzwGCuiyBiGSjGKiQJL29zunQ58nRPN08yGm6iRfbmDO hx3DjxUMHcvPv+cUG+n6UO3w8hwhLnA+8vhLR1CBUdBM+acq2Sc7ngnBUGFb8yLjkJ60 zgO+m2sfB4M1qCl46TrZ30Yb1fkZ+A4SfJ3s5zNiylM6J86zHv1RGPAWDVKBt5g5j7Hf 4m+5lRmPl/XQtwAj7SZ75sSnmqZ3YRa9GAjBlEHYHmVr4miYduLKVFZFDAVFS+OsCRd7 Efwp75kdXrN5rekJmon9GX9SLg4PdxfoYK66MMvIP7mYzgNbsrDb2c4mHFCzuAE2tXKG gvSg== X-Gm-Message-State: APf1xPDui4+LqawLRq2t+D2lK4X/vVFwqmCMrNgwqL74XGCUIHRvL5fy GxNDH2knRIm7w/XilPtt9NpWXCIZAzY= X-Google-Smtp-Source: AH8x225cg434NxdsNa/Xnkhrshlgd+nssM54OiFRd6+UA6X1hsogVXPwV9NVoWXSftisIR2CkIvlBg== X-Received: by 10.98.92.68 with SMTP id q65mr9796324pfb.4.1518891821066; Sat, 17 Feb 2018 10:23:41 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:25 -0800 Message-Id: <20180217182323.25885-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::242 Subject: [Qemu-devel] [PATCH v2 09/67] target/arm: Implement SVE Integer Binary Arithmetic - Predicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 145 +++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 196 +++++++++++++++++++++++++++++++++++++++++= +++- target/arm/translate-sve.c | 65 +++++++++++++++ target/arm/sve.decode | 42 ++++++++++ 4 files changed, 447 insertions(+), 1 deletion(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0c04afff8c..5b82ba1501 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -23,6 +23,151 @@ DEF_HELPER_FLAGS_3(sve_predtest, TCG_CALL_NO_WG, i32, p= tr, ptr, i32) DEF_HELPER_FLAGS_3(sve_pfirst, TCG_CALL_NO_WG, i32, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_pnext, TCG_CALL_NO_WG, i32, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_and_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_and_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_and_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_and_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_eor_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_eor_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_eor_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_eor_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_orr_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_orr_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_orr_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_orr_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_bic_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_bic_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_bic_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_bic_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_add_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_add_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_add_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_add_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_sub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_smax_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smax_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smax_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smax_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_umax_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umax_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umax_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umax_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_smin_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smin_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smin_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smin_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_umin_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umin_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umin_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umin_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_sabd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sabd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sabd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sabd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_uabd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_uabd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_uabd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_uabd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_mul_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_mul_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_mul_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_mul_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_smulh_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smulh_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smulh_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smulh_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_umulh_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umulh_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umulh_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umulh_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_udiv_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_udiv_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index cee7d9bcf6..26c177c2fd 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -25,6 +25,22 @@ #include "tcg/tcg-gvec-desc.h" =20 =20 +/* Note that vector data is stored in host-endian 64-bit chunks, + so addressing units smaller than that needs a host-endian fixup. */ +#ifdef HOST_WORDS_BIGENDIAN +#define H1(x) ((x) ^ 7) +#define H1_2(x) ((x) ^ 6) +#define H1_4(x) ((x) ^ 4) +#define H2(x) ((x) ^ 3) +#define H4(x) ((x) ^ 1) +#else +#define H1(x) (x) +#define H1_2(x) (x) +#define H1_4(x) (x) +#define H2(x) (x) +#define H4(x) (x) +#endif + /* Return a value for NZCV as per the ARM PredTest pseudofunction. * * The return value has bit 31 set if N is set, bit 1 set if Z is clear, @@ -105,7 +121,7 @@ LOGICAL_PPPP(sve_orn_pppp, DO_ORN) LOGICAL_PPPP(sve_nor_pppp, DO_NOR) LOGICAL_PPPP(sve_nand_pppp, DO_NAND) =20 -#undef DO_ADD +#undef DO_AND #undef DO_BIC #undef DO_EOR #undef DO_ORR @@ -115,6 +131,184 @@ LOGICAL_PPPP(sve_nand_pppp, DO_NAND) #undef DO_SEL #undef LOGICAL_PPPP =20 +/* Fully general three-operand expander, controlled by a predicate. + * This is complicated by the host-endian storage of the register file. + */ +/* ??? I don't expect the compiler could ever vectorize this itself. + * With some tables we can convert bit masks to byte masks, and with + * extra care wrt byte/word ordering we could use gcc generic vectors + * and do 16 bytes at a time. + */ +#define DO_ZPZZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + for (i =3D 0; i < opr_sz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + TYPE mm =3D *(TYPE *)(vm + H(i)); \ + *(TYPE *)(vd + H(i)) =3D OP(nn, mm); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); = \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZZ_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; \ + TYPE *d =3D vd, *n =3D vn, *m =3D vm; \ + uint8_t *pg =3D vg; \ + for (i =3D 0; i < opr_sz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE nn =3D n[i], mm =3D m[i]; \ + d[i] =3D OP(nn, mm); \ + } \ + } \ +} + +#define DO_AND(N, M) (N & M) +#define DO_EOR(N, M) (N ^ M) +#define DO_ORR(N, M) (N | M) +#define DO_BIC(N, M) (N & ~M) +#define DO_ADD(N, M) (N + M) +#define DO_SUB(N, M) (N - M) +#define DO_MAX(N, M) ((N) >=3D (M) ? (N) : (M)) +#define DO_MIN(N, M) ((N) >=3D (M) ? (M) : (N)) +#define DO_ABD(N, M) ((N) >=3D (M) ? (N) - (M) : (M) - (N)) +#define DO_MUL(N, M) (N * M) +#define DO_DIV(N, M) (M ? N / M : 0) + +DO_ZPZZ(sve_and_zpzz_b, uint8_t, H1, DO_AND) +DO_ZPZZ(sve_and_zpzz_h, uint16_t, H1_2, DO_AND) +DO_ZPZZ(sve_and_zpzz_s, uint32_t, H1_4, DO_AND) +DO_ZPZZ_D(sve_and_zpzz_d, uint64_t, DO_AND) + +DO_ZPZZ(sve_orr_zpzz_b, uint8_t, H1, DO_ORR) +DO_ZPZZ(sve_orr_zpzz_h, uint16_t, H1_2, DO_ORR) +DO_ZPZZ(sve_orr_zpzz_s, uint32_t, H1_4, DO_ORR) +DO_ZPZZ_D(sve_orr_zpzz_d, uint64_t, DO_ORR) + +DO_ZPZZ(sve_eor_zpzz_b, uint8_t, H1, DO_EOR) +DO_ZPZZ(sve_eor_zpzz_h, uint16_t, H1_2, DO_EOR) +DO_ZPZZ(sve_eor_zpzz_s, uint32_t, H1_4, DO_EOR) +DO_ZPZZ_D(sve_eor_zpzz_d, uint64_t, DO_EOR) + +DO_ZPZZ(sve_bic_zpzz_b, uint8_t, H1, DO_BIC) +DO_ZPZZ(sve_bic_zpzz_h, uint16_t, H1_2, DO_BIC) +DO_ZPZZ(sve_bic_zpzz_s, uint32_t, H1_4, DO_BIC) +DO_ZPZZ_D(sve_bic_zpzz_d, uint64_t, DO_BIC) + +DO_ZPZZ(sve_add_zpzz_b, uint8_t, H1, DO_ADD) +DO_ZPZZ(sve_add_zpzz_h, uint16_t, H1_2, DO_ADD) +DO_ZPZZ(sve_add_zpzz_s, uint32_t, H1_4, DO_ADD) +DO_ZPZZ_D(sve_add_zpzz_d, uint64_t, DO_ADD) + +DO_ZPZZ(sve_sub_zpzz_b, uint8_t, H1, DO_SUB) +DO_ZPZZ(sve_sub_zpzz_h, uint16_t, H1_2, DO_SUB) +DO_ZPZZ(sve_sub_zpzz_s, uint32_t, H1_4, DO_SUB) +DO_ZPZZ_D(sve_sub_zpzz_d, uint64_t, DO_SUB) + +DO_ZPZZ(sve_smax_zpzz_b, int8_t, H1, DO_MAX) +DO_ZPZZ(sve_smax_zpzz_h, int16_t, H1_2, DO_MAX) +DO_ZPZZ(sve_smax_zpzz_s, int32_t, H1_4, DO_MAX) +DO_ZPZZ_D(sve_smax_zpzz_d, int64_t, DO_MAX) + +DO_ZPZZ(sve_umax_zpzz_b, uint8_t, H1, DO_MAX) +DO_ZPZZ(sve_umax_zpzz_h, uint16_t, H1_2, DO_MAX) +DO_ZPZZ(sve_umax_zpzz_s, uint32_t, H1_4, DO_MAX) +DO_ZPZZ_D(sve_umax_zpzz_d, uint64_t, DO_MAX) + +DO_ZPZZ(sve_smin_zpzz_b, int8_t, H1, DO_MIN) +DO_ZPZZ(sve_smin_zpzz_h, int16_t, H1_2, DO_MIN) +DO_ZPZZ(sve_smin_zpzz_s, int32_t, H1_4, DO_MIN) +DO_ZPZZ_D(sve_smin_zpzz_d, int64_t, DO_MIN) + +DO_ZPZZ(sve_umin_zpzz_b, uint8_t, H1, DO_MIN) +DO_ZPZZ(sve_umin_zpzz_h, uint16_t, H1_2, DO_MIN) +DO_ZPZZ(sve_umin_zpzz_s, uint32_t, H1_4, DO_MIN) +DO_ZPZZ_D(sve_umin_zpzz_d, uint64_t, DO_MIN) + +DO_ZPZZ(sve_sabd_zpzz_b, int8_t, H1, DO_ABD) +DO_ZPZZ(sve_sabd_zpzz_h, int16_t, H1_2, DO_ABD) +DO_ZPZZ(sve_sabd_zpzz_s, int32_t, H1_4, DO_ABD) +DO_ZPZZ_D(sve_sabd_zpzz_d, int64_t, DO_ABD) + +DO_ZPZZ(sve_uabd_zpzz_b, uint8_t, H1, DO_ABD) +DO_ZPZZ(sve_uabd_zpzz_h, uint16_t, H1_2, DO_ABD) +DO_ZPZZ(sve_uabd_zpzz_s, uint32_t, H1_4, DO_ABD) +DO_ZPZZ_D(sve_uabd_zpzz_d, uint64_t, DO_ABD) + +/* Because the computation type is at least twice as large as required, + these work for both signed and unsigned source types. */ +static inline uint8_t do_mulh_b(int32_t n, int32_t m) +{ + return (n * m) >> 8; +} + +static inline uint16_t do_mulh_h(int32_t n, int32_t m) +{ + return (n * m) >> 16; +} + +static inline uint32_t do_mulh_s(int64_t n, int64_t m) +{ + return (n * m) >> 32; +} + +static inline uint64_t do_smulh_d(uint64_t n, uint64_t m) +{ + uint64_t lo, hi; + muls64(&lo, &hi, n, m); + return hi; +} + +static inline uint64_t do_umulh_d(uint64_t n, uint64_t m) +{ + uint64_t lo, hi; + mulu64(&lo, &hi, n, m); + return hi; +} + +DO_ZPZZ(sve_mul_zpzz_b, uint8_t, H1, DO_MUL) +DO_ZPZZ(sve_mul_zpzz_h, uint16_t, H1_2, DO_MUL) +DO_ZPZZ(sve_mul_zpzz_s, uint32_t, H1_4, DO_MUL) +DO_ZPZZ_D(sve_mul_zpzz_d, uint64_t, DO_MUL) + +DO_ZPZZ(sve_smulh_zpzz_b, int8_t, H1, do_mulh_b) +DO_ZPZZ(sve_smulh_zpzz_h, int16_t, H1_2, do_mulh_h) +DO_ZPZZ(sve_smulh_zpzz_s, int32_t, H1_4, do_mulh_s) +DO_ZPZZ_D(sve_smulh_zpzz_d, uint64_t, do_smulh_d) + +DO_ZPZZ(sve_umulh_zpzz_b, uint8_t, H1, do_mulh_b) +DO_ZPZZ(sve_umulh_zpzz_h, uint16_t, H1_2, do_mulh_h) +DO_ZPZZ(sve_umulh_zpzz_s, uint32_t, H1_4, do_mulh_s) +DO_ZPZZ_D(sve_umulh_zpzz_d, uint64_t, do_umulh_d) + +DO_ZPZZ(sve_sdiv_zpzz_s, int32_t, H1_4, DO_DIV) +DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV) + +DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_DIV) +DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV) + +#undef DO_AND +#undef DO_ORR +#undef DO_EOR +#undef DO_BIC +#undef DO_ADD +#undef DO_SUB +#undef DO_MAX +#undef DO_MIN +#undef DO_ABD +#undef DO_MUL +#undef DO_DIV +#undef DO_ZPZZ +#undef DO_ZPZZ_D + /* Similar to the ARM LastActiveElement pseudocode function, except the result is multiplied by the element size. This includes the not found indication; e.g. not found for esz=3D3 is -8. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a9b6ae046d..116002792a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -211,6 +211,71 @@ static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz= *a, uint32_t insn) do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } =20 +/* + *** SVE Integer Arithmetic - Binary Predicated Group + */ + +static void do_zpzz_ool(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_= 4 *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + if (fn =3D=3D NULL) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + vsz, vsz, 0, fn); +} + +#define DO_ZPZZ(NAME, name) \ +void trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_4 * const fns[4] =3D { = \ + gen_helper_sve_##name##_zpzz_b, gen_helper_sve_##name##_zpzz_h, \ + gen_helper_sve_##name##_zpzz_s, gen_helper_sve_##name##_zpzz_d, \ + }; \ + do_zpzz_ool(s, a, fns[a->esz]); \ +} + +DO_ZPZZ(AND, and) +DO_ZPZZ(EOR, eor) +DO_ZPZZ(ORR, orr) +DO_ZPZZ(BIC, bic) + +DO_ZPZZ(ADD, add) +DO_ZPZZ(SUB, sub) + +DO_ZPZZ(SMAX, smax) +DO_ZPZZ(UMAX, umax) +DO_ZPZZ(SMIN, smin) +DO_ZPZZ(UMIN, umin) +DO_ZPZZ(SABD, sabd) +DO_ZPZZ(UABD, uabd) + +DO_ZPZZ(MUL, mul) +DO_ZPZZ(SMULH, smulh) +DO_ZPZZ(UMULH, umulh) + +void trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_4 * const fns[4] =3D { + NULL, NULL, gen_helper_sve_sdiv_zpzz_s, gen_helper_sve_sdiv_zpzz_d + }; + do_zpzz_ool(s, a, fns[a->esz]); +} + +void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_4 * const fns[4] =3D { + NULL, NULL, gen_helper_sve_udiv_zpzz_s, gen_helper_sve_udiv_zpzz_d + }; + do_zpzz_ool(s, a, fns[a->esz]); +} + +#undef DO_ZPZZ + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2e27ef41cd..5fafe02575 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -25,6 +25,10 @@ %imm9_16_10 16:s6 10:3 %preg4_5 5:4 =20 +# Either a copy of rd (at bit 0), or a different source +# as propagated via the MOVPRFX instruction. +%reg_movprfx 0:5 + ########################################################################### # Named attribute sets. These are used to make nice(er) names # when creating helpers common to those for the individual @@ -34,6 +38,7 @@ &rri rd rn imm &rrr_esz rd rn rm esz &rprr_s rd pg rn rm s +&rprr_esz rd pg rn rm esz =20 &ptrue rd esz pat s =20 @@ -53,6 +58,12 @@ # Three prediate operand, with governing predicate, flag setting @pd_pg_pn_pm_s ........ . s:1 .. rm:4 .. pg:4 . rn:4 . rd:4 &rprr_s =20 +# Two register operand, with governing predicate, vector element size +@rdn_pg_rm ........ esz:2 ... ... ... pg:3 rm:5 rd:5 \ + &rprr_esz rn=3D%reg_movprfx +@rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ + &rprr_esz rm=3D%reg_movprfx + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=3D%imm9_16_10 @@ -62,6 +73,37 @@ ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. =20 +### SVE Integer Arithmetic - Binary Predicated Group + +# SVE bitwise logical vector operations (predicated) +ORR_zpzz 00000100 .. 011 000 000 ... ..... ..... @rdn_pg_rm +EOR_zpzz 00000100 .. 011 001 000 ... ..... ..... @rdn_pg_rm +AND_zpzz 00000100 .. 011 010 000 ... ..... ..... @rdn_pg_rm +BIC_zpzz 00000100 .. 011 011 000 ... ..... ..... @rdn_pg_rm + +# SVE integer add/subtract vectors (predicated) +ADD_zpzz 00000100 .. 000 000 000 ... ..... ..... @rdn_pg_rm +SUB_zpzz 00000100 .. 000 001 000 ... ..... ..... @rdn_pg_rm +SUB_zpzz 00000100 .. 000 011 000 ... ..... ..... @rdm_pg_rn # SUBR + +# SVE integer min/max/difference (predicated) +SMAX_zpzz 00000100 .. 001 000 000 ... ..... ..... @rdn_pg_rm +UMAX_zpzz 00000100 .. 001 001 000 ... ..... ..... @rdn_pg_rm +SMIN_zpzz 00000100 .. 001 010 000 ... ..... ..... @rdn_pg_rm +UMIN_zpzz 00000100 .. 001 011 000 ... ..... ..... @rdn_pg_rm +SABD_zpzz 00000100 .. 001 100 000 ... ..... ..... @rdn_pg_rm +UABD_zpzz 00000100 .. 001 101 000 ... ..... ..... @rdn_pg_rm + +# SVE integer multiply/divide (predicated) +MUL_zpzz 00000100 .. 010 000 000 ... ..... ..... @rdn_pg_rm +SMULH_zpzz 00000100 .. 010 010 000 ... ..... ..... @rdn_pg_rm +UMULH_zpzz 00000100 .. 010 011 000 ... ..... ..... @rdn_pg_rm +# Note that divide requires size >=3D 2; below 2 is unallocated. +SDIV_zpzz 00000100 .. 010 100 000 ... ..... ..... @rdn_pg_rm +UDIV_zpzz 00000100 .. 010 101 000 ... ..... ..... @rdn_pg_rm +SDIV_zpzz 00000100 .. 010 110 000 ... ..... ..... @rdm_pg_rn # SDIVR +UDIV_zpzz 00000100 .. 010 111 000 ... ..... ..... @rdm_pg_rn # UDIVR + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892752513330.46964262868335; Sat, 17 Feb 2018 10:39:12 -0800 (PST) Received: from localhost ([::1]:48170 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7OB-0004SG-I2 for importer@patchew.org; Sat, 17 Feb 2018 13:39:11 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39688) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79G-0000L8-7C for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79E-0001ZN-DX for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:46 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:44350) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79E-0001Z9-5M for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:44 -0500 Received: by mail-pg0-x241.google.com with SMTP id l4so2191126pgp.11 for ; Sat, 17 Feb 2018 10:23:44 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.41 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=myYWCaqty6C0BoKxDWeQ92jAbY6QN/No9ULf3IngOig=; b=ReHJfn/HYnod2UKEDjHnbrlMuVM5zs9PKdeuHtkUbAJA8lGGHgqAz5P7VoycbniWDc qQlQC512MaMDWlZIjRQrYFTq64XEgayUaTKkrF5tkmXwYBVz6+gcRupPIWSh8xiXpAdg vC4sOKm1yV8xvDD9gijmInGM5J+kVBmJwDqvY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=myYWCaqty6C0BoKxDWeQ92jAbY6QN/No9ULf3IngOig=; b=HNxfSWoP7/3MdsYO7j1LDuJhyOgurQMJrUNx1qW2sMQAGNt35vv2enmUybYgIAJUR9 thENj6OzOmrdO9xEmmwyBsWi6oOjwd9S2uJ5a4pMwLIkc5MESiaZAN/xVUov5Z38Wv19 uiBIqSKGnu3mceNqtbhmL0VFG4PVlC1btlyyRxJtgAoVWDNLOWqOS0Ytw2t6Oh0mqyE9 dV7E8zgLKZ1R3skkZ4vo/C2lwIA2FAzTmo9zvua5+YGlf+bmbT8tAw1qcpx1Ei9hJTcL v1p7mdyOdtWcuEC93cR/nTeDx0re5F4Rqm8sKMwvbDW3zs0aBQ10r5l+M6ljfzdxfWwT pyJw== X-Gm-Message-State: APf1xPB2SHs6jw/4mNEUXsUBeXU51vc7FyUaO+6sjGZ+MsjF9g0WSiUA 5Ncw++qsD3ek6RGA9arcEJM2z60TKpk= X-Google-Smtp-Source: AH8x225+yTq082y7+qFdv0dtYobES3w0lUxJ17RqP+mcCXItgHaHP8iYAgRy9AQKpD9HD1BGCVy6eg== X-Received: by 10.98.215.12 with SMTP id b12mr9773712pfh.149.1518891822822; Sat, 17 Feb 2018 10:23:42 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:26 -0800 Message-Id: <20180217182323.25885-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH v2 10/67] target/arm: Implement SVE Integer Reduction Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Excepting MOVPRFX, which isn't a reduction. Presumably it is placed within the group because of its encoding. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 44 +++++++++++++++++++++ target/arm/sve_helper.c | 95 ++++++++++++++++++++++++++++++++++++++++++= +++- target/arm/translate-sve.c | 65 +++++++++++++++++++++++++++++++ target/arm/sve.decode | 22 +++++++++++ 4 files changed, 224 insertions(+), 2 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 5b82ba1501..6b6bbeb272 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -168,6 +168,50 @@ DEF_HELPER_FLAGS_5(sve_udiv_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_udiv_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(sve_orv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_orv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_orv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_orv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_eorv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_eorv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_eorv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_eorv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_andv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_andv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_andv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_andv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_saddv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_saddv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_saddv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_uaddv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uaddv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uaddv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uaddv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_smaxv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_smaxv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_smaxv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_smaxv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_umaxv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_umaxv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_umaxv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_umaxv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_sminv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sminv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_uminv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uminv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 26c177c2fd..18fb27805e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -295,6 +295,99 @@ DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV) DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_DIV) DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV) =20 +#undef DO_ZPZZ +#undef DO_ZPZZ_D + +/* Two-operand reduction expander, controlled by a predicate. + * The difference between TYPERED and TYPERET has to do with + * sign-extension. E.g. for SMAX, TYPERED must be signed, + * but TYPERET must be unsigned so that e.g. a 32-bit value + * is not sign-extended to the ABI uint64_t return type. + */ +/* ??? If we were to vectorize this by hand the reduction ordering + * would change. For integer operands, this is perfectly fine. + */ +#define DO_VPZ(NAME, TYPEELT, TYPERED, TYPERET, H, INIT, OP) \ +uint64_t HELPER(NAME)(void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + TYPERED ret =3D INIT; \ + for (i =3D 0; i < opr_sz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEELT nn =3D *(TYPEELT *)(vn + H(i)); \ + ret =3D OP(ret, nn); \ + } \ + i +=3D sizeof(TYPEELT), pg >>=3D sizeof(TYPEELT); \ + } while (i & 15); \ + } \ + return (TYPERET)ret; \ +} + +#define DO_VPZ_D(NAME, TYPEE, TYPER, INIT, OP) \ +uint64_t HELPER(NAME)(void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; \ + TYPEE *n =3D vn; \ + uint8_t *pg =3D vg; \ + TYPER ret =3D INIT; \ + for (i =3D 0; i < opr_sz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + TYPEE nn =3D n[i]; \ + ret =3D OP(ret, nn); \ + } \ + } \ + return ret; \ +} + +DO_VPZ(sve_orv_b, uint8_t, uint8_t, uint8_t, H1, 0, DO_ORR) +DO_VPZ(sve_orv_h, uint16_t, uint16_t, uint16_t, H1_2, 0, DO_ORR) +DO_VPZ(sve_orv_s, uint32_t, uint32_t, uint32_t, H1_4, 0, DO_ORR) +DO_VPZ_D(sve_orv_d, uint64_t, uint64_t, 0, DO_ORR) + +DO_VPZ(sve_eorv_b, uint8_t, uint8_t, uint8_t, H1, 0, DO_EOR) +DO_VPZ(sve_eorv_h, uint16_t, uint16_t, uint16_t, H1_2, 0, DO_EOR) +DO_VPZ(sve_eorv_s, uint32_t, uint32_t, uint32_t, H1_4, 0, DO_EOR) +DO_VPZ_D(sve_eorv_d, uint64_t, uint64_t, 0, DO_EOR) + +DO_VPZ(sve_andv_b, uint8_t, uint8_t, uint8_t, H1, -1, DO_AND) +DO_VPZ(sve_andv_h, uint16_t, uint16_t, uint16_t, H1_2, -1, DO_AND) +DO_VPZ(sve_andv_s, uint32_t, uint32_t, uint32_t, H1_4, -1, DO_AND) +DO_VPZ_D(sve_andv_d, uint64_t, uint64_t, -1, DO_AND) + +DO_VPZ(sve_saddv_b, int8_t, uint64_t, uint64_t, H1, 0, DO_ADD) +DO_VPZ(sve_saddv_h, int16_t, uint64_t, uint64_t, H1_2, 0, DO_ADD) +DO_VPZ(sve_saddv_s, int32_t, uint64_t, uint64_t, H1_4, 0, DO_ADD) + +DO_VPZ(sve_uaddv_b, uint8_t, uint64_t, uint64_t, H1, 0, DO_ADD) +DO_VPZ(sve_uaddv_h, uint16_t, uint64_t, uint64_t, H1_2, 0, DO_ADD) +DO_VPZ(sve_uaddv_s, uint32_t, uint64_t, uint64_t, H1_4, 0, DO_ADD) +DO_VPZ_D(sve_uaddv_d, uint64_t, uint64_t, 0, DO_ADD) + +DO_VPZ(sve_smaxv_b, int8_t, int8_t, uint8_t, H1, INT8_MIN, DO_MAX) +DO_VPZ(sve_smaxv_h, int16_t, int16_t, uint16_t, H1_2, INT16_MIN, DO_MAX) +DO_VPZ(sve_smaxv_s, int32_t, int32_t, uint32_t, H1_4, INT32_MIN, DO_MAX) +DO_VPZ_D(sve_smaxv_d, int64_t, int64_t, INT64_MIN, DO_MAX) + +DO_VPZ(sve_umaxv_b, uint8_t, uint8_t, uint8_t, H1, 0, DO_MAX) +DO_VPZ(sve_umaxv_h, uint16_t, uint16_t, uint16_t, H1_2, 0, DO_MAX) +DO_VPZ(sve_umaxv_s, uint32_t, uint32_t, uint32_t, H1_4, 0, DO_MAX) +DO_VPZ_D(sve_umaxv_d, uint64_t, uint64_t, 0, DO_MAX) + +DO_VPZ(sve_sminv_b, int8_t, int8_t, uint8_t, H1, INT8_MAX, DO_MIN) +DO_VPZ(sve_sminv_h, int16_t, int16_t, uint16_t, H1_2, INT16_MAX, DO_MIN) +DO_VPZ(sve_sminv_s, int32_t, int32_t, uint32_t, H1_4, INT32_MAX, DO_MIN) +DO_VPZ_D(sve_sminv_d, int64_t, int64_t, INT64_MAX, DO_MIN) + +DO_VPZ(sve_uminv_b, uint8_t, uint8_t, uint8_t, H1, -1, DO_MIN) +DO_VPZ(sve_uminv_h, uint16_t, uint16_t, uint16_t, H1_2, -1, DO_MIN) +DO_VPZ(sve_uminv_s, uint32_t, uint32_t, uint32_t, H1_4, -1, DO_MIN) +DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) + +#undef DO_VPZ +#undef DO_VPZ_D + #undef DO_AND #undef DO_ORR #undef DO_EOR @@ -306,8 +399,6 @@ DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV) #undef DO_ABD #undef DO_MUL #undef DO_DIV -#undef DO_ZPZZ -#undef DO_ZPZZ_D =20 /* Similar to the ARM LastActiveElement pseudocode function, except the result is multiplied by the element size. This includes the not found diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 116002792a..49251a53c1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -276,6 +276,71 @@ void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a,= uint32_t insn) =20 #undef DO_ZPZZ =20 +/* + *** SVE Integer Reduction Group + */ + +typedef void gen_helper_gvec_reduc(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_i32); +static void do_vpz_ool(DisasContext *s, arg_rpr_esz *a, + gen_helper_gvec_reduc *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr t_zn, t_pg; + TCGv_i32 desc; + TCGv_i64 temp; + + if (fn =3D=3D 0) { + unallocated_encoding(s); + return; + } + + desc =3D tcg_const_i32(simd_desc(vsz, vsz, 0)); + temp =3D tcg_temp_new_i64(); + t_zn =3D tcg_temp_new_ptr(); + t_pg =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg)); + fn(temp, t_zn, t_pg, desc); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); + + write_fp_dreg(s, a->rd, temp); + tcg_temp_free_i64(temp); +} + +#define DO_VPZ(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_reduc * const fns[4] =3D { = \ + gen_helper_sve_##name##_b, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d, \ + }; \ + do_vpz_ool(s, a, fns[a->esz]); \ +} + +DO_VPZ(ORV, orv) +DO_VPZ(ANDV, andv) +DO_VPZ(EORV, eorv) + +DO_VPZ(UADDV, uaddv) +DO_VPZ(SMAXV, smaxv) +DO_VPZ(UMAXV, umaxv) +DO_VPZ(SMINV, sminv) +DO_VPZ(UMINV, uminv) + +static void trans_SADDV(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_reduc * const fns[4] =3D { + gen_helper_sve_saddv_b, gen_helper_sve_saddv_h, + gen_helper_sve_saddv_s, NULL + }; + do_vpz_ool(s, a, fns[a->esz]); +} + +#undef DO_VPZ + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5fafe02575..b390d8f398 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -37,6 +37,7 @@ &rr_esz rd rn esz &rri rd rn imm &rrr_esz rd rn rm esz +&rpr_esz rd pg rn esz &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz =20 @@ -64,6 +65,9 @@ @rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ &rprr_esz rm=3D%reg_movprfx =20 +# One register operand, with governing predicate, vector element size +@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=3D%imm9_16_10 @@ -104,6 +108,24 @@ UDIV_zpzz 00000100 .. 010 101 000 ... ..... ..... @r= dn_pg_rm SDIV_zpzz 00000100 .. 010 110 000 ... ..... ..... @rdm_pg_rn # SDIVR UDIV_zpzz 00000100 .. 010 111 000 ... ..... ..... @rdm_pg_rn # UDIVR =20 +### SVE Integer Reduction Group + +# SVE bitwise logical reduction (predicated) +ORV 00000100 .. 011 000 001 ... ..... ..... @rd_pg_rn +EORV 00000100 .. 011 001 001 ... ..... ..... @rd_pg_rn +ANDV 00000100 .. 011 010 001 ... ..... ..... @rd_pg_rn + +# SVE integer add reduction (predicated) +# Note that saddv requires size !=3D 3. +UADDV 00000100 .. 000 001 001 ... ..... ..... @rd_pg_rn +SADDV 00000100 .. 000 000 001 ... ..... ..... @rd_pg_rn + +# SVE integer min/max reduction (predicated) +SMAXV 00000100 .. 001 000 001 ... ..... ..... @rd_pg_rn +UMAXV 00000100 .. 001 001 001 ... ..... ..... @rd_pg_rn +SMINV 00000100 .. 001 010 001 ... ..... ..... @rd_pg_rn +UMINV 00000100 .. 001 011 001 ... ..... ..... @rd_pg_rn + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892966225406.84444416240206; Sat, 17 Feb 2018 10:42:46 -0800 (PST) Received: from localhost ([::1]:48196 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Rc-0007N3-T9 for importer@patchew.org; Sat, 17 Feb 2018 13:42:44 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39739) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79J-0000Ma-8E for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79G-0001aO-DE for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:49 -0500 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:32802) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79G-0001Zm-3e for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:46 -0500 Received: by mail-pf0-x242.google.com with SMTP id b8so525159pfh.0 for ; Sat, 17 Feb 2018 10:23:46 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.42 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=yc03Rp6PfTCJLkAYx7Y9aOVkKsWaJPzL0uw4GCiXPIs=; b=Sli782BVp2lpxlSXYnXso4LDD7TxDzNG6OcjkbhML+ngpCR+p1iDjpM4OzPmdyu5fw Qfy+mt9UXcXaA+OvgKMukxG5RvzckQtWuU59nYrrmX/85LJn7dqYgHhtXucz0Ho8Q5fX XU+nwW8627udYRWOmUEyOY/tJIndz5efT3XWw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=yc03Rp6PfTCJLkAYx7Y9aOVkKsWaJPzL0uw4GCiXPIs=; b=ngo+CuRWajC1T6YR/eUxQPWTdwRr144a+Jk19iCy4JQH9MZ5uCieexMKTfpY6pAtBW CEUEs6LfJH03zFeqbCktBH9p+Xdd2eBY1st23V1sPkAROVw/i/2rzRQrmdDRgvxjITxZ eQBkob/I9c+5sUwP5MPhGk/oex799sW9UAZSMgmiwN+CmyeMS6P66N1aJ/54QoA+Sf5d yYBkT/c90av0lDDNmYiQIOw6aWvwQk0VfrxgTVEM4CN8RVMmQEAHkIopC0jhScM6APfs 9ZCiolGThwT9Tbl4XSh1IIgJ4xuvMa4nh10A4BH4PYunwG11VvZQmTySiWO4sF+oxz3q aoAg== X-Gm-Message-State: APf1xPB9j22tG2STCq1jnIw37UXKLsWsjvIh5g3HJc/zqy85fH3nupxp duOh0QTLwunXGCtdkAvH3ucWvYLkY0Q= X-Google-Smtp-Source: AH8x227pT1AL3LYqRP+MJNvYCQKW3+OVl5xh9VkH4KkwCSmbcsjsRItddriJH6I+x6lVUwzo2aq88w== X-Received: by 10.98.249.66 with SMTP id g2mr9847325pfm.112.1518891824536; Sat, 17 Feb 2018 10:23:44 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:27 -0800 Message-Id: <20180217182323.25885-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v2 11/67] target/arm: Implement SVE bitwise shift by immediate (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 25 +++++ target/arm/sve_helper.c | 265 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 128 ++++++++++++++++++++++ target/arm/sve.decode | 29 ++++- 4 files changed, 445 insertions(+), 2 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6b6bbeb272..b3c89579af 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -212,6 +212,31 @@ DEF_HELPER_FLAGS_3(sve_uminv_h, TCG_CALL_NO_RWG, i64, = ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(sve_clr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_asr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_asr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_asr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_asr_zpzi_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) + +DEF_HELPER_FLAGS_4(sve_lsr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_lsr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_lsr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_lsr_zpzi_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) + +DEF_HELPER_FLAGS_4(sve_lsl_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_lsl_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_lsl_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_lsl_zpzi_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) + +DEF_HELPER_FLAGS_4(sve_asrd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asrd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asrd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asrd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 18fb27805e..b1a170fd70 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -92,6 +92,150 @@ uint32_t HELPER(sve_predtest)(void *vd, void *vg, uint3= 2_t words) return flags; } =20 +/* Expand active predicate bits to bytes, for byte elements. + * for (i =3D 0; i < 256; ++i) { + * unsigned long m =3D 0; + * for (j =3D 0; j < 8; j++) { + * if ((i >> j) & 1) { + * m |=3D 0xfful << (j << 3); + * } + * } + * printf("0x%016lx,\n", m); + * } + */ +static inline uint64_t expand_pred_b(uint8_t byte) +{ + static const uint64_t word[256] =3D { + 0x0000000000000000, 0x00000000000000ff, 0x000000000000ff00, + 0x000000000000ffff, 0x0000000000ff0000, 0x0000000000ff00ff, + 0x0000000000ffff00, 0x0000000000ffffff, 0x00000000ff000000, + 0x00000000ff0000ff, 0x00000000ff00ff00, 0x00000000ff00ffff, + 0x00000000ffff0000, 0x00000000ffff00ff, 0x00000000ffffff00, + 0x00000000ffffffff, 0x000000ff00000000, 0x000000ff000000ff, + 0x000000ff0000ff00, 0x000000ff0000ffff, 0x000000ff00ff0000, + 0x000000ff00ff00ff, 0x000000ff00ffff00, 0x000000ff00ffffff, + 0x000000ffff000000, 0x000000ffff0000ff, 0x000000ffff00ff00, + 0x000000ffff00ffff, 0x000000ffffff0000, 0x000000ffffff00ff, + 0x000000ffffffff00, 0x000000ffffffffff, 0x0000ff0000000000, + 0x0000ff00000000ff, 0x0000ff000000ff00, 0x0000ff000000ffff, + 0x0000ff0000ff0000, 0x0000ff0000ff00ff, 0x0000ff0000ffff00, + 0x0000ff0000ffffff, 0x0000ff00ff000000, 0x0000ff00ff0000ff, + 0x0000ff00ff00ff00, 0x0000ff00ff00ffff, 0x0000ff00ffff0000, + 0x0000ff00ffff00ff, 0x0000ff00ffffff00, 0x0000ff00ffffffff, + 0x0000ffff00000000, 0x0000ffff000000ff, 0x0000ffff0000ff00, + 0x0000ffff0000ffff, 0x0000ffff00ff0000, 0x0000ffff00ff00ff, + 0x0000ffff00ffff00, 0x0000ffff00ffffff, 0x0000ffffff000000, + 0x0000ffffff0000ff, 0x0000ffffff00ff00, 0x0000ffffff00ffff, + 0x0000ffffffff0000, 0x0000ffffffff00ff, 0x0000ffffffffff00, + 0x0000ffffffffffff, 0x00ff000000000000, 0x00ff0000000000ff, + 0x00ff00000000ff00, 0x00ff00000000ffff, 0x00ff000000ff0000, + 0x00ff000000ff00ff, 0x00ff000000ffff00, 0x00ff000000ffffff, + 0x00ff0000ff000000, 0x00ff0000ff0000ff, 0x00ff0000ff00ff00, + 0x00ff0000ff00ffff, 0x00ff0000ffff0000, 0x00ff0000ffff00ff, + 0x00ff0000ffffff00, 0x00ff0000ffffffff, 0x00ff00ff00000000, + 0x00ff00ff000000ff, 0x00ff00ff0000ff00, 0x00ff00ff0000ffff, + 0x00ff00ff00ff0000, 0x00ff00ff00ff00ff, 0x00ff00ff00ffff00, + 0x00ff00ff00ffffff, 0x00ff00ffff000000, 0x00ff00ffff0000ff, + 0x00ff00ffff00ff00, 0x00ff00ffff00ffff, 0x00ff00ffffff0000, + 0x00ff00ffffff00ff, 0x00ff00ffffffff00, 0x00ff00ffffffffff, + 0x00ffff0000000000, 0x00ffff00000000ff, 0x00ffff000000ff00, + 0x00ffff000000ffff, 0x00ffff0000ff0000, 0x00ffff0000ff00ff, + 0x00ffff0000ffff00, 0x00ffff0000ffffff, 0x00ffff00ff000000, + 0x00ffff00ff0000ff, 0x00ffff00ff00ff00, 0x00ffff00ff00ffff, + 0x00ffff00ffff0000, 0x00ffff00ffff00ff, 0x00ffff00ffffff00, + 0x00ffff00ffffffff, 0x00ffffff00000000, 0x00ffffff000000ff, + 0x00ffffff0000ff00, 0x00ffffff0000ffff, 0x00ffffff00ff0000, + 0x00ffffff00ff00ff, 0x00ffffff00ffff00, 0x00ffffff00ffffff, + 0x00ffffffff000000, 0x00ffffffff0000ff, 0x00ffffffff00ff00, + 0x00ffffffff00ffff, 0x00ffffffffff0000, 0x00ffffffffff00ff, + 0x00ffffffffffff00, 0x00ffffffffffffff, 0xff00000000000000, + 0xff000000000000ff, 0xff0000000000ff00, 0xff0000000000ffff, + 0xff00000000ff0000, 0xff00000000ff00ff, 0xff00000000ffff00, + 0xff00000000ffffff, 0xff000000ff000000, 0xff000000ff0000ff, + 0xff000000ff00ff00, 0xff000000ff00ffff, 0xff000000ffff0000, + 0xff000000ffff00ff, 0xff000000ffffff00, 0xff000000ffffffff, + 0xff0000ff00000000, 0xff0000ff000000ff, 0xff0000ff0000ff00, + 0xff0000ff0000ffff, 0xff0000ff00ff0000, 0xff0000ff00ff00ff, + 0xff0000ff00ffff00, 0xff0000ff00ffffff, 0xff0000ffff000000, + 0xff0000ffff0000ff, 0xff0000ffff00ff00, 0xff0000ffff00ffff, + 0xff0000ffffff0000, 0xff0000ffffff00ff, 0xff0000ffffffff00, + 0xff0000ffffffffff, 0xff00ff0000000000, 0xff00ff00000000ff, + 0xff00ff000000ff00, 0xff00ff000000ffff, 0xff00ff0000ff0000, + 0xff00ff0000ff00ff, 0xff00ff0000ffff00, 0xff00ff0000ffffff, + 0xff00ff00ff000000, 0xff00ff00ff0000ff, 0xff00ff00ff00ff00, + 0xff00ff00ff00ffff, 0xff00ff00ffff0000, 0xff00ff00ffff00ff, + 0xff00ff00ffffff00, 0xff00ff00ffffffff, 0xff00ffff00000000, + 0xff00ffff000000ff, 0xff00ffff0000ff00, 0xff00ffff0000ffff, + 0xff00ffff00ff0000, 0xff00ffff00ff00ff, 0xff00ffff00ffff00, + 0xff00ffff00ffffff, 0xff00ffffff000000, 0xff00ffffff0000ff, + 0xff00ffffff00ff00, 0xff00ffffff00ffff, 0xff00ffffffff0000, + 0xff00ffffffff00ff, 0xff00ffffffffff00, 0xff00ffffffffffff, + 0xffff000000000000, 0xffff0000000000ff, 0xffff00000000ff00, + 0xffff00000000ffff, 0xffff000000ff0000, 0xffff000000ff00ff, + 0xffff000000ffff00, 0xffff000000ffffff, 0xffff0000ff000000, + 0xffff0000ff0000ff, 0xffff0000ff00ff00, 0xffff0000ff00ffff, + 0xffff0000ffff0000, 0xffff0000ffff00ff, 0xffff0000ffffff00, + 0xffff0000ffffffff, 0xffff00ff00000000, 0xffff00ff000000ff, + 0xffff00ff0000ff00, 0xffff00ff0000ffff, 0xffff00ff00ff0000, + 0xffff00ff00ff00ff, 0xffff00ff00ffff00, 0xffff00ff00ffffff, + 0xffff00ffff000000, 0xffff00ffff0000ff, 0xffff00ffff00ff00, + 0xffff00ffff00ffff, 0xffff00ffffff0000, 0xffff00ffffff00ff, + 0xffff00ffffffff00, 0xffff00ffffffffff, 0xffffff0000000000, + 0xffffff00000000ff, 0xffffff000000ff00, 0xffffff000000ffff, + 0xffffff0000ff0000, 0xffffff0000ff00ff, 0xffffff0000ffff00, + 0xffffff0000ffffff, 0xffffff00ff000000, 0xffffff00ff0000ff, + 0xffffff00ff00ff00, 0xffffff00ff00ffff, 0xffffff00ffff0000, + 0xffffff00ffff00ff, 0xffffff00ffffff00, 0xffffff00ffffffff, + 0xffffffff00000000, 0xffffffff000000ff, 0xffffffff0000ff00, + 0xffffffff0000ffff, 0xffffffff00ff0000, 0xffffffff00ff00ff, + 0xffffffff00ffff00, 0xffffffff00ffffff, 0xffffffffff000000, + 0xffffffffff0000ff, 0xffffffffff00ff00, 0xffffffffff00ffff, + 0xffffffffffff0000, 0xffffffffffff00ff, 0xffffffffffffff00, + 0xffffffffffffffff, + }; + return word[byte]; +} + +/* Similarly for half-word elements. + * for (i =3D 0; i < 256; ++i) { + * unsigned long m =3D 0; + * if (i & 0xaa) { + * continue; + * } + * for (j =3D 0; j < 8; j +=3D 2) { + * if ((i >> j) & 1) { + * m |=3D 0xfffful << (j << 3); + * } + * } + * printf("[0x%x] =3D 0x%016lx,\n", i, m); + * } + */ +static inline uint64_t expand_pred_h(uint8_t byte) +{ + static const uint64_t word[] =3D { + [0x01] =3D 0x000000000000ffff, [0x04] =3D 0x00000000ffff0000, + [0x05] =3D 0x00000000ffffffff, [0x10] =3D 0x0000ffff00000000, + [0x11] =3D 0x0000ffff0000ffff, [0x14] =3D 0x0000ffffffff0000, + [0x15] =3D 0x0000ffffffffffff, [0x40] =3D 0xffff000000000000, + [0x41] =3D 0xffff00000000ffff, [0x44] =3D 0xffff0000ffff0000, + [0x45] =3D 0xffff0000ffffffff, [0x50] =3D 0xffffffff00000000, + [0x51] =3D 0xffffffff0000ffff, [0x54] =3D 0xffffffffffff0000, + [0x55] =3D 0xffffffffffffffff, + }; + return word[byte & 0x55]; +} + +/* Similarly for single word elements. */ +static inline uint64_t expand_pred_s(uint8_t byte) +{ + static const uint64_t word[] =3D { + [0x01] =3D 0x00000000ffffffffull, + [0x10] =3D 0xffffffff00000000ull, + [0x11] =3D 0xffffffffffffffffull, + }; + return word[byte & 0x11]; +} + #define LOGICAL_PPPP(NAME, FUNC) \ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ { \ @@ -483,3 +627,124 @@ uint32_t HELPER(sve_pnext)(void *vd, void *vg, uint32= _t pred_desc) =20 return flags; } + +/* Store zero into every active element of Zd. We will use this for two + * and three-operand predicated instructions for which logic dictates a + * zero result. In particular, logical shift by element size, which is + * otherwise undefined on the host. + * + * For element sizes smaller than uint64_t, we use tables to expand + * the N bits of the controlling predicate to a byte mask, and clear + * those bytes. + */ +void HELPER(sve_clr_b)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] &=3D ~expand_pred_b(pg[H1(i)]); + } +} + +void HELPER(sve_clr_h)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] &=3D ~expand_pred_h(pg[H1(i)]); + } +} + +void HELPER(sve_clr_s)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] &=3D ~expand_pred_s(pg[H1(i)]); + } +} + +void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + for (i =3D 0; i < opr_sz; i +=3D 1) { + if (pg[H1(i)] & 1) { + d[i] =3D 0; + } + } +} + +/* Three-operand expander, immediate operand, controlled by a predicate. + */ +#define DO_ZPZI(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + TYPE imm =3D simd_data(desc); \ + for (i =3D 0; i < opr_sz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) =3D OP(nn, imm); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZI_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; \ + TYPE *d =3D vd, *n =3D vn; \ + TYPE imm =3D simd_data(desc); \ + uint8_t *pg =3D vg; \ + for (i =3D 0; i < opr_sz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE nn =3D n[i]; \ + d[i] =3D OP(nn, imm); \ + } \ + } \ +} + +#define DO_SHR(N, M) (N >> M) +#define DO_SHL(N, M) (N << M) + +/* Arithmetic shift right for division. This rounds negative numbers + toward zero as per signed division. Therefore before shifting, + when N is negative, add 2**M-1. */ +#define DO_ASRD(N, M) ((N + (N < 0 ? ((__typeof(N))1 << M) - 1 : 0)) >> M) + +DO_ZPZI(sve_asr_zpzi_b, int8_t, H1, DO_SHR) +DO_ZPZI(sve_asr_zpzi_h, int16_t, H1_2, DO_SHR) +DO_ZPZI(sve_asr_zpzi_s, int32_t, H1_4, DO_SHR) +DO_ZPZI_D(sve_asr_zpzi_d, int64_t, DO_SHR) + +DO_ZPZI(sve_lsr_zpzi_b, uint8_t, H1, DO_SHR) +DO_ZPZI(sve_lsr_zpzi_h, uint16_t, H1_2, DO_SHR) +DO_ZPZI(sve_lsr_zpzi_s, uint32_t, H1_4, DO_SHR) +DO_ZPZI_D(sve_lsr_zpzi_d, uint64_t, DO_SHR) + +DO_ZPZI(sve_lsl_zpzi_b, uint8_t, H1, DO_SHL) +DO_ZPZI(sve_lsl_zpzi_h, uint16_t, H1_2, DO_SHL) +DO_ZPZI(sve_lsl_zpzi_s, uint32_t, H1_4, DO_SHL) +DO_ZPZI_D(sve_lsl_zpzi_d, uint64_t, DO_SHL) + +DO_ZPZI(sve_asrd_b, int8_t, H1, DO_ASRD) +DO_ZPZI(sve_asrd_h, int16_t, H1_2, DO_ASRD) +DO_ZPZI(sve_asrd_s, int32_t, H1_4, DO_ASRD) +DO_ZPZI_D(sve_asrd_d, int64_t, DO_ASRD) + +#undef DO_SHR +#undef DO_SHL +#undef DO_ASRD + +#undef DO_ZPZI +#undef DO_ZPZI_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 49251a53c1..4218300960 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -37,6 +37,30 @@ typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, ui= nt32_t, uint32_t); typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); =20 +/* + * Helpers for extracting complex instruction fields. + */ + +/* See e.g. ASL (immediate, predicated). + * Returns -1 for unallocated encoding; diagnose later. + */ +static int tszimm_esz(int x) +{ + x >>=3D 3; /* discard imm3 */ + return 31 - clz32(x); +} + +static int tszimm_shr(int x) +{ + return (16 << tszimm_esz(x)) - x; +} + +/* See e.g. LSL (immediate, predicated). */ +static int tszimm_shl(int x) +{ + return x - (8 << tszimm_esz(x)); +} + /* * Include the generated decoder. */ @@ -341,6 +365,110 @@ static void trans_SADDV(DisasContext *s, arg_rpr_esz = *a, uint32_t insn) =20 #undef DO_VPZ =20 +/* + *** SVE Shift by Immediate - Predicated Group + */ + +/* Store zero into every active element of Zd. We will use this for two + * and three-operand predicated instructions for which logic dictates a + * zero result. + */ +static void do_clr_zp(DisasContext *s, int rd, int pg, int esz) +{ + static gen_helper_gvec_2 * const fns[4] =3D { + gen_helper_sve_clr_b, gen_helper_sve_clr_h, + gen_helper_sve_clr_s, gen_helper_sve_clr_d, + }; + unsigned vsz =3D vec_full_reg_size(s); + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, rd), + pred_full_reg_offset(s, pg), + vsz, vsz, 0, fns[esz]); +} + +static void do_zpzi_ool(DisasContext *s, arg_rpri_esz *a, + gen_helper_gvec_3 *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + vsz, vsz, a->imm, fn); +} + +static void trans_ASR_zpzi(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + gen_helper_sve_asr_zpzi_b, gen_helper_sve_asr_zpzi_h, + gen_helper_sve_asr_zpzi_s, gen_helper_sve_asr_zpzi_d, + }; + if (a->esz < 0) { + /* Invalid tsz encoding -- see tszimm_esz. */ + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. For + arithmetic right-shift, it's the same as by one less. */ + a->imm =3D MIN(a->imm, (8 << a->esz) - 1); + do_zpzi_ool(s, a, fns[a->esz]); +} + +static void trans_LSR_zpzi(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + gen_helper_sve_lsr_zpzi_b, gen_helper_sve_lsr_zpzi_h, + gen_helper_sve_lsr_zpzi_s, gen_helper_sve_lsr_zpzi_d, + }; + if (a->esz < 0) { + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. + For logical shifts, it is a zeroing operation. */ + if (a->imm >=3D (8 << a->esz)) { + do_clr_zp(s, a->rd, a->pg, a->esz); + } else { + do_zpzi_ool(s, a, fns[a->esz]); + } +} + +static void trans_LSL_zpzi(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + gen_helper_sve_lsl_zpzi_b, gen_helper_sve_lsl_zpzi_h, + gen_helper_sve_lsl_zpzi_s, gen_helper_sve_lsl_zpzi_d, + }; + if (a->esz < 0) { + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. + For logical shifts, it is a zeroing operation. */ + if (a->imm >=3D (8 << a->esz)) { + do_clr_zp(s, a->rd, a->pg, a->esz); + } else { + do_zpzi_ool(s, a, fns[a->esz]); + } +} + +static void trans_ASRD(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + gen_helper_sve_asrd_b, gen_helper_sve_asrd_h, + gen_helper_sve_asrd_s, gen_helper_sve_asrd_d, + }; + if (a->esz < 0) { + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. For arithmetic + right shift for division, it is a zeroing operation. */ + if (a->imm >=3D (8 << a->esz)) { + do_clr_zp(s, a->rd, a->pg, a->esz); + } else { + do_zpzi_ool(s, a, fns[a->esz]); + } +} + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b390d8f398..c265ff9899 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -22,12 +22,20 @@ ########################################################################### # Named fields. These are primarily for disjoint fields. =20 +%imm6_22_5 22:1 5:5 %imm9_16_10 16:s6 10:3 %preg4_5 5:4 =20 +# A combination of tsz:imm3 -- extract esize. +%tszimm_esz 22:2 5:5 !function=3Dtszimm_esz +# A combination of tsz:imm3 -- extract (2 * esize) - (tsz:imm3) +%tszimm_shr 22:2 5:5 !function=3Dtszimm_shr +# A combination of tsz:imm3 -- extract (tsz:imm3) - esize +%tszimm_shl 22:2 5:5 !function=3Dtszimm_shl + # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. -%reg_movprfx 0:5 +%reg_movprfx 0:5 =20 ########################################################################### # Named attribute sets. These are used to make nice(er) names @@ -40,7 +48,7 @@ &rpr_esz rd pg rn esz &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz - +&rpri_esz rd pg rn imm esz &ptrue rd esz pat s =20 ########################################################################### @@ -68,6 +76,11 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz =20 +# Two register operand, one immediate operand, with predicate, +# element size encoded as TSZHL. User must fill in imm. +@rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 \ + &rpri_esz rn=3D%reg_movprfx esz=3D%tszimm_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=3D%imm9_16_10 @@ -126,6 +139,18 @@ UMAXV 00000100 .. 001 001 001 ... ..... ..... @rd_pg= _rn SMINV 00000100 .. 001 010 001 ... ..... ..... @rd_pg_rn UMINV 00000100 .. 001 011 001 ... ..... ..... @rd_pg_rn =20 +### SVE Shift by Immediate - Predicated Group + +# SVE bitwise shift by immediate (predicated) +ASR_zpzi 00000100 .. 000 000 100 ... .. ... ..... \ + @rdn_pg_tszimm imm=3D%tszimm_shr +LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... \ + @rdn_pg_tszimm imm=3D%tszimm_shr +LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... \ + @rdn_pg_tszimm imm=3D%tszimm_shl +ASRD 00000100 .. 000 100 100 ... .. ... ..... \ + @rdn_pg_tszimm imm=3D%tszimm_shr + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 15188925763063.4932997993520303; Sat, 17 Feb 2018 10:36:16 -0800 (PST) Received: from localhost ([::1]:48153 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7LL-00028d-D9 for importer@patchew.org; Sat, 17 Feb 2018 13:36:15 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39743) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79J-0000Md-95 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:50 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79H-0001b3-Ml for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:49 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:38908) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79H-0001aa-D6 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:47 -0500 Received: by mail-pg0-x241.google.com with SMTP id l24so4354289pgc.5 for ; Sat, 17 Feb 2018 10:23:47 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.44 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=N6hScQO8CYQW5lL3+OTxL793OiswiFsgIP5knStdv5c=; b=Z72KnqlmkK04i3s+2LKMdFK5K1M7AS0Uy7FLLhpQbj+KHoCifDEa/+DnE1upLewDbI nN1MMrPABDZa+r2loyAOKivhVtO1zO33LOcU5yrQsKOK/b0hHOGVTe9N2UQjKjXCJKiJ YihBRcUdY9NYdWR8nzod38OhfT40GqMoRtz/g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=N6hScQO8CYQW5lL3+OTxL793OiswiFsgIP5knStdv5c=; b=CFSMw8zYO2XIvcVWpUSeudkhQtiEvDTnWuuI3w2LnmDgDW1GhC9TKxColeIjtuiJuX BH5Hhycj/5ZapvzTE4dYPg4QFF2LjpxUovZfw83c88zfODayn/M4mqC2J/xYIsjuv/jL Y0GJelsNNSCgr3UfT2OvsVBaodxWiNxc+ONQHYFkWwCbSDEhz40BYwkOX/9npOInnmMc 8qWxFnXL6pq8MWhoYOoTAwM4kPgNBnhw4y0fGrmJOhk5gHlcGd43rkBTtHBk9TQbpjMj 6x0As1k60LHBl5z+IwXPa26+O+IuqanIx+60FwTStHO7V2xTJBjZAu+tchcPhxu5dYHE gxjA== X-Gm-Message-State: APf1xPDGEq4HdZ4wk0SSNII6ZVxcsQ6hidxRKNd7P7es/W+keXd4OOL6 423/vDI6UklIH1euylp0aFbie8JoKRY= X-Google-Smtp-Source: AH8x225EK/yPsdKYAtUPgRrhRngGVswrfml+hjBTnjC/RWripHKQka3xi/0kvpVLrf6t7dXVuFz2Pw== X-Received: by 10.98.27.78 with SMTP id b75mr9797472pfb.146.1518891826128; Sat, 17 Feb 2018 10:23:46 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:28 -0800 Message-Id: <20180217182323.25885-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH v2 12/67] target/arm: Implement SVE bitwise shift by vector (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 27 +++++++++++++++++++++++++++ target/arm/sve_helper.c | 25 +++++++++++++++++++++++++ target/arm/translate-sve.c | 4 ++++ target/arm/sve.decode | 8 ++++++++ 4 files changed, 64 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b3c89579af..0cc02ee59e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -168,6 +168,33 @@ DEF_HELPER_FLAGS_5(sve_udiv_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_udiv_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_asr_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_lsr_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_lsl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_orv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_orv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_orv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b1a170fd70..6ea806d12b 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -439,6 +439,28 @@ DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV) DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_DIV) DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV) =20 +/* Note that all bits of the shift are significant + and not modulo the element size. */ +#define DO_ASR(N, M) (N >> MIN(M, sizeof(N) * 8 - 1)) +#define DO_LSR(N, M) (M < sizeof(N) * 8 ? N >> M : 0) +#define DO_LSL(N, M) (M < sizeof(N) * 8 ? N << M : 0) + +DO_ZPZZ(sve_asr_zpzz_b, int8_t, H1, DO_ASR) +DO_ZPZZ(sve_lsr_zpzz_b, uint8_t, H1_2, DO_LSR) +DO_ZPZZ(sve_lsl_zpzz_b, uint8_t, H1_4, DO_LSL) + +DO_ZPZZ(sve_asr_zpzz_h, int16_t, H1, DO_ASR) +DO_ZPZZ(sve_lsr_zpzz_h, uint16_t, H1_2, DO_LSR) +DO_ZPZZ(sve_lsl_zpzz_h, uint16_t, H1_4, DO_LSL) + +DO_ZPZZ(sve_asr_zpzz_s, int32_t, H1, DO_ASR) +DO_ZPZZ(sve_lsr_zpzz_s, uint32_t, H1_2, DO_LSR) +DO_ZPZZ(sve_lsl_zpzz_s, uint32_t, H1_4, DO_LSL) + +DO_ZPZZ_D(sve_asr_zpzz_d, int64_t, DO_ASR) +DO_ZPZZ_D(sve_lsr_zpzz_d, uint64_t, DO_LSR) +DO_ZPZZ_D(sve_lsl_zpzz_d, uint64_t, DO_LSL) + #undef DO_ZPZZ #undef DO_ZPZZ_D =20 @@ -543,6 +565,9 @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) #undef DO_ABD #undef DO_MUL #undef DO_DIV +#undef DO_ASR +#undef DO_LSR +#undef DO_LSL =20 /* Similar to the ARM LastActiveElement pseudocode function, except the result is multiplied by the element size. This includes the not found diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4218300960..08c56e55a0 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -282,6 +282,10 @@ DO_ZPZZ(MUL, mul) DO_ZPZZ(SMULH, smulh) DO_ZPZZ(UMULH, umulh) =20 +DO_ZPZZ(ASR, asr) +DO_ZPZZ(LSR, lsr) +DO_ZPZZ(LSL, lsl) + void trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) { static gen_helper_gvec_4 * const fns[4] =3D { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index c265ff9899..7ddff8e6bb 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -151,6 +151,14 @@ LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... \ ASRD 00000100 .. 000 100 100 ... .. ... ..... \ @rdn_pg_tszimm imm=3D%tszimm_shr =20 +# SVE bitwise shift by vector (predicated) +ASR_zpzz 00000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm +LSR_zpzz 00000100 .. 010 001 100 ... ..... ..... @rdn_pg_rm +LSL_zpzz 00000100 .. 010 011 100 ... ..... ..... @rdn_pg_rm +ASR_zpzz 00000100 .. 010 100 100 ... ..... ..... @rdm_pg_rn # ASRR +LSR_zpzz 00000100 .. 010 101 100 ... ..... ..... @rdm_pg_rn # LSRR +LSL_zpzz 00000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # LSLR + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892767894724.5896901409321; Sat, 17 Feb 2018 10:39:27 -0800 (PST) Received: from localhost ([::1]:48172 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7OQ-0004fu-Um for importer@patchew.org; Sat, 17 Feb 2018 13:39:27 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39765) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79K-0000OG-HI for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79J-0001bs-18 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:50 -0500 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:45873) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79I-0001bL-PS for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:48 -0500 Received: by mail-pf0-x242.google.com with SMTP id w83so592216pfi.12 for ; Sat, 17 Feb 2018 10:23:48 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.46 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=z2zyjf9WQFOCWG11hDLk8qhZRQx78/ZLD2//zJu/aqw=; b=MDZ/nNY3d7UY3Rqv3OzI6wHTPQrdSS2AHfFpYOCxaZ7aC5dBBT59v74eFDUv9UlyL2 agfuO9mokB0vjNCgitmWN1+w9z8Xo4sMLdzGrxg7fjyTp8KKOdvAL6YgsWVeavJgcTHX H2pTEOk7lJ3hnWehZPCUTtpu/oGkfAl+XlRQA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=z2zyjf9WQFOCWG11hDLk8qhZRQx78/ZLD2//zJu/aqw=; b=kB9aNAn53lUGD6RIfmSrCIc68JJyrwB4lYOzDUlF7kcNIfpftcfQpvGc3OopWj46aO 9pprlBT9MEkIC0baOz8sByXQAH+XTjlZ+ceFxQlM4d5jsYjiwJa/ucWNn/9zTbeAtcnS Q0sK455LyvQpEB6My190WtzHbzmjbnniCrrcNabgH1QnWNTa0sxgMRDgYG9Bh/DL27Ru YdxZHDmjXuEJZzgoXfEJC+cgyUMdfyoWVjkFtgJ0cSakIePZ8KR9vXvNXJvvSsz2KLOj J7ZE5pF2fyBDTiJQg8t68YTneIM6bJ+AiUm4hIe34cGMETbO2/p966fIHHYvYjhZlFOg vptA== X-Gm-Message-State: APf1xPAB5sB/AoD6u1P2vS5UUxmQSxpmr8NYd0WgqAGcGKcIxofJhNxf BQmP7X+0jqoW8hW2OTSrKJ+fkcdGS20= X-Google-Smtp-Source: AH8x225Dzy8u6DgiIe8Q0T8Y76aTEq8OWkx5IkNe2ovTDNQv8/W7toHjNliaYI9G9FAp7KlpJUNkmg== X-Received: by 10.167.130.193 with SMTP id f1mr9609038pfn.241.1518891827522; Sat, 17 Feb 2018 10:23:47 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:29 -0800 Message-Id: <20180217182323.25885-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v2 13/67] target/arm: Implement SVE bitwise shift by wide elements (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 21 +++++++++++++++++++++ target/arm/sve_helper.c | 35 +++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 25 +++++++++++++++++++++++++ target/arm/sve.decode | 6 ++++++ 4 files changed, 87 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0cc02ee59e..d516580134 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -195,6 +195,27 @@ DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_lsr_zpzw_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_lsl_zpzw_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_orv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_orv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_orv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6ea806d12b..3054b3cc99 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -464,6 +464,41 @@ DO_ZPZZ_D(sve_lsl_zpzz_d, uint64_t, DO_LSL) #undef DO_ZPZZ #undef DO_ZPZZ_D =20 +/* Three-operand expander, controlled by a predicate, in which the + * third operand is "wide". That is, for D =3D N op M, the same 64-bit + * value of M is used with all of the narrower values of N. + */ +#define DO_ZPZW(NAME, TYPE, TYPEW, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + for (i =3D 0; i < opr_sz; ) { \ + uint8_t pg =3D *(uint8_t *)(vg + H1(i >> 3)); \ + TYPEW mm =3D *(TYPEW *)(vm + i); \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) =3D OP(nn, mm); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); = \ + } while (i & 7); \ + } \ +} + +DO_ZPZW(sve_asr_zpzw_b, int8_t, uint64_t, H1, DO_ASR) +DO_ZPZW(sve_lsr_zpzw_b, uint8_t, uint64_t, H1, DO_LSR) +DO_ZPZW(sve_lsl_zpzw_b, uint8_t, uint64_t, H1, DO_LSL) + +DO_ZPZW(sve_asr_zpzw_h, int16_t, uint64_t, H1_2, DO_ASR) +DO_ZPZW(sve_lsr_zpzw_h, uint16_t, uint64_t, H1_2, DO_LSR) +DO_ZPZW(sve_lsl_zpzw_h, uint16_t, uint64_t, H1_2, DO_LSL) + +DO_ZPZW(sve_asr_zpzw_s, int32_t, uint64_t, H1_4, DO_ASR) +DO_ZPZW(sve_lsr_zpzw_s, uint32_t, uint64_t, H1_4, DO_LSR) +DO_ZPZW(sve_lsl_zpzw_s, uint32_t, uint64_t, H1_4, DO_LSL) + +#undef DO_ZPZW + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 08c56e55a0..35bcd9229d 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -473,6 +473,31 @@ static void trans_ASRD(DisasContext *s, arg_rpri_esz *= a, uint32_t insn) } } =20 +/* + *** SVE Bitwise Shift - Predicated Group + */ + +#define DO_ZPZW(NAME, name) \ +static void trans_##NAME##_zpzw(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_4 * const fns[3] =3D { = \ + gen_helper_sve_##name##_zpzw_b, gen_helper_sve_##name##_zpzw_h, \ + gen_helper_sve_##name##_zpzw_s, \ + }; \ + if (a->esz >=3D 0 && a->esz < 3) { = \ + do_zpzz_ool(s, a, fns[a->esz]); \ + } else { \ + unallocated_encoding(s); \ + } \ +} + +DO_ZPZW(ASR, asr) +DO_ZPZW(LSR, lsr) +DO_ZPZW(LSL, lsl) + +#undef DO_ZPZW + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7ddff8e6bb..177f338fed 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -159,6 +159,12 @@ ASR_zpzz 00000100 .. 010 100 100 ... ..... ..... @rd= m_pg_rn # ASRR LSR_zpzz 00000100 .. 010 101 100 ... ..... ..... @rdm_pg_rn # LSRR LSL_zpzz 00000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # LSLR =20 +# SVE bitwise shift by wide elements (predicated) +# Note these require size !=3D 3. +ASR_zpzw 00000100 .. 011 000 100 ... ..... ..... @rdn_pg_rm +LSR_zpzw 00000100 .. 011 001 100 ... ..... ..... @rdn_pg_rm +LSL_zpzw 00000100 .. 011 011 100 ... ..... ..... @rdn_pg_rm + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892985907299.22858457638074; Sat, 17 Feb 2018 10:43:05 -0800 (PST) Received: from localhost ([::1]:48198 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Rx-0007fA-1w for importer@patchew.org; Sat, 17 Feb 2018 13:43:05 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39798) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79M-0000Sb-PP for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79K-0001cw-Uq for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:52 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:42839) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79K-0001cW-Mb for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:50 -0500 Received: by mail-pf0-x244.google.com with SMTP id b25so592216pfd.9 for ; Sat, 17 Feb 2018 10:23:50 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.47 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=FjMo7EZbo9fN48yISfUcPdtxkrSfQEMIR1b8FIyft8c=; b=gPSWg/L2BgM8dScJdxcZ8TbIij2E91/NkzUYycCIBTItntP65/dQZw9rtLLIgZFtrK fZlB4fVpkPbQYDbP7ZvGvq6wdpH6lfMoMUE+XpyUZIwqRzzsul0nBoBzxztQgac/bWI+ xV23aIo/L7K7d9G/zKkkjHUcI0fGIK21d+ReE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=FjMo7EZbo9fN48yISfUcPdtxkrSfQEMIR1b8FIyft8c=; b=UqaZ7J4mViIZsAZG4wZxjI5VgyeX5LCnvg0uwceTNJMgNeTIvI83MQLzthewjC+klO z6z8SRWmmNDDzq6c9JFy9CZL71UrQ56/r1X296gOzoA9dsJO45cEyY4eb++4ZvYOzxpm XVMAOAT4ANVqYh5sBkHEJplhRi8ymV0b6sj8bbGaig8F+CdvqTrZMRR2LvJ2UQcXvt11 3IBf3LIqO7btRaNReUlh+8EWnpOCJf5YKwm/8K1xRIFAOg8KAG2Jvpg6tCHrvSLXdaLk 60gAfznJEonsKQzJrMB5Xu6ShOUjyPbuBDDWck2QA/Hap1RsNhDeDHnTDPC4oUS8xSEI 6BTA== X-Gm-Message-State: APf1xPDbTvjJ6JyDsSDpYRYDOWifSPrGnIAT0y3CKgTdk46HfPujqZME EfKQlKhzXaOqs8LQcyuwc7swXW/BunI= X-Google-Smtp-Source: AH8x227EScwK7fLYSOqspSqI/JfkC9RH9b8jFQv/t++vm5FGkhCTRpT+Z9lXVIhBZQ9y5yMCuPoBzw== X-Received: by 10.98.83.6 with SMTP id h6mr5098866pfb.174.1518891829303; Sat, 17 Feb 2018 10:23:49 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:30 -0800 Message-Id: <20180217182323.25885-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 14/67] target/arm: Implement SVE Integer Arithmetic - Unary Predicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 60 +++++++++++++++++++++ target/arm/sve_helper.c | 127 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 111 +++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 23 ++++++++ 4 files changed, 321 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d516580134..11644125d1 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -285,6 +285,66 @@ DEF_HELPER_FLAGS_4(sve_asrd_h, TCG_CALL_NO_RWG, void, = ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asrd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asrd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_cls_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cls_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cls_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cls_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_clz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_clz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_clz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_clz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cnt_zpz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_cnt_zpz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_cnt_zpz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_cnt_zpz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) + +DEF_HELPER_FLAGS_4(sve_cnot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cnot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cnot_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cnot_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fabs_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fabs_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fabs_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fneg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fneg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fneg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_not_zpz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_not_zpz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_not_zpz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_not_zpz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) + +DEF_HELPER_FLAGS_4(sve_sxtb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_sxtb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_sxtb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_uxtb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uxtb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uxtb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_sxth_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_sxth_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_uxth_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uxth_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_sxtw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uxtw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_abs_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_abs_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_abs_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_abs_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_neg_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_neg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_neg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_neg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 3054b3cc99..e11823a727 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -499,6 +499,133 @@ DO_ZPZW(sve_lsl_zpzw_s, uint32_t, uint64_t, H1_4, DO_= LSL) =20 #undef DO_ZPZW =20 +/* Fully general two-operand expander, controlled by a predicate. + */ +#define DO_ZPZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + for (i =3D 0; i < opr_sz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) =3D OP(nn); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZ_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; \ + TYPE *d =3D vd, *n =3D vn; \ + uint8_t *pg =3D vg; \ + for (i =3D 0; i < opr_sz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE nn =3D n[i]; \ + d[i] =3D OP(nn); \ + } \ + } \ +} + +#define DO_CLS_B(N) (clrsb32(N) - 24) +#define DO_CLS_H(N) (clrsb32(N) - 16) + +DO_ZPZ(sve_cls_b, int8_t, H1, DO_CLS_B) +DO_ZPZ(sve_cls_h, int16_t, H1_2, DO_CLS_H) +DO_ZPZ(sve_cls_s, int32_t, H1_4, clrsb32) +DO_ZPZ_D(sve_cls_d, int64_t, clrsb64) + +#define DO_CLZ_B(N) (clz32(N) - 24) +#define DO_CLZ_H(N) (clz32(N) - 16) + +DO_ZPZ(sve_clz_b, uint8_t, H1, DO_CLZ_B) +DO_ZPZ(sve_clz_h, uint16_t, H1_2, DO_CLZ_H) +DO_ZPZ(sve_clz_s, uint32_t, H1_4, clz32) +DO_ZPZ_D(sve_clz_d, uint64_t, clz64) + +DO_ZPZ(sve_cnt_zpz_b, uint8_t, H1, ctpop8) +DO_ZPZ(sve_cnt_zpz_h, uint16_t, H1_2, ctpop16) +DO_ZPZ(sve_cnt_zpz_s, uint32_t, H1_4, ctpop32) +DO_ZPZ_D(sve_cnt_zpz_d, uint64_t, ctpop64) + +#define DO_CNOT(N) (N =3D=3D 0) + +DO_ZPZ(sve_cnot_b, uint8_t, H1, DO_CNOT) +DO_ZPZ(sve_cnot_h, uint16_t, H1_2, DO_CNOT) +DO_ZPZ(sve_cnot_s, uint32_t, H1_4, DO_CNOT) +DO_ZPZ_D(sve_cnot_d, uint64_t, DO_CNOT) + +#define DO_FABS(N) (N & ((__typeof(N))-1 >> 1)) + +DO_ZPZ(sve_fabs_h, uint16_t, H1_2, DO_FABS) +DO_ZPZ(sve_fabs_s, uint32_t, H1_4, DO_FABS) +DO_ZPZ_D(sve_fabs_d, uint64_t, DO_FABS) + +#define DO_FNEG(N) (N ^ ~((__typeof(N))-1 >> 1)) + +DO_ZPZ(sve_fneg_h, uint16_t, H1_2, DO_FNEG) +DO_ZPZ(sve_fneg_s, uint32_t, H1_4, DO_FNEG) +DO_ZPZ_D(sve_fneg_d, uint64_t, DO_FNEG) + +#define DO_NOT(N) (~N) + +DO_ZPZ(sve_not_zpz_b, uint8_t, H1, DO_NOT) +DO_ZPZ(sve_not_zpz_h, uint16_t, H1_2, DO_NOT) +DO_ZPZ(sve_not_zpz_s, uint32_t, H1_4, DO_NOT) +DO_ZPZ_D(sve_not_zpz_d, uint64_t, DO_NOT) + +#define DO_SXTB(N) ((int8_t)N) +#define DO_SXTH(N) ((int16_t)N) +#define DO_SXTS(N) ((int32_t)N) +#define DO_UXTB(N) ((uint8_t)N) +#define DO_UXTH(N) ((uint16_t)N) +#define DO_UXTS(N) ((uint32_t)N) + +DO_ZPZ(sve_sxtb_h, uint16_t, H1_2, DO_SXTB) +DO_ZPZ(sve_sxtb_s, uint32_t, H1_4, DO_SXTB) +DO_ZPZ(sve_sxth_s, uint32_t, H1_4, DO_SXTH) +DO_ZPZ_D(sve_sxtb_d, uint64_t, DO_SXTB) +DO_ZPZ_D(sve_sxth_d, uint64_t, DO_SXTH) +DO_ZPZ_D(sve_sxtw_d, uint64_t, DO_SXTS) + +DO_ZPZ(sve_uxtb_h, uint16_t, H1_2, DO_UXTB) +DO_ZPZ(sve_uxtb_s, uint32_t, H1_4, DO_UXTB) +DO_ZPZ(sve_uxth_s, uint32_t, H1_4, DO_UXTH) +DO_ZPZ_D(sve_uxtb_d, uint64_t, DO_UXTB) +DO_ZPZ_D(sve_uxth_d, uint64_t, DO_UXTH) +DO_ZPZ_D(sve_uxtw_d, uint64_t, DO_UXTS) + +#define DO_ABS(N) (N < 0 ? -N : N) + +DO_ZPZ(sve_abs_b, int8_t, H1, DO_ABS) +DO_ZPZ(sve_abs_h, int16_t, H1_2, DO_ABS) +DO_ZPZ(sve_abs_s, int32_t, H1_4, DO_ABS) +DO_ZPZ_D(sve_abs_d, int64_t, DO_ABS) + +#define DO_NEG(N) (-N) + +DO_ZPZ(sve_neg_b, uint8_t, H1, DO_NEG) +DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG) +DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG) +DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG) + +#undef DO_CLS_B +#undef DO_CLS_H +#undef DO_CLZ_B +#undef DO_CLZ_H +#undef DO_CNOT +#undef DO_FABS +#undef DO_FNEG +#undef DO_ABS +#undef DO_NEG +#undef DO_ZPZ +#undef DO_ZPZ_D + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 35bcd9229d..dce8ba8dc0 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -304,6 +304,117 @@ void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a= , uint32_t insn) =20 #undef DO_ZPZZ =20 +/* + *** SVE Integer Arithmetic - Unary Predicated Group + */ + +static void do_zpz_ool(DisasContext *s, arg_rpr_esz *a, gen_helper_gvec_3 = *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + if (fn =3D=3D NULL) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + vsz, vsz, 0, fn); +} + +#define DO_ZPZ(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_3 * const fns[4] =3D { \ + gen_helper_sve_##name##_b, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d, \ + }; \ + do_zpz_ool(s, a, fns[a->esz]); \ +} + +DO_ZPZ(CLS, cls) +DO_ZPZ(CLZ, clz) +DO_ZPZ(CNT_zpz, cnt_zpz) +DO_ZPZ(CNOT, cnot) +DO_ZPZ(NOT_zpz, not_zpz) +DO_ZPZ(ABS, abs) +DO_ZPZ(NEG, neg) + +static void trans_FABS(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, + gen_helper_sve_fabs_h, + gen_helper_sve_fabs_s, + gen_helper_sve_fabs_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_FNEG(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, + gen_helper_sve_fneg_h, + gen_helper_sve_fneg_s, + gen_helper_sve_fneg_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_SXTB(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, + gen_helper_sve_sxtb_h, + gen_helper_sve_sxtb_s, + gen_helper_sve_sxtb_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_UXTB(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, + gen_helper_sve_uxtb_h, + gen_helper_sve_uxtb_s, + gen_helper_sve_uxtb_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_SXTH(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, NULL, + gen_helper_sve_sxth_s, + gen_helper_sve_sxth_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_UXTH(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, NULL, + gen_helper_sve_uxth_s, + gen_helper_sve_uxth_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_SXTW(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ool(s, a, a->esz =3D=3D 3 ? gen_helper_sve_sxtw_d : NULL); +} + +static void trans_UXTW(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ool(s, a, a->esz =3D=3D 3 ? gen_helper_sve_uxtw_d : NULL); +} + +#undef DO_ZPZ + /* *** SVE Integer Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 177f338fed..b875501475 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -165,6 +165,29 @@ ASR_zpzw 00000100 .. 011 000 100 ... ..... ..... @rdn= _pg_rm LSR_zpzw 00000100 .. 011 001 100 ... ..... ..... @rdn_pg_rm LSL_zpzw 00000100 .. 011 011 100 ... ..... ..... @rdn_pg_rm =20 +### SVE Integer Arithmetic - Unary Predicated Group + +# SVE unary bit operations (predicated) +# Note esz !=3D 0 for FABS and FNEG. +CLS 00000100 .. 011 000 101 ... ..... ..... @rd_pg_rn +CLZ 00000100 .. 011 001 101 ... ..... ..... @rd_pg_rn +CNT_zpz 00000100 .. 011 010 101 ... ..... ..... @rd_pg_rn +CNOT 00000100 .. 011 011 101 ... ..... ..... @rd_pg_rn +NOT_zpz 00000100 .. 011 110 101 ... ..... ..... @rd_pg_rn +FABS 00000100 .. 011 100 101 ... ..... ..... @rd_pg_rn +FNEG 00000100 .. 011 101 101 ... ..... ..... @rd_pg_rn + +# SVE integer unary operations (predicated) +# Note esz > original size for extensions. +ABS 00000100 .. 010 110 101 ... ..... ..... @rd_pg_rn +NEG 00000100 .. 010 111 101 ... ..... ..... @rd_pg_rn +SXTB 00000100 .. 010 000 101 ... ..... ..... @rd_pg_rn +UXTB 00000100 .. 010 001 101 ... ..... ..... @rd_pg_rn +SXTH 00000100 .. 010 010 101 ... ..... ..... @rd_pg_rn +UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_rn +SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn +UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892090578453.6880626140578; Sat, 17 Feb 2018 10:28:10 -0800 (PST) Received: from localhost ([::1]:48081 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7DV-0003u5-Og for importer@patchew.org; Sat, 17 Feb 2018 13:28:09 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39824) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79O-0000VJ-7U for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79M-0001dX-QY for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:54 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:44552) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79M-0001dF-JA for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:52 -0500 Received: by mail-pl0-x243.google.com with SMTP id w21so3426484plp.11 for ; Sat, 17 Feb 2018 10:23:52 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=JrLpQVNkAD2ooQuftQEUHc/FeKYLqxEFlP7jwgpSp7M=; b=bVWirNV3nDHul8zG/O7qFSIG1shh6nqF2GybheeEh3srcQfFjTUGfmFIXxZ3H73PgV hYZUxie8fh0zmZ5aqkWTv8zv27iUNqu6UIAvS8nRc9LtoHmyTr6Sdpe5mIIjbg7QOkI7 wv0ltXw6AQpZCrmxpbDOSSlaNhfkHK9k9asEo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=JrLpQVNkAD2ooQuftQEUHc/FeKYLqxEFlP7jwgpSp7M=; b=SyJR5VOdM5Y622Sw0VE3ojVddKYotSPChg2DKOXTSD7o4Dk6raiLWeSLgkdFK6h6hu 3HvbjdNCDHa2DFqlZ5q89rxPumGJQkcz4Gy794I62LN3L7FrSK/qJYFJE8YuTlh81CpD Ru5HBBLea0ybsGzJZ/aGorBir5NESYCUyXrRnsN8SH/Jnh5kw0Tykm3XBTu6YVPD92eI cd/ILn774o1NcpKQYQ/fnBZNFb2+dUDFqP8DyWftUSzbhVZTcZx3t7AONve6Ggpp+J34 XZcjc5oy08ItlsEwfQ2FNXhF3sqaG+ldBLjDRvXFkljZYyWl7XRnV+Yj5lswuH4tPDfS WL3w== X-Gm-Message-State: APf1xPAwkpA8+cF61GIUQuk4dTKFEjp/HnNnrf6tcR/J3aw4m2SWXPY3 AYREkhuuBxj+TnekiIxLSbNihj6o6M0= X-Google-Smtp-Source: AH8x225aSSYoUuR0wsV+gMiF5FY6f3mB2aKoLmGSojLGN+z9bu/dXT4Z5ZYlGfOkUp/U8MicC2Cnog== X-Received: by 2002:a17:902:6941:: with SMTP id k1-v6mr4060710plt.86.1518891831346; Sat, 17 Feb 2018 10:23:51 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:31 -0800 Message-Id: <20180217182323.25885-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 15/67] target/arm: Implement SVE Integer Multiply-Add Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 18 ++++++++++++++ target/arm/sve_helper.c | 58 ++++++++++++++++++++++++++++++++++++++++++= +++- target/arm/translate-sve.c | 31 +++++++++++++++++++++++++ target/arm/sve.decode | 17 ++++++++++++++ 4 files changed, 123 insertions(+), 1 deletion(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 11644125d1..b31d497f31 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -345,6 +345,24 @@ DEF_HELPER_FLAGS_4(sve_neg_h, TCG_CALL_NO_RWG, void, p= tr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_neg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_neg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_6(sve_mla_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mla_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mla_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mla_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_mls_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mls_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mls_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mls_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e11823a727..4b08a38ce8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -932,6 +932,62 @@ DO_ZPZI_D(sve_asrd_d, int64_t, DO_ASRD) #undef DO_SHR #undef DO_SHL #undef DO_ASRD - #undef DO_ZPZI #undef DO_ZPZI_D + +/* Fully general four-operand expander, controlled by a predicate. + */ +#define DO_ZPZZZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *va, void *vn, void *vm, \ + void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + for (i =3D 0; i < opr_sz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + TYPE mm =3D *(TYPE *)(vm + H(i)); \ + TYPE aa =3D *(TYPE *)(va + H(i)); \ + *(TYPE *)(vd + H(i)) =3D OP(aa, nn, mm); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZZZ_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *va, void *vn, void *vm, \ + void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; \ + TYPE *d =3D vd, *a =3D va, *n =3D vn, *m =3D vm; \ + uint8_t *pg =3D vg; \ + for (i =3D 0; i < opr_sz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE aa =3D a[i], nn =3D n[i], mm =3D m[i]; \ + d[i] =3D OP(aa, nn, mm); \ + } \ + } \ +} + +#define DO_MLA(A, N, M) (A + N * M) +#define DO_MLS(A, N, M) (A - N * M) + +DO_ZPZZZ(sve_mla_b, uint8_t, H1, DO_MLA) +DO_ZPZZZ(sve_mls_b, uint8_t, H1, DO_MLS) + +DO_ZPZZZ(sve_mla_h, uint16_t, H1_2, DO_MLA) +DO_ZPZZZ(sve_mls_h, uint16_t, H1_2, DO_MLS) + +DO_ZPZZZ(sve_mla_s, uint32_t, H1_4, DO_MLA) +DO_ZPZZZ(sve_mls_s, uint32_t, H1_4, DO_MLS) + +DO_ZPZZZ_D(sve_mla_d, uint64_t, DO_MLA) +DO_ZPZZZ_D(sve_mls_d, uint64_t, DO_MLS) + +#undef DO_MLA +#undef DO_MLS +#undef DO_ZPZZZ +#undef DO_ZPZZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index dce8ba8dc0..b956d87636 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -609,6 +609,37 @@ DO_ZPZW(LSL, lsl) =20 #undef DO_ZPZW =20 +/* + *** SVE Integer Multiply-Add Group + */ + +static void do_zpzzz_ool(DisasContext *s, arg_rprrr_esz *a, + gen_helper_gvec_5 *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + tcg_gen_gvec_5_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->ra), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + vsz, vsz, 0, fn); +} + +#define DO_ZPZZZ(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rprrr_esz *a, uint32_t insn)= \ +{ \ + static gen_helper_gvec_5 * const fns[4] =3D { \ + gen_helper_sve_##name##_b, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d, \ + }; \ + do_zpzzz_ool(s, a, fns[a->esz]); \ +} + +DO_ZPZZZ(MLA, mla) +DO_ZPZZZ(MLS, mls) + +#undef DO_ZPZZZ + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b875501475..68a1823b72 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -48,6 +48,7 @@ &rpr_esz rd pg rn esz &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz +&rprrr_esz rd pg rn rm ra esz &rpri_esz rd pg rn imm esz &ptrue rd esz pat s =20 @@ -73,6 +74,12 @@ @rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ &rprr_esz rm=3D%reg_movprfx =20 +# Three register operand, with governing predicate, vector element size +@rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \ + &rprrr_esz ra=3D%reg_movprfx +@rdn_pg_ra_rm ........ esz:2 . rm:5 ... pg:3 ra:5 rd:5 \ + &rprrr_esz rn=3D%reg_movprfx + # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz =20 @@ -188,6 +195,16 @@ UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_= rn SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn =20 +### SVE Integer Multiply-Add Group + +# SVE integer multiply-add writing addend (predicated) +MLA 00000100 .. 0 ..... 010 ... ..... ..... @rda_pg_rn_rm +MLS 00000100 .. 0 ..... 011 ... ..... ..... @rda_pg_rn_rm + +# SVE integer multiply-add writing multiplicand (predicated) +MLA 00000100 .. 0 ..... 110 ... ..... ..... @rdn_pg_ra_rm # MAD +MLS 00000100 .. 0 ..... 111 ... ..... ..... @rdn_pg_ra_rm # MSB + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892298989162.2410717095156; Sat, 17 Feb 2018 10:31:38 -0800 (PST) Received: from localhost ([::1]:48103 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Gm-0006oF-Qf for importer@patchew.org; Sat, 17 Feb 2018 13:31:32 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39854) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79P-0000Ws-Jx for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79O-0001eU-EZ for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:55 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:45434) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79O-0001eA-AO for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:54 -0500 Received: by mail-pl0-x243.google.com with SMTP id p5so3428335plo.12 for ; Sat, 17 Feb 2018 10:23:54 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.51 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=p2OJeeH3GeRDZFkskP3/uYl1RujJsL25/T8puntaOkM=; b=ZTtPAfCmp4AENayfBA1q3SlbKh7Qlc+yadSVWnyaqj3P2yqCTtBS+Xh5SduMTIlvgs fmU8vd3dH/idKnbBAPdQVwV43wAICu+no7J+z+Wa37RajwsatY8KFafyIFg+tYWabHQk X4iHsfoRGQvsjyqrMLmD9KM/0COyQvuN16/m8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=p2OJeeH3GeRDZFkskP3/uYl1RujJsL25/T8puntaOkM=; b=ftjpcoPfgH8k7SjK+GzrDniyU70sjKImCtb0ue+syu+Uq9Ea4RpMWT4usIgB2KpjJf fHCKG3R9Giaki3r/sbtEQMvUSHlKxIZbq96+th7m/Nnxcs8y4Wj8G/MXqZoYkq/lS4QH VU2cwvBglE4bwYvnDntP0thrLxxihq3No15facNyhS7SIShp/ROjYic4yRjXuPS2mdWd 1nxsr1smFsGvgXkQ48++JJWK/FSa6IfY2KGMfKche7wQYZDtjl6M2henb/iSns4doMAy svjiSQmmRIBzkgKJgZ0PvsgRAwamZekiytrZZdr8pcjHmN2qOioHexSj8P8Ae4mCII6T EuKg== X-Gm-Message-State: APf1xPBo6M/SPVP9a7Vy6s3MpOXWAj3aSiaSZRrwkkWbxcWdwvcoPk8E knjPz9tdoYEczImuEwpOHgKuXlcTLAQ= X-Google-Smtp-Source: AH8x2254Zx6s7ZXjTy0s+dKIK/73yKyhSUlwYL46IxVX3XuXhZjfZtSVOgnhVnj4jRyvzaNTu9Cf1A== X-Received: by 2002:a17:902:9686:: with SMTP id n6-v6mr9302113plp.333.1518891833069; Sat, 17 Feb 2018 10:23:53 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:32 -0800 Message-Id: <20180217182323.25885-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 16/67] target/arm: Implement SVE Integer Arithmetic - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/translate-sve.c | 41 ++++++++++++++++++++++++++++++++++++++--- target/arm/sve.decode | 13 +++++++++++++ 2 files changed, 51 insertions(+), 3 deletions(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index b956d87636..8baec6c674 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -235,6 +235,40 @@ static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz= *a, uint32_t insn) do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } =20 +/* + *** SVE Integer Arithmetic - Unpredicated Group + */ + +static void trans_ADD_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_add, a->esz, a->rd, a->rn, a->rm); +} + +static void trans_SUB_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_sub, a->esz, a->rd, a->rn, a->rm); +} + +static void trans_SQADD_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_ssadd, a->esz, a->rd, a->rn, a->rm); +} + +static void trans_SQSUB_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_sssub, a->esz, a->rd, a->rn, a->rm); +} + +static void trans_UQADD_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_usadd, a->esz, a->rd, a->rn, a->rm); +} + +static void trans_UQSUB_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_ussub, a->esz, a->rd, a->rn, a->rm); +} + /* *** SVE Integer Arithmetic - Binary Predicated Group */ @@ -254,7 +288,8 @@ static void do_zpzz_ool(DisasContext *s, arg_rprr_esz *= a, gen_helper_gvec_4 *fn) } =20 #define DO_ZPZZ(NAME, name) \ -void trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) \ +static void trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ { \ static gen_helper_gvec_4 * const fns[4] =3D { = \ gen_helper_sve_##name##_zpzz_b, gen_helper_sve_##name##_zpzz_h, \ @@ -286,7 +321,7 @@ DO_ZPZZ(ASR, asr) DO_ZPZZ(LSR, lsr) DO_ZPZZ(LSL, lsl) =20 -void trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +static void trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t ins= n) { static gen_helper_gvec_4 * const fns[4] =3D { NULL, NULL, gen_helper_sve_sdiv_zpzz_s, gen_helper_sve_sdiv_zpzz_d @@ -294,7 +329,7 @@ void trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, = uint32_t insn) do_zpzz_ool(s, a, fns[a->esz]); } =20 -void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +static void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t ins= n) { static gen_helper_gvec_4 * const fns[4] =3D { NULL, NULL, gen_helper_sve_udiv_zpzz_s, gen_helper_sve_udiv_zpzz_d diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 68a1823b72..b40d7dc9a2 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -68,6 +68,9 @@ # Three prediate operand, with governing predicate, flag setting @pd_pg_pn_pm_s ........ . s:1 .. rm:4 .. pg:4 . rn:4 . rd:4 &rprr_s =20 +# Three operand, vector element size +@rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz + # Two register operand, with governing predicate, vector element size @rdn_pg_rm ........ esz:2 ... ... ... pg:3 rm:5 rd:5 \ &rprr_esz rn=3D%reg_movprfx @@ -205,6 +208,16 @@ MLS 00000100 .. 0 ..... 011 ... ..... ..... @rda_pg= _rn_rm MLA 00000100 .. 0 ..... 110 ... ..... ..... @rdn_pg_ra_rm # MAD MLS 00000100 .. 0 ..... 111 ... ..... ..... @rdn_pg_ra_rm # MSB =20 +### SVE Integer Arithmetic - Unpredicated Group + +# SVE integer add/subtract vectors (unpredicated) +ADD_zzz 00000100 .. 1 ..... 000 000 ..... ..... @rd_rn_rm +SUB_zzz 00000100 .. 1 ..... 000 001 ..... ..... @rd_rn_rm +SQADD_zzz 00000100 .. 1 ..... 000 100 ..... ..... @rd_rn_rm +UQADD_zzz 00000100 .. 1 ..... 000 101 ..... ..... @rd_rn_rm +SQSUB_zzz 00000100 .. 1 ..... 000 110 ..... ..... @rd_rn_rm +UQSUB_zzz 00000100 .. 1 ..... 000 111 ..... ..... @rd_rn_rm + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893211981472.8664134271088; Sat, 17 Feb 2018 10:46:51 -0800 (PST) Received: from localhost ([::1]:48237 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Vb-0002Q9-4c for importer@patchew.org; Sat, 17 Feb 2018 13:46:51 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39897) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79S-0000ad-J5 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79Q-0001fD-Av for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:58 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:46293) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79P-0001el-Ny for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:56 -0500 Received: by mail-pl0-x243.google.com with SMTP id x19so3428069plr.13 for ; Sat, 17 Feb 2018 10:23:55 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.53 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=STsmc4QHRnZfTt7wI7/HBaUPP3ioNeLwXu1h7b1S+p8=; b=hn1ZBSBu3dRtay/gpG4FfhIwDAwA1h0GdGOtLoITHFW+FmfwRPEhJrj3ezAM/ngfpF tQ7+P4vMkwgfVF2Ri9cg+4sFTQfXfit88ESvQABOC4oZwv7gJqZcuKOaUuNRtpzuRaZp jNzlDSGw39sU2xZMukOQ9k8ZMzhC9vy5WzdtE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=STsmc4QHRnZfTt7wI7/HBaUPP3ioNeLwXu1h7b1S+p8=; b=GRhZACHdiRAP/KJnROBULj3+CuEKXq9lxQe9nsbbTMiokhqMJ32743E8I/+oNDvQ9h KkNFNa+OfuDJVE9ExDi+xru0sDhb8y/EkCFqQVd/xAPoigmCHIb27I4rQgpxCVTbEHiJ aYWtF3mNcHurv5t4B7/gfrCx2cD7oIVpBPv1szYPIWOuTXI2K3B9KE4fuQCQR8VuVnq7 7JpqSQl0udz8aiq99wC4EaX3txPHABHl8eocKAj1EuH8tv0C5M2bec+MiLzazf/YTG/i MBKeGiDNYBYiVIub13JfVrojo5NchAmHCP8l0urf48Ux4XbvcI7CTjq5VRZ7khcZqx7H 6Nyg== X-Gm-Message-State: APf1xPAj/MXPbmyIiqq3vExSkidhGs/5040KuXJY813EBu+Xt6nOUyKo E4Eyt8ktnoZmHV3nrgnA9tN189puEAg= X-Google-Smtp-Source: AH8x225/cKx+aBck3hY+pi/TNWm6jiwvehDcv/wcIAF1B1X1JTAaoHjGlNgXtxjAxniHXPYpInD1BA== X-Received: by 2002:a17:902:33a5:: with SMTP id b34-v6mr2253184plc.263.1518891834489; Sat, 17 Feb 2018 10:23:54 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:33 -0800 Message-Id: <20180217182323.25885-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 17/67] target/arm: Implement SVE Index Generation Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 5 ++++ target/arm/sve_helper.c | 40 +++++++++++++++++++++++++++ target/arm/translate-sve.c | 67 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 14 ++++++++++ 4 files changed, 126 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b31d497f31..2a2dbe98dd 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -363,6 +363,11 @@ DEF_HELPER_FLAGS_6(sve_mls_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_mls_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_index_b, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_4(sve_index_h, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_4(sve_index_s, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_4(sve_index_d, TCG_CALL_NO_RWG, void, ptr, i64, i64, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4b08a38ce8..950012e70a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -991,3 +991,43 @@ DO_ZPZZZ_D(sve_mls_d, uint64_t, DO_MLS) #undef DO_MLS #undef DO_ZPZZZ #undef DO_ZPZZZ_D + +void HELPER(sve_index_b)(void *vd, uint32_t start, + uint32_t incr, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc); + uint8_t *d =3D vd; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[H1(i)] =3D start + i * incr; + } +} + +void HELPER(sve_index_h)(void *vd, uint32_t start, + uint32_t incr, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 2; + uint16_t *d =3D vd; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[H2(i)] =3D start + i * incr; + } +} + +void HELPER(sve_index_s)(void *vd, uint32_t start, + uint32_t incr, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 4; + uint32_t *d =3D vd; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[H4(i)] =3D start + i * incr; + } +} + +void HELPER(sve_index_d)(void *vd, uint64_t start, + uint64_t incr, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D start + i * incr; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 8baec6c674..773f0bfded 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -675,6 +675,73 @@ DO_ZPZZZ(MLS, mls) =20 #undef DO_ZPZZZ =20 +/* + *** SVE Index Generation Group + */ + +static void do_index(DisasContext *s, int esz, int rd, + TCGv_i64 start, TCGv_i64 incr) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_i32 desc =3D tcg_const_i32(simd_desc(vsz, vsz, 0)); + TCGv_ptr t_zd =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, rd)); + if (esz =3D=3D 3) { + gen_helper_sve_index_d(t_zd, start, incr, desc); + } else { + typedef void index_fn(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32); + static index_fn * const fns[3] =3D { + gen_helper_sve_index_b, + gen_helper_sve_index_h, + gen_helper_sve_index_s, + }; + TCGv_i32 s32 =3D tcg_temp_new_i32(); + TCGv_i32 i32 =3D tcg_temp_new_i32(); + + tcg_gen_extrl_i64_i32(s32, start); + tcg_gen_extrl_i64_i32(i32, incr); + fns[esz](t_zd, s32, i32, desc); + + tcg_temp_free_i32(s32); + tcg_temp_free_i32(i32); + } + tcg_temp_free_ptr(t_zd); + tcg_temp_free_i32(desc); +} + +static void trans_INDEX_ii(DisasContext *s, arg_INDEX_ii *a, uint32_t insn) +{ + TCGv_i64 start =3D tcg_const_i64(a->imm1); + TCGv_i64 incr =3D tcg_const_i64(a->imm2); + do_index(s, a->esz, a->rd, start, incr); + tcg_temp_free_i64(start); + tcg_temp_free_i64(incr); +} + +static void trans_INDEX_ir(DisasContext *s, arg_INDEX_ir *a, uint32_t insn) +{ + TCGv_i64 start =3D tcg_const_i64(a->imm); + TCGv_i64 incr =3D cpu_reg(s, a->rm); + do_index(s, a->esz, a->rd, start, incr); + tcg_temp_free_i64(start); +} + +static void trans_INDEX_ri(DisasContext *s, arg_INDEX_ri *a, uint32_t insn) +{ + TCGv_i64 start =3D cpu_reg(s, a->rn); + TCGv_i64 incr =3D tcg_const_i64(a->imm); + do_index(s, a->esz, a->rd, start, incr); + tcg_temp_free_i64(incr); +} + +static void trans_INDEX_rr(DisasContext *s, arg_INDEX_rr *a, uint32_t insn) +{ + TCGv_i64 start =3D cpu_reg(s, a->rn); + TCGv_i64 incr =3D cpu_reg(s, a->rm); + do_index(s, a->esz, a->rd, start, incr); +} + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b40d7dc9a2..d7b078e92f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -226,6 +226,20 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_= rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 =20 +### SVE Index Generation Group + +# SVE index generation (immediate start, immediate increment) +INDEX_ii 00000100 esz:2 1 imm2:s5 010000 imm1:s5 rd:5 + +# SVE index generation (immediate start, register increment) +INDEX_ir 00000100 esz:2 1 rm:5 010010 imm:s5 rd:5 + +# SVE index generation (register start, immediate increment) +INDEX_ri 00000100 esz:2 1 imm:s5 010001 rn:5 rd:5 + +# SVE index generation (register start, register increment) +INDEX_rr 00000100 .. 1 ..... 010011 ..... ..... @rd_rn_rm + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892718098541.8057516150462; Sat, 17 Feb 2018 10:38:38 -0800 (PST) Received: from localhost ([::1]:48169 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7NX-0003w0-PB for importer@patchew.org; Sat, 17 Feb 2018 13:38:31 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39898) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79S-0000ae-J7 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79R-0001g3-BH for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:58 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:41240) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79R-0001fc-5R for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:57 -0500 Received: by mail-pf0-x244.google.com with SMTP id 68so590897pfj.8 for ; Sat, 17 Feb 2018 10:23:57 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.54 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=G5CwiAWO6kX58omnBN3XeNxT6S6emdMtL5CoxFcf+0c=; b=S7PQxOEVqNVgXFHdoZSxAFe23u3NriC/cohTEYvROBZ+/NrXFMncQ0OQ9gut8Sd2D9 DB2+EClwq63UW+H9OxY12FHpTc3fH+XK5bAHZI9byh5/SMu/7KTFpHVxnZA1oywO8hr0 u9IET6oge02IS6bHa76aajzbLufU1GpUhpvYo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=G5CwiAWO6kX58omnBN3XeNxT6S6emdMtL5CoxFcf+0c=; b=frVkF+Zkld73C4qbonxPgnXIgMPp7UwQU5b2pcdHx3AkOIKM7E1KzExDVbbNPAzcz/ sgmr5iPzA4ymihJ4o1d1vF/Khhbc9yz74YLCntV0CA6gVH6EsgiOnxDDfEqUouuq35HW uGeU1vtex8oQHSZqQMnyaz+Yjouc+wiaQTh7gwpz3A44qX8V+uAWi5NNmelzzZsUgTEx ivDkHNnDI9M6BEPOI0dh9OV33DpNy26EJjOQ2Ucl3DW8RMzmldTPCMZvwr8jNK2xCekZ Lmh6jz9p1XFLzFR7WaVq96fSdXZw8085DPJPEBUrJIKtvs8hR44ZF94bRi/o4TJSrlKr Rxpw== X-Gm-Message-State: APf1xPAVxSzGxXFULCNkSKEGTI1LsUwXBOmuPISkVISAbSu6JbRLoD4s as/yNs6tYYYk1W2ptngecaUYor01mXU= X-Google-Smtp-Source: AH8x225fR7MTDzeFqHz6p6D+emhGcJYmWadIY7qlpRgv71VxELrk1kfSx31Kkn0V13LTGtADMOQw/A== X-Received: by 10.98.58.129 with SMTP id v1mr1979921pfj.203.1518891835946; Sat, 17 Feb 2018 10:23:55 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:34 -0800 Message-Id: <20180217182323.25885-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 18/67] target/arm: Implement SVE Stack Allocation Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/translate-sve.c | 24 ++++++++++++++++++++++++ target/arm/sve.decode | 12 ++++++++++++ 2 files changed, 36 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 773f0bfded..4a38020c8a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -742,6 +742,30 @@ static void trans_INDEX_rr(DisasContext *s, arg_INDEX_= rr *a, uint32_t insn) do_index(s, a->esz, a->rd, start, incr); } =20 +/* + *** SVE Stack Allocation Group + */ + +static void trans_ADDVL(DisasContext *s, arg_ADDVL *a, uint32_t insn) +{ + TCGv_i64 rd =3D cpu_reg_sp(s, a->rd); + TCGv_i64 rn =3D cpu_reg_sp(s, a->rn); + tcg_gen_addi_i64(rd, rn, a->imm * vec_full_reg_size(s)); +} + +static void trans_ADDPL(DisasContext *s, arg_ADDPL *a, uint32_t insn) +{ + TCGv_i64 rd =3D cpu_reg_sp(s, a->rd); + TCGv_i64 rn =3D cpu_reg_sp(s, a->rn); + tcg_gen_addi_i64(rd, rn, a->imm * pred_full_reg_size(s)); +} + +static void trans_RDVL(DisasContext *s, arg_RDVL *a, uint32_t insn) +{ + TCGv_i64 reg =3D cpu_reg(s, a->rd); + tcg_gen_movi_i64(reg, a->imm * vec_full_reg_size(s)); +} + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d7b078e92f..0b47869dcd 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -86,6 +86,9 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz =20 +# Two register operands with a 6-bit signed immediate. +@rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri + # Two register operand, one immediate operand, with predicate, # element size encoded as TSZHL. User must fill in imm. @rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 \ @@ -240,6 +243,15 @@ INDEX_ri 00000100 esz:2 1 imm:s5 010001 rn:5 rd:5 # SVE index generation (register start, register increment) INDEX_rr 00000100 .. 1 ..... 010011 ..... ..... @rd_rn_rm =20 +### SVE Stack Allocation Group + +# SVE stack frame adjustment +ADDVL 00000100 001 ..... 01010 ...... ..... @rd_rn_i6 +ADDPL 00000100 011 ..... 01010 ...... ..... @rd_rn_i6 + +# SVE stack frame size +RDVL 00000100 101 11111 01010 imm:s6 rd:5 + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893181733135.5760606031438; Sat, 17 Feb 2018 10:46:21 -0800 (PST) Received: from localhost ([::1]:48236 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7V6-00022D-OZ for importer@patchew.org; Sat, 17 Feb 2018 13:46:20 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39935) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79U-0000cC-KU for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79T-0001gs-3T for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:00 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:44352) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79S-0001gM-Rz for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:59 -0500 Received: by mail-pg0-x243.google.com with SMTP id l4so2191244pgp.11 for ; Sat, 17 Feb 2018 10:23:58 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.56 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=flMxh8Nvg2soBYq5gB6IVe/v9x76mKBYlcXjLqsvSu8=; b=SmON7BiB1Oa4kH7rcwQy2TmkZau4ONwPUQS/n8dEP1UstUk+wmYz9R70otlU1K6/xo tfnTG63eQREzoSNSGuffcw2zweJENbZbXzw5j0k4XrEb26pZS+OVOCUGfVMMgOv1WMci 1WI+w7gSQIROucxLE2y77dhKez/w1ecelyiZM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=flMxh8Nvg2soBYq5gB6IVe/v9x76mKBYlcXjLqsvSu8=; b=HSBdK0On+ahWE+BfOewzRRvP0fV36BL8qKhlDmiIN+JdUNlsBWLKRBchxqxnZRa53v bPL2VXY1D6ibBdZFwsWWpi4tWWwqRF8gyIjWveyu8Veo78qoGjHz2Api1r5BoSjGHxSx cSLaiWkn6Dkfsdi1pBLVFSOl0KkpWn4phTJDllC7+9+fpbbZkSFSiv99XRmW97vqoBOL o/S5IfkMmnM9s+/+5sFN4oj9oj/mz69bgEoBRmPtoYDG/Y3dizGLr3UL42Xu39U7jMz4 tYUdJ10fUODIwnEeQGW6Y+rA3z1qaad5TI2tpL68m68NEImQXB5pe+rt4I4GfxLUmmpd E/9A== X-Gm-Message-State: APf1xPAo1H5FBvkGuf/AcRMX4vG+US7I3SFFKu3HwSMkzrvmjoFIpzC0 lt9hiuI7goNWz1bu+9JYB+LJSi/4H3Q= X-Google-Smtp-Source: AH8x2274+Azxc84h6CqkBvw5gwwCIMz+vfQXtIvmz4g4WXh1TPnF/N2Wb2Ii/6Tt9AcVOSeKxZOMWg== X-Received: by 10.98.38.134 with SMTP id m128mr9653690pfm.154.1518891837624; Sat, 17 Feb 2018 10:23:57 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:35 -0800 Message-Id: <20180217182323.25885-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH v2 19/67] target/arm: Implement SVE Bitwise Shift - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 12 +++++++ target/arm/sve_helper.c | 30 +++++++++++++++++ target/arm/translate-sve.c | 81 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 26 +++++++++++++++ 4 files changed, 149 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2a2dbe98dd..00e3cd48bb 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -368,6 +368,18 @@ DEF_HELPER_FLAGS_4(sve_index_h, TCG_CALL_NO_RWG, void,= ptr, i32, i32, i32) DEF_HELPER_FLAGS_4(sve_index_s, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) DEF_HELPER_FLAGS_4(sve_index_d, TCG_CALL_NO_RWG, void, ptr, i64, i64, i32) =20 +DEF_HELPER_FLAGS_4(sve_asr_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_asr_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_asr_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) + +DEF_HELPER_FLAGS_4(sve_lsr_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_lsr_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_lsr_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) + +DEF_HELPER_FLAGS_4(sve_lsl_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_lsl_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_lsl_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 950012e70a..4c6e2713fa 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -614,6 +614,36 @@ DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG) DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG) DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG) =20 +/* Three-operand expander, unpredicated, in which the third operand is "wi= de". + */ +#define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + for (i =3D 0; i < opr_sz; ) { \ + TYPEW mm =3D *(TYPEW *)(vm + i); \ + do { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) =3D OP(nn, mm); \ + i +=3D sizeof(TYPE); \ + } while (i & 7); \ + } \ +} + +DO_ZZW(sve_asr_zzw_b, int8_t, uint64_t, H1, DO_ASR) +DO_ZZW(sve_lsr_zzw_b, uint8_t, uint64_t, H1, DO_LSR) +DO_ZZW(sve_lsl_zzw_b, uint8_t, uint64_t, H1, DO_LSL) + +DO_ZZW(sve_asr_zzw_h, int16_t, uint64_t, H1_2, DO_ASR) +DO_ZZW(sve_lsr_zzw_h, uint16_t, uint64_t, H1_2, DO_LSR) +DO_ZZW(sve_lsl_zzw_h, uint16_t, uint64_t, H1_2, DO_LSL) + +DO_ZZW(sve_asr_zzw_s, int32_t, uint64_t, H1_4, DO_ASR) +DO_ZZW(sve_lsr_zzw_s, uint32_t, uint64_t, H1_4, DO_LSR) +DO_ZZW(sve_lsl_zzw_s, uint32_t, uint64_t, H1_4, DO_LSL) + +#undef DO_ZZW + #undef DO_CLS_B #undef DO_CLS_H #undef DO_CLZ_B diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4a38020c8a..43e9f1ad08 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -130,6 +130,13 @@ static void do_mov_z(DisasContext *s, int rd, int rn) do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn); } =20 +/* Initialize a Zreg with replications of a 64-bit immediate. */ +static void do_dupi_z(DisasContext *s, int rd, uint64_t word) +{ + unsigned vsz =3D vec_full_reg_size(s); + tcg_gen_gvec_dup64i(vec_full_reg_offset(s, rd), vsz, vsz, word); +} + /* Invoke a vector expander on two Pregs. */ static void do_vector2_p(DisasContext *s, GVecGen2Fn *gvec_fn, int esz, int rd, int rn) @@ -644,6 +651,80 @@ DO_ZPZW(LSL, lsl) =20 #undef DO_ZPZW =20 +/* + *** SVE Bitwise Shift - Unpredicated Group + */ + +static void do_shift_imm(DisasContext *s, arg_rri_esz *a, bool asr, + void (*gvec_fn)(unsigned, uint32_t, uint32_t, + int64_t, uint32_t, uint32_t)) +{ + unsigned vsz =3D vec_full_reg_size(s); + if (a->esz < 0) { + /* Invalid tsz encoding -- see tszimm_esz. */ + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. For + arithmetic right-shift, it's the same as by one less. + Otherwise it is a zeroing operation. */ + if (a->imm >=3D 8 << a->esz) { + if (asr) { + a->imm =3D (8 << a->esz) - 1; + } else { + do_dupi_z(s, a->rd, 0); + return; + } + } + gvec_fn(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz); +} + +static void trans_ASR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_shift_imm(s, a, true, tcg_gen_gvec_sari); +} + +static void trans_LSR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_shift_imm(s, a, false, tcg_gen_gvec_shri); +} + +static void trans_LSL_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_shift_imm(s, a, false, tcg_gen_gvec_shli); +} + +static void do_zzw_ool(DisasContext *s, arg_rrr_esz *a, gen_helper_gvec_3 = *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + if (fn =3D=3D NULL) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fn); +} + +#define DO_ZZW(NAME, name) \ +static void trans_##NAME##_zzw(DisasContext *s, arg_rrr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_3 * const fns[4] =3D { = \ + gen_helper_sve_##name##_zzw_b, gen_helper_sve_##name##_zzw_h, \ + gen_helper_sve_##name##_zzw_s, NULL \ + }; \ + do_zzw_ool(s, a, fns[a->esz]); \ +} + +DO_ZZW(ASR, asr) +DO_ZZW(LSR, lsr) +DO_ZZW(LSL, lsl) + +#undef DO_ZZW + /* *** SVE Integer Multiply-Add Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0b47869dcd..f71ea1b60d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -33,6 +33,11 @@ # A combination of tsz:imm3 -- extract (tsz:imm3) - esize %tszimm_shl 22:2 5:5 !function=3Dtszimm_shl =20 +# Similarly for the tszh/tszl pair at 22/16 for zzi +%tszimm16_esz 22:2 16:5 !function=3Dtszimm_esz +%tszimm16_shr 22:2 16:5 !function=3Dtszimm_shr +%tszimm16_shl 22:2 16:5 !function=3Dtszimm_shl + # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. %reg_movprfx 0:5 @@ -44,6 +49,7 @@ =20 &rr_esz rd rn esz &rri rd rn imm +&rri_esz rd rn imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz &rprr_s rd pg rn rm s @@ -94,6 +100,10 @@ @rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 \ &rpri_esz rn=3D%reg_movprfx esz=3D%tszimm_esz =20 +# Similarly without predicate. +@rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 \ + &rri_esz esz=3D%tszimm16_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=3D%imm9_16_10 @@ -252,6 +262,22 @@ ADDPL 00000100 011 ..... 01010 ...... ..... @rd_rn_i6 # SVE stack frame size RDVL 00000100 101 11111 01010 imm:s6 rd:5 =20 +### SVE Bitwise Shift - Unpredicated Group + +# SVE bitwise shift by immediate (unpredicated) +ASR_zzi 00000100 .. 1 ..... 1001 00 ..... ..... \ + @rd_rn_tszimm imm=3D%tszimm16_shr +LSR_zzi 00000100 .. 1 ..... 1001 01 ..... ..... \ + @rd_rn_tszimm imm=3D%tszimm16_shr +LSL_zzi 00000100 .. 1 ..... 1001 11 ..... ..... \ + @rd_rn_tszimm imm=3D%tszimm16_shl + +# SVE bitwise shift by wide elements (unpredicated) +# Note esz !=3D 3 +ASR_zzw 00000100 .. 1 ..... 1000 00 ..... ..... @rd_rn_rm +LSR_zzw 00000100 .. 1 ..... 1000 01 ..... ..... @rd_rn_rm +LSL_zzw 00000100 .. 1 ..... 1000 11 ..... ..... @rd_rn_rm + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892920083716.4813226051023; Sat, 17 Feb 2018 10:42:00 -0800 (PST) Received: from localhost ([::1]:48195 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Qq-0006lp-S3 for importer@patchew.org; Sat, 17 Feb 2018 13:41:56 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39957) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79V-0000cH-Nv for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79U-0001hY-Hj for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:01 -0500 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:47051) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79U-0001hC-AC for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:00 -0500 Received: by mail-pf0-x242.google.com with SMTP id z24so587774pfh.13 for ; Sat, 17 Feb 2018 10:24:00 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=qUivDrBs/deC0vPF+CXEPEAAPcV9POzH7MV2Tc4RKIU=; b=FjCGSzOrl59WAmxuTF3fp9cSnWOvbvUgfkBEWUwBwkG9lu8QMIiknQeTGNI/UBvJL3 2MoyEOQphAC3KlOTqLXCYkJhLwyiVlZ7vMcC8trb2Wb3jChCrAP4w2TefmACPQ5aSrTK 4bZvwBG+audRb0NOqQmMDYm/3BdsDWDcLoQsM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=qUivDrBs/deC0vPF+CXEPEAAPcV9POzH7MV2Tc4RKIU=; b=gK/YjL0hgmrrgBBAEtg/NpGcmc/dkyrjJgSVg+pIvBX7grSJNj23q22jUvfRnbTuFd umdnBA3V3xmrRMFj2/K5OQqEo/Gu3i3C3joV8ZAzAEE0ol38I6cQ2fOppWXuZNAHJmAA FoaH0z64Q+tL2B8oT8hcBGKMtLmpDR2wEC6zqfQq5ut42zRfXw9J07ads8qQsqdSFnQP TXqj9TIrKSsWQyM9kJn5PKDMpG1Gp2O1MvKYtxgWgD6U8+GKplC03L8MjtswlgvB2tTc kocV7R4q6Yk7PKQDMOPkqHepv3m0TJpi0RR+hRbr1fMm1y+ZZmrwpapmFK2KUKKquYWN /bYQ== X-Gm-Message-State: APf1xPBL7a/9GWy0Veotn08R0IhFujPjsGr6n+FYv6Er1672nVVP+FO1 kScQqSn7i+K8A7zeswTXLIDUqClNEpQ= X-Google-Smtp-Source: AH8x2252cRtbyCN03vYg4IiEbIx4rFTJXXoOqRlBFxSuNUSDnI0w9CKlWsoPSxel9sU8K62hl5NUNA== X-Received: by 10.98.196.199 with SMTP id h68mr1212317pfk.42.1518891839009; Sat, 17 Feb 2018 10:23:59 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:36 -0800 Message-Id: <20180217182323.25885-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v2 20/67] target/arm: Implement SVE Compute Vector Address Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 5 +++++ target/arm/sve_helper.c | 40 ++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 33 +++++++++++++++++++++++++++++++++ target/arm/sve.decode | 12 ++++++++++++ 4 files changed, 90 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 00e3cd48bb..5280d375f9 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -380,6 +380,11 @@ DEF_HELPER_FLAGS_4(sve_lsl_zzw_b, TCG_CALL_NO_RWG, voi= d, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_lsl_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) DEF_HELPER_FLAGS_4(sve_lsl_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) =20 +DEF_HELPER_FLAGS_4(sve_adr_p32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_adr_p64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_adr_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_adr_u32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4c6e2713fa..a290a58c02 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1061,3 +1061,43 @@ void HELPER(sve_index_d)(void *vd, uint64_t start, d[i] =3D start + i * incr; } } + +void HELPER(sve_adr_p32)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 4; + uint32_t sh =3D simd_data(desc); + uint32_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D n[i] + (m[i] << sh); + } +} + +void HELPER(sve_adr_p64)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t sh =3D simd_data(desc); + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D n[i] + (m[i] << sh); + } +} + +void HELPER(sve_adr_s32)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t sh =3D simd_data(desc); + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D n[i] + ((uint64_t)(int32_t)m[i] << sh); + } +} + +void HELPER(sve_adr_u32)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t sh =3D simd_data(desc); + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D n[i] + ((uint64_t)(uint32_t)m[i] << sh); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 43e9f1ad08..34cc8c2773 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -847,6 +847,39 @@ static void trans_RDVL(DisasContext *s, arg_RDVL *a, u= int32_t insn) tcg_gen_movi_i64(reg, a->imm * vec_full_reg_size(s)); } =20 +/* + *** SVE Compute Vector Address Group + */ + +static void do_adr(DisasContext *s, arg_rrri *a, gen_helper_gvec_3 *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, a->imm, fn); +} + +static void trans_ADR_p32(DisasContext *s, arg_rrri *a, uint32_t insn) +{ + do_adr(s, a, gen_helper_sve_adr_p32); +} + +static void trans_ADR_p64(DisasContext *s, arg_rrri *a, uint32_t insn) +{ + do_adr(s, a, gen_helper_sve_adr_p64); +} + +static void trans_ADR_s32(DisasContext *s, arg_rrri *a, uint32_t insn) +{ + do_adr(s, a, gen_helper_sve_adr_s32); +} + +static void trans_ADR_u32(DisasContext *s, arg_rrri *a, uint32_t insn) +{ + do_adr(s, a, gen_helper_sve_adr_u32); +} + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f71ea1b60d..6ec1f94832 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -49,6 +49,7 @@ =20 &rr_esz rd rn esz &rri rd rn imm +&rrri rd rn rm imm &rri_esz rd rn imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz @@ -77,6 +78,9 @@ # Three operand, vector element size @rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz =20 +# Three operand with "memory" size, aka immediate left shift +@rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri + # Two register operand, with governing predicate, vector element size @rdn_pg_rm ........ esz:2 ... ... ... pg:3 rm:5 rd:5 \ &rprr_esz rn=3D%reg_movprfx @@ -278,6 +282,14 @@ ASR_zzw 00000100 .. 1 ..... 1000 00 ..... ..... @rd_= rn_rm LSR_zzw 00000100 .. 1 ..... 1000 01 ..... ..... @rd_rn_rm LSL_zzw 00000100 .. 1 ..... 1000 11 ..... ..... @rd_rn_rm =20 +### SVE Compute Vector Address Group + +# SVE vector address generation +ADR_s32 00000100 00 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm +ADR_u32 00000100 01 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm +ADR_p32 00000100 10 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm +ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893419407553.6469223537896; Sat, 17 Feb 2018 10:50:19 -0800 (PST) Received: from localhost ([::1]:48264 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Yw-00053a-HD for importer@patchew.org; Sat, 17 Feb 2018 13:50:18 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40002) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79X-0000cK-Uo for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79W-0001ic-0n for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:03 -0500 Received: from mail-pg0-x244.google.com ([2607:f8b0:400e:c05::244]:43777) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79V-0001i7-Op for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:01 -0500 Received: by mail-pg0-x244.google.com with SMTP id f6so4342956pgs.10 for ; Sat, 17 Feb 2018 10:24:01 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.59 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=118KeuWXZVwxjQFkY5AB5YsFbc+jn6EHA5V7kRQzsGs=; b=jvQ2AbmbvBM8Tp3q9d5eHKlWbe6OTUZiM+YZINPTsvgpwL/Oj5AXb5A+8zW+vcsfpH LLn3sNe/gFUe5URO6SvMbVxiGsk9khfZ87Q2aIuMLBStV0QFxzxECj1L089KVoblGv4O eeMrou7G9WEWNQmM1mWiOiHKwhlEAEo8Sv1W4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=118KeuWXZVwxjQFkY5AB5YsFbc+jn6EHA5V7kRQzsGs=; b=F6Rn+dCTaLL9jNGVSTngHfMRPi8u12RsD4FaanZJMNC0XIevPw1FbNFjzwVvX+LjQT p4FdYOXJiW5KGNGn43u3YwerNpGQgs9GZm9bKxF9RFiaU6G4TTkesBNJLR0IasCZdN8b yicGKHkieWIbTJATKtZINaNtRoyEN2qFKIS2n70UNJ2YfEwL/WyVP1GEKoIfo0ZoKr1n LtFJ2n+9qvAL5iGGhZ58x1S+slXTWRWbEPZVtozW1fDzjoLqGg2+iaVjCta7DA5MntAV D/0jDluumt2fdtC6G9lhu4XU9cw3Gf922urI4js7cLFkMOGbzdN1AEkO82jDc9iyk1ml 1NAQ== X-Gm-Message-State: APf1xPBj6Hp+IOfiCvsIBPMqiK/dWHmPjIC3+/4YHEJ5uH6CYMvhjncf /feBbGLMIaCUkSE+SwjvRNXv5zH+SYw= X-Google-Smtp-Source: AH8x224wjaPgpp3j8LHnQbd8nEgGqSWgJYjpng/XYVFQpXfXrPJzChstxLa7LODEnvpz8dHdyMVhxw== X-Received: by 10.98.107.130 with SMTP id g124mr9665515pfc.225.1518891840502; Sat, 17 Feb 2018 10:24:00 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:37 -0800 Message-Id: <20180217182323.25885-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::244 Subject: [Qemu-devel] [PATCH v2 21/67] target/arm: Implement SVE floating-point exponential accelerator X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 4 +++ target/arm/sve_helper.c | 81 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 22 +++++++++++++ target/arm/sve.decode | 7 ++++ 4 files changed, 114 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 5280d375f9..e2925ff8ec 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -385,6 +385,10 @@ DEF_HELPER_FLAGS_4(sve_adr_p64, TCG_CALL_NO_RWG, void,= ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_adr_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_adr_u32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(sve_fexpa_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fexpa_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fexpa_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a290a58c02..4d42653eef 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1101,3 +1101,84 @@ void HELPER(sve_adr_u32)(void *vd, void *vn, void *v= m, uint32_t desc) d[i] =3D n[i] + ((uint64_t)(uint32_t)m[i] << sh); } } + +void HELPER(sve_fexpa_h)(void *vd, void *vn, uint32_t desc) +{ + static const uint16_t coeff[] =3D { + 0x0000, 0x0016, 0x002d, 0x0045, 0x005d, 0x0075, 0x008e, 0x00a8, + 0x00c2, 0x00dc, 0x00f8, 0x0114, 0x0130, 0x014d, 0x016b, 0x0189, + 0x01a8, 0x01c8, 0x01e8, 0x0209, 0x022b, 0x024e, 0x0271, 0x0295, + 0x02ba, 0x02e0, 0x0306, 0x032e, 0x0356, 0x037f, 0x03a9, 0x03d4, + }; + intptr_t i, opr_sz =3D simd_oprsz(desc) / 2; + uint16_t *d =3D vd, *n =3D vn; + + for (i =3D 0; i < opr_sz; i++) { + uint16_t nn =3D n[i]; + intptr_t idx =3D extract32(nn, 0, 5); + uint16_t exp =3D extract32(nn, 5, 5); + d[i] =3D coeff[idx] | (exp << 10); + } +} + +void HELPER(sve_fexpa_s)(void *vd, void *vn, uint32_t desc) +{ + static const uint32_t coeff[] =3D { + 0x000000, 0x0164d2, 0x02cd87, 0x043a29, + 0x05aac3, 0x071f62, 0x08980f, 0x0a14d5, + 0x0b95c2, 0x0d1adf, 0x0ea43a, 0x1031dc, + 0x11c3d3, 0x135a2b, 0x14f4f0, 0x16942d, + 0x1837f0, 0x19e046, 0x1b8d3a, 0x1d3eda, + 0x1ef532, 0x20b051, 0x227043, 0x243516, + 0x25fed7, 0x27cd94, 0x29a15b, 0x2b7a3a, + 0x2d583f, 0x2f3b79, 0x3123f6, 0x3311c4, + 0x3504f3, 0x36fd92, 0x38fbaf, 0x3aff5b, + 0x3d08a4, 0x3f179a, 0x412c4d, 0x4346cd, + 0x45672a, 0x478d75, 0x49b9be, 0x4bec15, + 0x4e248c, 0x506334, 0x52a81e, 0x54f35b, + 0x5744fd, 0x599d16, 0x5bfbb8, 0x5e60f5, + 0x60ccdf, 0x633f89, 0x65b907, 0x68396a, + 0x6ac0c7, 0x6d4f30, 0x6fe4ba, 0x728177, + 0x75257d, 0x77d0df, 0x7a83b3, 0x7d3e0c, + }; + intptr_t i, opr_sz =3D simd_oprsz(desc) / 4; + uint32_t *d =3D vd, *n =3D vn; + + for (i =3D 0; i < opr_sz; i++) { + uint32_t nn =3D n[i]; + intptr_t idx =3D extract32(nn, 0, 6); + uint32_t exp =3D extract32(nn, 6, 8); + d[i] =3D coeff[idx] | (exp << 23); + } +} + +void HELPER(sve_fexpa_d)(void *vd, void *vn, uint32_t desc) +{ + static const uint64_t coeff[] =3D { + 0x0000000000000, 0x02C9A3E778061, 0x059B0D3158574, 0x0874518759BC8, + 0x0B5586CF9890F, 0x0E3EC32D3D1A2, 0x11301D0125B51, 0x1429AAEA92DE0, + 0x172B83C7D517B, 0x1A35BEB6FCB75, 0x1D4873168B9AA, 0x2063B88628CD6, + 0x2387A6E756238, 0x26B4565E27CDD, 0x29E9DF51FDEE1, 0x2D285A6E4030B, + 0x306FE0A31B715, 0x33C08B26416FF, 0x371A7373AA9CB, 0x3A7DB34E59FF7, + 0x3DEA64C123422, 0x4160A21F72E2A, 0x44E086061892D, 0x486A2B5C13CD0, + 0x4BFDAD5362A27, 0x4F9B2769D2CA7, 0x5342B569D4F82, 0x56F4736B527DA, + 0x5AB07DD485429, 0x5E76F15AD2148, 0x6247EB03A5585, 0x6623882552225, + 0x6A09E667F3BCD, 0x6DFB23C651A2F, 0x71F75E8EC5F74, 0x75FEB564267C9, + 0x7A11473EB0187, 0x7E2F336CF4E62, 0x82589994CCE13, 0x868D99B4492ED, + 0x8ACE5422AA0DB, 0x8F1AE99157736, 0x93737B0CDC5E5, 0x97D829FDE4E50, + 0x9C49182A3F090, 0xA0C667B5DE565, 0xA5503B23E255D, 0xA9E6B5579FDBF, + 0xAE89F995AD3AD, 0xB33A2B84F15FB, 0xB7F76F2FB5E47, 0xBCC1E904BC1D2, + 0xC199BDD85529C, 0xC67F12E57D14B, 0xCB720DCEF9069, 0xD072D4A07897C, + 0xD5818DCFBA487, 0xDA9E603DB3285, 0xDFC97337B9B5F, 0xE502EE78B3FF6, + 0xEA4AFA2A490DA, 0xEFA1BEE615A27, 0xF50765B6E4540, 0xFA7C1819E90D8, + }; + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn; + + for (i =3D 0; i < opr_sz; i++) { + uint64_t nn =3D n[i]; + intptr_t idx =3D extract32(nn, 0, 6); + uint64_t exp =3D extract32(nn, 6, 11); + d[i] =3D coeff[idx] | (exp << 52); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 34cc8c2773..2f23f1b192 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -880,6 +880,28 @@ static void trans_ADR_u32(DisasContext *s, arg_rrri *a= , uint32_t insn) do_adr(s, a, gen_helper_sve_adr_u32); } =20 +/* + *** SVE Integer Misc - Unpredicated Group + */ + +static void trans_FEXPA(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2 * const fns[4] =3D { + NULL, + gen_helper_sve_fexpa_h, + gen_helper_sve_fexpa_s, + gen_helper_sve_fexpa_d, + }; + unsigned vsz =3D vec_full_reg_size(s); + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, 0, fns[a->esz]); +} + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6ec1f94832..e791fe8031 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -68,6 +68,7 @@ =20 # Two operand @pd_pn ........ esz:2 .. .... ....... rn:4 . rd:4 &rr_esz +@rd_rn ........ esz:2 ...... ...... rn:5 rd:5 &rr_esz =20 # Three operand with unused vector element size @rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=3D0 @@ -290,6 +291,12 @@ ADR_u32 00000100 01 1 ..... 1010 .. ..... ..... @rd_= rn_msz_rm ADR_p32 00000100 10 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm =20 +### SVE Integer Misc - Unpredicated Group + +# SVE floating-point exponential accelerator +# Note esz !=3D 0 +FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893613168599.2174904491634; Sat, 17 Feb 2018 10:53:33 -0800 (PST) Received: from localhost ([::1]:48297 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7c4-0007nw-69 for importer@patchew.org; Sat, 17 Feb 2018 13:53:32 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40018) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79Y-0000dI-KO for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79X-0001jf-Je for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:04 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:36444) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79X-0001jM-EB for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:03 -0500 Received: by mail-pg0-x241.google.com with SMTP id j9so4359617pgv.3 for ; Sat, 17 Feb 2018 10:24:03 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.00 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6mOX10P32ylwuosMCM/vEVrBSkq/IoiKDXfgaS7C6yA=; b=Cbn5c3CnrlMaYY+GBPBs8zLeBddcAqd4zwZnExg8yJ02kypfRfaFRyNWR6S9zJwPd9 pZJ8df3NshxjKAxOD7EB4Kp8eP3I25lDutZz+5Vqd64yyLPKaobNcK7l0sHiYPvyLUeG NdIXhhF/Vlws3L/LW62psQJMsazQnJWrms3Hg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6mOX10P32ylwuosMCM/vEVrBSkq/IoiKDXfgaS7C6yA=; b=Zi8gBduIXBl4PzLB1H9foFqK0Js0XryKiWXeYMACNiRPBb3zqXlBaoX8q9kBnbwwbM G8IeOcR95VhqLyPrUd2IcE2kHYjub9davsJqa2b25UJz7e6OpPeoUmyJd4AVNVmHVydL 4T3antAUbTLw78Dx9oXvfrxxOl24kCWO42dscOjUZytSQSWNsRw0TIiJGxZaNU6DmyUH 4WNkUWOKUjyJawdHy74de3Z0K7/EUwvTDL5Q9eO0KapZXZgbPe3q1Y+CucLYdOiNOktD IP90fUhl8fXJRincn84/cnDyGs8IGptXL5l3vvUB2VR9O4qO8Sf+jxr/cE6Qq96CwnZb SsxQ== X-Gm-Message-State: APf1xPBC8/wINBDiqRyhLTnqKfS6qnGh9SGjaTtR+Tv6oevfMfDm8LHp IKmMVzRuUW6krfERv2Xe6aYpunNZTeU= X-Google-Smtp-Source: AH8x225NgcDwDGWb4Ez+7/CvBZiaWqwixnt49ZD3/yFHI+YCgeOgFr9xtXH8Ao84BbD9kyPpzCr7VA== X-Received: by 10.98.245.131 with SMTP id b3mr9812757pfm.20.1518891842022; Sat, 17 Feb 2018 10:24:02 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:38 -0800 Message-Id: <20180217182323.25885-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH v2 22/67] target/arm: Implement SVE floating-point trig select coefficient X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 4 ++++ target/arm/sve_helper.c | 43 +++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 19 +++++++++++++++++++ target/arm/sve.decode | 4 ++++ 4 files changed, 70 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index e2925ff8ec..4f1bd5a62f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -389,6 +389,10 @@ DEF_HELPER_FLAGS_3(sve_fexpa_h, TCG_CALL_NO_RWG, void,= ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_fexpa_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_fexpa_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_ftssel_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_ftssel_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_ftssel_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4d42653eef..b4f70af23f 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -23,6 +23,7 @@ #include "exec/cpu_ldst.h" #include "exec/helper-proto.h" #include "tcg/tcg-gvec-desc.h" +#include "fpu/softfloat.h" =20 =20 /* Note that vector data is stored in host-endian 64-bit chunks, @@ -1182,3 +1183,45 @@ void HELPER(sve_fexpa_d)(void *vd, void *vn, uint32_= t desc) d[i] =3D coeff[idx] | (exp << 52); } } + +void HELPER(sve_ftssel_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 2; + uint16_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint16_t nn =3D n[i]; + uint16_t mm =3D m[i]; + if (mm & 1) { + nn =3D float16_one; + } + d[i] =3D nn ^ (mm & 2) << 14; + } +} + +void HELPER(sve_ftssel_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 4; + uint32_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint32_t nn =3D n[i]; + uint32_t mm =3D m[i]; + if (mm & 1) { + nn =3D float32_one; + } + d[i] =3D nn ^ (mm & 2) << 30; + } +} + +void HELPER(sve_ftssel_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint64_t nn =3D n[i]; + uint64_t mm =3D m[i]; + if (mm & 1) { + nn =3D float64_one; + } + d[i] =3D nn ^ (mm & 2) << 62; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 2f23f1b192..e32be385fd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -902,6 +902,25 @@ static void trans_FEXPA(DisasContext *s, arg_rr_esz *a= , uint32_t insn) vsz, vsz, 0, fns[a->esz]); } =20 +static void trans_FTSSEL(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, + gen_helper_sve_ftssel_h, + gen_helper_sve_ftssel_s, + gen_helper_sve_ftssel_d, + }; + unsigned vsz =3D vec_full_reg_size(s); + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fns[a->esz]); +} + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index e791fe8031..4ea3f33919 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -297,6 +297,10 @@ ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_= rn_msz_rm # Note esz !=3D 0 FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn =20 +# SVE floating-point trig select coefficient +# Note esz !=3D 0 +FTSSEL 00000100 .. 1 ..... 101100 ..... ..... @rd_rn_rm + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893437364173.34977810488374; Sat, 17 Feb 2018 10:50:37 -0800 (PST) Received: from localhost ([::1]:48265 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7ZE-0005Jz-DW for importer@patchew.org; Sat, 17 Feb 2018 13:50:36 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40072) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79c-0000ho-1p for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79Z-0001kR-Fa for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:08 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:44407) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79Z-0001k9-6V for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:05 -0500 Received: by mail-pf0-x244.google.com with SMTP id 17so590931pfw.11 for ; Sat, 17 Feb 2018 10:24:05 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.02 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=rsh86Uyni96VeArd/ErWXmUyZyr8qyIKJCMXUaKWZU0=; b=ghV2xgAlpAyAP7tkVG4GrFsiHoirQYM8f+3fMFeHpeaGG20n5nCteUNfgOcoOjMrf2 7Y4+pB6+l4ikquN+LOzSWDNiHwYhcjNDK+AZ69QMN8jyV4QIB2wBEdGa7QgDI16lqh1M GCIuHHH5atOAJH+aiJbArC+p4rzp22kKOmcLQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=rsh86Uyni96VeArd/ErWXmUyZyr8qyIKJCMXUaKWZU0=; b=D1bxWO31C63itXncltWZbHQf/fVmQ7q7D47e0KAM4awxcdhZFBeFbvpry6FcDi17zQ aTFL7x2hF9kgAuruQZz4ldYelSZc74pFb1GG3W8QJYbDtavJxXQ9MlYg8PYOfhnslada sh7wSXuWn6+PQEOJOonJAHAXnEuG+YdmEyaO7fltWw5eOHAgnMjhTDi8a8xLTm9Vzbec hI8vS3SY1xvB5WU7cSEEnZNXas2qTmYFOQDRDSjkCnjP0JPBrXTra4QPB+RLnpTJs7kp nNxk2nM4OcjMVnzABFC2B8cvW50fT7p4LFm5OhKUltjsm2WZrawHCansqhihUE1WL63O UjCA== X-Gm-Message-State: APf1xPAHJWrYsRnEVJSu8/I52sRrWIj8sQQwfjPvUREAzfH/6ePNaxZn PnTjW5XJ0cMON4eD74D/tArOz2yF83U= X-Google-Smtp-Source: AH8x226NKR9aWxSYjHKhD1vLp83ga+UGr+MVEywWmGUnVCXU6psNu+jESciHoqFh2V+aaqlLq+9LiQ== X-Received: by 10.98.208.3 with SMTP id p3mr9790778pfg.8.1518891843780; Sat, 17 Feb 2018 10:24:03 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:39 -0800 Message-Id: <20180217182323.25885-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 23/67] target/arm: Implement SVE Element Count Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 11 ++ target/arm/sve_helper.c | 136 ++++++++++++++++++++++ target/arm/translate-sve.c | 274 +++++++++++++++++++++++++++++++++++++++++= +++- target/arm/sve.decode | 30 ++++- 4 files changed, 448 insertions(+), 3 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4f1bd5a62f..2831e1643b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -393,6 +393,17 @@ DEF_HELPER_FLAGS_4(sve_ftssel_h, TCG_CALL_NO_RWG, void= , ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_ftssel_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_ftssel_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_sqaddi_b, TCG_CALL_NO_RWG, void, ptr, ptr, s32, i32) +DEF_HELPER_FLAGS_4(sve_sqaddi_h, TCG_CALL_NO_RWG, void, ptr, ptr, s32, i32) +DEF_HELPER_FLAGS_4(sve_sqaddi_s, TCG_CALL_NO_RWG, void, ptr, ptr, s64, i32) +DEF_HELPER_FLAGS_4(sve_sqaddi_d, TCG_CALL_NO_RWG, void, ptr, ptr, s64, i32) + +DEF_HELPER_FLAGS_4(sve_uqaddi_b, TCG_CALL_NO_RWG, void, ptr, ptr, s32, i32) +DEF_HELPER_FLAGS_4(sve_uqaddi_h, TCG_CALL_NO_RWG, void, ptr, ptr, s32, i32) +DEF_HELPER_FLAGS_4(sve_uqaddi_s, TCG_CALL_NO_RWG, void, ptr, ptr, s64, i32) +DEF_HELPER_FLAGS_4(sve_uqaddi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_uqsubi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b4f70af23f..cfda16d520 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1225,3 +1225,139 @@ void HELPER(sve_ftssel_d)(void *vd, void *vn, void = *vm, uint32_t desc) d[i] =3D nn ^ (mm & 2) << 62; } } + +/* + * Signed saturating addition with scalar operand. + */ + +void HELPER(sve_sqaddi_b)(void *d, void *a, int32_t b, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc); + + for (i =3D 0; i < oprsz; i +=3D sizeof(int8_t)) { + int r =3D *(int8_t *)(a + i) + b; + if (r > INT8_MAX) { + r =3D INT8_MAX; + } else if (r < INT8_MIN) { + r =3D INT8_MIN; + } + *(int8_t *)(d + i) =3D r; + } +} + +void HELPER(sve_sqaddi_h)(void *d, void *a, int32_t b, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc); + + for (i =3D 0; i < oprsz; i +=3D sizeof(int16_t)) { + int r =3D *(int16_t *)(a + i) + b; + if (r > INT16_MAX) { + r =3D INT16_MAX; + } else if (r < INT16_MIN) { + r =3D INT16_MIN; + } + *(int16_t *)(d + i) =3D r; + } +} + +void HELPER(sve_sqaddi_s)(void *d, void *a, int64_t b, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc); + + for (i =3D 0; i < oprsz; i +=3D sizeof(int32_t)) { + int64_t r =3D *(int32_t *)(a + i) + b; + if (r > INT32_MAX) { + r =3D INT32_MAX; + } else if (r < INT32_MIN) { + r =3D INT32_MIN; + } + *(int32_t *)(d + i) =3D r; + } +} + +void HELPER(sve_sqaddi_d)(void *d, void *a, int64_t b, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc); + + for (i =3D 0; i < oprsz; i +=3D sizeof(int64_t)) { + int64_t ai =3D *(int64_t *)(a + i); + int64_t r =3D ai + b; + if (((r ^ ai) & ~(ai ^ b)) < 0) { + /* Signed overflow. */ + r =3D (r < 0 ? INT64_MAX : INT64_MIN); + } + *(int64_t *)(d + i) =3D r; + } +} + +/* + * Unsigned saturating addition with scalar operand. + */ + +void HELPER(sve_uqaddi_b)(void *d, void *a, int32_t b, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc); + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint8_t)) { + int r =3D *(uint8_t *)(a + i) + b; + if (r > UINT8_MAX) { + r =3D UINT8_MAX; + } else if (r < 0) { + r =3D 0; + } + *(uint8_t *)(d + i) =3D r; + } +} + +void HELPER(sve_uqaddi_h)(void *d, void *a, int32_t b, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc); + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint16_t)) { + int r =3D *(uint16_t *)(a + i) + b; + if (r > UINT16_MAX) { + r =3D UINT16_MAX; + } else if (r < 0) { + r =3D 0; + } + *(uint16_t *)(d + i) =3D r; + } +} + +void HELPER(sve_uqaddi_s)(void *d, void *a, int64_t b, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc); + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint32_t)) { + int64_t r =3D *(uint32_t *)(a + i) + b; + if (r > UINT32_MAX) { + r =3D UINT32_MAX; + } else if (r < 0) { + r =3D 0; + } + *(uint32_t *)(d + i) =3D r; + } +} + +void HELPER(sve_uqaddi_d)(void *d, void *a, uint64_t b, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc); + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint64_t)) { + uint64_t r =3D *(uint64_t *)(a + i) + b; + if (r < b) { + r =3D UINT64_MAX; + } + *(uint64_t *)(d + i) =3D r; + } +} + +void HELPER(sve_uqsubi_d)(void *d, void *a, uint64_t b, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc); + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint64_t)) { + uint64_t ai =3D *(uint64_t *)(a + i); + *(uint64_t *)(d + i) =3D (ai < b ? 0 : ai - b); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e32be385fd..702f20e97b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -61,6 +61,11 @@ static int tszimm_shl(int x) return x - (8 << tszimm_esz(x)); } =20 +static inline int plus1(int x) +{ + return x + 1; +} + /* * Include the generated decoder. */ @@ -127,7 +132,9 @@ static void do_vector3_z(DisasContext *s, GVecGen3Fn *g= vec_fn, /* Invoke a vector move on two Zregs. */ static void do_mov_z(DisasContext *s, int rd, int rn) { - do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn); + if (rd !=3D rn) { + do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn); + } } =20 /* Initialize a Zreg with replications of a 64-bit immediate. */ @@ -168,7 +175,9 @@ static void do_vecop4_p(DisasContext *s, const GVecGen4= *gvec_op, /* Invoke a vector move on two Pregs. */ static void do_mov_p(DisasContext *s, int rd, int rn) { - do_vector2_p(s, tcg_gen_gvec_mov, 0, rd, rn); + if (rd !=3D rn) { + do_vector2_p(s, tcg_gen_gvec_mov, 0, rd, rn); + } } =20 /* Set the cpu flags as per a return from an SVE helper. */ @@ -1378,6 +1387,267 @@ static void trans_PNEXT(DisasContext *s, arg_rr_esz= *a, uint32_t insn) do_pfirst_pnext(s, a, gen_helper_sve_pnext); } =20 +/* + *** SVE Element Count Group + */ + +/* Perform an inline saturating addition of a 32-bit value within + * a 64-bit register. The second operand is known to be positive, + * which halves the comparisions we must perform to bound the result. + */ +static void do_sat_addsub_32(TCGv_i64 reg, TCGv_i64 val, bool u, bool d) +{ + int64_t ibound; + TCGv_i64 bound; + TCGCond cond; + + /* Use normal 64-bit arithmetic to detect 32-bit overflow. */ + if (u) { + tcg_gen_ext32u_i64(reg, reg); + } else { + tcg_gen_ext32s_i64(reg, reg); + } + if (d) { + tcg_gen_sub_i64(reg, reg, val); + ibound =3D (u ? 0 : INT32_MIN); + cond =3D TCG_COND_LT; + } else { + tcg_gen_add_i64(reg, reg, val); + ibound =3D (u ? UINT32_MAX : INT32_MAX); + cond =3D TCG_COND_GT; + } + bound =3D tcg_const_i64(ibound); + tcg_gen_movcond_i64(cond, reg, reg, bound, bound, reg); + tcg_temp_free_i64(bound); +} + +/* Similarly with 64-bit values. */ +static void do_sat_addsub_64(TCGv_i64 reg, TCGv_i64 val, bool u, bool d) +{ + TCGv_i64 t0 =3D tcg_temp_new_i64(); + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 t2; + + if (u) { + if (d) { + tcg_gen_sub_i64(t0, reg, val); + tcg_gen_movi_i64(t1, 0); + tcg_gen_movcond_i64(TCG_COND_LTU, reg, reg, val, t1, t0); + } else { + tcg_gen_add_i64(t0, reg, val); + tcg_gen_movi_i64(t1, -1); + tcg_gen_movcond_i64(TCG_COND_LTU, reg, t0, reg, t1, t0); + } + } else { + if (d) { + /* Detect signed overflow for subtraction. */ + tcg_gen_xor_i64(t0, reg, val); + tcg_gen_sub_i64(t1, reg, val); + tcg_gen_xor_i64(reg, reg, t0); + tcg_gen_and_i64(t0, t0, reg); + + /* Bound the result. */ + tcg_gen_movi_i64(reg, INT64_MIN); + t2 =3D tcg_const_i64(0); + tcg_gen_movcond_i64(TCG_COND_LT, reg, t0, t2, reg, t1); + } else { + /* Detect signed overflow for addition. */ + tcg_gen_xor_i64(t0, reg, val); + tcg_gen_add_i64(reg, reg, val); + tcg_gen_xor_i64(t1, reg, val); + tcg_gen_andc_i64(t0, t1, t0); + + /* Bound the result. */ + tcg_gen_movi_i64(t1, INT64_MAX); + t2 =3D tcg_const_i64(0); + tcg_gen_movcond_i64(TCG_COND_LT, reg, t0, t2, t1, reg); + } + tcg_temp_free_i64(t2); + } + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); +} + +/* Similarly with a vector and a scalar operand. */ +static void do_sat_addsub_vec(DisasContext *s, int esz, int rd, int rn, + TCGv_i64 val, bool u, bool d) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr dptr, nptr; + TCGv_i32 t32, desc; + TCGv_i64 t64; + + dptr =3D tcg_temp_new_ptr(); + nptr =3D tcg_temp_new_ptr(); + tcg_gen_addi_ptr(dptr, cpu_env, vec_full_reg_offset(s, rd)); + tcg_gen_addi_ptr(nptr, cpu_env, vec_full_reg_offset(s, rn)); + desc =3D tcg_const_i32(simd_desc(vsz, vsz, 0)); + + switch (esz) { + case MO_8: + t32 =3D tcg_temp_new_i32(); + tcg_gen_extrl_i64_i32(t32, val); + if (d) { + tcg_gen_neg_i32(t32, t32); + } + if (u) { + gen_helper_sve_uqaddi_b(dptr, nptr, t32, desc); + } else { + gen_helper_sve_sqaddi_b(dptr, nptr, t32, desc); + } + tcg_temp_free_i32(t32); + break; + + case MO_16: + t32 =3D tcg_temp_new_i32(); + tcg_gen_extrl_i64_i32(t32, val); + if (d) { + tcg_gen_neg_i32(t32, t32); + } + if (u) { + gen_helper_sve_uqaddi_h(dptr, nptr, t32, desc); + } else { + gen_helper_sve_sqaddi_h(dptr, nptr, t32, desc); + } + tcg_temp_free_i32(t32); + break; + + case MO_32: + t64 =3D tcg_temp_new_i64(); + if (d) { + tcg_gen_neg_i64(t64, val); + } else { + tcg_gen_mov_i64(t64, val); + } + if (u) { + gen_helper_sve_uqaddi_s(dptr, nptr, t64, desc); + } else { + gen_helper_sve_sqaddi_s(dptr, nptr, t64, desc); + } + tcg_temp_free_i64(t64); + break; + + case MO_64: + if (u) { + if (d) { + gen_helper_sve_uqsubi_d(dptr, nptr, val, desc); + } else { + gen_helper_sve_uqaddi_d(dptr, nptr, val, desc); + } + } else if (d) { + t64 =3D tcg_temp_new_i64(); + tcg_gen_neg_i64(t64, val); + gen_helper_sve_sqaddi_d(dptr, nptr, t64, desc); + tcg_temp_free_i64(t64); + } else { + gen_helper_sve_sqaddi_d(dptr, nptr, val, desc); + } + break; + + default: + g_assert_not_reached(); + } + + tcg_temp_free_ptr(dptr); + tcg_temp_free_ptr(nptr); + tcg_temp_free_i32(desc); +} + +static void trans_CNT_r(DisasContext *s, arg_CNT_r *a, uint32_t insn) +{ + unsigned fullsz =3D vec_full_reg_size(s); + unsigned numelem =3D decode_pred_count(fullsz, a->pat, a->esz); + + tcg_gen_movi_i64(cpu_reg(s, a->rd), numelem * a->imm); +} + +static void trans_INCDEC_r(DisasContext *s, arg_incdec_cnt *a, uint32_t in= sn) +{ + unsigned fullsz =3D vec_full_reg_size(s); + unsigned numelem =3D decode_pred_count(fullsz, a->pat, a->esz); + int inc =3D numelem * a->imm * (a->d ? -1 : 1); + TCGv_i64 reg =3D cpu_reg(s, a->rd); + + tcg_gen_addi_i64(reg, reg, inc); +} + +static void trans_SINCDEC_r_32(DisasContext *s, arg_incdec_cnt *a, + uint32_t insn) +{ + unsigned fullsz =3D vec_full_reg_size(s); + unsigned numelem =3D decode_pred_count(fullsz, a->pat, a->esz); + int inc =3D numelem * a->imm; + TCGv_i64 reg =3D cpu_reg(s, a->rd); + + /* Use normal 64-bit arithmetic to detect 32-bit overflow. */ + if (inc =3D=3D 0) { + if (a->u) { + tcg_gen_ext32u_i64(reg, reg); + } else { + tcg_gen_ext32s_i64(reg, reg); + } + } else { + TCGv_i64 t =3D tcg_const_i64(inc); + do_sat_addsub_32(reg, t, a->u, a->d); + tcg_temp_free_i64(t); + } +} + +static void trans_SINCDEC_r_64(DisasContext *s, arg_incdec_cnt *a, + uint32_t insn) +{ + unsigned fullsz =3D vec_full_reg_size(s); + unsigned numelem =3D decode_pred_count(fullsz, a->pat, a->esz); + int inc =3D numelem * a->imm; + TCGv_i64 reg =3D cpu_reg(s, a->rd); + + if (inc !=3D 0) { + TCGv_i64 t =3D tcg_const_i64(inc); + do_sat_addsub_64(reg, t, a->u, a->d); + tcg_temp_free_i64(t); + } +} + +static void trans_INCDEC_v(DisasContext *s, arg_incdec2_cnt *a, uint32_t i= nsn) +{ + unsigned fullsz =3D vec_full_reg_size(s); + unsigned numelem =3D decode_pred_count(fullsz, a->pat, a->esz); + int inc =3D numelem * a->imm; + + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + if (inc !=3D 0) { + TCGv_i64 t =3D tcg_const_i64(a->d ? -inc : inc); + tcg_gen_gvec_adds(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), t, fullsz, fullsz= ); + tcg_temp_free_i64(t); + } else { + do_mov_z(s, a->rd, a->rn); + } +} + +static void trans_SINCDEC_v(DisasContext *s, arg_incdec2_cnt *a, + uint32_t insn) +{ + unsigned fullsz =3D vec_full_reg_size(s); + unsigned numelem =3D decode_pred_count(fullsz, a->pat, a->esz); + int inc =3D numelem * a->imm; + + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + if (inc !=3D 0) { + TCGv_i64 t =3D tcg_const_i64(inc); + do_sat_addsub_vec(s, a->esz, a->rd, a->rn, t, a->u, a->d); + tcg_temp_free_i64(t); + } else { + do_mov_z(s, a->rd, a->rn); + } +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 4ea3f33919..5690b5fcb9 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -22,6 +22,7 @@ ########################################################################### # Named fields. These are primarily for disjoint fields. =20 +%imm4_16_p1 16:4 !function=3Dplus1 %imm6_22_5 22:1 5:5 %imm9_16_10 16:s6 10:3 %preg4_5 5:4 @@ -58,6 +59,8 @@ &rprrr_esz rd pg rn rm ra esz &rpri_esz rd pg rn imm esz &ptrue rd esz pat s +&incdec_cnt rd pat esz imm d u +&incdec2_cnt rd rn pat esz imm d u =20 ########################################################################### # Named instruction formats. These are generally used to @@ -115,6 +118,13 @@ @rd_rn_i9 ........ ........ ...... rn:5 rd:5 \ &rri imm=3D%imm9_16_10 =20 +# One register, pattern, and uint4+1. +# User must fill in U and D. +@incdec_cnt ........ esz:2 .. .... ...... pat:5 rd:5 \ + &incdec_cnt imm=3D%imm4_16_p1 +@incdec2_cnt ........ esz:2 .. .... ...... pat:5 rd:5 \ + &incdec2_cnt imm=3D%imm4_16_p1 rn=3D%reg_movprfx + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. =20 @@ -301,7 +311,25 @@ FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn # Note esz !=3D 0 FTSSEL 00000100 .. 1 ..... 101100 ..... ..... @rd_rn_rm =20 -### SVE Predicate Logical Operations Group +### SVE Element Count Group + +# SVE element count +CNT_r 00000100 .. 10 .... 1110 0 0 ..... ..... @incdec_cnt d=3D0 u=3D1 + +# SVE inc/dec register by element count +INCDEC_r 00000100 .. 11 .... 1110 0 d:1 ..... ..... @incdec_cnt u=3D1 + +# SVE saturating inc/dec register by element count +SINCDEC_r_32 00000100 .. 10 .... 1111 d:1 u:1 ..... ..... @incdec_cnt +SINCDEC_r_64 00000100 .. 11 .... 1111 d:1 u:1 ..... ..... @incdec_cnt + +# SVE inc/dec vector by element count +# Note this requires esz !=3D 0. +INCDEC_v 00000100 .. 1 1 .... 1100 0 d:1 ..... ..... @incdec2_cnt u=3D1 + +# SVE saturating inc/dec vector by element count +# Note these require esz !=3D 0. +SINCDEC_v 00000100 .. 1 0 .... 1100 d:1 u:1 ..... ..... @incdec2_cnt =20 # SVE predicate logical operations AND_pppp 00100101 0. 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm_s --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893787306577.9055505528875; Sat, 17 Feb 2018 10:56:27 -0800 (PST) Received: from localhost ([::1]:48326 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7es-0001us-Hf for importer@patchew.org; Sat, 17 Feb 2018 13:56:26 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40068) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79b-0000hM-OU for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79a-0001lB-MY for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:07 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:43777) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79a-0001ks-Gz for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:06 -0500 Received: by mail-pg0-x243.google.com with SMTP id f6so4342993pgs.10 for ; Sat, 17 Feb 2018 10:24:06 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.03 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=nSNLVT4lf1W1WDzUmj9fhH5V3ch8xkv1gwPhXGeJ6r0=; b=XMEvR+3c4GAUicBd+RM9a/qV1xj0SccDnRWRLh972nZAg6gYVuzi+wWl3Tu3cC8TwB fg63rVue9xMigQqDJBiS76/0vabjcdEQMdtlJ71BjC/JN9ru/dhUUC4Phz8pujpesuUF aseGu2/VROm3JvmJ5H5lAf3gKFBTmE8H2PZWM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=nSNLVT4lf1W1WDzUmj9fhH5V3ch8xkv1gwPhXGeJ6r0=; b=HloNR/NPi1fKJwpoIc34aglffPUt651ymjvia15rIc2zhEql2WxruSqCnMRmvUZ7eZ eoQYnJaRKk+SM5aUsREwjk/ziq3HZb9Csbb5fJArF8IzYMk+EmKycl88STpQsf68KW3/ qObkec7+eRPgIjYfkvAbKRrisJxlHSA6ukZMOY2cgWpmvbmd6T2X/UUQNTu51qoAnrqG eQFwaTg/Q7gaDJ8Iv+jRZC4GKoE23NYOeS37ujoMB1K9Wm81+hCda+oJIPSEXrIeSSPI vw2IhheqxJewoGItHMsNQmObOx+0wIonjhbLONvj7N+vLZJOVmcG6VOc+RlhPjKoEa45 ve3A== X-Gm-Message-State: APf1xPAEqPs2tbADJeaiI9vvjYjxFsCPK5DbHmWmmcOGrolvz6pNeOHz eQ6VZCN8ocgwseDnjBKm2vYYkEIQrUI= X-Google-Smtp-Source: AH8x227IRDdwB84xJH9Oj92FGhPz2iDvmbTQ0wTbP1GIrhsGCOeDYEHg/IM4Kfo69vMoSZlhXiNWgQ== X-Received: by 10.98.200.80 with SMTP id z77mr787317pff.85.1518891845282; Sat, 17 Feb 2018 10:24:05 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:40 -0800 Message-Id: <20180217182323.25885-25-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH v2 24/67] target/arm: Implement SVE Bitwise Immediate Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/translate-sve.c | 50 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 17 ++++++++++++++++ 2 files changed, 67 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 702f20e97b..21b1e4df85 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -34,6 +34,8 @@ #include "translate-a64.h" =20 typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t); +typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, + int64_t, uint32_t, uint32_t); typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); =20 @@ -1648,6 +1650,54 @@ static void trans_SINCDEC_v(DisasContext *s, arg_inc= dec2_cnt *a, } } =20 +/* + *** SVE Bitwise Immediate Group + */ + +static void do_zz_dbm(DisasContext *s, arg_rr_dbm *a, GVecGen2iFn *gvec_fn) +{ + unsigned vsz; + uint64_t imm; + + if (!logic_imm_decode_wmask(&imm, extract32(a->dbm, 12, 1), + extract32(a->dbm, 0, 6), + extract32(a->dbm, 6, 6))) { + unallocated_encoding(s); + return; + } + + vsz =3D vec_full_reg_size(s); + gvec_fn(MO_64, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), imm, vsz, vsz); +} + +static void trans_AND_zzi(DisasContext *s, arg_rr_dbm *a, uint32_t insn) +{ + do_zz_dbm(s, a, tcg_gen_gvec_andi); +} + +static void trans_ORR_zzi(DisasContext *s, arg_rr_dbm *a, uint32_t insn) +{ + do_zz_dbm(s, a, tcg_gen_gvec_ori); +} + +static void trans_EOR_zzi(DisasContext *s, arg_rr_dbm *a, uint32_t insn) +{ + do_zz_dbm(s, a, tcg_gen_gvec_xori); +} + +static void trans_DUPM(DisasContext *s, arg_DUPM *a, uint32_t insn) +{ + uint64_t imm; + if (!logic_imm_decode_wmask(&imm, extract32(a->dbm, 12, 1), + extract32(a->dbm, 0, 6), + extract32(a->dbm, 6, 6))) { + unallocated_encoding(s); + return; + } + do_dupi_z(s, a->rd, imm); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5690b5fcb9..0990d135f4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -50,6 +50,7 @@ =20 &rr_esz rd rn esz &rri rd rn imm +&rr_dbm rd rn dbm &rrri rd rn rm imm &rri_esz rd rn imm esz &rrr_esz rd rn rm esz @@ -112,6 +113,10 @@ @rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 \ &rri_esz esz=3D%tszimm16_esz =20 +# Two register operand, one encoded bitmask. +@rdn_dbm ........ .. .... dbm:13 rd:5 \ + &rr_dbm rn=3D%reg_movprfx + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=3D%imm9_16_10 @@ -331,6 +336,18 @@ INCDEC_v 00000100 .. 1 1 .... 1100 0 d:1 ..... ..... = @incdec2_cnt u=3D1 # Note these require esz !=3D 0. SINCDEC_v 00000100 .. 1 0 .... 1100 d:1 u:1 ..... ..... @incdec2_cnt =20 +### SVE Bitwise Immediate Group + +# SVE bitwise logical with immediate (unpredicated) +ORR_zzi 00000101 00 0000 ............. ..... @rdn_dbm +EOR_zzi 00000101 01 0000 ............. ..... @rdn_dbm +AND_zzi 00000101 10 0000 ............. ..... @rdn_dbm + +# SVE broadcast bitmask immediate +DUPM 00000101 11 0000 dbm:13 rd:5 + +### SVE Predicate Logical Operations Group + # SVE predicate logical operations AND_pppp 00100101 0. 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm_s BIC_pppp 00100101 0. 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm_s --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893618954296.94788864331906; Sat, 17 Feb 2018 10:53:38 -0800 (PST) Received: from localhost ([::1]:48300 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7cA-0007uW-6j for importer@patchew.org; Sat, 17 Feb 2018 13:53:38 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40111) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79e-0000la-9F for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79c-0001m2-Cn for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:10 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:44554) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79c-0001ld-4z for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:08 -0500 Received: by mail-pl0-x244.google.com with SMTP id w21so3426672plp.11 for ; Sat, 17 Feb 2018 10:24:08 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.05 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=GpiGhGCBLCpMu5HsxIdaTUuHdNgL0asf15r6AUQljAo=; b=DvkOv3ISU3xNKZSB91aWS50R8HNAYKHeuoVbRx0BFkQpMtYeUy5W3licpKHAexmX8b KkOj+w7TQBgDYH2Kz0hbvG/IdUS16gRdo8SmVXTjBE90cmAk2hdE+DtvY8a0HZWtFqD1 hP/Fh3xMPGWBVM6MtRHOJPKCbbz2VrEBSiaZE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=GpiGhGCBLCpMu5HsxIdaTUuHdNgL0asf15r6AUQljAo=; b=WdNZEOcMYcHLJbWd129XejzZ1/GNUL6EXBvq7MtsCT+Gng86z60ctq/Att/YxA3DWv p+U1z+V1IiFrocbugqUJsDX9amLQdm+XzEz2dJR5auZVxeBzu2tIBOgeprpddQKDvZuO M3Kgx2wem0gaDjokWRxAOzFQdDtqtcrFFNlCVRpujsct2Z1SLU9q0vpW/79yBIfICEE/ TDQJ0wQte6gQADzWdBVFlw2wFrXH1ocOUbfp59BReYBmMVnzp+0AaiZUZxp/NrWOa1E8 Im8G61mj2jgk7KnnrASbn6dWm3Hs9YigSZHWjr9L63LCAbRD7gm2BEr+HDgdAIwHP+oi 34TQ== X-Gm-Message-State: APf1xPDo66ybsU2LsaOlcLpYDw6Usc41sRZgK5q6SDwhHkh42RFBBZ4a vlk8YjX9Mfa77NtCk9+LnlmVXQsBkrw= X-Google-Smtp-Source: AH8x226ZlKmyrAwKDVy9lOzic56OFhvYE9pIj/LMrW/f6Xwq+sHJbeQfipbm37jxIRum4pTIxOasQA== X-Received: by 2002:a17:902:7808:: with SMTP id p8-v6mr9620755pll.161.1518891846870; Sat, 17 Feb 2018 10:24:06 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:41 -0800 Message-Id: <20180217182323.25885-26-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 25/67] target/arm: Implement SVE Integer Wide Immediate - Predicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 10 +++++ target/arm/sve_helper.c | 108 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 92 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 17 +++++++ 4 files changed, 227 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2831e1643b..79493ab647 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -404,6 +404,16 @@ DEF_HELPER_FLAGS_4(sve_uqaddi_s, TCG_CALL_NO_RWG, void= , ptr, ptr, s64, i32) DEF_HELPER_FLAGS_4(sve_uqaddi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_uqsubi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 +DEF_HELPER_FLAGS_5(sve_cpy_m_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64,= i32) +DEF_HELPER_FLAGS_5(sve_cpy_m_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64,= i32) +DEF_HELPER_FLAGS_5(sve_cpy_m_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64,= i32) +DEF_HELPER_FLAGS_5(sve_cpy_m_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64,= i32) + +DEF_HELPER_FLAGS_4(sve_cpy_z_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_cpy_z_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_cpy_z_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_cpy_z_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index cfda16d520..6a95d1ec48 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1361,3 +1361,111 @@ void HELPER(sve_uqsubi_d)(void *d, void *a, uint64_= t b, uint32_t desc) *(uint64_t *)(d + i) =3D (ai < b ? 0 : ai - b); } } + +/* Two operand predicated copy immediate with merge. All valid immediates + * can fit within 17 signed bits in the simd_data field. + */ +void HELPER(sve_cpy_m_b)(void *vd, void *vn, void *vg, + uint64_t mm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn; + uint8_t *pg =3D vg; + + mm =3D (mm & 0xff) * (-1ull / 0xff); + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint64_t nn =3D n[i]; + uint64_t pp =3D expand_pred_b(pg[H1(i)]); + d[i] =3D (mm & pp) | (nn & ~pp); + } +} + +void HELPER(sve_cpy_m_h)(void *vd, void *vn, void *vg, + uint64_t mm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn; + uint8_t *pg =3D vg; + + mm =3D (mm & 0xffff) * (-1ull / 0xffff); + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint64_t nn =3D n[i]; + uint64_t pp =3D expand_pred_h(pg[H1(i)]); + d[i] =3D (mm & pp) | (nn & ~pp); + } +} + +void HELPER(sve_cpy_m_s)(void *vd, void *vn, void *vg, + uint64_t mm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn; + uint8_t *pg =3D vg; + + mm =3D deposit64(mm, 32, 32, mm); + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint64_t nn =3D n[i]; + uint64_t pp =3D expand_pred_s(pg[H1(i)]); + d[i] =3D (mm & pp) | (nn & ~pp); + } +} + +void HELPER(sve_cpy_m_d)(void *vd, void *vn, void *vg, + uint64_t mm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn; + uint8_t *pg =3D vg; + + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint64_t nn =3D n[i]; + d[i] =3D (pg[H1(i)] & 1 ? mm : nn); + } +} + +void HELPER(sve_cpy_z_b)(void *vd, void *vg, uint64_t val, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + + val =3D (val & 0xff) * (-1ull / 0xff); + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D val & expand_pred_b(pg[H1(i)]); + } +} + +void HELPER(sve_cpy_z_h)(void *vd, void *vg, uint64_t val, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + + val =3D (val & 0xffff) * (-1ull / 0xffff); + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D val & expand_pred_h(pg[H1(i)]); + } +} + +void HELPER(sve_cpy_z_s)(void *vd, void *vg, uint64_t val, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + + val =3D deposit64(val, 32, 32, val); + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D val & expand_pred_s(pg[H1(i)]); + } +} + +void HELPER(sve_cpy_z_d)(void *vd, void *vg, uint64_t val, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D (pg[H1(i)] & 1 ? val : 0); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 21b1e4df85..dd085b084b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -68,6 +68,12 @@ static inline int plus1(int x) return x + 1; } =20 +/* The SH bit is in bit 8. Extract the low 8 and shift. */ +static inline int expand_imm_sh8s(int x) +{ + return (int8_t)x << (x & 0x100 ? 8 : 0); +} + /* * Include the generated decoder. */ @@ -1698,6 +1704,92 @@ static void trans_DUPM(DisasContext *s, arg_DUPM *a,= uint32_t insn) do_dupi_z(s, a->rd, imm); } =20 +/* + *** SVE Integer Wide Immediate - Predicated Group + */ + +/* Implement all merging copies. This is used for CPY (immediate), + * FCPY, CPY (scalar), CPY (SIMD&FP scalar). + */ +static void do_cpy_m(DisasContext *s, int esz, int rd, int rn, int pg, + TCGv_i64 val) +{ + typedef void gen_cpy(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64, TCGv_i32); + static gen_cpy * const fns[4] =3D { + gen_helper_sve_cpy_m_b, gen_helper_sve_cpy_m_h, + gen_helper_sve_cpy_m_s, gen_helper_sve_cpy_m_d, + }; + unsigned vsz =3D vec_full_reg_size(s); + TCGv_i32 desc =3D tcg_const_i32(simd_desc(vsz, vsz, 0)); + TCGv_ptr t_zd =3D tcg_temp_new_ptr(); + TCGv_ptr t_zn =3D tcg_temp_new_ptr(); + TCGv_ptr t_pg =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, rd)); + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, rn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + + fns[esz](t_zd, t_zn, t_pg, val, desc); + + tcg_temp_free_ptr(t_zd); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); +} + +static void trans_FCPY(DisasContext *s, arg_FCPY *a, uint32_t insn) +{ + uint64_t imm; + TCGv_i64 t_imm; + + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + + /* Decode the VFP immediate. */ + imm =3D vfp_expand_imm(a->esz, a->imm); + + t_imm =3D tcg_const_i64(imm); + do_cpy_m(s, a->esz, a->rd, a->rn, a->pg, t_imm); + tcg_temp_free_i64(t_imm); +} + +static void trans_CPY_m_i(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + TCGv_i64 t_imm; + + if (a->esz =3D=3D 0 && extract32(insn, 13, 1)) { + unallocated_encoding(s); + return; + } + + t_imm =3D tcg_const_i64(a->imm); + do_cpy_m(s, a->esz, a->rd, a->rn, a->pg, t_imm); + tcg_temp_free_i64(t_imm); +} + +static void trans_CPY_z_i(DisasContext *s, arg_CPY_z_i *a, uint32_t insn) +{ + static gen_helper_gvec_2i * const fns[4] =3D { + gen_helper_sve_cpy_z_b, gen_helper_sve_cpy_z_h, + gen_helper_sve_cpy_z_s, gen_helper_sve_cpy_z_d, + }; + unsigned vsz =3D vec_full_reg_size(s); + TCGv_i64 t_imm; + + if (a->esz =3D=3D 0 && extract32(insn, 13, 1)) { + unallocated_encoding(s); + return; + } + + t_imm =3D tcg_const_i64(a->imm); + tcg_gen_gvec_2i_ool(vec_full_reg_offset(s, a->rd), + pred_full_reg_offset(s, a->pg), + t_imm, vsz, vsz, 0, fns[a->esz]); + tcg_temp_free_i64(t_imm); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0990d135f4..e6e10a4f84 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -39,6 +39,9 @@ %tszimm16_shr 22:2 16:5 !function=3Dtszimm_shr %tszimm16_shl 22:2 16:5 !function=3Dtszimm_shl =20 +# Signed 8-bit immediate, optionally shifted left by 8. +%sh8_i8s 5:9 !function=3Dexpand_imm_sh8s + # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. %reg_movprfx 0:5 @@ -113,6 +116,11 @@ @rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 \ &rri_esz esz=3D%tszimm16_esz =20 +# Two register operand, one immediate operand, with 4-bit predicate. +# User must fill in imm. +@rdn_pg4 ........ esz:2 .. pg:4 ... ........ rd:5 \ + &rpri_esz rn=3D%reg_movprfx + # Two register operand, one encoded bitmask. @rdn_dbm ........ .. .... dbm:13 rd:5 \ &rr_dbm rn=3D%reg_movprfx @@ -346,6 +354,15 @@ AND_zzi 00000101 10 0000 ............. ..... @rdn_dbm # SVE broadcast bitmask immediate DUPM 00000101 11 0000 dbm:13 rd:5 =20 +### SVE Integer Wide Immediate - Predicated Group + +# SVE copy floating-point immediate (predicated) +FCPY 00000101 .. 01 .... 110 imm:8 ..... @rdn_pg4 + +# SVE copy integer immediate (predicated) +CPY_m_i 00000101 .. 01 .... 01 . ........ ..... @rdn_pg4 imm=3D%sh8_i8s +CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rdn_pg4 imm=3D%sh8_i8s + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893159048451.51915007916296; Sat, 17 Feb 2018 10:45:59 -0800 (PST) Received: from localhost ([::1]:48230 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7UZ-0001cw-GG for importer@patchew.org; Sat, 17 Feb 2018 13:45:49 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40123) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79f-0000mM-37 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79d-0001mo-PN for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:11 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:39664) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79d-0001mK-HR for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:09 -0500 Received: by mail-pl0-x241.google.com with SMTP id s13so3436999plq.6 for ; Sat, 17 Feb 2018 10:24:09 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.06 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DwIEfXE8CNgM0TRpz0r+ylrsP3Cl/uBo6B2EBJ/28wM=; b=jhabaLD67l4OK/+vLYeFc4Wb2gL/RzzTtyf+eul8Ai1bKqBT3RoGsh1oACmODiXAPU NsFiVErzTnhKkC4qcJyIuOgXoGCGaVDLgv75N6ZXQZ0qMRnb8fI/Ofkpucd/gFXRL7VO egXLF3Iyy5g0QLZDVCMqUvTFv6xNsi82BgPII= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DwIEfXE8CNgM0TRpz0r+ylrsP3Cl/uBo6B2EBJ/28wM=; b=Y/VQlkS3wlKZxcOS/cLaZ+rXlWZ1qEJnXufwIdca5dsxPpNRTcH3hsxrU37LXJMLHh Z0jLZD4zeBEkNyBmvNapAoMn05lNcgUqqlQL37NDeLrIcyYFCj2ntl7mK4/9BuedpMfy 9R58NTmSzsL7JZEiWfXUbO+JNsV8oqPIGTMhGdi9iC6VCclWFRxvzb+THxr+r1kx/Qoc 1rqKjiSvVjlxnqtiG43Zbb7T8+sLNoCSF1SGk6+50faLiRuSNM2g2BtUfeiQiHhL7/tW ZnTkcgQP8L8oLFeYAGKuC6pbTxi6gglCnQKBq963Va8SMx/yf3LmxPtnTvon3SmHT2qU dWFw== X-Gm-Message-State: APf1xPBATTb+O7ohd4I8D8unnyY4qbGsujJKAVJK/HlzQ9YoQ9HL4KSg fTrWjdHOyeipxmWT62mH1OxXgxSiaDk= X-Google-Smtp-Source: AH8x22698HxMWM9c9d3hie8I1fo2TOjPyWRtUwsKiQQeDd8J0QMLbEm5IAeodQxlURuYCkzPpmPc9A== X-Received: by 2002:a17:902:42c3:: with SMTP id h61-v6mr9349488pld.269.1518891848270; Sat, 17 Feb 2018 10:24:08 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:42 -0800 Message-Id: <20180217182323.25885-27-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v2 26/67] target/arm: Implement SVE Permute - Extract Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 2 ++ target/arm/sve_helper.c | 81 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 29 +++++++++++++++++ target/arm/sve.decode | 9 +++++- 4 files changed, 120 insertions(+), 1 deletion(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 79493ab647..94f4356ce9 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -414,6 +414,8 @@ DEF_HELPER_FLAGS_4(sve_cpy_z_h, TCG_CALL_NO_RWG, void, = ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_cpy_z_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_cpy_z_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) =20 +DEF_HELPER_FLAGS_4(sve_ext, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6a95d1ec48..fb3f54300b 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1469,3 +1469,84 @@ void HELPER(sve_cpy_z_d)(void *vd, void *vg, uint64_= t val, uint32_t desc) d[i] =3D (pg[H1(i)] & 1 ? val : 0); } } + +/* Big-endian hosts need to frob the byte indicies. If the copy + * happens to be 8-byte aligned, then no frobbing necessary. + */ +static void swap_memmove(void *vd, void *vs, size_t n) +{ + uintptr_t d =3D (uintptr_t)vd; + uintptr_t s =3D (uintptr_t)vs; + uintptr_t o =3D (d | s | n) & 7; + size_t i; + +#ifndef HOST_WORDS_BIGENDIAN + o =3D 0; +#endif + switch (o) { + case 0: + memmove(vd, vs, n); + break; + + case 4: + if (d < s || d >=3D s + n) { + for (i =3D 0; i < n; i +=3D 4) { + *(uint32_t *)H1_4(d + i) =3D *(uint32_t *)H1_4(s + i); + } + } else { + for (i =3D n; i > 0; ) { + i -=3D 4; + *(uint32_t *)H1_4(d + i) =3D *(uint32_t *)H1_4(s + i); + } + } + break; + + case 2: + case 6: + if (d < s || d >=3D s + n) { + for (i =3D 0; i < n; i +=3D 2) { + *(uint16_t *)H1_2(d + i) =3D *(uint16_t *)H1_2(s + i); + } + } else { + for (i =3D n; i > 0; ) { + i -=3D 2; + *(uint16_t *)H1_2(d + i) =3D *(uint16_t *)H1_2(s + i); + } + } + break; + + default: + if (d < s || d >=3D s + n) { + for (i =3D 0; i < n; i++) { + *(uint8_t *)H1(d + i) =3D *(uint8_t *)H1(s + i); + } + } else { + for (i =3D n; i > 0; ) { + i -=3D 1; + *(uint8_t *)H1(d + i) =3D *(uint8_t *)H1(s + i); + } + } + break; + } +} + +void HELPER(sve_ext)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t opr_sz =3D simd_oprsz(desc); + size_t n_ofs =3D simd_data(desc); + size_t n_siz =3D opr_sz - n_ofs; + + if (vd !=3D vm) { + swap_memmove(vd, vn + n_ofs, n_siz); + swap_memmove(vd + n_siz, vm, n_ofs); + } else if (vd !=3D vn) { + swap_memmove(vd + n_siz, vd, n_ofs); + swap_memmove(vd, vn + n_ofs, n_siz); + } else { + /* vd =3D=3D vn =3D=3D vm. Need temp space. */ + ARMVectorReg tmp; + swap_memmove(&tmp, vm, n_ofs); + swap_memmove(vd, vd + n_ofs, n_siz); + memcpy(vd + n_siz, &tmp, n_ofs); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index dd085b084b..07a5eac092 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -1790,6 +1790,35 @@ static void trans_CPY_z_i(DisasContext *s, arg_CPY_z= _i *a, uint32_t insn) tcg_temp_free_i64(t_imm); } =20 +/* + *** SVE Permute Extract Group + */ + +static void trans_EXT(DisasContext *s, arg_EXT *a, uint32_t insn) +{ + unsigned vsz =3D vec_full_reg_size(s); + unsigned n_ofs =3D a->imm >=3D vsz ? 0 : a->imm; + unsigned n_siz =3D vsz - n_ofs; + unsigned d =3D vec_full_reg_offset(s, a->rd); + unsigned n =3D vec_full_reg_offset(s, a->rn); + unsigned m =3D vec_full_reg_offset(s, a->rm); + + /* Use host vector move insns if we have appropriate sizes + and no unfortunate overlap. */ + if (m !=3D d + && n_ofs =3D=3D size_for_gvec(n_ofs) + && n_siz =3D=3D size_for_gvec(n_siz) + && (d !=3D n || n_siz <=3D n_ofs)) { + tcg_gen_gvec_mov(0, d, n + n_ofs, n_siz, n_siz); + if (n_ofs !=3D 0) { + tcg_gen_gvec_mov(0, d + n_siz, m, n_ofs, n_ofs); + } + return; + } + + tcg_gen_gvec_3_ool(d, n, m, vsz, vsz, n_ofs, gen_helper_sve_ext); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index e6e10a4f84..5e3a9839d4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -22,8 +22,9 @@ ########################################################################### # Named fields. These are primarily for disjoint fields. =20 -%imm4_16_p1 16:4 !function=3Dplus1 +%imm4_16_p1 16:4 !function=3Dplus1 %imm6_22_5 22:1 5:5 +%imm8_16_10 16:5 10:3 %imm9_16_10 16:s6 10:3 %preg4_5 5:4 =20 @@ -363,6 +364,12 @@ FCPY 00000101 .. 01 .... 110 imm:8 ..... @rdn_pg4 CPY_m_i 00000101 .. 01 .... 01 . ........ ..... @rdn_pg4 imm=3D%sh8_i8s CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rdn_pg4 imm=3D%sh8_i8s =20 +### SVE Permute - Extract Group + +# SVE extract vector (immediate offset) +EXT 00000101 001 ..... 000 ... rm:5 rd:5 \ + &rrri rn=3D%reg_movprfx imm=3D%imm8_16_10 + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892540221935.4261573196568; Sat, 17 Feb 2018 10:35:40 -0800 (PST) Received: from localhost ([::1]:48142 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Kk-0001ay-SZ for importer@patchew.org; Sat, 17 Feb 2018 13:35:39 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40160) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79h-0000rJ-OK for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79f-0001ow-Jh for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:13 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:40421) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79f-0001nU-BN for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:11 -0500 Received: by mail-pl0-x243.google.com with SMTP id g18so3435334plo.7 for ; Sat, 17 Feb 2018 10:24:11 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.08 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=qPsbfGNpCfDIlap2bj2lVE1RF8fhYlKz98g2CqBrEjw=; b=CtJ3puXXRdtHjT+YOvukmfZ36iapRfi545H3hposydFmKXzn4AnDPMpIdxExGg9jYZ XeXy0IhUWNx75Sy68mBgXscviZiASPGvHugp8ALndLyLq1HQFEyNEZKmGMzn3KVa9tBM S85mjdXBCQZmsdUi55/gSP3sIudaEB0yXm8zY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=qPsbfGNpCfDIlap2bj2lVE1RF8fhYlKz98g2CqBrEjw=; b=qSYvGrWc1wbMDck9uOGPWlaC8TgAvIyoDOdb7fi2yJot3B40gaTyWAD3FwcsNBySwf zufyhphbeIRyWI7G/cZ4FPNrIxXWfstsjEIw7h+O5q7lwsfEpWSfzgc5ucxCOq65wlID cuCanh1YaPL2l7td2Kzs3ecVWouiXP/sGzWArAGV7nAPXg5OlY3/lJO/WWc+gyB9zPk8 tRqelw5ZSucde4sJnBq+pxfAL+ruAbqR1hvaXhusuOeqRJPfBmdNHSC7XoatrtMIh6fa 4HgZeuhSki182GkTL0UeeQzpajfuA+LlvKTrbhaLtw+Upr3FYqWtTdCdKxlxX7vURzA/ Ccxg== X-Gm-Message-State: APf1xPDKW1YNb3sik3Fst46hG6ybixmkQq3UW1MPc0MLssNRjikfCjcG 38X92JSLAIepEBMSFDZRtCbZdpWQbZ4= X-Google-Smtp-Source: AH8x224f9hLrjTqGCftOFhgql3oG3fsUUx7qgaSMCPbOSgiF4CxRH3P/HhgDDRPVGlj4h9OIeRiyCw== X-Received: by 2002:a17:902:2bc5:: with SMTP id l63-v6mr9564154plb.108.1518891849947; Sat, 17 Feb 2018 10:24:09 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:43 -0800 Message-Id: <20180217182323.25885-28-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 27/67] target/arm: Implement SVE Permute - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 23 +++++++++ target/arm/translate-a64.h | 14 +++--- target/arm/sve_helper.c | 114 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 113 +++++++++++++++++++++++++++++++++++++++++= +++ target/arm/sve.decode | 29 +++++++++++- 5 files changed, 285 insertions(+), 8 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 94f4356ce9..0c9aad575e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -416,6 +416,29 @@ DEF_HELPER_FLAGS_4(sve_cpy_z_d, TCG_CALL_NO_RWG, void,= ptr, ptr, i64, i32) =20 DEF_HELPER_FLAGS_4(sve_ext, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_insr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_insr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_insr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_insr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_3(sve_rev_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_tbl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_tbl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_tbl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_tbl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_sunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_uunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h index e519aee314..328aa7fce1 100644 --- a/target/arm/translate-a64.h +++ b/target/arm/translate-a64.h @@ -66,18 +66,18 @@ static inline void assert_fp_access_checked(DisasContex= t *s) static inline int vec_reg_offset(DisasContext *s, int regno, int element, TCGMemOp size) { - int offs =3D 0; + int element_size =3D 1 << size; + int offs =3D element * element_size; #ifdef HOST_WORDS_BIGENDIAN /* This is complicated slightly because vfp.zregs[n].d[0] is * still the low half and vfp.zregs[n].d[1] the high half * of the 128 bit vector, even on big endian systems. - * Calculate the offset assuming a fully bigendian 128 bits, - * then XOR to account for the order of the two 64 bit halves. + * Calculate the offset assuming a fully little-endian 128 bits, + * then XOR to account for the order of the 64 bit units. */ - offs +=3D (16 - ((element + 1) * (1 << size))); - offs ^=3D 8; -#else - offs +=3D element * (1 << size); + if (element_size < 8) { + offs ^=3D 8 - element_size; + } #endif offs +=3D offsetof(CPUARMState, vfp.zregs[regno]); assert_fp_access_checked(s); diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index fb3f54300b..466a209c1e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1550,3 +1550,117 @@ void HELPER(sve_ext)(void *vd, void *vn, void *vm, = uint32_t desc) memcpy(vd + n_siz, &tmp, n_ofs); } } + +#define DO_INSR(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, uint64_t val, uint32_t desc) \ +{ \ + intptr_t opr_sz =3D simd_oprsz(desc); \ + swap_memmove(vd + sizeof(TYPE), vn, opr_sz - sizeof(TYPE)); \ + *(TYPE *)(vd + H(0)) =3D val; \ +} + +DO_INSR(sve_insr_b, uint8_t, H1) +DO_INSR(sve_insr_h, uint16_t, H1_2) +DO_INSR(sve_insr_s, uint32_t, H1_4) +DO_INSR(sve_insr_d, uint64_t, ) + +#undef DO_INSR + +void HELPER(sve_rev_b)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz =3D simd_oprsz(desc); + for (i =3D 0, j =3D opr_sz - 8; i < opr_sz / 2; i +=3D 8, j -=3D 8) { + uint64_t f =3D *(uint64_t *)(vn + i); + uint64_t b =3D *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) =3D bswap64(b); + *(uint64_t *)(vd + j) =3D bswap64(f); + } +} + +static inline uint64_t hswap64(uint64_t h) +{ + uint64_t m =3D 0x0000ffff0000ffffull; + h =3D rol64(h, 32); + return ((h & m) << 16) | ((h >> 16) & m); +} + +void HELPER(sve_rev_h)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz =3D simd_oprsz(desc); + for (i =3D 0, j =3D opr_sz - 8; i < opr_sz / 2; i +=3D 8, j -=3D 8) { + uint64_t f =3D *(uint64_t *)(vn + i); + uint64_t b =3D *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) =3D hswap64(b); + *(uint64_t *)(vd + j) =3D hswap64(f); + } +} + +void HELPER(sve_rev_s)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz =3D simd_oprsz(desc); + for (i =3D 0, j =3D opr_sz - 8; i < opr_sz / 2; i +=3D 8, j -=3D 8) { + uint64_t f =3D *(uint64_t *)(vn + i); + uint64_t b =3D *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) =3D rol64(b, 32); + *(uint64_t *)(vd + j) =3D rol64(f, 32); + } +} + +void HELPER(sve_rev_d)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz =3D simd_oprsz(desc); + for (i =3D 0, j =3D opr_sz - 8; i < opr_sz / 2; i +=3D 8, j -=3D 8) { + uint64_t f =3D *(uint64_t *)(vn + i); + uint64_t b =3D *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) =3D b; + *(uint64_t *)(vd + j) =3D f; + } +} + +#define DO_TBL(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + uintptr_t elem =3D opr_sz / sizeof(TYPE); \ + TYPE *d =3D vd, *n =3D vn, *m =3D vm; \ + ARMVectorReg tmp; \ + if (unlikely(vd =3D=3D vn)) { \ + n =3D memcpy(&tmp, vn, opr_sz); \ + } \ + for (i =3D 0; i < elem; i++) { \ + TYPE j =3D m[H(i)]; \ + d[H(i)] =3D j < elem ? n[H(j)] : 0; \ + } \ +} + +DO_TBL(sve_tbl_b, uint8_t, H1) +DO_TBL(sve_tbl_h, uint16_t, H2) +DO_TBL(sve_tbl_s, uint32_t, H4) +DO_TBL(sve_tbl_d, uint64_t, ) + +#undef TBL + +#define DO_UNPK(NAME, TYPED, TYPES, HD, HS) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + TYPED *d =3D vd; \ + TYPES *n =3D vn; \ + ARMVectorReg tmp; \ + if (unlikely(vn - vd < opr_sz)) { \ + n =3D memcpy(&tmp, n, opr_sz / 2); \ + } \ + for (i =3D 0; i < opr_sz / sizeof(TYPED); i++) { \ + d[HD(i)] =3D n[HS(i)]; \ + } \ +} + +DO_UNPK(sve_sunpk_h, int16_t, int8_t, H2, H1) +DO_UNPK(sve_sunpk_s, int32_t, int16_t, H4, H2) +DO_UNPK(sve_sunpk_d, int64_t, int32_t, , H4) + +DO_UNPK(sve_uunpk_h, uint16_t, uint8_t, H2, H1) +DO_UNPK(sve_uunpk_s, uint32_t, uint16_t, H4, H2) +DO_UNPK(sve_uunpk_d, uint64_t, uint32_t, , H4) + +#undef DO_UNPK diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 07a5eac092..3724f6290c 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -1819,6 +1819,119 @@ static void trans_EXT(DisasContext *s, arg_EXT *a, = uint32_t insn) tcg_gen_gvec_3_ool(d, n, m, vsz, vsz, n_ofs, gen_helper_sve_ext); } =20 +/* + *** SVE Permute - Unpredicated Group + */ + +static void trans_DUP_s(DisasContext *s, arg_DUP_s *a, uint32_t insn) +{ + unsigned vsz =3D vec_full_reg_size(s); + tcg_gen_gvec_dup_i64(a->esz, vec_full_reg_offset(s, a->rd), + vsz, vsz, cpu_reg_sp(s, a->rn)); +} + +static void trans_DUP_x(DisasContext *s, arg_DUP_x *a, uint32_t insn) +{ + unsigned vsz =3D vec_full_reg_size(s); + unsigned dofs =3D vec_full_reg_offset(s, a->rd); + unsigned esz, index; + + if ((a->imm & 0x1f) =3D=3D 0) { + unallocated_encoding(s); + return; + } + esz =3D ctz32(a->imm); + index =3D a->imm >> (esz + 1); + + if ((index << esz) < vsz) { + unsigned nofs =3D vec_reg_offset(s, a->rn, index, esz); + tcg_gen_gvec_dup_mem(esz, dofs, nofs, vsz, vsz); + } else { + tcg_gen_gvec_dup64i(dofs, vsz, vsz, 0); + } +} + +static void do_insr_i64(DisasContext *s, arg_rrr_esz *a, TCGv_i64 val) +{ + typedef void gen_insr(TCGv_ptr, TCGv_ptr, TCGv_i64, TCGv_i32); + static gen_insr * const fns[4] =3D { + gen_helper_sve_insr_b, gen_helper_sve_insr_h, + gen_helper_sve_insr_s, gen_helper_sve_insr_d, + }; + unsigned vsz =3D vec_full_reg_size(s); + TCGv_i32 desc =3D tcg_const_i32(simd_desc(vsz, vsz, 0)); + TCGv_ptr t_zd =3D tcg_temp_new_ptr(); + TCGv_ptr t_zn =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn)); + + fns[a->esz](t_zd, t_zn, val, desc); + + tcg_temp_free_ptr(t_zd); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_i32(desc); +} + +static void trans_INSR_f(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + tcg_gen_ld_i64(t, cpu_env, vec_reg_offset(s, a->rm, 0, MO_64)); + do_insr_i64(s, a, t); + tcg_temp_free_i64(t); +} + +static void trans_INSR_r(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_insr_i64(s, a, cpu_reg(s, a->rm)); +} + +static void trans_REV_v(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2 * const fns[4] =3D { + gen_helper_sve_rev_b, gen_helper_sve_rev_h, + gen_helper_sve_rev_s, gen_helper_sve_rev_d + }; + unsigned vsz =3D vec_full_reg_size(s); + + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, 0, fns[a->esz]); +} + +static void trans_TBL(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + gen_helper_sve_tbl_b, gen_helper_sve_tbl_h, + gen_helper_sve_tbl_s, gen_helper_sve_tbl_d + }; + unsigned vsz =3D vec_full_reg_size(s); + + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fns[a->esz]); +} + +static void trans_UNPK(DisasContext *s, arg_UNPK *a, uint32_t insn) +{ + static gen_helper_gvec_2 * const fns[4][2] =3D { + { NULL, NULL }, + { gen_helper_sve_sunpk_h, gen_helper_sve_uunpk_h }, + { gen_helper_sve_sunpk_s, gen_helper_sve_uunpk_s }, + { gen_helper_sve_sunpk_d, gen_helper_sve_uunpk_d }, + }; + unsigned vsz =3D vec_full_reg_size(s); + + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn) + (a->h ? vsz / 2 : 0= ), + vsz, vsz, 0, fns[a->esz][a->u]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5e3a9839d4..8af47ad27b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -24,6 +24,7 @@ =20 %imm4_16_p1 16:4 !function=3Dplus1 %imm6_22_5 22:1 5:5 +%imm7_22_16 22:2 16:5 %imm8_16_10 16:5 10:3 %imm9_16_10 16:s6 10:3 %preg4_5 5:4 @@ -85,7 +86,9 @@ @pd_pg_pn_pm_s ........ . s:1 .. rm:4 .. pg:4 . rn:4 . rd:4 &rprr_s =20 # Three operand, vector element size -@rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ + &rrr_esz rn=3D%reg_movprfx =20 # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -370,6 +373,30 @@ CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rd= n_pg4 imm=3D%sh8_i8s EXT 00000101 001 ..... 000 ... rm:5 rd:5 \ &rrri rn=3D%reg_movprfx imm=3D%imm8_16_10 =20 +### SVE Permute - Unpredicated Group + +# SVE broadcast general register +DUP_s 00000101 .. 1 00000 001110 ..... ..... @rd_rn + +# SVE broadcast indexed element +DUP_x 00000101 .. 1 ..... 001000 rn:5 rd:5 \ + &rri imm=3D%imm7_22_16 + +# SVE insert SIMD&FP scalar register +INSR_f 00000101 .. 1 10100 001110 ..... ..... @rdn_rm + +# SVE insert general register +INSR_r 00000101 .. 1 00100 001110 ..... ..... @rdn_rm + +# SVE reverse vector elements +REV_v 00000101 .. 1 11000 001110 ..... ..... @rd_rn + +# SVE vector table lookup +TBL 00000101 .. 1 ..... 001100 ..... ..... @rd_rn_rm + +# SVE unpack vector elements +UNPK 00000101 esz:2 1100 u:1 h:1 001110 rn:5 rd:5 + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893458554839.1456771377713; Sat, 17 Feb 2018 10:50:58 -0800 (PST) Received: from localhost ([::1]:48271 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7ZZ-0005dI-JP for importer@patchew.org; Sat, 17 Feb 2018 13:50:57 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40199) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79k-0000sp-6u for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79h-0001pf-Gj for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:16 -0500 Received: from mail-pf0-x241.google.com ([2607:f8b0:400e:c00::241]:32803) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79h-0001pG-0e for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:13 -0500 Received: by mail-pf0-x241.google.com with SMTP id b8so525376pfh.0 for ; Sat, 17 Feb 2018 10:24:12 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.10 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=gs8jk3SnvU+rkac2DB/AmJreOmTjHJdoAqlQrUA8wrI=; b=LJskhUr+Lqcy5YCUHndW+/zRbO1Yr+Xb46iNwLPDW+Qfl9kYsg2QJYohKg8ctgeX4E d3QhS4fU4tzMj35JfRn8C97WEdULBgrFvF0dYljq4YizNitO+79qa1PNcv2crCYvTw5M A0suKjhxnqfuIFfW1a+Oza5KwkUYCRG/9dxBQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gs8jk3SnvU+rkac2DB/AmJreOmTjHJdoAqlQrUA8wrI=; b=TTLpgb8vDZTBf/UC42k+81ChOHq22U2JqF25mNPH3jP8trp4ptCJCnKgIxSvXHWWzF RkGMOQlXfngdnku1oWOA/O/u9maoRwF8hDnlNMdpQjucKRtimROi8F+p6UsmyUGo44BG p54glat9U+iKdiijKYnDS/DrX7c3CkP3Pyh7ta1h2eBsjIksSL1nZfz8hDu2EgdjBZxZ ipsoelRDVuJbGf6lQSKl9rITB5S4WAyxZooKs4crnT5Jh6NdXNPFsnJOgFrAV9aQinUo fCNfe9ckWUmAbrf6QI3aNc+HcIsftoCOLfldSz1cgQMoJEZIrYW01hEvgUCjEgVB5FnA ChGA== X-Gm-Message-State: APf1xPDETgwCE37oXnBxEvqxKzVPCBS2UXd7EHIr/3/7P3Bnx6ex+fwL OcXpv3SI+p5IJo6uI9RDwLSSdnMGm7g= X-Google-Smtp-Source: AH8x225zliw4gPiMEt8X7Y3nRS6IMQ7cNPKCVVPqfs1lTcL8d0SbqGLvsPYJ4Oj2Qyf87u7wpPXglw== X-Received: by 10.99.63.9 with SMTP id m9mr8511033pga.247.1518891851631; Sat, 17 Feb 2018 10:24:11 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:44 -0800 Message-Id: <20180217182323.25885-29-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::241 Subject: [Qemu-devel] [PATCH v2 28/67] target/arm: Implement SVE Permute - Predicates Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 6 + target/arm/sve_helper.c | 280 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 110 ++++++++++++++++++ target/arm/sve.decode | 18 +++ 4 files changed, 414 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0c9aad575e..ff958fcebd 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -439,6 +439,12 @@ DEF_HELPER_FLAGS_3(sve_uunpk_h, TCG_CALL_NO_RWG, void,= ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_zip_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_punpk_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 466a209c1e..c3a2706a16 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1664,3 +1664,283 @@ DO_UNPK(sve_uunpk_s, uint32_t, uint16_t, H4, H2) DO_UNPK(sve_uunpk_d, uint64_t, uint32_t, , H4) =20 #undef DO_UNPK + +static const uint64_t expand_bit_data[5][2] =3D { + { 0x1111111111111111ull, 0x2222222222222222ull }, + { 0x0303030303030303ull, 0x0c0c0c0c0c0c0c0cull }, + { 0x000f000f000f000full, 0x00f000f000f000f0ull }, + { 0x000000ff000000ffull, 0x0000ff000000ff00ull }, + { 0x000000000000ffffull, 0x00000000ffff0000ull } +}; + +/* Expand units of 2**N bits to units of 2**(N+1) bits, + with the higher bits zero. */ +static uint64_t expand_bits(uint64_t x, int n) +{ + int i, sh; + for (i =3D 4, sh =3D 16; i >=3D n; i--, sh >>=3D 1) { + x =3D ((x & expand_bit_data[i][1]) << sh) | (x & expand_bit_data[i= ][0]); + } + return x; +} + +/* Compress units of 2**(N+1) bits to units of 2**N bits. */ +static uint64_t compress_bits(uint64_t x, int n) +{ + int i, sh; + for (i =3D n, sh =3D 1 << n; i <=3D 4; i++, sh <<=3D 1) { + x =3D ((x >> sh) & expand_bit_data[i][1]) | (x & expand_bit_data[i= ][0]); + } + return x; +} + +void HELPER(sve_zip_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + int esz =3D extract32(pred_desc, SIMD_DATA_SHIFT, 2); + intptr_t high =3D extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + uint64_t *d =3D vd; + intptr_t i; + + if (oprsz <=3D 8) { + uint64_t nn =3D *(uint64_t *)vn; + uint64_t mm =3D *(uint64_t *)vm; + int half =3D 4 * oprsz; + + nn =3D extract64(nn, high * half, half); + mm =3D extract64(mm, high * half, half); + nn =3D expand_bits(nn, esz); + mm =3D expand_bits(mm, esz); + d[0] =3D nn + (mm << (1 << esz)); + } else { + ARMPredicateReg tmp_n, tmp_m; + + /* We produce output faster than we consume input. + Therefore we must be mindful of possible overlap. */ + if ((vn - vd) < (uintptr_t)oprsz) { + vn =3D memcpy(&tmp_n, vn, oprsz); + } + if ((vm - vd) < (uintptr_t)oprsz) { + vm =3D memcpy(&tmp_m, vm, oprsz); + } + if (high) { + high =3D oprsz >> 1; + } + + if ((high & 3) =3D=3D 0) { + uint32_t *n =3D vn, *m =3D vm; + high >>=3D 2; + + for (i =3D 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + uint64_t nn =3D n[H4(high + i)]; + uint64_t mm =3D m[H4(high + i)]; + + nn =3D expand_bits(nn, esz); + mm =3D expand_bits(mm, esz); + d[i] =3D nn + (mm << (1 << esz)); + } + } else { + uint8_t *n =3D vn, *m =3D vm; + uint16_t *d16 =3D vd; + + for (i =3D 0; i < oprsz / 2; i++) { + uint16_t nn =3D n[H1(high + i)]; + uint16_t mm =3D m[H1(high + i)]; + + nn =3D expand_bits(nn, esz); + mm =3D expand_bits(mm, esz); + d16[H2(i)] =3D nn + (mm << (1 << esz)); + } + } + } +} + +void HELPER(sve_uzp_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + int esz =3D extract32(pred_desc, SIMD_DATA_SHIFT, 2); + int odd =3D extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1) << esz; + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + uint64_t l, h; + intptr_t i; + + if (oprsz <=3D 8) { + l =3D compress_bits(n[0] >> odd, esz); + h =3D compress_bits(m[0] >> odd, esz); + d[0] =3D extract64(l + (h << (4 * oprsz)), 0, 8 * oprsz); + } else { + ARMPredicateReg tmp_m; + intptr_t oprsz_16 =3D oprsz / 16; + + if ((vm - vd) < (uintptr_t)oprsz) { + m =3D memcpy(&tmp_m, vm, oprsz); + } + + for (i =3D 0; i < oprsz_16; i++) { + l =3D n[2 * i + 0]; + h =3D n[2 * i + 1]; + l =3D compress_bits(l >> odd, esz); + h =3D compress_bits(h >> odd, esz); + d[i] =3D l + (h << 32); + } + + /* For VL which is not a power of 2, the results from M do not + align nicely with the uint64_t for D. Put the aligned results + from M into TMP_M and then copy it into place afterward. */ + if (oprsz & 15) { + d[i] =3D compress_bits(n[2 * i] >> odd, esz); + + for (i =3D 0; i < oprsz_16; i++) { + l =3D m[2 * i + 0]; + h =3D m[2 * i + 1]; + l =3D compress_bits(l >> odd, esz); + h =3D compress_bits(h >> odd, esz); + tmp_m.p[i] =3D l + (h << 32); + } + tmp_m.p[i] =3D compress_bits(m[2 * i] >> odd, esz); + + swap_memmove(vd + oprsz / 2, &tmp_m, oprsz / 2); + } else { + for (i =3D 0; i < oprsz_16; i++) { + l =3D m[2 * i + 0]; + h =3D m[2 * i + 1]; + l =3D compress_bits(l >> odd, esz); + h =3D compress_bits(h >> odd, esz); + d[oprsz_16 + i] =3D l + (h << 32); + } + } + } +} + +static const uint64_t even_bit_esz_masks[4] =3D { + 0x5555555555555555ull, + 0x3333333333333333ull, + 0x0f0f0f0f0f0f0f0full, + 0x00ff00ff00ff00ffull +}; + +void HELPER(sve_trn_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + uintptr_t esz =3D extract32(pred_desc, SIMD_DATA_SHIFT, 2); + bool odd =3D extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + uint64_t mask; + int shr, shl; + intptr_t i; + + shl =3D 1 << esz; + shr =3D 0; + mask =3D even_bit_esz_masks[esz]; + if (odd) { + mask <<=3D shl; + shr =3D shl; + shl =3D 0; + } + + for (i =3D 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + uint64_t nn =3D (n[i] & mask) >> shr; + uint64_t mm =3D (m[i] & mask) << shl; + d[i] =3D nn + mm; + } +} + +/* Reverse units of 2**N bits. */ +static uint64_t reverse_bits_64(uint64_t x, int n) +{ + int i, sh; + + x =3D bswap64(x); + for (i =3D 2, sh =3D 4; i >=3D n; i--, sh >>=3D 1) { + uint64_t mask =3D even_bit_esz_masks[i]; + x =3D ((x & mask) << sh) | ((x >> sh) & mask); + } + return x; +} + +static uint8_t reverse_bits_8(uint8_t x, int n) +{ + static const uint8_t mask[3] =3D { 0x55, 0x33, 0x0f }; + int i, sh; + + for (i =3D 2, sh =3D 4; i >=3D n; i--, sh >>=3D 1) { + x =3D ((x & mask[i]) << sh) | ((x >> sh) & mask[i]); + } + return x; +} + +void HELPER(sve_rev_p)(void *vd, void *vn, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + int esz =3D extract32(pred_desc, SIMD_DATA_SHIFT, 2); + intptr_t i, oprsz_2 =3D oprsz / 2; + + if (oprsz <=3D 8) { + uint64_t l =3D *(uint64_t *)vn; + l =3D reverse_bits_64(l << (64 - 8 * oprsz), esz); + *(uint64_t *)vd =3D l; + } else if ((oprsz & 15) =3D=3D 0) { + for (i =3D 0; i < oprsz_2; i +=3D 8) { + intptr_t ih =3D oprsz - 8 - i; + uint64_t l =3D reverse_bits_64(*(uint64_t *)(vn + i), esz); + uint64_t h =3D reverse_bits_64(*(uint64_t *)(vn + ih), esz); + *(uint64_t *)(vd + i) =3D h; + *(uint64_t *)(vd + ih) =3D l; + } + } else { + for (i =3D 0; i < oprsz_2; i +=3D 1) { + intptr_t il =3D H1(i); + intptr_t ih =3D H1(oprsz - 1 - i); + uint8_t l =3D reverse_bits_8(*(uint8_t *)(vn + il), esz); + uint8_t h =3D reverse_bits_8(*(uint8_t *)(vn + ih), esz); + *(uint8_t *)(vd + il) =3D h; + *(uint8_t *)(vd + ih) =3D l; + } + } +} + +void HELPER(sve_punpk_p)(void *vd, void *vn, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t high =3D extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + uint64_t *d =3D vd; + intptr_t i; + + if (oprsz <=3D 8) { + uint64_t nn =3D *(uint64_t *)vn; + int half =3D 4 * oprsz; + + nn =3D extract64(nn, high * half, half); + nn =3D expand_bits(nn, 0); + d[0] =3D nn; + } else { + ARMPredicateReg tmp_n; + + /* We produce output faster than we consume input. + Therefore we must be mindful of possible overlap. */ + if ((vn - vd) < (uintptr_t)oprsz) { + vn =3D memcpy(&tmp_n, vn, oprsz); + } + if (high) { + high =3D oprsz >> 1; + } + + if ((high & 3) =3D=3D 0) { + uint32_t *n =3D vn; + high >>=3D 2; + + for (i =3D 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + uint64_t nn =3D n[H4(high + i)]; + d[i] =3D expand_bits(nn, 0); + } + } else { + uint16_t *d16 =3D vd; + uint8_t *n =3D vn; + + for (i =3D 0; i < oprsz / 2; i++) { + uint16_t nn =3D n[H1(high + i)]; + d16[H2(i)] =3D expand_bits(nn, 0); + } + } + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3724f6290c..45e1ea87bf 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -1932,6 +1932,116 @@ static void trans_UNPK(DisasContext *s, arg_UNPK *a= , uint32_t insn) vsz, vsz, 0, fns[a->esz][a->u]); } =20 +/* + *** SVE Permute - Predicates Group + */ + +static void do_perm_pred3(DisasContext *s, arg_rrr_esz *a, bool high_odd, + gen_helper_gvec_3 *fn) +{ + unsigned vsz =3D pred_full_reg_size(s); + + /* Predicate sizes may be smaller and cannot use simd_desc. + We cannot round up, as we do elsewhere, because we need + the exact size for ZIP2 and REV. We retain the style for + the other helpers for consistency. */ + TCGv_ptr t_d =3D tcg_temp_new_ptr(); + TCGv_ptr t_n =3D tcg_temp_new_ptr(); + TCGv_ptr t_m =3D tcg_temp_new_ptr(); + TCGv_i32 t_desc; + int desc; + + desc =3D vsz - 2; + desc =3D deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + desc =3D deposit32(desc, SIMD_DATA_SHIFT + 2, 2, high_odd); + + tcg_gen_addi_ptr(t_d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_n, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(t_m, cpu_env, pred_full_reg_offset(s, a->rm)); + t_desc =3D tcg_const_i32(desc); + + fn(t_d, t_n, t_m, t_desc); + + tcg_temp_free_ptr(t_d); + tcg_temp_free_ptr(t_n); + tcg_temp_free_ptr(t_m); + tcg_temp_free_i32(t_desc); +} + +static void do_perm_pred2(DisasContext *s, arg_rr_esz *a, bool high_odd, + gen_helper_gvec_2 *fn) +{ + unsigned vsz =3D pred_full_reg_size(s); + TCGv_ptr t_d =3D tcg_temp_new_ptr(); + TCGv_ptr t_n =3D tcg_temp_new_ptr(); + TCGv_i32 t_desc; + int desc; + + tcg_gen_addi_ptr(t_d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_n, cpu_env, pred_full_reg_offset(s, a->rn)); + + /* Predicate sizes may be smaller and cannot use simd_desc. + We cannot round up, as we do elsewhere, because we need + the exact size for ZIP2 and REV. We retain the style for + the other helpers for consistency. */ + + desc =3D vsz - 2; + desc =3D deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + desc =3D deposit32(desc, SIMD_DATA_SHIFT + 2, 2, high_odd); + t_desc =3D tcg_const_i32(desc); + + fn(t_d, t_n, t_desc); + + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(t_d); + tcg_temp_free_ptr(t_n); +} + +static void trans_ZIP1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_perm_pred3(s, a, 0, gen_helper_sve_zip_p); +} + +static void trans_ZIP2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_perm_pred3(s, a, 1, gen_helper_sve_zip_p); +} + +static void trans_UZP1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_perm_pred3(s, a, 0, gen_helper_sve_uzp_p); +} + +static void trans_UZP2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_perm_pred3(s, a, 1, gen_helper_sve_uzp_p); +} + +static void trans_TRN1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_perm_pred3(s, a, 0, gen_helper_sve_trn_p); +} + +static void trans_TRN2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_perm_pred3(s, a, 1, gen_helper_sve_trn_p); +} + +static void trans_REV_p(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + do_perm_pred2(s, a, 0, gen_helper_sve_rev_p); +} + +static void trans_PUNPKLO(DisasContext *s, arg_PUNPKLO *a, uint32_t insn) +{ + do_perm_pred2(s, a, 0, gen_helper_sve_punpk_p); +} + +static void trans_PUNPKHI(DisasContext *s, arg_PUNPKHI *a, uint32_t insn) +{ + do_perm_pred2(s, a, 1, gen_helper_sve_punpk_p); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8af47ad27b..bcbe84c3a6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -87,6 +87,7 @@ =20 # Three operand, vector element size @rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz @rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ &rrr_esz rn=3D%reg_movprfx =20 @@ -397,6 +398,23 @@ TBL 00000101 .. 1 ..... 001100 ..... ..... @rd_rn_rm # SVE unpack vector elements UNPK 00000101 esz:2 1100 u:1 h:1 001110 rn:5 rd:5 =20 +### SVE Permute - Predicates Group + +# SVE permute predicate elements +ZIP1_p 00000101 .. 10 .... 010 000 0 .... 0 .... @pd_pn_pm +ZIP2_p 00000101 .. 10 .... 010 001 0 .... 0 .... @pd_pn_pm +UZP1_p 00000101 .. 10 .... 010 010 0 .... 0 .... @pd_pn_pm +UZP2_p 00000101 .. 10 .... 010 011 0 .... 0 .... @pd_pn_pm +TRN1_p 00000101 .. 10 .... 010 100 0 .... 0 .... @pd_pn_pm +TRN2_p 00000101 .. 10 .... 010 101 0 .... 0 .... @pd_pn_pm + +# SVE reverse predicate elements +REV_p 00000101 .. 11 0100 010 000 0 .... 0 .... @pd_pn + +# SVE unpack predicate elements +PUNPKLO 00000101 00 11 0000 010 000 0 .... 0 .... @pd_pn_e0 +PUNPKHI 00000101 00 11 0001 010 000 0 .... 0 .... @pd_pn_e0 + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893996220473.02245371061554; Sat, 17 Feb 2018 10:59:56 -0800 (PST) Received: from localhost ([::1]:48363 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7iF-0005Dq-BP for importer@patchew.org; Sat, 17 Feb 2018 13:59:55 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40198) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79k-0000so-6c for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:17 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79i-0001q4-Li for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:16 -0500 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:38521) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79i-0001pp-E7 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:14 -0500 Received: by mail-pf0-x243.google.com with SMTP id i3so592644pfe.5 for ; Sat, 17 Feb 2018 10:24:14 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.11 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=8ZPV9VHMlufqnzfBjaH561O57HKOPZlRQxm41Pp/2u4=; b=YsP6ikEiSp9Batn91eU/Rl+M+SJ/N9BzD+8oWcykyx2ZJE9/F1T/GFEl83APQ5+qOj gt2S/r7NOtbmCKF6LzwtkwGEaC046JHhosydXCvqNpYFq03G8cq6TBlgnr6iPRPufh7S HyBXWvCu3wyqngPFLby+5f3in/+Ga9vp80+OY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=8ZPV9VHMlufqnzfBjaH561O57HKOPZlRQxm41Pp/2u4=; b=YMBxDaXxRWiTDaP+29fswG4X34vi0kbCsKuLHMwNEQlLYuMsJp+h7HLRi6hL6tBZbF CeaYelxMkd5UTJ7GYSB9O/fYS0lc/QNR8YdUqJbarsX/c00Oo0BXx6Jb42otor0O75th sFbGXMHWZMrXRC9mqUnq2EFTO49TUxSreEVYEyoe/mUQzxSz9RAGO/QZ+R/tf/DRavbA hWNH76y36HzcaiAdtsqXv5jmvbNcmOuEgI1wjK3V3fxZGWmU2zYQCgHSQvMKV0nJSoqu FWfZ3W4/jO1EvtMFpjU/ZoNsrFJdRHKx3yiYCLMjU5bBQwV2Br30wSJK6BNlKnMKS72r 2ELA== X-Gm-Message-State: APf1xPAjgj2PeE8ZztzJehO7yQkKYdkiqKFGCuWHVyo9WLkW6uUeEF/C xO5A69xUMeVf4XDp1bsflR1JzGoJJ2E= X-Google-Smtp-Source: AH8x226LTNHJGDMkT3gEUC6NJEqPXswK74Ph3c0f8+Sue1iwOTqb4+3a7oW15vfNUKysuB2euOkVbQ== X-Received: by 10.99.146.3 with SMTP id o3mr8290580pgd.309.1518891853101; Sat, 17 Feb 2018 10:24:13 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:45 -0800 Message-Id: <20180217182323.25885-30-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v2 29/67] target/arm: Implement SVE Permute - Interleaving Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 15 ++++++++++ target/arm/sve_helper.c | 72 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 69 ++++++++++++++++++++++++++++++++++++++++++= ++ target/arm/sve.decode | 10 +++++++ 4 files changed, 166 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ff958fcebd..bab20345c6 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -445,6 +445,21 @@ DEF_HELPER_FLAGS_4(sve_trn_p, TCG_CALL_NO_RWG, void, p= tr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_rev_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_punpk_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_zip_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_uzp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_trn_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index c3a2706a16..62982bd099 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1944,3 +1944,75 @@ void HELPER(sve_punpk_p)(void *vd, void *vn, uint32_= t pred_desc) } } } + +#define DO_ZIP(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz =3D simd_oprsz(desc); \ + intptr_t i, oprsz_2 =3D oprsz / 2; \ + ARMVectorReg tmp_n, tmp_m; \ + /* We produce output faster than we consume input. \ + Therefore we must be mindful of possible overlap. */ \ + if (unlikely((vn - vd) < (uintptr_t)oprsz)) { \ + vn =3D memcpy(&tmp_n, vn, oprsz_2); \ + } \ + if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \ + vm =3D memcpy(&tmp_m, vm, oprsz_2); \ + } \ + for (i =3D 0; i < oprsz_2; i +=3D sizeof(TYPE)) { \ + *(TYPE *)(vd + H(2 * i + 0)) =3D *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(2 * i + sizeof(TYPE))) =3D *(TYPE *)(vm + H(i)); \ + } \ +} + +DO_ZIP(sve_zip_b, uint8_t, H1) +DO_ZIP(sve_zip_h, uint16_t, H1_2) +DO_ZIP(sve_zip_s, uint32_t, H1_4) +DO_ZIP(sve_zip_d, uint64_t, ) + +#define DO_UZP(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz =3D simd_oprsz(desc); \ + intptr_t oprsz_2 =3D oprsz / 2; \ + intptr_t odd_ofs =3D simd_data(desc); \ + intptr_t i; \ + ARMVectorReg tmp_m; \ + if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \ + vm =3D memcpy(&tmp_m, vm, oprsz); \ + } \ + for (i =3D 0; i < oprsz_2; i +=3D sizeof(TYPE)) { = \ + *(TYPE *)(vd + H(i)) =3D *(TYPE *)(vn + H(2 * i + odd_ofs)); \ + } \ + for (i =3D 0; i < oprsz_2; i +=3D sizeof(TYPE)) { = \ + *(TYPE *)(vd + H(oprsz_2 + i)) =3D *(TYPE *)(vm + H(2 * i + odd_of= s)); \ + } \ +} + +DO_UZP(sve_uzp_b, uint8_t, H1) +DO_UZP(sve_uzp_h, uint16_t, H1_2) +DO_UZP(sve_uzp_s, uint32_t, H1_4) +DO_UZP(sve_uzp_d, uint64_t, ) + +#define DO_TRN(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz =3D simd_oprsz(desc); \ + intptr_t odd_ofs =3D simd_data(desc); \ + intptr_t i; \ + for (i =3D 0; i < oprsz; i +=3D 2 * sizeof(TYPE)) { = \ + TYPE ae =3D *(TYPE *)(vn + H(i + odd_ofs)); \ + TYPE be =3D *(TYPE *)(vm + H(i + odd_ofs)); \ + *(TYPE *)(vd + H(i + 0)) =3D ae; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) =3D be; \ + } \ +} + +DO_TRN(sve_trn_b, uint8_t, H1) +DO_TRN(sve_trn_h, uint16_t, H1_2) +DO_TRN(sve_trn_s, uint32_t, H1_4) +DO_TRN(sve_trn_d, uint64_t, ) + +#undef DO_ZIP +#undef DO_UZP +#undef DO_TRN diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 45e1ea87bf..09ac955a36 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2042,6 +2042,75 @@ static void trans_PUNPKHI(DisasContext *s, arg_PUNPK= HI *a, uint32_t insn) do_perm_pred2(s, a, 1, gen_helper_sve_punpk_p); } =20 +/* + *** SVE Permute - Interleaving Group + */ + +static void do_zip(DisasContext *s, arg_rrr_esz *a, bool high) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + gen_helper_sve_zip_b, gen_helper_sve_zip_h, + gen_helper_sve_zip_s, gen_helper_sve_zip_d, + }; + unsigned vsz =3D vec_full_reg_size(s); + unsigned high_ofs =3D high ? vsz / 2 : 0; + + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn) + high_ofs, + vec_full_reg_offset(s, a->rm) + high_ofs, + vsz, vsz, 0, fns[a->esz]); +} + +static void do_zzz_data_ool(DisasContext *s, arg_rrr_esz *a, int data, + gen_helper_gvec_3 *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, data, fn); +} + +static void trans_ZIP1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zip(s, a, false); +} + +static void trans_ZIP2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zip(s, a, true); +} + +static gen_helper_gvec_3 * const uzp_fns[4] =3D { + gen_helper_sve_uzp_b, gen_helper_sve_uzp_h, + gen_helper_sve_uzp_s, gen_helper_sve_uzp_d, +}; + +static void trans_UZP1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_data_ool(s, a, 0, uzp_fns[a->esz]); +} + +static void trans_UZP2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_data_ool(s, a, 1 << a->esz, uzp_fns[a->esz]); +} + +static gen_helper_gvec_3 * const trn_fns[4] =3D { + gen_helper_sve_trn_b, gen_helper_sve_trn_h, + gen_helper_sve_trn_s, gen_helper_sve_trn_d, +}; + +static void trans_TRN1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_data_ool(s, a, 0, trn_fns[a->esz]); +} + +static void trans_TRN2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index bcbe84c3a6..2efa3773fc 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -415,6 +415,16 @@ REV_p 00000101 .. 11 0100 010 000 0 .... 0 .... @pd_pn PUNPKLO 00000101 00 11 0000 010 000 0 .... 0 .... @pd_pn_e0 PUNPKHI 00000101 00 11 0001 010 000 0 .... 0 .... @pd_pn_e0 =20 +### SVE Permute - Interleaving Group + +# SVE permute vector elements +ZIP1_z 00000101 .. 1 ..... 011 000 ..... ..... @rd_rn_rm +ZIP2_z 00000101 .. 1 ..... 011 001 ..... ..... @rd_rn_rm +UZP1_z 00000101 .. 1 ..... 011 010 ..... ..... @rd_rn_rm +UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm +TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm +TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894204339377.9456236299227; Sat, 17 Feb 2018 11:03:24 -0800 (PST) Received: from localhost ([::1]:48404 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7lY-00089c-Fk for importer@patchew.org; Sat, 17 Feb 2018 14:03:20 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40218) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79l-0000uS-Bg for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79k-0001r8-AB for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:17 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:37119) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79k-0001qY-1h for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:16 -0500 Received: by mail-pg0-x241.google.com with SMTP id o1so4353677pgn.4 for ; Sat, 17 Feb 2018 10:24:15 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.13 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=nvmjmNBSSd8KlTGFZjxDZR+Pc5M+VYp/8trMxJq8PP0=; b=d9nEwwYSjSaZUIaYquQNmpyKFIsbiRl3tuTsBjkgE9UJK+HgSsr5ZfaeFGJR+zf4eM /A7lagrN+kO7Oaum7aHT1gt+zZbeuXnd5R7CdrZwo4EoL+2G7rkUVCsvqbopNMoJ43KC X/7TsJcXEbLefuEyNkIyf/IItVAz6ahZ1EJm0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=nvmjmNBSSd8KlTGFZjxDZR+Pc5M+VYp/8trMxJq8PP0=; b=N/m+XUnknLMXKMmHkqqkhxRN+OAKwavRlDFq5pfcpDr9qjZNVJYMCLVPRiakhCg6H1 nNSRc3XyQMvTB1ZZIpLmpj2NIvBDSv9hz+WERMA/oUwUSMStEAoHpdebAqOpK5Uh1ZTq pG6zGSGpokuy1dJl+wmCMKzMtwJ+qkelRPmD9/AHSctQPg2yWiQxtXsT7Y1Pjb0D5MvT /+ushBY7I/ytMhty0vGFjKqpLSTKp7ik7Iw22IUdwr39VNVTG83VvFj470bnID6bQVc0 WXRzNVPocL1dGhlycUvilBXVdrJl0TFdW8m69d/fd8L9/+re96KbVyny5VqYgCC27Jbg VY/g== X-Gm-Message-State: APf1xPCHaBFk2LF0y3oTAig5a0Jobu9G9w7sBt/YvxhP3KJ4bGO3KlsK LVZVbZp2NTIrLvCLQBusRu0cQzUYtlI= X-Google-Smtp-Source: AH8x227ra5SCHz+/5DHmIDhVtGb+aZ+/fazaEZFov6kRj3itMkzVxw6jYkPqxJN44gAbo6xKEmGv9g== X-Received: by 10.99.60.72 with SMTP id i8mr8129654pgn.399.1518891854813; Sat, 17 Feb 2018 10:24:14 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:46 -0800 Message-Id: <20180217182323.25885-31-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH v2 30/67] target/arm: Implement SVE compress active elements X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 3 +++ target/arm/sve_helper.c | 34 ++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 12 ++++++++++++ target/arm/sve.decode | 6 ++++++ 4 files changed, 55 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index bab20345c6..d977aea00d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -460,6 +460,9 @@ DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, pt= r, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_compact_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 62982bd099..87a1a32232 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2016,3 +2016,37 @@ DO_TRN(sve_trn_d, uint64_t, ) #undef DO_ZIP #undef DO_UZP #undef DO_TRN + +void HELPER(sve_compact_s)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, j, opr_sz =3D simd_oprsz(desc) / 4; + uint32_t *d =3D vd, *n =3D vn; + uint8_t *pg =3D vg; + + for (i =3D j =3D 0; i < opr_sz; i++) { + if (pg[H1(i / 2)] & (i & 1 ? 0x10 : 0x01)) { + d[H4(j)] =3D n[H4(i)]; + j++; + } + } + for (; j < opr_sz; j++) { + d[H4(j)] =3D 0; + } +} + +void HELPER(sve_compact_d)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, j, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn; + uint8_t *pg =3D vg; + + for (i =3D j =3D 0; i < opr_sz; i++) { + if (pg[H1(i)] & 1) { + d[j] =3D n[i]; + j++; + } + } + for (; j < opr_sz; j++) { + d[j] =3D 0; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 09ac955a36..21531b259c 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2111,6 +2111,18 @@ static void trans_TRN2_z(DisasContext *s, arg_rrr_es= z *a, uint32_t insn) do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]); } =20 +/* + *** SVE Permute Vector - Predicated Group + */ + +static void trans_COMPACT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, NULL, gen_helper_sve_compact_s, gen_helper_sve_compact_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2efa3773fc..a89bd37eeb 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -425,6 +425,12 @@ UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_r= n_rm TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm =20 +### SVE Permute - Predicated Group + +# SVE compress active elements +# Note esz >=3D 2 +COMPACT 00000101 .. 100001 100 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894387466336.0137459338075; Sat, 17 Feb 2018 11:06:27 -0800 (PST) Received: from localhost ([::1]:48425 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7oW-0002VK-IG for importer@patchew.org; Sat, 17 Feb 2018 14:06:24 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40262) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79n-0000xe-Lp for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:25 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79m-0001rx-0M for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:19 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:40422) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79l-0001rY-Nc for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:17 -0500 Received: by mail-pl0-x244.google.com with SMTP id g18so3435424plo.7 for ; Sat, 17 Feb 2018 10:24:17 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.14 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=bxlL3933KZt9XyhRekE43YoYNcpubIZDGz3/aO+obpk=; b=EWN46cE+IeEAix1Pn0dcFCdCJO2n4pJ6RDg0TpzvTjoNhdpRLiNI5Z7YAL5p2KFFcE T2FLvMinUBybRuUrugRiGigApi04xV5oG3Q/ZScrvVZQ0VpO0ThquhgPu0lmyJE4AKuv J/r/lx/BnH27IyUJpWHQI+JqDzgjQvSTKZuNk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=bxlL3933KZt9XyhRekE43YoYNcpubIZDGz3/aO+obpk=; b=eWnXxzRz+UzK/U+U/5Or8KZBT/CDKy6fTW08KOVSWpv3Rl26EFLH0SoqZzoySC9/Gy ZAAFFwRVaVmv3R1rdfD2ZNgAxEj4ySfCA9zOS6gSgWXXuW5+GgY8AYicd26065nXdaXu tgQ9vvVETu5yTnTohYVji18bls0jcqVfv5KKloLgKGCYHWdXlg6qboqTg3qa6FlSM26+ kcRcIihqPffkwmb4XPFUoaUx4MvhV+h4Jg5lFEKcHcgnGy9OMgcdGslC54wPI1X6xDVo 47QOW1xGJJZ0zJNNUEPuuWbkkVb7QRUj8DoS+Xeguxss1aCGm3dqADJ2C8xjViObsoBW YvZw== X-Gm-Message-State: APf1xPB+cr+bI6Xkwt3unmFNtWz8+UdAFDXMmdC6IM9ySwqNC3Lahb3X uJ7t5GkR2oP45hhgcXDU5aOKzsGocnY= X-Google-Smtp-Source: AH8x225lhe4IqoiIs2fPz1ay1xTsk539/pM018Vi2Lt0N2ASfZxs7iNLz21VP8zYmxIkcVCiP+O6Ow== X-Received: by 2002:a17:902:8509:: with SMTP id bj9-v6mr9562808plb.386.1518891856347; Sat, 17 Feb 2018 10:24:16 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:47 -0800 Message-Id: <20180217182323.25885-32-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 31/67] target/arm: Implement SVE conditionally broadcast/extract element X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 11 ++ target/arm/translate-sve.c | 299 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 20 +++ 4 files changed, 332 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d977aea00d..a58fb4ba01 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -463,6 +463,8 @@ DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, pt= r, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_compact_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) =20 +DEF_HELPER_FLAGS_2(sve_last_active_element, TCG_CALL_NO_RWG, s32, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 87a1a32232..ee289be642 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2050,3 +2050,14 @@ void HELPER(sve_compact_d)(void *vd, void *vn, void = *vg, uint32_t desc) d[j] =3D 0; } } + +/* Similar to the ARM LastActiveElement pseudocode function, except the + result is multiplied by the element size. This includes the not found + indication; e.g. not found for esz=3D3 is -8. */ +int32_t HELPER(sve_last_active_element)(void *vg, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz =3D extract32(pred_desc, SIMD_DATA_SHIFT, 2); + + return last_active_element(vg, DIV_ROUND_UP(oprsz, 8), esz); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 21531b259c..207a22a0bc 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2123,6 +2123,305 @@ static void trans_COMPACT(DisasContext *s, arg_rpr_= esz *a, uint32_t insn) do_zpz_ool(s, a, fns[a->esz]); } =20 +/* Call the helper that computes the ARM LastActiveElement pseudocode + function, scaled by the element size. This includes the not found + indication; e.g. not found for esz=3D3 is -8. */ +static void find_last_active(DisasContext *s, TCGv_i32 ret, int esz, int p= g) +{ + /* Predicate sizes may be smaller and cannot use simd_desc. We cannot + round up, as we do elsewhere, because we need the exact size. */ + TCGv_ptr t_p =3D tcg_temp_new_ptr(); + TCGv_i32 t_desc; + unsigned vsz =3D pred_full_reg_size(s); + unsigned desc; + + desc =3D vsz - 2; + desc =3D deposit32(desc, SIMD_DATA_SHIFT, 2, esz); + + tcg_gen_addi_ptr(t_p, cpu_env, pred_full_reg_offset(s, pg)); + t_desc =3D tcg_const_i32(desc); + + gen_helper_sve_last_active_element(ret, t_p, t_desc); + + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(t_p); +} + +/* Increment LAST to the offset of the next element in the vector, + wrapping around to 0. */ +static void incr_last_active(DisasContext *s, TCGv_i32 last, int esz) +{ + unsigned vsz =3D vec_full_reg_size(s); + + tcg_gen_addi_i32(last, last, 1 << esz); + if (is_power_of_2(vsz)) { + tcg_gen_andi_i32(last, last, vsz - 1); + } else { + TCGv_i32 max =3D tcg_const_i32(vsz); + TCGv_i32 zero =3D tcg_const_i32(0); + tcg_gen_movcond_i32(TCG_COND_GEU, last, last, max, zero, last); + tcg_temp_free_i32(max); + tcg_temp_free_i32(zero); + } +} + +/* If LAST < 0, set LAST to the offset of the last element in the vector. = */ +static void wrap_last_active(DisasContext *s, TCGv_i32 last, int esz) +{ + unsigned vsz =3D vec_full_reg_size(s); + + if (is_power_of_2(vsz)) { + tcg_gen_andi_i32(last, last, vsz - 1); + } else { + TCGv_i32 max =3D tcg_const_i32(vsz - (1 << esz)); + TCGv_i32 zero =3D tcg_const_i32(0); + tcg_gen_movcond_i32(TCG_COND_LT, last, last, zero, max, last); + tcg_temp_free_i32(max); + tcg_temp_free_i32(zero); + } +} + +/* Load an unsigned element of ESZ from BASE+OFS. */ +static TCGv_i64 load_esz(TCGv_ptr base, int ofs, int esz) +{ + TCGv_i64 r =3D tcg_temp_new_i64(); + + switch (esz) { + case 0: + tcg_gen_ld8u_i64(r, base, ofs); + break; + case 1: + tcg_gen_ld16u_i64(r, base, ofs); + break; + case 2: + tcg_gen_ld32u_i64(r, base, ofs); + break; + case 3: + tcg_gen_ld_i64(r, base, ofs); + break; + default: + g_assert_not_reached(); + } + return r; +} + +/* Load an unsigned element of ESZ from RM[LAST]. */ +static TCGv_i64 load_last_active(DisasContext *s, TCGv_i32 last, + int rm, int esz) +{ + TCGv_ptr p =3D tcg_temp_new_ptr(); + TCGv_i64 r; + + /* Convert offset into vector into offset into ENV. + The final adjustment for the vector register base + is added via constant offset to the load. */ +#ifdef HOST_WORDS_BIGENDIAN + /* Adjust for element ordering. See vec_reg_offset. */ + if (esz < 3) { + tcg_gen_xori_i32(last, last, 8 - (1 << esz)); + } +#endif + tcg_gen_ext_i32_ptr(p, last); + tcg_gen_add_ptr(p, p, cpu_env); + + r =3D load_esz(p, vec_full_reg_offset(s, rm), esz); + tcg_temp_free_ptr(p); + + return r; +} + +/* Compute CLAST for a Zreg. */ +static void do_clast_vector(DisasContext *s, arg_rprr_esz *a, bool before) +{ + TCGv_i32 last =3D tcg_temp_local_new_i32(); + TCGLabel *over =3D gen_new_label(); + TCGv_i64 ele; + unsigned vsz, esz =3D a->esz; + + find_last_active(s, last, esz, a->pg); + + /* There is of course no movcond for a 2048-bit vector, + so we must branch over the actual store. */ + tcg_gen_brcondi_i32(TCG_COND_LT, last, 0, over); + + if (!before) { + incr_last_active(s, last, esz); + } + + ele =3D load_last_active(s, last, a->rm, esz); + tcg_temp_free_i32(last); + + vsz =3D vec_full_reg_size(s); + tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd), vsz, vsz, ele= ); + tcg_temp_free_i64(ele); + + /* If this insn used MOVPRFX, we may need a second move. */ + if (a->rd !=3D a->rn) { + TCGLabel *done =3D gen_new_label(); + tcg_gen_br(done); + + gen_set_label(over); + do_mov_z(s, a->rd, a->rn); + + gen_set_label(done); + } else { + gen_set_label(over); + } +} + +static void trans_CLASTA_z(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + do_clast_vector(s, a, false); +} + +static void trans_CLASTB_z(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + do_clast_vector(s, a, true); +} + +/* Compute CLAST for a scalar. */ +static void do_clast_scalar(DisasContext *s, int esz, int pg, int rm, + bool before, TCGv_i64 reg_val) +{ + TCGv_i32 last =3D tcg_temp_new_i32(); + TCGv_i64 ele, cmp, zero; + + find_last_active(s, last, esz, pg); + + /* Extend the original value of last prior to incrementing. */ + cmp =3D tcg_temp_new_i64(); + tcg_gen_ext_i32_i64(cmp, last); + + if (!before) { + incr_last_active(s, last, esz); + } + + /* The conceit here is that while last < 0 indicates not found, after + adjusting for cpu_env->vfp.zregs[rm], it is still a valid address + from which we can load garbage. We then discard the garbage with + a conditional move. */ + ele =3D load_last_active(s, last, rm, esz); + tcg_temp_free_i32(last); + + zero =3D tcg_const_i64(0); + tcg_gen_movcond_i64(TCG_COND_GE, reg_val, cmp, zero, ele, reg_val); + + tcg_temp_free_i64(zero); + tcg_temp_free_i64(cmp); + tcg_temp_free_i64(ele); +} + +/* Compute CLAST for a Vreg. */ +static void do_clast_fp(DisasContext *s, arg_rpr_esz *a, bool before) +{ + int esz =3D a->esz; + int ofs =3D vec_reg_offset(s, a->rd, 0, esz); + TCGv_i64 reg =3D load_esz(cpu_env, ofs, esz); + + do_clast_scalar(s, esz, a->pg, a->rn, before, reg); + write_fp_dreg(s, a->rd, reg); + tcg_temp_free_i64(reg); +} + +static void trans_CLASTA_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_clast_fp(s, a, false); +} + +static void trans_CLASTB_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_clast_fp(s, a, true); +} + +/* Compute CLAST for a Xreg. */ +static void do_clast_general(DisasContext *s, arg_rpr_esz *a, bool before) +{ + TCGv_i64 reg =3D cpu_reg(s, a->rd); + + switch (a->esz) { + case 0: + tcg_gen_ext8u_i64(reg, reg); + break; + case 1: + tcg_gen_ext16u_i64(reg, reg); + break; + case 2: + tcg_gen_ext32u_i64(reg, reg); + break; + case 3: + break; + default: + g_assert_not_reached(); + } + + do_clast_scalar(s, a->esz, a->pg, a->rn, before, cpu_reg(s, a->rd)); +} + +static void trans_CLASTA_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_clast_general(s, a, false); +} + +static void trans_CLASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_clast_general(s, a, true); +} + +/* Compute LAST for a scalar. */ +static TCGv_i64 do_last_scalar(DisasContext *s, int esz, + int pg, int rm, bool before) +{ + TCGv_i32 last =3D tcg_temp_new_i32(); + TCGv_i64 ret; + + find_last_active(s, last, esz, pg); + if (before) { + wrap_last_active(s, last, esz); + } else { + incr_last_active(s, last, esz); + } + + ret =3D load_last_active(s, last, rm, esz); + tcg_temp_free_i32(last); + return ret; +} + +/* Compute LAST for a Vreg. */ +static void do_last_fp(DisasContext *s, arg_rpr_esz *a, bool before) +{ + TCGv_i64 val =3D do_last_scalar(s, a->esz, a->pg, a->rn, before); + write_fp_dreg(s, a->rd, val); + tcg_temp_free_i64(val); +} + +static void trans_LASTA_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_last_fp(s, a, false); +} + +static void trans_LASTB_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_last_fp(s, a, true); +} + +/* Compute LAST for a Xreg. */ +static void do_last_general(DisasContext *s, arg_rpr_esz *a, bool before) +{ + TCGv_i64 val =3D do_last_scalar(s, a->esz, a->pg, a->rn, before); + tcg_gen_mov_i64(cpu_reg(s, a->rd), val); + tcg_temp_free_i64(val); +} + +static void trans_LASTA_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_last_general(s, a, false); +} + +static void trans_LASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_last_general(s, a, true); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a89bd37eeb..1370802c12 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -431,6 +431,26 @@ TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_r= n_rm # Note esz >=3D 2 COMPACT 00000101 .. 100001 100 ... ..... ..... @rd_pg_rn =20 +# SVE conditionally broadcast element to vector +CLASTA_z 00000101 .. 10100 0 100 ... ..... ..... @rdn_pg_rm +CLASTB_z 00000101 .. 10100 1 100 ... ..... ..... @rdn_pg_rm + +# SVE conditionally copy element to SIMD&FP scalar +CLASTA_v 00000101 .. 10101 0 100 ... ..... ..... @rd_pg_rn +CLASTB_v 00000101 .. 10101 1 100 ... ..... ..... @rd_pg_rn + +# SVE conditionally copy element to general register +CLASTA_r 00000101 .. 11000 0 101 ... ..... ..... @rd_pg_rn +CLASTB_r 00000101 .. 11000 1 101 ... ..... ..... @rd_pg_rn + +# SVE copy element to SIMD&FP scalar register +LASTA_v 00000101 .. 10001 0 100 ... ..... ..... @rd_pg_rn +LASTB_v 00000101 .. 10001 1 100 ... ..... ..... @rd_pg_rn + +# SVE copy element to general register +LASTA_r 00000101 .. 10000 0 101 ... ..... ..... @rd_pg_rn +LASTB_r 00000101 .. 10000 1 101 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893826723839.8502787266543; Sat, 17 Feb 2018 10:57:06 -0800 (PST) Received: from localhost ([::1]:48331 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7fW-0002WS-0d for importer@patchew.org; Sat, 17 Feb 2018 13:57:06 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40268) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79o-0000yY-1q for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:21 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79n-0001sT-6E for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:20 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:37016) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79n-0001s8-1B for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:19 -0500 Received: by mail-pl0-x244.google.com with SMTP id ay8so3442958plb.4 for ; Sat, 17 Feb 2018 10:24:18 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.16 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=lsu/bLU80Lg0ZG5Cn3SeZNiZJAF3AUwvFClFldUX54I=; b=g0g6bM3lHcjVoeB/+79v/FLkgRaH1ZusoThyBV9eDk1WHnqLvR6fUDtpOzV6tZL1Fg gt7Sw0rZRd7SnRgB9h6ckJwb3xKxdMrgx/UdqopyLzKi75ly1GMPAw+Cy4DC0p19bUdg nyIacCpNNIJcQNheGh+u6jxP3E/yh5m2FEmbw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lsu/bLU80Lg0ZG5Cn3SeZNiZJAF3AUwvFClFldUX54I=; b=SKTbVLqXU+XH/eS4mraVUSKwtxpfKrMes3gQm2uCfck/wawGGEIz/X5PaMMjXgljkr F63Jf3puyR7/Yosq3WCFoh21USUxA5dW2x/xNjXmiS+7+AbMFqXKG7VmJPF3vUoPQ6pY sqSLCtk2Agj8PL/8iFdaOuyOZ/2tUxRX4fzdNlkGlzwj82hrFDRYfKlaGSmw5We/Bo2M EzdAy3qmp42TW3yEbRJMKfu1LejxKe5lp4eVjxWvByFDnxEtO0umHdbE06NplOL2rLZq AfBp8sG/Ybk13v6vSMkOI6DbwH4uTrubTJshc/nvqRYkEUvUPa3OUHOOvHqzeWl3B8wQ sr1w== X-Gm-Message-State: APf1xPARHQaCkv4ZtNm5N1Fzf7S/JzUHsQqH5ZnF+tcb89siFnd0SX0l Zyo5ZHolmyvIxf1pB+obOSlp7MDNE8I= X-Google-Smtp-Source: AH8x226FtfdoDKh/SlogdWe6XGSGQV+0SjIp6yC0siWFuhdQur5e8TNL/5Vi06gW/35fBTax7zcTXA== X-Received: by 2002:a17:902:9893:: with SMTP id s19-v6mr9286801plp.101.1518891857879; Sat, 17 Feb 2018 10:24:17 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:48 -0800 Message-Id: <20180217182323.25885-33-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 32/67] target/arm: Implement SVE copy to vector (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/translate-sve.c | 13 +++++++++++++ target/arm/sve.decode | 6 ++++++ 2 files changed, 19 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 207a22a0bc..fc2a295ab7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2422,6 +2422,19 @@ static void trans_LASTB_r(DisasContext *s, arg_rpr_e= sz *a, uint32_t insn) do_last_general(s, a, true); } =20 +static void trans_CPY_m_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_cpy_m(s, a->esz, a->rd, a->rd, a->pg, cpu_reg_sp(s, a->rn)); +} + +static void trans_CPY_m_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + int ofs =3D vec_reg_offset(s, a->rn, 0, a->esz); + TCGv_i64 t =3D load_esz(cpu_env, ofs, a->esz); + do_cpy_m(s, a->esz, a->rd, a->rd, a->pg, t); + tcg_temp_free_i64(t); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1370802c12..5e127de88c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -451,6 +451,12 @@ LASTB_v 00000101 .. 10001 1 100 ... ..... ..... @rd_= pg_rn LASTA_r 00000101 .. 10000 0 101 ... ..... ..... @rd_pg_rn LASTB_r 00000101 .. 10000 1 101 ... ..... ..... @rd_pg_rn =20 +# SVE copy element from SIMD&FP scalar register +CPY_m_v 00000101 .. 100000 100 ... ..... ..... @rd_pg_rn + +# SVE copy element from general register to vector (predicated) +CPY_m_r 00000101 .. 101000 101 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892763583251.4218904606032; Sat, 17 Feb 2018 10:39:23 -0800 (PST) Received: from localhost ([::1]:48171 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7OM-0004dv-N2 for importer@patchew.org; Sat, 17 Feb 2018 13:39:22 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40298) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79q-00011l-8N for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:25 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79o-0001tQ-To for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:22 -0500 Received: from mail-pl0-x242.google.com ([2607:f8b0:400e:c01::242]:44553) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79o-0001t2-NF for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:20 -0500 Received: by mail-pl0-x242.google.com with SMTP id w21so3426846plp.11 for ; Sat, 17 Feb 2018 10:24:20 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.18 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=HYLGbImRsPUmVxi2WVEnc4w2Rh42rCM67spjRykW1n4=; b=kHIuEg2+rn0Aznu1oT0wRsNNzYK0M64xl2q1o7ohhe57wiZxCFovgpnu4ezVB9O52/ OIKY28v7q3OgF65Teflpl2/morhDB5ODMnB5vgKdBxf3V6glftRIbn2jug3EiAWxikb4 +yEk+w1chHC30SALWefJ129CVcEMwNIFsy1bc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=HYLGbImRsPUmVxi2WVEnc4w2Rh42rCM67spjRykW1n4=; b=OJGJDTRuDNzBjQ8yzL3HjCQUTLBsrce58Iu9DQBUTZo8+dzgNxWN2F6X6Ob4QnxBwj BaoLjw4X/eLGEKWhbLznpU3XEULiUC7mVExjIui5C1NYSf3jXYznEArI1MtvIaV/YlaU qpC4uMFwotF59aU35yIL4SmsAIrrQ/Vf8POTUOZyyY6Tl8aQy2vKJ5dRBz9NgvT+GuBw Kya1sEVcQHW+Zt+FcsYrAI4GjjohGTrBtje6QZ7jXfk52qH6JOtR+WJ0z316tTASp3xJ 01KTpLRbjEAqybIPSJrnPODt9BDtAgXPNDzFWhoscwmNcjLJgX2OhR+6z8CfcBC/q885 9eQw== X-Gm-Message-State: APf1xPDm49szZ6TJEmB3wAjH3nM3vA88+QncSa3ZewKXr6vqVt/aV+yC UnksceKQcq5X3ex44cil057lu9XNi0U= X-Google-Smtp-Source: AH8x227YPeml4jRZfu95Uj8Myi1N+XkU95vXa/Lj9Uac0+aF9BEMUl7YGhzjn1ykRqZNd1s2NWDifw== X-Received: by 2002:a17:902:402:: with SMTP id 2-v6mr9254422ple.353.1518891859459; Sat, 17 Feb 2018 10:24:19 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:49 -0800 Message-Id: <20180217182323.25885-34-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::242 Subject: [Qemu-devel] [PATCH v2 33/67] target/arm: Implement SVE reverse within elements X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 14 ++++++++++++++ target/arm/sve_helper.c | 41 ++++++++++++++++++++++++++++++++++------- target/arm/translate-sve.c | 38 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 7 +++++++ 4 files changed, 93 insertions(+), 7 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a58fb4ba01..3b7c54905d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -465,6 +465,20 @@ DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, voi= d, ptr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_2(sve_last_active_element, TCG_CALL_NO_RWG, s32, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_revb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_revh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_revw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_rbit_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ee289be642..a67bb579b8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -237,6 +237,26 @@ static inline uint64_t expand_pred_s(uint8_t byte) return word[byte & 0x11]; } =20 +/* Swap 16-bit words within a 32-bit word. */ +static inline uint32_t hswap32(uint32_t h) +{ + return rol32(h, 16); +} + +/* Swap 16-bit words within a 64-bit word. */ +static inline uint64_t hswap64(uint64_t h) +{ + uint64_t m =3D 0x0000ffff0000ffffull; + h =3D rol64(h, 32); + return ((h & m) << 16) | ((h >> 16) & m); +} + +/* Swap 32-bit words within a 64-bit word. */ +static inline uint64_t wswap64(uint64_t h) +{ + return rol64(h, 32); +} + #define LOGICAL_PPPP(NAME, FUNC) \ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ { \ @@ -615,6 +635,20 @@ DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG) DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG) DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG) =20 +DO_ZPZ(sve_revb_h, uint16_t, H1_2, bswap16) +DO_ZPZ(sve_revb_s, uint32_t, H1_4, bswap32) +DO_ZPZ_D(sve_revb_d, uint64_t, bswap64) + +DO_ZPZ(sve_revh_s, uint32_t, H1_4, hswap32) +DO_ZPZ_D(sve_revh_d, uint64_t, hswap64) + +DO_ZPZ_D(sve_revw_d, uint64_t, wswap64) + +DO_ZPZ(sve_rbit_b, uint8_t, H1, revbit8) +DO_ZPZ(sve_rbit_h, uint16_t, H1_2, revbit16) +DO_ZPZ(sve_rbit_s, uint32_t, H1_4, revbit32) +DO_ZPZ_D(sve_rbit_d, uint64_t, revbit64) + /* Three-operand expander, unpredicated, in which the third operand is "wi= de". */ #define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \ @@ -1577,13 +1611,6 @@ void HELPER(sve_rev_b)(void *vd, void *vn, uint32_t = desc) } } =20 -static inline uint64_t hswap64(uint64_t h) -{ - uint64_t m =3D 0x0000ffff0000ffffull; - h =3D rol64(h, 32); - return ((h & m) << 16) | ((h >> 16) & m); -} - void HELPER(sve_rev_h)(void *vd, void *vn, uint32_t desc) { intptr_t i, j, opr_sz =3D simd_oprsz(desc); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fc2a295ab7..5a1ed379ad 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2435,6 +2435,44 @@ static void trans_CPY_m_v(DisasContext *s, arg_rpr_e= sz *a, uint32_t insn) tcg_temp_free_i64(t); } =20 +static void trans_REVB(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, + gen_helper_sve_revb_h, + gen_helper_sve_revb_s, + gen_helper_sve_revb_d, + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_REVH(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, + NULL, + gen_helper_sve_revh_s, + gen_helper_sve_revh_d, + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_REVW(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ool(s, a, a->esz =3D=3D 3 ? gen_helper_sve_revw_d : NULL); +} + +static void trans_RBIT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + gen_helper_sve_rbit_b, + gen_helper_sve_rbit_h, + gen_helper_sve_rbit_s, + gen_helper_sve_rbit_d, + }; + do_zpz_ool(s, a, fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5e127de88c..8903fb6592 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -457,6 +457,13 @@ CPY_m_v 00000101 .. 100000 100 ... ..... ..... @rd_p= g_rn # SVE copy element from general register to vector (predicated) CPY_m_r 00000101 .. 101000 101 ... ..... ..... @rd_pg_rn =20 +# SVE reverse within elements +# Note esz >=3D operation size +REVB 00000101 .. 1001 00 100 ... ..... ..... @rd_pg_rn +REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn +REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn +RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893995079691.530158772912; Sat, 17 Feb 2018 10:59:55 -0800 (PST) Received: from localhost ([::1]:48362 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7iE-0005Am-2U for importer@patchew.org; Sat, 17 Feb 2018 13:59:54 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40334) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79t-00015A-5d for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79q-0001u6-Bl for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:25 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:38522) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79q-0001tg-5r for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:22 -0500 Received: by mail-pf0-x244.google.com with SMTP id i3so592708pfe.5 for ; Sat, 17 Feb 2018 10:24:22 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=k4jL592UFSa2Gtp23s7RFTcDDeu5zdBjKBpgmgfqBXU=; b=KatThmknFgzHb/Wf3HO7P2EK564kCbe3j/4RB061Jcs8gnNuWy6VTDIow5DZvgSiHR Ipot+dEmFcABolk0K8yYRySP08UKC24xJahxgpm2MFilWrc22VZmPWSS9GsnDiDf3toN 338t0xA1WZ31bByeJosFPH95W0Lg89cZ/GCPQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=k4jL592UFSa2Gtp23s7RFTcDDeu5zdBjKBpgmgfqBXU=; b=Q89I4GuuvyQZ5LDRTTGsOpgB81ZYbPc1ECSIV0SZM4r/emX8HMKwN7CEcniwyW4Q9I FZNAh0N9OX4pOhdARBM/ufMTPeiw2UCctiNgegJgT9KPxR5wrXf5cicWzuSRJxXqWHh4 4zIVpAUqhwZCRgfw8b8ZIYGM+IXI/WZi/z0EbO7tIcZJoGglV98dF1Z+RJSvq4EVwH/N oPaTZWZzN+lgSah9TtyAmsbpq2g6hvydFggHAcf2UxjZiWTSoRIekETRdeOqXf/ea0xX MBj5AhDMJYswENQiE2UymprH/nw73kM8E9aaz7Gw/NWTZmHpQl54q4J0b8FN5KRqZqRJ diTw== X-Gm-Message-State: APf1xPCqGY6nD6JvZALOZh619NCc7WtoWJ4/AxFoYd2YsuFU86wgIbHV PdGV+6XWHNbNRhj3krtuz8ecECGmj4w= X-Google-Smtp-Source: AH8x227mN93CBk85U5KKUYYnybE8SOVIlMCKbpRgbyX7KL8dpE2UUGtel4mDf41Vy+XNY8VCRRPfZg== X-Received: by 10.99.168.8 with SMTP id o8mr8387252pgf.42.1518891860910; Sat, 17 Feb 2018 10:24:20 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:50 -0800 Message-Id: <20180217182323.25885-35-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 34/67] target/arm: Implement SVE vector splice (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 2 ++ target/arm/sve_helper.c | 37 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 10 ++++++++++ target/arm/sve.decode | 3 +++ 4 files changed, 52 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 3b7c54905d..c3f8a2b502 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -479,6 +479,8 @@ DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, p= tr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, = i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a67bb579b8..f524a1ddce 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2088,3 +2088,40 @@ int32_t HELPER(sve_last_active_element)(void *vg, ui= nt32_t pred_desc) =20 return last_active_element(vg, DIV_ROUND_UP(oprsz, 8), esz); } + +void HELPER(sve_splice)(void *vd, void *vn, void *vm, void *vg, uint32_t d= esc) +{ + intptr_t opr_sz =3D simd_oprsz(desc) / 8; + int esz =3D simd_data(desc); + uint64_t pg, first_g, last_g, len, mask =3D pred_esz_masks[esz]; + intptr_t i, first_i, last_i; + ARMVectorReg tmp; + + first_i =3D last_i =3D 0; + first_g =3D last_g =3D 0; + + /* Find the extent of the active elements within VG. */ + for (i =3D QEMU_ALIGN_UP(opr_sz, 8) - 8; i >=3D 0; i -=3D 8) { + pg =3D *(uint64_t *)(vg + i) & mask; + if (pg) { + if (last_g =3D=3D 0) { + last_g =3D pg; + last_i =3D i; + } + first_g =3D pg; + first_i =3D i; + } + } + + len =3D 0; + if (first_g !=3D 0) { + first_i =3D first_i * 8 + ctz64(first_g); + last_i =3D last_i * 8 + 63 - clz64(last_g); + len =3D last_i - first_i + (1 << esz); + if (vd =3D=3D vm) { + vm =3D memcpy(&tmp, vm, opr_sz * 8); + } + swap_memmove(vd, vn + first_i, len); + } + swap_memmove(vd + len, vm, opr_sz * 8 - len); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5a1ed379ad..559fb41fd6 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2473,6 +2473,16 @@ static void trans_RBIT(DisasContext *s, arg_rpr_esz = *a, uint32_t insn) do_zpz_ool(s, a, fns[a->esz]); } =20 +static void trans_SPLICE(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + unsigned vsz =3D vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + vsz, vsz, a->esz, gen_helper_sve_splice); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8903fb6592..70feb448e6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -464,6 +464,9 @@ REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn =20 +# SVE vector splice (predicated) +SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894181048356.9209354166811; Sat, 17 Feb 2018 11:03:01 -0800 (PST) Received: from localhost ([::1]:48403 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7lD-0007uy-S2 for importer@patchew.org; Sat, 17 Feb 2018 14:02:59 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40333) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79t-000158-4r for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79r-0001uf-RW for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:25 -0500 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:34162) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79r-0001uK-LQ for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:23 -0500 Received: by mail-pf0-x242.google.com with SMTP id g17so591387pfh.1 for ; Sat, 17 Feb 2018 10:24:23 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.21 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9GMGFACpQiacf69ZW381VkQYNsWKfIiV8Bn1CblnjWs=; b=OHcQszaSDvmydmEZUcVx9HWAMwPJljsXAMcbnzRABqEvtzNnr9f+QXd0dK87lrRcXO 6Dd+LNdDkjrLGzuCvLkuOpJlSPumIxC/jBRksdsXmIL7vfNiZ6zba9Yk4FdAInQAuumn NrQ5N6//rbvGBiTnPFDrK+fKWLVmivKqcDLfA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9GMGFACpQiacf69ZW381VkQYNsWKfIiV8Bn1CblnjWs=; b=rREzw1pFIDnA36sQwNE4+jaPNp/aOp36QZVRGByVfHTNfdM1KypBb3qLIB2NdNMTQu 2PgUZgwSe2ZiWfnOKLs6ElckQM8pfchR5k6hJ15vz9Ksx5l9fJEi2Ojd6wVhfGOd27yV l1gSZ4yf7hMCaNhottOAmpyp5Rq/3yUabWACw1APkxm4ttjAxGoHVR7cEKcJ9DFFCocH suaTCorpRsUZVSV7/g0pJ4BbVhQu5Kc9ha82zpJD24M7dWQQgxoeDqb2Y5e+F6NCefdK tL0BODG4i7quZkiyosj2LzzaZYus8U7hb+kESHYWs7bWxCJm54/7GzZo2RnYVtsDE+Bp PBLA== X-Gm-Message-State: APf1xPC9RnoGlf/BKsoGPh3Ijx5P8ojiesZTBUAroylvBwO1SKAHWmCq 5gHUcCyscgkqBWuPS+uahIZsa1vL2Ec= X-Google-Smtp-Source: AH8x227mJd2T4IYR9Mo1CZ5MJjPecyKLYgWbgcn4u4rh3IZCg6WS17locGNw0eA46X7ga9f7jbhqcw== X-Received: by 10.101.81.76 with SMTP id g12mr8489946pgq.24.1518891862350; Sat, 17 Feb 2018 10:24:22 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:51 -0800 Message-Id: <20180217182323.25885-36-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v2 35/67] target/arm: Implement SVE Select Vectors Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 9 ++++++++ target/arm/sve_helper.c | 55 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 2 ++ target/arm/sve.decode | 6 +++++ 4 files changed, 72 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c3f8a2b502..0f57f64895 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -195,6 +195,15 @@ DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_sel_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f524a1ddce..86cd792cdf 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2125,3 +2125,58 @@ void HELPER(sve_splice)(void *vd, void *vn, void *vm= , void *vg, uint32_t desc) } swap_memmove(vd + len, vm, opr_sz * 8 - len); } + +void HELPER(sve_sel_zpzz_b)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + uint8_t *pg =3D vg; + + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint64_t nn =3D n[i], mm =3D m[i]; + uint64_t pp =3D expand_pred_b(pg[H1(i)]); + d[i] =3D (nn & pp) | (mm & ~pp); + } +} + +void HELPER(sve_sel_zpzz_h)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + uint8_t *pg =3D vg; + + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint64_t nn =3D n[i], mm =3D m[i]; + uint64_t pp =3D expand_pred_h(pg[H1(i)]); + d[i] =3D (nn & pp) | (mm & ~pp); + } +} + +void HELPER(sve_sel_zpzz_s)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + uint8_t *pg =3D vg; + + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint64_t nn =3D n[i], mm =3D m[i]; + uint64_t pp =3D expand_pred_s(pg[H1(i)]); + d[i] =3D (nn & pp) | (mm & ~pp); + } +} + +void HELPER(sve_sel_zpzz_d)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + uint8_t *pg =3D vg; + + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint64_t nn =3D n[i], mm =3D m[i]; + d[i] =3D (pg[H1(i)] & 1 ? nn : mm); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 559fb41fd6..021b33ced9 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -361,6 +361,8 @@ static void trans_UDIV_zpzz(DisasContext *s, arg_rprr_e= sz *a, uint32_t insn) do_zpzz_ool(s, a, fns[a->esz]); } =20 +DO_ZPZZ(SEL, sel) + #undef DO_ZPZZ =20 /* diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 70feb448e6..7ec84fdd80 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -99,6 +99,7 @@ &rprr_esz rn=3D%reg_movprfx @rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ &rprr_esz rm=3D%reg_movprfx +@rd_pg4_rn_rm ........ esz:2 . rm:5 .. pg:4 rn:5 rd:5 &rprr_esz =20 # Three register operand, with governing predicate, vector element size @rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \ @@ -467,6 +468,11 @@ RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_= rn # SVE vector splice (predicated) SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm =20 +### SVE Select Vectors Group + +# SVE select vector elements (predicated) +SEL_zpzz 00000101 .. 1 ..... 11 .... ..... ..... @rd_pg4_rn_rm + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518892978824324.3715134373564; Sat, 17 Feb 2018 10:42:58 -0800 (PST) Received: from localhost ([::1]:48197 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Rp-0007Xw-U5 for importer@patchew.org; Sat, 17 Feb 2018 13:42:57 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40381) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79w-00018M-0Z for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79t-0001vh-IM for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:27 -0500 Received: from mail-pg0-x242.google.com ([2607:f8b0:400e:c05::242]:40106) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79t-0001uz-AW for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:25 -0500 Received: by mail-pg0-x242.google.com with SMTP id g2so4353515pgn.7 for ; Sat, 17 Feb 2018 10:24:25 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.22 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=D1c9k2YAv4Fr2n1MAcDtQOvAWXUmnI0wi/JHbwXdfnk=; b=eZRog78hIGdsi2vfSWPDH9iYdiOdARbuLZDL2bZB4GPr/6OO/lrODIzsUO6txgWzv6 N10OrmvjV03Ex05rbmxQAC0L82XMXSWEcEwPyoZKZ4xVZzvGtrTzu0ZHOve0D7DA4GZb I2W5xlrAQEvO8sJQd+tUefpkiO6MFv1/NZzmI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=D1c9k2YAv4Fr2n1MAcDtQOvAWXUmnI0wi/JHbwXdfnk=; b=FbzyK6uwxazxMROjPIhibc2jQuHSh41jWuwgQbOCVokBzr6cHFkw631Oa3x6Boh38U Leejv8Z3wiYnzAkWT1Yf3YxUOQdX3mZ9yDcf4Gl9nkIKNQ3tTysqJURSrnyk+aFhBtMY bBi/BtzfQAKuwoyU5uSPqTnZItTxBYkh3TnLEiU42Qm3B7xpBrd9WcmzpNofpE8obtxB 7xp24PU/KYM0syjiuBEu3M6YAcw/cl59JPdu6YiISltMrZpnBS4AJvZ/EP2iBNOjEhdZ LyXlQBEdRSrqCngJcThZI5xzmPw7h73JX4VbTQ+OaKlFjKpqSNGsGhzM5UVsBC0CVwRe 0HYg== X-Gm-Message-State: APf1xPCgn4mDl6vJEY8mBWZ4dFPlv3qFV0o/fLYLSqBQZDxWvYS+82lL 1bIjAiheGTHwHhm3hNv5i4yJnIzzMxs= X-Google-Smtp-Source: AH8x225yqA0jJV7KCEBQ3dKoICTxNIJl6p3yWcdE5KcmN0vvKFsMwp3LIzVwZ+WxNAMYGuJGNXnBiA== X-Received: by 10.99.107.200 with SMTP id g191mr8139908pgc.165.1518891863804; Sat, 17 Feb 2018 10:24:23 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:52 -0800 Message-Id: <20180217182323.25885-37-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::242 Subject: [Qemu-devel] [PATCH v2 36/67] target/arm: Implement SVE Integer Compare - Vectors Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 115 +++++++++++++++++++++++++++ target/arm/sve_helper.c | 193 +++++++++++++++++++++++++++++++++++++++++= +++- target/arm/translate-sve.c | 87 ++++++++++++++++++++ target/arm/sve.decode | 24 ++++++ 4 files changed, 416 insertions(+), 3 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0f57f64895..6ffd1fbe8e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -490,6 +490,121 @@ DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void,= ptr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, = i32) =20 +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmple_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmple_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmple_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 86cd792cdf..ae433861f8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -46,14 +46,14 @@ * * The return value has bit 31 set if N is set, bit 1 set if Z is clear, * and bit 0 set if C is set. - * - * This is an iterative function, called for each Pd and Pg word - * moving forward. */ =20 /* For no G bits set, NZCV =3D C. */ #define PREDTEST_INIT 1 =20 +/* This is an iterative function, called for each Pd and Pg word + * moving forward. + */ static uint32_t iter_predtest_fwd(uint64_t d, uint64_t g, uint32_t flags) { if (likely(g)) { @@ -73,6 +73,28 @@ static uint32_t iter_predtest_fwd(uint64_t d, uint64_t g= , uint32_t flags) return flags; } =20 +/* This is an iterative function, called for each Pd and Pg word + * moving backward. + */ +static uint32_t iter_predtest_bwd(uint64_t d, uint64_t g, uint32_t flags) +{ + if (likely(g)) { + /* Compute C from first (i.e last) !(D & G). + Use bit 2 to signal first G bit seen. */ + if (!(flags & 4)) { + flags +=3D 4 - 1; /* add bit 2, subtract C from PREDTEST_INIT = */ + flags |=3D (d & pow2floor(g)) =3D=3D 0; + } + + /* Accumulate Z from each D & G. */ + flags |=3D ((d & g) !=3D 0) << 1; + + /* Compute N from last (i.e first) D & G. Replace previous. */ + flags =3D deposit32(flags, 31, 1, (d & (g & -g)) !=3D 0); + } + return flags; +} + /* The same for a single word predicate. */ uint32_t HELPER(sve_predtest1)(uint64_t d, uint64_t g) { @@ -2180,3 +2202,168 @@ void HELPER(sve_sel_zpzz_d)(void *vd, void *vn, voi= d *vm, d[i] =3D (pg[H1(i)] & 1 ? nn : mm); } } + +/* Two operand comparison controlled by a predicate. + * ??? It is very tempting to want to be able to expand this inline + * with x86 instructions, e.g. + * + * vcmpeqw zm, zn, %ymm0 + * vpmovmskb %ymm0, %eax + * and $0x5555, %eax + * and pg, %eax + * + * or even aarch64, e.g. + * + * // mask =3D 4000 1000 0400 0100 0040 0010 0004 0001 + * cmeq v0.8h, zn, zm + * and v0.8h, v0.8h, mask + * addv h0, v0.8h + * and v0.8b, pg + * + * However, coming up with an abstraction that allows vector inputs and + * a scalar output, and also handles the byte-ordering of sub-uint64_t + * scalar outputs, is tricky. + */ +#define DO_CMP_PPZZ(NAME, TYPE, OP, H, MASK) = \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t des= c) \ +{ = \ + intptr_t opr_sz =3D simd_oprsz(desc); = \ + uint32_t flags =3D PREDTEST_INIT; = \ + intptr_t i =3D opr_sz; = \ + do { = \ + uint64_t out =3D 0, pg; = \ + do { = \ + i -=3D sizeof(TYPE), out <<=3D sizeof(TYPE); = \ + TYPE nn =3D *(TYPE *)(vn + H(i)); = \ + TYPE mm =3D *(TYPE *)(vm + H(i)); = \ + out |=3D nn OP mm; = \ + } while (i & 63); = \ + pg =3D *(uint64_t *)(vg + (i >> 3)) & MASK; = \ + out &=3D pg; = \ + *(uint64_t *)(vd + (i >> 3)) =3D out; = \ + flags =3D iter_predtest_bwd(out, pg, flags); = \ + } while (i > 0); = \ + return flags; = \ +} + +#define DO_CMP_PPZZ_B(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, H1, 0xffffffffffffffffull) +#define DO_CMP_PPZZ_H(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, H1_2, 0x5555555555555555ull) +#define DO_CMP_PPZZ_S(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, H1_4, 0x1111111111111111ull) +#define DO_CMP_PPZZ_D(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, , 0x0101010101010101ull) + +DO_CMP_PPZZ_B(sve_cmpeq_ppzz_b, uint8_t, =3D=3D) +DO_CMP_PPZZ_H(sve_cmpeq_ppzz_h, uint16_t, =3D=3D) +DO_CMP_PPZZ_S(sve_cmpeq_ppzz_s, uint32_t, =3D=3D) +DO_CMP_PPZZ_D(sve_cmpeq_ppzz_d, uint64_t, =3D=3D) + +DO_CMP_PPZZ_B(sve_cmpne_ppzz_b, uint8_t, !=3D) +DO_CMP_PPZZ_H(sve_cmpne_ppzz_h, uint16_t, !=3D) +DO_CMP_PPZZ_S(sve_cmpne_ppzz_s, uint32_t, !=3D) +DO_CMP_PPZZ_D(sve_cmpne_ppzz_d, uint64_t, !=3D) + +DO_CMP_PPZZ_B(sve_cmpgt_ppzz_b, int8_t, >) +DO_CMP_PPZZ_H(sve_cmpgt_ppzz_h, int16_t, >) +DO_CMP_PPZZ_S(sve_cmpgt_ppzz_s, int32_t, >) +DO_CMP_PPZZ_D(sve_cmpgt_ppzz_d, int64_t, >) + +DO_CMP_PPZZ_B(sve_cmpge_ppzz_b, int8_t, >=3D) +DO_CMP_PPZZ_H(sve_cmpge_ppzz_h, int16_t, >=3D) +DO_CMP_PPZZ_S(sve_cmpge_ppzz_s, int32_t, >=3D) +DO_CMP_PPZZ_D(sve_cmpge_ppzz_d, int64_t, >=3D) + +DO_CMP_PPZZ_B(sve_cmphi_ppzz_b, uint8_t, >) +DO_CMP_PPZZ_H(sve_cmphi_ppzz_h, uint16_t, >) +DO_CMP_PPZZ_S(sve_cmphi_ppzz_s, uint32_t, >) +DO_CMP_PPZZ_D(sve_cmphi_ppzz_d, uint64_t, >) + +DO_CMP_PPZZ_B(sve_cmphs_ppzz_b, uint8_t, >=3D) +DO_CMP_PPZZ_H(sve_cmphs_ppzz_h, uint16_t, >=3D) +DO_CMP_PPZZ_S(sve_cmphs_ppzz_s, uint32_t, >=3D) +DO_CMP_PPZZ_D(sve_cmphs_ppzz_d, uint64_t, >=3D) + +#undef DO_CMP_PPZZ_B +#undef DO_CMP_PPZZ_H +#undef DO_CMP_PPZZ_S +#undef DO_CMP_PPZZ_D +#undef DO_CMP_PPZZ + +/* Similar, but the second source is "wide". */ +#define DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H, MASK) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t des= c) \ +{ = \ + intptr_t opr_sz =3D simd_oprsz(desc); = \ + uint32_t flags =3D PREDTEST_INIT; = \ + intptr_t i =3D opr_sz; = \ + do { = \ + uint64_t out =3D 0, pg; = \ + do { = \ + TYPEW mm =3D *(TYPEW *)(vm + i - 8); = \ + do { = \ + i -=3D sizeof(TYPE), out <<=3D sizeof(TYPE); = \ + TYPE nn =3D *(TYPE *)(vn + H(i)); = \ + out |=3D nn OP mm; = \ + } while (i & 7); = \ + } while (i & 63); = \ + pg =3D *(uint64_t *)(vg + (i >> 3)) & MASK; = \ + out &=3D pg; = \ + *(uint64_t *)(vd + (i >> 3)) =3D out; = \ + flags =3D iter_predtest_bwd(out, pg, flags); = \ + } while (i > 0); = \ + return flags; = \ +} + +#define DO_CMP_PPZW_B(NAME, TYPE, TYPEW, OP) \ + DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1, 0xffffffffffffffffull) +#define DO_CMP_PPZW_H(NAME, TYPE, TYPEW, OP) \ + DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1_2, 0x5555555555555555ull) +#define DO_CMP_PPZW_S(NAME, TYPE, TYPEW, OP) \ + DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1_4, 0x1111111111111111ull) + +DO_CMP_PPZW_B(sve_cmpeq_ppzw_b, uint8_t, uint64_t, =3D=3D) +DO_CMP_PPZW_H(sve_cmpeq_ppzw_h, uint16_t, uint64_t, =3D=3D) +DO_CMP_PPZW_S(sve_cmpeq_ppzw_s, uint32_t, uint64_t, =3D=3D) + +DO_CMP_PPZW_B(sve_cmpne_ppzw_b, uint8_t, uint64_t, !=3D) +DO_CMP_PPZW_H(sve_cmpne_ppzw_h, uint16_t, uint64_t, !=3D) +DO_CMP_PPZW_S(sve_cmpne_ppzw_s, uint32_t, uint64_t, !=3D) + +DO_CMP_PPZW_B(sve_cmpgt_ppzw_b, int8_t, int64_t, >) +DO_CMP_PPZW_H(sve_cmpgt_ppzw_h, int16_t, int64_t, >) +DO_CMP_PPZW_S(sve_cmpgt_ppzw_s, int32_t, int64_t, >) + +DO_CMP_PPZW_B(sve_cmpge_ppzw_b, int8_t, int64_t, >=3D) +DO_CMP_PPZW_H(sve_cmpge_ppzw_h, int16_t, int64_t, >=3D) +DO_CMP_PPZW_S(sve_cmpge_ppzw_s, int32_t, int64_t, >=3D) + +DO_CMP_PPZW_B(sve_cmphi_ppzw_b, uint8_t, uint64_t, >) +DO_CMP_PPZW_H(sve_cmphi_ppzw_h, uint16_t, uint64_t, >) +DO_CMP_PPZW_S(sve_cmphi_ppzw_s, uint32_t, uint64_t, >) + +DO_CMP_PPZW_B(sve_cmphs_ppzw_b, uint8_t, uint64_t, >=3D) +DO_CMP_PPZW_H(sve_cmphs_ppzw_h, uint16_t, uint64_t, >=3D) +DO_CMP_PPZW_S(sve_cmphs_ppzw_s, uint32_t, uint64_t, >=3D) + +DO_CMP_PPZW_B(sve_cmplt_ppzw_b, int8_t, int64_t, <) +DO_CMP_PPZW_H(sve_cmplt_ppzw_h, int16_t, int64_t, <) +DO_CMP_PPZW_S(sve_cmplt_ppzw_s, int32_t, int64_t, <) + +DO_CMP_PPZW_B(sve_cmple_ppzw_b, int8_t, int64_t, <=3D) +DO_CMP_PPZW_H(sve_cmple_ppzw_h, int16_t, int64_t, <=3D) +DO_CMP_PPZW_S(sve_cmple_ppzw_s, int32_t, int64_t, <=3D) + +DO_CMP_PPZW_B(sve_cmplo_ppzw_b, uint8_t, uint64_t, <) +DO_CMP_PPZW_H(sve_cmplo_ppzw_h, uint16_t, uint64_t, <) +DO_CMP_PPZW_S(sve_cmplo_ppzw_s, uint32_t, uint64_t, <) + +DO_CMP_PPZW_B(sve_cmpls_ppzw_b, uint8_t, uint64_t, <=3D) +DO_CMP_PPZW_H(sve_cmpls_ppzw_h, uint16_t, uint64_t, <=3D) +DO_CMP_PPZW_S(sve_cmpls_ppzw_s, uint32_t, uint64_t, <=3D) + +#undef DO_CMP_PPZW_B +#undef DO_CMP_PPZW_H +#undef DO_CMP_PPZW_S +#undef DO_CMP_PPZW diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 021b33ced9..cb54777108 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -39,6 +39,9 @@ typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); =20 +typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_ptr, TCGv_i32); + /* * Helpers for extracting complex instruction fields. */ @@ -2485,6 +2488,90 @@ static void trans_SPLICE(DisasContext *s, arg_rprr_e= sz *a, uint32_t insn) vsz, vsz, a->esz, gen_helper_sve_splice); } =20 +/* + *** SVE Integer Compare - Vectors Group + */ + +static void do_ppzz_flags(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_flags_4 *gen_fn) +{ + TCGv_ptr pd, zn, zm, pg; + unsigned vsz; + TCGv_i32 t; + + if (gen_fn =3D=3D NULL) { + unallocated_encoding(s); + return; + } + + vsz =3D vec_full_reg_size(s); + t =3D tcg_const_i32(simd_desc(vsz, vsz, 0)); + pd =3D tcg_temp_new_ptr(); + zn =3D tcg_temp_new_ptr(); + zm =3D tcg_temp_new_ptr(); + pg =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(pd, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(zm, cpu_env, vec_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + + gen_fn(t, pd, zn, zm, pg, t); + + tcg_temp_free_ptr(pd); + tcg_temp_free_ptr(zn); + tcg_temp_free_ptr(zm); + tcg_temp_free_ptr(pg); + + do_pred_flags(t); + + tcg_temp_free_i32(t); +} + +#define DO_PPZZ(NAME, name) \ +static void trans_##NAME##_ppzz(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_flags_4 * const fns[4] =3D { = \ + gen_helper_sve_##name##_ppzz_b, gen_helper_sve_##name##_ppzz_h, \ + gen_helper_sve_##name##_ppzz_s, gen_helper_sve_##name##_ppzz_d, \ + }; \ + do_ppzz_flags(s, a, fns[a->esz]); \ +} + +DO_PPZZ(CMPEQ, cmpeq) +DO_PPZZ(CMPNE, cmpne) +DO_PPZZ(CMPGT, cmpgt) +DO_PPZZ(CMPGE, cmpge) +DO_PPZZ(CMPHI, cmphi) +DO_PPZZ(CMPHS, cmphs) + +#undef DO_PPZZ + +#define DO_PPZW(NAME, name) \ +static void trans_##NAME##_ppzw(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_flags_4 * const fns[4] =3D { = \ + gen_helper_sve_##name##_ppzw_b, gen_helper_sve_##name##_ppzw_h, \ + gen_helper_sve_##name##_ppzw_s, NULL \ + }; \ + do_ppzz_flags(s, a, fns[a->esz]); \ +} + +DO_PPZW(CMPEQ, cmpeq) +DO_PPZW(CMPNE, cmpne) +DO_PPZW(CMPGT, cmpgt) +DO_PPZW(CMPGE, cmpge) +DO_PPZW(CMPHI, cmphi) +DO_PPZW(CMPHS, cmphs) +DO_PPZW(CMPLT, cmplt) +DO_PPZW(CMPLE, cmple) +DO_PPZW(CMPLO, cmplo) +DO_PPZW(CMPLS, cmpls) + +#undef DO_PPZW + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7ec84fdd80..deedc9163b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -100,6 +100,7 @@ @rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ &rprr_esz rm=3D%reg_movprfx @rd_pg4_rn_rm ........ esz:2 . rm:5 .. pg:4 rn:5 rd:5 &rprr_esz +@pd_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 . rd:4 &rprr_esz =20 # Three register operand, with governing predicate, vector element size @rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \ @@ -473,6 +474,29 @@ SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_= pg_rm # SVE select vector elements (predicated) SEL_zpzz 00000101 .. 1 ..... 11 .... ..... ..... @rd_pg4_rn_rm =20 +### SVE Integer Compare - Vectors Group + +# SVE integer compare_vectors +CMPHS_ppzz 00100100 .. 0 ..... 000 ... ..... 0 .... @pd_pg_rn_rm +CMPHI_ppzz 00100100 .. 0 ..... 000 ... ..... 1 .... @pd_pg_rn_rm +CMPGE_ppzz 00100100 .. 0 ..... 100 ... ..... 0 .... @pd_pg_rn_rm +CMPGT_ppzz 00100100 .. 0 ..... 100 ... ..... 1 .... @pd_pg_rn_rm +CMPEQ_ppzz 00100100 .. 0 ..... 101 ... ..... 0 .... @pd_pg_rn_rm +CMPNE_ppzz 00100100 .. 0 ..... 101 ... ..... 1 .... @pd_pg_rn_rm + +# SVE integer compare with wide elements +# Note these require esz !=3D 3. +CMPEQ_ppzw 00100100 .. 0 ..... 001 ... ..... 0 .... @pd_pg_rn_rm +CMPNE_ppzw 00100100 .. 0 ..... 001 ... ..... 1 .... @pd_pg_rn_rm +CMPGE_ppzw 00100100 .. 0 ..... 010 ... ..... 0 .... @pd_pg_rn_rm +CMPGT_ppzw 00100100 .. 0 ..... 010 ... ..... 1 .... @pd_pg_rn_rm +CMPLT_ppzw 00100100 .. 0 ..... 011 ... ..... 0 .... @pd_pg_rn_rm +CMPLE_ppzw 00100100 .. 0 ..... 011 ... ..... 1 .... @pd_pg_rn_rm +CMPHS_ppzw 00100100 .. 0 ..... 110 ... ..... 0 .... @pd_pg_rn_rm +CMPHI_ppzw 00100100 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm +CMPLO_ppzw 00100100 .. 0 ..... 111 ... ..... 0 .... @pd_pg_rn_rm +CMPLS_ppzw 00100100 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894367271393.66230902952896; Sat, 17 Feb 2018 11:06:07 -0800 (PST) Received: from localhost ([::1]:48423 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7oE-0002Fp-An for importer@patchew.org; Sat, 17 Feb 2018 14:06:06 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40400) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79w-00019A-Ra for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79v-0001wQ-00 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:28 -0500 Received: from mail-pl0-x242.google.com ([2607:f8b0:400e:c01::242]:41328) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79u-0001w3-Of for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:26 -0500 Received: by mail-pl0-x242.google.com with SMTP id k8so3438728pli.8 for ; Sat, 17 Feb 2018 10:24:26 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.23 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=vgUeQVcNdAqSYVjf7X4B2XDy4dpJwgizl072wHry8oE=; b=DLQoYF4Cqu8sg/sk846kXGwwprCPhEsGsixoGHFURg6+OnyUsHryI+nIAB9i8khzk4 8GzLvzs9d+g7oDgLHkM/NBFl7utG5TCVDn4Fs0XOn24fyQt3XoLC1fyfN8Y9I2KRM5Lp ThLEY9VAj2/3YtZTw9mpQf2B9JbYz5rUesBWA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=vgUeQVcNdAqSYVjf7X4B2XDy4dpJwgizl072wHry8oE=; b=RX7RXDLxmi8+9HUrJU/a3aJopB5XfuK5hZ/+DJtnPu3JJlosIMFqJmTs9g5gaYVEzA kNPX33t4YeYLdXKR9H/QPNCYNzpluGOPwq+DR/aqSt43Z7YwgoHbcEDCiO1gKfCHUixK Z+XRNNVFTDEnGPeRuXW1PDD7PVp4QTlJfHPcrqGorxGhJC0TFHdTclTQBR0nd1aTtXhi +NWFgSVKwO7dCfhbUGF3ug+jvZXMMellltz5mwUNlQQgVB0zW/JrjRJR7E3fztRf6WNS iO+o3adh9Dv70g65OAp6/M/fx1scc2d7n39L3WYPoqxv4xscUZ1wFpdVWdMk8A8akNGe +uMA== X-Gm-Message-State: APf1xPBULIFqcPesxchyjN53+uFlXs7phcXRcaanr9Btm9UG4iAfM8jQ 5XmOiTylH9zcVpdcLoUcSDaYIy5BJmY= X-Google-Smtp-Source: AH8x226iOqX6UaJqHEDyOHHxvqtbOYFeGVyI64ZIPKzfcBvMfcsJsbLTul+2zkp0ASp7ROTWuCInYQ== X-Received: by 2002:a17:902:ab85:: with SMTP id f5-v6mr9594068plr.199.1518891865475; Sat, 17 Feb 2018 10:24:25 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:53 -0800 Message-Id: <20180217182323.25885-38-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::242 Subject: [Qemu-devel] [PATCH v2 37/67] target/arm: Implement SVE Integer Compare - Immediate Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 44 +++++++++++++++++++++++ target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 63 +++++++++++++++++++++++++++++++++ target/arm/sve.decode | 23 ++++++++++++ 4 files changed, 218 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6ffd1fbe8e..ae38c0a4be 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -605,6 +605,50 @@ DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) + +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) + +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) + +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, = i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ae433861f8..b74db681f2 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2367,3 +2367,91 @@ DO_CMP_PPZW_S(sve_cmpls_ppzw_s, uint32_t, uint64_t, = <=3D) #undef DO_CMP_PPZW_H #undef DO_CMP_PPZW_S #undef DO_CMP_PPZW + +/* Similar, but the second source is immediate. */ +#define DO_CMP_PPZI(NAME, TYPE, OP, H, MASK) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t opr_sz =3D simd_oprsz(desc); \ + uint32_t flags =3D PREDTEST_INIT; \ + TYPE mm =3D simd_data(desc); \ + intptr_t i =3D opr_sz; \ + do { \ + uint64_t out =3D 0, pg; \ + do { \ + i -=3D sizeof(TYPE), out <<=3D sizeof(TYPE); \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + out |=3D nn OP mm; \ + } while (i & 63); \ + pg =3D *(uint64_t *)(vg + (i >> 3)) & MASK; \ + out &=3D pg; \ + *(uint64_t *)(vd + (i >> 3)) =3D out; \ + flags =3D iter_predtest_bwd(out, pg, flags); \ + } while (i > 0); \ + return flags; \ +} + +#define DO_CMP_PPZI_B(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, H1, 0xffffffffffffffffull) +#define DO_CMP_PPZI_H(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, H1_2, 0x5555555555555555ull) +#define DO_CMP_PPZI_S(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, H1_4, 0x1111111111111111ull) +#define DO_CMP_PPZI_D(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, , 0x0101010101010101ull) + +DO_CMP_PPZI_B(sve_cmpeq_ppzi_b, uint8_t, =3D=3D) +DO_CMP_PPZI_H(sve_cmpeq_ppzi_h, uint16_t, =3D=3D) +DO_CMP_PPZI_S(sve_cmpeq_ppzi_s, uint32_t, =3D=3D) +DO_CMP_PPZI_D(sve_cmpeq_ppzi_d, uint64_t, =3D=3D) + +DO_CMP_PPZI_B(sve_cmpne_ppzi_b, uint8_t, !=3D) +DO_CMP_PPZI_H(sve_cmpne_ppzi_h, uint16_t, !=3D) +DO_CMP_PPZI_S(sve_cmpne_ppzi_s, uint32_t, !=3D) +DO_CMP_PPZI_D(sve_cmpne_ppzi_d, uint64_t, !=3D) + +DO_CMP_PPZI_B(sve_cmpgt_ppzi_b, int8_t, >) +DO_CMP_PPZI_H(sve_cmpgt_ppzi_h, int16_t, >) +DO_CMP_PPZI_S(sve_cmpgt_ppzi_s, int32_t, >) +DO_CMP_PPZI_D(sve_cmpgt_ppzi_d, int64_t, >) + +DO_CMP_PPZI_B(sve_cmpge_ppzi_b, int8_t, >=3D) +DO_CMP_PPZI_H(sve_cmpge_ppzi_h, int16_t, >=3D) +DO_CMP_PPZI_S(sve_cmpge_ppzi_s, int32_t, >=3D) +DO_CMP_PPZI_D(sve_cmpge_ppzi_d, int64_t, >=3D) + +DO_CMP_PPZI_B(sve_cmphi_ppzi_b, uint8_t, >) +DO_CMP_PPZI_H(sve_cmphi_ppzi_h, uint16_t, >) +DO_CMP_PPZI_S(sve_cmphi_ppzi_s, uint32_t, >) +DO_CMP_PPZI_D(sve_cmphi_ppzi_d, uint64_t, >) + +DO_CMP_PPZI_B(sve_cmphs_ppzi_b, uint8_t, >=3D) +DO_CMP_PPZI_H(sve_cmphs_ppzi_h, uint16_t, >=3D) +DO_CMP_PPZI_S(sve_cmphs_ppzi_s, uint32_t, >=3D) +DO_CMP_PPZI_D(sve_cmphs_ppzi_d, uint64_t, >=3D) + +DO_CMP_PPZI_B(sve_cmplt_ppzi_b, int8_t, <) +DO_CMP_PPZI_H(sve_cmplt_ppzi_h, int16_t, <) +DO_CMP_PPZI_S(sve_cmplt_ppzi_s, int32_t, <) +DO_CMP_PPZI_D(sve_cmplt_ppzi_d, int64_t, <) + +DO_CMP_PPZI_B(sve_cmple_ppzi_b, int8_t, <=3D) +DO_CMP_PPZI_H(sve_cmple_ppzi_h, int16_t, <=3D) +DO_CMP_PPZI_S(sve_cmple_ppzi_s, int32_t, <=3D) +DO_CMP_PPZI_D(sve_cmple_ppzi_d, int64_t, <=3D) + +DO_CMP_PPZI_B(sve_cmplo_ppzi_b, uint8_t, <) +DO_CMP_PPZI_H(sve_cmplo_ppzi_h, uint16_t, <) +DO_CMP_PPZI_S(sve_cmplo_ppzi_s, uint32_t, <) +DO_CMP_PPZI_D(sve_cmplo_ppzi_d, uint64_t, <) + +DO_CMP_PPZI_B(sve_cmpls_ppzi_b, uint8_t, <=3D) +DO_CMP_PPZI_H(sve_cmpls_ppzi_h, uint16_t, <=3D) +DO_CMP_PPZI_S(sve_cmpls_ppzi_s, uint32_t, <=3D) +DO_CMP_PPZI_D(sve_cmpls_ppzi_d, uint64_t, <=3D) + +#undef DO_CMP_PPZI_B +#undef DO_CMP_PPZI_H +#undef DO_CMP_PPZI_S +#undef DO_CMP_PPZI_D +#undef DO_CMP_PPZI diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index cb54777108..a7eeb122e3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -39,6 +39,8 @@ typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); =20 +typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_i32); typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); =20 @@ -2572,6 +2574,67 @@ DO_PPZW(CMPLS, cmpls) =20 #undef DO_PPZW =20 +/* + *** SVE Integer Compare - Immediate Groups + */ + +static void do_ppzi_flags(DisasContext *s, arg_rpri_esz *a, + gen_helper_gvec_flags_3 *gen_fn) +{ + TCGv_ptr pd, zn, pg; + unsigned vsz; + TCGv_i32 t; + + if (gen_fn =3D=3D NULL) { + unallocated_encoding(s); + return; + } + + vsz =3D vec_full_reg_size(s); + t =3D tcg_const_i32(simd_desc(vsz, vsz, a->imm)); + pd =3D tcg_temp_new_ptr(); + zn =3D tcg_temp_new_ptr(); + pg =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(pd, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + + gen_fn(t, pd, zn, pg, t); + + tcg_temp_free_ptr(pd); + tcg_temp_free_ptr(zn); + tcg_temp_free_ptr(pg); + + do_pred_flags(t); + + tcg_temp_free_i32(t); +} + +#define DO_PPZI(NAME, name) \ +static void trans_##NAME##_ppzi(DisasContext *s, arg_rpri_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_flags_3 * const fns[4] =3D { = \ + gen_helper_sve_##name##_ppzi_b, gen_helper_sve_##name##_ppzi_h, \ + gen_helper_sve_##name##_ppzi_s, gen_helper_sve_##name##_ppzi_d, \ + }; \ + do_ppzi_flags(s, a, fns[a->esz]); \ +} + +DO_PPZI(CMPEQ, cmpeq) +DO_PPZI(CMPNE, cmpne) +DO_PPZI(CMPGT, cmpgt) +DO_PPZI(CMPGE, cmpge) +DO_PPZI(CMPHI, cmphi) +DO_PPZI(CMPHS, cmphs) +DO_PPZI(CMPLT, cmplt) +DO_PPZI(CMPLE, cmple) +DO_PPZI(CMPLO, cmplo) +DO_PPZI(CMPLS, cmpls) + +#undef DO_PPZI + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index deedc9163b..0e317d7d48 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -132,6 +132,11 @@ @rdn_dbm ........ .. .... dbm:13 rd:5 \ &rr_dbm rn=3D%reg_movprfx =20 +# Predicate output, vector and immediate input, +# controlling predicate, element size. +@pd_pg_rn_i7 ........ esz:2 . imm:7 . pg:3 rn:5 . rd:4 &rpri_esz +@pd_pg_rn_i5 ........ esz:2 . imm:s5 ... pg:3 rn:5 . rd:4 &rpri_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=3D%imm9_16_10 @@ -497,6 +502,24 @@ CMPHI_ppzw 00100100 .. 0 ..... 110 ... ..... 1 .... @p= d_pg_rn_rm CMPLO_ppzw 00100100 .. 0 ..... 111 ... ..... 0 .... @pd_pg_rn_rm CMPLS_ppzw 00100100 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm =20 +### SVE Integer Compare - Unsigned Immediate Group + +# SVE integer compare with unsigned immediate +CMPHS_ppzi 00100100 .. 1 ....... 0 ... ..... 0 .... @pd_pg_rn_i7 +CMPHI_ppzi 00100100 .. 1 ....... 0 ... ..... 1 .... @pd_pg_rn_i7 +CMPLO_ppzi 00100100 .. 1 ....... 1 ... ..... 0 .... @pd_pg_rn_i7 +CMPLS_ppzi 00100100 .. 1 ....... 1 ... ..... 1 .... @pd_pg_rn_i7 + +### SVE Integer Compare - Signed Immediate Group + +# SVE integer compare with signed immediate +CMPGE_ppzi 00100101 .. 0 ..... 000 ... ..... 0 .... @pd_pg_rn_i5 +CMPGT_ppzi 00100101 .. 0 ..... 000 ... ..... 1 .... @pd_pg_rn_i5 +CMPLT_ppzi 00100101 .. 0 ..... 001 ... ..... 0 .... @pd_pg_rn_i5 +CMPLE_ppzi 00100101 .. 0 ..... 001 ... ..... 1 .... @pd_pg_rn_i5 +CMPEQ_ppzi 00100101 .. 0 ..... 100 ... ..... 0 .... @pd_pg_rn_i5 +CMPNE_ppzi 00100101 .. 0 ..... 100 ... ..... 1 .... @pd_pg_rn_i5 + ### SVE Predicate Logical Operations Group =20 # SVE predicate logical operations --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894614370546.6642231349798; Sat, 17 Feb 2018 11:10:14 -0800 (PST) Received: from localhost ([::1]:48450 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7sD-0005sG-C3 for importer@patchew.org; Sat, 17 Feb 2018 14:10:13 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40444) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79z-00019t-Of for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79w-0001xS-M4 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:30 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:41329) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79w-0001ws-EP for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:28 -0500 Received: by mail-pl0-x243.google.com with SMTP id k8so3438744pli.8 for ; Sat, 17 Feb 2018 10:24:28 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.25 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=bCA93J2gh2zptPDp0XGn339+9IiVcOPVbDJIhtHGNOw=; b=BfuUYEHPCP0T2TiyD3RHVHf32mbxNfu2qVDqbWjr6RYaPlOGa5rwnv6iHv5ZV0T699 Af0YQ7eka2KR/NUdU0EYrePj+aVvOef+/378xEW6pl+mRDRD8AjHIit5HV6JLdgVSYeC rK2y8nA82dAt77fe+WppKL62SQsAaeQDRXHG0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=bCA93J2gh2zptPDp0XGn339+9IiVcOPVbDJIhtHGNOw=; b=LEg6XvahE7rFfiHqfHUNPz/IPlsO0TVCMp0tcspBmYoSsdiAF0DfawxmlIu1EKfE8s AP5dNL06PLIjhRW+cjw8bKFQQ40jV/vCyaGMLmdwLEoboiPBf8L8n86k0I26XnzyzH6d JXVk6ub2TTFIwq4QzQnReegV3nu59JeIzAUHib3NuwCp66gubwLZWw6xLNXqe6r1z3AQ mW+4bjY5gMPmUPuPYD8NytRoYhgEBtke1xJ2aXYhRM0lC+7fjeC8fedbL8fHYm7PSoGU MQK5n4ytiBwczofwrpbch3D1PzSNVqqFlFKfTLpVcgBSCLXaPl4PhzwdaIAH7m7+tUod 4SeQ== X-Gm-Message-State: APf1xPBl1d1tEeaDPQJUp2YUXZhCxdhIpQHMQ3nuZRpg5rYEBMywmIn6 FFWymRL8G6hJYe9rn+oT3jnZsyXDC1s= X-Google-Smtp-Source: AH8x227JZU4WCh/acioO93806sifoTgyocDWzE+WvvT4/nlk25W0DbcTGbEJPgNzZgM8PdESg5vNQA== X-Received: by 2002:a17:902:6ac2:: with SMTP id i2-v6mr1484066plt.368.1518891867027; Sat, 17 Feb 2018 10:24:27 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:54 -0800 Message-Id: <20180217182323.25885-39-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 38/67] target/arm: Implement SVE Partition Break Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 18 ++++ target/arm/sve_helper.c | 247 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 96 ++++++++++++++++++ target/arm/sve.decode | 19 ++++ 4 files changed, 380 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ae38c0a4be..f0a3ed3414 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -658,3 +658,21 @@ DEF_HELPER_FLAGS_5(sve_orn_pppp, TCG_CALL_NO_RWG, void= , ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_nor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_nand_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_brkpa, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_5(sve_brkpb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_5(sve_brkpas, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_5(sve_brkpbs, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i= 32) + +DEF_HELPER_FLAGS_4(sve_brka_z, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkb_z, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brka_m, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkb_m, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_brkas_z, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkbs_z, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkas_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkbs_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b74db681f2..d6d2220f8b 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2455,3 +2455,250 @@ DO_CMP_PPZI_D(sve_cmpls_ppzi_d, uint64_t, <=3D) #undef DO_CMP_PPZI_S #undef DO_CMP_PPZI_D #undef DO_CMP_PPZI + +/* Similar to the ARM LastActive pseudocode function. */ +static bool last_active_pred(void *vd, void *vg, intptr_t oprsz) +{ + intptr_t i; + + for (i =3D QEMU_ALIGN_UP(oprsz, 8) - 8; i >=3D 0; i -=3D 8) { + uint64_t pg =3D *(uint64_t *)(vg + i); + if (pg) { + return (pow2floor(pg) & *(uint64_t *)(vd + i)) !=3D 0; + } + } + return 0; +} + +/* Compute a mask into RETB that is true for all G, up to and including + * (if after) or excluding (if !after) the first G & N. + * Return true if BRK found. + */ +static bool compute_brk(uint64_t *retb, uint64_t n, uint64_t g, + bool brk, bool after) +{ + uint64_t b; + + if (brk) { + b =3D 0; + } else if ((g & n) =3D=3D 0) { + /* For all G, no N are set; break not found. */ + b =3D g; + } else { + /* Break somewhere in N. Locate it. */ + b =3D g & n; /* guard true, pred true*/ + b =3D b & -b; /* first such */ + if (after) { + b =3D b | (b - 1); /* break after same */ + } else { + b =3D b - 1; /* break before same */ + } + brk =3D true; + } + + *retb =3D b; + return brk; +} + +/* Compute a zeroing BRK. */ +static void compute_brk_z(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + bool brk =3D false; + intptr_t i; + + for (i =3D 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t this_b, this_g =3D g[i]; + + brk =3D compute_brk(&this_b, n[i], this_g, brk, after); + d[i] =3D this_b & this_g; + } +} + +/* Likewise, but also compute flags. */ +static uint32_t compute_brks_z(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + uint32_t flags =3D PREDTEST_INIT; + bool brk =3D false; + intptr_t i; + + for (i =3D 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t this_b, this_d, this_g =3D g[i]; + + brk =3D compute_brk(&this_b, n[i], this_g, brk, after); + d[i] =3D this_d =3D this_b & this_g; + flags =3D iter_predtest_fwd(this_d, this_g, flags); + } + return flags; +} + +/* Given a computation function, compute a merging BRK. */ +static void compute_brk_m(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + bool brk =3D false; + intptr_t i; + + for (i =3D 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t this_b, this_g =3D g[i]; + + brk =3D compute_brk(&this_b, n[i], this_g, brk, after); + d[i] =3D (this_b & this_g) | (d[i] & ~this_g); + } +} + +/* Likewise, but also compute flags. */ +static uint32_t compute_brks_m(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + uint32_t flags =3D PREDTEST_INIT; + bool brk =3D false; + intptr_t i; + + for (i =3D 0; i < oprsz / 8; ++i) { + uint64_t this_b, this_d =3D d[i], this_g =3D g[i]; + + brk =3D compute_brk(&this_b, n[i], this_g, brk, after); + d[i] =3D this_d =3D (this_b & this_g) | (this_d & ~this_g); + flags =3D iter_predtest_fwd(this_d, this_g, flags); + } + return flags; +} + +static uint32_t do_zero(ARMPredicateReg *d, intptr_t oprsz) +{ + /* It is quicker to zero the whole predicate than loop on OPRSZ. + The compiler should turn this into 4 64-bit integer stores. */ + memset(d, 0, sizeof(ARMPredicateReg)); + return PREDTEST_INIT; +} + +void HELPER(sve_brkpa)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + compute_brk_z(vd, vm, vg, oprsz, true); + } else { + do_zero(vd, oprsz); + } +} + +uint32_t HELPER(sve_brkpas)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + return compute_brks_z(vd, vm, vg, oprsz, true); + } else { + return do_zero(vd, oprsz); + } +} + +void HELPER(sve_brkpb)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + compute_brk_z(vd, vm, vg, oprsz, false); + } else { + do_zero(vd, oprsz); + } +} + +uint32_t HELPER(sve_brkpbs)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + return compute_brks_z(vd, vm, vg, oprsz, false); + } else { + return do_zero(vd, oprsz); + } +} + +void HELPER(sve_brka_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_z(vd, vn, vg, oprsz, true); +} + +uint32_t HELPER(sve_brkas_z)(void *vd, void *vn, void *vg, uint32_t pred_d= esc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_z(vd, vn, vg, oprsz, true); +} + +void HELPER(sve_brkb_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_z(vd, vn, vg, oprsz, false); +} + +uint32_t HELPER(sve_brkbs_z)(void *vd, void *vn, void *vg, uint32_t pred_d= esc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_z(vd, vn, vg, oprsz, false); +} + +void HELPER(sve_brka_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_m(vd, vn, vg, oprsz, true); +} + +uint32_t HELPER(sve_brkas_m)(void *vd, void *vn, void *vg, uint32_t pred_d= esc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_m(vd, vn, vg, oprsz, true); +} + +void HELPER(sve_brkb_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_m(vd, vn, vg, oprsz, false); +} + +uint32_t HELPER(sve_brkbs_m)(void *vd, void *vn, void *vg, uint32_t pred_d= esc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_m(vd, vn, vg, oprsz, false); +} + +void HELPER(sve_brkn)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + + if (!last_active_pred(vn, vg, oprsz)) { + do_zero(vd, oprsz); + } +} + +/* As if PredTest(Ones(PL), D, esz). */ +static uint32_t predtest_ones(ARMPredicateReg *d, intptr_t oprsz, + uint64_t esz_mask) +{ + uint32_t flags =3D PREDTEST_INIT; + intptr_t i; + + for (i =3D 0; i < oprsz / 8; i++) { + flags =3D iter_predtest_fwd(d->p[i], esz_mask, flags); + } + if (oprsz & 7) { + uint64_t mask =3D ~(-1ULL << (8 * (oprsz & 7))); + flags =3D iter_predtest_fwd(d->p[i], esz_mask & mask, flags); + } + return flags; +} + +uint32_t HELPER(sve_brkns)(void *vd, void *vn, void *vg, uint32_t pred_des= c) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + + if (last_active_pred(vn, vg, oprsz)) { + return predtest_ones(vd, oprsz, -1); + } else { + return do_zero(vd, oprsz); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a7eeb122e3..dc95d68867 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2635,6 +2635,102 @@ DO_PPZI(CMPLS, cmpls) =20 #undef DO_PPZI =20 +/* + *** SVE Partition Break Group + */ + +static void do_brk3(DisasContext *s, arg_rprr_s *a, + gen_helper_gvec_4 *fn, gen_helper_gvec_flags_4 *fn_s) +{ + unsigned vsz =3D pred_full_reg_size(s); + + /* Predicate sizes may be smaller and cannot use simd_desc. */ + TCGv_ptr d =3D tcg_temp_new_ptr(); + TCGv_ptr n =3D tcg_temp_new_ptr(); + TCGv_ptr m =3D tcg_temp_new_ptr(); + TCGv_ptr g =3D tcg_temp_new_ptr(); + TCGv_i32 t =3D tcg_const_i32(vsz - 2); + + tcg_gen_addi_ptr(d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(n, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(m, cpu_env, pred_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(g, cpu_env, pred_full_reg_offset(s, a->pg)); + + if (a->s) { + fn_s(t, d, n, m, g, t); + do_pred_flags(t); + } else { + fn(d, n, m, g, t); + } + tcg_temp_free_ptr(d); + tcg_temp_free_ptr(n); + tcg_temp_free_ptr(m); + tcg_temp_free_ptr(g); + tcg_temp_free_i32(t); +} + +static void do_brk2(DisasContext *s, arg_rpr_s *a, + gen_helper_gvec_3 *fn, gen_helper_gvec_flags_3 *fn_s) +{ + unsigned vsz =3D pred_full_reg_size(s); + + /* Predicate sizes may be smaller and cannot use simd_desc. */ + TCGv_ptr d =3D tcg_temp_new_ptr(); + TCGv_ptr n =3D tcg_temp_new_ptr(); + TCGv_ptr g =3D tcg_temp_new_ptr(); + TCGv_i32 t =3D tcg_const_i32(vsz - 2); + + tcg_gen_addi_ptr(d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(n, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(g, cpu_env, pred_full_reg_offset(s, a->pg)); + + if (a->s) { + fn_s(t, d, n, g, t); + do_pred_flags(t); + } else { + fn(d, n, g, t); + } + tcg_temp_free_ptr(d); + tcg_temp_free_ptr(n); + tcg_temp_free_ptr(g); + tcg_temp_free_i32(t); +} + +void trans_BRKPA(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + do_brk3(s, a, gen_helper_sve_brkpa, gen_helper_sve_brkpas); +} + +void trans_BRKPB(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + do_brk3(s, a, gen_helper_sve_brkpb, gen_helper_sve_brkpbs); +} + +void trans_BRKA_m(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + do_brk2(s, a, gen_helper_sve_brka_m, gen_helper_sve_brkas_m); +} + +void trans_BRKB_m(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + do_brk2(s, a, gen_helper_sve_brkb_m, gen_helper_sve_brkbs_m); +} + +void trans_BRKA_z(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + do_brk2(s, a, gen_helper_sve_brka_z, gen_helper_sve_brkas_z); +} + +void trans_BRKB_z(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + do_brk2(s, a, gen_helper_sve_brkb_z, gen_helper_sve_brkbs_z); +} + +void trans_BRKN(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + do_brk2(s, a, gen_helper_sve_brkn, gen_helper_sve_brkns); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0e317d7d48..1c19129e55 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -60,6 +60,7 @@ &rri_esz rd rn imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz +&rpr_s rd pg rn s &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz &rprrr_esz rd pg rn rm ra esz @@ -79,6 +80,9 @@ @pd_pn ........ esz:2 .. .... ....... rn:4 . rd:4 &rr_esz @rd_rn ........ esz:2 ...... ...... rn:5 rd:5 &rr_esz =20 +# Two operand with governing predicate, flags setting +@pd_pg_pn_s ........ . s:1 ...... .. pg:4 . rn:4 . rd:4 &rpr_s + # Three operand with unused vector element size @rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=3D0 =20 @@ -568,6 +572,21 @@ PFIRST 00100101 01 011 000 11000 00 .... 0 .... @pd_p= n_e0 # SVE predicate next active PNEXT 00100101 .. 011 001 11000 10 .... 0 .... @pd_pn =20 +### SVE Partition Break Group + +# SVE propagate break from previous partition +BRKPA 00100101 0. 00 .... 11 .... 0 .... 0 .... @pd_pg_pn_pm_s +BRKPB 00100101 0. 00 .... 11 .... 0 .... 1 .... @pd_pg_pn_pm_s + +# SVE partition break condition +BRKA_z 00100101 0. 01000001 .... 0 .... 0 .... @pd_pg_pn_s +BRKB_z 00100101 1. 01000001 .... 0 .... 0 .... @pd_pg_pn_s +BRKA_m 00100101 0. 01000001 .... 0 .... 1 .... @pd_pg_pn_s +BRKB_m 00100101 1. 01000001 .... 0 .... 1 .... @pd_pg_pn_s + +# SVE propagate break to next partition +BRKN 00100101 0. 01100001 .... 0 .... 0 .... @pd_pg_pn_s + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group =20 # SVE load predicate register --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893167562963.1963161938124; Sat, 17 Feb 2018 10:46:07 -0800 (PST) Received: from localhost ([::1]:48233 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Us-0001pX-GU for importer@patchew.org; Sat, 17 Feb 2018 13:46:06 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40455) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79z-00019v-Ok for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79y-0001y1-1K for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:31 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:40420) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79x-0001xk-Pk for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:29 -0500 Received: by mail-pl0-x241.google.com with SMTP id g18so3435581plo.7 for ; Sat, 17 Feb 2018 10:24:29 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.27 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=FVmGqUj6eUA4cUBaHP2RMtNc+5HxMAEYAfQlDELLJ08=; b=czs9ma53fn5O+exBK5ohzl/uOcsR6hdYQQvCHS/EOaDx5gCUA2NiCExeOuFOH5PWUE 7tdazZtdEFl5fvSvwIvfNNlRmv0Xw3uQ77l2BtN/B+wSVCny6pYUfxY2OcEpZVcV6cd5 2dHFgt0Q/AZslo9DVnP5gQvt53qdHubNiLfio= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=FVmGqUj6eUA4cUBaHP2RMtNc+5HxMAEYAfQlDELLJ08=; b=VXwJSg79LaAiOLhilnvCjCol2CDpbM09LimRYQrEnd2WY8zSjwZHj96wEOLcJoGmB2 lzU7bxF+aKXqNRakB/X9WYSBtbRtX3psILQ7zsytRox2UyXLapwLLNaKRhaxYUQr+wh7 QIyzogMBHX15Pame29l4jK3iHdZypa+CJuqWSUSgOc3yt+sZY+isr56D8i6miFReNvAO 7x7BXm/Lad+Ux1/xCNlYK0aGDYoYaA+WduzueHXyitlfUr0rS36QRsSe0vMB9/uoEv5H AFK1INOJJYyAN8sfJGpNUUxbiBf4vSpPYsB6ZV6tOVTElhlXt3al/uaFmuMsvZgnC1Z8 Z4zw== X-Gm-Message-State: APf1xPAzE9YkYQ+kfvghTA9kpmDeEshbRiR6jK28v7Nf7Bo6qdE4BGlQ nTJzM+3A1fF9bSHz370mzj5GHJxvoYU= X-Google-Smtp-Source: AH8x227B4s8uEITQGuLJxk8NUeAreL6ZGzUG4vGGvFFKFIs1ML352ea+SUiSIcr4C1UMhlxKuczs9g== X-Received: by 2002:a17:902:6e8c:: with SMTP id v12-v6mr9361431plk.424.1518891868501; Sat, 17 Feb 2018 10:24:28 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:55 -0800 Message-Id: <20180217182323.25885-40-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v2 39/67] target/arm: Implement SVE Predicate Count Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 14 ++++++ target/arm/translate-sve.c | 116 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 27 +++++++++++ 4 files changed, 159 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index f0a3ed3414..dd4f8f754d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -676,3 +676,5 @@ DEF_HELPER_FLAGS_4(sve_brkbs_m, TCG_CALL_NO_RWG, i32, p= tr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d6d2220f8b..dd884bdd1c 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2702,3 +2702,17 @@ uint32_t HELPER(sve_brkns)(void *vd, void *vn, void = *vg, uint32_t pred_desc) return do_zero(vd, oprsz); } } + +uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz =3D extract32(pred_desc, SIMD_DATA_SHIFT, 2); + uint64_t *n =3D vn, *g =3D vg, sum =3D 0, mask =3D pred_esz_masks[esz]; + intptr_t i; + + for (i =3D 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t t =3D n[i] & g[i] & mask; + sum +=3D ctpop64(t); + } + return sum; +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index dc95d68867..038800cc86 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -36,6 +36,8 @@ typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t); typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t, uint32_t, uint32_t); +typedef void GVecGen2sFn(unsigned, uint32_t, uint32_t, + TCGv_i64, uint32_t, uint32_t); typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); =20 @@ -2731,6 +2733,120 @@ void trans_BRKN(DisasContext *s, arg_rpr_s *a, uint= 32_t insn) do_brk2(s, a, gen_helper_sve_brkn, gen_helper_sve_brkns); } =20 +/* + *** SVE Predicate Count Group + */ + +static void do_cntp(DisasContext *s, TCGv_i64 val, int esz, int pn, int pg) +{ + unsigned psz =3D pred_full_reg_size(s); + + if (psz <=3D 8) { + uint64_t psz_mask; + + tcg_gen_ld_i64(val, cpu_env, pred_full_reg_offset(s, pn)); + if (pn !=3D pg) { + TCGv_i64 g =3D tcg_temp_new_i64(); + tcg_gen_ld_i64(g, cpu_env, pred_full_reg_offset(s, pg)); + tcg_gen_and_i64(val, val, g); + tcg_temp_free_i64(g); + } + + /* Reduce the pred_esz_masks value simply to reduce the + size of the code generated here. */ + psz_mask =3D deposit64(0, 0, psz * 8, -1); + tcg_gen_andi_i64(val, val, pred_esz_masks[esz] & psz_mask); + + tcg_gen_ctpop_i64(val, val); + } else { + TCGv_ptr t_pn =3D tcg_temp_new_ptr(); + TCGv_ptr t_pg =3D tcg_temp_new_ptr(); + unsigned desc; + TCGv_i32 t_desc; + + desc =3D psz - 2; + desc =3D deposit32(desc, SIMD_DATA_SHIFT, 2, esz); + + tcg_gen_addi_ptr(t_pn, cpu_env, pred_full_reg_offset(s, pn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + t_desc =3D tcg_const_i32(desc); + + gen_helper_sve_cntp(val, t_pn, t_pg, t_desc); + tcg_temp_free_ptr(t_pn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(t_desc); + } +} + +static void trans_CNTP(DisasContext *s, arg_CNTP *a, uint32_t insn) +{ + do_cntp(s, cpu_reg(s, a->rd), a->esz, a->rn, a->pg); +} + +static void trans_INCDECP_r(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + TCGv_i64 reg =3D cpu_reg(s, a->rd); + TCGv_i64 val =3D tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + if (a->d) { + tcg_gen_sub_i64(reg, reg, val); + } else { + tcg_gen_add_i64(reg, reg, val); + } + tcg_temp_free_i64(val); +} + +static void trans_INCDECP_z(DisasContext *s, arg_incdec2_pred *a, + uint32_t insn) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_i64 val =3D tcg_temp_new_i64(); + GVecGen2sFn *gvec_fn =3D a->d ? tcg_gen_gvec_subs : tcg_gen_gvec_adds; + + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + do_cntp(s, val, a->esz, a->pg, a->pg); + gvec_fn(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), val, vsz, vsz); +} + +static void trans_SINCDECP_r_32(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + TCGv_i64 reg =3D cpu_reg(s, a->rd); + TCGv_i64 val =3D tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_32(reg, val, a->u, a->d); +} + +static void trans_SINCDECP_r_64(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + TCGv_i64 reg =3D cpu_reg(s, a->rd); + TCGv_i64 val =3D tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_64(reg, val, a->u, a->d); +} + +static void trans_SINCDECP_z(DisasContext *s, arg_incdec2_pred *a, + uint32_t insn) +{ + TCGv_i64 val =3D tcg_temp_new_i64(); + + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, a->u, a->d); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1c19129e55..76c084d43e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -68,6 +68,8 @@ &ptrue rd esz pat s &incdec_cnt rd pat esz imm d u &incdec2_cnt rd rn pat esz imm d u +&incdec_pred rd pg esz d u +&incdec2_pred rd rn pg esz d u =20 ########################################################################### # Named instruction formats. These are generally used to @@ -114,6 +116,7 @@ =20 # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz +@rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz =20 # Two register operands with a 6-bit signed immediate. @rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri @@ -154,6 +157,12 @@ @incdec2_cnt ........ esz:2 .. .... ...... pat:5 rd:5 \ &incdec2_cnt imm=3D%imm4_16_p1 rn=3D%reg_movprfx =20 +# One register, predicate. +# User must fill in U and D. +@incdec_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 &incdec_pred +@incdec2_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 \ + &incdec2_pred rn=3D%reg_movprfx + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. =20 @@ -587,6 +596,24 @@ BRKB_m 00100101 1. 01000001 .... 0 .... 1 .... @pd_p= g_pn_s # SVE propagate break to next partition BRKN 00100101 0. 01100001 .... 0 .... 0 .... @pd_pg_pn_s =20 +### SVE Predicate Count Group + +# SVE predicate count +CNTP 00100101 .. 100 000 10 .... 0 .... ..... @rd_pg4_pn + +# SVE inc/dec register by predicate count +INCDECP_r 00100101 .. 10110 d:1 10001 00 .... ..... @incdec_pred u=3D1 + +# SVE inc/dec vector by predicate count +INCDECP_z 00100101 .. 10110 d:1 10000 00 .... ..... @incdec2_pred u=3D1 + +# SVE saturating inc/dec register by predicate count +SINCDECP_r_32 00100101 .. 1010 d:1 u:1 10001 00 .... ..... @incdec_pred +SINCDECP_r_64 00100101 .. 1010 d:1 u:1 10001 10 .... ..... @incdec_pred + +# SVE saturating inc/dec vector by predicate count +SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_pred + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group =20 # SVE load predicate register --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894584947436.1153906903522; Sat, 17 Feb 2018 11:09:44 -0800 (PST) Received: from localhost ([::1]:48448 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7rg-0005O4-1x for importer@patchew.org; Sat, 17 Feb 2018 14:09:40 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40505) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7A2-0001DX-7o for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79z-0001yw-Ad for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:34 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:34791) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79z-0001yY-32 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:31 -0500 Received: by mail-pl0-x243.google.com with SMTP id bd10so3450055plb.1 for ; Sat, 17 Feb 2018 10:24:31 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.28 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=5Y5i1Z/QDoiE237plg9CGLy21f/g2av1w+1fO98BbyU=; b=D6Xv3L8wcdUuu17M6/nAzjHQfUC82mYwKRU2s1p89C9WCeDQG2erUwT8LxN4sa+Scd zeOmL49xGuJ3El7P0RJ7lK+erpektPOX0NLF9bFnkqk0I1iKkbPVa2DAHdO5pA/AXs50 FrmCvhnQuBwWNXk7GY6/hG2zVlo0rUgFRuHrM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=5Y5i1Z/QDoiE237plg9CGLy21f/g2av1w+1fO98BbyU=; b=IowTsfAX4U2IlcqD9+1Ct4vHTOOZCTaPhXJ+xITskFtBpzsb14u+LIf6JDX4X/Iw1B blDcH5H7+fdaUU+VVzJ19p3zi/OwtDCiKN5EeZ2PXSosEuNqvuesD26VjCzLxbHHZ5uZ bpRbL6GGKW4r1izq3dqY5FJ2QI6IgJxtgVl4M6cn+J7u3vbGeeyoT6ZXVCaM9EY3QYII T2YS7IwOIZNMEbdSrPGUKssTp957UH9E8AiNga+y4DNufWBnYsVRW5/JXjGIKiP+5PW1 4YrFZHh9uurx7C7WVmOmLRTbbfDMx8DOoW4IxZqhtaYpsDuO/aV/ATB4hu2xIWLzelwY hMXQ== X-Gm-Message-State: APf1xPBn5M1qKn2Pm/VBLbvRajNhFLLZeNhgOs0txYuz4ckzGk87miIM H5a1AWJvhP7gJ72pMKZ9v7fpC+5DUv4= X-Google-Smtp-Source: AH8x227YF1d8mRWn6k639XQnDH4ktC6rbZ4TI3Oel3JOjqAy+gpvB83phqBMZfKX0l4gyOkeu/6aIQ== X-Received: by 2002:a17:902:6bcb:: with SMTP id m11-v6mr2326167plt.326.1518891869820; Sat, 17 Feb 2018 10:24:29 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:56 -0800 Message-Id: <20180217182323.25885-41-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 40/67] target/arm: Implement SVE Integer Compare - Scalars Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 31 ++++++++++++++++ target/arm/translate-sve.c | 92 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 8 ++++ 4 files changed, 133 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index dd4f8f754d..1863106d0f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -678,3 +678,5 @@ DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr= , ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_while, TCG_CALL_NO_RWG, i32, ptr, i32, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index dd884bdd1c..80b78da834 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2716,3 +2716,34 @@ uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32= _t pred_desc) } return sum; } + +uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) +{ + uintptr_t oprsz =3D extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz =3D extract32(pred_desc, SIMD_DATA_SHIFT, 2); + uint64_t esz_mask =3D pred_esz_masks[esz]; + ARMPredicateReg *d =3D vd; + uint32_t flags; + intptr_t i; + + /* Begin with a zero predicate register. */ + flags =3D do_zero(d, oprsz); + if (count =3D=3D 0) { + return flags; + } + + /* Scale from predicate element count to bits. */ + count <<=3D esz; + /* Bound to the bits in the predicate. */ + count =3D MIN(count, oprsz * 8); + + /* Set all of the requested bits. */ + for (i =3D 0; i < count / 64; ++i) { + d->p[i] =3D esz_mask; + } + if (count & 63) { + d->p[i] =3D ~(-1ull << (count & 63)) & esz_mask; + } + + return predtest_ones(d, oprsz, esz_mask); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 038800cc86..4b92a55c21 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2847,6 +2847,98 @@ static void trans_SINCDECP_z(DisasContext *s, arg_in= cdec2_pred *a, do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, a->u, a->d); } =20 +/* + *** SVE Integer Compare Scalars Group + */ + +static void trans_CTERM(DisasContext *s, arg_CTERM *a, uint32_t insn) +{ + TCGCond cond =3D (a->ne ? TCG_COND_NE : TCG_COND_EQ); + TCGv_i64 rn =3D read_cpu_reg(s, a->rn, a->sf); + TCGv_i64 rm =3D read_cpu_reg(s, a->rm, a->sf); + TCGv_i64 cmp =3D tcg_temp_new_i64(); + + tcg_gen_setcond_i64(cond, cmp, rn, rm); + tcg_gen_extrl_i64_i32(cpu_NF, cmp); + tcg_temp_free_i64(cmp); + + /* VF =3D !NF & !CF. */ + tcg_gen_xori_i32(cpu_VF, cpu_NF, 1); + tcg_gen_andc_i32(cpu_VF, cpu_VF, cpu_CF); + + /* Both NF and VF actually look at bit 31. */ + tcg_gen_neg_i32(cpu_NF, cpu_NF); + tcg_gen_neg_i32(cpu_VF, cpu_VF); +} + +static void trans_WHILE(DisasContext *s, arg_WHILE *a, uint32_t insn) +{ + TCGv_i64 op0 =3D read_cpu_reg(s, a->rn, 1); + TCGv_i64 op1 =3D read_cpu_reg(s, a->rm, 1); + TCGv_i64 t0 =3D tcg_temp_new_i64(); + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i32 t2, t3; + TCGv_ptr ptr; + unsigned desc, vsz =3D vec_full_reg_size(s); + TCGCond cond; + + if (!a->sf) { + if (a->u) { + tcg_gen_ext32u_i64(op0, op0); + tcg_gen_ext32u_i64(op1, op1); + } else { + tcg_gen_ext32s_i64(op0, op0); + tcg_gen_ext32s_i64(op1, op1); + } + } + + /* For the helper, compress the different conditions into a computation + * of how many iterations for which the condition is true. + * + * This is slightly complicated by 0 <=3D UINT64_MAX, which is nominal= ly + * 2**64 iterations, overflowing to 0. Of course, predicate registers + * aren't that large, so any value >=3D predicate size is sufficient. + */ + tcg_gen_sub_i64(t0, op1, op0); + + /* t0 =3D MIN(op1 - op0, vsz). */ + if (a->eq) { + /* Equality means one more iteration. */ + tcg_gen_movi_i64(t1, vsz - 1); + tcg_gen_movcond_i64(TCG_COND_LTU, t0, t0, t1, t0, t1); + tcg_gen_addi_i64(t0, t0, 1); + } else { + tcg_gen_movi_i64(t1, vsz); + tcg_gen_movcond_i64(TCG_COND_LTU, t0, t0, t1, t0, t1); + } + + /* t0 =3D (condition true ? t0 : 0). */ + cond =3D (a->u + ? (a->eq ? TCG_COND_LEU : TCG_COND_LTU) + : (a->eq ? TCG_COND_LE : TCG_COND_LT)); + tcg_gen_movi_i64(t1, 0); + tcg_gen_movcond_i64(cond, t0, op0, op1, t0, t1); + + t2 =3D tcg_temp_new_i32(); + tcg_gen_extrl_i64_i32(t2, t0); + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); + + desc =3D (vsz / 8) - 2; + desc =3D deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + t3 =3D tcg_const_i32(desc); + + ptr =3D tcg_temp_new_ptr(); + tcg_gen_addi_ptr(ptr, cpu_env, pred_full_reg_offset(s, a->rd)); + + gen_helper_sve_while(t2, ptr, t2, t3); + do_pred_flags(t2); + + tcg_temp_free_ptr(ptr); + tcg_temp_free_i32(t2); + tcg_temp_free_i32(t3); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 76c084d43e..b5bc7e9546 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -614,6 +614,14 @@ SINCDECP_r_64 00100101 .. 1010 d:1 u:1 10001 10 .... .= .... @incdec_pred # SVE saturating inc/dec vector by predicate count SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_pred =20 +### SVE Integer Compare - Scalars Group + +# SVE conditionally terminate scalars +CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000 + +# SVE integer compare scalar count and limit +WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 1 rn:5 eq:1 rd:4 + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group =20 # SVE load predicate register --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893728524702.7615977050554; Sat, 17 Feb 2018 10:55:28 -0800 (PST) Received: from localhost ([::1]:48317 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7dt-0000y0-NX for importer@patchew.org; Sat, 17 Feb 2018 13:55:25 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40504) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7A2-0001DW-7n for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7A0-0001zc-M8 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:34 -0500 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:45874) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7A0-0001zM-G9 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:32 -0500 Received: by mail-pf0-x242.google.com with SMTP id w83so592594pfi.12 for ; Sat, 17 Feb 2018 10:24:32 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.29 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Sbjzr4oP/2U5rXhk+eeI3UjsbSb3BRWAaUy7LI44TGI=; b=PivhMbZN4oKeFBK3typmjkvZQ71C+JQ3JiYxLXTBJzKP5klTatnO525dkf/tXWU/rA AiaaBByN9gbGc1dNyf6ZVkr6JS1cvRayazZ4Fh2KtO3puhQq24ieo2fDO/TImRhp50tD 63ReSVo1pLmf8exbQH6MiQQhLu+IIeyWrSfdA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Sbjzr4oP/2U5rXhk+eeI3UjsbSb3BRWAaUy7LI44TGI=; b=LG9UIc8UNbBQxna3+FCWQT3kOanlET4Z14G4N590jnPNT9NbYR4mbn51mZFDNvHcZq GP8biS2jYtdd9iudcAMq+EVjbph/Dqn0YnvdcJF+gb6C1VY7cUETP7EK9XerqSOpXS5a noLHA0hjyMhMLhsrAjqJkQLqLLrO5BA0HZ0jawQdKVApZ/BsEFOKZ4nF9mkcwOF5IQpI SAV51h4uAmIgXTSXig4lewCjoFZhMCGkEaIVoLDCF+L0ErwYCibgv8wYz08IGP/2c+np XaPRtYOK/MjHtriJ4CYOYrPYwzZid3A8rRXHbrkZkQyCNtndn34ifp5CuKk0t3mM2fjK Dwww== X-Gm-Message-State: APf1xPCpYPGh1Kc0PcWr+4gkHu8iTRodRQCAW6UmvAnSHQ8QwO6WmNEq E//hiZVa/qffHdSck0num72ACA0x2YQ= X-Google-Smtp-Source: AH8x2253CODkeK8tDn4NAjEbo/bQcKdf1Us3eaSh44sgoVlqPbIIKEfGXQg5KN0f3dYZoLLUSFuEyA== X-Received: by 10.98.92.68 with SMTP id q65mr9797853pfb.4.1518891871266; Sat, 17 Feb 2018 10:24:31 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:57 -0800 Message-Id: <20180217182323.25885-42-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v2 41/67] target/arm: Implement FDUP/DUP X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/translate-sve.c | 35 +++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 ++++++++ 2 files changed, 43 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4b92a55c21..7571d02237 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2939,6 +2939,41 @@ static void trans_WHILE(DisasContext *s, arg_WHILE *= a, uint32_t insn) tcg_temp_free_i32(t3); } =20 +/* + *** SVE Integer Wide Immediate - Unpredicated Group + */ + +static void trans_FDUP(DisasContext *s, arg_FDUP *a, uint32_t insn) +{ + unsigned vsz =3D vec_full_reg_size(s); + int dofs =3D vec_full_reg_offset(s, a->rd); + uint64_t imm; + + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + + /* Decode the VFP immediate. */ + imm =3D vfp_expand_imm(a->esz, a->imm); + imm =3D dup_const(a->esz, imm); + + tcg_gen_gvec_dup64i(dofs, vsz, vsz, imm); +} + +static void trans_DUP_i(DisasContext *s, arg_DUP_i *a, uint32_t insn) +{ + unsigned vsz =3D vec_full_reg_size(s); + int dofs =3D vec_full_reg_offset(s, a->rd); + + if (a->esz =3D=3D 0 && extract32(insn, 13, 1)) { + unallocated_encoding(s); + return; + } + + tcg_gen_gvec_dup64i(dofs, vsz, vsz, dup_const(a->esz, a->imm)); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b5bc7e9546..ea1bfe7579 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -622,6 +622,14 @@ CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000 # SVE integer compare scalar count and limit WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 1 rn:5 eq:1 rd:4 =20 +### SVE Integer Wide Immediate - Unpredicated Group + +# SVE broadcast floating-point immediate (unpredicated) +FDUP 00100101 esz:2 111 00 1110 imm:8 rd:5 + +# SVE broadcast integer immediate (unpredicated) +DUP_i 00100101 esz:2 111 00 011 . ........ rd:5 imm=3D%sh8_i8s + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group =20 # SVE load predicate register --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894799163157.82816844817478; Sat, 17 Feb 2018 11:13:19 -0800 (PST) Received: from localhost ([::1]:48480 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7vC-000080-9F for importer@patchew.org; Sat, 17 Feb 2018 14:13:18 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40545) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7A4-0001Fz-10 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7A2-00020l-Ei for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:36 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:35233) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7A2-000201-5H for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:34 -0500 Received: by mail-pf0-x244.google.com with SMTP id a6so591965pfi.2 for ; Sat, 17 Feb 2018 10:24:34 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.31 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=f6tnM4rLbtSBZqU3j1NYbg9btYpfaZ4siRzrSFF1A54=; b=EhjmOh6lGPWybODY5lXUS71MTW38IIzcnhajIriQbP0P00tlB7d3HRAmBYCW9OYuEW PnDLVI6XqhG1kH9EpIbQaDp+RLTzMQZV6y8+xJRr7r21PiuLVxg3MEs1bxh+iXaO4UML F7ncbuivsjEJhKJeQ3UltlropW6EgvJcC4BlA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=f6tnM4rLbtSBZqU3j1NYbg9btYpfaZ4siRzrSFF1A54=; b=OelQaEjt2wJuTYuYOu/h/IQYPh9A9rRDMbCKEy5tTaEtlBhMH3fg2PizC59g+Nzlre MiKzgeBXL6CqtpvvaYJ8P0us9/YFYH0YZpIcxJj8LEpuY3r8yTvDvZOoANtmr6Trx2SS p5J6zlwVQmWT6iTyDnqC3t9C/mNj7sYShyxBpkurEmdL7kudPbyU5G7uy3KglU39mKT5 1iuVbmqPXEpTSUbVwTypCQXeENjcovx2wG9XhjxPDKA5rxp3+VlOd4+cuOAOJu9XHeFt +lG3uCIGV3ncM9EdWlPWjv2EGzVcUbPAO8hrSYN96B+6rwpoL2UTfvFiZ2cDNgVuztpa 0udA== X-Gm-Message-State: APf1xPAf13vN9+u1uTHj2vAVQvX+27xmCyJ+wMLcDIWWkunMc8uItyEl znjjrLUuKoJqj5BO4x09VzAvlNi9anE= X-Google-Smtp-Source: AH8x226+9BKoPD+oLfcpccrbiOiajmtdVhfEfo/cM+ViUTXA4AukMSg83jFrnevbhsyXLuEgC7lKhw== X-Received: by 10.98.33.204 with SMTP id o73mr2432335pfj.54.1518891872719; Sat, 17 Feb 2018 10:24:32 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:58 -0800 Message-Id: <20180217182323.25885-43-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 42/67] target/arm: Implement SVE Integer Wide Immediate - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 25 +++++++++ target/arm/sve_helper.c | 41 ++++++++++++++ target/arm/translate-sve.c | 135 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 26 +++++++++ 4 files changed, 227 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 1863106d0f..97bfe0f47b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -680,3 +680,28 @@ DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, pt= r, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_3(sve_while, TCG_CALL_NO_RWG, i32, ptr, i32, i32) + +DEF_HELPER_FLAGS_4(sve_subri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_subri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_subri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_subri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_smaxi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smaxi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smaxi_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smaxi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_smini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_umaxi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umaxi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umaxi_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umaxi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_umini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 80b78da834..4f45f11bff 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -803,6 +803,46 @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) #undef DO_VPZ #undef DO_VPZ_D =20 +/* Two vector operand, one scalar operand, unpredicated. */ +#define DO_ZZI(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint64_t s64, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / sizeof(TYPE); \ + TYPE s =3D s64, *d =3D vd, *n =3D vn; = \ + for (i =3D 0; i < opr_sz; ++i) { \ + d[i] =3D OP(n[i], s); \ + } \ +} + +#define DO_SUBR(X, Y) (Y - X) + +DO_ZZI(sve_subri_b, uint8_t, DO_SUBR) +DO_ZZI(sve_subri_h, uint16_t, DO_SUBR) +DO_ZZI(sve_subri_s, uint32_t, DO_SUBR) +DO_ZZI(sve_subri_d, uint64_t, DO_SUBR) + +DO_ZZI(sve_smaxi_b, int8_t, DO_MAX) +DO_ZZI(sve_smaxi_h, int16_t, DO_MAX) +DO_ZZI(sve_smaxi_s, int32_t, DO_MAX) +DO_ZZI(sve_smaxi_d, int64_t, DO_MAX) + +DO_ZZI(sve_smini_b, int8_t, DO_MIN) +DO_ZZI(sve_smini_h, int16_t, DO_MIN) +DO_ZZI(sve_smini_s, int32_t, DO_MIN) +DO_ZZI(sve_smini_d, int64_t, DO_MIN) + +DO_ZZI(sve_umaxi_b, uint8_t, DO_MAX) +DO_ZZI(sve_umaxi_h, uint16_t, DO_MAX) +DO_ZZI(sve_umaxi_s, uint32_t, DO_MAX) +DO_ZZI(sve_umaxi_d, uint64_t, DO_MAX) + +DO_ZZI(sve_umini_b, uint8_t, DO_MIN) +DO_ZZI(sve_umini_h, uint16_t, DO_MIN) +DO_ZZI(sve_umini_s, uint32_t, DO_MIN) +DO_ZZI(sve_umini_d, uint64_t, DO_MIN) + +#undef DO_ZZI + #undef DO_AND #undef DO_ORR #undef DO_EOR @@ -817,6 +857,7 @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) #undef DO_ASR #undef DO_LSR #undef DO_LSL +#undef DO_SUBR =20 /* Similar to the ARM LastActiveElement pseudocode function, except the result is multiplied by the element size. This includes the not found diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7571d02237..72abcb543a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -81,6 +81,11 @@ static inline int expand_imm_sh8s(int x) return (int8_t)x << (x & 0x100 ? 8 : 0); } =20 +static inline int expand_imm_sh8u(int x) +{ + return (uint8_t)x << (x & 0x100 ? 8 : 0); +} + /* * Include the generated decoder. */ @@ -2974,6 +2979,136 @@ static void trans_DUP_i(DisasContext *s, arg_DUP_i = *a, uint32_t insn) tcg_gen_gvec_dup64i(dofs, vsz, vsz, dup_const(a->esz, a->imm)); } =20 +static void trans_ADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + unsigned vsz =3D vec_full_reg_size(s); + + if (a->esz =3D=3D 0 && extract32(insn, 13, 1)) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_addi(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz); +} + +static void trans_SUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + a->imm =3D -a->imm; + trans_ADD_zzi(s, a, insn); +} + +static void trans_SUBR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + static const GVecGen2s op[4] =3D { + { .fni8 =3D tcg_gen_vec_sub8_i64, + .fniv =3D tcg_gen_sub_vec, + .fno =3D gen_helper_sve_subri_b, + .opc =3D INDEX_op_sub_vec, + .vece =3D MO_8, + .scalar_first =3D true }, + { .fni8 =3D tcg_gen_vec_sub16_i64, + .fniv =3D tcg_gen_sub_vec, + .fno =3D gen_helper_sve_subri_h, + .opc =3D INDEX_op_sub_vec, + .vece =3D MO_16, + .scalar_first =3D true }, + { .fni4 =3D tcg_gen_sub_i32, + .fniv =3D tcg_gen_sub_vec, + .fno =3D gen_helper_sve_subri_s, + .opc =3D INDEX_op_sub_vec, + .vece =3D MO_32, + .scalar_first =3D true }, + { .fni8 =3D tcg_gen_sub_i64, + .fniv =3D tcg_gen_sub_vec, + .fno =3D gen_helper_sve_subri_d, + .opc =3D INDEX_op_sub_vec, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + .vece =3D MO_64, + .scalar_first =3D true } + }; + unsigned vsz =3D vec_full_reg_size(s); + TCGv_i64 c; + + if (a->esz =3D=3D 0 && extract32(insn, 13, 1)) { + unallocated_encoding(s); + return; + } + c =3D tcg_const_i64(a->imm); + tcg_gen_gvec_2s(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), vsz, vsz, c, &op[a->esz= ]); + tcg_temp_free_i64(c); +} + +static void trans_MUL_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + unsigned vsz =3D vec_full_reg_size(s); + tcg_gen_gvec_muli(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz); +} + +static void do_zzi_sat(DisasContext *s, arg_rri_esz *a, uint32_t insn, + bool u, bool d) +{ + TCGv_i64 val; + + if (a->esz =3D=3D 0 && extract32(insn, 13, 1)) { + unallocated_encoding(s); + return; + } + val =3D tcg_const_i64(a->imm); + do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, u, d); + tcg_temp_free_i64(val); +} + +static void trans_SQADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_zzi_sat(s, a, insn, false, false); +} + +static void trans_UQADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_zzi_sat(s, a, insn, true, false); +} + +static void trans_SQSUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_zzi_sat(s, a, insn, false, true); +} + +static void trans_UQSUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_zzi_sat(s, a, insn, true, true); +} + +static void do_zzi_ool(DisasContext *s, arg_rri_esz *a, gen_helper_gvec_2i= *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_i64 c =3D tcg_const_i64(a->imm); + + tcg_gen_gvec_2i_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + c, vsz, vsz, 0, fn); + tcg_temp_free_i64(c); +} + +#define DO_ZZI(NAME, name) \ +static void trans_##NAME##_zzi(DisasContext *s, arg_rri_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_2i * const fns[4] =3D { \ + gen_helper_sve_##name##i_b, gen_helper_sve_##name##i_h, \ + gen_helper_sve_##name##i_s, gen_helper_sve_##name##i_d, \ + }; \ + do_zzi_ool(s, a, fns[a->esz]); \ +} + +DO_ZZI(SMAX, smax) +DO_ZZI(UMAX, umax) +DO_ZZI(SMIN, smin) +DO_ZZI(UMIN, umin) + +#undef DO_ZZI + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index ea1bfe7579..1ede152360 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -43,6 +43,8 @@ =20 # Signed 8-bit immediate, optionally shifted left by 8. %sh8_i8s 5:9 !function=3Dexpand_imm_sh8s +# Unsigned 8-bit immediate, optionally shifted left by 8. +%sh8_i8u 5:9 !function=3Dexpand_imm_sh8u =20 # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. @@ -96,6 +98,12 @@ @pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz @rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ &rrr_esz rn=3D%reg_movprfx +@rdn_sh_i8u ........ esz:2 ...... ...... ..... rd:5 \ + &rri_esz rn=3D%reg_movprfx imm=3D%sh8_i8u +@rdn_i8u ........ esz:2 ...... ... imm:8 rd:5 \ + &rri_esz rn=3D%reg_movprfx +@rdn_i8s ........ esz:2 ...... ... imm:s8 rd:5 \ + &rri_esz rn=3D%reg_movprfx =20 # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -630,6 +638,24 @@ FDUP 00100101 esz:2 111 00 1110 imm:8 rd:5 # SVE broadcast integer immediate (unpredicated) DUP_i 00100101 esz:2 111 00 011 . ........ rd:5 imm=3D%sh8_i8s =20 +# SVE integer add/subtract immediate (unpredicated) +ADD_zzi 00100101 .. 100 000 11 . ........ ..... @rdn_sh_i8u +SUB_zzi 00100101 .. 100 001 11 . ........ ..... @rdn_sh_i8u +SUBR_zzi 00100101 .. 100 011 11 . ........ ..... @rdn_sh_i8u +SQADD_zzi 00100101 .. 100 100 11 . ........ ..... @rdn_sh_i8u +UQADD_zzi 00100101 .. 100 101 11 . ........ ..... @rdn_sh_i8u +SQSUB_zzi 00100101 .. 100 110 11 . ........ ..... @rdn_sh_i8u +UQSUB_zzi 00100101 .. 100 111 11 . ........ ..... @rdn_sh_i8u + +# SVE integer min/max immediate (unpredicated) +SMAX_zzi 00100101 .. 101 000 110 ........ ..... @rdn_i8s +UMAX_zzi 00100101 .. 101 001 110 ........ ..... @rdn_i8u +SMIN_zzi 00100101 .. 101 010 110 ........ ..... @rdn_i8s +UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u + +# SVE integer multiply immediate (unpredicated) +MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group =20 # SVE load predicate register --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 151889337987438.71630197303057; Sat, 17 Feb 2018 10:49:39 -0800 (PST) Received: from localhost ([::1]:48263 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7YI-0004YR-Ul for importer@patchew.org; Sat, 17 Feb 2018 13:49:39 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40571) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7A5-0001IC-LM for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7A3-00021Q-VE for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:37 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:34489) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7A3-000214-NV for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:35 -0500 Received: by mail-pg0-x243.google.com with SMTP id m19so4360482pgn.1 for ; Sat, 17 Feb 2018 10:24:35 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.32 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=phitCBYJ3/hkDUn72beMxZ/Q6P0rFSGJrO+mIm6pzNw=; b=fELIXxqgPeTLsz6llK3Nn+TB4c9sygTze5gXA7CYaafPCwe37NEkyYp0Ut4xVJmtNv 7Mx09tp2ajASVTdwzkF0GERr8jqXLk75stt46rQm+mClhDIX4i53I7R1wZYCFv8d9Syb R9gU8F2/VbyBAxP/1ICaAuXsl8ZCWsCunvtro= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=phitCBYJ3/hkDUn72beMxZ/Q6P0rFSGJrO+mIm6pzNw=; b=lPgEPLJMaWk15bDXOifF8bG+2jxt0h6SmsDhRUeVCLNfAdDFe5NwVlmm9ikcZ2fUFc 5l2h3cWF6rOjAmkAIJ4SJ7wWNWLlQgmGfkfxNCXVvqMwXamvnkbBB72+meyesfyPhVrF vDKC0R9QkS3yNB80hZvoTJxJVK33282JaL8B0klwl1ciDatz9Q0d7OGxXaROv6oii3L5 Y0I43aCmUvneihLlhkFZh4/VGu7AkVCMTokG7lCNieAGssafRav9AJbvPqdifMbaiEBJ Lh6FPoTORIJtOMWJ6VZ5OgWUx/4OXI48UyTT6fOpQhLTi61Vqb6s36wnsydZIjQGHIL4 3HAA== X-Gm-Message-State: APf1xPClV9tbQtx593M1S5zLjJXpn7I7SYFl3lIZmsqHue/OS3M0sckB Rx8bftelrYazZrbfVguC9i4xGf/oPYk= X-Google-Smtp-Source: AH8x225ZRLa/QvXlBYuZesQsCJuCCCHzZnsN/ezVuPA3Dvy/CwmEq3UDMuv1Y028gCWrIdlctasutw== X-Received: by 10.167.130.193 with SMTP id f1mr9610466pfn.241.1518891874395; Sat, 17 Feb 2018 10:24:34 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:59 -0800 Message-Id: <20180217182323.25885-44-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH v2 43/67] target/arm: Implement SVE Floating Point Arithmetic - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 14 +++++++ target/arm/helper.h | 19 ++++++++++ target/arm/translate-sve.c | 41 ++++++++++++++++++++ target/arm/vec_helper.c | 94 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/Makefile.objs | 2 +- target/arm/sve.decode | 10 +++++ 6 files changed, 179 insertions(+), 1 deletion(-) create mode 100644 target/arm/vec_helper.c diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 97bfe0f47b..2e76084992 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -705,3 +705,17 @@ DEF_HELPER_FLAGS_4(sve_umini_b, TCG_CALL_NO_RWG, void,= ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_umini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_umini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_umini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_5(gvec_recps_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_recps_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_recps_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_rsqrts_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/helper.h b/target/arm/helper.h index be3c2fcdc0..f3ce58e276 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -565,6 +565,25 @@ DEF_HELPER_2(dc_zva, void, env, i64) DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64) DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64) =20 +DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) +DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) +DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) + +DEF_HELPER_FLAGS_5(gvec_fsub_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) +DEF_HELPER_FLAGS_5(gvec_fsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) +DEF_HELPER_FLAGS_5(gvec_fsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) + +DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) +DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) +DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) + +DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 72abcb543a..f9a3ad1434 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3109,6 +3109,47 @@ DO_ZZI(UMIN, umin) =20 #undef DO_ZZI =20 +/* + *** SVE Floating Point Arithmetic - Unpredicated Group + */ + +static void do_zzz_fp(DisasContext *s, arg_rrr_esz *a, + gen_helper_gvec_3_ptr *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr status; + + if (fn =3D=3D NULL) { + unallocated_encoding(s); + return; + } + status =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, 0, fn); +} + + +#define DO_FP3(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rrr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_3_ptr * const fns[4] =3D { \ + NULL, gen_helper_gvec_##name##_h, \ + gen_helper_gvec_##name##_s, gen_helper_gvec_##name##_d \ + }; \ + do_zzz_fp(s, a, fns[a->esz]); \ +} + +DO_FP3(FADD_zzz, fadd) +DO_FP3(FSUB_zzz, fsub) +DO_FP3(FMUL_zzz, fmul) +DO_FP3(FTSMUL, ftsmul) +DO_FP3(FRECPS, recps) +DO_FP3(FRSQRTS, rsqrts) + +#undef DO_FP3 + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c new file mode 100644 index 0000000000..ad5c29cdd5 --- /dev/null +++ b/target/arm/vec_helper.c @@ -0,0 +1,94 @@ +/* + * ARM Shared AdvSIMD / SVE Operations + * + * Copyright (c) 2018 Linaro + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/helper-proto.h" +#include "tcg/tcg-gvec-desc.h" +#include "fpu/softfloat.h" + + +/* Floating-point trigonometric starting value. + * See the ARM ARM pseudocode function FPTrigSMul. + */ +static float16 float16_ftsmul(float16 op1, uint16_t op2, float_status *sta= t) +{ + float16 result =3D float16_mul(op1, op1, stat); + if (!float16_is_any_nan(result)) { + result =3D float16_set_sign(result, op2 & 1); + } + return result; +} + +static float32 float32_ftsmul(float32 op1, uint32_t op2, float_status *sta= t) +{ + float32 result =3D float32_mul(op1, op1, stat); + if (!float32_is_any_nan(result)) { + result =3D float32_set_sign(result, op2 & 1); + } + return result; +} + +static float64 float64_ftsmul(float64 op1, uint64_t op2, float_status *sta= t) +{ + float64 result =3D float64_mul(op1, op1, stat); + if (!float64_is_any_nan(result)) { + result =3D float64_set_sign(result, op2 & 1); + } + return result; +} + +#define DO_3OP(NAME, FUNC, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc)= \ +{ = \ + intptr_t i, oprsz =3D simd_oprsz(desc); = \ + TYPE *d =3D vd, *n =3D vn, *m =3D vm; = \ + for (i =3D 0; i < oprsz / sizeof(TYPE); i++) { = \ + d[i] =3D FUNC(n[i], m[i], stat); = \ + } = \ +} + +DO_3OP(gvec_fadd_h, float16_add, float16) +DO_3OP(gvec_fadd_s, float32_add, float32) +DO_3OP(gvec_fadd_d, float64_add, float64) + +DO_3OP(gvec_fsub_h, float16_sub, float16) +DO_3OP(gvec_fsub_s, float32_sub, float32) +DO_3OP(gvec_fsub_d, float64_sub, float64) + +DO_3OP(gvec_fmul_h, float16_mul, float16) +DO_3OP(gvec_fmul_s, float32_mul, float32) +DO_3OP(gvec_fmul_d, float64_mul, float64) + +DO_3OP(gvec_ftsmul_h, float16_ftsmul, float16) +DO_3OP(gvec_ftsmul_s, float32_ftsmul, float32) +DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64) + +#ifdef TARGET_AARCH64 + +DO_3OP(gvec_recps_h, helper_recpsf_f16, float16) +DO_3OP(gvec_recps_s, helper_recpsf_f32, float32) +DO_3OP(gvec_recps_d, helper_recpsf_f64, float64) + +DO_3OP(gvec_rsqrts_h, helper_rsqrtsf_f16, float16) +DO_3OP(gvec_rsqrts_s, helper_rsqrtsf_f32, float32) +DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64) + +#endif +#undef DO_3OP diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs index 452ac6f453..50a521876d 100644 --- a/target/arm/Makefile.objs +++ b/target/arm/Makefile.objs @@ -8,7 +8,7 @@ obj-y +=3D translate.o op_helper.o helper.o cpu.o obj-y +=3D neon_helper.o iwmmxt_helper.o obj-y +=3D gdbstub.o obj-$(TARGET_AARCH64) +=3D cpu64.o translate-a64.o helper-a64.o gdbstub64.o -obj-y +=3D crypto_helper.o +obj-y +=3D crypto_helper.o vec_helper.o obj-$(CONFIG_SOFTMMU) +=3D arm-powerctl.o =20 DECODETREE =3D $(SRC_PATH)/scripts/decodetree.py diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1ede152360..42d14994a1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -656,6 +656,16 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_= i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s =20 +### SVE Floating Point Arithmetic - Unpredicated Group + +# SVE floating-point arithmetic (unpredicated) +FADD_zzz 01100101 .. 0 ..... 000 000 ..... ..... @rd_rn_rm +FSUB_zzz 01100101 .. 0 ..... 000 001 ..... ..... @rd_rn_rm +FMUL_zzz 01100101 .. 0 ..... 000 010 ..... ..... @rd_rn_rm +FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm +FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm +FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group =20 # SVE load predicate register --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 151889392006087.29086562434452; Sat, 17 Feb 2018 10:58:40 -0800 (PST) Received: from localhost ([::1]:48359 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7gx-0003qo-L2 for importer@patchew.org; Sat, 17 Feb 2018 13:58:35 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40595) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7A8-0001Kq-4T for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7A5-00022D-PJ for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:40 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:35793) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7A5-00021n-H0 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:37 -0500 Received: by mail-pl0-x243.google.com with SMTP id bb3so3441299plb.2 for ; Sat, 17 Feb 2018 10:24:37 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.34 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=dj1dGlY0FjUyA12pbp3Budo7JX7KcWM9Ev96He6goX4=; b=kdRAj7O62tkXm4TcGmj59SWnYoEIl6YkIh1cV0+xpKdFGaQlFwrSFpW5mP/0H0Tr+v FfX0Hv5C3MUjH/wHsSwAmkeImSfOJFgXhiqcel+jCP7mTpjIiXSUEDaHfinWtqJl4Xqx zzegl7jAT5PGNkt/IWFuDyoRSfB2Y0KcyyMuE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=dj1dGlY0FjUyA12pbp3Budo7JX7KcWM9Ev96He6goX4=; b=csfV8TWdjIBhBgCj85cls+0RRSLng4G/cuyQBQGYmtEB8pElyrSt6vjUAH/xt2L5W+ Vr6adE26szejxJnO3xE+WV0PNDNY1BQmdi2qIbfTX3bssKoAvMlGrbyQ/xks+9hrt3cm LP/F4LR7VJIwlmi7HQY0Be7aiFFZX3YmVgCkrsAxAXVM5/2eqDBztqBPdIZD5AeAf/Ow n9mIc0xoV7Dd2NVRcuAkVMgLgsEr+cknUHO/b1y5eBbbfU4oDwpCtHl/Dm+kyhlrcOJk 9T2vaqeD4CRifgys5Pdfv8g/izf3tPk9sIa/RLP0HQRFnz7PPtK3fG92+vqcwZGsTVAr pVgw== X-Gm-Message-State: APf1xPBBV8doXsI9e+BoFVV3dF1PpL2fBIDvlM93ffB367n0qmbzDOqA R95BbhbaEFYLMJUA/xdb96SL8bvkFzU= X-Google-Smtp-Source: AH8x227JxpKG1CA1bvBAVELcDVo1rjBp8VVy7q6Ke04socZnEc6KJbfqnF7XHRS21TcZ8M+4dellnA== X-Received: by 2002:a17:902:a983:: with SMTP id bh3-v6mr1820821plb.359.1518891876101; Sat, 17 Feb 2018 10:24:36 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:00 -0800 Message-Id: <20180217182323.25885-45-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 44/67] target/arm: Implement SVE Memory Contiguous Load Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 35 +++++++ target/arm/sve_helper.c | 235 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 130 +++++++++++++++++++++++++ target/arm/sve.decode | 44 ++++++++- 4 files changed, 442 insertions(+), 2 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2e76084992..fcc9ba5f50 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -719,3 +719,38 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4f45f11bff..e542725113 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2788,3 +2788,238 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count= , uint32_t pred_desc) =20 return predtest_ones(d, oprsz, esz_mask); } + +/* + * Load contiguous data, protected by a governing predicate. + */ +#define DO_LD1(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + intptr_t ra =3D GETPC(); \ + unsigned rd =3D simd_data(desc); \ + void *vd =3D &env->vfp.zregs[rd]; \ + for (i =3D 0; i < oprsz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m =3D 0; \ + if (pg & 1) { \ + m =3D FN(env, addr, ra); \ + } \ + *(TYPEE *)(vd + H(i)) =3D m; \ + i +=3D sizeof(TYPEE), pg >>=3D sizeof(TYPEE); \ + addr +=3D sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_LD1_D(NAME, FN, TYPEM) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc) / 8; \ + intptr_t ra =3D GETPC(); \ + unsigned rd =3D simd_data(desc); \ + uint64_t *d =3D &env->vfp.zregs[rd].d[0]; \ + uint8_t *pg =3D vg; \ + for (i =3D 0; i < oprsz; i +=3D 1) { \ + TYPEM m =3D 0; \ + if (pg[H1(i)] & 1) { \ + m =3D FN(env, addr, ra); \ + } \ + d[i] =3D m; \ + addr +=3D sizeof(TYPEM); \ + } \ +} + +#define DO_LD2(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + intptr_t ra =3D GETPC(); \ + unsigned rd =3D simd_data(desc); \ + void *d1 =3D &env->vfp.zregs[rd]; \ + void *d2 =3D &env->vfp.zregs[(rd + 1) & 31]; \ + for (i =3D 0; i < oprsz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m1 =3D 0, m2 =3D 0; \ + if (pg & 1) { \ + m1 =3D FN(env, addr, ra); \ + m2 =3D FN(env, addr + sizeof(TYPEM), ra); \ + } \ + *(TYPEE *)(d1 + H(i)) =3D m1; \ + *(TYPEE *)(d2 + H(i)) =3D m2; \ + i +=3D sizeof(TYPEE), pg >>=3D sizeof(TYPEE); \ + addr +=3D 2 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_LD3(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + intptr_t ra =3D GETPC(); \ + unsigned rd =3D simd_data(desc); \ + void *d1 =3D &env->vfp.zregs[rd]; \ + void *d2 =3D &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 =3D &env->vfp.zregs[(rd + 2) & 31]; \ + for (i =3D 0; i < oprsz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m1 =3D 0, m2 =3D 0, m3 =3D 0; \ + if (pg & 1) { \ + m1 =3D FN(env, addr, ra); \ + m2 =3D FN(env, addr + sizeof(TYPEM), ra); \ + m3 =3D FN(env, addr + 2 * sizeof(TYPEM), ra); \ + } \ + *(TYPEE *)(d1 + H(i)) =3D m1; \ + *(TYPEE *)(d2 + H(i)) =3D m2; \ + *(TYPEE *)(d3 + H(i)) =3D m3; \ + i +=3D sizeof(TYPEE), pg >>=3D sizeof(TYPEE); \ + addr +=3D 3 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_LD4(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + intptr_t ra =3D GETPC(); \ + unsigned rd =3D simd_data(desc); \ + void *d1 =3D &env->vfp.zregs[rd]; \ + void *d2 =3D &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 =3D &env->vfp.zregs[(rd + 2) & 31]; \ + void *d4 =3D &env->vfp.zregs[(rd + 3) & 31]; \ + for (i =3D 0; i < oprsz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m1 =3D 0, m2 =3D 0, m3 =3D 0, m4 =3D 0; \ + if (pg & 1) { \ + m1 =3D FN(env, addr, ra); \ + m2 =3D FN(env, addr + sizeof(TYPEM), ra); \ + m3 =3D FN(env, addr + 2 * sizeof(TYPEM), ra); \ + m4 =3D FN(env, addr + 3 * sizeof(TYPEM), ra); \ + } \ + *(TYPEE *)(d1 + H(i)) =3D m1; \ + *(TYPEE *)(d2 + H(i)) =3D m2; \ + *(TYPEE *)(d3 + H(i)) =3D m3; \ + *(TYPEE *)(d4 + H(i)) =3D m4; \ + i +=3D sizeof(TYPEE), pg >>=3D sizeof(TYPEE); \ + addr +=3D 4 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +DO_LD1(sve_ld1bhu_r, cpu_ldub_data_ra, uint16_t, uint8_t, H1_2) +DO_LD1(sve_ld1bhs_r, cpu_ldsb_data_ra, uint16_t, int8_t, H1_2) +DO_LD1(sve_ld1bsu_r, cpu_ldub_data_ra, uint32_t, uint8_t, H1_4) +DO_LD1(sve_ld1bss_r, cpu_ldsb_data_ra, uint32_t, int8_t, H1_4) +DO_LD1_D(sve_ld1bdu_r, cpu_ldub_data_ra, uint8_t) +DO_LD1_D(sve_ld1bds_r, cpu_ldsb_data_ra, int8_t) + +DO_LD1(sve_ld1hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4) +DO_LD1(sve_ld1hss_r, cpu_ldsw_data_ra, uint32_t, int8_t, H1_4) +DO_LD1_D(sve_ld1hdu_r, cpu_lduw_data_ra, uint16_t) +DO_LD1_D(sve_ld1hds_r, cpu_ldsw_data_ra, int16_t) + +DO_LD1_D(sve_ld1sdu_r, cpu_ldl_data_ra, uint32_t) +DO_LD1_D(sve_ld1sds_r, cpu_ldl_data_ra, int32_t) + +DO_LD1(sve_ld1bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LD2(sve_ld2bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LD3(sve_ld3bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LD4(sve_ld4bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) + +DO_LD1(sve_ld1hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LD2(sve_ld2hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LD3(sve_ld3hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LD4(sve_ld4hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) + +DO_LD1(sve_ld1ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LD2(sve_ld2ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LD3(sve_ld3ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LD4(sve_ld4ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) + +DO_LD1_D(sve_ld1dd_r, cpu_ldq_data_ra, uint64_t) + +void HELPER(sve_ld2dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc) / 8; + intptr_t ra =3D GETPC(); + unsigned rd =3D simd_data(desc); + uint64_t *d1 =3D &env->vfp.zregs[rd].d[0]; + uint64_t *d2 =3D &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint8_t *pg =3D vg; + + for (i =3D 0; i < oprsz; i +=3D 1) { + uint64_t m1 =3D 0, m2 =3D 0; + if (pg[H1(i)] & 1) { + m1 =3D cpu_ldq_data_ra(env, addr, ra); + m2 =3D cpu_ldq_data_ra(env, addr + 8, ra); + } + d1[i] =3D m1; + d2[i] =3D m2; + addr +=3D 2 * 8; + } +} + +void HELPER(sve_ld3dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc) / 8; + intptr_t ra =3D GETPC(); + unsigned rd =3D simd_data(desc); + uint64_t *d1 =3D &env->vfp.zregs[rd].d[0]; + uint64_t *d2 =3D &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint64_t *d3 =3D &env->vfp.zregs[(rd + 2) & 31].d[0]; + uint8_t *pg =3D vg; + + for (i =3D 0; i < oprsz; i +=3D 1) { + uint64_t m1 =3D 0, m2 =3D 0, m3 =3D 0; + if (pg[H1(i)] & 1) { + m1 =3D cpu_ldq_data_ra(env, addr, ra); + m2 =3D cpu_ldq_data_ra(env, addr + 8, ra); + m3 =3D cpu_ldq_data_ra(env, addr + 16, ra); + } + d1[i] =3D m1; + d2[i] =3D m2; + d3[i] =3D m3; + addr +=3D 3 * 8; + } +} + +void HELPER(sve_ld4dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc) / 8; + intptr_t ra =3D GETPC(); + unsigned rd =3D simd_data(desc); + uint64_t *d1 =3D &env->vfp.zregs[rd].d[0]; + uint64_t *d2 =3D &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint64_t *d3 =3D &env->vfp.zregs[(rd + 2) & 31].d[0]; + uint64_t *d4 =3D &env->vfp.zregs[(rd + 3) & 31].d[0]; + uint8_t *pg =3D vg; + + for (i =3D 0; i < oprsz; i +=3D 1) { + uint64_t m1 =3D 0, m2 =3D 0, m3 =3D 0, m4 =3D 0; + if (pg[H1(i)] & 1) { + m1 =3D cpu_ldq_data_ra(env, addr, ra); + m2 =3D cpu_ldq_data_ra(env, addr + 8, ra); + m3 =3D cpu_ldq_data_ra(env, addr + 16, ra); + m4 =3D cpu_ldq_data_ra(env, addr + 24, ra); + } + d1[i] =3D m1; + d2[i] =3D m2; + d3[i] =3D m3; + d4[i] =3D m4; + addr +=3D 4 * 8; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index f9a3ad1434..aa8bfd2ae7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -46,6 +46,8 @@ typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, = TCGv_ptr, typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); =20 +typedef void gen_helper_gvec_mem(TCGv_env, TCGv_ptr, TCGv_i64, TCGv_i32); + /* * Helpers for extracting complex instruction fields. */ @@ -86,6 +88,15 @@ static inline int expand_imm_sh8u(int x) return (uint8_t)x << (x & 0x100 ? 8 : 0); } =20 +/* Convert a 2-bit memory size (msz) to a 4-bit data type (dtype) + * with unsigned data. C.f. SVE Memory Contiguous Load Group. + */ +static inline int msz_dtype(int msz) +{ + static const uint8_t dtype[4] =3D { 0, 5, 10, 15 }; + return dtype[msz]; +} + /* * Include the generated decoder. */ @@ -3268,3 +3279,122 @@ static void trans_LDR_pri(DisasContext *s, arg_rri = *a, uint32_t insn) int size =3D pred_full_reg_size(s); do_ldr(s, pred_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); } + +/* + *** SVE Memory - Contiguous Load Group + */ + +/* The memory element size of dtype. */ +static const TCGMemOp dtype_mop[16] =3D { + MO_UB, MO_UB, MO_UB, MO_UB, + MO_SL, MO_UW, MO_UW, MO_UW, + MO_SW, MO_SW, MO_UL, MO_UL, + MO_SB, MO_SB, MO_SB, MO_Q +}; + +#define dtype_msz(x) (dtype_mop[x] & MO_SIZE) + +/* The vector element size of dtype. */ +static const uint8_t dtype_esz[16] =3D { + 0, 1, 2, 3, + 3, 1, 2, 3, + 3, 2, 2, 3, + 3, 2, 1, 3 +}; + +static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, + gen_helper_gvec_mem *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr t_pg; + TCGv_i32 desc; + + /* For e.g. LD4, there are not enough arguments to pass all 4 + registers as pointers, so encode the regno into the data field. + For consistency, do this even for LD1. */ + desc =3D tcg_const_i32(simd_desc(vsz, vsz, zt)); + t_pg =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + fn(cpu_env, t_pg, addr, desc); + + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); + tcg_temp_free_i64(addr); +} + +static void do_ld_zpa(DisasContext *s, int zt, int pg, + TCGv_i64 addr, int dtype, int nreg) +{ + static gen_helper_gvec_mem * const fns[16][4] =3D { + { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r, + gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r }, + { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1sds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hh_r, gen_helper_sve_ld2hh_r, + gen_helper_sve_ld3hh_r, gen_helper_sve_ld4hh_r }, + { gen_helper_sve_ld1hsu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1hds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hss_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1ss_r, gen_helper_sve_ld2ss_r, + gen_helper_sve_ld3ss_r, gen_helper_sve_ld4ss_r }, + { gen_helper_sve_ld1sdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1bds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bss_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1dd_r, gen_helper_sve_ld2dd_r, + gen_helper_sve_ld3dd_r, gen_helper_sve_ld4dd_r }, + }; + gen_helper_gvec_mem *fn =3D fns[dtype][nreg]; + + /* While there are holes in the table, they are not + accessible via the instruction encoding. */ + assert(fn !=3D NULL); + do_mem_zpa(s, zt, pg, addr, fn); +} + +static void trans_LD_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn) +{ + TCGv_i64 addr; + + if (a->rm =3D=3D 31) { + unallocated_encoding(s); + return; + } + + addr =3D tcg_temp_new_i64(); + tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), + (a->nreg + 1) << dtype_msz(a->dtype)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg); +} + +static void trans_LD_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + unsigned vsz =3D vec_full_reg_size(s); + unsigned elements =3D vsz >> dtype_esz[a->dtype]; + TCGv_i64 addr =3D tcg_temp_new_i64(); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), + (a->imm * elements * (a->nreg + 1)) + << dtype_msz(a->dtype)); + do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg); +} + +static void trans_LDFF1_zprr(DisasContext *s, arg_rprr_load *a, uint32_t i= nsn) +{ + /* FIXME */ + trans_LD_zprr(s, a, insn); +} + +static void trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t i= nsn) +{ + /* FIXME */ + trans_LD_zpri(s, a, insn); +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 42d14994a1..d2b3869c58 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -42,9 +42,12 @@ %tszimm16_shl 22:2 16:5 !function=3Dtszimm_shl =20 # Signed 8-bit immediate, optionally shifted left by 8. -%sh8_i8s 5:9 !function=3Dexpand_imm_sh8s +%sh8_i8s 5:9 !function=3Dexpand_imm_sh8s # Unsigned 8-bit immediate, optionally shifted left by 8. -%sh8_i8u 5:9 !function=3Dexpand_imm_sh8u +%sh8_i8u 5:9 !function=3Dexpand_imm_sh8u + +# Unsigned load of msz into esz=3D2, represented as a dtype. +%msz_dtype 23:2 !function=3Dmsz_dtype =20 # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. @@ -72,6 +75,8 @@ &incdec2_cnt rd rn pat esz imm d u &incdec_pred rd pg esz d u &incdec2_pred rd rn pg esz d u +&rprr_load rd pg rn rm dtype nreg +&rpri_load rd pg rn imm dtype nreg =20 ########################################################################### # Named instruction formats. These are generally used to @@ -171,6 +176,15 @@ @incdec2_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 \ &incdec2_pred rn=3D%reg_movprfx =20 +# Loads; user must fill in NREG. +@rprr_load_dt ....... dtype:4 rm:5 ... pg:3 rn:5 rd:5 &rprr_load +@rpri_load_dt ....... dtype:4 . imm:s4 ... pg:3 rn:5 rd:5 &rpri_load + +@rprr_load_msz ....... .... rm:5 ... pg:3 rn:5 rd:5 \ + &rprr_load dtype=3D%msz_dtype +@rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \ + &rpri_load dtype=3D%msz_dtype + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. =20 @@ -673,3 +687,29 @@ LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_= rn_i9 =20 # SVE load vector register LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 + +### SVE Memory Contiguous Load Group + +# SVE contiguous load (scalar plus scalar) +LD_zprr 1010010 .... ..... 010 ... ..... ..... @rprr_load_dt nreg=3D0 + +# SVE contiguous first-fault load (scalar plus scalar) +LDFF1_zprr 1010010 .... ..... 011 ... ..... ..... @rprr_load_dt nreg=3D0 + +# SVE contiguous load (scalar plus immediate) +LD_zpri 1010010 .... 0.... 101 ... ..... ..... @rpri_load_dt nreg=3D0 + +# SVE contiguous non-fault load (scalar plus immediate) +LDNF1_zpri 1010010 .... 1.... 101 ... ..... ..... @rpri_load_dt nreg=3D0 + +# SVE contiguous non-temporal load (scalar plus scalar) +# LDNT1B, LDNT1H, LDNT1W, LDNT1D +# SVE load multiple structures (scalar plus scalar) +# LD2B, LD2H, LD2W, LD2D; etc. +LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz + +# SVE contiguous non-temporal load (scalar plus immediate) +# LDNT1B, LDNT1H, LDNT1W, LDNT1D +# SVE load multiple structures (scalar plus immediate) +# LD2B, LD2H, LD2W, LD2D; etc. +LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894798885942.4935643552072; Sat, 17 Feb 2018 11:13:18 -0800 (PST) Received: from localhost ([::1]:48479 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7vB-00006T-SK for importer@patchew.org; Sat, 17 Feb 2018 14:13:17 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40617) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7A9-0001LR-Ga for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7A7-00023r-Fe for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:41 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:46297) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7A7-00023Q-7f for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:39 -0500 Received: by mail-pl0-x244.google.com with SMTP id x19so3428626plr.13 for ; Sat, 17 Feb 2018 10:24:39 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.36 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DlfGTdNWi8xMhHNSKulM72AN7w8vQPIOR0ZxZAqGW9M=; b=OOStrE8SG5sLNenC8VHYkLz3zpB2NaXAQtZQ32+a58mXcDwhw2Gk/bE1LPWibS4nOC m112gW2ng8XfgI454iXfUQSlBanWdtTkNaqO8/EZkl3FKe89c/ynb3YuSCat4gK7hzMX A1fhxeFMYpsoVgJUC3zSv8Ems0q0CFB8IOi/Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DlfGTdNWi8xMhHNSKulM72AN7w8vQPIOR0ZxZAqGW9M=; b=jTH/jG4+fDgnRR0bJePbXbFCLmF/VkmJ2vWhRq8XVNwKZBeosZ+pg/KaQrmCa+Dft2 PZwRjRGO5wwr54LqySNsB8OsCNvu63d5lTYmDWNlimCLm3YCloaIsBSZLt/fm94nH3wI Z3OGciENiCDOoRRES6PtsCjuCYXmEVNlvtqN2QgcSsY0CZytHbkiXTTAWrRubbGTG4m7 4u+wSjKLd0kIFiydOh14gMvz203SxcXx2u19DIqqQ1rd9AnBnigRjnOqWSSWWSwIEr7W V5IDEck9Q723yQ4sTJGR22tJd0C9wm9Zl6zHBm1722t8VvmljJBycDq9btjwYY9WAVy0 UxOA== X-Gm-Message-State: APf1xPAN8e9Sd1fPVc+7EZhJusLO0tz+jvbgKpw02icsb8nr6q6MLIam AHuxYuicxXqi3C2OOEqK6L9e7GbvDVk= X-Google-Smtp-Source: AH8x226LWdipet6JkulVBxpeYS61K7jsjb59Bd/qRJI99/Qnfrgx6fHyB9LbY5NyxIwGIUhN4JTSqA== X-Received: by 2002:a17:902:8646:: with SMTP id y6-v6mr9633644plt.406.1518891877821; Sat, 17 Feb 2018 10:24:37 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:01 -0800 Message-Id: <20180217182323.25885-46-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 45/67] target/arm: Implement SVE Memory Contiguous Store Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 29 +++++++ target/arm/sve_helper.c | 211 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 68 ++++++++++++++- target/arm/sve.decode | 38 ++++++++ 4 files changed, 343 insertions(+), 3 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index fcc9ba5f50..74c2d642a3 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -754,3 +754,32 @@ DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void,= env, ptr, tl, i32) =20 DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1bh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1bs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1bd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e542725113..e259e910de 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3023,3 +3023,214 @@ void HELPER(sve_ld4dd_r)(CPUARMState *env, void *vg, addr +=3D 4 * 8; } } + +/* + * Store contiguous data, protected by a governing predicate. + */ +#define DO_ST1(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + intptr_t ra =3D GETPC(); \ + unsigned rd =3D simd_data(desc); \ + void *vd =3D &env->vfp.zregs[rd]; \ + for (i =3D 0; i < oprsz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m =3D *(TYPEE *)(vd + H(i)); \ + FN(env, addr, m, ra); \ + } \ + i +=3D sizeof(TYPEE), pg >>=3D sizeof(TYPEE); \ + addr +=3D sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_ST1_D(NAME, FN, TYPEM) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc) / 8; \ + intptr_t ra =3D GETPC(); \ + unsigned rd =3D simd_data(desc); \ + uint64_t *d =3D &env->vfp.zregs[rd].d[0]; \ + uint8_t *pg =3D vg; \ + for (i =3D 0; i < oprsz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + FN(env, addr, d[i], ra); \ + } \ + addr +=3D sizeof(TYPEM); \ + } \ +} + +#define DO_ST2(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + intptr_t ra =3D GETPC(); \ + unsigned rd =3D simd_data(desc); \ + void *d1 =3D &env->vfp.zregs[rd]; \ + void *d2 =3D &env->vfp.zregs[(rd + 1) & 31]; \ + for (i =3D 0; i < oprsz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m1 =3D *(TYPEE *)(d1 + H(i)); \ + TYPEM m2 =3D *(TYPEE *)(d2 + H(i)); \ + FN(env, addr, m1, ra); \ + FN(env, addr + sizeof(TYPEM), m2, ra); \ + } \ + i +=3D sizeof(TYPEE), pg >>=3D sizeof(TYPEE); \ + addr +=3D 2 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_ST3(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + intptr_t ra =3D GETPC(); \ + unsigned rd =3D simd_data(desc); \ + void *d1 =3D &env->vfp.zregs[rd]; \ + void *d2 =3D &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 =3D &env->vfp.zregs[(rd + 2) & 31]; \ + for (i =3D 0; i < oprsz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m1 =3D *(TYPEE *)(d1 + H(i)); \ + TYPEM m2 =3D *(TYPEE *)(d2 + H(i)); \ + TYPEM m3 =3D *(TYPEE *)(d3 + H(i)); \ + FN(env, addr, m1, ra); \ + FN(env, addr + sizeof(TYPEM), m2, ra); \ + FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \ + } \ + i +=3D sizeof(TYPEE), pg >>=3D sizeof(TYPEE); \ + addr +=3D 3 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_ST4(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + intptr_t ra =3D GETPC(); \ + unsigned rd =3D simd_data(desc); \ + void *d1 =3D &env->vfp.zregs[rd]; \ + void *d2 =3D &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 =3D &env->vfp.zregs[(rd + 2) & 31]; \ + void *d4 =3D &env->vfp.zregs[(rd + 3) & 31]; \ + for (i =3D 0; i < oprsz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m1 =3D *(TYPEE *)(d1 + H(i)); \ + TYPEM m2 =3D *(TYPEE *)(d2 + H(i)); \ + TYPEM m3 =3D *(TYPEE *)(d3 + H(i)); \ + TYPEM m4 =3D *(TYPEE *)(d4 + H(i)); \ + FN(env, addr, m1, ra); \ + FN(env, addr + sizeof(TYPEM), m2, ra); \ + FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \ + FN(env, addr + 3 * sizeof(TYPEM), m4, ra); \ + } \ + i +=3D sizeof(TYPEE), pg >>=3D sizeof(TYPEE); \ + addr +=3D 4 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +DO_ST1(sve_st1bh_r, cpu_stb_data_ra, uint16_t, uint8_t, H1_2) +DO_ST1(sve_st1bs_r, cpu_stb_data_ra, uint32_t, uint8_t, H1_4) +DO_ST1_D(sve_st1bd_r, cpu_stb_data_ra, uint8_t) + +DO_ST1(sve_st1hs_r, cpu_stw_data_ra, uint32_t, uint16_t, H1_4) +DO_ST1_D(sve_st1hd_r, cpu_stw_data_ra, uint16_t) + +DO_ST1_D(sve_st1sd_r, cpu_stl_data_ra, uint32_t) + +DO_ST1(sve_st1bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) +DO_ST2(sve_st2bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) +DO_ST3(sve_st3bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) +DO_ST4(sve_st4bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) + +DO_ST1(sve_st1hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) +DO_ST2(sve_st2hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) +DO_ST3(sve_st3hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) +DO_ST4(sve_st4hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) + +DO_ST1(sve_st1ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) +DO_ST2(sve_st2ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) +DO_ST3(sve_st3ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) +DO_ST4(sve_st4ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) + +DO_ST1_D(sve_st1dd_r, cpu_stq_data_ra, uint64_t) + +void HELPER(sve_st2dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc) / 8; + intptr_t ra =3D GETPC(); + unsigned rd =3D simd_data(desc); + uint64_t *d1 =3D &env->vfp.zregs[rd].d[0]; + uint64_t *d2 =3D &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint8_t *pg =3D vg; + + for (i =3D 0; i < oprsz; i +=3D 1) { + if (pg[H1(i)] & 1) { + cpu_stq_data_ra(env, addr, d1[i], ra); + cpu_stq_data_ra(env, addr + 8, d2[i], ra); + } + addr +=3D 2 * 8; + } +} + +void HELPER(sve_st3dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc) / 8; + intptr_t ra =3D GETPC(); + unsigned rd =3D simd_data(desc); + uint64_t *d1 =3D &env->vfp.zregs[rd].d[0]; + uint64_t *d2 =3D &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint64_t *d3 =3D &env->vfp.zregs[(rd + 2) & 31].d[0]; + uint8_t *pg =3D vg; + + for (i =3D 0; i < oprsz; i +=3D 1) { + if (pg[H1(i)] & 1) { + cpu_stq_data_ra(env, addr, d1[i], ra); + cpu_stq_data_ra(env, addr + 8, d2[i], ra); + cpu_stq_data_ra(env, addr + 16, d3[i], ra); + } + addr +=3D 3 * 8; + } +} + +void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz =3D simd_oprsz(desc) / 8; + intptr_t ra =3D GETPC(); + unsigned rd =3D simd_data(desc); + uint64_t *d1 =3D &env->vfp.zregs[rd].d[0]; + uint64_t *d2 =3D &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint64_t *d3 =3D &env->vfp.zregs[(rd + 2) & 31].d[0]; + uint64_t *d4 =3D &env->vfp.zregs[(rd + 3) & 31].d[0]; + uint8_t *pg =3D vg; + + for (i =3D 0; i < oprsz; i +=3D 1) { + if (pg[H1(i)] & 1) { + cpu_stq_data_ra(env, addr, d1[i], ra); + cpu_stq_data_ra(env, addr + 8, d2[i], ra); + cpu_stq_data_ra(env, addr + 16, d3[i], ra); + cpu_stq_data_ra(env, addr + 24, d4[i], ra); + } + addr +=3D 4 * 8; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index aa8bfd2ae7..fda9a56fd5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3320,7 +3320,6 @@ static void do_mem_zpa(DisasContext *s, int zt, int p= g, TCGv_i64 addr, =20 tcg_temp_free_ptr(t_pg); tcg_temp_free_i32(desc); - tcg_temp_free_i64(addr); } =20 static void do_ld_zpa(DisasContext *s, int zt, int pg, @@ -3368,7 +3367,7 @@ static void trans_LD_zprr(DisasContext *s, arg_rprr_l= oad *a, uint32_t insn) return; } =20 - addr =3D tcg_temp_new_i64(); + addr =3D new_tmp_a64(s); tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), (a->nreg + 1) << dtype_msz(a->dtype)); tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); @@ -3379,7 +3378,7 @@ static void trans_LD_zpri(DisasContext *s, arg_rpri_l= oad *a, uint32_t insn) { unsigned vsz =3D vec_full_reg_size(s); unsigned elements =3D vsz >> dtype_esz[a->dtype]; - TCGv_i64 addr =3D tcg_temp_new_i64(); + TCGv_i64 addr =3D new_tmp_a64(s); =20 tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), (a->imm * elements * (a->nreg + 1)) @@ -3398,3 +3397,66 @@ static void trans_LDNF1_zpri(DisasContext *s, arg_rp= ri_load *a, uint32_t insn) /* FIXME */ trans_LD_zpri(s, a, insn); } + +static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, + int msz, int esz, int nreg) +{ + static gen_helper_gvec_mem * const fn_single[4][4] =3D { + { gen_helper_sve_st1bb_r, gen_helper_sve_st1bh_r, + gen_helper_sve_st1bs_r, gen_helper_sve_st1bd_r }, + { NULL, gen_helper_sve_st1hh_r, + gen_helper_sve_st1hs_r, gen_helper_sve_st1hd_r }, + { NULL, NULL, + gen_helper_sve_st1ss_r, gen_helper_sve_st1sd_r }, + { NULL, NULL, NULL, gen_helper_sve_st1dd_r }, + }; + static gen_helper_gvec_mem * const fn_multiple[3][4] =3D { + { gen_helper_sve_st1hh_r, gen_helper_sve_st2hh_r, + gen_helper_sve_st3hh_r, gen_helper_sve_st4hh_r }, + { gen_helper_sve_st1ss_r, gen_helper_sve_st2ss_r, + gen_helper_sve_st3ss_r, gen_helper_sve_st4ss_r }, + { gen_helper_sve_st1dd_r, gen_helper_sve_st2dd_r, + gen_helper_sve_st3dd_r, gen_helper_sve_st4dd_r }, + }; + gen_helper_gvec_mem *fn; + + if (nreg =3D=3D 0) { + /* ST1 */ + fn =3D fn_single[msz][esz]; + if (fn =3D=3D NULL) { + unallocated_encoding(s); + return; + } + } else { + /* ST2, ST3, ST4 -- msz =3D=3D esz, enforced by encoding */ + assert(msz =3D=3D esz); + fn =3D fn_multiple[msz][nreg - 1]; + } + do_mem_zpa(s, zt, pg, addr, fn); +} + +static void trans_ST_zprr(DisasContext *s, arg_rprr_store *a, uint32_t ins= n) +{ + TCGv_i64 addr; + + if (a->rm =3D=3D 31) { + unallocated_encoding(s); + return; + } + + addr =3D new_tmp_a64(s); + tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), (a->nreg + 1) << a->msz); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg); +} + +static void trans_ST_zpri(DisasContext *s, arg_rpri_store *a, uint32_t ins= n) +{ + unsigned vsz =3D vec_full_reg_size(s); + unsigned elements =3D vsz >> a->esz; + TCGv_i64 addr =3D new_tmp_a64(s); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), + (a->imm * elements * (a->nreg + 1)) << a->msz); + do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg); +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d2b3869c58..41b8cd8746 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -28,6 +28,7 @@ %imm8_16_10 16:5 10:3 %imm9_16_10 16:s6 10:3 %preg4_5 5:4 +%size_23 23:2 =20 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=3Dtszimm_esz @@ -77,6 +78,8 @@ &incdec2_pred rd rn pg esz d u &rprr_load rd pg rn rm dtype nreg &rpri_load rd pg rn imm dtype nreg +&rprr_store rd pg rn rm msz esz nreg +&rpri_store rd pg rn imm msz esz nreg =20 ########################################################################### # Named instruction formats. These are generally used to @@ -185,6 +188,12 @@ @rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \ &rpri_load dtype=3D%msz_dtype =20 +# Stores; user must fill in ESZ, MSZ, NREG as needed. +@rprr_store ....... .. .. rm:5 ... pg:3 rn:5 rd:5 &rprr_store +@rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_= store +@rprr_store_esz_n0 ....... .. esz:2 rm:5 ... pg:3 rn:5 rd:5 \ + &rprr_store nreg=3D0 + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. =20 @@ -713,3 +722,32 @@ LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @= rprr_load_msz # SVE load multiple structures (scalar plus immediate) # LD2B, LD2H, LD2W, LD2D; etc. LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz + +### SVE Memory Store Group + +# SVE contiguous store (scalar plus immediate) +# ST1B, ST1H, ST1W, ST1D; require msz <=3D esz +ST_zpri 1110010 .. esz:2 0.... 111 ... ..... ..... \ + @rpri_store_msz nreg=3D0 + +# SVE contiguous store (scalar plus scalar) +# ST1B, ST1H, ST1W, ST1D; require msz <=3D esz +# Enumerate msz lest we conflict with STR_zri. +ST_zprr 1110010 00 .. ..... 010 ... ..... ..... \ + @rprr_store_esz_n0 msz=3D0 +ST_zprr 1110010 01 .. ..... 010 ... ..... ..... \ + @rprr_store_esz_n0 msz=3D1 +ST_zprr 1110010 10 .. ..... 010 ... ..... ..... \ + @rprr_store_esz_n0 msz=3D2 +ST_zprr 1110010 11 11 ..... 010 ... ..... ..... \ + @rprr_store msz=3D3 esz=3D3 nreg=3D0 + +# SVE contiguous non-temporal store (scalar plus immediate) (nreg =3D=3D = 0) +# SVE store multiple structures (scalar plus immediate) (nreg !=3D 0) +ST_zpri 1110010 .. nreg:2 1.... 111 ... ..... ..... \ + @rpri_store_msz esz=3D%size_23 + +# SVE contiguous non-temporal store (scalar plus scalar) (nreg =3D=3D = 0) +# SVE store multiple structures (scalar plus scalar) (nreg !=3D 0) +ST_zprr 1110010 msz:2 nreg:2 ..... 011 ... ..... ..... \ + @rprr_store esz=3D%size_23 --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518895010754305.52249137498416; Sat, 17 Feb 2018 11:16:50 -0800 (PST) Received: from localhost ([::1]:48569 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7yb-0003Nb-V8 for importer@patchew.org; Sat, 17 Feb 2018 14:16:50 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40639) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AB-0001Mr-7I for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:44 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7A8-00024w-Oq for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:43 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:34790) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7A8-00024Y-Jq for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:40 -0500 Received: by mail-pl0-x241.google.com with SMTP id bd10so3450169plb.1 for ; Sat, 17 Feb 2018 10:24:40 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.37 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=II2r4Y9ASLSI8Rc5w2Vv6jfGjVTNMqTbvzEJXx6d0Hc=; b=Qia6tOaqHgc7H7ev7GQWax4fPoDob0/7CxzJ8E2Fyk/vfwGx33RwY2Bteuttts/scb TS1yxMlfKb/MW/+4ZtTfrHulNKnuA9FquF0Kf2cv00+laYe2WsjK5K36Sj4PQ8ZaXUSN ayHYBeG5suo6/ulYCeH8yvRHKNGGxSJIHOzNE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=II2r4Y9ASLSI8Rc5w2Vv6jfGjVTNMqTbvzEJXx6d0Hc=; b=Mw+42NZsABhmg0BEAFpY7j9G50VDC9c8gMKCDN6A//pl4oXuVmXPDD0d+VL3t05o1T 9Vr6Pz5rCHCb75s+YMwyaTatrPy69eOjf8/blNaDYy4+jLufAqP+tmdXWrQ9sl1xhozV je3FHN0SBidn+rFG4JkK7eJZuiPAGVN4NuheIDiX1ozzpEAzhyL1QzSxNCMwfDPB3T94 GXGpbIO3ap+c6Iqw6iEbyM++ncBV9ByGwo5ZhgyMYfmm068l1P05I6CuAOJU0GQ97Rge GVhtvDZq/85Kv9yIF/KHmyO7OeZXlvYv1W75uWLrPyzoPvsJurYgUJM0KcTZdVLns4Tp WBVQ== X-Gm-Message-State: APf1xPDyYuXcImEuBD5Z17mZi6JkTr58wOq49HHXAOrZbzEZWrYzYeMm 5YhIMdOSKVG8xn1st6AhPxoRsPVWyX4= X-Google-Smtp-Source: AH8x227UsRDWYRrg+qB+M9KVEyHBC4IWz/B7/eceaYCPdeyZk6WX2mqsbkngABrOT1pCYFDGBopfjQ== X-Received: by 2002:a17:902:9686:: with SMTP id n6-v6mr9303607plp.333.1518891879340; Sat, 17 Feb 2018 10:24:39 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:02 -0800 Message-Id: <20180217182323.25885-47-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v2 46/67] target/arm: Implement SVE load and broadcast quadword X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 51 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 9 ++++++++ 2 files changed, 60 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fda9a56fd5..7b21102b7e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3398,6 +3398,57 @@ static void trans_LDNF1_zpri(DisasContext *s, arg_rp= ri_load *a, uint32_t insn) trans_LD_zpri(s, a, insn); } =20 +static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int ms= z) +{ + static gen_helper_gvec_mem * const fns[4] =3D { + gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_r, + gen_helper_sve_ld1ss_r, gen_helper_sve_ld1dd_r, + }; + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr t_pg; + TCGv_i32 desc; + + /* Load the first quadword using the normal predicated load helpers. = */ + desc =3D tcg_const_i32(simd_desc(16, 16, zt)); + t_pg =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + fns[msz](cpu_env, t_pg, addr, desc); + + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); + + /* Replicate that first quadword. */ + if (vsz > 16) { + unsigned dofs =3D vec_full_reg_offset(s, zt); + tcg_gen_gvec_dup_mem(4, dofs + 16, dofs, vsz - 16, vsz - 16); + } +} + +static void trans_LD1RQ_zprr(DisasContext *s, arg_rprr_load *a, uint32_t i= nsn) +{ + TCGv_i64 addr; + int msz =3D dtype_msz(a->dtype); + + if (a->rm =3D=3D 31) { + unallocated_encoding(s); + return; + } + + addr =3D new_tmp_a64(s); + tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), msz); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_ldrq(s, a->rd, a->pg, addr, msz); +} + +static void trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t i= nsn) +{ + TCGv_i64 addr =3D new_tmp_a64(s); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 16); + do_ldrq(s, a->rd, a->pg, addr, dtype_msz(a->dtype)); +} + static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz, int esz, int nreg) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 41b8cd8746..6c906e25e9 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -723,6 +723,15 @@ LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @= rprr_load_msz # LD2B, LD2H, LD2W, LD2D; etc. LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz =20 +# SVE load and broadcast quadword (scalar plus scalar) +LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \ + @rprr_load_msz nreg=3D0 + +# SVE load and broadcast quadword (scalar plus immediate) +# LD1RQB, LD1RQH, LD1RQS, LD1RQD +LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ + @rpri_load_msz nreg=3D0 + ### SVE Memory Store Group =20 # SVE contiguous store (scalar plus immediate) --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893587119298.42090006904357; Sat, 17 Feb 2018 10:53:07 -0800 (PST) Received: from localhost ([::1]:48295 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7be-0007Ro-2Z for importer@patchew.org; Sat, 17 Feb 2018 13:53:06 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40666) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AC-0001On-Mi for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AA-00025r-UN for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:44 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:38170) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AA-00025S-Lt for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:42 -0500 Received: by mail-pl0-x241.google.com with SMTP id h10so3439633plt.5 for ; Sat, 17 Feb 2018 10:24:42 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.39 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=7ABdzz1wITfCNiODAQGc4uzbhTYqYtjwPZz9V9qFcXA=; b=WIhqa3mih7ta9C/mSlgyt29PjH06zUxuey5fGbCpXsZMB37Ycu70Aymdq7FPknrFPF K2ip6LnGuh/RJPbHghxloZGepjGEmq8Npdi6gGCMXBKyAPM2+F4xrgqdX3AuGCR7ByOI 6sdlWqCxcvFuE7p9BTb6qmlGHIQ3z2rK2XwGo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=7ABdzz1wITfCNiODAQGc4uzbhTYqYtjwPZz9V9qFcXA=; b=m876LtdkkhvPcQmhrydUNiUp+/5wkPsdccN0fu2rdvgpN35k7z72Wn3H2ynGN1nSkq w2sch0Cnd+cC78gMfpqLJcMnvOximw2Am9MbETrIscX/d26akt9nxo3/E3Ihh2SHpPmo SCSa05P5dKcn5AkGq/8o0h4LxszQW/EN5Dk+9VcpCkZR73oNVmxnSD4Ns4pJVppK/U8+ SGMLys6MnR0apM06MdDB6URlTRcXY/0maTvgUEHt5MsLnITEU1cUXcpxxII1pRGswHfw yAUdBERO7UYq97KXcVAIpgWO0hovyiX2Dvrr8Ow1RQMhmwF4eY/RJ2QCefApdTRXPj6M 0uow== X-Gm-Message-State: APf1xPDIHIPUz30JdtkBiVxYUlQ3Ie80eWJ1Zh5rvAvSwO4BDnCHUjis oClB3QqGdFj9kU+LVqEuFGdTNJHGvak= X-Google-Smtp-Source: AH8x224EYmKQ6sELj0GiRd08CXVrbqjGTmQhWeN7FSk1D6h9JtQB2XBzZRkwH6h9Bl8qazHfruEVaA== X-Received: by 2002:a17:902:b189:: with SMTP id s9-v6mr9342318plr.243.1518891881212; Sat, 17 Feb 2018 10:24:41 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:03 -0800 Message-Id: <20180217182323.25885-48-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v2 47/67] target/arm: Implement SVE integer convert to floating-point X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 30 +++++++++++++++ target/arm/sve_helper.c | 52 ++++++++++++++++++++++++++ target/arm/translate-sve.c | 92 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 22 +++++++++++ 4 files changed, 196 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 74c2d642a3..fb7609f9ef 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,36 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_ucvt_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e259e910de..a1e0ceb5fb 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2789,6 +2789,58 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count,= uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } =20 +/* Fully general two-operand expander, controlled by a predicate, + * With the extra float_status parameter. + */ +#define DO_ZPZ_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t des= c) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + for (i =3D 0; i < opr_sz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) =3D OP(nn, status); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZ_FP_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t des= c) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; \ + TYPE *d =3D vd, *n =3D vn; \ + uint8_t *pg =3D vg; \ + for (i =3D 0; i < opr_sz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + d[i] =3D OP(n[i], status); \ + } \ + } \ +} + +DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) +DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) +DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) +DO_ZPZ_FP_D(sve_scvt_sd, uint64_t, int32_to_float64) +DO_ZPZ_FP_D(sve_scvt_dh, uint64_t, int64_to_float16) +DO_ZPZ_FP_D(sve_scvt_ds, uint64_t, int64_to_float32) +DO_ZPZ_FP_D(sve_scvt_dd, uint64_t, int64_to_float64) + +DO_ZPZ_FP(sve_ucvt_hh, uint16_t, H1_2, uint16_to_float16) +DO_ZPZ_FP(sve_ucvt_sh, uint32_t, H1_4, uint32_to_float16) +DO_ZPZ_FP(sve_ucvt_ss, uint32_t, H1_4, uint32_to_float32) +DO_ZPZ_FP_D(sve_ucvt_sd, uint64_t, uint32_to_float64) +DO_ZPZ_FP_D(sve_ucvt_dh, uint64_t, uint64_to_float16) +DO_ZPZ_FP_D(sve_ucvt_ds, uint64_t, uint64_to_float32) +DO_ZPZ_FP_D(sve_ucvt_dd, uint64_t, uint64_to_float64) + +#undef DO_ZPZ_FP +#undef DO_ZPZ_FP_D + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7b21102b7e..05c684222e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3161,6 +3161,98 @@ DO_FP3(FRSQRTS, rsqrts) =20 #undef DO_FP3 =20 + +/* + *** SVE Floating Point Unary Operations Prediated Group + */ + +static void do_zpz_ptr(DisasContext *s, int rd, int rn, int pg, + bool is_fp16, gen_helper_gvec_3_ptr *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr status; + + if (fn =3D=3D NULL) { + unallocated_encoding(s); + return; + } + status =3D get_fpstatus_ptr(is_fp16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + pred_full_reg_offset(s, pg), + status, vsz, vsz, 0, fn); +} + +static void trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); +} + +static void trans_SCVTF_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_sh); +} + +static void trans_SCVTF_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_dh); +} + +static void trans_SCVTF_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_ss); +} + +static void trans_SCVTF_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_ds); +} + +static void trans_SCVTF_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_sd); +} + +static void trans_SCVTF_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_dd); +} + +static void trans_UCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_hh); +} + +static void trans_UCVTF_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_sh); +} + +static void trans_UCVTF_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_dh); +} + +static void trans_UCVTF_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_ss); +} + +static void trans_UCVTF_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_ds); +} + +static void trans_UCVTF_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_sd); +} + +static void trans_UCVTF_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_dd); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6c906e25e9..b571b70050 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -134,6 +134,9 @@ @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz =20 +# One register operand, with governing predicate, no vector element size +@rd_pg_rn_e0 ........ .. ... ... ... pg:3 rn:5 rd:5 &rpr_esz esz=3D0 + # Two register operands with a 6-bit signed immediate. @rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri =20 @@ -689,6 +692,25 @@ FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_r= n_rm FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm =20 +### SVE FP Unary Operations Predicated Group + +# SVE integer convert to floating-point +SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_dh 01100101 01 010 11 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_ss 01100101 10 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_sd 01100101 11 010 00 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_ds 01100101 11 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_dd 01100101 11 010 11 0 101 ... ..... ..... @rd_pg_rn_e0 + +UCVTF_hh 01100101 01 010 01 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_sh 01100101 01 010 10 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_dh 01100101 01 010 11 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_ss 01100101 10 010 10 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_sd 01100101 11 010 00 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_ds 01100101 11 010 10 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_dd 01100101 11 010 11 1 101 ... ..... ..... @rd_pg_rn_e0 + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group =20 # SVE load predicate register --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893765193702.394544586058; Sat, 17 Feb 2018 10:56:05 -0800 (PST) Received: from localhost ([::1]:48325 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7eW-0001cA-AL for importer@patchew.org; Sat, 17 Feb 2018 13:56:04 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40693) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AE-0001Qz-Ez for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AC-00026n-Kw for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:46 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:37017) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AC-00026D-Bz for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:44 -0500 Received: by mail-pl0-x243.google.com with SMTP id ay8so3443274plb.4 for ; Sat, 17 Feb 2018 10:24:44 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.41 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=819P1QL6+9Kz2hG8caKn90tN50J65oecWq4BgxtZEpg=; b=Lf3WtKR9UXU59a/zCRqA8HmQXWzmAopj6YN06N4rnNEuWx2j7jo+ow42M8Ty9TIMBD QDVwtucur/HOnRqqiTcwc04Vo8JBV51CBwlD1egiDrTHqXuom7iBymVEj7/yxLt+OPip u60DSHeQCWNEoL8cN2a18w0IKz55GERSuAbFU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=819P1QL6+9Kz2hG8caKn90tN50J65oecWq4BgxtZEpg=; b=YaGXZiFRbeGPZP/48mIXaRv4bvPB/uXK4lh0qKtf0Gsbecan//oF8RE8fUeqCzLWJy 6J3f6XCffts0fc8q+uxyWVc9ocTGcH6YPYsmUxmcpP4WPbiClnrDnCxq8L5BdEWBK8Yq R7eq8+NQmELpuhRj0sFC7cqO6uLNrUCgGUzPR4MhY4xkwvzCiHYSdHrHWl4KftOE0Rb6 lKWM9kNPxldMNGY5jVjWpgrLvsPUEo7y0Y71v5zwicyir90UdNtWGeB8eGk8vBQR+L5o G+j/h5I1jCqfJ/CUhWnehuWqHZORqRNEX30TMFTWpWe2ilBooEBh2+PayWA4gSOsNz62 Yt1A== X-Gm-Message-State: APf1xPDSw3XiheUaYHNCzph95CPF6pXHapOpOPXIher5BpI5HS2gUPk0 ZwDU5Fx6o5CBwYIifG/VCZ3gTon7pjY= X-Google-Smtp-Source: AH8x224NVZe0prOYzWGduSkyyWNLdFk2BHLhI9Tda4v5UTfuJzqteh7g3vZlklcfhNzCqOlk2ql7Gg== X-Received: by 2002:a17:902:33a5:: with SMTP id b34-v6mr2254723plc.263.1518891883020; Sat, 17 Feb 2018 10:24:43 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:04 -0800 Message-Id: <20180217182323.25885-49-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 48/67] target/arm: Implement SVE floating-point arithmetic (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 77 ++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 107 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 47 ++++++++++++++++++++ target/arm/sve.decode | 17 +++++++ 4 files changed, 248 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index fb7609f9ef..84d0a8978c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,83 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsub_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsub_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsub_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmul_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmul_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmul_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fdiv_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fdiv_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fdiv_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmin_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmin_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmin_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmax_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmax_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmax_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fminnum_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnum_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnum_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxnum_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnum_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnum_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fabd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fabd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fabd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fscalbn_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fscalbn_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fscalbn_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmulx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a1e0ceb5fb..d80babfae7 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2789,6 +2789,113 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count= , uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } =20 +/* Fully general three-operand expander, controlled by a predicate, + * With the extra float_status parameter. + */ +#define DO_ZPZZ_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + for (i =3D 0; i < opr_sz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + TYPE mm =3D *(TYPE *)(vm + H(i)); \ + *(TYPE *)(vd + H(i)) =3D OP(nn, mm, status); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZZ_FP_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; \ + TYPE *d =3D vd, *n =3D vn, *m =3D vm; \ + uint8_t *pg =3D vg; \ + for (i =3D 0; i < opr_sz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + d[i] =3D OP(n[i], m[i], status); \ + } \ + } \ +} + +DO_ZPZZ_FP(sve_fadd_h, uint16_t, H1_2, float16_add) +DO_ZPZZ_FP(sve_fadd_s, uint16_t, H1_4, float32_add) +DO_ZPZZ_FP_D(sve_fadd_d, uint64_t, float64_add) + +DO_ZPZZ_FP(sve_fsub_h, uint16_t, H1_2, float16_sub) +DO_ZPZZ_FP(sve_fsub_s, uint16_t, H1_4, float32_sub) +DO_ZPZZ_FP_D(sve_fsub_d, uint64_t, float64_sub) + +DO_ZPZZ_FP(sve_fmul_h, uint16_t, H1_2, float16_mul) +DO_ZPZZ_FP(sve_fmul_s, uint16_t, H1_4, float32_mul) +DO_ZPZZ_FP_D(sve_fmul_d, uint64_t, float64_mul) + +DO_ZPZZ_FP(sve_fdiv_h, uint16_t, H1_2, float16_div) +DO_ZPZZ_FP(sve_fdiv_s, uint16_t, H1_4, float32_div) +DO_ZPZZ_FP_D(sve_fdiv_d, uint64_t, float64_div) + +DO_ZPZZ_FP(sve_fmin_h, uint16_t, H1_2, float16_min) +DO_ZPZZ_FP(sve_fmin_s, uint16_t, H1_4, float32_min) +DO_ZPZZ_FP_D(sve_fmin_d, uint64_t, float64_min) + +DO_ZPZZ_FP(sve_fmax_h, uint16_t, H1_2, float16_max) +DO_ZPZZ_FP(sve_fmax_s, uint16_t, H1_4, float32_max) +DO_ZPZZ_FP_D(sve_fmax_d, uint64_t, float64_max) + +DO_ZPZZ_FP(sve_fminnum_h, uint16_t, H1_2, float16_minnum) +DO_ZPZZ_FP(sve_fminnum_s, uint16_t, H1_4, float32_minnum) +DO_ZPZZ_FP_D(sve_fminnum_d, uint64_t, float64_minnum) + +DO_ZPZZ_FP(sve_fmaxnum_h, uint16_t, H1_2, float16_maxnum) +DO_ZPZZ_FP(sve_fmaxnum_s, uint16_t, H1_4, float32_maxnum) +DO_ZPZZ_FP_D(sve_fmaxnum_d, uint64_t, float64_maxnum) + +static inline uint16_t abd_h(float16 a, float16 b, float_status *s) +{ + return float16_abs(float16_sub(a, b, s)); + +} + +static inline uint32_t abd_s(float32 a, float32 b, float_status *s) +{ + return float32_abs(float32_sub(a, b, s)); + +} + +static inline uint64_t abd_d(float64 a, float64 b, float_status *s) +{ + return float64_abs(float64_sub(a, b, s)); + +} + +DO_ZPZZ_FP(sve_fabd_h, uint16_t, H1_2, abd_h) +DO_ZPZZ_FP(sve_fabd_s, uint16_t, H1_4, abd_s) +DO_ZPZZ_FP_D(sve_fabd_d, uint64_t, abd_d) + +static inline uint64_t scalbn_d(float64 a, int64_t b, float_status *s) +{ + int b_int =3D MIN(MAX(b, INT_MIN), INT_MAX); + return float64_scalbn(a, b_int, s); +} + +DO_ZPZZ_FP(sve_fscalbn_h, uint16_t, H1_2, float16_scalbn) +DO_ZPZZ_FP(sve_fscalbn_s, uint16_t, H1_4, float32_scalbn) +DO_ZPZZ_FP_D(sve_fscalbn_d, uint64_t, scalbn_d) + +DO_ZPZZ_FP(sve_fmulx_h, uint16_t, H1_2, helper_advsimd_mulxh) +DO_ZPZZ_FP(sve_fmulx_s, uint16_t, H1_4, helper_vfp_mulxs) +DO_ZPZZ_FP_D(sve_fmulx_d, uint64_t, helper_vfp_mulxd) + +#undef DO_ZPZZ_FP +#undef DO_ZPZZ_FP_D + /* Fully general two-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 05c684222e..1692980d20 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3161,6 +3161,52 @@ DO_FP3(FRSQRTS, rsqrts) =20 #undef DO_FP3 =20 +/* + *** SVE Floating Point Arithmetic - Predicated Group + */ + +static void do_zpzz_fp(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4_ptr *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr status; + + if (fn =3D=3D NULL) { + unallocated_encoding(s); + return; + } + status =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); +} + +#define DO_FP3(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rprr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[4] =3D { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + do_zpzz_fp(s, a, fns[a->esz]); \ +} + +DO_FP3(FADD_zpzz, fadd) +DO_FP3(FSUB_zpzz, fsub) +DO_FP3(FMUL_zpzz, fmul) +DO_FP3(FMIN_zpzz, fmin) +DO_FP3(FMAX_zpzz, fmax) +DO_FP3(FMINNM_zpzz, fminnum) +DO_FP3(FMAXNM_zpzz, fmaxnum) +DO_FP3(FABD, fabd) +DO_FP3(FSCALE, fscalbn) +DO_FP3(FDIV, fdiv) +DO_FP3(FMULX, fmulx) + +#undef DO_FP3 =20 /* *** SVE Floating Point Unary Operations Prediated Group @@ -3181,6 +3227,7 @@ static void do_zpz_ptr(DisasContext *s, int rd, int r= n, int pg, vec_full_reg_offset(s, rn), pred_full_reg_offset(s, pg), status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); } =20 static void trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b571b70050..1a13c603ff 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -692,6 +692,23 @@ FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_r= n_rm FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm =20 +### SVE FP Arithmetic Predicated Group + +# SVE floating-point arithmetic (predicated) +FADD_zpzz 01100101 .. 00 0000 100 ... ..... ..... @rdn_pg_rm +FSUB_zpzz 01100101 .. 00 0001 100 ... ..... ..... @rdn_pg_rm +FMUL_zpzz 01100101 .. 00 0010 100 ... ..... ..... @rdn_pg_rm +FSUB_zpzz 01100101 .. 00 0011 100 ... ..... ..... @rdm_pg_rn # FSUBR +FMAXNM_zpzz 01100101 .. 00 0100 100 ... ..... ..... @rdn_pg_rm +FMINNM_zpzz 01100101 .. 00 0101 100 ... ..... ..... @rdn_pg_rm +FMAX_zpzz 01100101 .. 00 0110 100 ... ..... ..... @rdn_pg_rm +FMIN_zpzz 01100101 .. 00 0111 100 ... ..... ..... @rdn_pg_rm +FABD 01100101 .. 00 1000 100 ... ..... ..... @rdn_pg_rm +FSCALE 01100101 .. 00 1001 100 ... ..... ..... @rdn_pg_rm +FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm +FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR +FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm + ### SVE FP Unary Operations Predicated Group =20 # SVE integer convert to floating-point --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518895258794586.6465501657779; Sat, 17 Feb 2018 11:20:58 -0800 (PST) Received: from localhost ([::1]:49489 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en82Z-0007qJ-Jc for importer@patchew.org; Sat, 17 Feb 2018 14:20:55 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40710) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AF-0001SM-Ko for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AD-00027L-U8 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:47 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:45436) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AD-000276-MH for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:45 -0500 Received: by mail-pl0-x243.google.com with SMTP id p5so3428969plo.12 for ; Sat, 17 Feb 2018 10:24:45 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.43 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=N/uyfV2EnnPtS8uXZFNB+wp3yS83EN710i4Qp70kVz4=; b=H/MUIYn1loOyvN60iecRE0FQjCwd/BjSnnyc15z+QbX2c/H8rVKH38gGmh604YWggC 0FKT8NXeIKlxk4p9mXj1VpQS3Vw942fgR+Zv9AP1fWaMKC5QxzeYiaF3OHKU7YZSINoa QiqPd7izm5AMwGANa1dlHyrrfkUfzI/Wg5oMY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=N/uyfV2EnnPtS8uXZFNB+wp3yS83EN710i4Qp70kVz4=; b=VF7u2CAImX0FKxYzpwqgGQRbOGzGSkxAdhHxAbMPTo1Ldd7z7XXi6Csx6DqLT99kVU woi9t8SwpgpZphcqxKXTQ96i7aihTI76jcbwKA6fIp1C6vvtjLf6BpYJdPMUIw7kBhiq e/XFrmvjw1LWAGrNEmfGt/Zg5vu+NnbcVO9Lk4WYrnysbS28k8n+/WIUntSERrl//jT6 VzwzE1EqP0XFOG7pdg1pM7ZxKIrDspsdEas9abi61mlg7NmQTksG/24ilzpPzJwn/kKb V4PthdGjEUXI8dy/IXczW5dOsMrynwb4G1KcstmoGyCWE8Hu4wm42RvD3G0Jwvkg5h0h EZHg== X-Gm-Message-State: APf1xPCDvNBKBYNDdzeFNcLiJxPwe+VDNoEFQEh52961dUgihMgMYf+b bxMLypynfC8G0isplcSH/Ser5YMGACY= X-Google-Smtp-Source: AH8x227vUBrxj+wMTFEJi+VZF2kkIGO8K+988jlw3KCo5KnSuXvRWs9Ps+bCTqJ60+dEtEMvTuokZg== X-Received: by 2002:a17:902:7808:: with SMTP id p8-v6mr9622082pll.161.1518891884401; Sat, 17 Feb 2018 10:24:44 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:05 -0800 Message-Id: <20180217182323.25885-50-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 49/67] target/arm: Implement SVE FP Multiply-Add Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 ++++++++++++++ target/arm/sve_helper.c | 53 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 41 +++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 17 +++++++++++++++ 4 files changed, 127 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 84d0a8978c..a95f077c7f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -827,6 +827,22 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d80babfae7..6622275b44 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2948,6 +2948,59 @@ DO_ZPZ_FP_D(sve_ucvt_dd, uint64_t, uint64_to_float64) #undef DO_ZPZ_FP #undef DO_ZPZ_FP_D =20 +/* 4-operand predicated multiply-add. This requires 7 operands to pass + * "properly", so we need to encode some of the registers into DESC. + */ +QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 20 > 32); + +#define DO_FMLA(NAME, N, H, NEG1, NEG3) = \ +void HELPER(NAME)(CPUARMState *env, void *vg, uint32_t desc) = \ +{ = \ + intptr_t i =3D 0, opr_sz =3D simd_oprsz(desc); = \ + unsigned rd =3D extract32(desc, SIMD_DATA_SHIFT, 5); = \ + unsigned rn =3D extract32(desc, SIMD_DATA_SHIFT + 5, 5); = \ + unsigned rm =3D extract32(desc, SIMD_DATA_SHIFT + 10, 5); = \ + unsigned ra =3D extract32(desc, SIMD_DATA_SHIFT + 15, 5); = \ + void *vd =3D &env->vfp.zregs[rd]; = \ + void *vn =3D &env->vfp.zregs[rn]; = \ + void *vm =3D &env->vfp.zregs[rm]; = \ + void *va =3D &env->vfp.zregs[ra]; = \ + do { = \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); = \ + do { = \ + if (likely(pg & 1)) { = \ + float##N e1 =3D *(uint##N##_t *)(vn + H(i)); = \ + float##N e2 =3D *(uint##N##_t *)(vm + H(i)); = \ + float##N e3 =3D *(uint##N##_t *)(va + H(i)); = \ + float##N r; = \ + if (NEG1) e1 =3D float##N##_chs(e1); = \ + if (NEG3) e3 =3D float##N##_chs(e3); = \ + r =3D float##N##_muladd(e1, e2, e3, 0, &env->vfp.fp_status= ); \ + *(uint##N##_t *)(vd + H(i)) =3D r; = \ + } = \ + i +=3D sizeof(float##N), pg >>=3D sizeof(float##N); = \ + } while (i & 15); = \ + } while (i < opr_sz); = \ +} + +DO_FMLA(sve_fmla_zpzzz_h, 16, H1_2, 0, 0) +DO_FMLA(sve_fmla_zpzzz_s, 32, H1_4, 0, 0) +DO_FMLA(sve_fmla_zpzzz_d, 64, , 0, 0) + +DO_FMLA(sve_fmls_zpzzz_h, 16, H1_2, 0, 1) +DO_FMLA(sve_fmls_zpzzz_s, 32, H1_4, 0, 1) +DO_FMLA(sve_fmls_zpzzz_d, 64, , 0, 1) + +DO_FMLA(sve_fnmla_zpzzz_h, 16, H1_2, 1, 0) +DO_FMLA(sve_fnmla_zpzzz_s, 32, H1_4, 1, 0) +DO_FMLA(sve_fnmla_zpzzz_d, 64, , 1, 0) + +DO_FMLA(sve_fnmls_zpzzz_h, 16, H1_2, 1, 1) +DO_FMLA(sve_fnmls_zpzzz_s, 32, H1_4, 1, 1) +DO_FMLA(sve_fnmls_zpzzz_d, 64, , 1, 1) + +#undef DO_FMLA + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1692980d20..3124368fb5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3208,6 +3208,47 @@ DO_FP3(FMULX, fmulx) =20 #undef DO_FP3 =20 +typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); + +static void do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla= *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + unsigned desc; + TCGv_i32 t_desc; + TCGv_ptr pg =3D tcg_temp_new_ptr(); + + /* We would need 7 operands to pass these arguments "properly". + * So we encode all the register numbers into the descriptor. + */ + desc =3D deposit32(a->rd, 5, 5, a->rn); + desc =3D deposit32(desc, 10, 5, a->rm); + desc =3D deposit32(desc, 15, 5, a->ra); + desc =3D simd_desc(vsz, vsz, desc); + + t_desc =3D tcg_const_i32(desc); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + fn(cpu_env, pg, t_desc); + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(pg); +} + +#define DO_FMLA(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rprrr_esz *a, uint32_t insn)= \ +{ \ + static gen_helper_sve_fmla * const fns[4] =3D { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + do_fmla(s, a, fns[a->esz]); \ +} + +DO_FMLA(FMLA_zpzzz, fmla_zpzzz) +DO_FMLA(FMLS_zpzzz, fmls_zpzzz) +DO_FMLA(FNMLA_zpzzz, fnmla_zpzzz) +DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz) + +#undef DO_FMLA + /* *** SVE Floating Point Unary Operations Prediated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1a13c603ff..817833f96e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -129,6 +129,8 @@ &rprrr_esz ra=3D%reg_movprfx @rdn_pg_ra_rm ........ esz:2 . rm:5 ... pg:3 ra:5 rd:5 \ &rprrr_esz rn=3D%reg_movprfx +@rdn_pg_rm_ra ........ esz:2 . ra:5 ... pg:3 rm:5 rd:5 \ + &rprrr_esz rn=3D%reg_movprfx =20 # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @@ -709,6 +711,21 @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn= _pg_rm FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm =20 +### SVE FP Multiply-Add Group + +# SVE floating-point multiply-accumulate writing addend +FMLA_zpzzz 01100101 .. 1 ..... 000 ... ..... ..... @rda_pg_rn_rm +FMLS_zpzzz 01100101 .. 1 ..... 001 ... ..... ..... @rda_pg_rn_rm +FNMLA_zpzzz 01100101 .. 1 ..... 010 ... ..... ..... @rda_pg_rn_rm +FNMLS_zpzzz 01100101 .. 1 ..... 011 ... ..... ..... @rda_pg_rn_rm + +# SVE floating-point multiply-accumulate writing multiplicand +# FMAD, FMSB, FNMAD, FNMS +FMLA_zpzzz 01100101 .. 1 ..... 100 ... ..... ..... @rdn_pg_rm_ra +FMLS_zpzzz 01100101 .. 1 ..... 101 ... ..... ..... @rdn_pg_rm_ra +FNMLA_zpzzz 01100101 .. 1 ..... 110 ... ..... ..... @rdn_pg_rm_ra +FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra + ### SVE FP Unary Operations Predicated Group =20 # SVE integer convert to floating-point --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518895438519831.5258955602146; Sat, 17 Feb 2018 11:23:58 -0800 (PST) Received: from localhost ([::1]:49522 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en85V-0001us-L9 for importer@patchew.org; Sat, 17 Feb 2018 14:23:57 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40737) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AG-0001Ts-Us for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AF-00028D-Nx for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:48 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:43140) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AF-00027k-8N for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:47 -0500 Received: by mail-pl0-x244.google.com with SMTP id f4so3436597plr.10 for ; Sat, 17 Feb 2018 10:24:47 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.44 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cokBBhuDkWzo28jRgpOw6j/fAbncWNu8+0qc6t2uB2A=; b=VAo/oF8A+CY8ToSIg/4fbQP52ZtyxwcUn0hg4PSmo+ru7LWv7WHA/UuerjwCRlOzz4 NnjH5dK+8OVJLUlan+plCftRZjgD1QsyQN+DxXWcSbAqdrGNnroV4dRLCwNc4Mi5xvoT EEP9qMp2RVsDTH1sKJYoz3Fa2ib42s6P3/pzU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cokBBhuDkWzo28jRgpOw6j/fAbncWNu8+0qc6t2uB2A=; b=qKoeP2YVJIkdXrJ00DRMSmtjMTytztNtEHC12ZrJTlXMNU16OgzsBJwh08uTJVa0fn u/2pOUA7RRBSNUXuG64oi/5+AaSWMq/qkxzAxyTUdWCtMdVpPCL42Av5I+dr4miWQL9a GZ6lCedj2MfnG+qr+ADVFeeIONu1ernqu1cwjIIakCZsqi8UcbccY3cqha5Fs1bt5C/t vGwczhP/Vj77AAGyQnDju09GWy54Vsf9nXR6ExY/6F0889UG63wsv7/l0ZKLr/3Z7wXQ 8PU2XDBkmfvt1l20oRgrHMCaPjGFWrxOQ3CpRJge82SYwo6A/8M0vfA5XmmBs+7F/o+5 h8IQ== X-Gm-Message-State: APf1xPAtstjNX1KbKB5HtLjHZJXcwWmQ1uME2Lizt6deFs4a5LQF4fAT xPSzVXnPZVQ4elcNTe5hzlshM1cEWyY= X-Google-Smtp-Source: AH8x225i8K5eWnpZ34KCpQWGE/DYbuVxImf+7xbFwVTSH1Vuv87gBUFibyekkC9iro90vqOpmtOz6w== X-Received: by 2002:a17:902:2bc5:: with SMTP id l63-v6mr9565360plb.108.1518891886027; Sat, 17 Feb 2018 10:24:46 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:06 -0800 Message-Id: <20180217182323.25885-51-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 50/67] target/arm: Implement SVE Floating Point Accumulating Reduction Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 7 ++++++ target/arm/sve_helper.c | 56 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 42 ++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 5 +++++ 4 files changed, 110 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a95f077c7f..c4502256d5 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,13 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6622275b44..0e2b3091b0 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2789,6 +2789,62 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count,= uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } =20 +uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ + intptr_t i =3D 0, opr_sz =3D simd_oprsz(desc); + float16 result =3D nn; + + do { + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); + do { + if (pg & 1) { + float16 mm =3D *(float16 *)(vm + H1_2(i)); + result =3D float16_add(result, mm, status); + } + i +=3D sizeof(float16), pg >>=3D sizeof(float16); + } while (i & 15); + } while (i < opr_sz); + + return result; +} + +uint64_t HELPER(sve_fadda_s)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ + intptr_t i =3D 0, opr_sz =3D simd_oprsz(desc); + float32 result =3D nn; + + do { + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); + do { + if (pg & 1) { + float32 mm =3D *(float32 *)(vm + H1_2(i)); + result =3D float32_add(result, mm, status); + } + i +=3D sizeof(float32), pg >>=3D sizeof(float32); + } while (i & 15); + } while (i < opr_sz); + + return result; +} + +uint64_t HELPER(sve_fadda_d)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ + intptr_t i =3D 0, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *m =3D vm; + uint8_t *pg =3D vg; + + for (i =3D 0; i < opr_sz; i++) { + if (pg[H1(i)] & 1) { + nn =3D float64_add(nn, m[i], status); + } + } + + return nn; +} + /* Fully general three-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3124368fb5..32f0340738 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3120,6 +3120,48 @@ DO_ZZI(UMIN, umin) =20 #undef DO_ZZI =20 +/* + *** SVE Floating Point Accumulating Reduction Group + */ + +static void trans_FADDA(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + typedef void fadda_fn(TCGv_i64, TCGv_i64, TCGv_ptr, + TCGv_ptr, TCGv_ptr, TCGv_i32); + static fadda_fn * const fns[3] =3D { + gen_helper_sve_fadda_h, + gen_helper_sve_fadda_s, + gen_helper_sve_fadda_d, + }; + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr t_rm, t_pg, t_fpst; + TCGv_i64 t_val; + TCGv_i32 t_desc; + + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + + t_val =3D load_esz(cpu_env, vec_reg_offset(s, a->rn, 0, a->esz), a->es= z); + t_rm =3D tcg_temp_new_ptr(); + t_pg =3D tcg_temp_new_ptr(); + tcg_gen_addi_ptr(t_rm, cpu_env, vec_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg)); + t_fpst =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + t_desc =3D tcg_const_i32(simd_desc(vsz, vsz, 0)); + + fns[a->esz - 1](t_val, t_val, t_rm, t_pg, t_fpst, t_desc); + + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(t_fpst); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_ptr(t_rm); + + write_fp_dreg(s, a->rd, t_val); + tcg_temp_free_i64(t_val); +} + /* *** SVE Floating Point Arithmetic - Unpredicated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 817833f96e..95a290aed0 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -684,6 +684,11 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_= i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s =20 +### SVE FP Accumulating Reduction Group + +# SVE floating-point serial reduction (predicated) +FADDA 01100101 .. 011 000 001 ... ..... ..... @rdn_pg_rm + ### SVE Floating Point Arithmetic - Unpredicated Group =20 # SVE floating-point arithmetic (unpredicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894993356647.7894198896346; Sat, 17 Feb 2018 11:16:33 -0800 (PST) Received: from localhost ([::1]:48568 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7yK-00039a-Cl for importer@patchew.org; Sat, 17 Feb 2018 14:16:32 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40779) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AK-0001YK-6W for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AH-000292-3g for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:52 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:41329) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AG-00028c-SX for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:49 -0500 Received: by mail-pl0-x241.google.com with SMTP id k8so3439026pli.8 for ; Sat, 17 Feb 2018 10:24:48 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.46 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zzawyRmfL2KGIFwz+BVgdCeNmKyizy+NZ2+wOIU6zVw=; b=ioy7bBnl43iUet/aePnOz6XoxwlBK5M3a+tlgeYlQo7rbaCYNDvx3xKURxe/+Fzbk5 Dl93j2rfX6xPDPlRUQ7egH+/PbhnoeRinFIxKEUDPej1r/Xn+NPm2WTK43knEqPsghO+ wA3DNZFHnqNRy96mCycDXpx8SpoZ0Iq+FkG1A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zzawyRmfL2KGIFwz+BVgdCeNmKyizy+NZ2+wOIU6zVw=; b=TMOfD+VA8aIgI+catPBgw/gEFXiw4TH9hwKKgt65kvVa3dAdZp5Yt755FqB4Lblqed gWUp0VUBrS3DCKOJTfsrjfmJzM47uZFHDNQxBuGTBycw2UPG/+jwsjq3EDabP8WyOdl5 Bc6sonIvbQUBLECaGFV/jFWOOUbRU6t39YqTy9AGICDjmLE12e0WWOynr1xyaJNlPS3L 6XvbbSJsG2QaHV3MhJXCUuIgQuE/HJ9DRjHsVGvU+G3ao5M0UV36kjYuvo69JmfjH0Yt WlyOYUucu1QMH3JCH+NL1jyqOEOcHM7P+b3DMnwan78CSEDSe0PCzg/LC6Hpt2uCFQWt ZxPg== X-Gm-Message-State: APf1xPDs8ah4iTXeimAF7NCH44UPjoebA7joVUnS0KOufSi6dW8Dyqh3 8s6XwG/7Sncm0mzIlrB8AkWL0GIZK/A= X-Google-Smtp-Source: AH8x227IMqdf2tZV2qL7KGqJ1A7J9cX53mJHgX7KsBciwKV78steNtr0z5MTSq8yHVredcahQ+tJlg== X-Received: by 2002:a17:902:8509:: with SMTP id bj9-v6mr9563834plb.386.1518891887554; Sat, 17 Feb 2018 10:24:47 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:07 -0800 Message-Id: <20180217182323.25885-52-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v2 51/67] target/arm: Implement SVE load and broadcast element X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve_helper.c | 43 ++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 55 ++++++++++++++++++++++++++++++++++++++++++= +++- target/arm/sve.decode | 5 +++++ 4 files changed, 107 insertions(+), 1 deletion(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c4502256d5..6c640a92ff 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -274,6 +274,11 @@ DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, p= tr, ptr, i32) DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(sve_clri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_asr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) DEF_HELPER_FLAGS_4(sve_asr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) DEF_HELPER_FLAGS_4(sve_asr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 0e2b3091b0..a7dc6f6164 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -994,6 +994,49 @@ void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t de= sc) } } =20 +/* Store zero into every inactive element of Zd. */ +void HELPER(sve_clri_b)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] &=3D expand_pred_b(pg[H1(i)]); + } +} + +void HELPER(sve_clri_h)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] &=3D expand_pred_h(pg[H1(i)]); + } +} + +void HELPER(sve_clri_s)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] &=3D expand_pred_s(pg[H1(i)]); + } +} + +void HELPER(sve_clri_d)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + for (i =3D 0; i < opr_sz; i +=3D 1) { + if (!(pg[H1(i)] & 1)) { + d[i] =3D 0; + } + } +} + /* Three-operand expander, immediate operand, controlled by a predicate. */ #define DO_ZPZI(NAME, TYPE, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 32f0340738..b000a2482e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -584,6 +584,19 @@ static void do_clr_zp(DisasContext *s, int rd, int pg,= int esz) vsz, vsz, 0, fns[esz]); } =20 +/* Store zero into every inactive element of Zd. */ +static void do_clr_inactive_zp(DisasContext *s, int rd, int pg, int esz) +{ + static gen_helper_gvec_2 * const fns[4] =3D { + gen_helper_sve_clri_b, gen_helper_sve_clri_h, + gen_helper_sve_clri_s, gen_helper_sve_clri_d, + }; + unsigned vsz =3D vec_full_reg_size(s); + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, rd), + pred_full_reg_offset(s, pg), + vsz, vsz, 0, fns[esz]); +} + static void do_zpzi_ool(DisasContext *s, arg_rpri_esz *a, gen_helper_gvec_3 *fn) { @@ -3506,7 +3519,7 @@ static void trans_LDR_pri(DisasContext *s, arg_rri *a= , uint32_t insn) *** SVE Memory - Contiguous Load Group */ =20 -/* The memory element size of dtype. */ +/* The memory mode of the dtype. */ static const TCGMemOp dtype_mop[16] =3D { MO_UB, MO_UB, MO_UB, MO_UB, MO_SL, MO_UW, MO_UW, MO_UW, @@ -3671,6 +3684,46 @@ static void trans_LD1RQ_zpri(DisasContext *s, arg_rp= ri_load *a, uint32_t insn) do_ldrq(s, a->rd, a->pg, addr, dtype_msz(a->dtype)); } =20 +/* Load and broadcast element. */ +static void trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a, uint32_t in= sn) +{ + unsigned vsz =3D vec_full_reg_size(s); + unsigned psz =3D pred_full_reg_size(s); + unsigned esz =3D dtype_esz[a->dtype]; + TCGLabel *over =3D gen_new_label(); + TCGv_i64 temp; + + /* If the guarding predicate has no bits set, no load occurs. */ + if (psz <=3D 8) { + temp =3D tcg_temp_new_i64(); + tcg_gen_ld_i64(temp, cpu_env, pred_full_reg_offset(s, a->pg)); + tcg_gen_andi_i64(temp, temp, + deposit64(0, 0, psz * 8, pred_esz_masks[esz])); + tcg_gen_brcondi_i64(TCG_COND_EQ, temp, 0, over); + tcg_temp_free_i64(temp); + } else { + TCGv_i32 t32 =3D tcg_temp_new_i32(); + find_last_active(s, t32, esz, a->pg); + tcg_gen_brcondi_i32(TCG_COND_LT, t32, 0, over); + tcg_temp_free_i32(t32); + } + + /* Load the data. */ + temp =3D tcg_temp_new_i64(); + tcg_gen_addi_i64(temp, cpu_reg_sp(s, a->rn), a->imm); + tcg_gen_qemu_ld_i64(temp, temp, get_mem_index(s), + s->be_data | dtype_mop[a->dtype]); + + /* Broadcast to *all* elements. */ + tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd), + vsz, vsz, temp); + tcg_temp_free_i64(temp); + + /* Zero the inactive elements. */ + gen_set_label(over); + do_clr_inactive_zp(s, a->rd, a->pg, esz); +} + static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz, int esz, int nreg) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 95a290aed0..3e30985a09 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -29,6 +29,7 @@ %imm9_16_10 16:s6 10:3 %preg4_5 5:4 %size_23 23:2 +%dtype_23_13 23:2 13:2 =20 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=3Dtszimm_esz @@ -758,6 +759,10 @@ LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_= rn_i9 # SVE load vector register LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 =20 +# SVE load and broadcast element +LD1R_zpri 1000010 .. 1 imm:6 1.. pg:3 rn:5 rd:5 \ + &rpri_load dtype=3D%dtype_23_13 nreg=3D0 + ### SVE Memory Contiguous Load Group =20 # SVE contiguous load (scalar plus scalar) --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 151889422721358.58095330123422; Sat, 17 Feb 2018 11:03:47 -0800 (PST) Received: from localhost ([::1]:48405 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7ly-00005F-CX for importer@patchew.org; Sat, 17 Feb 2018 14:03:46 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40780) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AK-0001YL-6v for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AI-00029j-HF for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:52 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:38172) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AI-00029H-8q for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:50 -0500 Received: by mail-pl0-x243.google.com with SMTP id h10so3439731plt.5 for ; Sat, 17 Feb 2018 10:24:50 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.47 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=szV9p2aTTZ+9ZvXrV44xeIdEsoeNuGFa0EWnSbjdE/M=; b=MvXSwKgV4KRlsBmoQxUd7Y+CIlWGYasYDm4sXGBUtiOpbCnE6YNApp9YGxhcpKBYuy +0AZYN7MAMbsC9wgNTApclQica0EU8GbEq8jaN1nYIgkckAMS8TjYJ2D1vuaHuvAQjYJ dCKeQsIW7gs4q5x6qAc+N3lYkfztfnC3X3OiA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=szV9p2aTTZ+9ZvXrV44xeIdEsoeNuGFa0EWnSbjdE/M=; b=XmfZX3MdUGRQFrV5XGvH41vYiLI9zpGFDD2FPlMUl5CYUT0tBasjQf5E9s/oOwMU5z XHYjO8uwZ5ImPbLcGXadNIyF4SL+wU3DzkrxeRiMCXc0Djnm70j0h3zWDxKCMz3wUvf9 UEBIbnWreBzyivD8pZTFP2LiMj5NYq/jzI+BFVGlGDaZXDXJZdGSP9jZZEL+DcHI3XXw F8gwupWumQ+nSd9al5j5M+srjXBjpsGoDMYLb5M5EJHE+WQcAaULzsHQM7c86h/sZT07 8sVtn7xKMq0aAee25UISBiAER2tPVilJYQ9w7bihas44+Q9tNcxPlzqFvP5LgvMw1IjR jRMQ== X-Gm-Message-State: APf1xPBgIniup/srer5+pIm2BP/wk3mn+0i+327wA/uQwLYYZF6XmjaY XJSEySDf7yuvYkntBsAaraRAXLQU/As= X-Google-Smtp-Source: AH8x224p6Nr5P4r7+FebMOogU2huyhVI6XIfaKHCMpTNMFSSS/JLpQZW1PTEIHwKclLGjfShOX9DFg== X-Received: by 2002:a17:902:461:: with SMTP id 88-v6mr9362814ple.88.1518891889000; Sat, 17 Feb 2018 10:24:49 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:08 -0800 Message-Id: <20180217182323.25885-53-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 52/67] target/arm: Implement SVE store vector/predicate register X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 101 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 6 +++ 2 files changed, 107 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index b000a2482e..9c724980a0 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3501,6 +3501,95 @@ static void do_ldr(DisasContext *s, uint32_t vofs, u= int32_t len, tcg_temp_free_i64(t0); } =20 +/* Similarly for stores. */ +static void do_str(DisasContext *s, uint32_t vofs, uint32_t len, + int rn, int imm) +{ + uint32_t len_align =3D QEMU_ALIGN_DOWN(len, 8); + uint32_t len_remain =3D len % 8; + uint32_t nparts =3D len / 8 + ctpop8(len_remain); + int midx =3D get_mem_index(s); + TCGv_i64 addr, t0; + + addr =3D tcg_temp_new_i64(); + t0 =3D tcg_temp_new_i64(); + + /* Note that unpredicated load/store of vector/predicate registers + * are defined as a stream of bytes, which equates to little-endian + * operations on larger quantities. There is no nice way to force + * a little-endian load for aarch64_be-linux-user out of line. + * + * Attempt to keep code expansion to a minimum by limiting the + * amount of unrolling done. + */ + if (nparts <=3D 4) { + int i; + + for (i =3D 0; i < len_align; i +=3D 8) { + tcg_gen_ld_i64(t0, cpu_env, vofs + i); + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i); + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ); + } + } else { + TCGLabel *loop =3D gen_new_label(); + TCGv_ptr i =3D TCGV_NAT_TO_PTR(glue(tcg_const_local_, ptr)(0)); + TCGv_ptr src; + + gen_set_label(loop); + + src =3D tcg_temp_new_ptr(); + tcg_gen_add_ptr(src, cpu_env, i); + tcg_gen_ld_i64(t0, src, vofs); + + /* Minimize the number of local temps that must be re-read from + * the stack each iteration. Instead, re-compute values other + * than the loop counter. + */ + tcg_gen_addi_ptr(src, i, imm); +#if UINTPTR_MAX =3D=3D UINT32_MAX + tcg_gen_extu_i32_i64(addr, TCGV_PTR_TO_NAT(src)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn)); +#else + tcg_gen_add_i64(addr, TCGV_PTR_TO_NAT(src), cpu_reg_sp(s, rn)); +#endif + tcg_temp_free_ptr(src); + + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ); + + tcg_gen_addi_ptr(i, i, 8); + + glue(tcg_gen_brcondi_, ptr)(TCG_COND_LTU, TCGV_PTR_TO_NAT(i), + len_align, loop); + tcg_temp_free_ptr(i); + } + + /* Predicate register stores can be any multiple of 2. */ + if (len_remain) { + tcg_gen_ld_i64(t0, cpu_env, vofs + len_align); + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align); + + switch (len_remain) { + case 2: + case 4: + case 8: + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LE | ctz32(len_remain)); + break; + + case 6: + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUL); + tcg_gen_addi_i64(addr, addr, 4); + tcg_gen_shri_i64(addr, addr, 32); + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUW); + break; + + default: + g_assert_not_reached(); + } + } + tcg_temp_free_i64(addr); + tcg_temp_free_i64(t0); +} + #undef ptr =20 static void trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn) @@ -3515,6 +3604,18 @@ static void trans_LDR_pri(DisasContext *s, arg_rri *= a, uint32_t insn) do_ldr(s, pred_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); } =20 +static void trans_STR_zri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + int size =3D vec_full_reg_size(s); + do_str(s, vec_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); +} + +static void trans_STR_pri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + int size =3D pred_full_reg_size(s); + do_str(s, pred_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); +} + /* *** SVE Memory - Contiguous Load Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 3e30985a09..5d8e1481d7 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -800,6 +800,12 @@ LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ =20 ### SVE Memory Store Group =20 +# SVE store predicate register +STR_pri 1110010 11 0. ..... 000 ... ..... 0 .... @pd_rn_i9 + +# SVE store vector register +STR_zri 1110010 11 0. ..... 010 ... ..... ..... @rd_rn_i9 + # SVE contiguous store (scalar plus immediate) # ST1B, ST1H, ST1W, ST1D; require msz <=3D esz ST_zpri 1110010 .. esz:2 0.... 111 ... ..... ..... \ --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518893956215642.9762148903325; Sat, 17 Feb 2018 10:59:16 -0800 (PST) Received: from localhost ([::1]:48360 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7hb-0004UF-Dr for importer@patchew.org; Sat, 17 Feb 2018 13:59:15 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40813) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AL-0001bz-OB for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AK-0002AG-2v for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:53 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:37016) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AJ-00029z-Ri for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:52 -0500 Received: by mail-pl0-x241.google.com with SMTP id ay8so3443369plb.4 for ; Sat, 17 Feb 2018 10:24:51 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=IhToWs1C7AovvETP6XOvXMuMMPnPAh0jKOUin6dQbJU=; b=QB9g90y2TEkyU1V27BzKlf0vX0kK5U2vJncmfjTKWsy4dmrrR7r/oGGoPZlIu/zvUd l8/GFpDFU8ohWjjQ36ULEbIvZnU1XsLrSXauYEbJBMpIXecpxmAc6P52AtLMGR1RLnri t+Dirc7B5c7M8M5TcFTIqj80cmYLkI0K/4Gu4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=IhToWs1C7AovvETP6XOvXMuMMPnPAh0jKOUin6dQbJU=; b=Ht510ic2dQlEk3XZBRNdjlQTlVNJ34Xd8yxlASzdpcjsx94xQd5gRfYPss2whq4ds9 wpf8E9gZcF5YM1wjillLoPkrSLGKnnOQsn+/I84hPH9YiQBHOkTCsp3D1rImBx2uYAwZ gfUMioKkTAP1evFHyiQkEcAB19b78ZgFNiRYMSvyaoZjoQpOWVUzzzJloC2Vksref+c+ V3SABcL/xlSRbcapE3Ht25sv0t8NUoO1g5OBdTPc1JA3Qf8zw4kpdL7ckI2nwBDc+5MA bo5pOYZ3RoVRSsLKjOyQRYM+LetiTKjHtSFP1ogThJYRvxkCTkzuJAch44/AX3Gt9OnW xTcg== X-Gm-Message-State: APf1xPDkx/QvW4Z4QObLO/oml/Ss8eXUa1IUkfnw6/zaSy5yTcxnLvrc 1cqw/wSz2Icet4NP7xeExfupObv/xRY= X-Google-Smtp-Source: AH8x225GxDz/AUak1xXpAMHxnaNfA0YLPwRPyyQMXyA0iLbK2wtYBX7nmLY+oRgLV+LQPWDkLfUqXQ== X-Received: by 2002:a17:902:b488:: with SMTP id y8-v6mr8856075plr.432.1518891890498; Sat, 17 Feb 2018 10:24:50 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:09 -0800 Message-Id: <20180217182323.25885-54-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v2 53/67] target/arm: Implement SVE scatter stores X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 41 ++++++++++++++++++++++++++ target/arm/sve_helper.c | 62 ++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 71 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 39 +++++++++++++++++++++++++ 4 files changed, 213 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6c640a92ff..b5c093f2fd 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -918,3 +918,44 @@ DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, = env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) =20 DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbs_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sths_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a7dc6f6164..07b3d285f2 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3545,3 +3545,65 @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, addr +=3D 4 * 8; } } + +/* Stores with a vector index. */ + +#define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc) / 8; \ + unsigned scale =3D simd_data(desc); \ + uintptr_t ra =3D GETPC(); \ + uint32_t *d =3D vd; TYPEI *m =3D vm; uint8_t *pg =3D vg; = \ + for (i =3D 0; i < oprsz; i++) { \ + uint8_t pp =3D pg[H1(i)]; \ + if (pp & 0x01) { \ + target_ulong off =3D (target_ulong)m[H4(i * 2)] << scale; \ + FN(env, base + off, d[H4(i * 2)], ra); \ + } \ + if (pp & 0x10) { \ + target_ulong off =3D (target_ulong)m[H4(i * 2 + 1)] << scale; \ + FN(env, base + off, d[H4(i * 2 + 1)], ra); \ + } \ + } \ +} + +#define DO_ST1_ZPZ_D(NAME, TYPEI, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc) / 8; \ + unsigned scale =3D simd_data(desc); \ + uintptr_t ra =3D GETPC(); \ + uint64_t *d =3D vd, *m =3D vm; uint8_t *pg =3D vg; = \ + for (i =3D 0; i < oprsz; i++) { \ + if (pg[H1(i)] & 1) { \ + target_ulong off =3D (target_ulong)(TYPEI)m[i] << scale; \ + FN(env, base + off, d[i], ra); \ + } \ + } \ +} + +DO_ST1_ZPZ_S(sve_stbs_zsu, uint32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_S(sve_sths_zsu, uint32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_S(sve_stss_zsu, uint32_t, cpu_stl_data_ra) + +DO_ST1_ZPZ_S(sve_stbs_zss, int32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_S(sve_sths_zss, int32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_S(sve_stss_zss, int32_t, cpu_stl_data_ra) + +DO_ST1_ZPZ_D(sve_stbd_zsu, uint32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_D(sve_sthd_zsu, uint32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_D(sve_stsd_zsu, uint32_t, cpu_stl_data_ra) +DO_ST1_ZPZ_D(sve_stdd_zsu, uint32_t, cpu_stq_data_ra) + +DO_ST1_ZPZ_D(sve_stbd_zss, int32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_D(sve_sthd_zss, int32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_D(sve_stsd_zss, int32_t, cpu_stl_data_ra) +DO_ST1_ZPZ_D(sve_stdd_zss, int32_t, cpu_stq_data_ra) + +DO_ST1_ZPZ_D(sve_stbd_zd, uint64_t, cpu_stb_data_ra) +DO_ST1_ZPZ_D(sve_sthd_zd, uint64_t, cpu_stw_data_ra) +DO_ST1_ZPZ_D(sve_stsd_zd, uint64_t, cpu_stl_data_ra) +DO_ST1_ZPZ_D(sve_stdd_zd, uint64_t, cpu_stq_data_ra) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 9c724980a0..ca49b94924 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -47,6 +47,8 @@ typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, = TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); =20 typedef void gen_helper_gvec_mem(TCGv_env, TCGv_ptr, TCGv_i64, TCGv_i32); +typedef void gen_helper_gvec_mem_scatter(TCGv_env, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_i64, TCGv_i32); =20 /* * Helpers for extracting complex instruction fields. @@ -3887,3 +3889,72 @@ static void trans_ST_zpri(DisasContext *s, arg_rpri_= store *a, uint32_t insn) (a->imm * elements * (a->nreg + 1)) << a->msz); do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg); } + +/* + *** SVE gather loads / scatter stores + */ + +static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale, + TCGv_i64 scalar, gen_helper_gvec_mem_scatter *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_i32 desc =3D tcg_const_i32(simd_desc(vsz, vsz, scale)); + TCGv_ptr t_zm =3D tcg_temp_new_ptr(); + TCGv_ptr t_pg =3D tcg_temp_new_ptr(); + TCGv_ptr t_zt =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + tcg_gen_addi_ptr(t_zm, cpu_env, vec_full_reg_offset(s, zm)); + tcg_gen_addi_ptr(t_zt, cpu_env, vec_full_reg_offset(s, zt)); + fn(cpu_env, t_zt, t_pg, t_zm, scalar, desc); + + tcg_temp_free_ptr(t_zt); + tcg_temp_free_ptr(t_zm); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); +} + +static void trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) +{ + /* Indexed by [xs][msz]. */ + static gen_helper_gvec_mem_scatter * const fn32[2][3] =3D { + { gen_helper_sve_stbs_zsu, + gen_helper_sve_sths_zsu, + gen_helper_sve_stss_zsu, }, + { gen_helper_sve_stbs_zss, + gen_helper_sve_sths_zss, + gen_helper_sve_stss_zss, }, + }; + static gen_helper_gvec_mem_scatter * const fn64[3][4] =3D { + { gen_helper_sve_stbd_zsu, + gen_helper_sve_sthd_zsu, + gen_helper_sve_stsd_zsu, + gen_helper_sve_stdd_zsu, }, + { gen_helper_sve_stbd_zss, + gen_helper_sve_sthd_zss, + gen_helper_sve_stsd_zss, + gen_helper_sve_stdd_zss, }, + { gen_helper_sve_stbd_zd, + gen_helper_sve_sthd_zd, + gen_helper_sve_stsd_zd, + gen_helper_sve_stdd_zd, }, + }; + gen_helper_gvec_mem_scatter *fn; + + if (a->esz < a->msz || (a->msz =3D=3D 0 && a->scale)) { + unallocated_encoding(s); + return; + } + switch (a->esz) { + case MO_32: + fn =3D fn32[a->xs][a->msz]; + break; + case MO_64: + fn =3D fn64[a->xs][a->msz]; + break; + default: + g_assert_not_reached(); + } + do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz, + cpu_reg_sp(s, a->rn), fn); +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5d8e1481d7..edd9340c02 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -81,6 +81,7 @@ &rpri_load rd pg rn imm dtype nreg &rprr_store rd pg rn rm msz esz nreg &rpri_store rd pg rn imm msz esz nreg +&rprr_scatter_store rd pg rn rm esz msz xs scale =20 ########################################################################### # Named instruction formats. These are generally used to @@ -199,6 +200,8 @@ @rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_= store @rprr_store_esz_n0 ....... .. esz:2 rm:5 ... pg:3 rn:5 rd:5 \ &rprr_store nreg=3D0 +@rprr_scatter_store ....... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \ + &rprr_scatter_store =20 ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -832,3 +835,39 @@ ST_zpri 1110010 .. nreg:2 1.... 111 ... ..... ..... \ # SVE store multiple structures (scalar plus scalar) (nreg !=3D 0) ST_zprr 1110010 msz:2 nreg:2 ..... 011 ... ..... ..... \ @rprr_store esz=3D%size_23 + +# SVE 32-bit scatter store (scalar plus 32-bit scaled offsets) +# Require msz > 0 && msz <=3D esz. +ST1_zprz 1110010 .. 11 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=3D0 esz=3D2 scale=3D1 +ST1_zprz 1110010 .. 11 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=3D1 esz=3D2 scale=3D1 + +# SVE 32-bit scatter store (scalar plus 32-bit unscaled offsets) +# Require msz <=3D esz. +ST1_zprz 1110010 .. 10 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=3D0 esz=3D2 scale=3D0 +ST1_zprz 1110010 .. 10 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=3D1 esz=3D2 scale=3D0 + +# SVE 64-bit scatter store (scalar plus 64-bit scaled offset) +# Require msz > 0 +ST1_zprz 1110010 .. 01 ..... 101 ... ..... ..... \ + @rprr_scatter_store xs=3D2 esz=3D3 scale=3D1 + +# SVE 64-bit scatter store (scalar plus 64-bit unscaled offset) +ST1_zprz 1110010 .. 00 ..... 101 ... ..... ..... \ + @rprr_scatter_store xs=3D2 esz=3D3 scale=3D0 + +# SVE 64-bit scatter store (scalar plus unpacked 32-bit scaled offset) +# Require msz > 0 +ST1_zprz 1110010 .. 01 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=3D0 esz=3D3 scale=3D1 +ST1_zprz 1110010 .. 01 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=3D1 esz=3D3 scale=3D1 + +# SVE 64-bit scatter store (scalar plus unpacked 32-bit unscaled offset) +ST1_zprz 1110010 .. 00 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=3D0 esz=3D3 scale=3D0 +ST1_zprz 1110010 .. 00 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=3D1 esz=3D3 scale=3D0 --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894411274762.9754611867834; Sat, 17 Feb 2018 11:06:51 -0800 (PST) Received: from localhost ([::1]:48426 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7ow-0002st-Ce for importer@patchew.org; Sat, 17 Feb 2018 14:06:50 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40821) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AM-0001cr-H0 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AL-0002Au-EB for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:54 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:40423) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AL-0002Ad-8u for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:53 -0500 Received: by mail-pl0-x244.google.com with SMTP id g18so3435893plo.7 for ; Sat, 17 Feb 2018 10:24:53 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.50 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=d4HbPykktVYLS4tE1HcEFAw5Qf7NVL1FMMy69rPPEE0=; b=TnhRGHO515AMY9RNZj7jrkToPGL9H070saLqxSGas1cvhg9KJ0QIZanFDGXLYt+Fdy EmkG7fi6omiG23Z9YmNoQjC37e3FKDhsv/3vkN4wukDIslXiJiq7G3iCtCIFYpgEfyKc cxJlStYyhnzD8FTiT0cA5FeNKVeOMGDJy0lLc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=d4HbPykktVYLS4tE1HcEFAw5Qf7NVL1FMMy69rPPEE0=; b=HL9r2fIKIL42/J3UW2Ya8ddL/OFbJ1Sb9M2D9cYYU/q44Qteb21mygNIY+ri3b/E+T d+PPbF6F3664f75EfyM4A40NUh7kITz0kcTpRZP/u+ZFlB5snoFG4DO4qaTw2Y15KOuI q3mINQbrdrvz+HDRzMAsHATRvbnRts4ZUT/3KvZHULnfAV3ctMjdywCGWF8BfEoThS7t /0Dgr9S+PZOkAd7CfJ/JiH9GsJwcNY+Z1BwpbBbF9LHR7i6c60ORjE8MvtTPN++bLbMf bwV8X9znlA5ceQlYDJx3GdOWMU4J4a71hBSfWC/Nr1b05vaIReWYFiDqkTyznHuNZp1x 5KWw== X-Gm-Message-State: APf1xPDNRqsIQLKl1jKe68g6TGW/6H5EVmtMhf+K/wNcwxT3yTK69Li8 St7RBbGNR5i3gdYb+ngG8SoHmL3hXjI= X-Google-Smtp-Source: AH8x2246PX7oSq6ddtVI79EboBOh7VEnAql6sDaE/4SnI3jLjmIvA/jKeZJJA4xVw3Widjs5lQaQsg== X-Received: by 2002:a17:902:ab85:: with SMTP id f5-v6mr9594980plr.199.1518891892033; Sat, 17 Feb 2018 10:24:52 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:10 -0800 Message-Id: <20180217182323.25885-55-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 54/67] target/arm: Implement SVE prefetches X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 9 +++++++++ target/arm/sve.decode | 23 +++++++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ca49b94924..63c7a0e8d8 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3958,3 +3958,12 @@ static void trans_ST1_zprz(DisasContext *s, arg_ST1_= zprz *a, uint32_t insn) do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz, cpu_reg_sp(s, a->rn), fn); } + +/* + * Prefetches + */ + +static void trans_PRF(DisasContext *s, arg_PRF *a, uint32_t insn) +{ + /* Prefetch is a nop within QEMU. */ +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index edd9340c02..f0144aa2d0 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -801,6 +801,29 @@ LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \ LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ @rpri_load_msz nreg=3D0 =20 +# SVE 32-bit gather prefetch (scalar plus 32-bit scaled offsets) +PRF 1000010 00 -1 ----- 0-- --- ----- 0 ---- + +# SVE 32-bit gather prefetch (vector plus immediate) +PRF 1000010 -- 00 ----- 111 --- ----- 0 ---- + +# SVE contiguous prefetch (scalar plus immediate) +PRF 1000010 11 1- ----- 0-- --- ----- 0 ---- + +# SVE contiguous prefetch (scalar plus scalar) +PRF 1000010 -- 00 ----- 110 --- ----- 0 ---- + +### SVE Memory 64-bit Gather Group + +# SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets) +PRF 1100010 00 11 ----- 1-- --- ----- 0 ---- + +# SVE 64-bit gather prefetch (scalar plus unpacked 32-bit scaled offsets) +PRF 1100010 00 -1 ----- 0-- --- ----- 0 ---- + +# SVE 64-bit gather prefetch (vector plus immediate) +PRF 1100010 -- 00 ----- 111 --- ----- 0 ---- + ### SVE Memory Store Group =20 # SVE store predicate register --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518895649823979.1016891864393; Sat, 17 Feb 2018 11:27:29 -0800 (PST) Received: from localhost ([::1]:50396 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en88n-0005p7-55 for importer@patchew.org; Sat, 17 Feb 2018 14:27:21 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40853) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AP-0001fN-8p for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AN-0002Bt-I9 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:57 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:46298) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AN-0002BW-8w for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:55 -0500 Received: by mail-pl0-x244.google.com with SMTP id x19so3428833plr.13 for ; Sat, 17 Feb 2018 10:24:55 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.52 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=2sKSuIJ/y4jlOeyOJVW8/Dfg+ktvWdZhrQjHSRQzkQA=; b=Xr67ruYVEKWUBhAqQuRPykML7XquFAAR8pijTPSYWPGglP15/CgxzduyTsB7U4ILXY UibQXV0F6IuGNYHq5i7WlDqJwi0RnFN56QeXXiGyNWTAmEpz4+Hgu4MXgSAHOciRFWjn bNwqk6EgQIdCUZqx6ZTWQUIULqV+o3t25bgjM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=2sKSuIJ/y4jlOeyOJVW8/Dfg+ktvWdZhrQjHSRQzkQA=; b=mMgGOvomo+7W62MoISuh2qv8czX4ZxFpSm09mvj1p9NXGmSFG8zX7voJvrmdpBKmK/ CCRBszqnCSFIhE6mMSXjfWGqQCqserzqpeg+U2qSXDHNLbILYnuCVNvcz9JDdiRsIelx v2m6kmoQdfxxHhmJ3CtbSkoGD+SQc2YYPqWKaHV0Kx76RCQIOezsD4TpWJxOpQyHfmoy 3DCY0mvhvJUQggDVOuQro4IdRrCwTEiFd5NqfReVlZwrs9MT9UlYQf6WBPaxM/HVgWQZ UJ1jBqHdJo2JyHuD9nrhCZfIyAnUxOYsxNgYfRj3P1ebkQMz2ZxrhH64QKytu6KBCk+6 m9uw== X-Gm-Message-State: APf1xPAPUppp3IT0YUFu4WIkCyRD125DN3ZnmpRhIzgplfOxdK2KuOUW eoH78puazN37c/gWM0GD4IX9DGJQmEg= X-Google-Smtp-Source: AH8x227Jw5vUVFDbZiEqA6f1FKx9rdbTAAIjx65J5BnYRKcehR0JSXKUpkiQm9rLEWqe/8EKDUaTEg== X-Received: by 2002:a17:902:bf01:: with SMTP id bi1-v6mr9308650plb.254.1518891893898; Sat, 17 Feb 2018 10:24:53 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:11 -0800 Message-Id: <20180217182323.25885-56-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 55/67] target/arm: Implement SVE gather loads X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 67 ++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 75 +++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 97 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 53 +++++++++++++++++++++++++ 4 files changed, 292 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b5c093f2fd..3cb7ab9ef2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -919,6 +919,73 @@ DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, = env, ptr, tl, i32) =20 DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) =20 +DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldssu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldssu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 07b3d285f2..4edd3d4367 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3546,6 +3546,81 @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, } } =20 +/* Loads with a vector index. */ + +#define DO_LD1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc) / 8; \ + unsigned scale =3D simd_data(desc); \ + uintptr_t ra =3D GETPC(); \ + uint32_t *d =3D vd; TYPEI *m =3D vm; uint8_t *pg =3D vg; = \ + for (i =3D 0; i < oprsz; i++) { \ + uint8_t pp =3D pg[H1(i)]; \ + if (pp & 0x01) { \ + target_ulong off =3D (target_ulong)m[H4(i * 2)] << scale; \ + d[H4(i * 2)] =3D (TYPEM)FN(env, base + off, ra); \ + } \ + if (pp & 0x10) { \ + target_ulong off =3D (target_ulong)m[H4(i * 2 + 1)] << scale; \ + d[H4(i * 2 + 1)] =3D (TYPEM)FN(env, base + off, ra); \ + } \ + } \ +} + +#define DO_LD1_ZPZ_D(NAME, TYPEI, TYPEM, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc) / 8; \ + unsigned scale =3D simd_data(desc); \ + uintptr_t ra =3D GETPC(); \ + uint64_t *d =3D vd, *m =3D vm; uint8_t *pg =3D vg; = \ + for (i =3D 0; i < oprsz; i++) { \ + if (pg[H1(i)] & 1) { \ + target_ulong off =3D (target_ulong)(TYPEI)m[i] << scale; \ + d[i] =3D (TYPEM)FN(env, base + off, ra); \ + } \ + } \ +} + +DO_LD1_ZPZ_S(sve_ldbsu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhsu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_S(sve_ldssu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_S(sve_ldbss_zsu, uint32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhss_zsu, uint32_t, int16_t, cpu_lduw_data_ra) + +DO_LD1_ZPZ_S(sve_ldbsu_zss, int32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhsu_zss, int32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_S(sve_ldssu_zss, int32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_S(sve_ldbss_zss, int32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhss_zss, int32_t, int16_t, cpu_lduw_data_ra) + +DO_LD1_ZPZ_D(sve_ldbdu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhdu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsdu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_D(sve_ldddu_zsu, uint32_t, uint64_t, cpu_ldq_data_ra) +DO_LD1_ZPZ_D(sve_ldbds_zsu, uint32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhds_zsu, uint32_t, int16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsds_zsu, uint32_t, int32_t, cpu_ldl_data_ra) + +DO_LD1_ZPZ_D(sve_ldbdu_zss, int32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhdu_zss, int32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsdu_zss, int32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_D(sve_ldddu_zss, int32_t, uint64_t, cpu_ldq_data_ra) +DO_LD1_ZPZ_D(sve_ldbds_zss, int32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhds_zss, int32_t, int16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsds_zss, int32_t, int32_t, cpu_ldl_data_ra) + +DO_LD1_ZPZ_D(sve_ldbdu_zd, uint64_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhdu_zd, uint64_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsdu_zd, uint64_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_D(sve_ldddu_zd, uint64_t, uint64_t, cpu_ldq_data_ra) +DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t, cpu_ldl_data_ra) + /* Stores with a vector index. */ =20 #define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 63c7a0e8d8..6484ecd257 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3914,6 +3914,103 @@ static void do_mem_zpz(DisasContext *s, int zt, int= pg, int zm, int scale, tcg_temp_free_i32(desc); } =20 +/* Indexed by [xs][u][msz]. */ +static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][3] =3D { + { { gen_helper_sve_ldbss_zsu, + gen_helper_sve_ldhss_zsu, + NULL, }, + { gen_helper_sve_ldbsu_zsu, + gen_helper_sve_ldhsu_zsu, + gen_helper_sve_ldssu_zsu, } }, + { { gen_helper_sve_ldbss_zss, + gen_helper_sve_ldhss_zss, + NULL, }, + { gen_helper_sve_ldbsu_zss, + gen_helper_sve_ldhsu_zss, + gen_helper_sve_ldssu_zss, } }, +}; + +static gen_helper_gvec_mem_scatter * const gather_load_fn64[3][2][4] =3D { + { { gen_helper_sve_ldbds_zsu, + gen_helper_sve_ldhds_zsu, + gen_helper_sve_ldsds_zsu, + NULL, }, + { gen_helper_sve_ldbdu_zsu, + gen_helper_sve_ldhdu_zsu, + gen_helper_sve_ldsdu_zsu, + gen_helper_sve_ldddu_zsu, } }, + { { gen_helper_sve_ldbds_zss, + gen_helper_sve_ldhds_zss, + gen_helper_sve_ldsds_zss, + NULL, }, + { gen_helper_sve_ldbdu_zss, + gen_helper_sve_ldhdu_zss, + gen_helper_sve_ldsdu_zss, + gen_helper_sve_ldddu_zss, } }, + { { gen_helper_sve_ldbds_zd, + gen_helper_sve_ldhds_zd, + gen_helper_sve_ldsds_zd, + NULL, }, + { gen_helper_sve_ldbdu_zd, + gen_helper_sve_ldhdu_zd, + gen_helper_sve_ldsdu_zd, + gen_helper_sve_ldddu_zd, } }, +}; + +static void trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn) +{ + gen_helper_gvec_mem_scatter *fn =3D NULL; + + if (a->esz < a->msz + || (a->msz =3D=3D 0 && a->scale) + || (a->esz =3D=3D a->msz && !a->u)) { + unallocated_encoding(s); + return; + } + + /* TODO: handle LDFF1. */ + switch (a->esz) { + case MO_32: + fn =3D gather_load_fn32[a->xs][a->u][a->msz]; + break; + case MO_64: + fn =3D gather_load_fn64[a->xs][a->u][a->msz]; + break; + } + assert(fn !=3D NULL); + + do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz, + cpu_reg_sp(s, a->rn), fn); +} + +static void trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn) +{ + gen_helper_gvec_mem_scatter *fn =3D NULL; + TCGv_i64 imm; + + if (a->esz < a->msz || (a->esz =3D=3D a->msz && !a->u)) { + unallocated_encoding(s); + return; + } + + /* TODO: handle LDFF1. */ + switch (a->esz) { + case MO_32: + fn =3D gather_load_fn32[0][a->u][a->msz]; + break; + case MO_64: + fn =3D gather_load_fn64[2][a->u][a->msz]; + break; + } + assert(fn !=3D NULL); + + /* Treat LD1_zpiz (zn[x] + imm) the same way as LD1_zprz (rn + zm[x]) + by loading the immediate into the scalar parameter. */ + imm =3D tcg_const_i64(a->imm << a->msz); + do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn); + tcg_temp_free_i64(imm); +} + static void trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) { /* Indexed by [xs][msz]. */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f0144aa2d0..f85d82e009 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -81,6 +81,8 @@ &rpri_load rd pg rn imm dtype nreg &rprr_store rd pg rn rm msz esz nreg &rpri_store rd pg rn imm msz esz nreg +&rprr_gather_load rd pg rn rm esz msz u ff xs scale +&rpri_gather_load rd pg rn imm esz msz u ff &rprr_scatter_store rd pg rn rm esz msz xs scale =20 ########################################################################### @@ -195,6 +197,18 @@ @rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \ &rpri_load dtype=3D%msz_dtype =20 +# Gather Loads. +@rprr_g_load_u ....... .. . . rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load xs=3D2 +@rprr_g_load_xs_u ....... .. xs:1 . rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load +@rprr_g_load_xs_u_sc ....... .. xs:1 scale:1 rm:5 . u:1 ff:1 pg:3 rn:5 rd= :5 \ + &rprr_gather_load +@rprr_g_load_u_sc ....... .. . scale:1 rm:5 . u:1 ff:1 pg:3 rn:5 rd= :5 \ + &rprr_gather_load xs=3D2 +@rpri_g_load ....... msz:2 .. imm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rpri_gather_load + # Stores; user must fill in ESZ, MSZ, NREG as needed. @rprr_store ....... .. .. rm:5 ... pg:3 rn:5 rd:5 &rprr_store @rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_= store @@ -766,6 +780,19 @@ LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_r= n_i9 LD1R_zpri 1000010 .. 1 imm:6 1.. pg:3 rn:5 rd:5 \ &rpri_load dtype=3D%dtype_23_13 nreg=3D0 =20 +# SVE 32-bit gather load (scalar plus 32-bit unscaled offsets) +# SVE 32-bit gather load (scalar plus 32-bit scaled offsets) +LD1_zprz 1000010 00 .0 ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u esz=3D2 msz=3D0 scale=3D0 +LD1_zprz 1000010 01 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3D2 msz=3D1 +LD1_zprz 1000010 10 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3D2 msz=3D2 + +# SVE 32-bit gather load (vector plus immediate) +LD1_zpiz 1000010 .. 01 ..... 1.. ... ..... ..... \ + @rpri_g_load esz=3D2 + ### SVE Memory Contiguous Load Group =20 # SVE contiguous load (scalar plus scalar) @@ -815,6 +842,32 @@ PRF 1000010 -- 00 ----- 110 --- ----- 0 ---- =20 ### SVE Memory 64-bit Gather Group =20 +# SVE 64-bit gather load (scalar plus 32-bit unpacked unscaled offsets) +# SVE 64-bit gather load (scalar plus 32-bit unpacked scaled offsets) +LD1_zprz 1100010 00 .0 ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u esz=3D3 msz=3D0 scale=3D0 +LD1_zprz 1100010 01 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3D3 msz=3D1 +LD1_zprz 1100010 10 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3D3 msz=3D2 +LD1_zprz 1100010 11 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3D3 msz=3D3 + +# SVE 64-bit gather load (scalar plus 64-bit unscaled offsets) +# SVE 64-bit gather load (scalar plus 64-bit scaled offsets) +LD1_zprz 1100010 00 10 ..... 1.. ... ..... ..... \ + @rprr_g_load_u esz=3D3 msz=3D0 scale=3D0 +LD1_zprz 1100010 01 1. ..... 1.. ... ..... ..... \ + @rprr_g_load_u_sc esz=3D3 msz=3D1 +LD1_zprz 1100010 10 1. ..... 1.. ... ..... ..... \ + @rprr_g_load_u_sc esz=3D3 msz=3D2 +LD1_zprz 1100010 11 1. ..... 1.. ... ..... ..... \ + @rprr_g_load_u_sc esz=3D3 msz=3D3 + +# SVE 64-bit gather load (vector plus immediate) +LD1_zpiz 1100010 .. 01 ..... 1.. ... ..... ..... \ + @rpri_g_load esz=3D3 + # SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets) PRF 1100010 00 11 ----- 1-- --- ----- 0 ---- =20 --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518895834120970.9153881497022; Sat, 17 Feb 2018 11:30:34 -0800 (PST) Received: from localhost ([::1]:50773 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en8Bs-0000M9-T0 for importer@patchew.org; Sat, 17 Feb 2018 14:30:33 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40867) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AQ-0001gq-6k for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:01 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AO-0002CF-Ua for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:58 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:45437) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AO-0002C4-Me for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:56 -0500 Received: by mail-pl0-x243.google.com with SMTP id p5so3429097plo.12 for ; Sat, 17 Feb 2018 10:24:56 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.53 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6iBAAbIc0nm8fljouk+VAtJA8NE1VxP/6Uk4XqY2ALY=; b=bQsteFdJvY671q2PALBpD3G4VVdIP0C0wcHJJ4m0VEPbrtH2lOA6Zf1imZNajZFqCA Q2PtAQaa00CaJNHPIuqSzaio26WDT3AVoFWbl+K06Bz2ZRRBrHu7HhxjTrON1FkO4Nly ajvjiycfrYGfbms2GB42To+oQCFBC/e44OxN8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6iBAAbIc0nm8fljouk+VAtJA8NE1VxP/6Uk4XqY2ALY=; b=oFlcGXzfThhhp7w2N1Pde83sJdQevJG+faKbUmTEjqT4ifU8vuKCVfCZgScqbeNFt4 umiKFEC5lkyPf898gwYRE1D+Il0vyOITW6r3NYbfgk7I9hOE7sRj/qx45uJZ31nxdOPF uoVr8Y9Vj4mdIG94koGyHvBVXV/2+I0n8cBngWec4IMOKSFi1ckEloSMavqYm4dEOPyy 1Uy1lTkTPLzR+t1Y1wzw1JAgLPRCthZJKg15oRi0hr7/G0w+Bqr9ISlEWy3R4ZxMDS65 jrU+xa9Cte6fIm+7tWMTBgk0IbYNBwadjtoO1DmeBvCuan7XeG97t/GmGTJ5RvELPfPu 2m+g== X-Gm-Message-State: APf1xPBApxbr9Dm32zuzb40SMEnD3gU1jNuxGScCxslV5m05ARfu9Sln K6p2/tjA+UIcNBAIwIqg4CojvXTQCRM= X-Google-Smtp-Source: AH8x2252j69Ce1qptI3Hfua8uBm4k7FBulkFBUrUb3UngMCwYdgWRs6kJ8bCFGd7iSmwqgbtvUWj3Q== X-Received: by 2002:a17:902:6ac2:: with SMTP id i2-v6mr1484963plt.368.1518891895405; Sat, 17 Feb 2018 10:24:55 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:12 -0800 Message-Id: <20180217182323.25885-57-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 56/67] target/arm: Implement SVE scatter store vector immediate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 79 +++++++++++++++++++++++++++++++-----------= ---- target/arm/sve.decode | 11 +++++++ 2 files changed, 65 insertions(+), 25 deletions(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6484ecd257..0241e8e707 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4011,31 +4011,33 @@ static void trans_LD1_zpiz(DisasContext *s, arg_LD1= _zpiz *a, uint32_t insn) tcg_temp_free_i64(imm); } =20 +/* Indexed by [xs][msz]. */ +static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][3] =3D { + { gen_helper_sve_stbs_zsu, + gen_helper_sve_sths_zsu, + gen_helper_sve_stss_zsu, }, + { gen_helper_sve_stbs_zss, + gen_helper_sve_sths_zss, + gen_helper_sve_stss_zss, }, +}; + +static gen_helper_gvec_mem_scatter * const scatter_store_fn64[3][4] =3D { + { gen_helper_sve_stbd_zsu, + gen_helper_sve_sthd_zsu, + gen_helper_sve_stsd_zsu, + gen_helper_sve_stdd_zsu, }, + { gen_helper_sve_stbd_zss, + gen_helper_sve_sthd_zss, + gen_helper_sve_stsd_zss, + gen_helper_sve_stdd_zss, }, + { gen_helper_sve_stbd_zd, + gen_helper_sve_sthd_zd, + gen_helper_sve_stsd_zd, + gen_helper_sve_stdd_zd, }, +}; + static void trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) { - /* Indexed by [xs][msz]. */ - static gen_helper_gvec_mem_scatter * const fn32[2][3] =3D { - { gen_helper_sve_stbs_zsu, - gen_helper_sve_sths_zsu, - gen_helper_sve_stss_zsu, }, - { gen_helper_sve_stbs_zss, - gen_helper_sve_sths_zss, - gen_helper_sve_stss_zss, }, - }; - static gen_helper_gvec_mem_scatter * const fn64[3][4] =3D { - { gen_helper_sve_stbd_zsu, - gen_helper_sve_sthd_zsu, - gen_helper_sve_stsd_zsu, - gen_helper_sve_stdd_zsu, }, - { gen_helper_sve_stbd_zss, - gen_helper_sve_sthd_zss, - gen_helper_sve_stsd_zss, - gen_helper_sve_stdd_zss, }, - { gen_helper_sve_stbd_zd, - gen_helper_sve_sthd_zd, - gen_helper_sve_stsd_zd, - gen_helper_sve_stdd_zd, }, - }; gen_helper_gvec_mem_scatter *fn; =20 if (a->esz < a->msz || (a->msz =3D=3D 0 && a->scale)) { @@ -4044,10 +4046,10 @@ static void trans_ST1_zprz(DisasContext *s, arg_ST1= _zprz *a, uint32_t insn) } switch (a->esz) { case MO_32: - fn =3D fn32[a->xs][a->msz]; + fn =3D scatter_store_fn32[a->xs][a->msz]; break; case MO_64: - fn =3D fn64[a->xs][a->msz]; + fn =3D scatter_store_fn64[a->xs][a->msz]; break; default: g_assert_not_reached(); @@ -4056,6 +4058,33 @@ static void trans_ST1_zprz(DisasContext *s, arg_ST1_= zprz *a, uint32_t insn) cpu_reg_sp(s, a->rn), fn); } =20 +static void trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a, uint32_t insn) +{ + gen_helper_gvec_mem_scatter *fn =3D NULL; + TCGv_i64 imm; + + if (a->esz < a->msz) { + unallocated_encoding(s); + return; + } + + switch (a->esz) { + case MO_32: + fn =3D scatter_store_fn32[0][a->msz]; + break; + case MO_64: + fn =3D scatter_store_fn64[2][a->msz]; + break; + } + assert(fn !=3D NULL); + + /* Treat ST1_zpiz (zn[x] + imm) the same way as ST1_zprz (rn + zm[x]) + by loading the immediate into the scalar parameter. */ + imm =3D tcg_const_i64(a->imm << a->msz); + do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn); + tcg_temp_free_i64(imm); +} + /* * Prefetches */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f85d82e009..6ccb4289fc 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -84,6 +84,7 @@ &rprr_gather_load rd pg rn rm esz msz u ff xs scale &rpri_gather_load rd pg rn imm esz msz u ff &rprr_scatter_store rd pg rn rm esz msz xs scale +&rpri_scatter_store rd pg rn imm esz msz =20 ########################################################################### # Named instruction formats. These are generally used to @@ -216,6 +217,8 @@ &rprr_store nreg=3D0 @rprr_scatter_store ....... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \ &rprr_scatter_store +@rpri_scatter_store ....... msz:2 .. imm:5 ... pg:3 rn:5 rd:5 \ + &rpri_scatter_store =20 ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -935,6 +938,14 @@ ST1_zprz 1110010 .. 01 ..... 101 ... ..... ..... \ ST1_zprz 1110010 .. 00 ..... 101 ... ..... ..... \ @rprr_scatter_store xs=3D2 esz=3D3 scale=3D0 =20 +# SVE 64-bit scatter store (vector plus immediate) +ST1_zpiz 1110010 .. 10 ..... 101 ... ..... ..... \ + @rpri_scatter_store esz=3D3 + +# SVE 32-bit scatter store (vector plus immediate) +ST1_zpiz 1110010 .. 11 ..... 101 ... ..... ..... \ + @rpri_scatter_store esz=3D2 + # SVE 64-bit scatter store (scalar plus unpacked 32-bit scaled offset) # Require msz > 0 ST1_zprz 1110010 .. 01 ..... 100 ... ..... ..... \ --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894588686640.9636451754745; Sat, 17 Feb 2018 11:09:48 -0800 (PST) Received: from localhost ([::1]:48449 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7rn-0005UK-QH for importer@patchew.org; Sat, 17 Feb 2018 14:09:47 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40914) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AT-0001l4-5B for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AQ-0002DC-IC for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:01 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:32806) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AQ-0002Cb-Au for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:58 -0500 Received: by mail-pf0-x244.google.com with SMTP id b8so525773pfh.0 for ; Sat, 17 Feb 2018 10:24:58 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.55 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=yqpNa6LdvOaBLZgG0e0TTv5WL7qSDCjvAX9y6/JraPc=; b=OqrSJAuQChgvOEmzEAmyFOvFbm3TsWxQBjIYhQJMSHHzgzDoOWrrz9mlcLl7Ihr1UU JbqIFkORERW9MAn3xlIXMhjv/lVaFEv0uhv2FOBIi6xnrTjY1D+qKf1wzhV4cMLTCPAt gVgITEIfpQFpXzuXVynaDtZGp2OzG+59eZnE0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=yqpNa6LdvOaBLZgG0e0TTv5WL7qSDCjvAX9y6/JraPc=; b=btySIER2yOKKLwOdgwigUkurg3KeqjxSuMZrM5GRUPksdFj1VzPRVJWWSTplZHNEaJ cexTUnJ57m9/GDMzGgwMhk1ViCozsfsojZ3oad7xx6kFvTqK0HoPZnyxILJalmIC/BKb 6ppV/U6tzgsXHosGuGl1tFzTdoKWbSnJBVF15qSNxA7KfKTeUKM7LllzYmsM/p6PD7ge RTAZYVldD2wVGEq+aQuydKb+Bxqf72vHcb5aWb+iMoy4G8uEf7yL4nGo8OClcjAMj+rf uSD2A0OTyT5HZjIcXWDJbS7mJtbln1rzVDrXPbrnzuJhzFJgFcvUvmKKsgspX1LBrtt6 o9rg== X-Gm-Message-State: APf1xPASQI9u3GrTK0IlgsBGwXa/ew3gJNFl0uq2LS5VVY+wrLnhXYdT cMgl9NcHnOARMgPo36+XN/AM/mw+aU4= X-Google-Smtp-Source: AH8x226DJ0o22MVhnpQc6bMKLZSYnvGQHp4xsaLos/IpD0iOgZ1697iWIVLVoeqZhia0LMJKR3jisQ== X-Received: by 10.99.113.90 with SMTP id b26mr8168202pgn.10.1518891896945; Sat, 17 Feb 2018 10:24:56 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:13 -0800 Message-Id: <20180217182323.25885-58-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 57/67] target/arm: Implement SVE floating-point compare vectors X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 49 +++++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 64 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 41 +++++++++++++++++++++++++++++ target/arm/sve.decode | 11 ++++++++ 4 files changed, 165 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 3cb7ab9ef2..30373e3fc7 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -839,6 +839,55 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_6(sve_fcmge_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmgt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmgt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmgt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmeq_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmeq_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmeq_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmne_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmne_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmne_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmuo_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmuo_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmuo_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_facge_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facge_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facge_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_facgt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4edd3d4367..ace613684d 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3100,6 +3100,70 @@ DO_FMLA(sve_fnmls_zpzzz_d, 64, , 1, 1) =20 #undef DO_FMLA =20 +/* Two operand floating-point comparison controlled by a predicate. + * Unlike the integer version, we are not allowed to optimistically + * compare operands, since the comparison may have side effects wrt + * the FPSR. + */ +#define DO_FPCMP_PPZZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t opr_sz =3D simd_oprsz(desc); \ + intptr_t i =3D opr_sz, j =3D ((opr_sz - 1) & -64) >> 3; = \ + do { \ + uint64_t out =3D 0; \ + uint64_t pg =3D *(uint64_t *)(vg + j); \ + do { \ + i -=3D sizeof(TYPE), out <<=3D sizeof(TYPE); = \ + if ((pg >> (i & 63)) & 1) { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + TYPE mm =3D *(TYPE *)(vm + H(i)); \ + out |=3D OP(TYPE, nn, mm, status); \ + } \ + } while (i & 63); \ + *(uint64_t *)(vd + j) =3D out; \ + j -=3D 8; \ + } while (i > 0); \ +} + +#define DO_FPCMP_PPZZ_H(NAME, OP) \ + DO_FPCMP_PPZZ(NAME##_h, float16, H1_2, OP) +#define DO_FPCMP_PPZZ_S(NAME, OP) \ + DO_FPCMP_PPZZ(NAME##_s, float32, H1_4, OP) +#define DO_FPCMP_PPZZ_D(NAME, OP) \ + DO_FPCMP_PPZZ(NAME##_d, float64, , OP) + +#define DO_FPCMP_PPZZ_ALL(NAME, OP) \ + DO_FPCMP_PPZZ_H(NAME, OP) \ + DO_FPCMP_PPZZ_S(NAME, OP) \ + DO_FPCMP_PPZZ_D(NAME, OP) + +#define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <=3D 0 +#define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0 +#define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) =3D=3D 0 +#define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) !=3D 0 +#define DO_FCMUO(TYPE, X, Y, ST) \ + TYPE##_compare_quiet(X, Y, ST) =3D=3D float_relation_unordered +#define DO_FACGE(TYPE, X, Y, ST) \ + TYPE##_compare(TYPE##_abs(Y), TYPE##_abs(X), ST) <=3D 0 +#define DO_FACGT(TYPE, X, Y, ST) \ + TYPE##_compare(TYPE##_abs(Y), TYPE##_abs(X), ST) < 0 + +DO_FPCMP_PPZZ_ALL(sve_fcmge, DO_FCMGE) +DO_FPCMP_PPZZ_ALL(sve_fcmgt, DO_FCMGT) +DO_FPCMP_PPZZ_ALL(sve_fcmeq, DO_FCMEQ) +DO_FPCMP_PPZZ_ALL(sve_fcmne, DO_FCMNE) +DO_FPCMP_PPZZ_ALL(sve_fcmuo, DO_FCMUO) +DO_FPCMP_PPZZ_ALL(sve_facge, DO_FACGE) +DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT) + +#undef DO_FPCMP_PPZZ_ALL +#undef DO_FPCMP_PPZZ_D +#undef DO_FPCMP_PPZZ_S +#undef DO_FPCMP_PPZZ_H +#undef DO_FPCMP_PPZZ + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0241e8e707..8fcb9dd2be 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3265,6 +3265,47 @@ DO_FP3(FMULX, fmulx) =20 #undef DO_FP3 =20 +static void do_fp_cmp(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4_ptr *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr status; + + if (fn =3D=3D NULL) { + unallocated_encoding(s); + return; + } + + status =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + tcg_gen_gvec_4_ptr(pred_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); +} + +#define DO_FPCMP(NAME, name) \ +static void trans_##NAME##_ppzz(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[4] =3D { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + do_fp_cmp(s, a, fns[a->esz]); \ +} + +DO_FPCMP(FCMGE, fcmge) +DO_FPCMP(FCMGT, fcmgt) +DO_FPCMP(FCMEQ, fcmeq) +DO_FPCMP(FCMNE, fcmne) +DO_FPCMP(FCMUO, fcmuo) +DO_FPCMP(FACGE, facge) +DO_FPCMP(FACGT, facgt) + +#undef DO_FPCMP + typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); =20 static void do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla= *fn) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6ccb4289fc..f82cef2d7e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -321,6 +321,17 @@ UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_= rn SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn =20 +### SVE Floating Point Compare - Vectors Group + +# SVE floating-point compare vectors +FCMGE_ppzz 01100101 .. 0 ..... 010 ... ..... 0 .... @pd_pg_rn_rm +FCMGT_ppzz 01100101 .. 0 ..... 010 ... ..... 1 .... @pd_pg_rn_rm +FCMEQ_ppzz 01100101 .. 0 ..... 011 ... ..... 0 .... @pd_pg_rn_rm +FCMNE_ppzz 01100101 .. 0 ..... 011 ... ..... 1 .... @pd_pg_rn_rm +FCMUO_ppzz 01100101 .. 0 ..... 110 ... ..... 0 .... @pd_pg_rn_rm +FACGE_ppzz 01100101 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm +FACGT_ppzz 01100101 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm + ### SVE Integer Multiply-Add Group =20 # SVE integer multiply-add writing addend (predicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894182917113.72402787948943; Sat, 17 Feb 2018 11:03:02 -0800 (PST) Received: from localhost ([::1]:48402 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7lC-0007tM-7d for importer@patchew.org; Sat, 17 Feb 2018 14:02:58 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40921) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AT-0001lc-GZ for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AS-0002Dn-0f for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:01 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:37889) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AR-0002DS-Oc for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:59 -0500 Received: by mail-pg0-x243.google.com with SMTP id l24so4354929pgc.5 for ; Sat, 17 Feb 2018 10:24:59 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=oHroeQ0pW6nI9r1pBVle/D5G83QwYsIdCkVximFFDyU=; b=eSaB5noFZRf1UWPb5mMz49m7GVFXpYPd0DlVoOpcubgaaPWOsc+aVd1Pj/vbPfuo2O vm9uCwojs6QevwjuphuP3/VAeRsW3llcI4qkyJ8c1qX2D37HGI1kudCFdBuRO4BpqAt9 npfqENOlv9LSsSYjXHuGqq9jxVsiAmtKzzRy0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=oHroeQ0pW6nI9r1pBVle/D5G83QwYsIdCkVximFFDyU=; b=ohSMy1Dxe9zQnXev/xzcyXGwPY0aZ6TJbsCj2B3lHvwEvXme3iDF2lSCHfua0SzZQq 5X99VRC664gexER8XxUCfLlm6/Otycg+5F4hQAAjk0+DWKMeSvWHGApQUcdoL5xYw51j JllZrHywDwhM8LEE+j31RE2GoYXS4qOgMJqBeW39n3GsyuAMcie+vbphpYbg+eXiKpL7 NmG6D/bhXmegg+ptYZvCs8kEiwTHfqC54vRsSK03C3r8b+0AIRiTsEWgS1NcbrCxVYlF cwt+Y2kOg3pfelTFQlYnpR4uyZjm14F/YtxRbKZGxNyuLIsqz1ULqAebSYIC1CHdGWMr Kv3g== X-Gm-Message-State: APf1xPCzws+PpYROYZnviQMIpYLcKO+dSlT/GCqL3h5IpsHlv5UyGt0Y p25mVKin9vHJGAvKQpRkf+FPSc8Mvuk= X-Google-Smtp-Source: AH8x22618gNPfoEE8GFhOjihi60YLKraL0liydWKsxzlmOLAb9i8nJ+6tV+6YjzWapik+jMcZr5XSw== X-Received: by 10.101.87.132 with SMTP id b4mr8350083pgr.332.1518891898432; Sat, 17 Feb 2018 10:24:58 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:14 -0800 Message-Id: <20180217182323.25885-59-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH v2 58/67] target/arm: Implement SVE floating-point arithmetic with immediate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 56 +++++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 68 ++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 73 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 14 +++++++++ 4 files changed, 211 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 30373e3fc7..7ada12687b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -809,6 +809,62 @@ DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_6(sve_fadds_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsubs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmuls_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmuls_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmuls_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsubrs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubrs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubrs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxnms_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnms_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnms_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fminnms_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnms_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnms_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmins_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ace613684d..9378c8f0b2 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2995,6 +2995,74 @@ DO_ZPZZ_FP_D(sve_fmulx_d, uint64_t, helper_vfp_mulxd) #undef DO_ZPZZ_FP #undef DO_ZPZZ_FP_D =20 +/* Three-operand expander, with one scalar operand, controlled by + * a predicate, with the extra float_status parameter. + */ +#define DO_ZPZS_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint64_t scalar, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + TYPE mm =3D scalar; \ + for (i =3D 0; i < opr_sz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) =3D OP(nn, mm, status); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +DO_ZPZS_FP(sve_fadds_h, float16, H1_2, float16_add) +DO_ZPZS_FP(sve_fadds_s, float32, H1_4, float32_add) +DO_ZPZS_FP(sve_fadds_d, float64, , float64_add) + +DO_ZPZS_FP(sve_fsubs_h, float16, H1_2, float16_sub) +DO_ZPZS_FP(sve_fsubs_s, float32, H1_4, float32_sub) +DO_ZPZS_FP(sve_fsubs_d, float64, , float64_sub) + +DO_ZPZS_FP(sve_fmuls_h, float16, H1_2, float16_mul) +DO_ZPZS_FP(sve_fmuls_s, float32, H1_4, float32_mul) +DO_ZPZS_FP(sve_fmuls_d, float64, , float64_mul) + +static inline float16 subr_h(float16 a, float16 b, float_status *s) +{ + return float16_sub(b, a, s); +} + +static inline float32 subr_s(float32 a, float32 b, float_status *s) +{ + return float32_sub(b, a, s); +} + +static inline float64 subr_d(float64 a, float64 b, float_status *s) +{ + return float64_sub(b, a, s); +} + +DO_ZPZS_FP(sve_fsubrs_h, float16, H1_2, subr_h) +DO_ZPZS_FP(sve_fsubrs_s, float32, H1_4, subr_s) +DO_ZPZS_FP(sve_fsubrs_d, float64, , subr_d) + +DO_ZPZS_FP(sve_fmaxnms_h, float16, H1_2, float16_maxnum) +DO_ZPZS_FP(sve_fmaxnms_s, float32, H1_4, float32_maxnum) +DO_ZPZS_FP(sve_fmaxnms_d, float64, , float64_maxnum) + +DO_ZPZS_FP(sve_fminnms_h, float16, H1_2, float16_minnum) +DO_ZPZS_FP(sve_fminnms_s, float32, H1_4, float32_minnum) +DO_ZPZS_FP(sve_fminnms_d, float64, , float64_minnum) + +DO_ZPZS_FP(sve_fmaxs_h, float16, H1_2, float16_max) +DO_ZPZS_FP(sve_fmaxs_s, float32, H1_4, float32_max) +DO_ZPZS_FP(sve_fmaxs_d, float64, , float64_max) + +DO_ZPZS_FP(sve_fmins_h, float16, H1_2, float16_min) +DO_ZPZS_FP(sve_fmins_s, float32, H1_4, float32_min) +DO_ZPZS_FP(sve_fmins_d, float64, , float64_min) + /* Fully general two-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 8fcb9dd2be..6ce1b01b9a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -32,6 +32,7 @@ #include "exec/log.h" #include "trace-tcg.h" #include "translate-a64.h" +#include "fpu/softfloat.h" =20 typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t); typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, @@ -3265,6 +3266,78 @@ DO_FP3(FMULX, fmulx) =20 #undef DO_FP3 =20 +typedef void gen_helper_sve_fp2scalar(TCGv_ptr, TCGv_ptr, TCGv_ptr, + TCGv_i64, TCGv_ptr, TCGv_i32); + +static void do_fp_scalar(DisasContext *s, int zd, int zn, int pg, bool is_= fp16, + TCGv_i64 scalar, gen_helper_sve_fp2scalar *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr t_zd, t_zn, t_pg, status; + TCGv_i32 desc; + + t_zd =3D tcg_temp_new_ptr(); + t_zn =3D tcg_temp_new_ptr(); + t_pg =3D tcg_temp_new_ptr(); + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, zd)); + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, zn)); + tcg_gen_addi_ptr(t_pg, cpu_env, vec_full_reg_offset(s, pg)); + + status =3D get_fpstatus_ptr(is_fp16); + desc =3D tcg_const_i32(simd_desc(vsz, vsz, 0)); + fn(t_zd, t_zn, t_pg, scalar, status, desc); + + tcg_temp_free_i32(desc); + tcg_temp_free_ptr(status); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_zd); +} + +static void do_fp_imm(DisasContext *s, arg_rpri_esz *a, uint64_t imm, + gen_helper_sve_fp2scalar *fn) +{ + TCGv_i64 temp =3D tcg_const_i64(imm); + do_fp_scalar(s, a->rd, a->rn, a->pg, a->esz =3D=3D MO_16, temp, fn); + tcg_temp_free_i64(temp); +} + +#define DO_FP_IMM(NAME, name, const0, const1) \ +static void trans_##NAME##_zpzi(DisasContext *s, arg_rpri_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_sve_fp2scalar * const fns[3] =3D { = \ + gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, \ + gen_helper_sve_##name##_d \ + }; \ + static uint64_t const val[3][2] =3D { = \ + { float16_##const0, float16_##const1 }, \ + { float32_##const0, float32_##const1 }, \ + { float64_##const0, float64_##const1 }, \ + }; \ + if (a->esz =3D=3D 0) { = \ + unallocated_encoding(s); \ + return; \ + } \ + do_fp_imm(s, a, val[a->esz - 1][a->imm], fns[a->esz - 1]); \ +} + +#define float16_two make_float16(0x4000) +#define float32_two make_float32(0x40000000) +#define float64_two make_float64(0x4000000000000000ULL) + +DO_FP_IMM(FADD, fadds, half, one) +DO_FP_IMM(FSUB, fsubs, half, one) +DO_FP_IMM(FMUL, fmuls, half, two) +DO_FP_IMM(FSUBR, fsubrs, half, one) +DO_FP_IMM(FMAXNM, fmaxnms, zero, one) +DO_FP_IMM(FMINNM, fminnms, zero, one) +DO_FP_IMM(FMAX, fmaxs, zero, one) +DO_FP_IMM(FMIN, fmins, zero, one) + +#undef DO_FP_IMM + static void do_fp_cmp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f82cef2d7e..258d14b729 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -161,6 +161,10 @@ @rdn_pg4 ........ esz:2 .. pg:4 ... ........ rd:5 \ &rpri_esz rn=3D%reg_movprfx =20 +# Two register operand, one one-bit floating-point operand. +@rdn_i1 ........ esz:2 ......... pg:3 .... imm:1 rd:5 \ + &rpri_esz rn=3D%reg_movprfx + # Two register operand, one encoded bitmask. @rdn_dbm ........ .. .... dbm:13 rd:5 \ &rr_dbm rn=3D%reg_movprfx @@ -748,6 +752,16 @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn= _pg_rm FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm =20 +# SVE floating-point arithmetic with immediate (predicated) +FADD_zpzi 01100101 .. 011 000 100 ... 0000 . ..... @rdn_i1 +FSUB_zpzi 01100101 .. 011 001 100 ... 0000 . ..... @rdn_i1 +FMUL_zpzi 01100101 .. 011 010 100 ... 0000 . ..... @rdn_i1 +FSUBR_zpzi 01100101 .. 011 011 100 ... 0000 . ..... @rdn_i1 +FMAXNM_zpzi 01100101 .. 011 100 100 ... 0000 . ..... @rdn_i1 +FMINNM_zpzi 01100101 .. 011 101 100 ... 0000 . ..... @rdn_i1 +FMAX_zpzi 01100101 .. 011 110 100 ... 0000 . ..... @rdn_i1 +FMIN_zpzi 01100101 .. 011 111 100 ... 0000 . ..... @rdn_i1 + ### SVE FP Multiply-Add Group =20 # SVE floating-point multiply-accumulate writing addend --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894378348447.93860189173097; Sat, 17 Feb 2018 11:06:18 -0800 (PST) Received: from localhost ([::1]:48424 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7oP-0002PE-5R for importer@patchew.org; Sat, 17 Feb 2018 14:06:17 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40950) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AU-0001n8-Rq for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AT-0002Ej-Ii for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:02 -0500 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:47055) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AT-0002Dx-Ba for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:01 -0500 Received: by mail-pf0-x243.google.com with SMTP id z24so588312pfh.13 for ; Sat, 17 Feb 2018 10:25:01 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.58 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=y77VXTcItoEr50fnaXeizXOxi+B96vWDrfG5IAacJoI=; b=NVHBDsMh2kDQxF8XKMUixrckyYlojinVvRxKUWy2ZfWF0NWAUfhJKJKb21uAmDV7RT hapg0Zi4HQNmksI4aLR2Jg+HX9mHVpaiYA1ep3eKgka6G9t+g5oKHclqDgFcH2FfvEHG SjBeBQPyOPqiukgRY8rwREmjZgLUYq1AdCBg8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=y77VXTcItoEr50fnaXeizXOxi+B96vWDrfG5IAacJoI=; b=hHgpDNwwtVQ9H4caaNNQX2oGhVlx5M3N1dew48wbsf3ruGToLL47n7nEJ7HTCAFxs0 C5hEumSw3oic63YHJwqTyVkRal1vRICwFS6L75SFenVOXH+vTCseBffdmFiXugpZ+Xhh HLJsLR2C6eXjcECx5/O87W0x2/E3gW20xNYmM4YghOxT7hxilhbiXarlOncpljU5qDGx fztKbYZgi33md5OBGzmAvR4jJGPXxAUg6WDG5rYWZtfivQq1ZVmnuAMJ66FWj6/D2X4Y bIKc+WPtBqCqC9HjjcfcI4dT9GsjEDWbA8fqWEisGa5uaB405aQZJTL3IRnDMNIKlTpI 6KlA== X-Gm-Message-State: APf1xPDGzLB9FIOu79XxVbo2uZqZ8GVwIydhbRDEk4eoZU4iIFyVjY1w AsB+Bk5b1NhlwM0O+BqRYLCefG5nn8c= X-Google-Smtp-Source: AH8x2258mwmqOb+Bqq/XzxNSdeDl4Kzxyd2EWVcOaHT7lkxVCR88QL437fgP6Huz2tKG4exIsI65bQ== X-Received: by 10.101.100.208 with SMTP id t16mr7928770pgv.398.1518891899959; Sat, 17 Feb 2018 10:24:59 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:15 -0800 Message-Id: <20180217182323.25885-60-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v2 59/67] target/arm: Implement SVE Floating Point Multiply Indexed Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper.h | 14 ++++++++++ target/arm/translate-sve.c | 44 +++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 64 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 19 ++++++++++++++ 4 files changed, 141 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index f3ce58e276..a8d824b085 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -584,6 +584,20 @@ DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(gvec_fmul_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(gvec_fmla_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fmla_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fmla_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6ce1b01b9a..cf2a4d3284 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3136,6 +3136,50 @@ DO_ZZI(UMIN, umin) =20 #undef DO_ZZI =20 +/* + *** SVE Floating Point Multiply-Add Indexed Group + */ + +static void trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a, uint32_t in= sn) +{ + static gen_helper_gvec_4_ptr * const fns[3] =3D { + gen_helper_gvec_fmla_idx_h, + gen_helper_gvec_fmla_idx_s, + gen_helper_gvec_fmla_idx_d, + }; + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr status =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + status, vsz, vsz, a->index * 2 + a->sub, + fns[a->esz - 1]); + tcg_temp_free_ptr(status); +} + +/* + *** SVE Floating Point Multiply Indexed Group + */ + +static void trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] =3D { + gen_helper_gvec_fmul_idx_h, + gen_helper_gvec_fmul_idx_s, + gen_helper_gvec_fmul_idx_d, + }; + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr status =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, a->index, fns[a->esz - 1]); + tcg_temp_free_ptr(status); +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index ad5c29cdd5..e711a3217d 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -24,6 +24,22 @@ #include "fpu/softfloat.h" =20 =20 +/* Note that vector data is stored in host-endian 64-bit chunks, + so addressing units smaller than that needs a host-endian fixup. */ +#ifdef HOST_WORDS_BIGENDIAN +#define H1(x) ((x) ^ 7) +#define H1_2(x) ((x) ^ 6) +#define H1_4(x) ((x) ^ 4) +#define H2(x) ((x) ^ 3) +#define H4(x) ((x) ^ 1) +#else +#define H1(x) (x) +#define H1_2(x) (x) +#define H1_4(x) (x) +#define H2(x) (x) +#define H4(x) (x) +#endif + /* Floating-point trigonometric starting value. * See the ARM ARM pseudocode function FPTrigSMul. */ @@ -92,3 +108,51 @@ DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64) =20 #endif #undef DO_3OP + +/* For the indexed ops, SVE applies the index per 128-bit vector segment. + * For AdvSIMD, there is of course only one such vector segment. + */ + +#define DO_MUL_IDX(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc)= \ +{ = \ + intptr_t i, j, oprsz =3D simd_oprsz(desc), segment =3D 16 / sizeof(TYP= E); \ + intptr_t idx =3D simd_data(desc); = \ + TYPE *d =3D vd, *n =3D vn, *m =3D vm; = \ + for (i =3D 0; i < oprsz / sizeof(TYPE); i +=3D segment) { = \ + TYPE mm =3D m[H(i + idx)]; = \ + for (j =3D 0; j < segment; j++) { = \ + d[i + j] =3D TYPE##_mul(n[i + j], mm, stat); = \ + } = \ + } = \ +} + +DO_MUL_IDX(gvec_fmul_idx_h, float16, H2) +DO_MUL_IDX(gvec_fmul_idx_s, float32, H4) +DO_MUL_IDX(gvec_fmul_idx_d, float64, ) + +#undef DO_MUL_IDX + +#define DO_FMLA_IDX(NAME, TYPE, H) = \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, = \ + void *stat, uint32_t desc) = \ +{ = \ + intptr_t i, j, oprsz =3D simd_oprsz(desc), segment =3D 16 / sizeof(TYP= E); \ + TYPE op1_neg =3D extract32(desc, SIMD_DATA_SHIFT, 1); = \ + intptr_t idx =3D desc >> (SIMD_DATA_SHIFT + 1); = \ + TYPE *d =3D vd, *n =3D vn, *m =3D vm, *a =3D va; = \ + op1_neg <<=3D (8 * sizeof(TYPE) - 1); = \ + for (i =3D 0; i < oprsz / sizeof(TYPE); i +=3D segment) { = \ + TYPE mm =3D m[H(i + idx)]; = \ + for (j =3D 0; j < segment; j++) { = \ + d[i + j] =3D TYPE##_muladd(n[i + j] ^ op1_neg, = \ + mm, a[i + j], 0, stat); = \ + } = \ + } = \ +} + +DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2) +DO_FMLA_IDX(gvec_fmla_idx_s, float32, H4) +DO_FMLA_IDX(gvec_fmla_idx_d, float64, ) + +#undef DO_FMLA_IDX diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 258d14b729..d16e733aa3 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -30,6 +30,7 @@ %preg4_5 5:4 %size_23 23:2 %dtype_23_13 23:2 13:2 +%index3_22_19 22:1 19:2 =20 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=3Dtszimm_esz @@ -720,6 +721,24 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_= i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s =20 +### SVE FP Multiply-Add Indexed Group + +# SVE floating-point multiply-add (indexed) +FMLA_zzxz 01100100 0.1 .. rm:3 00000 sub:1 rn:5 rd:5 \ + ra=3D%reg_movprfx index=3D%index3_22_19 esz=3D1 +FMLA_zzxz 01100100 101 index:2 rm:3 00000 sub:1 rn:5 rd:5 \ + ra=3D%reg_movprfx esz=3D2 +FMLA_zzxz 01100100 111 index:1 rm:4 00000 sub:1 rn:5 rd:5 \ + ra=3D%reg_movprfx esz=3D3 + +### SVE FP Multiply Indexed Group + +# SVE floating-point multiply (indexed) +FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \ + index=3D%index3_22_19 esz=3D1 +FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=3D2 +FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3D3 + ### SVE FP Accumulating Reduction Group =20 # SVE floating-point serial reduction (predicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518895276756317.0768622175782; Sat, 17 Feb 2018 11:21:16 -0800 (PST) Received: from localhost ([::1]:49490 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en82o-000826-MC for importer@patchew.org; Sat, 17 Feb 2018 14:21:10 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40989) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AW-0001pi-Ly for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AV-0002FS-4i for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:04 -0500 Received: from mail-pg0-x242.google.com ([2607:f8b0:400e:c05::242]:42014) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AU-0002Ev-Sd for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:03 -0500 Received: by mail-pg0-x242.google.com with SMTP id y8so4343546pgr.9 for ; Sat, 17 Feb 2018 10:25:02 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.00 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=8A9oZaEU1Q/sjBJ+Ds3RtyUVolRFNfgjtMee6G1xKSw=; b=ILmJkAB/QmiKvyTrvdZRU0xnCcOuTiKT0Up+kSeGeb6kH6WW9ZIl17bj9FrGzt0OkX AqvDcJI3dZDr/DeiohFASo5cXa46yqZ1uCoFvkBeFq5s4U0sjPOb0oVcsZveVgXtZkh2 6GIr76f5UCROG2O6+dJ9UBMwxpDHTIhe7XvGk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=8A9oZaEU1Q/sjBJ+Ds3RtyUVolRFNfgjtMee6G1xKSw=; b=ER8FkX1HWvh/HF4TFPHyYN8Re5IZq/ofHTqzqXi6KUhY85B8f0YsdcCuYvsikq7bpG tc/XDU/LKjU+PlLWv6QxbtxpE3y6u5QVsTvC+PudPLt35cZ7ZnbuHl5P2WdfGCMKkwJZ OOucRtcKGEH+I0HzP+ryV2HXIcWXfc7Lqjfpwzdehyu0mvJxjjcxJrWGWLn8dWRufC6A nw7GaXTM2zTPwYd+su2AQqqToLUPV/XlEWKZAM+2KcbakjEOjryRWNDVxYcam2Cjdrqr VlL/gN8s91B+rW4b2Y8rNfFk5ZmMb3/G+YW27PrCENR8Gk0lekHk3WXqUMcWN7ZbWAqJ ClXQ== X-Gm-Message-State: APf1xPAnZTTBxATkETHivuCaAmYneEkQqMSFum/Xo7gPBjVmYLnGyxBX 0ldGFg3mIMpY2G2oQZ+x5UgkCyNQ6rs= X-Google-Smtp-Source: AH8x227/Aj7lZufyh15ZFay3jwkJKQfMIoEf03/CcX2e3uUGfwb1CCUSuWI+hn+sBHuLVorjePnJ0g== X-Received: by 10.99.125.74 with SMTP id m10mr8493057pgn.354.1518891901572; Sat, 17 Feb 2018 10:25:01 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:16 -0800 Message-Id: <20180217182323.25885-61-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::242 Subject: [Qemu-devel] [PATCH v2 60/67] target/arm: Implement SVE FP Fast Reduction Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 35 ++++++++++++++++++++++++++ target/arm/sve_helper.c | 61 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 55 +++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 ++++++ 4 files changed, 159 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 7ada12687b..c07b2245ba 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -725,6 +725,41 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_faddv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fmaxnmv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxnmv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxnmv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fminnmv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminnmv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminnmv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fmaxv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fminv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG, i64, i64, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 9378c8f0b2..29deefcd86 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2832,6 +2832,67 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count,= uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } =20 +/* Recursive reduction on a function; + * C.f. the ARM ARM function ReducePredicated. + * + * While it would be possible to write this without the DATA temporary, + * it is much simpler to process the predicate register this way. + * The recursion is bounded to depth 7 (128 fp16 elements), so there's + * little to gain with a more complex non-recursive form. + */ +#define DO_REDUCE(NAME, TYPE, H, FUNC, IDENT) \ +static TYPE NAME##_reduce(TYPE *data, float_status *status, uintptr_t n) \ +{ \ + if (n =3D=3D 1) { \ + return *data; \ + } else { \ + uintptr_t half =3D n / 2; \ + TYPE lo =3D NAME##_reduce(data, status, half); \ + TYPE hi =3D NAME##_reduce(data + half, status, half); \ + return TYPE##_##FUNC(lo, hi, status); \ + } \ +} \ +uint64_t HELPER(NAME)(void *vn, void *vg, void *vs, uint32_t desc) \ +{ \ + uintptr_t i, oprsz =3D simd_oprsz(desc), maxsz =3D simd_maxsz(desc); \ + TYPE data[sizeof(ARMVectorReg) / sizeof(TYPE)]; \ + for (i =3D 0; i < oprsz; ) { \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + *(TYPE *)((void *)data + i) =3D (pg & 1 ? nn : IDENT); \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); \ + } while (i & 15); \ + } \ + for (; i < maxsz; i +=3D sizeof(TYPE)) { \ + *(TYPE *)((void *)data + i) =3D IDENT; \ + } \ + return NAME##_reduce(data, vs, maxsz / sizeof(TYPE)); \ +} + +DO_REDUCE(sve_faddv_h, float16, H1_2, add, float16_zero) +DO_REDUCE(sve_faddv_s, float32, H1_4, add, float32_zero) +DO_REDUCE(sve_faddv_d, float64, , add, float64_zero) + +/* Identity is floatN_default_nan, without the function call. */ +DO_REDUCE(sve_fminnmv_h, float16, H1_2, minnum, 0x7E00) +DO_REDUCE(sve_fminnmv_s, float32, H1_4, minnum, 0x7FC00000) +DO_REDUCE(sve_fminnmv_d, float64, , minnum, 0x7FF8000000000000ULL) + +DO_REDUCE(sve_fmaxnmv_h, float16, H1_2, maxnum, 0x7E00) +DO_REDUCE(sve_fmaxnmv_s, float32, H1_4, maxnum, 0x7FC00000) +DO_REDUCE(sve_fmaxnmv_d, float64, , maxnum, 0x7FF8000000000000ULL) + +DO_REDUCE(sve_fminv_h, float16, H1_2, min, float16_infinity) +DO_REDUCE(sve_fminv_s, float32, H1_4, min, float32_infinity) +DO_REDUCE(sve_fminv_d, float64, , min, float64_infinity) + +DO_REDUCE(sve_fmaxv_h, float16, H1_2, max, float16_chs(float16_infinity)) +DO_REDUCE(sve_fmaxv_s, float32, H1_4, max, float32_chs(float32_infinity)) +DO_REDUCE(sve_fmaxv_d, float64, , max, float64_chs(float64_infinity)) + +#undef DO_REDUCE + uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg, void *status, uint32_t desc) { diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index cf2a4d3284..a77ddf0f4b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3180,6 +3180,61 @@ static void trans_FMUL_zzx(DisasContext *s, arg_FMUL= _zzx *a, uint32_t insn) tcg_temp_free_ptr(status); } =20 +/* + *** SVE Floating Point Fast Reduction Group + */ + +typedef void gen_helper_fp_reduce(TCGv_i64, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_i32); + +static void do_reduce(DisasContext *s, arg_rpr_esz *a, + gen_helper_fp_reduce *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + unsigned p2vsz =3D pow2ceil(vsz); + TCGv_i32 t_desc =3D tcg_const_i32(simd_desc(vsz, p2vsz, 0)); + TCGv_ptr t_zn, t_pg, status; + TCGv_i64 temp; + + temp =3D tcg_temp_new_i64(); + t_zn =3D tcg_temp_new_ptr(); + t_pg =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg)); + status =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + + fn(temp, t_zn, t_pg, status, t_desc); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_ptr(status); + tcg_temp_free_i32(t_desc); + + write_fp_dreg(s, a->rd, temp); + tcg_temp_free_i64(temp); +} + +#define DO_VPZ(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_fp_reduce * const fns[3] =3D { = \ + gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, \ + gen_helper_sve_##name##_d, \ + }; \ + if (a->esz =3D=3D 0) { = \ + unallocated_encoding(s); \ + return; \ + } \ + do_reduce(s, a, fns[a->esz - 1]); \ +} + +DO_VPZ(FADDV, faddv) +DO_VPZ(FMINNMV, fminnmv) +DO_VPZ(FMAXNMV, fmaxnmv) +DO_VPZ(FMINV, fminv) +DO_VPZ(FMAXV, fmaxv) + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d16e733aa3..feb8c65e89 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -739,6 +739,14 @@ FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \ FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=3D2 FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3D3 =20 +### SVE FP Fast Reduction Group + +FADDV 01100101 .. 000 000 001 ... ..... ..... @rd_pg_rn +FMAXNMV 01100101 .. 000 100 001 ... ..... ..... @rd_pg_rn +FMINNMV 01100101 .. 000 101 001 ... ..... ..... @rd_pg_rn +FMAXV 01100101 .. 000 110 001 ... ..... ..... @rd_pg_rn +FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn + ### SVE FP Accumulating Reduction Group =20 # SVE floating-point serial reduction (predicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894770973550.8979345284366; Sat, 17 Feb 2018 11:12:50 -0800 (PST) Received: from localhost ([::1]:48478 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7uk-0008DJ-2l for importer@patchew.org; Sat, 17 Feb 2018 14:12:50 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41012) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AX-0001rI-Po for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AW-0002GE-Lm for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:05 -0500 Received: from mail-pg0-x242.google.com ([2607:f8b0:400e:c05::242]:39714) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AW-0002Fn-EL for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:04 -0500 Received: by mail-pg0-x242.google.com with SMTP id w17so4356674pgv.6 for ; Sat, 17 Feb 2018 10:25:04 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.01 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Y10V3pA5jyYqp4BAV4fenO1hcQGPIHd6qQBTl6wUr7o=; b=T1NMf67azn3fAUVZXhEEHvirBgKsZtqgLkp4xA2p64VZK5oFjoE/qWbosH8wFg+dQu wYQ4xbhj4wNJmZbmLSCPRRqBaAcHSqs0y3krBTstMM7Ulr3FRuAMWTn3a8c9484UpP9k lUBa614cysd4+gA+R4z7su4ga9vnHbO2dcG1U= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Y10V3pA5jyYqp4BAV4fenO1hcQGPIHd6qQBTl6wUr7o=; b=BHmYTinzCQkW/gl3TdeUomosOaJseChKjd5VPSeKRUSrNzCqzOOlO6yh3+E1cRf9nd yGPFxndHKPsSKf77FkAQZ9ip+BsV3j75qM+wYFJJBxVxD4Dn9hRTjj4TtVAGBfce/6Wz TcvKAhLWPVVZlQCcnux40gARzmXjGIJe7TrfjTTP6wyd2b18nvlNXxjaXTiN2Ay/cZAM vWqJc1vJGXVEweOaqG2j7utP38rHUXGbO8YkcCMwBm+ScbkfAHX8o8sf9pxw3aaFqvWD RiVX4Q+dWOuKoU5AHJMogA10WlDKx+QrOkKadWXMDfpH8zwE02M0Fo68LACHaXFlpImE CUfQ== X-Gm-Message-State: APf1xPC3tPW2KnNzXfGYJ+LSe6uc277BYWEwbyQJoGbOsErzLKgS8KvS +J+8o+iL1W1dVZj0/40IlpPHQSmKxwk= X-Google-Smtp-Source: AH8x225OAphyYS3qVZZx+/OY9mqyhM/KPREk90egNF+w33NVxdzA+gnilQGbu91s5gxBye9jQfCfSA== X-Received: by 10.99.114.86 with SMTP id c22mr8198570pgn.41.1518891903201; Sat, 17 Feb 2018 10:25:03 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:17 -0800 Message-Id: <20180217182323.25885-62-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::242 Subject: [Qemu-devel] [PATCH v2 61/67] target/arm: Implement SVE Floating Point Unary Operations - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper.h | 8 ++++++++ target/arm/translate-sve.c | 43 +++++++++++++++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 20 ++++++++++++++++++++ target/arm/sve.decode | 5 +++++ 4 files changed, 76 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index a8d824b085..4bfefe42b2 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -565,6 +565,14 @@ DEF_HELPER_2(dc_zva, void, env, i64) DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64) DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64) =20 +DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) + +DEF_HELPER_FLAGS_4(gvec_frsqrte_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(gvec_frsqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(gvec_frsqrte_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) + DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a77ddf0f4b..463ff7b690 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3235,6 +3235,49 @@ DO_VPZ(FMAXNMV, fmaxnmv) DO_VPZ(FMINV, fminv) DO_VPZ(FMAXV, fmaxv) =20 +/* + *** SVE Floating Point Unary Operations - Unpredicated Group + */ + +static void do_zz_fp(DisasContext *s, arg_rr_esz *a, gen_helper_gvec_2_ptr= *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr status =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + + tcg_gen_gvec_2_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); +} + +static void trans_FRECPE(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2_ptr * const fns[3] =3D { + gen_helper_gvec_frecpe_h, + gen_helper_gvec_frecpe_s, + gen_helper_gvec_frecpe_d, + }; + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + } else { + do_zz_fp(s, a, fns[a->esz - 1]); + } +} + +static void trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2_ptr * const fns[3] =3D { + gen_helper_gvec_frsqrte_h, + gen_helper_gvec_frsqrte_s, + gen_helper_gvec_frsqrte_d, + }; + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + } else { + do_zz_fp(s, a, fns[a->esz - 1]); + } +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index e711a3217d..60dc07cf87 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -40,6 +40,26 @@ #define H4(x) (x) #endif =20 +#define DO_2OP(NAME, FUNC, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \ +{ \ + intptr_t i, oprsz =3D simd_oprsz(desc); \ + TYPE *d =3D vd, *n =3D vn; \ + for (i =3D 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] =3D FUNC(n[i], stat); \ + } \ +} + +DO_2OP(gvec_frecpe_h, helper_recpe_f16, float16) +DO_2OP(gvec_frecpe_s, helper_recpe_f32, float32) +DO_2OP(gvec_frecpe_d, helper_recpe_f64, float64) + +DO_2OP(gvec_frsqrte_h, helper_rsqrte_f16, float16) +DO_2OP(gvec_frsqrte_s, helper_rsqrte_f32, float32) +DO_2OP(gvec_frsqrte_d, helper_rsqrte_f64, float64) + +#undef DO_2OP + /* Floating-point trigonometric starting value. * See the ARM ARM pseudocode function FPTrigSMul. */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index feb8c65e89..112e85174c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -747,6 +747,11 @@ FMINNMV 01100101 .. 000 101 001 ... ..... ..... @rd_= pg_rn FMAXV 01100101 .. 000 110 001 ... ..... ..... @rd_pg_rn FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn =20 +## SVE Floating Point Unary Operations - Unpredicated Group + +FRECPE 01100101 .. 001 110 001110 ..... ..... @rd_rn +FRSQRTE 01100101 .. 001 111 001110 ..... ..... @rd_rn + ### SVE FP Accumulating Reduction Group =20 # SVE floating-point serial reduction (predicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894966492674.6726783682985; Sat, 17 Feb 2018 11:16:06 -0800 (PST) Received: from localhost ([::1]:48567 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7xq-0002jU-1W for importer@patchew.org; Sat, 17 Feb 2018 14:16:02 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41035) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Aa-0001uZ-93 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AY-0002Gx-QA for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:08 -0500 Received: from mail-pf0-x241.google.com ([2607:f8b0:400e:c00::241]:38521) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AY-0002Gi-I1 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:06 -0500 Received: by mail-pf0-x241.google.com with SMTP id i3so593060pfe.5 for ; Sat, 17 Feb 2018 10:25:06 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.03 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=SbsyuLWrWOxYaj+V/JBx0bY2VvqPeS9b43VL0kuF60o=; b=P85vW9Ojo+Dlfrk2dYWyOK9kPfIlDjTtj/h9XiG99njckkeyuAixaXyRU9DgacnCKq yCF3I4E8kF5hmiCPrHie1YcHW9g7mme5eIweb+ANNAaBypBKzho+XvXkdBEKnhOBoU5+ d8Mxs/qy6q4cV6Gqc1b/FDR5th9AnVM/tf/m4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SbsyuLWrWOxYaj+V/JBx0bY2VvqPeS9b43VL0kuF60o=; b=k5VWOOJsmNIBEbHBiijERfxxD/OBnop1J94LNHMNGiVH8PJzy+MFJWuHhslyQmDWIj 3tRsrGpgmYJrzPHhMqJj2DpcUte7+ANdva8Gio8tE9hu5aD9Qhy+zPehUIyczNzXY/Yr PT5VOgFsPve+hZkuxIRBJXf1tjj/5NQ33ZatNYoHxx3fucqX4yjuStYPKlj8sNGK/+/d zH3Jr6lLWpGq22ct+rgVMFAB6HorzQloXTTd8ViBY1iYzI9l/npr74KtOJqjAgXmznLI XhC3ZlPsgtbyEZM9vw3kL7V0Bqqm1naIr4SGuXkaSCdtk5cPk7cjUwJTpeMVgSNphaf4 7juA== X-Gm-Message-State: APf1xPCK5Jsnk7QeAIRhgoQTMiXg0R0W2sJyWYYH6/LFPlx+neB/C0mA wCydjleWE3syjN7/ICNtyWqP1TIbOSA= X-Google-Smtp-Source: AH8x224irxY1i5cJQdkYHMyG3krEa0bgrmvWTMdE6OWA8Wgo43joNjFFdkpM2+2Yd+XQcYSAyylHmw== X-Received: by 10.99.125.19 with SMTP id y19mr4723689pgc.285.1518891905214; Sat, 17 Feb 2018 10:25:05 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:18 -0800 Message-Id: <20180217182323.25885-63-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::241 Subject: [Qemu-devel] [PATCH v2 62/67] target/arm: Implement SVE FP Compare with Zero Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 42 ++++++++++++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 45 ++++++++++++++++++++++++++++++++++++++++++= +++ target/arm/translate-sve.c | 41 +++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 10 ++++++++++ 4 files changed, 138 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c07b2245ba..696c97648b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -767,6 +767,48 @@ DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG, i64, i64, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_fcmge0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmgt0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmgt0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmgt0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmlt0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmlt0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmlt0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmle0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmle0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmle0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmeq0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmeq0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmeq0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmne0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmne0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmne0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 29deefcd86..6a052ce9ad 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3270,6 +3270,8 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void = *vg, \ =20 #define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <=3D 0 #define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0 +#define DO_FCMLE(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) <=3D 0 +#define DO_FCMLT(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) < 0 #define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) =3D=3D 0 #define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) !=3D 0 #define DO_FCMUO(TYPE, X, Y, ST) \ @@ -3293,6 +3295,49 @@ DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT) #undef DO_FPCMP_PPZZ_H #undef DO_FPCMP_PPZZ =20 +/* One operand floating-point comparison against zero, controlled + * by a predicate. + */ +#define DO_FPCMP_PPZ0(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t opr_sz =3D simd_oprsz(desc); \ + intptr_t i =3D opr_sz, j =3D ((opr_sz - 1) & -64) >> 3; \ + do { \ + uint64_t out =3D 0; \ + uint64_t pg =3D *(uint64_t *)(vg + j); \ + do { \ + i -=3D sizeof(TYPE), out <<=3D sizeof(TYPE); \ + if ((pg >> (i & 63)) & 1) { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + out |=3D OP(TYPE, nn, 0, status); \ + } \ + } while (i & 63); \ + *(uint64_t *)(vd + j) =3D out; \ + j -=3D 8; \ + } while (i > 0); \ +} + +#define DO_FPCMP_PPZ0_H(NAME, OP) \ + DO_FPCMP_PPZ0(NAME##_h, float16, H1_2, OP) +#define DO_FPCMP_PPZ0_S(NAME, OP) \ + DO_FPCMP_PPZ0(NAME##_s, float32, H1_4, OP) +#define DO_FPCMP_PPZ0_D(NAME, OP) \ + DO_FPCMP_PPZ0(NAME##_d, float64, , OP) + +#define DO_FPCMP_PPZ0_ALL(NAME, OP) \ + DO_FPCMP_PPZ0_H(NAME, OP) \ + DO_FPCMP_PPZ0_S(NAME, OP) \ + DO_FPCMP_PPZ0_D(NAME, OP) + +DO_FPCMP_PPZ0_ALL(sve_fcmge0, DO_FCMGE) +DO_FPCMP_PPZ0_ALL(sve_fcmgt0, DO_FCMGT) +DO_FPCMP_PPZ0_ALL(sve_fcmle0, DO_FCMLE) +DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT) +DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ) +DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE) + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 463ff7b690..02655bff03 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3278,6 +3278,47 @@ static void trans_FRSQRTE(DisasContext *s, arg_rr_es= z *a, uint32_t insn) } } =20 +/* + *** SVE Floating Point Compare with Zero Group + */ + +static void do_ppz_fp(DisasContext *s, arg_rpr_esz *a, + gen_helper_gvec_3_ptr *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr status =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + + tcg_gen_gvec_3_ptr(pred_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); +} + +#define DO_PPZ(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_3_ptr * const fns[3] =3D { \ + gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, \ + gen_helper_sve_##name##_d, \ + }; \ + if (a->esz =3D=3D 0) { \ + unallocated_encoding(s); \ + return; \ + } \ + do_ppz_fp(s, a, fns[a->esz - 1]); \ +} + +DO_PPZ(FCMGE_ppz0, fcmge0) +DO_PPZ(FCMGT_ppz0, fcmgt0) +DO_PPZ(FCMLE_ppz0, fcmle0) +DO_PPZ(FCMLT_ppz0, fcmlt0) +DO_PPZ(FCMEQ_ppz0, fcmeq0) +DO_PPZ(FCMNE_ppz0, fcmne0) + +#undef DO_PPZ + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 112e85174c..f4505ad0bf 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -141,6 +141,7 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz +@pd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 . rd:4 &rpr_esz =20 # One register operand, with governing predicate, no vector element size @rd_pg_rn_e0 ........ .. ... ... ... pg:3 rn:5 rd:5 &rpr_esz esz=3D0 @@ -752,6 +753,15 @@ FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg= _rn FRECPE 01100101 .. 001 110 001110 ..... ..... @rd_rn FRSQRTE 01100101 .. 001 111 001110 ..... ..... @rd_rn =20 +### SVE FP Compare with Zero Group + +FCMGE_ppz0 01100101 .. 0100 00 001 ... ..... 0 .... @pd_pg_rn +FCMGT_ppz0 01100101 .. 0100 00 001 ... ..... 1 .... @pd_pg_rn +FCMLT_ppz0 01100101 .. 0100 01 001 ... ..... 0 .... @pd_pg_rn +FCMLE_ppz0 01100101 .. 0100 01 001 ... ..... 1 .... @pd_pg_rn +FCMEQ_ppz0 01100101 .. 0100 10 001 ... ..... 0 .... @pd_pg_rn +FCMNE_ppz0 01100101 .. 0100 11 001 ... ..... 0 .... @pd_pg_rn + ### SVE FP Accumulating Reduction Group =20 # SVE floating-point serial reduction (predicated) --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518896152318336.4919974694752; Sat, 17 Feb 2018 11:35:52 -0800 (PST) Received: from localhost ([::1]:51937 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en8Gu-0005cB-3h for importer@patchew.org; Sat, 17 Feb 2018 14:35:44 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41063) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Ad-0001xn-0F for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7Ab-0002JH-CX for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:11 -0500 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:34165) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7Ab-0002Ht-4W for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:09 -0500 Received: by mail-pf0-x243.google.com with SMTP id g17so591757pfh.1 for ; Sat, 17 Feb 2018 10:25:09 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.05 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Pssv/btxNtN7VLrefrxQ25vQ0Lh51u2AnNokoqUYCik=; b=E2Uawo0FIlMnJilATgNe2hX5D+Yya7srydNRR5+IlzMd5CwI0lHbU5mP/qjB5fZYts 1/J77E6Sw2O3RARrKchEA1sK+eFZWlwBUnfrsv9Cihgfjzea+svbDoBZCfLjLxq9zp+H nrOrs39p9q1s8PgbfqTIyff9ZCfNHE+qPKWBQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Pssv/btxNtN7VLrefrxQ25vQ0Lh51u2AnNokoqUYCik=; b=JSg+x+nBrUos4t1k0PSMuQLP77PEA6I5YzAFjI0KD//ukrzECfBsFSMuNzfQfum62g e90LEH1IkpHn3JyeUMW70lmDoICxxHKz+jcQkG0rsrUn3S/uXN5XrNrNbAsIxsK9R+W6 thTCj7+q5zNKeabW6F7kwYeX1yF8N2blmY0x/lgqgedJhBmTH5kLbgYU70bkNmqtsHGM /7qUkUWEU3WNi5K6oIjUyGsCFwhkqsO0d2ldHNPgRUr7yBtCtFjuVVraJ3vmuU3HPjkh fKPrW1cUXQnP+aO8rYJfsLaqBXlu8WpgA31ujREBfiOoXvWScqMV1t3EmF2lwsXXSFZS +1EQ== X-Gm-Message-State: APf1xPDi873DpAM2yY2VlVANZmYn4xhBj40b1jCMWQ078jXx6FaR4LzO 5l52LYVn+nSNzdtZ+jwTaIwRc6VbhVo= X-Google-Smtp-Source: AH8x224wv6afoUbb56zR1in6cZQvQk0jCpVmr584WRoaK4608B9eyWLiNslxh9vR88cnQaemfB3Chg== X-Received: by 10.98.58.129 with SMTP id v1mr1982096pfj.203.1518891907734; Sat, 17 Feb 2018 10:25:07 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:19 -0800 Message-Id: <20180217182323.25885-64-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v2 63/67] target/arm: Implement SVE floating-point trig multiply-add coefficient X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 4 +++ target/arm/sve_helper.c | 70 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 26 +++++++++++++++++ target/arm/sve.decode | 3 ++ 4 files changed, 103 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 696c97648b..ce5fe24dc2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1037,6 +1037,10 @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RW= G, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) +DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) +DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6a052ce9ad..53e3516f47 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3338,6 +3338,76 @@ DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT) DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ) DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE) =20 +/* FP Trig Multiply-Add. */ + +void HELPER(sve_ftmad_h)(void *vd, void *vn, void *vm, void *vs, uint32_t = desc) +{ + static const float16 coeff[16] =3D { + 0x3c00, 0xb155, 0x2030, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, + 0x3c00, 0xb800, 0x293a, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, + }; + intptr_t i, opr_sz =3D simd_oprsz(desc) / sizeof(float16); + intptr_t x =3D simd_data(desc); + float16 *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i++) { + float16 mm =3D m[i]; + intptr_t xx =3D x; + if (float16_is_neg(mm)) { + mm =3D float16_abs(mm); + xx +=3D 8; + } + d[i] =3D float16_muladd(n[i], mm, coeff[xx], 0, vs); + } +} + +void HELPER(sve_ftmad_s)(void *vd, void *vn, void *vm, void *vs, uint32_t = desc) +{ + static const float32 coeff[16] =3D { + 0x3f800000, 0xbe2aaaab, 0x3c088886, 0xb95008b9, + 0x36369d6d, 0x00000000, 0x00000000, 0x00000000, + 0x3f800000, 0xbf000000, 0x3d2aaaa6, 0xbab60705, + 0x37cd37cc, 0x00000000, 0x00000000, 0x00000000, + }; + intptr_t i, opr_sz =3D simd_oprsz(desc) / sizeof(float32); + intptr_t x =3D simd_data(desc); + float32 *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i++) { + float32 mm =3D m[i]; + intptr_t xx =3D x; + if (float32_is_neg(mm)) { + mm =3D float32_abs(mm); + xx +=3D 8; + } + d[i] =3D float32_muladd(n[i], mm, coeff[xx], 0, vs); + } +} + +void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t = desc) +{ + static const float64 coeff[16] =3D { + 0x3ff0000000000000ull, 0xbfc5555555555543ull, + 0x3f8111111110f30cull, 0xbf2a01a019b92fc6ull, + 0x3ec71de351f3d22bull, 0xbe5ae5e2b60f7b91ull, + 0x3de5d8408868552full, 0x0000000000000000ull, + 0x3ff0000000000000ull, 0xbfe0000000000000ull, + 0x3fa5555555555536ull, 0xbf56c16c16c13a0bull, + 0x3efa01a019b1e8d8ull, 0xbe927e4f7282f468ull, + 0x3e21ee96d2641b13ull, 0xbda8f76380fbb401ull, + }; + intptr_t i, opr_sz =3D simd_oprsz(desc) / sizeof(float64); + intptr_t x =3D simd_data(desc); + float64 *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i++) { + float64 mm =3D m[i]; + intptr_t xx =3D x; + if (float64_is_neg(mm)) { + mm =3D float64_abs(mm); + xx +=3D 8; + } + d[i] =3D float64_muladd(n[i], mm, coeff[xx], 0, vs); + } +} + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 02655bff03..e185af29e3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3319,6 +3319,32 @@ DO_PPZ(FCMNE_ppz0, fcmne0) =20 #undef DO_PPZ =20 +/* + *** SVE floating-point trig multiply-add coefficient + */ + +static void trans_FTMAD(DisasContext *s, arg_FTMAD *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] =3D { + gen_helper_sve_ftmad_h, + gen_helper_sve_ftmad_s, + gen_helper_sve_ftmad_d, + }; + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr status; + + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + status =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, a->imm, fns[a->esz - 1]); + tcg_temp_free_ptr(status); +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f4505ad0bf..ca54895900 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -804,6 +804,9 @@ FMINNM_zpzi 01100101 .. 011 101 100 ... 0000 . ..... @r= dn_i1 FMAX_zpzi 01100101 .. 011 110 100 ... 0000 . ..... @rdn_i1 FMIN_zpzi 01100101 .. 011 111 100 ... 0000 . ..... @rdn_i1 =20 +# SVE floating-point trig multiply-add coefficient +FTMAD 01100101 esz:2 010 imm:3 100000 rm:5 rd:5 rn=3D%reg_movprfx + ### SVE FP Multiply-Add Group =20 # SVE floating-point multiply-accumulate writing addend --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894563321111.75292309017789; Sat, 17 Feb 2018 11:09:23 -0800 (PST) Received: from localhost ([::1]:48446 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7rK-00052r-Fm for importer@patchew.org; Sat, 17 Feb 2018 14:09:18 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41094) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Ae-00021V-N4 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7Ad-0002Li-8S for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:12 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:46094) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7Ad-0002L6-1b for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:11 -0500 Received: by mail-pg0-x243.google.com with SMTP id m1so262275pgp.13 for ; Sat, 17 Feb 2018 10:25:10 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.08 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=/pW1zgtuWTJfjGKD3YWxaK5UQ0fE+1tL0T5+VXhGEbA=; b=KD3mFwNuu0Wh5YtxIVHjM+iS1Dq7EM4MN/PnNHEUVpr1exml7OMtKF6sfATjkMwoym mZNtfX/P4RPWuJ7rVnnfO8rXiDgo2i8fJIt6fcNWZUfn2lDMniMH4bK/68xARR53osVB neelp2qsTMXks/J2JYWi2qkgcP/n0d7IwCd6Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=/pW1zgtuWTJfjGKD3YWxaK5UQ0fE+1tL0T5+VXhGEbA=; b=jIM0kKuiVqV1cSBzp6yOF28LQNX7yfiNkSRTGn32qKVvwO9UdJhAV7KE1051rSDA5T 7Y07Ok5BSdGxwNPquFx1pcyjhzwDq6Jaoss3ZrPABgdys0it4RQiPdMZ41JjjZT4CG1s S+nv/uUBkMjlUNblSv0ADzAzWpt9jjvd6iXs7K3Ejf4c5imjB0TNDULSp9SQ3YHKZc3b pcFY0KZ7hT+qrex62yflTktIl4/jltb6UraQexrWjlGvcFbP7p884rZb0o4pd5+8GmQv EyiMhT53ncT4f8b0uRisYG/2LN7vVhfARUjCegllrax3E8npR6Xn1tsMB+Vp6loanPQO INcA== X-Gm-Message-State: APf1xPBAOxmpqrizaSSsR/dM5jgyW7VW24DbRpWNYhlezRVYug3PH/qP 3hdJZEIW2qA77kXVdw67Sc2VwDUf/XA= X-Google-Smtp-Source: AH8x22612U4xWmBj8RZPwRWR2nYD9Gw2VeQnOYxFOEmvF56+GijcoR7XJ7ENsOBDNDy4Pf1ZnbUdzQ== X-Received: by 10.98.67.68 with SMTP id q65mr9808636pfa.129.1518891909777; Sat, 17 Feb 2018 10:25:09 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:20 -0800 Message-Id: <20180217182323.25885-65-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH v2 64/67] target/arm: Implement SVE floating-point convert precision X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 13 +++++++++++++ target/arm/sve_helper.c | 27 +++++++++++++++++++++++++++ target/arm/translate-sve.c | 30 ++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 ++++++++ 4 files changed, 78 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ce5fe24dc2..bac4bfdc60 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -942,6 +942,19 @@ DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_fcvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 53e3516f47..9db01ac2f2 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3157,6 +3157,33 @@ void HELPER(NAME)(void *vd, void *vn, void *vg, void= *status, uint32_t desc) \ } \ } =20 +static inline float32 float16_to_float32_ieee(float16 f, float_status *s) +{ + return float16_to_float32(f, true, s); +} + +static inline float64 float16_to_float64_ieee(float16 f, float_status *s) +{ + return float16_to_float64(f, true, s); +} + +static inline float16 float32_to_float16_ieee(float32 f, float_status *s) +{ + return float32_to_float16(f, true, s); +} + +static inline float16 float64_to_float16_ieee(float64 f, float_status *s) +{ + return float64_to_float16(f, true, s); +} + +DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, float32_to_float16_ieee) +DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, float16_to_float32_ieee) +DO_ZPZ_FP_D(sve_fcvt_dh, uint64_t, float64_to_float16_ieee) +DO_ZPZ_FP_D(sve_fcvt_hd, uint64_t, float16_to_float64_ieee) +DO_ZPZ_FP_D(sve_fcvt_ds, uint64_t, float64_to_float32) +DO_ZPZ_FP_D(sve_fcvt_sd, uint64_t, float32_to_float64) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e185af29e3..361d545965 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3651,6 +3651,36 @@ static void do_zpz_ptr(DisasContext *s, int rd, int = rn, int pg, tcg_temp_free_ptr(status); } =20 +static void trans_FCVT_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_sh); +} + +static void trans_FCVT_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hs); +} + +static void trans_FCVT_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_dh); +} + +static void trans_FCVT_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hd); +} + +static void trans_FCVT_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_ds); +} + +static void trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd); +} + static void trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index ca54895900..d44cf17fc8 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -824,6 +824,14 @@ FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @= rdn_pg_rm_ra =20 ### SVE FP Unary Operations Predicated Group =20 +# SVE floating-point convert precision +FCVT_sh 01100101 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_hs 01100101 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_dh 01100101 11 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_sd 01100101 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518895451490679.2896443295223; Sat, 17 Feb 2018 11:24:11 -0800 (PST) Received: from localhost ([::1]:49563 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en85i-000286-Jm for importer@patchew.org; Sat, 17 Feb 2018 14:24:10 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41121) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Ag-00024M-QM for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7Af-0002NF-8w for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:14 -0500 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:35234) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7Ae-0002MW-Ok for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:13 -0500 Received: by mail-pf0-x243.google.com with SMTP id a6so592302pfi.2 for ; Sat, 17 Feb 2018 10:25:12 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.09 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Uw3Coayj2BZrW4P8dkcsk/ENJVJ6Su3WMdNeRaiLH4w=; b=aAw6kceenL51GdB5XgS8nMwyre4AoWxIRVivFI0Ab7mJWqoJTE1ShcX6pAXyMx9iO2 Zd8bearhCeaCnT585r9GoAQyknaf/PNKRv4s/N53Mmr/xxfgCVVSavQ5XS+PuBMbVVpN CNDRSWXljqo+ZQaWm1/pDGx80Yjzg8ws8Fp3c= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Uw3Coayj2BZrW4P8dkcsk/ENJVJ6Su3WMdNeRaiLH4w=; b=D4UMPJrJ1xKofl6cQVaQvGsg0x+OqumvNkhnZbaTr7eCej/XavudtOTAFuSfIDUUdo UD/mo26ZtDMA+oO1MLBE5kMnuO6FePAxO5Cs4p9OWMfJc8Ufzy5cdlYnfaM7vmcCmUnU NqePAkHyT8LHA78MCt9mXn+96PDGrXJok8oSAzR9Tj520bz5QAuzH9LO/177v5/fqFR5 s+T5engkolsW/qfmEjx8Wcl+KeZ7ck5AOZ/N0bVdr9FlBpy2NVUZuQzaK9iBWwAdxjVf H5f1aBSdgYZwZvzo11P0KbQuM8mN4FdIe0BoNB6Y1ORqkkSHs+8gntJGa9a94MAeh/bF Kncg== X-Gm-Message-State: APf1xPBYXI/8NWlJVIR0wqJlCvQ9rYEsTrH2x7kCifj0tJgMRXyQGDAx kCXM3SWl+aDPhnuuxv4+tY+RDAWMTws= X-Google-Smtp-Source: AH8x2247kJSXSHDMCIlR90y+MQzu43g1XOWr89dQy2TyvvWdUPzW3O/gWtR7kgqFLAxqbmPzzF6WOw== X-Received: by 10.99.63.9 with SMTP id m9mr8512676pga.247.1518891911466; Sat, 17 Feb 2018 10:25:11 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:21 -0800 Message-Id: <20180217182323.25885-66-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v2 65/67] target/arm: Implement SVE floating-point convert to integer X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 30 ++++++++++++++++++++ target/arm/sve_helper.c | 16 +++++++++++ target/arm/translate-sve.c | 70 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 16 +++++++++++ 4 files changed, 132 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index bac4bfdc60..0f5fea9045 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -955,6 +955,36 @@ DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_fcvtzs_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcvtzu_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 9db01ac2f2..09f5c77254 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3184,6 +3184,22 @@ DO_ZPZ_FP_D(sve_fcvt_hd, uint64_t, float16_to_float6= 4_ieee) DO_ZPZ_FP_D(sve_fcvt_ds, uint64_t, float64_to_float32) DO_ZPZ_FP_D(sve_fcvt_sd, uint64_t, float32_to_float64) =20 +DO_ZPZ_FP(sve_fcvtzs_hh, uint16_t, H1_2, float16_to_int16_round_to_zero) +DO_ZPZ_FP(sve_fcvtzs_hs, uint32_t, H1_4, float16_to_int32_round_to_zero) +DO_ZPZ_FP(sve_fcvtzs_ss, uint32_t, H1_4, float32_to_int32_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzs_hd, uint64_t, float16_to_int64_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzs_sd, uint64_t, float32_to_int64_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzs_ds, uint64_t, float64_to_int32_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzs_dd, uint64_t, float64_to_int64_round_to_zero) + +DO_ZPZ_FP(sve_fcvtzu_hh, uint16_t, H1_2, float16_to_uint16_round_to_zero) +DO_ZPZ_FP(sve_fcvtzu_hs, uint32_t, H1_4, float16_to_uint32_round_to_zero) +DO_ZPZ_FP(sve_fcvtzu_ss, uint32_t, H1_4, float32_to_uint32_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzu_hd, uint64_t, float16_to_uint64_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzu_sd, uint64_t, float32_to_uint64_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzu_ds, uint64_t, float64_to_uint32_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzu_dd, uint64_t, float64_to_uint64_round_to_zero) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 361d545965..bc865dfd15 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3681,6 +3681,76 @@ static void trans_FCVT_sd(DisasContext *s, arg_rpr_e= sz *a, uint32_t insn) do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd); } =20 +static void trans_FCVTZS_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hh); +} + +static void trans_FCVTZU_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hh); +} + +static void trans_FCVTZS_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hs); +} + +static void trans_FCVTZU_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hs); +} + +static void trans_FCVTZS_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hd); +} + +static void trans_FCVTZU_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hd); +} + +static void trans_FCVTZS_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_ss); +} + +static void trans_FCVTZU_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_ss); +} + +static void trans_FCVTZS_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_sd); +} + +static void trans_FCVTZU_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_sd); +} + +static void trans_FCVTZS_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_ds); +} + +static void trans_FCVTZU_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_ds); +} + +static void trans_FCVTZS_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_dd); +} + +static void trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd); +} + static void trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d44cf17fc8..92dda3a241 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -832,6 +832,22 @@ FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_= pg_rn_e0 FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVT_sd 01100101 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 =20 +# SVE floating-point convert to integer +FCVTZS_hh 01100101 01 011 01 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_hh 01100101 01 011 01 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_hs 01100101 01 011 10 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_hs 01100101 01 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_hd 01100101 01 011 11 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_hd 01100101 01 011 11 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_ss 01100101 10 011 10 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_ss 01100101 10 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_ds 01100101 11 011 00 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_ds 01100101 11 011 00 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_sd 01100101 11 011 10 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_sd 01100101 11 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_dd 01100101 11 011 11 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_dd 01100101 11 011 11 1 101 ... ..... ..... @rd_pg_rn_e0 + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518895649766110.08284482991166; Sat, 17 Feb 2018 11:27:29 -0800 (PST) Received: from localhost ([::1]:50397 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en88p-0005tI-LI for importer@patchew.org; Sat, 17 Feb 2018 14:27:23 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41143) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Ai-00025v-8T for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7Ag-0002Nl-Rc for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:16 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:39714) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7Ag-0002NM-6v for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:14 -0500 Received: by mail-pg0-x241.google.com with SMTP id w17so4356781pgv.6 for ; Sat, 17 Feb 2018 10:25:14 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.11 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=TYEkgvQ2lsVeGMILIUd+qlGVon1fvaQFWVsi7XMK5X8=; b=OENPRN3pDjBKPI8A6Li+DqhGVkcnN9QW73M2G8HKfYT4u72VfuhXjXtj/acxisha3p uld55SKBHwa3xpQbopJgeFGwDMFhZbfngDXiMGwRy4KnqqKGuRZZ13HTiLrtFVU7z1rC /yeQrfM72KiWHSsLT1Rjdp9PZZtjQPOQZQigk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=TYEkgvQ2lsVeGMILIUd+qlGVon1fvaQFWVsi7XMK5X8=; b=uhcRc7JCllQfDsulEr6AmpxjvyUzYP0aPrGGyj63N0WU3bV3onufPlpeT+6Jy8jTRk rQDBY+hTeHSPiKgF4gk0CnXHf93N9skQ1VOQoeNj4AnOy/6zTspiFwvy56olPRiJ7+9R Al6gOH4Ch3r71EPePVK8NShP5w5kBeCBY1llsb7QMitt3NMBrqO2PfUY8tGBTbbBhu2K UCJyLNE/JCeXuhpCfu33TGYF5/QB7LidgS76RerSwmuyBS7C8V7aNNNic5hw4h2apYJr LkAf5cOuVA+SSeXb4O3nyJpLvtj22oB0p+jx5kN9dMa1qkLekzNweS04E5hbpe+EoWiq 5UYw== X-Gm-Message-State: APf1xPCVHuyyTKn/HC5vFbEvrLdqLry+stIRlyRhGLm66xf3fA9OgRnV XRS7EPgC2NsFl64TyxREUmo/K2gbx8w= X-Google-Smtp-Source: AH8x225IeetMyVoGMkNGEo3sc7OJ0Cqui40MtciL0ACoQbmU+E7BZhezHYodlBVWtpmW0n5+U2kPTw== X-Received: by 10.99.146.3 with SMTP id o3mr8292115pgd.309.1518891912866; Sat, 17 Feb 2018 10:25:12 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:22 -0800 Message-Id: <20180217182323.25885-67-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH v2 66/67] target/arm: Implement SVE floating-point round to integral value X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++ target/arm/sve_helper.c | 8 +++++ target/arm/translate-sve.c | 80 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.decode | 9 ++++++ 4 files changed, 111 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0f5fea9045..749bab0b38 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -985,6 +985,20 @@ DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_frint_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_frintx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 09f5c77254..7950710be7 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3200,6 +3200,14 @@ DO_ZPZ_FP_D(sve_fcvtzu_sd, uint64_t, float32_to_uint= 64_round_to_zero) DO_ZPZ_FP_D(sve_fcvtzu_ds, uint64_t, float64_to_uint32_round_to_zero) DO_ZPZ_FP_D(sve_fcvtzu_dd, uint64_t, float64_to_uint64_round_to_zero) =20 +DO_ZPZ_FP(sve_frint_h, uint16_t, H1_2, helper_advsimd_rinth) +DO_ZPZ_FP(sve_frint_s, uint32_t, H1_4, helper_rints) +DO_ZPZ_FP_D(sve_frint_d, uint64_t, helper_rintd) + +DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int) +DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int) +DO_ZPZ_FP_D(sve_frintx_d, uint64_t, float64_round_to_int) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index bc865dfd15..5f1c4984b8 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3751,6 +3751,86 @@ static void trans_FCVTZU_dd(DisasContext *s, arg_rpr= _esz *a, uint32_t insn) do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd); } =20 +static gen_helper_gvec_3_ptr * const frint_fns[3] =3D { + gen_helper_sve_frint_h, + gen_helper_sve_frint_s, + gen_helper_sve_frint_d +}; + +static void trans_FRINTI(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + } else { + do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz =3D=3D MO_16, + frint_fns[a->esz - 1]); + } +} + +static void trans_FRINTX(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] =3D { + gen_helper_sve_frintx_h, + gen_helper_sve_frintx_s, + gen_helper_sve_frintx_d + }; + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + } else { + do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz =3D=3D MO_16, fns[a->esz= - 1]); + } +} + +static void do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode) +{ + unsigned vsz =3D vec_full_reg_size(s); + TCGv_i32 tmode; + TCGv_ptr status; + + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + + tmode =3D tcg_const_i32(mode); + status =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + gen_helper_set_rmode(tmode, tmode, status); + + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, frint_fns[a->esz - 1]); + + gen_helper_set_rmode(tmode, tmode, status); + tcg_temp_free_i32(tmode); + tcg_temp_free_ptr(status); +} + +static void trans_FRINTN(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_frint_mode(s, a, float_round_nearest_even); +} + +static void trans_FRINTP(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_frint_mode(s, a, float_round_up); +} + +static void trans_FRINTM(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_frint_mode(s, a, float_round_down); +} + +static void trans_FRINTZ(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_frint_mode(s, a, float_round_to_zero); +} + +static void trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_frint_mode(s, a, float_round_ties_away); +} + static void trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 92dda3a241..e06c0c5279 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -848,6 +848,15 @@ FCVTZU_sd 01100101 11 011 10 1 101 ... ..... ..... @rd= _pg_rn_e0 FCVTZS_dd 01100101 11 011 11 0 101 ... ..... ..... @rd_pg_rn_e0 FCVTZU_dd 01100101 11 011 11 1 101 ... ..... ..... @rd_pg_rn_e0 =20 +# SVE floating-point round to integral value +FRINTN 01100101 .. 000 000 101 ... ..... ..... @rd_pg_rn +FRINTP 01100101 .. 000 001 101 ... ..... ..... @rd_pg_rn +FRINTM 01100101 .. 000 010 101 ... ..... ..... @rd_pg_rn +FRINTZ 01100101 .. 000 011 101 ... ..... ..... @rd_pg_rn +FRINTA 01100101 .. 000 100 101 ... ..... ..... @rd_pg_rn +FRINTX 01100101 .. 000 110 101 ... ..... ..... @rd_pg_rn +FRINTI 01100101 .. 000 111 101 ... ..... ..... @rd_pg_rn + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 --=20 2.14.3 From nobody Fri Oct 24 09:33:46 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518894756862859.0720356522505; Sat, 17 Feb 2018 11:12:36 -0800 (PST) Received: from localhost ([::1]:48476 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7uT-0007yb-Uw for importer@patchew.org; Sat, 17 Feb 2018 14:12:34 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41157) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Aj-00026p-48 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7Ah-0002OR-NX for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:17 -0500 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:44408) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7Ah-0002O2-Ft for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:15 -0500 Received: by mail-pf0-x242.google.com with SMTP id 17so591534pfw.11 for ; Sat, 17 Feb 2018 10:25:15 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.13 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=XmEjpPkWqm/YDvgUSl0/MJ3T6b+RSLhnCpKyIM1pO3k=; b=f7LQ4NkLsW6JaC7vdaN1PMX9BdoQo+WGQW1dLlsOY6TTGumPC4h44m+Wjnz4hjf9Eb a+HCPj45ftQYLfixSJN/Yy71PUCgB3BQGNJt5pdHndmrsh9S97gsvvxKyBdmJFtabTeK R4jWTSB85XK/0nULn+6Noc/SAk5AnXqpkIlp0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=XmEjpPkWqm/YDvgUSl0/MJ3T6b+RSLhnCpKyIM1pO3k=; b=hquz5b8IljqyzPU8rCz4kbln/wSHZPrpYoh77YMiGE++iqouV0a9IzuKVdbNI3sngD 1GKmOF+I/+A+7cCASqK6WC2kSX+rPqLlM+Qq9A0EJevVopKOizp1bqXy1jTF4mwFCTFn GT1+HteJs3g1zii4c0lpAMPsmXfMRbl0rPw4rveJ0Ne4YkldL4o2XUK3Akh5Ze4VVN62 Zv/UYzs1v3Jm9csZzJYlgkdkhlHbnP770XqwMebdwWHo/Sm59uIefbirdNPXyCN69Obt eGDTeclzYe8OIZiUkCGy4amYxtW0Zvo+l+xPoCPfk7UMEBXzZYZgDuYbij7K6RWrob46 P2Ug== X-Gm-Message-State: APf1xPCCXrX4Og+mGTHWFGPo5TNO5Fhunbq1AISZDIplxIyc5HO49fnp kQCp5RoiCexNJgPPA0NzEdiA1rs++m0= X-Google-Smtp-Source: AH8x224JDlD8oWh1iS75GDEK+0PQ0oJA6niVa2ZVYlsTtVKNK2QzrudRznnHhq7RAQrN7u3H4rHyOQ== X-Received: by 10.101.72.199 with SMTP id o7mr8233639pgs.303.1518891914266; Sat, 17 Feb 2018 10:25:14 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:23 -0800 Message-Id: <20180217182323.25885-68-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v2 67/67] target/arm: Implement SVE floating-point unary operations X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++++++ target/arm/sve_helper.c | 8 ++++++++ target/arm/translate-sve.c | 28 ++++++++++++++++++++++++++++ target/arm/sve.decode | 4 ++++ 4 files changed, 54 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 749bab0b38..5cebc9121d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -999,6 +999,20 @@ DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_frecpx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fsqrt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fsqrt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fsqrt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 7950710be7..4f0985a29e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3208,6 +3208,14 @@ DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_roun= d_to_int) DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int) DO_ZPZ_FP_D(sve_frintx_d, uint64_t, float64_round_to_int) =20 +DO_ZPZ_FP(sve_frecpx_h, uint16_t, H1_2, helper_frecpx_f16) +DO_ZPZ_FP(sve_frecpx_s, uint32_t, H1_4, helper_frecpx_f32) +DO_ZPZ_FP_D(sve_frecpx_d, uint64_t, helper_frecpx_f64) + +DO_ZPZ_FP(sve_fsqrt_h, uint16_t, H1_2, float16_sqrt) +DO_ZPZ_FP(sve_fsqrt_s, uint32_t, H1_4, float32_sqrt) +DO_ZPZ_FP_D(sve_fsqrt_d, uint64_t, float64_sqrt) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5f1c4984b8..f1ff033333 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3831,6 +3831,34 @@ static void trans_FRINTA(DisasContext *s, arg_rpr_es= z *a, uint32_t insn) do_frint_mode(s, a, float_round_ties_away); } =20 +static void trans_FRECPX(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] =3D { + gen_helper_sve_frecpx_h, + gen_helper_sve_frecpx_s, + gen_helper_sve_frecpx_d + }; + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + } else { + do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz =3D=3D MO_16, fns[a->esz= - 1]); + } +} + +static void trans_FSQRT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] =3D { + gen_helper_sve_fsqrt_h, + gen_helper_sve_fsqrt_s, + gen_helper_sve_fsqrt_d + }; + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + } else { + do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz =3D=3D MO_16, fns[a->esz= - 1]); + } +} + static void trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index e06c0c5279..fbd9cf1384 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -857,6 +857,10 @@ FRINTA 01100101 .. 000 100 101 ... ..... ..... @rd_p= g_rn FRINTX 01100101 .. 000 110 101 ... ..... ..... @rd_pg_rn FRINTI 01100101 .. 000 111 101 ... ..... ..... @rd_pg_rn =20 +# SVE floating-point unary operations +FRECPX 01100101 .. 001 100 101 ... ..... ..... @rd_pg_rn +FSQRT 01100101 .. 001 101 101 ... ..... ..... @rd_pg_rn + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 --=20 2.14.3