From nobody Mon Feb 9 08:30:36 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1502993554697422.4117239342838; Thu, 17 Aug 2017 11:12:34 -0700 (PDT) Received: from localhost ([::1]:38667 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diPHV-0002A2-AP for importer@patchew.org; Thu, 17 Aug 2017 14:12:33 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37577) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diP9W-00049O-VY for qemu-devel@nongnu.org; Thu, 17 Aug 2017 14:04:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1diP9V-0005Of-TD for qemu-devel@nongnu.org; Thu, 17 Aug 2017 14:04:18 -0400 Received: from mail-wr0-x22e.google.com ([2a00:1450:400c:c0c::22e]:34032) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1diP9V-0005Nu-Gh for qemu-devel@nongnu.org; Thu, 17 Aug 2017 14:04:17 -0400 Received: by mail-wr0-x22e.google.com with SMTP id y96so47731454wrc.1 for ; Thu, 17 Aug 2017 11:04:17 -0700 (PDT) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id f9sm3877162wmf.9.2017.08.17.11.04.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 Aug 2017 11:04:10 -0700 (PDT) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id 2A0ED3E1403; Thu, 17 Aug 2017 19:04:05 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=UC4iwuAR4yeyWBSDbELSGaxnBCBHWMws/wQFfSoxQ0g=; b=FRRxoY1V+wZWgrDUDBiQAimwI4n4lFdaoYE0qCsrWD9lf0h5ocOy9utlQVqBMAg2bU AT0VFoIohLYSlqYKiHmrBHo2u00Rrt3dkXW3RHVG/WrM///P0XAlZcHBAQ6l45VIAKbK 3h6CRCNHdoR3CBV8eugaOqAjjvh3D4nDll7MM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UC4iwuAR4yeyWBSDbELSGaxnBCBHWMws/wQFfSoxQ0g=; b=gGhQFXaoQPF0czSRIbyQv/U/1A79qjzbfXGfA8d3hSyr0nBl2PDEVxAWwolIkEKhd2 FbprRBbDWRjGkJMC2R76v6i6CHEZmLDABmkS2aVdQGRJ+CTBzJ+G/B83p6hr7PSIapfS OKsbCh9kfbKJmQBJ1dyDsaBPD/UfHordNh/MGyQESPDMUC+dBx9Km+DyqgGkxK66ZmRj m+nyJhHvQKfBsuJ2QtQVqcdICH/0Dy4t7ODflYTwa6Sel7FVjugBRgdGl5+bF9CNb3pR UblWnwEztWpJncutOZzdHu5f8MHMDN15ZshzLhX7ntxVLlK6zT50ef5d6Q1uGPSMlQkH ZElg== X-Gm-Message-State: AHYfb5i9EF7deWAtjigZVJSNS4t7JnHPhCik/xPYFIPodIrtRr0OGrUy 3BGCDSlusRpOEA2n X-Received: by 10.223.152.19 with SMTP id v19mr4134516wrb.60.1502993056413; Thu, 17 Aug 2017 11:04:16 -0700 (PDT) From: =?UTF-8?q?Alex=20Benn=C3=A9e?= To: rth@twiddle.net, cota@braap.org, batuzovk@ispras.ru Date: Thu, 17 Aug 2017 19:04:04 +0100 Message-Id: <20170817180404.29334-10-alex.bennee@linaro.org> X-Mailer: git-send-email 2.13.0 In-Reply-To: <20170817180404.29334-1-alex.bennee@linaro.org> References: <20170817180404.29334-1-alex.bennee@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c0c::22e Subject: [Qemu-devel] [RFC PATCH 9/9] target/arm/translate-a64: vectorise smull vD.4s, vN.[48]s, vM.h[] X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, =?UTF-8?q?Alex=20Benn=C3=A9e?= , qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 These instructions show up in the ffmpeg profile from the ff_simple_idct_put_neon function. WARNING: this is experimental and essentially shortcuts to the vectorised helper for the one instruction that shows up a lot in the ffmpeg trace. Otherwise it falls through to the normal code generation. We also skip where rd =3D=3D rn to avoid having to explicitly deal with the aliasing in the helper. Signed-off-by: Alex Benn=C3=A9e --- target/arm/helper-a64.c | 17 +++++++++++ target/arm/helper-a64.h | 2 ++ target/arm/translate-a64.c | 72 ++++++++++++++++++++++++++++++++++++++++++= ++++ 3 files changed, 91 insertions(+) diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c index 17b1edfb5f..ae0f8da5c4 100644 --- a/target/arm/helper-a64.c +++ b/target/arm/helper-a64.c @@ -538,3 +538,20 @@ uint64_t HELPER(paired_cmpxchg64_be)(CPUARMState *env,= uint64_t addr, =20 return !success; } + +/* Multiply Long (vector, by element) */ +void HELPER(advsimd_smull_idx_s32)(void *d, void *n, uint32_t m, + uint32_t simd_data) +{ + int opr_elt =3D GET_SIMD_DATA(OPR_ELT, simd_data); + int doff_elt =3D GET_SIMD_DATA(DOFF_ELT, simd_data); + int32_t *rd =3D (int32_t *) d; + int16_t *rn =3D (int16_t *) n; + int16_t rm =3D (int16_t) m; + int i; + + #pragma GCC ivdep + for (i =3D 0; i < opr_elt; ++i) { + rd[i] =3D rn[i + doff_elt] * rm; + } +} diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h index 6f9eaba533..0bd7942cec 100644 --- a/target/arm/helper-a64.h +++ b/target/arm/helper-a64.h @@ -44,3 +44,5 @@ DEF_HELPER_FLAGS_3(crc32_64, TCG_CALL_NO_RWG_SE, i64, i64= , i64, i32) DEF_HELPER_FLAGS_3(crc32c_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32) DEF_HELPER_FLAGS_4(paired_cmpxchg64_le, TCG_CALL_NO_WG, i64, env, i64, i64= , i64) DEF_HELPER_FLAGS_4(paired_cmpxchg64_be, TCG_CALL_NO_WG, i64, env, i64, i64= , i64) + +DEF_HELPER_4(advsimd_smull_idx_s32, void, vec, vec, i32, i32) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index f474c5008b..3a609e571c 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10466,6 +10466,74 @@ static void disas_simd_two_reg_misc(DisasContext *= s, uint32_t insn) } } =20 +typedef void AdvSIMDGenTwoPlusOneVectorFn(TCGv_vec, TCGv_vec, TCGv_i32, TC= Gv_i32); + +/* Handle [U/S]ML[S/A]L instructions + * + * This splits off from bellow only to aid experimentation. + */ +static bool handle_vec_simd_mul_addsub(DisasContext *s, uint32_t insn, int= opcode, int size, bool is_q, bool u, int rn, int rm, int rd) +{ + /* fprintf(stderr, "%s: %#04x op:%x sz:%d rn:%d rm:%d rd:%d\n", __func= __, */ + /* insn, opcode, size, rn, rm, rd); */ + + if (size =3D=3D 1) { + AdvSIMDGenTwoPlusOneVectorFn *fn =3D NULL; + uint32_t simd_info =3D 0; + + switch (opcode) { + case 0x2: /* SMLAL, SMLAL2, UMLAL, UMLAL2 */ + break; + case 0x6: /* SMLSL, SMLSL2, UMLSL, UMLSL2 */ + break; + case 0xa: /* SMULL, SMULL2, UMULL, UMULL2 */ + if (!u) + { + /* helper assumes no aliasing */ + if (rd =3D=3D rn) { + return false; + } + + fn =3D gen_helper_advsimd_smull_idx_s32; + simd_info =3D deposit32(simd_info, + ADVSIMD_OPR_ELT_SHIFT, ADVSIMD_OPR_E= LT_BITS, 4); + + if (is_q) { + simd_info =3D deposit32(simd_info, + ADVSIMD_DOFF_ELT_SHIFT, ADVSIMD_= DOFF_ELT_BITS, 4); + } + }; + break; + default: + break; + } + + /* assert(fn); */ + + if (fn) { + TCGv_i32 tcg_idx =3D tcg_temp_new_i32(); + TCGv_i32 tcg_simd_info =3D tcg_const_i32(simd_info); + int h =3D extract32(insn, 11, 1); + int lm =3D extract32(insn, 20, 2); + int index =3D h << 2 | lm; + + if (!fp_access_check(s)) { + return false; + } + + read_vec_element_i32(s, tcg_idx, rm, index, size); + + fn(cpu_V[rd], cpu_V[rn], tcg_idx, tcg_simd_info); + + tcg_temp_free_i32(tcg_simd_info); + tcg_temp_free_i32(tcg_idx); + return true; + } + } + + return false; +} + /* C3.6.13 AdvSIMD scalar x indexed element * 31 30 29 28 24 23 22 21 20 19 16 15 12 11 10 9 5 4 0 * +-----+---+-----------+------+---+---+------+-----+---+---+------+-----= -+ @@ -10518,6 +10586,10 @@ static void disas_simd_indexed(DisasContext *s, ui= nt32_t insn) unallocated_encoding(s); return; } + /* Shortcut if we have a vectorised helper */ + if (handle_vec_simd_mul_addsub(s, insn, opcode, size, is_q, u, rn,= rm, rd)) { + return; + } is_long =3D true; break; case 0x3: /* SQDMLAL, SQDMLAL2 */ --=20 2.13.0