From nobody Tue Feb 10 05:46:33 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1519742850150409.3857998750068; Tue, 27 Feb 2018 06:47:30 -0800 (PST) Received: from localhost ([::1]:37717 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eqgXQ-0002aQ-Tx for importer@patchew.org; Tue, 27 Feb 2018 09:47:28 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56014) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eqgPT-0003mm-Ez for qemu-devel@nongnu.org; Tue, 27 Feb 2018 09:39:20 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eqgPS-00084l-2Z for qemu-devel@nongnu.org; Tue, 27 Feb 2018 09:39:15 -0500 Received: from mail-wm0-x241.google.com ([2a00:1450:400c:c09::241]:40641) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eqgPR-000845-Ns for qemu-devel@nongnu.org; Tue, 27 Feb 2018 09:39:13 -0500 Received: by mail-wm0-x241.google.com with SMTP id t6so15916780wmt.5 for ; Tue, 27 Feb 2018 06:39:13 -0800 (PST) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id o6sm6457262wmo.38.2018.02.27.06.38.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 27 Feb 2018 06:39:05 -0800 (PST) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id 0D1063E0BFD; Tue, 27 Feb 2018 14:38:54 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Udvn+v2EYmidmut+a91fqRC7Q2kSoODnzfg5Opg6GRg=; b=JACcwsAiNZKGApW7Dcb4GLKau5DtPTRtV2EkuoS5iDl2OW2fUIe2IJJSKFmKzEZChs Ap5uzoOIGDzsWhsd/JQl+DMGB8T930gkwEwFEghxqbuxCcLXEIdf+P1mVuk889lcKKAY 82KrVjcpci3wv5/XriHQ5nzhNNPGsZgbGpDyQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Udvn+v2EYmidmut+a91fqRC7Q2kSoODnzfg5Opg6GRg=; b=gRiKOmadKXke3+B+LpKtTU6/CUetuANW9NkmoBdGWBXJjiJjM6jRzbZfUbz+/8I+c4 S0NMyUvPJRkdTIuEu2RYVKaBSyebx9JVDn/DaqgwlD8HiNrdTMLvcj/JnFJz8kKaOL65 61/1QY3k3h436vyplQTW8zSU3Mk+76p6741jQJ1Rg3oIQphm21iv3Hd3S2sA3gV35Ahr SZAwhSD9dZxxXIjo/0kQ21dup+OfYSKy0jhvMW7p/GGvyZ3Xvv1qdULHRbnyPSPYFOZ/ QV2qArR4L91zboaOvNHncyxnjnuDQWbN8+hD+XFNpH85RjzSLXEQisZJm+j3Ie9/aOQR I/Sg== X-Gm-Message-State: APf1xPCUqJ1Q0bIpc351DpMbzVtBrZIj/hI2980Wr1EyWWLq5cxqYYkE A2ehmux6DUpS0wWu3y0WT6QCyZzm7oY= X-Google-Smtp-Source: AG47ELsjAYJOHlDfSZRwCGKCG0iU0p57tdu3gfBOZtRJglo5jVSP5vGk0AYmjNf9b7HmPjygU1FeWQ== X-Received: by 10.28.95.139 with SMTP id t133mr11439936wmb.88.1519742352582; Tue, 27 Feb 2018 06:39:12 -0800 (PST) From: =?UTF-8?q?Alex=20Benn=C3=A9e?= To: qemu-arm@nongnu.org Date: Tue, 27 Feb 2018 14:38:36 +0000 Message-Id: <20180227143852.11175-16-alex.bennee@linaro.org> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180227143852.11175-1-alex.bennee@linaro.org> References: <20180227143852.11175-1-alex.bennee@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c09::241 Subject: [Qemu-devel] [PATCH v4 15/31] arm/translate-a64: add FP16 x2 ops for simd_indexed X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?UTF-8?q?Alex=20Benn=C3=A9e?= , richard.henderson@linaro.org, qemu-devel@nongnu.org, Peter Maydell Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 A bunch of the vectorised bitwise operations just operate on larger chunks at a time. We can do the same for the new half-precision operations by introducing some TWOHALFOP helpers which work on each half of a pair of half-precision operations at once. Hopefully all this hoop jumping will get simpler once we have generically vectorised helpers here. Signed-off-by: Alex Benn=C3=A9e Reviewed-by: Richard Henderson --- v2 - checkpatch fixes --- target/arm/helper-a64.c | 46 ++++++++++++++++++++++++++++++++++++++++++= +++- target/arm/helper-a64.h | 10 ++++++++++ target/arm/translate-a64.c | 26 +++++++++++++++++++++----- 3 files changed, 76 insertions(+), 6 deletions(-) diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c index 8fdbe034f3..4d5ae96d8f 100644 --- a/target/arm/helper-a64.c +++ b/target/arm/helper-a64.c @@ -629,8 +629,32 @@ ADVSIMD_HALFOP(max) ADVSIMD_HALFOP(minnum) ADVSIMD_HALFOP(maxnum) =20 +#define ADVSIMD_TWOHALFOP(name) \ +uint32_t ADVSIMD_HELPER(name, 2h)(uint32_t two_a, uint32_t two_b, void *fp= stp) \ +{ \ + float16 a1, a2, b1, b2; \ + uint32_t r1, r2; \ + float_status *fpst =3D fpstp; \ + a1 =3D extract32(two_a, 0, 16); \ + a2 =3D extract32(two_a, 16, 16); \ + b1 =3D extract32(two_b, 0, 16); \ + b2 =3D extract32(two_b, 16, 16); \ + r1 =3D float16_ ## name(a1, b1, fpst); \ + r2 =3D float16_ ## name(a2, b2, fpst); \ + return deposit32(r1, 16, 16, r2); \ +} + +ADVSIMD_TWOHALFOP(add) +ADVSIMD_TWOHALFOP(sub) +ADVSIMD_TWOHALFOP(mul) +ADVSIMD_TWOHALFOP(div) +ADVSIMD_TWOHALFOP(min) +ADVSIMD_TWOHALFOP(max) +ADVSIMD_TWOHALFOP(minnum) +ADVSIMD_TWOHALFOP(maxnum) + /* Data processing - scalar floating-point and advanced SIMD */ -float16 HELPER(advsimd_mulxh)(float16 a, float16 b, void *fpstp) +static float16 float16_mulx(float16 a, float16 b, void *fpstp) { float_status *fpst =3D fpstp; =20 @@ -646,6 +670,9 @@ float16 HELPER(advsimd_mulxh)(float16 a, float16 b, voi= d *fpstp) return float16_mul(a, b, fpst); } =20 +ADVSIMD_HALFOP(mulx) +ADVSIMD_TWOHALFOP(mulx) + /* fused multiply-accumulate */ float16 HELPER(advsimd_muladdh)(float16 a, float16 b, float16 c, void *fps= tp) { @@ -653,6 +680,23 @@ float16 HELPER(advsimd_muladdh)(float16 a, float16 b, = float16 c, void *fpstp) return float16_muladd(a, b, c, 0, fpst); } =20 +uint32_t HELPER(advsimd_muladd2h)(uint32_t two_a, uint32_t two_b, + uint32_t two_c, void *fpstp) +{ + float_status *fpst =3D fpstp; + float16 a1, a2, b1, b2, c1, c2; + uint32_t r1, r2; + a1 =3D extract32(two_a, 0, 16); + a2 =3D extract32(two_a, 16, 16); + b1 =3D extract32(two_b, 0, 16); + b2 =3D extract32(two_b, 16, 16); + c1 =3D extract32(two_c, 0, 16); + c2 =3D extract32(two_c, 16, 16); + r1 =3D float16_muladd(a1, b1, c1, 0, fpst); + r2 =3D float16_muladd(a2, b2, c2, 0, fpst); + return deposit32(r1, 16, 16, r2); +} + /* * Floating point comparisons produce an integer result. Softfloat * routines return float_relation types which we convert to the 0/-1 diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h index 79012eee9d..003ffa582f 100644 --- a/target/arm/helper-a64.h +++ b/target/arm/helper-a64.h @@ -65,3 +65,13 @@ DEF_HELPER_3(advsimd_acge_f16, i32, f16, f16, ptr) DEF_HELPER_3(advsimd_acgt_f16, i32, f16, f16, ptr) DEF_HELPER_3(advsimd_mulxh, f16, f16, f16, ptr) DEF_HELPER_4(advsimd_muladdh, f16, f16, f16, f16, ptr) +DEF_HELPER_3(advsimd_add2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_sub2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_mul2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_div2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_max2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_min2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_maxnum2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_minnum2h, i32, i32, i32, ptr) +DEF_HELPER_3(advsimd_mulx2h, i32, i32, i32, ptr) +DEF_HELPER_4(advsimd_muladd2h, i32, i32, i32, i32, ptr) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 6a264bc134..3487c0430f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11417,8 +11417,13 @@ static void disas_simd_indexed(DisasContext *s, ui= nt32_t insn) * multiply-add */ tcg_gen_xori_i32(tcg_op, tcg_op, 0x80008000); } - gen_helper_advsimd_muladdh(tcg_res, tcg_op, tcg_idx, - tcg_res, fpst); + if (is_scalar) { + gen_helper_advsimd_muladdh(tcg_res, tcg_op, tcg_id= x, + tcg_res, fpst); + } else { + gen_helper_advsimd_muladd2h(tcg_res, tcg_op, tcg_i= dx, + tcg_res, fpst); + } break; case 2: if (opcode =3D=3D 0x5) { @@ -11437,10 +11442,21 @@ static void disas_simd_indexed(DisasContext *s, u= int32_t insn) switch (size) { case 1: if (u) { - gen_helper_advsimd_mulxh(tcg_res, tcg_op, tcg_idx, - fpst); + if (is_scalar) { + gen_helper_advsimd_mulxh(tcg_res, tcg_op, + tcg_idx, fpst); + } else { + gen_helper_advsimd_mulx2h(tcg_res, tcg_op, + tcg_idx, fpst); + } } else { - g_assert_not_reached(); + if (is_scalar) { + gen_helper_advsimd_mulh(tcg_res, tcg_op, + tcg_idx, fpst); + } else { + gen_helper_advsimd_mul2h(tcg_res, tcg_op, + tcg_idx, fpst); + } } break; case 2: --=20 2.15.1