From nobody Fri Oct 24 21:45:53 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518895258794586.6465501657779; Sat, 17 Feb 2018 11:20:58 -0800 (PST) Received: from localhost ([::1]:49489 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en82Z-0007qJ-Jc for importer@patchew.org; Sat, 17 Feb 2018 14:20:55 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40710) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AF-0001SM-Ko for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AD-00027L-U8 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:47 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:45436) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AD-000276-MH for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:45 -0500 Received: by mail-pl0-x243.google.com with SMTP id p5so3428969plo.12 for ; Sat, 17 Feb 2018 10:24:45 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.43 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=N/uyfV2EnnPtS8uXZFNB+wp3yS83EN710i4Qp70kVz4=; b=H/MUIYn1loOyvN60iecRE0FQjCwd/BjSnnyc15z+QbX2c/H8rVKH38gGmh604YWggC 0FKT8NXeIKlxk4p9mXj1VpQS3Vw942fgR+Zv9AP1fWaMKC5QxzeYiaF3OHKU7YZSINoa QiqPd7izm5AMwGANa1dlHyrrfkUfzI/Wg5oMY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=N/uyfV2EnnPtS8uXZFNB+wp3yS83EN710i4Qp70kVz4=; b=VF7u2CAImX0FKxYzpwqgGQRbOGzGSkxAdhHxAbMPTo1Ldd7z7XXi6Csx6DqLT99kVU woi9t8SwpgpZphcqxKXTQ96i7aihTI76jcbwKA6fIp1C6vvtjLf6BpYJdPMUIw7kBhiq e/XFrmvjw1LWAGrNEmfGt/Zg5vu+NnbcVO9Lk4WYrnysbS28k8n+/WIUntSERrl//jT6 VzwzE1EqP0XFOG7pdg1pM7ZxKIrDspsdEas9abi61mlg7NmQTksG/24ilzpPzJwn/kKb V4PthdGjEUXI8dy/IXczW5dOsMrynwb4G1KcstmoGyCWE8Hu4wm42RvD3G0Jwvkg5h0h EZHg== X-Gm-Message-State: APf1xPCDvNBKBYNDdzeFNcLiJxPwe+VDNoEFQEh52961dUgihMgMYf+b bxMLypynfC8G0isplcSH/Ser5YMGACY= X-Google-Smtp-Source: AH8x227vUBrxj+wMTFEJi+VZF2kkIGO8K+988jlw3KCo5KnSuXvRWs9Ps+bCTqJ60+dEtEMvTuokZg== X-Received: by 2002:a17:902:7808:: with SMTP id p8-v6mr9622082pll.161.1518891884401; Sat, 17 Feb 2018 10:24:44 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:05 -0800 Message-Id: <20180217182323.25885-50-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 49/67] target/arm: Implement SVE FP Multiply-Add Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 ++++++++++++++ target/arm/sve_helper.c | 53 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 41 +++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 17 +++++++++++++++ 4 files changed, 127 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 84d0a8978c..a95f077c7f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -827,6 +827,22 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d80babfae7..6622275b44 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2948,6 +2948,59 @@ DO_ZPZ_FP_D(sve_ucvt_dd, uint64_t, uint64_to_float64) #undef DO_ZPZ_FP #undef DO_ZPZ_FP_D =20 +/* 4-operand predicated multiply-add. This requires 7 operands to pass + * "properly", so we need to encode some of the registers into DESC. + */ +QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 20 > 32); + +#define DO_FMLA(NAME, N, H, NEG1, NEG3) = \ +void HELPER(NAME)(CPUARMState *env, void *vg, uint32_t desc) = \ +{ = \ + intptr_t i =3D 0, opr_sz =3D simd_oprsz(desc); = \ + unsigned rd =3D extract32(desc, SIMD_DATA_SHIFT, 5); = \ + unsigned rn =3D extract32(desc, SIMD_DATA_SHIFT + 5, 5); = \ + unsigned rm =3D extract32(desc, SIMD_DATA_SHIFT + 10, 5); = \ + unsigned ra =3D extract32(desc, SIMD_DATA_SHIFT + 15, 5); = \ + void *vd =3D &env->vfp.zregs[rd]; = \ + void *vn =3D &env->vfp.zregs[rn]; = \ + void *vm =3D &env->vfp.zregs[rm]; = \ + void *va =3D &env->vfp.zregs[ra]; = \ + do { = \ + uint16_t pg =3D *(uint16_t *)(vg + H1_2(i >> 3)); = \ + do { = \ + if (likely(pg & 1)) { = \ + float##N e1 =3D *(uint##N##_t *)(vn + H(i)); = \ + float##N e2 =3D *(uint##N##_t *)(vm + H(i)); = \ + float##N e3 =3D *(uint##N##_t *)(va + H(i)); = \ + float##N r; = \ + if (NEG1) e1 =3D float##N##_chs(e1); = \ + if (NEG3) e3 =3D float##N##_chs(e3); = \ + r =3D float##N##_muladd(e1, e2, e3, 0, &env->vfp.fp_status= ); \ + *(uint##N##_t *)(vd + H(i)) =3D r; = \ + } = \ + i +=3D sizeof(float##N), pg >>=3D sizeof(float##N); = \ + } while (i & 15); = \ + } while (i < opr_sz); = \ +} + +DO_FMLA(sve_fmla_zpzzz_h, 16, H1_2, 0, 0) +DO_FMLA(sve_fmla_zpzzz_s, 32, H1_4, 0, 0) +DO_FMLA(sve_fmla_zpzzz_d, 64, , 0, 0) + +DO_FMLA(sve_fmls_zpzzz_h, 16, H1_2, 0, 1) +DO_FMLA(sve_fmls_zpzzz_s, 32, H1_4, 0, 1) +DO_FMLA(sve_fmls_zpzzz_d, 64, , 0, 1) + +DO_FMLA(sve_fnmla_zpzzz_h, 16, H1_2, 1, 0) +DO_FMLA(sve_fnmla_zpzzz_s, 32, H1_4, 1, 0) +DO_FMLA(sve_fnmla_zpzzz_d, 64, , 1, 0) + +DO_FMLA(sve_fnmls_zpzzz_h, 16, H1_2, 1, 1) +DO_FMLA(sve_fnmls_zpzzz_s, 32, H1_4, 1, 1) +DO_FMLA(sve_fnmls_zpzzz_d, 64, , 1, 1) + +#undef DO_FMLA + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1692980d20..3124368fb5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3208,6 +3208,47 @@ DO_FP3(FMULX, fmulx) =20 #undef DO_FP3 =20 +typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); + +static void do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla= *fn) +{ + unsigned vsz =3D vec_full_reg_size(s); + unsigned desc; + TCGv_i32 t_desc; + TCGv_ptr pg =3D tcg_temp_new_ptr(); + + /* We would need 7 operands to pass these arguments "properly". + * So we encode all the register numbers into the descriptor. + */ + desc =3D deposit32(a->rd, 5, 5, a->rn); + desc =3D deposit32(desc, 10, 5, a->rm); + desc =3D deposit32(desc, 15, 5, a->ra); + desc =3D simd_desc(vsz, vsz, desc); + + t_desc =3D tcg_const_i32(desc); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + fn(cpu_env, pg, t_desc); + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(pg); +} + +#define DO_FMLA(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rprrr_esz *a, uint32_t insn)= \ +{ \ + static gen_helper_sve_fmla * const fns[4] =3D { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + do_fmla(s, a, fns[a->esz]); \ +} + +DO_FMLA(FMLA_zpzzz, fmla_zpzzz) +DO_FMLA(FMLS_zpzzz, fmls_zpzzz) +DO_FMLA(FNMLA_zpzzz, fnmla_zpzzz) +DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz) + +#undef DO_FMLA + /* *** SVE Floating Point Unary Operations Prediated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1a13c603ff..817833f96e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -129,6 +129,8 @@ &rprrr_esz ra=3D%reg_movprfx @rdn_pg_ra_rm ........ esz:2 . rm:5 ... pg:3 ra:5 rd:5 \ &rprrr_esz rn=3D%reg_movprfx +@rdn_pg_rm_ra ........ esz:2 . ra:5 ... pg:3 rm:5 rd:5 \ + &rprrr_esz rn=3D%reg_movprfx =20 # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @@ -709,6 +711,21 @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn= _pg_rm FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm =20 +### SVE FP Multiply-Add Group + +# SVE floating-point multiply-accumulate writing addend +FMLA_zpzzz 01100101 .. 1 ..... 000 ... ..... ..... @rda_pg_rn_rm +FMLS_zpzzz 01100101 .. 1 ..... 001 ... ..... ..... @rda_pg_rn_rm +FNMLA_zpzzz 01100101 .. 1 ..... 010 ... ..... ..... @rda_pg_rn_rm +FNMLS_zpzzz 01100101 .. 1 ..... 011 ... ..... ..... @rda_pg_rn_rm + +# SVE floating-point multiply-accumulate writing multiplicand +# FMAD, FMSB, FNMAD, FNMS +FMLA_zpzzz 01100101 .. 1 ..... 100 ... ..... ..... @rdn_pg_rm_ra +FMLS_zpzzz 01100101 .. 1 ..... 101 ... ..... ..... @rdn_pg_rm_ra +FNMLA_zpzzz 01100101 .. 1 ..... 110 ... ..... ..... @rdn_pg_rm_ra +FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra + ### SVE FP Unary Operations Predicated Group =20 # SVE integer convert to floating-point --=20 2.14.3