From nobody Sun May 5 04:06:22 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1487140721086361.41812086266805; Tue, 14 Feb 2017 22:38:41 -0800 (PST) Received: from localhost ([::1]:38772 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cdtEd-0000kA-B0 for importer@patchew.org; Wed, 15 Feb 2017 01:38:39 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46423) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cdtDb-0000PR-7b for qemu-devel@nongnu.org; Wed, 15 Feb 2017 01:37:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cdtDY-0006Xn-3H for qemu-devel@nongnu.org; Wed, 15 Feb 2017 01:37:35 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:48115) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cdtDX-0006X9-PL for qemu-devel@nongnu.org; Wed, 15 Feb 2017 01:37:32 -0500 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v1F6XpoL082257 for ; Wed, 15 Feb 2017 01:37:28 -0500 Received: from e28smtp07.in.ibm.com (e28smtp07.in.ibm.com [125.16.236.7]) by mx0a-001b2d01.pphosted.com with ESMTP id 28m9wxpumr-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 15 Feb 2017 01:37:27 -0500 Received: from localhost by e28smtp07.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 15 Feb 2017 12:07:24 +0530 Received: from d28dlp01.in.ibm.com (9.184.220.126) by e28smtp07.in.ibm.com (192.168.1.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 15 Feb 2017 12:07:21 +0530 Received: from d28relay05.in.ibm.com (d28relay05.in.ibm.com [9.184.220.62]) by d28dlp01.in.ibm.com (Postfix) with ESMTP id C7D3AE0024; Wed, 15 Feb 2017 12:08:53 +0530 (IST) Received: from d28av02.in.ibm.com (d28av02.in.ibm.com [9.184.220.64]) by d28relay05.in.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v1F6bIsW33620202; Wed, 15 Feb 2017 12:07:18 +0530 Received: from d28av02.in.ibm.com (localhost [127.0.0.1]) by d28av02.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id v1F6bKsc012678; Wed, 15 Feb 2017 12:07:20 +0530 Received: from bharata.in.ibm.com ([9.124.35.54]) by d28av02.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id v1F6bKeH012675; Wed, 15 Feb 2017 12:07:20 +0530 From: Bharata B Rao To: qemu-devel@nongnu.org Date: Wed, 15 Feb 2017 12:07:16 +0530 X-Mailer: git-send-email 2.7.4 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 17021506-0024-0000-0000-0000039ED3C5 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17021506-0025-0000-0000-0000111910F7 Message-Id: <1487140636-19955-1-git-send-email-bharata@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-02-15_03:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1612050000 definitions=main-1702150066 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 148.163.156.1 Subject: [Qemu-devel] [PATCH] target-ppc: Add quad precision muladd instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rth@twiddle.net, qemu-ppc@nongnu.org, Bharata B Rao , nikunj@linux.vnet.ibm.com, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" xsmaddqp: VSX Scalar Multiply-Add Quad-Precision xsmaddqpo: VSX Scalar Multiply-Add Quad-Precision using round to Odd xsnmaddqp: VSX Scalar Negative Multiply-Add Quad-Precision xsnmaddqpo: VSX Scalar Negative Multiply-Add Quad-Precision using round to = Odd xsmsubqp: VSX Scalar Multiply-Subtract Quad-Precision xsmsubqpo: VSX Scalar Multiply-Subtract Quad-Precision using round to Odd xsnmsubqp: VSX Scalar Negative Multiply-Subtract Quad-Precision xsnmsubqpo: VSX Scalar Negative Multiply-Subtract Quad-Precision using round to Odd Signed-off-by: Bharata B Rao --- target/ppc/fpu_helper.c | 69 +++++++++++++++++++++++++++++++++= ++++ target/ppc/helper.h | 4 +++ target/ppc/translate/vsx-impl.inc.c | 4 +++ target/ppc/translate/vsx-ops.inc.c | 4 +++ 4 files changed, 81 insertions(+) diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index 58aee64..201cafd 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -2425,6 +2425,75 @@ VSX_MADD(xvnmaddmsp, 4, float32, VsrW(i), NMADD_FLGS= , 0, 0, 0) VSX_MADD(xvnmsubasp, 4, float32, VsrW(i), NMSUB_FLGS, 1, 0, 0) VSX_MADD(xvnmsubmsp, 4, float32, VsrW(i), NMSUB_FLGS, 0, 0, 0) =20 +/* + * Quadruple-precision version of multiply and add/subtract. + * + * This implementation is not 100% accurate as we truncate the + * intermediate result of multiplication and then add/subtract + * separately. + * + * TODO: When float128_muladd() becomes available, switch this + * implementation to use that instead of separate float128_mul() + * followed by float128_add(). + */ +#define VSX_MADD_QP(op, maddflgs) = \ +void helper_##op(CPUPPCState *env, uint32_t opcode) = \ +{ = \ + ppc_vsr_t xt_in, xa, xb, xt_out; = \ + = \ + getVSR(rA(opcode) + 32, &xa, env); = \ + getVSR(rB(opcode) + 32, &xb, env); = \ + getVSR(rD(opcode) + 32, &xt_in, env); = \ + = \ + xt_out =3D xt_in; = \ + helper_reset_fpstatus(env); = \ + float_status tstat =3D env->fp_status; = \ + if (unlikely(Rc(opcode) !=3D 0)) { = \ + tstat.float_rounding_mode =3D float_round_to_odd; = \ + } = \ + set_float_exception_flags(0, &tstat); = \ + xt_out.f128 =3D float128_mul(xa.f128, xt_in.f128, &tstat); = \ + = \ + if (maddflgs & float_muladd_negate_c) { = \ + xb.VsrD(0) ^=3D 0x8000000000000000; = \ + } = \ + xt_out.f128 =3D float128_add(xt_out.f128, xb.f128, &tstat); = \ + env->fp_status.float_exception_flags |=3D tstat.float_exception_flags;= \ + = \ + if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { = \ + if (float128_is_signaling_nan(xa.f128, &tstat) || = \ + float128_is_signaling_nan(xt_in.f128, &tstat) || = \ + float128_is_signaling_nan(xb.f128, &tstat)) { = \ + float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1); = \ + tstat.float_exception_flags &=3D ~float_flag_invalid; = \ + } = \ + if ((float128_is_infinity(xa.f128) && float128_is_zero(xt_in.f128)= ) ||\ + (float128_is_zero(xa.f128) && float128_is_infinity(xt_in.f128)= )) {\ + float_invalid_op_excp(env, POWERPC_EXCP_FP_VXIMZ, 1); = \ + tstat.float_exception_flags &=3D ~float_flag_invalid; = \ + } = \ + if ((tstat.float_exception_flags & float_flag_invalid) && = \ + ((float128_is_infinity(xa.f128) || = \ + float128_is_infinity(xt_in.f128)) && = \ + float128_is_infinity(xb.f128))) { = \ + float_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, 1); = \ + } = \ + } = \ + = \ + helper_compute_fprf_float128(env, xt_out.f128); = \ + if ((maddflgs & float_muladd_negate_result) && = \ + !float128_is_any_nan(xt_out.f128)) { = \ + xt_out.VsrD(0) ^=3D 0x8000000000000000; = \ + } = \ + putVSR(rD(opcode) + 32, &xt_out, env); = \ + float_check_status(env); = \ +} + +VSX_MADD_QP(xsmaddqp, MADD_FLGS) +VSX_MADD_QP(xsmsubqp, MSUB_FLGS) +VSX_MADD_QP(xsnmaddqp, NMADD_FLGS) +VSX_MADD_QP(xsnmsubqp, NMSUB_FLGS) + /* VSX_SCALAR_CMP_DP - VSX scalar floating point compare double precision * op - instruction mnemonic * cmp - comparison operation diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 6d77661..eade946 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -480,12 +480,16 @@ DEF_HELPER_2(xssqrtsp, void, env, i32) DEF_HELPER_2(xsrsqrtesp, void, env, i32) DEF_HELPER_2(xsmaddasp, void, env, i32) DEF_HELPER_2(xsmaddmsp, void, env, i32) +DEF_HELPER_2(xsmaddqp, void, env, i32) DEF_HELPER_2(xsmsubasp, void, env, i32) DEF_HELPER_2(xsmsubmsp, void, env, i32) +DEF_HELPER_2(xsmsubqp, void, env, i32) DEF_HELPER_2(xsnmaddasp, void, env, i32) DEF_HELPER_2(xsnmaddmsp, void, env, i32) +DEF_HELPER_2(xsnmaddqp, void, env, i32) DEF_HELPER_2(xsnmsubasp, void, env, i32) DEF_HELPER_2(xsnmsubmsp, void, env, i32) +DEF_HELPER_2(xsnmsubqp, void, env, i32) =20 DEF_HELPER_2(xvadddp, void, env, i32) DEF_HELPER_2(xvsubdp, void, env, i32) diff --git a/target/ppc/translate/vsx-impl.inc.c b/target/ppc/translate/vsx= -impl.inc.c index 7f12908..0a96e6b 100644 --- a/target/ppc/translate/vsx-impl.inc.c +++ b/target/ppc/translate/vsx-impl.inc.c @@ -853,12 +853,16 @@ GEN_VSX_HELPER_2(xssqrtsp, 0x16, 0x00, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xsrsqrtesp, 0x14, 0x00, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xsmaddasp, 0x04, 0x00, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xsmaddmsp, 0x04, 0x01, 0, PPC2_VSX207) +GEN_VSX_HELPER_2(xsmaddqp, 0x04, 0x0C, 0, PPC2_ISA300) GEN_VSX_HELPER_2(xsmsubasp, 0x04, 0x02, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xsmsubmsp, 0x04, 0x03, 0, PPC2_VSX207) +GEN_VSX_HELPER_2(xsmsubqp, 0x04, 0x0D, 0, PPC2_ISA300) GEN_VSX_HELPER_2(xsnmaddasp, 0x04, 0x10, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xsnmaddmsp, 0x04, 0x11, 0, PPC2_VSX207) +GEN_VSX_HELPER_2(xsnmaddqp, 0x04, 0x0E, 0, PPC2_ISA300) GEN_VSX_HELPER_2(xsnmsubasp, 0x04, 0x12, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xsnmsubmsp, 0x04, 0x13, 0, PPC2_VSX207) +GEN_VSX_HELPER_2(xsnmsubqp, 0x04, 0x0F, 0, PPC2_ISA300) GEN_VSX_HELPER_2(xscvsxdsp, 0x10, 0x13, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xscvuxdsp, 0x10, 0x12, 0, PPC2_VSX207) GEN_VSX_HELPER_2(xststdcsp, 0x14, 0x12, 0, PPC2_ISA300) diff --git a/target/ppc/translate/vsx-ops.inc.c b/target/ppc/translate/vsx-= ops.inc.c index 5030c4a..e770fab 100644 --- a/target/ppc/translate/vsx-ops.inc.c +++ b/target/ppc/translate/vsx-ops.inc.c @@ -237,12 +237,16 @@ GEN_XX2FORM(xssqrtsp, 0x16, 0x00, PPC2_VSX207), GEN_XX2FORM(xsrsqrtesp, 0x14, 0x00, PPC2_VSX207), GEN_XX3FORM(xsmaddasp, 0x04, 0x00, PPC2_VSX207), GEN_XX3FORM(xsmaddmsp, 0x04, 0x01, PPC2_VSX207), +GEN_VSX_XFORM_300(xsmaddqp, 0x04, 0x0C, 0x0), GEN_XX3FORM(xsmsubasp, 0x04, 0x02, PPC2_VSX207), GEN_XX3FORM(xsmsubmsp, 0x04, 0x03, PPC2_VSX207), +GEN_VSX_XFORM_300(xsmsubqp, 0x04, 0x0D, 0x0), GEN_XX3FORM(xsnmaddasp, 0x04, 0x10, PPC2_VSX207), GEN_XX3FORM(xsnmaddmsp, 0x04, 0x11, PPC2_VSX207), +GEN_VSX_XFORM_300(xsnmaddqp, 0x04, 0x0E, 0x0), GEN_XX3FORM(xsnmsubasp, 0x04, 0x12, PPC2_VSX207), GEN_XX3FORM(xsnmsubmsp, 0x04, 0x13, PPC2_VSX207), +GEN_VSX_XFORM_300(xsnmsubqp, 0x04, 0x0F, 0x0), GEN_XX2FORM(xscvsxdsp, 0x10, 0x13, PPC2_VSX207), GEN_XX2FORM(xscvuxdsp, 0x10, 0x12, PPC2_VSX207), =20 --=20 2.7.4