From nobody Thu Nov 13 22:08:03 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com ARC-Seal: i=1; a=rsa-sha256; t=1582910927; cv=none; d=zohomail.com; s=zohoarc; b=HlgMB/cer5++kjUg9DTHCUVaGBbEsbUK37H99buJ/t4nr/4Nas2qNdDvh9LjxrT/rPtaqFDELLmBWdTfCURHaEaxQMRnvobovRaqHylAu1Qi9tJRv2I7NSFpLqoX84O60S1emK8MfEuUMzLtOdVtszx7KzOGAasxl76elPPqTi0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1582910927; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=2JjCUdZgnuBXOkGwMV2cpHxEVEc/DLJqu4XeUh4WfIU=; b=KpcpVnGoyvnWR/kU6JuszUBqmmWh2J9nm+rU4hQe5nZLObfFkh51RhSLALdA6v6GOgC+k4nGLr94CDWvOyKpumagwS5BEynK3T8CLeyByO7g6iOg1O/O6h4sPcvqj6ZLg+Cuz8WVRJqpMhTGdm0B7bSPmGcOBnVlp0+Tjce7v38= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1582910927927940.7175863018717; Fri, 28 Feb 2020 09:28:47 -0800 (PST) Received: from localhost ([::1]:51306 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j7jRN-0006MO-Of for importer@patchew.org; Fri, 28 Feb 2020 12:28:45 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:58838) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j7imw-0006or-UJ for qemu-devel@nongnu.org; Fri, 28 Feb 2020 11:47:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j7imm-00007M-HM for qemu-devel@nongnu.org; Fri, 28 Feb 2020 11:46:58 -0500 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:13250) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j7imk-0005Uz-Kt for qemu-devel@nongnu.org; Fri, 28 Feb 2020 11:46:48 -0500 Received: from unknown (HELO ironmsg02-sd.qualcomm.com) ([10.53.140.142]) by alexa-out-sd-01.qualcomm.com with ESMTP; 28 Feb 2020 08:44:34 -0800 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg02-sd.qualcomm.com with ESMTP; 28 Feb 2020 08:44:33 -0800 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 3D876F0B; Fri, 28 Feb 2020 10:44:33 -0600 (CST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1582908406; x=1614444406; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2JjCUdZgnuBXOkGwMV2cpHxEVEc/DLJqu4XeUh4WfIU=; b=T5UB7VUpBKohciFo4vnIGQ1w2znpmDuZ4xTKUpPqmCIFglT6d1zB5Wej x18uWVKm/jPFVEW0AKl2/2xFEOCfk8h/7KIasl366UWzuXua5pPXj50oo fpPFWbfx7MWRkmxZmzkHxYcMSFp5mnUEXCmELBltSNrimpDbnnDOhaWyD w=; From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [RFC PATCH v2 57/67] Hexagon HVX import semantics Date: Fri, 28 Feb 2020 10:43:53 -0600 Message-Id: <1582908244-304-58-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1582908244-304-1-git-send-email-tsimpson@quicinc.com> References: <1582908244-304-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 199.106.114.38 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: riku.voipio@iki.fi, richard.henderson@linaro.org, laurent@vivier.eu, Taylor Simpson , philmd@redhat.com, aleksandar.m.mail@gmail.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Imported from the Hexagon architecture library imported/allext.idef Top level file for all extensions imported/mmvec/ext.idef HVX instruction definitions Signed-off-by: Taylor Simpson --- target/hexagon/imported/allext.idef | 25 + target/hexagon/imported/allidefs.def | 1 + target/hexagon/imported/mmvec/ext.idef | 2780 ++++++++++++++++++++++++++++= ++++ 3 files changed, 2806 insertions(+) create mode 100644 target/hexagon/imported/allext.idef create mode 100644 target/hexagon/imported/mmvec/ext.idef diff --git a/target/hexagon/imported/allext.idef b/target/hexagon/imported/= allext.idef new file mode 100644 index 0000000..26db774 --- /dev/null +++ b/target/hexagon/imported/allext.idef @@ -0,0 +1,25 @@ +/* + * Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/* + * Top level file for all instruction set extensions + */ +#define EXTNAME mmvec +#define EXTSTR "mmvec" +#include "mmvec/ext.idef" +#undef EXTNAME +#undef EXTSTR diff --git a/target/hexagon/imported/allidefs.def b/target/hexagon/imported= /allidefs.def index d2f13a2..03f3f1b 100644 --- a/target/hexagon/imported/allidefs.def +++ b/target/hexagon/imported/allidefs.def @@ -89,3 +89,4 @@ #include "shift.idef" #include "system.idef" #include "subinsns.idef" +#include "allext.idef" diff --git a/target/hexagon/imported/mmvec/ext.idef b/target/hexagon/import= ed/mmvec/ext.idef new file mode 100644 index 0000000..0bc6e9f --- /dev/null +++ b/target/hexagon/imported/mmvec/ext.idef @@ -0,0 +1,2780 @@ +/* + * Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/*************************************************************************= ***** + * + * HOYA: MULTI MEDIA INSTRUCITONS + * + *************************************************************************= *****/ + +#ifndef EXTINSN +#define EXTINSN Q6INSN +#define __SELF_DEF_EXTINSN 1 +#endif + +#ifndef NO_MMVEC + +#define DO_FOR_EACH_CODE(WIDTH, CODE) \ +{ \ + fHIDE(int i;) \ + fVFOREACH(WIDTH, i) {\ + CODE ;\ + } \ +} + + + + +#define ITERATOR_INSN_ANY_SLOT(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA,A_NOTE_ANY_RE= SOURCE), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + + + +#define ITERATOR_INSN2_ANY_SLOT(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE) \ +ITERATOR_INSN_ANY_SLOT(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + +#define ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA_DV,A_NOTE_ANY= 2_RESOURCE), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + + +#define ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,= CODE) \ +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + + +#define ITERATOR_INSN_SHIFT_SLOT(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VS,A_NOTE_SHIFT_= RESOURCE), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + + + +#define ITERATOR_INSN_SHIFT_SLOT_VV_LATE(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VS,A_CVI_VV_LATE= ,A_NOTE_SHIFT_RESOURCE), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_SHIFT_SLOT(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE) \ +ITERATOR_INSN_SHIFT_SLOT(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + +#define ITERATOR_INSN_PERMUTE_SLOT(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP,A_NOTE_PERMUT= E_RESOURCE), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_PERMUTE_SLOT(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE) \ +ITERATOR_INSN_PERMUTE_SLOT(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + +#define ITERATOR_INSN_PERMUTE_SLOT_DEP(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_NOTE_DEPRECATED,A_EXTENSION,A_CVI,A_CV= I_VP,A_NOTE_PERMUTE_RESOURCE), + + +#define ITERATOR_INSN2_PERMUTE_SLOT_DEP(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,COD= E) \ +ITERATOR_INSN_PERMUTE_SLOT_DEP(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + +#define ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,DESCR,CODE)= \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP_VS,A_NOTE_PER= MUTE_RESOURCE,A_NOTE_SHIFT_RESOURCE), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC_DEP(WIDTH,TAG,SYNTAX,DESCR,C= ODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_NOTE_DEPRECATED,A_EXTENSION,A_CVI,A_CV= I_VP_VS,A_NOTE_PERMUTE_RESOURCE,A_NOTE_SHIFT_RESOURCE), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_PERMUTE_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,SYNTAX2,DE= SCR,CODE) \ +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + +#define ITERATOR_INSN_MPY_SLOT(WIDTH,TAG, SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, \ +ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX,A_NOTE_MPY_RESOURCE), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN_MPY_SLOT_LATE(WIDTH,TAG, SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, \ +ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX,A_NOTE_MPY_RESOURCE,A_CVI_LATE), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_MPY_SLOT(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE) \ +ITERATOR_INSN_MPY_SLOT(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + +#define ITERATOR_INSN2_MPY_SLOT_LATE(WIDTH,TAG, SYNTAX,SYNTAX2,DESCR,CODE)= \ +ITERATOR_INSN_MPY_SLOT_LATE(WIDTH,TAG, SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + + +#define ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX_DV,A_NOTE_MPY= DV_RESOURCE), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,= CODE) \ +ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + + + + +#define ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC2(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR= ,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX_DV,A_CVI_VX_V= SRC0_IS_DST,A_NOTE_MPYDV_RESOURCE), DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + +#define ITERATOR_INSN_SLOT2_DOUBLE_VEC(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX_DV,A_CVI_EARL= Y,A_RESTRICT_SLOT2ONLY,A_NOTE_MPYDV_RESOURCE), DESCR, DO_FOR_EACH_CODE(WIDT= H, CODE)) + +#define VEC_DEF_MAPPING2(TAG, SYNTAX1_mapped, SYNTAX1, SYNTAX2_mapped, SYN= TAX2) \ +DEF_CVI_MAPPING(V6_##TAG, SYNTAX1_mapped, SYNTAX1) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX2_mapped, SYNTAX1_mapped) + + +#define ITERATOR_INSN_VHISTLIKE(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_4SLOT,A_CVI_REQU= IRES_TMPLOAD), \ +DESCR, fHIDE(mmvector_t input;) input =3D fTMPVDATA(); DO_FOR_EACH_CODE(WI= DTH, CODE)) + + + + + +/*************************************************************************= ***************** +* +* MMVECTOR MEMORY OPERATIONS - NO NAPALI V1 +* +**************************************************************************= *****************/ + + + +#define ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC_NOV1(WIDTH,TAG,SYNTAX,DESCR,CODE= ) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX_DV,A_NOTE_MPY= DV_RESOURCE,A_NOTE_NONAPALIV1), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC_NOV1(WIDTH,TAG,SYNTAX,SYNTAX2,D= ESCR,CODE) \ +ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC_NOV1(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + + + +#define ITERATOR_INSN_SHIFT_SLOT_NOV1(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VS,A_NOTE_SHIFT_= RESOURCE,A_NOTE_NONAPALIV1), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_SHIFT_SLOT_NOV1(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE= ) \ +ITERATOR_INSN_SHIFT_SLOT_NOV1(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + + +#define ITERATOR_INSN_ANY_SLOT_NOV1(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA,A_NOTE_ANY_RE= SOURCE,A_NOTE_NONAPALIV1), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_ANY_SLOT_NOV1(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE) \ +ITERATOR_INSN_ANY_SLOT_NOV1(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + + +#define ITERATOR_INSN_MPY_SLOT_NOV1(WIDTH,TAG, SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, \ +ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX,A_NOTE_MPY_RESOURCE,A_NOTE_NONAPALIV1),= \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN_PERMUTE_SLOT_NOV1(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP,A_NOTE_PERMUT= E_RESOURCE,A_NOTE_NONAPALIV1), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_PERMUTE_SLOTT_NOV1(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,C= ODE) \ +ITERATOR_INSN_PERMUTE_SLOT(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + +#define ITERATOR_INSN_PERMUTE_SLOT_DEPT_NOV1(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_NOTE_DEPRECATED,A_EXTENSION,A_CVI,A_CV= I_VP,A_NOTE_PERMUTE_RESOURCE,A_NOTE_NONAPALIV1), + + +#define ITERATOR_INSN2_PERMUTE_SLOT_DEPT_NOV1(WIDTH,TAG,SYNTAX,SYNTAX2,DES= CR,CODE) \ +ITERATOR_INSN_PERMUTE_SLOT_DEP_NOV1(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + +#define ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC_NOV1(WIDTH,TAG,SYNTAX,DESCR,= CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP_VS,A_NOTE_PER= MUTE_RESOURCE,A_NOTE_SHIFT_RESOURCE,A_NOTE_NONAPALIV1), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC_DEPT_NOV1(WIDTH,TAG,SYNTAX,D= ESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_NOTE_DEPRECATED,A_EXTENSION,A_CVI,A_CV= I_VP_VS,A_NOTE_PERMUTE_RESOURCE,A_NOTE_SHIFT_RESOURCE,A_NOTE_NONAPALIV1), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_PERMUTE_SLOT_DOUBLE_VEC_NOV1(WIDTH,TAG,SYNTAX,SYNTA= X2,DESCR,CODE) \ +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC_NOV1(WIDTH,TAG,SYNTAX2,DESCR,CODE) \ +DEF_CVI_MAPPING(V6_##TAG##_alt, SYNTAX, SYNTAX2) + +#define NARROWING_SHIFT_NOV1(ITERSIZE,TAG,DSTM,DSTTYPE,SRCTYPE,SYNOPTS,SAT= FUNC,RNDFUNC,SHAMTMASK) \ +ITERATOR_INSN_SHIFT_SLOT_NOV1(ITERSIZE,TAG, \ +"Vd32." #DSTTYPE "=3Dvasr(Vu32." #SRCTYPE ",Vv32." #SRCTYPE ",Rt8)" #SYNOP= TS, \ +"Vector shift right and shuffle", \ + fHIDE(fRT8NOTE())\ + fHIDE(int )shamt =3D RtV & SHAMTMASK; \ + DSTM(0,VdV.SRCTYPE[i],SATFUNC(RNDFUNC(VvV.SRCTYPE[i],shamt) >> shamt))= ; \ + DSTM(1,VdV.SRCTYPE[i],SATFUNC(RNDFUNC(VuV.SRCTYPE[i],shamt) >> shamt))) + +#define MMVEC_AVGS_NOV1(TYPE,TYPE2,DESCR, WIDTH, DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT_NOV1(WIDTH,vavg##TYPE, "Vd3= 2=3Dvavg"TYPE2"(Vu32,Vv32)", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32= ."#SRC")", "Vector Average "DESCR, = VdV.DEST[i] =3D fVAVGS( WIDTH, VuV.SRC[i], VvV.SRC[i])) \ +ITERATOR_INSN2_ANY_SLOT_NOV1(WIDTH,vavg##TYPE##rnd, "Vd3= 2=3Dvavg"TYPE2"(Vu32,Vv32):rnd", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32= ."#SRC"):rnd", "Vector Average % Round"DESCR, = VdV.DEST[i] =3D fVAVGSRND( WIDTH, VuV.SRC[i], VvV.SRC[i])) \ +ITERATOR_INSN2_ANY_SLOT_NOV1(WIDTH,vnavg##TYPE, "Vd3= 2=3Dvnavg"TYPE2"(Vu32,Vv32)", "Vd32."#DEST"=3Dvnavg(Vu32."#SRC",Vv3= 2."#SRC")", "Vector Negative Average "DESCR, = VdV.DEST[i] =3D fVNAVGS( WIDTH, VuV.SRC[i], VvV.SRC[i])) + + #define MMVEC_AVGU_NOV1(TYPE,TYPE2,DESCR, WIDTH, DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT_NOV1(WIDTH,vavg##TYPE, "Vd3= 2=3Dvavg"TYPE2"(Vu32,Vv32)", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32.= "#SRC")", "Vector Average "DESCR, = VdV.DEST[i] =3D fVAVGU( WIDTH, VuV.SRC[i], VvV.SRC[i])) \ +ITERATOR_INSN2_ANY_SLOT_NOV1(WIDTH,vavg##TYPE##rnd, "Vd3= 2=3Dvavg"TYPE2"(Vu32,Vv32):rnd", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32.= "#SRC"):rnd", "Vector Average % Round"DESCR, = VdV.DEST[i] =3D fVAVGURND(WIDTH, VuV.SRC[i], VvV.SRC[i])) + + + +/*************************************************************************= ***************** +* +* MMVECTOR MEMORY OPERATIONS +* +**************************************************************************= *****************/ + +#define MMVEC_EACH_EA(TAG,DESCR,ATTRIB,NT,SYNTAXA,SYNTAXB,BEH) \ +EXTINSN(V6_##TAG##_pi, SYNTAXA "(Rx32++#s3)" NT SYNTAXB,ATTRIB,DESCR,= { fEA_REG(RxV); BEH; fPM_I(RxV,VEC_SCALE(siV)); }) \ +EXTINSN(V6_##TAG##_ai, SYNTAXA "(Rt32+#s4)" NT SYNTAXB,ATTRIB,DESCR,{= fEA_RI(RtV,VEC_SCALE(siV)); BEH;}) \ +EXTINSN(V6_##TAG##_ppu, SYNTAXA "(Rx32++Mu2)" NT SYNTAXB,ATTRIB,DESCR= ,{ fEA_REG(RxV); BEH; fPM_M(RxV,MuV); }) \ + + +#define MMVEC_COND_EACH_EA_TRUE(TAG,DESCR,ATTRIB,NT,SYNTAXA,SYNTAXB,SYNTAX= P,BEH) \ +EXTINSN(V6_##TAG##_pred_pi, "if (" #SYNTAXP "4) " SYNTAXA "(Rx32++#s3= )" NT SYNTAXB, ATTRIB,DESCR, { if (fLSBOLD(SYNTAXP##V)) { fEA_REG(RxV); BEH= ; fPM_I(RxV,siV*fVECSIZE()); } else {CANCEL;}}) \ +EXTINSN(V6_##TAG##_pred_ai, "if (" #SYNTAXP "4) " SYNTAXA "(Rt32+#s4)= " NT SYNTAXB, ATTRIB,DESCR, { if (fLSBOLD(SYNTAXP##V)) { fEA_RI(RtV,siV*fV= ECSIZE()); BEH;} else {CANCEL;}}) \ +EXTINSN(V6_##TAG##_pred_ppu, "if (" #SYNTAXP "4) " SYNTAXA "(Rx32++Mu2= )" NT SYNTAXB,ATTRIB,DESCR, { if (fLSBOLD(SYNTAXP##V)) { fEA_REG(RxV); BEH= ; fPM_M(RxV,MuV); } else {CANCEL;}}) \ + +#define MMVEC_COND_EACH_EA_FALSE(TAG,DESCR,ATTRIB,NT,SYNTAXA,SYNTAXB,SYNTA= XP,BEH) \ +EXTINSN(V6_##TAG##_npred_pi, "if (!" #SYNTAXP "4) " SYNTAXA "(Rx32++#s= 3)" NT SYNTAXB,ATTRIB,DESCR,{ if (fLSBOLDNOT(SYNTAXP##V)) { fEA_REG(RxV); B= EH; fPM_I(RxV,siV*fVECSIZE()); } else {CANCEL;}}) \ +EXTINSN(V6_##TAG##_npred_ai, "if (!" #SYNTAXP "4) " SYNTAXA "(Rt32+#s4= )" NT SYNTAXB,ATTRIB,DESCR, { if (fLSBOLDNOT(SYNTAXP##V)) { fEA_RI(RtV,siV*= fVECSIZE()); BEH;} else {CANCEL;}}) \ +EXTINSN(V6_##TAG##_npred_ppu, "if (!" #SYNTAXP "4) " SYNTAXA "(Rx32++Mu= 2)" NT SYNTAXB,ATTRIB,DESCR,{ if (fLSBOLDNOT(SYNTAXP##V)) { fEA_REG(RxV); B= EH; fPM_M(RxV,MuV); } else {CANCEL;}}) + +#define MMVEC_COND_EACH_EA(TAG,DESCR,ATTRIB,NT,SYNTAXA,SYNTAXB,SYNTAXP,BEH= ) \ +MMVEC_COND_EACH_EA_TRUE(TAG,DESCR,ATTRIB,NT,SYNTAXA,SYNTAXB,SYNTAXP,BEH) \ +MMVEC_COND_EACH_EA_FALSE(TAG,DESCR,ATTRIB,NT,SYNTAXA,SYNTAXB,SYNTAXP,BEH) + + +#define VEC_SCALE(X) X*fVECSIZE() + + +#define MMVEC_LD(TAG,DESCR,ATTRIB,NT) MMVEC_EACH_EA(TAG,DESCR,ATTRIB,NT,"V= d32=3Dvmem","",fLOADMMV(EA,VdV)) +#define MMVEC_LDC(TAG,DESCR,ATTRIB,NT) MMVEC_EACH_EA(TAG##_cur,DESCR,ATTRI= B,NT,"Vd32.cur=3Dvmem","",fLOADMMV(EA,VdV)) +#define MMVEC_LDT(TAG,DESCR,ATTRIB,NT) MMVEC_EACH_EA(TAG##_tmp,DESCR,ATTRI= B,NT,"Vd32.tmp=3Dvmem","",fLOADMMV(EA,VdV)) +#define MMVEC_LDU(TAG,DESCR,ATTRIB,NT) MMVEC_EACH_EA(TAG,DESCR,ATTRIB,NT,"= Vd32=3Dvmemu","",fLOADMMVU(EA,VdV)) + + +#define MMVEC_STQ(TAG,DESCR,ATTRIB,NT) \ +MMVEC_EACH_EA(TAG##_qpred,DESCR,ATTRIB,NT,"if (Qv4) vmem","=3DVs32",fSTORE= MMVQ(EA,VsV,QvV)) \ +MMVEC_EACH_EA(TAG##_nqpred,DESCR,ATTRIB,NT,"if (!Qv4) vmem","=3DVs32",fSTO= REMMVNQ(EA,VsV,QvV)) + +DEF_CVI_MAPPING(V6_stup0, "if (Pv4) vmemu(Rt32)=3DVs32", "if (Pv4) vmemu= (Rt32+#0)=3DVs32") +DEF_CVI_MAPPING(V6_stunp0, "if (!Pv4) vmemu(Rt32)=3DVs32", "if (!Pv4) vme= mu(Rt32+#0)=3DVs32") + + + +/**************************************************************** +* MAPPING FOR VMEMs +****************************************************************/ + +#define ATTR_VMEM A_EXTENSION,A_CVI,A_CVI_VM,A_NOTE_VMEM,A_NOTE_ANY_RESOUR= CE,A_VMEM +#define ATTR_VMEMU A_EXTENSION,A_CVI,A_CVI_VM,A_NOTE_VMEM,A_NOTE_PERMUTE_R= ESOURCE,A_CVI_VP,A_VMEMU + + +MMVEC_LD(vL32b, "Aligned Vector Load", ATTRIBS(ATTR_VMEM,A_LOAD,A_= CVI_VA),) +MMVEC_LDC(vL32b, "Aligned Vector Load Cur", ATTRIBS(ATTR_VMEM,A_LOAD,A_CV= I_NEW,A_CVI_VA),) +MMVEC_LDT(vL32b, "Aligned Vector Load Tmp", ATTRIBS(ATTR_VMEM,A_LOAD,A_CV= I_TMP),) + +MMVEC_COND_EACH_EA(vL32b,"Conditional Aligned Vector Load",ATTRIBS(ATTR_VM= EM,A_LOAD,A_CVI_VA),,"Vd32=3Dvmem",,Pv,fLOADMMV(EA,VdV);) +MMVEC_COND_EACH_EA(vL32b_cur,"Conditional Aligned Vector Load Cur",ATTRIBS= (ATTR_VMEM,A_LOAD,A_CVI_VA,A_CVI_NEW),,"Vd32.cur=3Dvmem",,Pv,fLOADMMV(EA,Vd= V);) +MMVEC_COND_EACH_EA(vL32b_tmp,"Conditional Aligned Vector Load Tmp",ATTRIBS= (ATTR_VMEM,A_LOAD,A_CVI_TMP),,"Vd32.tmp=3Dvmem",,Pv,fLOADMMV(EA,VdV);) + +MMVEC_EACH_EA(vS32b,"Aligned Vector Store",ATTRIBS(ATTR_VMEM,A_STORE,A_RES= TRICT_SLOT0ONLY,A_CVI_VA),,"vmem","=3DVs32",fSTOREMMV(EA,VsV)) +MMVEC_COND_EACH_EA(vS32b,"Aligned Vector Store",ATTRIBS(ATTR_VMEM,A_STORE,= A_RESTRICT_SLOT0ONLY,A_CVI_VA),,"vmem","=3DVs32",Pv,fSTOREMMV(EA,VsV)) + + +MMVEC_STQ(vS32b, "Aligned Vector Store", ATTRIBS(ATTR_VMEM,A_STORE,A= _RESTRICT_SLOT0ONLY,A_CVI_VA),) + +MMVEC_LDU(vL32Ub, "Unaligned Vector Load", ATTRIBS(ATTR_VMEMU,A_LOAD,A= _RESTRICT_NOSLOT1),) + +MMVEC_EACH_EA(vS32Ub,"Unaligned Vector Store",ATTRIBS(ATTR_VMEMU,A_STORE,A= _RESTRICT_NOSLOT1),,"vmemu","=3DVs32",fSTOREMMVU(EA,VsV)) + +MMVEC_COND_EACH_EA(vS32Ub,"Unaligned Vector Store",ATTRIBS(ATTR_VMEMU,A_ST= ORE,A_RESTRICT_NOSLOT1),,"vmemu","=3DVs32",Pv,fSTOREMMVU(EA,VsV)) + +MMVEC_EACH_EA(vS32b_new,"Aligned Vector Store New",ATTRIBS(ATTR_VMEM,A_STO= RE,A_CVI_NEW,A_RESTRICT_SINGLE_MEM_FIRST,A_DOTNEWVALUE,A_RESTRICT_SLOT0ONLY= ),,"vmem","=3DOs8.new",fSTOREMMV(EA,fNEWVREG(OsN))) + +// V65 store relase, zero byte store +MMVEC_EACH_EA(vS32b_srls,"Aligned Vector Scatter Release",ATTRIBS(ATTR_VME= M,A_STORE,A_CVI_SCATTER_RELEASE,A_CVI_NEW,A_RESTRICT_SLOT0ONLY),,"vmem",":s= catter_release",fSTORERELEASE(EA,0)) + + + +MMVEC_COND_EACH_EA(vS32b_new,"Aligned Vector Store New",ATTRIBS(ATTR_VMEM,= A_STORE,A_CVI_NEW,A_RESTRICT_SINGLE_MEM_FIRST,A_DOTNEWVALUE,A_RESTRICT_SLOT= 0ONLY),,"vmem","=3DOs8.new",Pv,fSTOREMMV(EA,fNEWVREG(OsN))) + + +// Loads +DEF_CVI_MAPPING(V6_ld0, "Vd32=3Dvmem(Rt32)", "Vd32=3Dvmem(Rt32+#0)") +DEF_CVI_MAPPING(V6_ldu0, "Vd32=3Dvmemu(Rt32)", "Vd32=3Dvmemu(Rt32+#0)") + +DEF_CVI_MAPPING(V6_ldp0, "if (Pv4) Vd32=3Dvmem(Rt32)", "if (Pv4) Vd32= =3Dvmem(Rt32+#0)") +DEF_CVI_MAPPING(V6_ldnp0, "if (!Pv4) Vd32=3Dvmem(Rt32)", "if (!Pv4) Vd32= =3Dvmem(Rt32+#0)") +DEF_CVI_MAPPING(V6_ldcp0, "if (Pv4) Vd32.cur=3Dvmem(Rt32)", "if (Pv4) Vd3= 2.cur=3Dvmem(Rt32+#0)") +DEF_CVI_MAPPING(V6_ldtp0, "if (Pv4) Vd32.tmp=3Dvmem(Rt32)", "if (Pv4) Vd3= 2.tmp=3Dvmem(Rt32+#0)") +DEF_CVI_MAPPING(V6_ldcnp0, "if (!Pv4) Vd32.cur=3Dvmem(Rt32)", "if (!Pv4) = Vd32.cur=3Dvmem(Rt32+#0)") +DEF_CVI_MAPPING(V6_ldtnp0, "if (!Pv4) Vd32.tmp=3Dvmem(Rt32)", "if (!Pv4) = Vd32.tmp=3Dvmem(Rt32+#0)") + + + + +// Stores +DEF_CVI_MAPPING(V6_st0, "vmem(Rt32)=3DVs32", "vmem(Rt32+#0)=3DVs32") +DEF_CVI_MAPPING(V6_stn0, "vmem(Rt32)=3DOs8.new", "vmem(Rt32+#0)=3DOs8.n= ew") +DEF_CVI_MAPPING(V6_stq0, "if (Qv4) vmem(Rt32)=3DVs32", "if (Qv4) vme= m(Rt32+#0)=3DVs32") +DEF_CVI_MAPPING(V6_stnq0, "if (!Qv4) vmem(Rt32)=3DVs32", "if (!Qv4) v= mem(Rt32+#0)=3DVs32") +DEF_CVI_MAPPING(V6_stp0, "if (Pv4) vmem(Rt32)=3DVs32", "if (Pv4) vme= m(Rt32+#0)=3DVs32") +DEF_CVI_MAPPING(V6_stnp0, "if (!Pv4) vmem(Rt32)=3DVs32", "if (!Pv4) v= mem(Rt32+#0)=3DVs32") + + +DEF_CVI_MAPPING(V6_stu0, "vmemu(Rt32)=3DVs32", "vmemu(Rt32+#0)=3DVs32") + + + +/*************************************************************************= ***************** +* +* MMVECTOR MEMORY OPERATIONS - NON TEMPORAL +* +**************************************************************************= *****************/ + +#define ATTR_VMEM_NT A_EXTENSION,A_CVI,A_CVI_VM,A_NOTE_VMEM,A_NT_VMEM,A_NO= TE_NT_VMEM,A_NOTE_ANY_RESOURCE,A_VMEM + +MMVEC_EACH_EA(vS32b_nt,"Aligned Vector Store - Non temporal",ATTRIBS(ATTR_= VMEM_NT,A_STORE,A_RESTRICT_SLOT0ONLY,A_CVI_VA),":nt","vmem","=3DVs32",fSTOR= EMMV(EA,VsV)) +MMVEC_COND_EACH_EA(vS32b_nt,"Aligned Vector Store - Non temporal",ATTRIBS(= ATTR_VMEM_NT,A_STORE,A_RESTRICT_SLOT0ONLY,A_CVI_VA),":nt","vmem","=3DVs32",= Pv,fSTOREMMV(EA,VsV)) + +MMVEC_EACH_EA(vS32b_nt_new,"Aligned Vector Store New - Non temporal",ATTRI= BS(ATTR_VMEM_NT,A_STORE,A_CVI_NEW,A_RESTRICT_SINGLE_MEM_FIRST,A_DOTNEWVALUE= ,A_RESTRICT_SLOT0ONLY),":nt","vmem","=3DOs8.new",fSTOREMMV(EA,fNEWVREG(OsN)= )) +MMVEC_COND_EACH_EA(vS32b_nt_new,"Aligned Vector Store New - Non temporal",= ATTRIBS(ATTR_VMEM_NT,A_STORE,A_CVI_NEW,A_RESTRICT_SINGLE_MEM_FIRST,A_DOTNEW= VALUE,A_RESTRICT_SLOT0ONLY),":nt","vmem","=3DOs8.new",Pv,fSTOREMMV(EA,fNEWV= REG(OsN))) + + +MMVEC_STQ(vS32b_nt, "Aligned Vector Store - Non temporal", ATTRIBS(A= TTR_VMEM_NT,A_STORE,A_RESTRICT_SLOT0ONLY,A_CVI_VA),":nt") + +MMVEC_LD(vL32b_nt, "Aligned Vector Load - Non temporal", ATTRIBS(AT= TR_VMEM_NT,A_LOAD,A_CVI_VA),":nt") +MMVEC_LDC(vL32b_nt, "Aligned Vector Load Cur - Non temporal", ATTRIBS(ATT= R_VMEM_NT,A_LOAD,A_CVI_NEW,A_CVI_VA),":nt") +MMVEC_LDT(vL32b_nt, "Aligned Vector Load Tmp - Non temporal", ATTRIBS(ATT= R_VMEM_NT,A_LOAD,A_CVI_TMP),":nt") + +MMVEC_COND_EACH_EA(vL32b_nt,"Conditional Aligned Vector Load",ATTRIBS(ATTR= _VMEM_NT,A_CVI_VA),,"Vd32=3Dvmem",":nt",Pv,fLOADMMV(EA,VdV);) +MMVEC_COND_EACH_EA(vL32b_nt_cur,"Conditional Aligned Vector Load Cur",ATTR= IBS(ATTR_VMEM_NT,A_CVI_VA,A_CVI_NEW),,"Vd32.cur=3Dvmem",":nt",Pv,fLOADMMV(E= A,VdV);) +MMVEC_COND_EACH_EA(vL32b_nt_tmp,"Conditional Aligned Vector Load Tmp",ATTR= IBS(ATTR_VMEM_NT,A_CVI_TMP),,"Vd32.tmp=3Dvmem",":nt",Pv,fLOADMMV(EA,VdV);) + + +// Loads +DEF_CVI_MAPPING(V6_ldnt0, "Vd32=3Dvmem(Rt32):nt", "Vd32=3Dvmem(Rt32+#0)= :nt") +DEF_CVI_MAPPING(V6_ldpnt0, "if (Pv4) Vd32=3Dvmem(Rt32):nt", "if (Pv4) V= d32=3Dvmem(Rt32+#0):nt") +DEF_CVI_MAPPING(V6_ldnpnt0, "if (!Pv4) Vd32=3Dvmem(Rt32):nt", "if (!Pv4= ) Vd32=3Dvmem(Rt32+#0):nt") +DEF_CVI_MAPPING(V6_ldcpnt0, "if (Pv4) Vd32.cur=3Dvmem(Rt32):nt", "if (Pv4= ) Vd32.cur=3Dvmem(Rt32+#0):nt") +DEF_CVI_MAPPING(V6_ldtpnt0, "if (Pv4) Vd32.tmp=3Dvmem(Rt32):nt", "if (Pv4= ) Vd32.tmp=3Dvmem(Rt32+#0):nt") +DEF_CVI_MAPPING(V6_ldcnpnt0, "if (!Pv4) Vd32.cur=3Dvmem(Rt32):nt", "if (!= Pv4) Vd32.cur=3Dvmem(Rt32+#0):nt") +DEF_CVI_MAPPING(V6_ldtnpnt0, "if (!Pv4) Vd32.tmp=3Dvmem(Rt32):nt", "if (!= Pv4) Vd32.tmp=3Dvmem(Rt32+#0):nt") + + +// Stores +DEF_CVI_MAPPING(V6_stnt0, "vmem(Rt32):nt=3DVs32", "vmem(Rt32+#0):nt= =3DVs32") +DEF_CVI_MAPPING(V6_stnnt0, "vmem(Rt32):nt=3DOs8.new", "vmem(Rt32+#0):nt= =3DOs8.new") +DEF_CVI_MAPPING(V6_stqnt0, "if (Qv4) vmem(Rt32):nt=3DVs32", "if (Qv4= ) vmem(Rt32+#0):nt=3DVs32") +DEF_CVI_MAPPING(V6_stnqnt0, "if (!Qv4) vmem(Rt32):nt=3DVs32", "if (!Q= v4) vmem(Rt32+#0):nt=3DVs32") +DEF_CVI_MAPPING(V6_stpnt0, "if (Pv4) vmem(Rt32):nt=3DVs32", "if (Pv4= ) vmem(Rt32+#0):nt=3DVs32") +DEF_CVI_MAPPING(V6_stnpnt0, "if (!Pv4) vmem(Rt32):nt=3DVs32", "if (!P= v4) vmem(Rt32+#0):nt=3DVs32") + + + +#undef VEC_SCALE + + +/*************************************************** + * Vector Alignment + ************************************************/ + +#define VALIGNB(SHIFT) \ + fHIDE(int i;) \ + for(i =3D 0; i < fVBYTES(); i++) {\ + VdV.ub[i] =3D (i+SHIFT>=3DfVBYTES()) ? VuV.ub[i+SHIFT-fVBYTES()] := VvV.ub[i+SHIFT];\ + } + +EXTINSN(V6_valignb, "Vd32=3Dvalign(Vu32,Vv32,Rt8)", ATTRIBS(A_EXTENSION,= A_CVI,A_CVI_VP,A_NOTE_PERMUTE_RESOURCE,A_NOTE_RT8),"Align Two vectors by Rt= 8 as control", +{ + unsigned shift =3D RtV & (fVBYTES()-1); + VALIGNB(shift) +}) +EXTINSN(V6_vlalignb, "Vd32=3Dvlalign(Vu32,Vv32,Rt8)", ATTRIBS(A_EXTENSION,= A_CVI,A_CVI_VP,A_NOTE_PERMUTE_RESOURCE,A_NOTE_RT8),"Align Two vectors by Rt= 8 as control", +{ + unsigned shift =3D fVBYTES() - (RtV & (fVBYTES()-1)); + VALIGNB(shift) +}) +EXTINSN(V6_valignbi, "Vd32=3Dvalign(Vu32,Vv32,#u3)", ATTRIBS(A_EXTENSION,= A_CVI,A_CVI_VP,A_NOTE_PERMUTE_RESOURCE),"Align Two vectors by #u3 as contro= l", +{ + VALIGNB(uiV) +}) +EXTINSN(V6_vlalignbi,"Vd32=3Dvlalign(Vu32,Vv32,#u3)", ATTRIBS(A_EXTENSION,= A_CVI,A_CVI_VP,A_NOTE_PERMUTE_RESOURCE),"Align Two vectors by #u3 as contro= l", +{ + unsigned shift =3D fVBYTES() - uiV; + VALIGNB(shift) +}) + +EXTINSN(V6_vror, "Vd32=3Dvror(Vu32,Rt32)", ATTRIBS(A_EXTENSION,A_CVI,A_CVI= _VP,A_NOTE_PERMUTE_RESOURCE), +"Align Two vectors by Rt32 as control", +{ + fHIDE(int k;) + for (k=3D0;k> (RtV & (SIZE-1= )))) \ +ITERATOR_INSN2_SHIFT_SLOT(SIZE,vasl##TYPE, "Vd32=3Dvasl" #TYPE "(Vu32,Rt= 32)","Vd32."#TYPE"=3Dvasl(Vu32."#TYPE",Rt32)", "Vector arithmetic s= hift left " DESC, VdV.TYPE[i] =3D (VuV.TYPE[i] << (RtV & (SIZE-1= )))) \ +ITERATOR_INSN2_SHIFT_SLOT(SIZE,vlsr##TYPE, "Vd32=3Dvlsr" #TYPE "(Vu32,Rt= 32)","Vd32.u"#TYPE"=3Dvlsr(Vu32.u"#TYPE",Rt32)", "Vector logical shif= t right " DESC, VdV.u##TYPE[i] =3D (VuV.u##TYPE[i] >> (RtV & (SIZE-1= )))) \ +ITERATOR_INSN2_SHIFT_SLOT(SIZE,vasr##TYPE##v,"Vd32=3Dvasr" #TYPE "(Vu32,Vv= 32)","Vd32."#TYPE"=3Dvasr(Vu32."#TYPE",Vv32."#TYPE")", "Vector arithmetic s= hift right " DESC, VdV.TYPE[i] =3D fBIDIR_ASHIFTR(VuV.TYPE[i], fSXTN= ((LOGSIZE+1),SIZE,VvV.TYPE[i]),CASTTYPE)) \ +ITERATOR_INSN2_SHIFT_SLOT(SIZE,vasl##TYPE##v,"Vd32=3Dvasl" #TYPE "(Vu32,Vv= 32)","Vd32."#TYPE"=3Dvasl(Vu32."#TYPE",Vv32."#TYPE")", "Vector arithmetic s= hift left " DESC, VdV.TYPE[i] =3D fBIDIR_ASHIFTL(VuV.TYPE[i], fSXT= N((LOGSIZE+1),SIZE,VvV.TYPE[i]),CASTTYPE)) \ +ITERATOR_INSN2_SHIFT_SLOT(SIZE,vlsr##TYPE##v,"Vd32=3Dvlsr" #TYPE "(Vu32,Vv= 32)","Vd32."#TYPE"=3Dvlsr(Vu32."#TYPE",Vv32."#TYPE")", "Vector logical shif= t right " DESC, VdV.u##TYPE[i] =3D fBIDIR_LSHIFTR(VuV.u##TYPE[i], fS= XTN((LOGSIZE+1),SIZE,VvV.TYPE[i]),CASTTYPE)) \ + +V_SHIFT(w, "word", 32,5,4_4) +V_SHIFT(h, "halfword", 16,4,2_2) + +ITERATOR_INSN_SHIFT_SLOT(8,vlsrb,"Vd32.ub=3Dvlsr(Vu32.ub,Rt32)","vec log s= hift right bytes", VdV.b[i] =3D VuV.ub[i] >> (RtV & 0x7)) + +ITERATOR_INSN2_SHIFT_SLOT(32,vrotr,"Vd32=3Dvrotr(Vu32,Vv32)","Vd32.uw=3Dvr= otr(Vu32.uw,Vv32.uw)","Vector word rotate right", VdV.uw[i] =3D ((VuV.uw[i]= >> (VvV.uw[i] & 0x1f)) | (VuV.uw[i] << (32 - (VvV.uw[i] & 0x1f))))) + +/********************************************************************* + * MMVECTOR SHIFT AND PERMUTE + * ******************************************************************/ + +ITERATOR_INSN2_PERMUTE_SLOT_DOUBLE_VEC(32,vasr_into,"Vxx32=3Dvasrinto(Vu32= ,Vv32)","Vxx32.w=3Dvasrinto(Vu32.w,Vv32.w)","ASR vector 1 elements and over= lay dropping bits to MSB of vector 2 elements", + fHIDE(int64_t ) shift =3D (fSE32_64(VuV.w[i]) << 32); + fHIDE(int64_t ) mask =3D (((fSE32_64(VxxV.v[0].w[i])) << 32) | fZE32_= 64(VxxV.v[0].w[i])); + fHIDE(int64_t) lomask =3D (((fSE32_64(1)) << 32) - 1); + fHIDE(int ) count =3D -(0x40 & VvV.w[i]) + (VvV.w[i] & 0x3f); + fHIDE(int64_t ) result =3D (count =3D=3D -0x40) ? 0 : (((count < 0) ? = ((shift << -(count)) | (mask & (lomask << -(count)))) : ((shift >> count) |= (mask & (lomask >> count))))); + VxxV.v[1].w[i] =3D ((result >> 32) & 0xffffffff); + VxxV.v[0].w[i] =3D (result & 0xffffffff)) + +#define NEW_NARROWING_SHIFT 1 + +#if NEW_NARROWING_SHIFT +#define NARROWING_SHIFT(ITERSIZE,TAG,DSTM,DSTTYPE,SRCTYPE,SYNOPTS,SATFUNC,= RNDFUNC,SHAMTMASK) \ +ITERATOR_INSN_SHIFT_SLOT(ITERSIZE,TAG, \ +"Vd32." #DSTTYPE "=3Dvasr(Vu32." #SRCTYPE ",Vv32." #SRCTYPE ",Rt8)" #SYNOP= TS, \ +"Vector shift right and shuffle", \ + fHIDE(fRT8NOTE())\ + fHIDE(int )shamt =3D RtV & SHAMTMASK; \ + DSTM(0,VdV.SRCTYPE[i],SATFUNC(RNDFUNC(VvV.SRCTYPE[i],shamt) >> shamt))= ; \ + DSTM(1,VdV.SRCTYPE[i],SATFUNC(RNDFUNC(VuV.SRCTYPE[i],shamt) >> shamt))) + + + + + +/* WORD TO HALF*/ + +NARROWING_SHIFT(32,vasrwh,fSETHALF,h,w,,fECHO,fVNOROUND,0xF) +NARROWING_SHIFT(32,vasrwhsat,fSETHALF,h,w,:sat,fVSATH,fVNOROUND,0xF) +NARROWING_SHIFT(32,vasrwhrndsat,fSETHALF,h,w,:rnd:sat,fVSATH,fVROUND,0xF) +NARROWING_SHIFT(32,vasrwuhrndsat,fSETHALF,uh,w,:rnd:sat,fVSATUH,fVROUND,0x= F) +NARROWING_SHIFT(32,vasrwuhsat,fSETHALF,uh,w,:sat,fVSATUH,fVNOROUND,0xF) +NARROWING_SHIFT(32,vasruwuhrndsat,fSETHALF,uh,uw,:rnd:sat,fVSATUH,fVROUND,= 0xF) + +NARROWING_SHIFT_NOV1(32,vasruwuhsat,fSETHALF,uh,uw,:sat,fVSATUH,fVNOROUND,= 0xF) +NARROWING_SHIFT(16,vasrhubsat,fSETBYTE,ub,h,:sat,fVSATUB,fVNOROUND,0x7) +NARROWING_SHIFT(16,vasrhubrndsat,fSETBYTE,ub,h,:rnd:sat,fVSATUB,fVROUND,0x= 7) +NARROWING_SHIFT(16,vasrhbsat,fSETBYTE,b,h,:sat,fVSATB,fVNOROUND,0x7) +NARROWING_SHIFT(16,vasrhbrndsat,fSETBYTE,b,h,:rnd:sat,fVSATB,fVROUND,0x7) + +NARROWING_SHIFT_NOV1(16,vasruhubsat,fSETBYTE,ub,uh,:sat,fVSATUB,fVNOROUND,= 0x7) +NARROWING_SHIFT_NOV1(16,vasruhubrndsat,fSETBYTE,ub,uh,:rnd:sat,fVSATUB,fVR= OUND,0x7) + +#else +ITERATOR_INSN2_SHIFT_SLOT(32,vasrwh,"Vd32=3Dvasrwh(Vu32,Vv32,Rt8)","Vd32.h= =3Dvasr(Vu32.w,Vv32.w,Rt8)", +"Vector arithmetic shift right words, shuffle even halfwords", + fHIDE(fRT8NOTE())\ + fSETHALF(0,VdV.w[i], (VvV.w[i] >> (RtV & 0xF))); + fSETHALF(1,VdV.w[i], (VuV.w[i] >> (RtV & 0xF)))) + + +ITERATOR_INSN2_SHIFT_SLOT(32,vasrwhsat,"Vd32=3Dvasrwh(Vu32,Vv32,Rt8):sat",= "Vd32.h=3Dvasr(Vu32.w,Vv32.w,Rt8):sat", +"Vector arithmetic shift right words, shuffle even halfwords", + fHIDE(fRT8NOTE())\ + fSETHALF(0,VdV.w[i], fVSATH(VvV.w[i] >> (RtV & 0xF))); + fSETHALF(1,VdV.w[i], fVSATH(VuV.w[i] >> (RtV & 0xF)))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vasrwhrndsat,"Vd32=3Dvasrwh(Vu32,Vv32,Rt8):rn= d:sat","Vd32.h=3Dvasr(Vu32.w,Vv32.w,Rt8):rnd:sat", +"Vector arithmetic shift right words, shuffle even halfwords", + fHIDE(fRT8NOTE())\ + fHIDE(int ) shamt =3D RtV & 0xF; + fSETHALF(0,VdV.w[i], fVSATH( (VvV.w[i] + fBIDIR_ASHIFTL(1,(shamt-1),4= _8) ) >> shamt)); + fSETHALF(1,VdV.w[i], fVSATH( (VuV.w[i] + fBIDIR_ASHIFTL(1,(shamt-1),4= _8) ) >> shamt))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vasrwuhrndsat,"Vd32=3Dvasrwuh(Vu32,Vv32,Rt8):= rnd:sat","Vd32.uh=3Dvasr(Vu32.w,Vv32.w,Rt8):rnd:sat", +"Vector arithmetic shift right words, shuffle even halfwords", + fHIDE(fRT8NOTE())\ + fHIDE(int ) shamt =3D RtV & 0xF; + fSETHALF(0,VdV.w[i], fVSATUH( (VvV.w[i] + fBIDIR_ASHIFTL(1,(shamt-1),= 4_8) ) >> shamt)); + fSETHALF(1,VdV.w[i], fVSATUH( (VuV.w[i] + fBIDIR_ASHIFTL(1,(shamt-1),= 4_8) ) >> shamt))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vasrwuhsat,"Vd32=3Dvasrwuh(Vu32,Vv32,Rt8):sat= ","Vd32.uh=3Dvasr(Vu32.w,Vv32.w,Rt8):sat", +"Vector arithmetic shift right words, shuffle even halfwords", + fHIDE(fRT8NOTE())\ + fSETHALF(0, VdV.uw[i], fVSATUH(VvV.w[i] >> (RtV & 0xF))); + fSETHALF(1, VdV.uw[i], fVSATUH(VuV.w[i] >> (RtV & 0xF)))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vasruwuhrndsat,"Vd32=3Dvasruwuh(Vu32,Vv32,Rt8= ):rnd:sat","Vd32.uh=3Dvasr(Vu32.uw,Vv32.uw,Rt8):rnd:sat", +"Vector arithmetic shift right words, shuffle even halfwords", + fHIDE(fRT8NOTE())\ + fHIDE(int ) shamt =3D RtV & 0xF; + fSETHALF(0,VdV.w[i], fVSATUH( (VvV.uw[i] + fBIDIR_ASHIFTL(1,(shamt-1)= ,4_8) ) >> shamt)); + fSETHALF(1,VdV.w[i], fVSATUH( (VuV.uw[i] + fBIDIR_ASHIFTL(1,(shamt-1)= ,4_8) ) >> shamt))) +#endif + + + +ITERATOR_INSN2_SHIFT_SLOT(32,vroundwh,"Vd32=3Dvroundwh(Vu32,Vv32):sat","Vd= 32.h=3Dvround(Vu32.w,Vv32.w):sat", +"Vector round words to halves, shuffle resultant halfwords", + fSETHALF(0, VdV.uw[i], fVSATH((VvV.w[i] + fCONSTLL(0x8000)) >> 16)); + fSETHALF(1, VdV.uw[i], fVSATH((VuV.w[i] + fCONSTLL(0x8000)) >> 16))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vroundwuh,"Vd32=3Dvroundwuh(Vu32,Vv32):sat","= Vd32.uh=3Dvround(Vu32.w,Vv32.w):sat", +"Vector round words to halves, shuffle resultant halfwords", + fSETHALF(0, VdV.uw[i], fVSATUH((VvV.w[i] + fCONSTLL(0x8000)) >> 16)); + fSETHALF(1, VdV.uw[i], fVSATUH((VuV.w[i] + fCONSTLL(0x8000)) >> 16))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vrounduwuh,"Vd32=3Dvrounduwuh(Vu32,Vv32):sat"= ,"Vd32.uh=3Dvround(Vu32.uw,Vv32.uw):sat", +"Vector round words to halves, shuffle resultant halfwords", + fSETHALF(0, VdV.uw[i], fVSATUH((VvV.uw[i] + fCONSTLL(0x8000)) >> 16)); + fSETHALF(1, VdV.uw[i], fVSATUH((VuV.uw[i] + fCONSTLL(0x8000)) >> 16))) + + + + + +/* HALF TO BYTE*/ + +ITERATOR_INSN2_SHIFT_SLOT(16,vroundhb,"Vd32=3Dvroundhb(Vu32,Vv32):sat","Vd= 32.b=3Dvround(Vu32.h,Vv32.h):sat", +"Vector round words to halves, shuffle resultant halfwords", + fSETBYTE(0, VdV.uh[i], fVSATB((VvV.h[i] + 0x80) >> 8)); + fSETBYTE(1, VdV.uh[i], fVSATB((VuV.h[i] + 0x80) >> 8))) + +ITERATOR_INSN2_SHIFT_SLOT(16,vroundhub,"Vd32=3Dvroundhub(Vu32,Vv32):sat","= Vd32.ub=3Dvround(Vu32.h,Vv32.h):sat", +"Vector round words to halves, shuffle resultant halfwords", + fSETBYTE(0, VdV.uh[i], fVSATUB((VvV.h[i] + 0x80) >> 8)); + fSETBYTE(1, VdV.uh[i], fVSATUB((VuV.h[i] + 0x80) >> 8))) + +ITERATOR_INSN2_SHIFT_SLOT(16,vrounduhub,"Vd32=3Dvrounduhub(Vu32,Vv32):sat"= ,"Vd32.ub=3Dvround(Vu32.uh,Vv32.uh):sat", +"Vector round words to halves, shuffle resultant halfwords", + fSETBYTE(0, VdV.uh[i], fVSATUB((VvV.uh[i] + 0x80) >> 8)); + fSETBYTE(1, VdV.uh[i], fVSATUB((VuV.uh[i] + 0x80) >> 8))) + + +ITERATOR_INSN2_SHIFT_SLOT(32,vaslw_acc,"Vx32+=3Dvaslw(Vu32,Rt32)","Vx32.w+= =3Dvasl(Vu32.w,Rt32)", +"Vector shift add word", + VxV.w[i] +=3D (VuV.w[i] << (RtV & (32-1)))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vasrw_acc,"Vx32+=3Dvasrw(Vu32,Rt32)","Vx32.w+= =3Dvasr(Vu32.w,Rt32)", +"Vector shift add word", + VxV.w[i] +=3D (VuV.w[i] >> (RtV & (32-1)))) + +ITERATOR_INSN2_SHIFT_SLOT_NOV1(16,vaslh_acc,"Vx32+=3Dvaslh(Vu32,Rt32)","Vx= 32.h+=3Dvasl(Vu32.h,Rt32)", +"Vector shift add halfword", + VxV.h[i] +=3D (VuV.h[i] << (RtV & (16-1)))) + +ITERATOR_INSN2_SHIFT_SLOT_NOV1(16,vasrh_acc,"Vx32+=3Dvasrh(Vu32,Rt32)","Vx= 32.h+=3Dvasr(Vu32.h,Rt32)", +"Vector shift add halfword", + VxV.h[i] +=3D (VuV.h[i] >> (RtV & (16-1)))) + +/************************************************************************** +* +* MMVECTOR ELEMENT-WISE ARITHMETIC +* +**************************************************************************/ + +/************************************************************************** +* MACROS GO IN MACROS.DEF NOT HERE!!! +**************************************************************************/ + + +#define MMVEC_ABSDIFF(TYPE,TYPE2,DESCR, WIDTH, DEST,SRC)\ +ITERATOR_INSN2_MPY_SLOT(WIDTH, vabsdiff##TYPE, "Vd32=3Dv= absdiff"TYPE2"(Vu32,Vv32)" ,"Vd32."#DEST"=3Dvabsdiff(Vu32."#SRC",Vv32."#SRC= ")" , "Vector Absolute of Difference "DESCR, VdV.DEST[i] =3D (VuV.SRC= [i] > VvV.SRC[i]) ? (VuV.SRC[i] - VvV.SRC[i]) : (VvV.SRC[i] - VuV.SRC[i])) + +#define MMVEC_ADDU_SAT(TYPE,TYPE2,DESCR, WIDTH, DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT(WIDTH, vadd##TYPE##sat, "Vd32=3Dv= add"TYPE2"(Vu32,Vv32):sat" , "Vd32."#DEST"=3Dvadd(Vu32."#SRC",Vv32."#SRC= "):sat", "Vector Add & Saturate "DESCR, VdV.DEST[i] =3D fVUAD= DSAT(WIDTH, VuV.SRC[i], VvV.SRC[i]))\ +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH, vadd##TYPE##sat_dv, "Vdd32=3D= vadd"TYPE2"(Vuu32,Vvv32):sat", "Vdd32."#DEST"=3Dvadd(Vuu32."#SRC",Vvv32."#= SRC"):sat", "Double Vector Add & Saturate "DESCR, VddV.v[0].DEST[i] =3D = fVUADDSAT(WIDTH, VuuV.v[0].SRC[i],VvvV.v[0].SRC[i]); VddV.v[1].DEST[i] =3D = fVUADDSAT(WIDTH, VuuV.v[1].SRC[i],VvvV.v[1].SRC[i]))\ +ITERATOR_INSN2_ANY_SLOT(WIDTH, vsub##TYPE##sat, "Vd32=3Dv= sub"TYPE2"(Vu32,Vv32):sat", "Vd32."#DEST"=3Dvsub(Vu32."#SRC",Vv32."#SRC= "):sat", "Vector Add & Saturate "DESCR, VdV.DEST[i] =3D fVUSU= BSAT(WIDTH, VuV.SRC[i], VvV.SRC[i]))\ +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH, vsub##TYPE##sat_dv, "Vdd32=3D= vsub"TYPE2"(Vuu32,Vvv32):sat", "Vdd32."#DEST"=3Dvsub(Vuu32."#SRC",Vvv32."#= SRC"):sat", "Double Vector Add & Saturate "DESCR, VddV.v[0].DEST[i] =3D = fVUSUBSAT(WIDTH, VuuV.v[0].SRC[i],VvvV.v[0].SRC[i]); VddV.v[1].DEST[i] =3D = fVUSUBSAT(WIDTH, VuuV.v[1].SRC[i],VvvV.v[1].SRC[i]))\ + +#define MMVEC_ADDS_SAT(TYPE,TYPE2,DESCR, WIDTH,DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT(WIDTH, vadd##TYPE##sat, "Vd32=3Dv= add"TYPE2"(Vu32,Vv32):sat" , "Vd32."#DEST"=3Dvadd(Vu32."#SRC",Vv32."#SRC= "):sat", "Vector Add & Saturate "DESCR, VdV.DEST[i] =3D fVSAD= DSAT(WIDTH, VuV.SRC[i], VvV.SRC[i]))\ +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH, vadd##TYPE##sat_dv, "Vdd32=3D= vadd"TYPE2"(Vuu32,Vvv32):sat", "Vdd32."#DEST"=3Dvadd(Vuu32."#SRC",Vvv32."#= SRC"):sat", "Double Vector Add & Saturate "DESCR, VddV.v[0].DEST[i] =3D = fVSADDSAT(WIDTH, VuuV.v[0].SRC[i], VvvV.v[0].SRC[i]); VddV.v[1].DEST[i] =3D= fVSADDSAT(WIDTH, VuuV.v[1].SRC[i], VvvV.v[1].SRC[i]))\ +ITERATOR_INSN2_ANY_SLOT(WIDTH, vsub##TYPE##sat, "Vd32=3Dv= sub"TYPE2"(Vu32,Vv32):sat", "Vd32."#DEST"=3Dvsub(Vu32."#SRC",Vv32."#SRC= "):sat", "Vector Add & Saturate "DESCR, VdV.DEST[i] =3D fVSSU= BSAT(WIDTH, VuV.SRC[i], VvV.SRC[i]))\ +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH, vsub##TYPE##sat_dv, "Vdd32=3D= vsub"TYPE2"(Vuu32,Vvv32):sat", "Vdd32."#DEST"=3Dvsub(Vuu32."#SRC",Vvv32."#= SRC"):sat", "Double Vector Add & Saturate "DESCR, VddV.v[0].DEST[i] =3D = fVSSUBSAT(WIDTH, VuuV.v[0].SRC[i], VvvV.v[0].SRC[i]); VddV.v[1].DEST[i] =3D= fVSSUBSAT(WIDTH, VuuV.v[1].SRC[i], VvvV.v[1].SRC[i]))\ + +#define MMVEC_AVGU(TYPE,TYPE2,DESCR, WIDTH, DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vavg##TYPE, "Vd32=3Dv= avg"TYPE2"(Vu32,Vv32)", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32."#SRC= ")", "Vector Average "DESCR, Vd= V.DEST[i] =3D fVAVGU( WIDTH, VuV.SRC[i], VvV.SRC[i])) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vavg##TYPE##rnd, "Vd32=3Dv= avg"TYPE2"(Vu32,Vv32):rnd", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32."#SRC= "):rnd", "Vector Average % Round"DESCR, Vd= V.DEST[i] =3D fVAVGURND(WIDTH, VuV.SRC[i], VvV.SRC[i])) + + + +#define MMVEC_AVGS(TYPE,TYPE2,DESCR, WIDTH, DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vavg##TYPE, "Vd32=3Dv= avg"TYPE2"(Vu32,Vv32)", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32."#SR= C")", "Vector Average "DESCR, = VdV.DEST[i] =3D fVAVGS( WIDTH, VuV.SRC[i], VvV.SRC[i])) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vavg##TYPE##rnd, "Vd32=3Dv= avg"TYPE2"(Vu32,Vv32):rnd", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32."#SR= C"):rnd", "Vector Average % Round"DESCR, = VdV.DEST[i] =3D fVAVGSRND( WIDTH, VuV.SRC[i], VvV.SRC[i])) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vnavg##TYPE, "Vd32=3Dv= navg"TYPE2"(Vu32,Vv32)", "Vd32."#DEST"=3Dvnavg(Vu32."#SRC",Vv32."#S= RC")", "Vector Negative Average "DESCR, = VdV.DEST[i] =3D fVNAVGS( WIDTH, VuV.SRC[i], VvV.SRC[i])) + + + + + + + +#define MMVEC_ADDWRAP(TYPE,TYPE2, DESCR, WIDTH , DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT(WIDTH, vadd##TYPE, "Vd32=3Dvadd"T= YPE2"(Vu32,Vv32)" , "Vd32."#DEST"=3Dvadd(Vu32."#SRC",Vv32."#SRC")", = "Vector Add "DESCR, VdV.DEST[i] =3D VuV.SRC[i] + VvV.SRC[i])\ +ITERATOR_INSN2_ANY_SLOT(WIDTH, vsub##TYPE, "Vd32=3Dvsub"T= YPE2"(Vu32,Vv32)" , "Vd32."#DEST"=3Dvsub(Vu32."#SRC",Vv32."#SRC")", = "Vector Sub "DESCR, VdV.DEST[i] =3D VuV.SRC[i] - VvV.SRC[i])\ +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH, vadd##TYPE##_dv, "Vdd32=3Dvadd"= TYPE2"(Vuu32,Vvv32)" , "Vdd32."#DEST"=3Dvadd(Vuu32."#SRC",Vvv32."#SRC")", = "Double Vector Add "DESCR, VddV.v[0].DEST[i] =3D VuuV.v[0].SRC[i] + VvvV.= v[0].SRC[i]; VddV.v[1].DEST[i] =3D VuuV.v[1].SRC[i] + VvvV.v[1].SRC[i])\ +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH, vsub##TYPE##_dv, "Vdd32=3Dvsub"= TYPE2"(Vuu32,Vvv32)" , "Vdd32."#DEST"=3Dvsub(Vuu32."#SRC",Vvv32."#SRC")", = "Double Vector Sub "DESCR, VddV.v[0].DEST[i] =3D VuuV.v[0].SRC[i] - VvvV.= v[0].SRC[i]; VddV.v[1].DEST[i] =3D VuuV.v[1].SRC[i] - VvvV.v[1].SRC[i]) \ + + + + + +/* Wrapping Adds */ +MMVEC_ADDWRAP(b, "b", "Byte", 8, b, b); +MMVEC_ADDWRAP(h, "h", "Halfword", 16, h, h); +MMVEC_ADDWRAP(w, "w", "Word", 32, w, w); + +/* Saturating Adds */ +MMVEC_ADDU_SAT(ub, "ub", "Unsigned Byte", 8, ub, ub); +MMVEC_ADDU_SAT(uh, "uh", "Unsigned Halfword", 16, uh, uh); +MMVEC_ADDU_SAT(uw, "uw", "Unsigned word", 32, uw, uw); +MMVEC_ADDS_SAT(b, "b", "byte", 8, b, b); +MMVEC_ADDS_SAT(h, "h", "Halfword", 16, h, h); +MMVEC_ADDS_SAT(w, "w", "Word", 32, w, w); + + +/* Averaging Instructions */ +MMVEC_AVGU(ub,"ub", "Unsigned Byte", 8, ub, ub); +MMVEC_AVGU(uh,"uh", "Unsigned Halfword", 16, uh, uh); +MMVEC_AVGU_NOV1(uw,"uw", "Unsigned Word", 32, uw, uw); +MMVEC_AVGS_NOV1(b, "b", "Byte", 8, b, b); +MMVEC_AVGS(h, "h", "Halfword", 16, h, h); +MMVEC_AVGS(w, "w", "Word", 32, w, w); + + +/* Absolute Difference */ +MMVEC_ABSDIFF(ub,"ub", "Unsigned Byte", 8, ub, ub); +MMVEC_ABSDIFF(uh,"uh", "Unsigned Halfword", 16, uh, uh); +MMVEC_ABSDIFF(h,"h", "Halfword", 16, uh, h); +MMVEC_ABSDIFF(w,"w", "Word", 32, uw, w); + +ITERATOR_INSN2_ANY_SLOT(8,vnavgub, "Vd32=3Dvnavgub(Vu32,Vv32)", "Vd32.b=3D= vnavg(Vu32.ub,Vv32.ub)", +"Vector Negative Average Unsigned Byte", VdV.b[i] =3D fVNAVGU(8, VuV.ub[= i], VvV.ub[i])) + +ITERATOR_INSN_ANY_SLOT(32,vaddcarrysat,"Vd32.w=3Dvadd(Vu32.w,Vv32.w,Qs4):c= arry:sat","add w/carry and saturate", +VdV.w[i] =3D fVSATW(VuV.w[i]+VvV.w[i]+fGETQBIT(QsV,i*4))) + +ITERATOR_INSN_ANY_SLOT(32,vaddcarry,"Vd32.w=3Dvadd(Vu32.w,Vv32.w,Qx4):carr= y","add w/carry", +VdV.w[i] =3D VuV.w[i]+VvV.w[i]+fGETQBIT(QxV,i*4); +fSETQBITS(QxV,4,0xF,4*i,-fCARRY_FROM_ADD32(VuV.w[i],VvV.w[i],fGETQBIT(QxV,= i*4)))) + +ITERATOR_INSN_ANY_SLOT(32,vsubcarry,"Vd32.w=3Dvsub(Vu32.w,Vv32.w,Qx4):carr= y","add w/carry", +VdV.w[i] =3D VuV.w[i]+~VvV.w[i]+fGETQBIT(QxV,i*4); +fSETQBITS(QxV,4,0xF,4*i,-fCARRY_FROM_ADD32(VuV.w[i],~VvV.w[i],fGETQBIT(QxV= ,i*4)))) + +ITERATOR_INSN_ANY_SLOT(32,vaddcarryo,"Vd32.w,Qe4=3Dvadd(Vu32.w,Vv32.w):car= ry","add w/carry out-only", +VdV.w[i] =3D VuV.w[i]+VvV.w[i]; +fSETQBITS(QeV,4,0xF,4*i,-fCARRY_FROM_ADD32(VuV.w[i],VvV.w[i],0))) + +ITERATOR_INSN_ANY_SLOT(32,vsubcarryo,"Vd32.w,Qe4=3Dvsub(Vu32.w,Vv32.w):car= ry","subtract w/carry out-only", +VdV.w[i] =3D VuV.w[i]+~VvV.w[i]+1; +fSETQBITS(QeV,4,0xF,4*i,-fCARRY_FROM_ADD32(VuV.w[i],~VvV.w[i],1))) + + +ITERATOR_INSN_ANY_SLOT(32,vsatdw,"Vd32.w=3Dvsatdw(Vu32.w,Vv32.w)","Saturat= e from 64-bits (higher 32-bits come from first vector) to 32-bits",VdV.w[i]= =3D fVSATDW(VuV.w[i],VvV.w[i])) + + +#define MMVEC_ADDSAT_MIX(TAGEND,SATF,WIDTH,DEST,SRC1,SRC2)\ +ITERATOR_INSN_ANY_SLOT(WIDTH, vadd##TAGEND,"Vd32."#DEST"=3Dvadd(Vu32."#SRC= 1",Vv32."#SRC2"):sat", "Vector Add mixed", VdV.DEST[i] =3D SATF(VuV.SRC= 1[i] + VvV.SRC2[i]))\ +ITERATOR_INSN_ANY_SLOT(WIDTH, vsub##TAGEND,"Vd32."#DEST"=3Dvsub(Vu32."#SRC= 1",Vv32."#SRC2"):sat", "Vector Sub mixed", VdV.DEST[i] =3D SATF(VuV.SRC= 1[i] - VvV.SRC2[i]))\ + +MMVEC_ADDSAT_MIX(ububb_sat,fVSATUB,8,ub,ub,b) + +/**************************** +* WIDENING +****************************/ + + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vaddubh,"Vdd32=3Dvaddub(Vu32,Vv32)",= "Vdd32.h=3Dvadd(Vu32.ub,Vv32.ub)", +"Vector addition with widen into two vectors", + VddV.v[0].h[i] =3D fZE8_16(fGETUBYTE(0, VuV.uh[i])) + fZE8_16(fGETUBYT= E(0, VvV.uh[i])); + VddV.v[1].h[i] =3D fZE8_16(fGETUBYTE(1, VuV.uh[i])) + fZE8_16(fGETUBYT= E(1, VvV.uh[i]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vsububh,"Vdd32=3Dvsubub(Vu32,Vv32)",= "Vdd32.h=3Dvsub(Vu32.ub,Vv32.ub)", +"Vector subtraction with widen into two vectors", + VddV.v[0].h[i] =3D fZE8_16(fGETUBYTE(0, VuV.uh[i])) - fZE8_16(fGETUBYT= E(0, VvV.uh[i])); + VddV.v[1].h[i] =3D fZE8_16(fGETUBYTE(1, VuV.uh[i])) - fZE8_16(fGETUBYT= E(1, VvV.uh[i]))) + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vaddhw,"Vdd32=3Dvaddh(Vu32,Vv32)","V= dd32.w=3Dvadd(Vu32.h,Vv32.h)", +"Vector addition with widen into two vectors", + VddV.v[0].w[i] =3D fGETHALF(0, VuV.w[i]) + fGETHALF(0, VvV.w[i]); + VddV.v[1].w[i] =3D fGETHALF(1, VuV.w[i]) + fGETHALF(1, VvV.w[i])) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vsubhw,"Vdd32=3Dvsubh(Vu32,Vv32)","V= dd32.w=3Dvsub(Vu32.h,Vv32.h)", +"Vector subtraction with widen into two vectors", + VddV.v[0].w[i] =3D fGETHALF(0, VuV.w[i]) - fGETHALF(0, VvV.w[i]); + VddV.v[1].w[i] =3D fGETHALF(1, VuV.w[i]) - fGETHALF(1, VvV.w[i])) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vadduhw,"Vdd32=3Dvadduh(Vu32,Vv32)",= "Vdd32.w=3Dvadd(Vu32.uh,Vv32.uh)", +"Vector addition with widen into two vectors", + VddV.v[0].w[i] =3D fZE16_32(fGETUHALF(0, VuV.uw[i])) + fZE16_32(fGETUH= ALF(0, VvV.uw[i])); + VddV.v[1].w[i] =3D fZE16_32(fGETUHALF(1, VuV.uw[i])) + fZE16_32(fGETUH= ALF(1, VvV.uw[i]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vsubuhw,"Vdd32=3Dvsubuh(Vu32,Vv32)",= "Vdd32.w=3Dvsub(Vu32.uh,Vv32.uh)", +"Vector subtraction with widen into two vectors", + VddV.v[0].w[i] =3D fZE16_32(fGETUHALF(0, VuV.uw[i])) - fZE16_32(fGETUH= ALF(0, VvV.uw[i])); + VddV.v[1].w[i] =3D fZE16_32(fGETUHALF(1, VuV.uw[i])) - fZE16_32(fGETUH= ALF(1, VvV.uw[i]))) + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vaddhw_acc,"Vxx32+=3Dvaddh(Vu32,Vv32= )","Vxx32.w+=3Dvadd(Vu32.h,Vv32.h)", +"Vector addition with widen into two vectors", + VxxV.v[0].w[i] +=3D fGETHALF(0, VuV.w[i]) + fGETHALF(0, VvV.w[i]); + VxxV.v[1].w[i] +=3D fGETHALF(1, VuV.w[i]) + fGETHALF(1, VvV.w[i])) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vadduhw_acc,"Vxx32+=3Dvadduh(Vu32,Vv= 32)","Vxx32.w+=3Dvadd(Vu32.uh,Vv32.uh)", +"Vector addition with widen into two vectors", + VxxV.v[0].w[i] +=3D fGETUHALF(0, VuV.w[i]) + fGETUHALF(0, VvV.w[i]); + VxxV.v[1].w[i] +=3D fGETUHALF(1, VuV.w[i]) + fGETUHALF(1, VvV.w[i])) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vaddubh_acc,"Vxx32+=3Dvaddub(Vu32,Vv= 32)","Vxx32.h+=3Dvadd(Vu32.ub,Vv32.ub)", +"Vector addition with widen into two vectors", + VxxV.v[0].h[i] +=3D fGETUBYTE(0, VuV.h[i]) + fGETUBYTE(0, VvV.h[i]); + VxxV.v[1].h[i] +=3D fGETUBYTE(1, VuV.h[i]) + fGETUBYTE(1, VvV.h[i])) + + +DEF_CVI_MAPPING(V6_vd0, "Vd32=3D#0", "Vd32=3Dvxor(V31,V31)") +DEF_CVI_MAPPING(V6_vdd0, "Vdd32=3D#0", "Vdd32.w=3Dvsub(V31:30.w,V31:30.w= )") + + +/**************************** +* Conditional +****************************/ + +#define CONDADDSUB(WIDTH,TAGEND,LHSYN,RHSYN,DESCR,LHBEH,RHBEH) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vadd##TAGEND##q,"if (Qv4."#TAGEND") "LHSYN"+= =3D"RHSYN,"if (Qv4) "LHSYN"+=3D"RHSYN,DESCR,LHBEH=3DfCONDMASK##WIDTH(QvV,i,= LHBEH+RHBEH,LHBEH)) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vsub##TAGEND##q,"if (Qv4."#TAGEND") "LHSYN"-= =3D"RHSYN,"if (Qv4) "LHSYN"-=3D"RHSYN,DESCR,LHBEH=3DfCONDMASK##WIDTH(QvV,i,= LHBEH-RHBEH,LHBEH)) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vadd##TAGEND##nq,"if (!Qv4."#TAGEND") "LHSYN= "+=3D"RHSYN,"if (!Qv4) "LHSYN"+=3D"RHSYN,DESCR,LHBEH=3DfCONDMASK##WIDTH(QvV= ,i,LHBEH,LHBEH+RHBEH)) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vsub##TAGEND##nq,"if (!Qv4."#TAGEND") "LHSYN= "-=3D"RHSYN,"if (!Qv4) "LHSYN"-=3D"RHSYN,DESCR,LHBEH=3DfCONDMASK##WIDTH(QvV= ,i,LHBEH,LHBEH-RHBEH)) \ + +CONDADDSUB(8,b,"Vx32.b","Vu32.b","Conditional add/sub Byte",VxV.ub[i],VuV.= ub[i]) +CONDADDSUB(16,h,"Vx32.h","Vu32.h","Conditional add/sub Half",VxV.h[i],VuV.= h[i]) +CONDADDSUB(32,w,"Vx32.w","Vu32.w","Conditional add/sub Word",VxV.w[i],VuV.= w[i]) + +/***************************************************** + ABSOLUTE VALUES +*****************************************************/ +// V65 +ITERATOR_INSN2_ANY_SLOT_NOV1(8,vabsb, "Vd32=3Dvabsb(Vu32)", "Vd= 32.b=3Dvabs(Vu32.b)", "Vector absolute value of bytes", VdV.b[i] = =3D fABS(VuV.b[i])) +ITERATOR_INSN2_ANY_SLOT_NOV1(8,vabsb_sat, "Vd32=3Dvabsb(Vu32):sat", "Vd= 32.b=3Dvabs(Vu32.b):sat", "Vector absolute value of bytes", VdV.b[i] = =3D fVSATB(fABS(fSE8_16(VuV.b[i])))) + + +ITERATOR_INSN2_ANY_SLOT(16,vabsh, "Vd32=3Dvabsh(Vu32)", "Vd32.h= =3Dvabs(Vu32.h)", "Vector absolute value of halfwords", VdV.h[i] = =3D fABS(VuV.h[i])) +ITERATOR_INSN2_ANY_SLOT(16,vabsh_sat, "Vd32=3Dvabsh(Vu32):sat", "Vd32.h= =3Dvabs(Vu32.h):sat", "Vector absolute value of halfwords", VdV.h[i] = =3D fVSATH(fABS(fSE16_32(VuV.h[i])))) +ITERATOR_INSN2_ANY_SLOT(32,vabsw, "Vd32=3Dvabsw(Vu32)", "Vd32.w= =3Dvabs(Vu32.w)", "Vector absolute value of words", VdV.w[i] = =3D fABS(VuV.w[i])) +ITERATOR_INSN2_ANY_SLOT(32,vabsw_sat, "Vd32=3Dvabsw(Vu32):sat", "Vd32.w= =3Dvabs(Vu32.w):sat", "Vector absolute value of words", VdV.w[i] = =3D fVSATW(fABS(fSE32_64(VuV.w[i])))) + + +DEF_CVI_MAPPING(V6_vabsub_alt, "Vd32.ub=3Dvabs(Vu32.b)", "Vd32.b=3Dvabs(= Vu32.b)") +DEF_CVI_MAPPING(V6_vabsuh_alt, "Vd32.uh=3Dvabs(Vu32.h)", "Vd32.h=3Dvabs(= Vu32.h)") +DEF_CVI_MAPPING(V6_vabsuw_alt, "Vd32.uw=3Dvabs(Vu32.w)", "Vd32.w=3Dvabs(= Vu32.w)") + + + + +/************************************************************************** + * MMVECTOR MULTIPLICATIONS + * ***********************************************************************= */ + + +/* Byte by Byte */ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpybv,"Vdd32=3Dvmpyb(Vu32,Vv32)","V= dd32.h=3Dvmpy(Vu32.b,Vv32.b)", +"Vector absolute value of words", + VddV.v[0].h[i] =3D fMPY8SS(fGETBYTE(0, VuV.h[i]), fGETBYTE(0, VvV.h[i= ])); + VddV.v[1].h[i] =3D fMPY8SS(fGETBYTE(1, VuV.h[i]), fGETBYTE(1, VvV.h[i= ]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpybv_acc,"Vxx32+=3Dvmpyb(Vu32,Vv32= )","Vxx32.h+=3Dvmpy(Vu32.b,Vv32.b)", +"Vector absolute value of words", + VxxV.v[0].h[i] +=3D fMPY8SS(fGETBYTE(0, VuV.h[i]), fGETBYTE(0, VvV.h[= i])); + VxxV.v[1].h[i] +=3D fMPY8SS(fGETBYTE(1, VuV.h[i]), fGETBYTE(1, VvV.h[= i]))) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyubv,"Vdd32=3Dvmpyub(Vu32,Vv32)",= "Vdd32.uh=3Dvmpy(Vu32.ub,Vv32.ub)", +"Vector absolute value of words", + VddV.v[0].uh[i] =3D fMPY8UU(fGETUBYTE(0, VuV.uh[i]), fGETUBYTE(0, VvV= .uh[i]) ); + VddV.v[1].uh[i] =3D fMPY8UU(fGETUBYTE(1, VuV.uh[i]), fGETUBYTE(1, VvV= .uh[i]) )) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyubv_acc,"Vxx32+=3Dvmpyub(Vu32,Vv= 32)","Vxx32.uh+=3Dvmpy(Vu32.ub,Vv32.ub)", +"Vector absolute value of words", + VxxV.v[0].uh[i] +=3D fMPY8UU(fGETUBYTE(0, VuV.uh[i]), fGETUBYTE(0, Vv= V.uh[i]) ); + VxxV.v[1].uh[i] +=3D fMPY8UU(fGETUBYTE(1, VuV.uh[i]), fGETUBYTE(1, Vv= V.uh[i]) )) + + + + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpybusv,"Vdd32=3Dvmpybus(Vu32,Vv32)= ","Vdd32.h=3Dvmpy(Vu32.ub,Vv32.b)", +"Vector absolute value of words", + VddV.v[0].h[i] =3D fMPY8US(fGETUBYTE(0, VuV.uh[i]), fGETBYTE(0, VvV.h= [i])); + VddV.v[1].h[i] =3D fMPY8US(fGETUBYTE(1, VuV.uh[i]), fGETBYTE(1, VvV.h= [i]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpybusv_acc,"Vxx32+=3Dvmpybus(Vu32,= Vv32)","Vxx32.h+=3Dvmpy(Vu32.ub,Vv32.b)", +"Vector absolute value of words", + VxxV.v[0].h[i] +=3D fMPY8US(fGETUBYTE(0, VuV.uh[i]), fGETBYTE(0, VvV.= h[i])); + VxxV.v[1].h[i] +=3D fMPY8US(fGETUBYTE(1, VuV.uh[i]), fGETBYTE(1, VvV.= h[i]))) + + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpabusv,"Vdd32=3Dvmpabus(Vuu32,Vvv3= 2)","Vdd32.h=3Dvmpa(Vuu32.ub,Vvv32.b)", +"Vertical Byte Multiply", + VddV.v[0].h[i] =3D fMPY8US(fGETUBYTE(0, VuuV.v[0].uh[i]), fGETBYTE(0, = VvvV.v[0].uh[i])) + fMPY8US(fGETUBYTE(0, VuuV.v[1].uh[i]), fGETBYTE(0, VvvV= .v[1].uh[i])); + VddV.v[1].h[i] =3D fMPY8US(fGETUBYTE(1, VuuV.v[0].uh[i]), fGETBYTE(1, = VvvV.v[0].uh[i])) + fMPY8US(fGETUBYTE(1, VuuV.v[1].uh[i]), fGETBYTE(1, VvvV= .v[1].uh[i]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpabuuv,"Vdd32=3Dvmpabuu(Vuu32,Vvv3= 2)","Vdd32.h=3Dvmpa(Vuu32.ub,Vvv32.ub)", +"Vertical Byte Multiply", + VddV.v[0].h[i] =3D fMPY8UU(fGETUBYTE(0, VuuV.v[0].uh[i]), fGETUBYTE(0,= VvvV.v[0].uh[i])) + fMPY8UU(fGETUBYTE(0, VuuV.v[1].uh[i]), fGETUBYTE(0, Vv= vV.v[1].uh[i])); + VddV.v[1].h[i] =3D fMPY8UU(fGETUBYTE(1, VuuV.v[0].uh[i]), fGETUBYTE(1,= VvvV.v[0].uh[i])) + fMPY8UU(fGETUBYTE(1, VuuV.v[1].uh[i]), fGETUBYTE(1, Vv= vV.v[1].uh[i]))) + + + + + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhv,"Vdd32=3Dvmpyh(Vu32,Vv32)","V= dd32.w=3Dvmpy(Vu32.h,Vv32.h)", +"Vector by Vector Halfword Multiply", + VddV.v[0].w[i] =3D fMPY16SS(fGETHALF(0, VuV.w[i]), fGETHALF(0, VvV.w[i= ])); + VddV.v[1].w[i] =3D fMPY16SS(fGETHALF(1, VuV.w[i]), fGETHALF(1, VvV.w[i= ]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhv_acc,"Vxx32+=3Dvmpyh(Vu32,Vv32= )","Vxx32.w+=3Dvmpy(Vu32.h,Vv32.h)", +"Vector by Vector Halfword Multiply", + VxxV.v[0].w[i] +=3D fMPY16SS(fGETHALF(0, VuV.w[i]), fGETHALF(0, VvV.w[= i])); + VxxV.v[1].w[i] +=3D fMPY16SS(fGETHALF(1, VuV.w[i]), fGETHALF(1, VvV.w[= i]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyuhv,"Vdd32=3Dvmpyuh(Vu32,Vv32)",= "Vdd32.uw=3Dvmpy(Vu32.uh,Vv32.uh)", +"Vector by Vector Unsigned Halfword Multiply", + VddV.v[0].uw[i] =3D fMPY16UU(fGETUHALF(0, VuV.uw[i]), fGETUHALF(0, VvV= .uw[i])); + VddV.v[1].uw[i] =3D fMPY16UU(fGETUHALF(1, VuV.uw[i]), fGETUHALF(1, VvV= .uw[i]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyuhv_acc,"Vxx32+=3Dvmpyuh(Vu32,Vv= 32)","Vxx32.uw+=3Dvmpy(Vu32.uh,Vv32.uh)", +"Vector by Vector Unsigned Halfword Multiply", + VxxV.v[0].uw[i] +=3D fMPY16UU(fGETUHALF(0, VuV.uw[i]), fGETUHALF(0, Vv= V.uw[i])); + VxxV.v[1].uw[i] +=3D fMPY16UU(fGETUHALF(1, VuV.uw[i]), fGETUHALF(1, Vv= V.uw[i]))) + + + +/* Vector by Vector */ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyhvsrs,"Vd32=3Dvmpyh(Vu32,Vv32):<= <1:rnd:sat","Vd32.h=3Dvmpy(Vu32.h,Vv32.h):<<1:rnd:sat", +"Vector halfword multiply with round, shift, and sat16", + VdV.h[i] =3D fVSATH(fGETHALF(1,fVSAT(fROUND((fMPY16SS(VuV.h[i],VvV.h[i= ] )<<1)))))) + + + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhus, "Vdd32=3Dvmpyhus(Vu32,Vv32)= ","Vdd32.w=3Dvmpy(Vu32.h,Vv32.uh)", +"Vector by Vector Halfword Multiply", + VddV.v[0].w[i] =3D fMPY16SU(fGETHALF(0, VuV.w[i]), fGETUHALF(0, VvV.uw= [i])); + VddV.v[1].w[i] =3D fMPY16SU(fGETHALF(1, VuV.w[i]), fGETUHALF(1, VvV.uw= [i]))) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhus_acc, "Vxx32+=3Dvmpyhus(Vu32,= Vv32)","Vxx32.w+=3Dvmpy(Vu32.h,Vv32.uh)", +"Vector by Vector Halfword Multiply", + VxxV.v[0].w[i] +=3D fMPY16SU(fGETHALF(0, VuV.w[i]), fGETUHALF(0, VvV.u= w[i])); + VxxV.v[1].w[i] +=3D fMPY16SU(fGETHALF(1, VuV.w[i]), fGETUHALF(1, VvV.u= w[i]))) + + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyih,"Vd32=3Dvmpyih(Vu32,Vv32)","V= d32.h=3Dvmpyi(Vu32.h,Vv32.h)", +"Vector by Vector Halfword Multiply", + VdV.h[i] =3D fMPY16SS(VuV.h[i], VvV.h[i])) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyih_acc,"Vx32+=3Dvmpyih(Vu32,Vv32= )","Vx32.h+=3Dvmpyi(Vu32.h,Vv32.h)", +"Vector by Vector Halfword Multiply", + VxV.h[i] +=3D fMPY16SS(VuV.h[i], VvV.h[i])) + + + +/* 32x32 high half / frac */ + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyewuh,"Vd32=3Dvmpyewuh(Vu32,Vv32)= ","Vd32.w=3Dvmpye(Vu32.w,Vv32.uh)", +"Vector by Vector Halfword Multiply", +VdV.w[i] =3D fMPY3216SU(VuV.w[i], fGETUHALF(0, VvV.w[i])) >> 16) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyowh,"Vd32=3Dvmpyowh(Vu32,Vv32):<= <1:sat","Vd32.w=3Dvmpyo(Vu32.w,Vv32.h):<<1:sat", +"Vector by Vector Halfword Multiply", +VdV.w[i] =3D fVSATW((((fMPY3216SS(VuV.w[i], fGETHALF(1, VvV.w[i])) >> 14) = + 0) >> 1))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyowh_rnd,"Vd32=3Dvmpyowh(Vu32,Vv3= 2):<<1:rnd:sat","Vd32.w=3Dvmpyo(Vu32.w,Vv32.h):<<1:rnd:sat", +"Vector by Vector Halfword Multiply", +VdV.w[i] =3D fVSATW((((fMPY3216SS(VuV.w[i], fGETHALF(1, VvV.w[i])) >> 14) = + 1) >> 1))) + +ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC(32,vmpyewuh_64,"Vdd32=3Dvmpye(Vu32.w,Vv3= 2.uh)", +"Word times Halfword Multiply, 64-bit result", + fHIDE(size8s_t prod;) + prod =3D fMPY32SU(VuV.w[i],fGETUHALF(0,VvV.w[i])); + VddV.v[1].w[i] =3D prod >> 16; + VddV.v[0].w[i] =3D prod << 16) + +ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC(32,vmpyowh_64_acc,"Vxx32+=3Dvmpyo(Vu32.w= ,Vv32.h)", +"Word times Halfword Multiply, 64-bit result", + fHIDE(size8s_t prod;) + prod =3D fMPY32SS(VuV.w[i],fGETHALF(1,VvV.w[i])) + fSE32_64(VxxV.v[1].w[= i]); + VxxV.v[1].w[i] =3D prod >> 16; + fSETHALF(0, VxxV.v[0].w[i], VxxV.v[0].w[i] >> 16); + fSETHALF(1, VxxV.v[0].w[i], prod & 0x0000ffff)) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyowh_sacc,"Vx32+=3Dvmpyowh(Vu32,V= v32):<<1:sat:shift","Vx32.w+=3Dvmpyo(Vu32.w,Vv32.h):<<1:sat:shift", +"Vector by Vector Halfword Multiply", +IV1DEAD() VxV.w[i] =3D fVSATW(((((VxV.w[i] + fMPY3216SS(VuV.w[i], fGETHALF= (1, VvV.w[i]))) >> 14) + 0) >> 1))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyowh_rnd_sacc,"Vx32+=3Dvmpyowh(Vu= 32,Vv32):<<1:rnd:sat:shift","Vx32.w+=3Dvmpyo(Vu32.w,Vv32.h):<<1:rnd:sat:shi= ft", +"Vector by Vector Halfword Multiply", +IV1DEAD() VxV.w[i] =3D fVSATW(((((VxV.w[i] + fMPY3216SS(VuV.w[i], fGETHALF= (1, VvV.w[i]))) >> 14) + 1) >> 1))) + +/* For 32x32 integer / low half */ + +ITERATOR_INSN_MPY_SLOT(32,vmpyieoh,"Vd32.w=3Dvmpyieo(Vu32.h,Vv32.h)","Odd/= Even multiply for 32x32 low half", + VdV.w[i] =3D (fGETHALF(0,VuV.w[i])*fGETHALF(1,VvV.w[i])) << 16) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyiewuh,"Vd32=3Dvmpyiewuh(Vu32,Vv3= 2)","Vd32.w=3Dvmpyie(Vu32.w,Vv32.uh)", +"Vector by Vector Word by Halfword Multiply", +IV1DEAD() VdV.w[i] =3D fMPY3216SU(VuV.w[i], fGETUHALF(0, VvV.w[i])) ) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyiowh,"Vd32=3Dvmpyiowh(Vu32,Vv32)= ","Vd32.w=3Dvmpyio(Vu32.w,Vv32.h)", +"Vector by Vector Word by Halfword Multiply", +IV1DEAD() VdV.w[i] =3D fMPY3216SS(VuV.w[i], fGETHALF(1, VvV.w[i])) ) + +/* Add back these... */ + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyiewh_acc,"Vx32+=3Dvmpyiewh(Vu32,= Vv32)","Vx32.w+=3Dvmpyie(Vu32.w,Vv32.h)", +"Vector by Vector Word by Halfword Multiply", +VxV.w[i] =3D VxV.w[i] + fMPY3216SS(VuV.w[i], fGETHALF(0, VvV.w[i])) ) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyiewuh_acc,"Vx32+=3Dvmpyiewuh(Vu3= 2,Vv32)","Vx32.w+=3Dvmpyie(Vu32.w,Vv32.uh)", +"Vector by Vector Word by Halfword Multiply", +VxV.w[i] =3D VxV.w[i] + fMPY3216SU(VuV.w[i], fGETUHALF(0, VvV.w[i])) ) + + + + + + + +/* Vector by Scalar */ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyub,"Vdd32=3Dvmpyub(Vu32,Rt32)","= Vdd32.uh=3Dvmpy(Vu32.ub,Rt32.ub)", +"Vector absolute value of words", + VddV.v[0].uh[i] =3D fMPY8UU(fGETUBYTE(0, VuV.uh[i]), fGETUBYTE((2*i+0= )%4, RtV)); + VddV.v[1].uh[i] =3D fMPY8UU(fGETUBYTE(1, VuV.uh[i]), fGETUBYTE((2*i+1= )%4, RtV))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyub_acc,"Vxx32+=3Dvmpyub(Vu32,Rt3= 2)","Vxx32.uh+=3Dvmpy(Vu32.ub,Rt32.ub)", +"Vector absolute value of words", + VxxV.v[0].uh[i] +=3D fMPY8UU(fGETUBYTE(0, VuV.uh[i]), fGETUBYTE((2*i+0= )%4, RtV)); + VxxV.v[1].uh[i] +=3D fMPY8UU(fGETUBYTE(1, VuV.uh[i]), fGETUBYTE((2*i+1= )%4, RtV))) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpybus,"Vdd32=3Dvmpybus(Vu32,Rt32)"= ,"Vdd32.h=3Dvmpy(Vu32.ub,Rt32.b)", +"Vector absolute value of words", + VddV.v[0].h[i] =3D fMPY8US(fGETUBYTE(0, VuV.uh[i]), fGETBYTE((2*i+0)%= 4, RtV)); + VddV.v[1].h[i] =3D fMPY8US(fGETUBYTE(1, VuV.uh[i]), fGETBYTE((2*i+1)%= 4, RtV))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpybus_acc,"Vxx32+=3Dvmpybus(Vu32,R= t32)","Vxx32.h+=3Dvmpy(Vu32.ub,Rt32.b)", +"Vector absolute value of words", + VxxV.v[0].h[i] +=3D fMPY8US(fGETUBYTE(0, VuV.uh[i]), fGETBYTE((2*i+0)%= 4, RtV)); + VxxV.v[1].h[i] +=3D fMPY8US(fGETUBYTE(1, VuV.uh[i]), fGETBYTE((2*i+1)%= 4, RtV))) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpabus,"Vdd32=3Dvmpabus(Vuu32,Rt32)= ","Vdd32.h=3Dvmpa(Vuu32.ub,Rt32.b)", +"Vertical Byte Multiply", + VddV.v[0].h[i] =3D fMPY8US(fGETUBYTE(0, VuuV.v[0].uh[i]), fGETBYTE(0, = RtV)) + fMPY16SS(fGETUBYTE(0, VuuV.v[1].uh[i]), fGETBYTE(1, RtV)); + VddV.v[1].h[i] =3D fMPY8US(fGETUBYTE(1, VuuV.v[0].uh[i]), fGETBYTE(2, = RtV)) + fMPY16SS(fGETUBYTE(1, VuuV.v[1].uh[i]), fGETBYTE(3, RtV))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpabus_acc,"Vxx32+=3Dvmpabus(Vuu32,= Rt32)","Vxx32.h+=3Dvmpa(Vuu32.ub,Rt32.b)", +"Vertical Byte Multiply", + VxxV.v[0].h[i] +=3D fMPY8US(fGETUBYTE(0, VuuV.v[0].uh[i]), fGETBYTE(0,= RtV)) + fMPY16SS(fGETUBYTE(0, VuuV.v[1].uh[i]), fGETBYTE(1, RtV)); + VxxV.v[1].h[i] +=3D fMPY8US(fGETUBYTE(1, VuuV.v[0].uh[i]), fGETBYTE(2,= RtV)) + fMPY16SS(fGETUBYTE(1, VuuV.v[1].uh[i]), fGETBYTE(3, RtV))) + +// V65 + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC_NOV1(16,vmpabuu,"Vdd32=3Dvmpabuu(Vuu32,= Rt32)","Vdd32.h=3Dvmpa(Vuu32.ub,Rt32.ub)", +"Vertical Byte Multiply", + VddV.v[0].uh[i] =3D fMPY8UU(fGETUBYTE(0, VuuV.v[0].uh[i]), fGETUBYTE(0= , RtV)) + fMPY8UU(fGETUBYTE(0, VuuV.v[1].uh[i]), fGETUBYTE(1, RtV)); + VddV.v[1].uh[i] =3D fMPY8UU(fGETUBYTE(1, VuuV.v[0].uh[i]), fGETUBYTE(2= , RtV)) + fMPY8UU(fGETUBYTE(1, VuuV.v[1].uh[i]), fGETUBYTE(3, RtV))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC_NOV1(16,vmpabuu_acc,"Vxx32+=3Dvmpabuu(V= uu32,Rt32)","Vxx32.h+=3Dvmpa(Vuu32.ub,Rt32.ub)", +"Vertical Byte Multiply", + VxxV.v[0].uh[i] +=3D fMPY8UU(fGETUBYTE(0, VuuV.v[0].uh[i]), fGETUBYTE(= 0, RtV)) + fMPY8UU(fGETUBYTE(0, VuuV.v[1].uh[i]), fGETUBYTE(1, RtV)); + VxxV.v[1].uh[i] +=3D fMPY8UU(fGETUBYTE(1, VuuV.v[0].uh[i]), fGETUBYTE(= 2, RtV)) + fMPY8UU(fGETUBYTE(1, VuuV.v[1].uh[i]), fGETUBYTE(3, RtV))) + + + + +/* Half by Byte */ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpahb,"Vdd32=3Dvmpahb(Vuu32,Rt32)",= "Vdd32.w=3Dvmpa(Vuu32.h,Rt32.b)", +"Vertical Byte Multiply", + VddV.v[0].w[i] =3D fMPY16SS(fGETHALF(0, VuuV.v[0].w[i]), fSE8_16(fGETB= YTE(0, RtV))) + fMPY16SS(fGETHALF(0, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(1, R= tV))); + VddV.v[1].w[i] =3D fMPY16SS(fGETHALF(1, VuuV.v[0].w[i]), fSE8_16(fGETB= YTE(2, RtV))) + fMPY16SS(fGETHALF(1, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(3, R= tV)))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpahb_acc,"Vxx32+=3Dvmpahb(Vuu32,Rt= 32)","Vxx32.w+=3Dvmpa(Vuu32.h,Rt32.b)", +"Vertical Byte Multiply", + VxxV.v[0].w[i] +=3D fMPY16SS(fGETHALF(0, VuuV.v[0].w[i]), fSE8_16(fGET= BYTE(0, RtV))) + fMPY16SS(fGETHALF(0, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(1, = RtV))); + VxxV.v[1].w[i] +=3D fMPY16SS(fGETHALF(1, VuuV.v[0].w[i]), fSE8_16(fGET= BYTE(2, RtV))) + fMPY16SS(fGETHALF(1, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(3, = RtV)))) + +/* Half by Byte */ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpauhb,"Vdd32=3Dvmpauhb(Vuu32,Rt32)= ","Vdd32.w=3Dvmpa(Vuu32.uh,Rt32.b)", +"Vertical Byte Multiply", + VddV.v[0].w[i] =3D fMPY16US(fGETUHALF(0, VuuV.v[0].w[i]), fSE8_16(fGET= BYTE(0, RtV))) + fMPY16US(fGETUHALF(0, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(1,= RtV))); + VddV.v[1].w[i] =3D fMPY16US(fGETUHALF(1, VuuV.v[0].w[i]), fSE8_16(fGET= BYTE(2, RtV))) + fMPY16US(fGETUHALF(1, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(3,= RtV)))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpauhb_acc,"Vxx32+=3Dvmpauhb(Vuu32,= Rt32)","Vxx32.w+=3Dvmpa(Vuu32.uh,Rt32.b)", +"Vertical Byte Multiply", + VxxV.v[0].w[i] +=3D fMPY16US(fGETUHALF(0, VuuV.v[0].w[i]), fSE8_16(fGE= TBYTE(0, RtV))) + fMPY16US(fGETUHALF(0, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(1= , RtV))); + VxxV.v[1].w[i] +=3D fMPY16US(fGETUHALF(1, VuuV.v[0].w[i]), fSE8_16(fGE= TBYTE(2, RtV))) + fMPY16US(fGETUHALF(1, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(3= , RtV)))) + + + + + + + +/* Half by Half */ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyh,"Vdd32=3Dvmpyh(Vu32,Rt32)","Vd= d32.w=3Dvmpy(Vu32.h,Rt32.h)", +"Vector absolute value of words", + VddV.v[0].w[i] =3D fMPY16SS(fGETHALF(0, VuV.w[i]), fGETHALF(0, RtV)); + VddV.v[1].w[i] =3D fMPY16SS(fGETHALF(1, VuV.w[i]), fGETHALF(1, RtV))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC_NOV1(32,vmpyh_acc,"Vxx32+=3Dvmpyh(Vu32,= Rt32)","Vxx32.w+=3Dvmpy(Vu32.h,Rt32.h)", +"Vector even halfwords with scalar lower halfword multiply with shift and = sat32", + VxxV.v[0].w[i] =3D fCAST8s(VxxV.v[0].w[i]) + fMPY16SS(fGETHALF(0, VuV= .w[i]), fGETHALF(0, RtV)); + VxxV.v[1].w[i] =3D fCAST8s(VxxV.v[1].w[i]) + fMPY16SS(fGETHALF(1, VuV= .w[i]), fGETHALF(1, RtV))) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhsat_acc,"Vxx32+=3Dvmpyh(Vu32,Rt= 32):sat","Vxx32.w+=3Dvmpy(Vu32.h,Rt32.h):sat", +"Vector even halfwords with scalar lower halfword multiply with shift and = sat32", + VxxV.v[0].w[i] =3D fVSATW(fCAST8s(VxxV.v[0].w[i]) + fMPY16SS(fGETHALF= (0, VuV.w[i]), fGETHALF(0, RtV))); + VxxV.v[1].w[i] =3D fVSATW(fCAST8s(VxxV.v[1].w[i]) + fMPY16SS(fGETHALF= (1, VuV.w[i]), fGETHALF(1, RtV)))) + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhss,"Vd32=3Dvmpyh(Vu32,Rt32):<<1= :sat","Vd32.h=3Dvmpy(Vu32.h,Rt32.h):<<1:sat", +"Vector halfword by halfword multiply, shift by 1, and take upper 16 msb", + fSETHALF(0,VdV.w[i],fVSATH(fGETHALF(1,fVSAT((fMPY16SS(fGETHALF(0= ,VuV.w[i]),fGETHALF(0,RtV))<<1))))); + fSETHALF(1,VdV.w[i],fVSATH(fGETHALF(1,fVSAT((fMPY16SS(fGETHALF(1= ,VuV.w[i]),fGETHALF(1,RtV))<<1))))); +) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhsrs,"Vd32=3Dvmpyh(Vu32,Rt32):<<= 1:rnd:sat","Vd32.h=3Dvmpy(Vu32.h,Rt32.h):<<1:rnd:sat", +"Vector halfword with scalar halfword multiply with round, shift, and sat1= 6", + fSETHALF(0,VdV.w[i],fVSATH(fGETHALF(1,fVSAT(fROUND((fMPY16SS(fGETHA= LF(0,VuV.w[i]),fGETHALF(0,RtV))<<1)))))); + fSETHALF(1,VdV.w[i],fVSATH(fGETHALF(1,fVSAT(fROUND((fMPY16SS(fGETHA= LF(1,VuV.w[i]),fGETHALF(1,RtV))<<1)))))); +) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyuh,"Vdd32=3Dvmpyuh(Vu32,Rt32)","= Vdd32.uw=3Dvmpy(Vu32.uh,Rt32.uh)", +"Vector even halfword unsigned multiply by scalar", + VddV.v[0].uw[i] =3D fMPY16UU(fGETUHALF(0, VuV.uw[i]),fGETUHALF(0,RtV)); + VddV.v[1].uw[i] =3D fMPY16UU(fGETUHALF(1, VuV.uw[i]),fGETUHALF(1,RtV))) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyuh_acc,"Vxx32+=3Dvmpyuh(Vu32,Rt3= 2)","Vxx32.uw+=3Dvmpy(Vu32.uh,Rt32.uh)", +"Vector even halfword unsigned multiply by scalar", + VxxV.v[0].uw[i] +=3D fMPY16UU(fGETUHALF(0, VuV.uw[i]),fGETUHALF(0,RtV)= ); + VxxV.v[1].uw[i] +=3D fMPY16UU(fGETUHALF(1, VuV.uw[i]),fGETUHALF(1,RtV)= )) + + + + +/******************************************** +* HALF BY BYTE +********************************************/ +ITERATOR_INSN2_MPY_SLOT(16,vmpyihb,"Vd32=3Dvmpyihb(Vu32,Rt32)","Vd32.h=3Dv= mpyi(Vu32.h,Rt32.b)", +"Vector word by byte multiply, keep lower result", +VdV.h[i] =3D fMPY16SS(VuV.h[i], fGETBYTE(i % 4, RtV) )) + +ITERATOR_INSN2_MPY_SLOT(16,vmpyihb_acc,"Vx32+=3Dvmpyihb(Vu32,Rt32)","Vx32.= h+=3Dvmpyi(Vu32.h,Rt32.b)", +"Vector word by byte multiply, keep lower result", +VxV.h[i] +=3D fMPY16SS(VuV.h[i], fGETBYTE(i % 4, RtV) )) + + +/******************************************** +* WORD BY BYTE +********************************************/ +ITERATOR_INSN2_MPY_SLOT(32,vmpyiwb,"Vd32=3Dvmpyiwb(Vu32,Rt32)","Vd32.w=3Dv= mpyi(Vu32.w,Rt32.b)", +"Vector word by byte multiply, keep lower result", +VdV.w[i] =3D fMPY32SS(VuV.w[i], fGETBYTE(i % 4, RtV) )) + +ITERATOR_INSN2_MPY_SLOT(32,vmpyiwb_acc,"Vx32+=3Dvmpyiwb(Vu32,Rt32)","Vx32.= w+=3Dvmpyi(Vu32.w,Rt32.b)", +"Vector word by byte multiply, keep lower result", +VxV.w[i] +=3D fMPY32SS(VuV.w[i], fGETBYTE(i % 4, RtV) )) + +ITERATOR_INSN2_MPY_SLOT(32,vmpyiwub,"Vd32=3Dvmpyiwub(Vu32,Rt32)","Vd32.w= =3Dvmpyi(Vu32.w,Rt32.ub)", +"Vector word by byte multiply, keep lower result", +VdV.w[i] =3D fMPY32SS(VuV.w[i], fGETUBYTE(i % 4, RtV) )) + +ITERATOR_INSN2_MPY_SLOT(32,vmpyiwub_acc,"Vx32+=3Dvmpyiwub(Vu32,Rt32)","Vx3= 2.w+=3Dvmpyi(Vu32.w,Rt32.ub)", +"Vector word by byte multiply, keep lower result", +VxV.w[i] +=3D fMPY32SS(VuV.w[i], fGETUBYTE(i % 4, RtV) )) + + +/******************************************** +* WORD BY HALF +********************************************/ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyiwh,"Vd32=3Dvmpyiwh(Vu32,Rt32)",= "Vd32.w=3Dvmpyi(Vu32.w,Rt32.h)", +"Vector word by byte multiply, keep lower result", +VdV.w[i] =3D fMPY32SS(VuV.w[i], fGETHALF(i % 2, RtV))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyiwh_acc,"Vx32+=3Dvmpyiwh(Vu32,Rt= 32)","Vx32.w+=3Dvmpyi(Vu32.w,Rt32.h)", +"Vector word by byte multiply, keep lower result", +VxV.w[i] +=3D fMPY32SS(VuV.w[i], fGETHALF(i % 2, RtV))) + + + + + + + + + + + + + + + + + + + +/************************************************************************** + * MMVECTOR LOGICAL OPERATIONS + * ***********************************************************************= */ +ITERATOR_INSN_ANY_SLOT(16,vand,"Vd32=3Dvand(Vu32,Vv32)", "Vector Logical A= nd", VdV.uh[i] =3D VuV.uh[i] & VvV.h[i]) +ITERATOR_INSN_ANY_SLOT(16,vor, "Vd32=3Dvor(Vu32,Vv32)", "Vector Logical O= r", VdV.uh[i] =3D VuV.uh[i] | VvV.h[i]) +ITERATOR_INSN_ANY_SLOT(16,vxor,"Vd32=3Dvxor(Vu32,Vv32)", "Vector Logical X= OR", VdV.uh[i] =3D VuV.uh[i] ^ VvV.h[i]) +ITERATOR_INSN_ANY_SLOT(16,vnot,"Vd32=3Dvnot(Vu32)", "Vector Logical NO= T", VdV.uh[i] =3D ~VuV.uh[i]) + + + + + +ITERATOR_INSN2_MPY_SLOT_LATE(8, vandqrt, +"Vd32.ub=3Dvand(Qu4.ub,Rt32.ub)", "Vd32=3Dvand(Qu4,Rt32)", "Insert Predica= te into Vector", + VdV.ub[i] =3D fGETQBIT(QuV,i) ? fGETUBYTE(i % 4, RtV) : 0) + +ITERATOR_INSN2_MPY_SLOT_LATE(8, vandqrt_acc, +"Vx32.ub|=3Dvand(Qu4.ub,Rt32.ub)", "Vx32|=3Dvand(Qu4,Rt32)", "Insert Pred= icate into Vector", + VxV.ub[i] |=3D (fGETQBIT(QuV,i)) ? fGETUBYTE(i % 4, RtV) : 0) + +ITERATOR_INSN2_MPY_SLOT_LATE(8, vandnqrt, +"Vd32.ub=3Dvand(!Qu4.ub,Rt32.ub)", "Vd32=3Dvand(!Qu4,Rt32)", "Insert Predi= cate into Vector", + VdV.ub[i] =3D !fGETQBIT(QuV,i) ? fGETUBYTE(i % 4, RtV) : 0) + +ITERATOR_INSN2_MPY_SLOT_LATE(8, vandnqrt_acc, +"Vx32.ub|=3Dvand(!Qu4.ub,Rt32.ub)", "Vx32|=3Dvand(!Qu4,Rt32)", "Insert Pr= edicate into Vector", + VxV.ub[i] |=3D !(fGETQBIT(QuV,i)) ? fGETUBYTE(i % 4, RtV) : 0) + + +ITERATOR_INSN2_MPY_SLOT_LATE(8, vandvrt, +"Qd4.ub=3Dvand(Vu32.ub,Rt32.ub)", "Qd4=3Dvand(Vu32,Rt32)", "Insert into Pr= edicate", + fSETQBIT(QdV,i,((VuV.ub[i] & fGETUBYTE(i % 4, RtV)) !=3D 0) ? 1 : 0)) + +ITERATOR_INSN2_MPY_SLOT_LATE(8, vandvrt_acc, +"Qx4.ub|=3Dvand(Vu32.ub,Rt32.ub)", "Qx4|=3Dvand(Vu32,Rt32)", "Insert into = Predicate ", + fSETQBIT(QxV,i,fGETQBIT(QxV,i)|(((VuV.ub[i] & fGETUBYTE(i % 4, RtV)) != =3D 0) ? 1 : 0))) + +ITERATOR_INSN_ANY_SLOT(8,vandvqv,"Vd32=3Dvand(Qv4,Vu32)","Mask off bytes", +VdV.b[i] =3D fGETQBIT(QvV,i) ? VuV.b[i] : 0) +ITERATOR_INSN_ANY_SLOT(8,vandvnqv,"Vd32=3Dvand(!Qv4,Vu32)","Mask off bytes= ", +VdV.b[i] =3D !fGETQBIT(QvV,i) ? VuV.b[i] : 0) + + + /*************************************************** + * Compare Vector with Vector + ***************************************************/ +#define VCMP(DEST, ASRC, ASRCOP, CMP, N, SRC, MASK, WIDTH) \ +{ \ + for(fHIDE(int) i =3D 0; i < fVBYTES(); i +=3D WIDTH) { \ + fSETQBITS(DEST,WIDTH,MASK,i,ASRC ASRCOP ((VuV.SRC[i/WIDTH] CMP VvV.SRC[i= /WIDTH]) ? MASK : 0)); \ + } \ + } + +#define MMVEC_CMPEQMAP(T,T2,T3) \ +DEF_CVI_MAPPING(V6_MAP_eq##T, "Qd4=3Dvcmp.eq(Vu32." T2 ",Vv32." T2 ")= ", "Qd4=3Dvcmp.eq(Vu32." T3 ",Vv32." T3 ")") \ +DEF_CVI_MAPPING(V6_MAP_eq##T##_and,"Qx4&=3Dvcmp.eq(Vu32." T2 ",Vv32." T2 "= )", "Qx4&=3Dvcmp.eq(Vu32." T3 ",Vv32." T3 ")") \ +DEF_CVI_MAPPING(V6_MAP_eq##T##_ior,"Qx4|=3Dvcmp.eq(Vu32." T2 ",Vv32." T2 "= )", "Qx4|=3Dvcmp.eq(Vu32." T3 ",Vv32." T3 ")") \ +DEF_CVI_MAPPING(V6_MAP_eq##T##_xor,"Qx4^=3Dvcmp.eq(Vu32." T2 ",Vv32." T2 "= )", "Qx4^=3Dvcmp.eq(Vu32." T3 ",Vv32." T3 ")") + +#define MMVEC_CMPGT(TYPE,TYPE2,TYPE3,DESCR,N,MASK,WIDTH,SRC) \ +EXTINSN(V6_vgt##TYPE, "Qd4=3Dvcmp.gt(Vu32." TYPE2 ",Vv32." TYPE2 ")"= , ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA,A_NOTE_ANY_RESOURCE), DESCR" greater = than", \ + VCMP(QdV, , , >, N, SRC, MASK, WIDTH)) \ +EXTINSN(V6_vgt##TYPE##_and, "Qx4&=3Dvcmp.gt(Vu32." TYPE2 ",Vv32." TYPE2 ")= ", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA,A_NOTE_ANY_RESOURCE), DESCR" greater = than with predicate-and", \ + VCMP(QxV, fGETQBITS(QxV,WIDTH,MASK,i), &, >, N, SRC, MASK, WIDTH)) \ +EXTINSN(V6_vgt##TYPE##_or, "Qx4|=3Dvcmp.gt(Vu32." TYPE2 ",Vv32." TYPE2 ")= ", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA,A_NOTE_ANY_RESOURCE), DESCR" greater = than with predicate-or", \ + VCMP(QxV, fGETQBITS(QxV,WIDTH,MASK,i), |, >, N, SRC, MASK, WIDTH)) \ +EXTINSN(V6_vgt##TYPE##_xor, "Qx4^=3Dvcmp.gt(Vu32." TYPE2 ",Vv32." TYPE2 ")= ", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA,A_NOTE_ANY_RESOURCE), DESCR" greater = than with predicate-xor", \ + VCMP(QxV, fGETQBITS(QxV,WIDTH,MASK,i), ^, >, N, SRC, MASK, WIDTH)) + +#define MMVEC_CMP(TYPE,TYPE2,TYPE3,DESCR,N,MASK, WIDTH, SRC)\ +MMVEC_CMPGT(TYPE,TYPE2,TYPE3,DESCR,N,MASK,WIDTH,SRC) \ +EXTINSN(V6_veq##TYPE, "Qd4=3Dvcmp.eq(Vu32." TYPE2 ",Vv32." TYPE2 ")"= , ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA,A_NOTE_ANY_RESOURCE), DESCR" equal to= ", \ + VCMP(QdV, , , =3D=3D, N, SRC, MASK, WIDTH)) \ +EXTINSN(V6_veq##TYPE##_and, "Qx4&=3Dvcmp.eq(Vu32." TYPE2 ",Vv32." TYPE2 ")= ", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA,A_NOTE_ANY_RESOURCE), DESCR" equalto = with predicate-and", \ + VCMP(QxV, fGETQBITS(QxV,WIDTH,MASK,i), &, =3D=3D, N, SRC, MASK, WIDTH)) \ +EXTINSN(V6_veq##TYPE##_or, "Qx4|=3Dvcmp.eq(Vu32." TYPE2 ",Vv32." TYPE2 ")= ", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA,A_NOTE_ANY_RESOURCE), DESCR" equalto = with predicate-or", \ + VCMP(QxV, fGETQBITS(QxV,WIDTH,MASK,i), |, =3D=3D, N, SRC, MASK, WIDTH)) \ +EXTINSN(V6_veq##TYPE##_xor, "Qx4^=3Dvcmp.eq(Vu32." TYPE2 ",Vv32." TYPE2 ")= ", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA,A_NOTE_ANY_RESOURCE), DESCR" equalto = with predicate-xor", \ + VCMP(QxV, fGETQBITS(QxV,WIDTH,MASK,i), ^, =3D=3D, N, SRC, MASK, WIDTH)) + + +MMVEC_CMP(w,"w","","Vector Word Compare ", fVELEM(32), 0xF, 4, w) +MMVEC_CMP(h,"h","","Vector Half Compare ", fVELEM(16), 0x3, 2, h) +MMVEC_CMP(b,"b","","Vector Half Compare ", fVELEM(8), 0x1, 1, b) +MMVEC_CMPGT(uw,"uw","","Vector Unsigned Half Compare ", fVELEM(32), 0xF, 4= ,uw) +MMVEC_CMPGT(uh,"uh","","Vector Unsigned Half Compare ", fVELEM(16), 0x3, 2= ,uh) +MMVEC_CMPGT(ub,"ub","","Vector Unsigned Byte Compare ", fVELEM(8), 0x1, 1= ,ub) + +MMVEC_CMPEQMAP(uw,"uw","w") +MMVEC_CMPEQMAP(uh,"uh","h") +MMVEC_CMPEQMAP(ub,"ub","b") + + + +/*************************************************** +* Predicate Operations +***************************************************/ + +EXTINSN(V6_pred_scalar2, "Qd4=3Dvsetq(Rt32)", ATTRIBS(A_EXTENSION,= A_CVI,A_CVI_VP,A_NOTE_PERMUTE_RESOURCE), "Set Vector Predicate ", +{ + fHIDE(int i;) + for(i =3D 0; i < fVBYTES(); i++) fSETQBIT(QdV,i,(i < (RtV & (fVBYTES()= -1))) ? 1 : 0); +}) + +EXTINSN(V6_pred_scalar2v2, "Qd4=3Dvsetq2(Rt32)", ATTRIBS(A_EXTENSI= ON,A_CVI,A_CVI_VP,A_NOTE_PERMUTE_RESOURCE), "Set Vector Predicate ", +{ + fHIDE(int i;) + for(i =3D 0; i < fVBYTES(); i++) fSETQBIT(QdV,i,(i <=3D ((RtV-1) & (fV= BYTES()-1))) ? 1 : 0); +}) + + +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, shuffeqw, "Qd4.h=3Dvshuffe(Qs4.w,Qt4.= w)","Shrink Predicate", fSETQBIT(QdV,i, (i & 2) ? fGETQBIT(QsV,i-2) : fGETQ= BIT(QtV,i) ) ); +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, shuffeqh, "Qd4.b=3Dvshuffe(Qs4.h,Qt4.= h)","Shrink Predicate", fSETQBIT(QdV,i, (i & 1) ? fGETQBIT(QsV,i-1) : fGETQ= BIT(QtV,i) ) ); +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, pred_or, "Qd4=3Dor(Qs4,Qt4)","Vector = Predicate Or", fSETQBIT(QdV,i,fGETQBIT(QsV,i) || fGETQBIT(QtV,i) ) ); +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, pred_and, "Qd4=3Dand(Qs4,Qt4)","Vecto= r Predicate And", fSETQBIT(QdV,i,fGETQBIT(QsV,i) && fGETQBIT(QtV,i) ) ); +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, pred_xor, "Qd4=3Dxor(Qs4,Qt4)","Vecto= r Predicate Xor", fSETQBIT(QdV,i,fGETQBIT(QsV,i) ^ fGETQBIT(QtV,i) ) ); +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, pred_or_n, "Qd4=3Dor(Qs4,!Qt4)","Vect= or Predicate Or with not", fSETQBIT(QdV,i,fGETQBIT(QsV,i) || !fGETQBIT(QtV,= i) ) ); +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, pred_and_n, "Qd4=3Dand(Qs4,!Qt4)","Ve= ctor Predicate And with not", fSETQBIT(QdV,i,fGETQBIT(QsV,i) && !fGETQBIT(= QtV,i) ) ); +ITERATOR_INSN_ANY_SLOT(8, pred_not, "Qd4=3Dnot(Qs4)","Vector Predicate Not= ", fSETQBIT(QdV,i,!fGETQBIT(QsV,i) ) ); + + + +EXTINSN(V6_vcmov, "if (Ps4) Vd32=3DVu32", ATTRIBS(A_EXTENSION,A_CVI,A_CV= I_VA,A_NOTE_ANY_RESOURCE), "Conditional Mov", +{ +if (fLSBOLD(PsV)) { + fHIDE(int i;) + fVFOREACH(8, i) { + VdV.ub[i] =3D VuV.ub[i]; + } + } else {CANCEL;} +}) + +EXTINSN(V6_vncmov, "if (!Ps4) Vd32=3DVu32", ATTRIBS(A_EXTENSION,A_CVI,A_= CVI_VA,A_NOTE_ANY_RESOURCE), "Conditional Mov", +{ +if (fLSBOLDNOT(PsV)) { + fHIDE(int i;) + fVFOREACH(8, i) { + VdV.ub[i] =3D VuV.ub[i]; + } + } else {CANCEL;} +}) + +EXTINSN(V6_vccombine, "if (Ps4) Vdd32=3Dvcombine(Vu32,Vv32)", ATTRIBS(A_E= XTENSION,A_CVI,A_CVI_VA_DV,A_NOTE_ANY2_RESOURCE), "Conditional Combine", +{ +if (fLSBOLD(PsV)) { + fHIDE(int i;) + fVFOREACH(8, i) { + VddV.v[0].ub[i] =3D VvV.ub[i]; + VddV.v[1].ub[i] =3D VuV.ub[i]; + } + } else {CANCEL;} +}) + +EXTINSN(V6_vnccombine, "if (!Ps4) Vdd32=3Dvcombine(Vu32,Vv32)", ATTRIBS(A= _EXTENSION,A_CVI,A_CVI_VA_DV,A_NOTE_ANY2_RESOURCE), "Conditional Combine", +{ +if (fLSBOLDNOT(PsV)) { + fHIDE(int i;) + fVFOREACH(8, i) { + VddV.v[0].ub[i] =3D VvV.ub[i]; + VddV.v[1].ub[i] =3D VuV.ub[i]; + } + } else {CANCEL;} +}) + + + +ITERATOR_INSN_ANY_SLOT(8,vmux,"Vd32=3Dvmux(Qt4,Vu32,Vv32)", +"Vector Select Element 8-bit", + VdV.ub[i] =3D fGETQBIT(QtV,i) ? VuV.ub[i] : VvV.ub[i]) + +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8,vswap,"Vdd32=3Dvswap(Qt4,Vu32,Vv32)", +"Vector Swap Element 8-bit", + VddV.v[0].ub[i] =3D fGETQBIT(QtV,i) ? VuV.ub[i] : VvV.ub[i]; + VddV.v[1].ub[i] =3D !fGETQBIT(QtV,i) ? VuV.ub[i] : VvV.ub[i]) + + +/*************************************************************************= ** +* +* MMVECTOR SORTING +* +**************************************************************************= **/ + +#define MMVEC_SORT(TYPE,TYPE2,DESCR,ELEMENTSIZE,SRC)\ +ITERATOR_INSN2_ANY_SLOT(ELEMENTSIZE,vmax##TYPE, "Vd32=3Dvmax" TYPE2 "(Vu32= ,Vv32)", "Vd32."#SRC"=3Dvmax(Vu32."#SRC",Vv32."#SRC")", "Vector " DESCR " m= ax", VdV.SRC[i] =3D (VuV.SRC[i] > VvV.SRC[i]) ? VuV.SRC[i] : VvV.SRC[i]) \ +ITERATOR_INSN2_ANY_SLOT(ELEMENTSIZE,vmin##TYPE, "Vd32=3Dvmin" TYPE2 "(Vu32= ,Vv32)", "Vd32."#SRC"=3Dvmin(Vu32."#SRC",Vv32."#SRC")", "Vector " DESCR " m= in", VdV.SRC[i] =3D (VuV.SRC[i] < VvV.SRC[i]) ? VuV.SRC[i] : VvV.SRC[i]) + +MMVEC_SORT(b,"b", "signed byte", 8, b); +MMVEC_SORT(ub,"ub", "unsigned byte", 8, ub); +MMVEC_SORT(uh,"uh", "unsigned halfword",16, uh); +MMVEC_SORT(h, "h", "halfword", 16, h); +MMVEC_SORT(w, "w", "word", 32, w); + + + + + + + + + +/************************************************************* +* SHUFFLES +****************************************************************/ + +ITERATOR_INSN2_ANY_SLOT(16,vsathub,"Vd32=3Dvsathub(Vu32,Vv32)","Vd32.ub=3D= vsat(Vu32.h,Vv32.h)", +"Saturate and pack 32 halfwords to 32 unsigned bytes, and interleave them", + fSETBYTE(0, VdV.uh[i], fVSATUB(VvV.h[i])); + fSETBYTE(1, VdV.uh[i], fVSATUB(VuV.h[i]))) + +ITERATOR_INSN2_ANY_SLOT(32,vsatwh,"Vd32=3Dvsatwh(Vu32,Vv32)","Vd32.h=3Dvsa= t(Vu32.w,Vv32.w)", +"Saturate and pack 16 words to 16 halfwords, and interleave them", + fSETHALF(0, VdV.w[i], fVSATH(VvV.w[i])); + fSETHALF(1, VdV.w[i], fVSATH(VuV.w[i]))) + +ITERATOR_INSN2_ANY_SLOT(32,vsatuwuh,"Vd32=3Dvsatuwuh(Vu32,Vv32)","Vd32.uh= =3Dvsat(Vu32.uw,Vv32.uw)", +"Saturate and pack 16 words to 16 halfwords, and interleave them", + fSETHALF(0, VdV.w[i], fVSATUH(VvV.uw[i])); + fSETHALF(1, VdV.w[i], fVSATUH(VuV.uw[i]))) + +ITERATOR_INSN2_ANY_SLOT(16,vshuffeb,"Vd32=3Dvshuffeb(Vu32,Vv32)","Vd32.b= =3Dvshuffe(Vu32.b,Vv32.b)", +"Shuffle half words with in a lane", + fSETBYTE(0, VdV.uh[i], fGETUBYTE(0, VvV.uh[i])); + fSETBYTE(1, VdV.uh[i], fGETUBYTE(0, VuV.uh[i]))) + +ITERATOR_INSN2_ANY_SLOT(16,vshuffob,"Vd32=3Dvshuffob(Vu32,Vv32)","Vd32.b= =3Dvshuffo(Vu32.b,Vv32.b)", +"Shuffle half words with in a lane", + fSETBYTE(0, VdV.uh[i], fGETUBYTE(1, VvV.uh[i])); + fSETBYTE(1, VdV.uh[i], fGETUBYTE(1, VuV.uh[i]))) + +ITERATOR_INSN2_ANY_SLOT(32,vshufeh,"Vd32=3Dvshuffeh(Vu32,Vv32)","Vd32.h=3D= vshuffe(Vu32.h,Vv32.h)", +"Shuffle half words with in a lane", + fSETHALF(0, VdV.uw[i], fGETUHALF(0, VvV.uw[i])); + fSETHALF(1, VdV.uw[i], fGETUHALF(0, VuV.uw[i]))) + +ITERATOR_INSN2_ANY_SLOT(32,vshufoh,"Vd32=3Dvshuffoh(Vu32,Vv32)","Vd32.h=3D= vshuffo(Vu32.h,Vv32.h)", +"Shuffle half words with in a lane", + fSETHALF(0, VdV.uw[i], fGETUHALF(1, VvV.uw[i])); + fSETHALF(1, VdV.uw[i], fGETUHALF(1, VuV.uw[i]))) + + + + +/************************************************************************** +* Double Vector Shuffles +**************************************************************************/ + +DEF_CVI_MAPPING(V6_vtran2x2_map, "vtrans2x2(Vy32,Vx32,Rt32)","vshuff(Vy32,= Vx32,Rt32)") + + +EXTINSN(V6_vshuff, "vshuff(Vy32,Vx32,Rt32)", +ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP_VS,A_CVI_EARLY,A_NOTE_PERMUTE_RESOURCE,= A_NOTE_SHIFT_RESOURCE), +"2x2->2x2 transpose, for multiple data sizes, inplace", +{ + fHIDE(int offset;) + for (offset=3D1; offset2x2 transpose for multiple data sizes", +{ + fHIDE(int offset;) + VddV.v[0] =3D VvV; + VddV.v[1] =3D VuV; + for (offset=3D1; offset>1; offset>0; offset>>=3D1) { + if ( RtV & offset) { + fHIDE(int k;) \ + fVFOREACH(8, k) {\ + if (!( k & offset)) { + fSWAPB(VddV.v[1].ub[k], VddV.v[0].ub[k+offset]); + } + } + } + } + }) + +/*************************************************************************= */ + + + +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(32,vshufoeh,"Vdd32=3Dvshuffoeh(Vu32,Vv3= 2)","Vdd32.h=3Dvshuffoe(Vu32.h,Vv32.h)", +"Vector Shuffle half words", + fSETHALF(0, VddV.v[0].uw[i], fGETUHALF(0, VvV.uw[i])); + fSETHALF(1, VddV.v[0].uw[i], fGETUHALF(0, VuV.uw[i])); + fSETHALF(0, VddV.v[1].uw[i], fGETUHALF(1, VvV.uw[i])); + fSETHALF(1, VddV.v[1].uw[i], fGETUHALF(1, VuV.uw[i]))) + +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(16,vshufoeb,"Vdd32=3Dvshuffoeb(Vu32,Vv3= 2)","Vdd32.b=3Dvshuffoe(Vu32.b,Vv32.b)", +"Vector Shuffle bytes", + fSETBYTE(0, VddV.v[0].uh[i], fGETUBYTE(0, VvV.uh[i])); + fSETBYTE(1, VddV.v[0].uh[i], fGETUBYTE(0, VuV.uh[i])); + fSETBYTE(0, VddV.v[1].uh[i], fGETUBYTE(1, VvV.uh[i])); + fSETBYTE(1, VddV.v[1].uh[i], fGETUBYTE(1, VuV.uh[i]))) + + +/*************************************************************** +* Deal +***************************************************************/ + +ITERATOR_INSN2_PERMUTE_SLOT(32, vdealh, "Vd32=3Dvdealh(Vu32)", "Vd32.h=3Dv= deal(Vu32.h)", +"Deal Halfwords", + VdV.uh[i ] =3D fGETUHALF(0, VuV.uw[i]); + VdV.uh[i+fVELEM(32)] =3D fGETUHALF(1, VuV.uw[i])) + +ITERATOR_INSN2_PERMUTE_SLOT(16, vdealb, "Vd32=3Dvdealb(Vu32)", "Vd32.b=3Dv= deal(Vu32.b)", +"Deal Halfwords", + VdV.ub[i ] =3D fGETUBYTE(0, VuV.uh[i]); + VdV.ub[i+fVELEM(16)] =3D fGETUBYTE(1, VuV.uh[i])) + +ITERATOR_INSN2_PERMUTE_SLOT(32, vdealb4w, "Vd32=3Dvdealb4w(Vu32,Vv32)", "= Vd32.b=3Dvdeale(Vu32.b,Vv32.b)", +"Deal Two Vectors Bytes", + VdV.ub[0+i ] =3D fGETUBYTE(0, VvV.uw[i]); + VdV.ub[fVELEM(32)+i ] =3D fGETUBYTE(2, VvV.uw[i]); + VdV.ub[2*fVELEM(32)+i] =3D fGETUBYTE(0, VuV.uw[i]); + VdV.ub[3*fVELEM(32)+i] =3D fGETUBYTE(2, VuV.uw[i])) + +/*************************************************************** +* shuffle +***************************************************************/ + +ITERATOR_INSN2_PERMUTE_SLOT(32, vshuffh, "Vd32=3Dvshuffh(Vu32)", "Vd32.h= =3Dvshuff(Vu32.h)", +"Deal Halfwords", + fSETHALF(0, VdV.uw[i], VuV.uh[i]); + fSETHALF(1, VdV.uw[i], VuV.uh[i+fVELEM(32)])) + +ITERATOR_INSN2_PERMUTE_SLOT(16, vshuffb, "Vd32=3Dvshuffb(Vu32)", "Vd32.b= =3Dvshuff(Vu32.b)", +"Deal Halfwords", + fSETBYTE(0, VdV.uh[i], VuV.ub[i]); + fSETBYTE(1, VdV.uh[i], VuV.ub[i+fVELEM(16)])) + + + + + +/*********************************************************** +* INSERT AND EXTRACT +*********************************************************/ +EXTINSN(V6_extractw, "Rd32=3Dvextract(Vu32,Rs32)", +ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA,A_RESTRICT_NOPACKET,A_CVI_EXTRACT,A_NOT= E_NOPACKET,A_MEMLIKE,A_RESTRICT_SLOT0ONLY), +"Extract an element from a vector to scalar", +fHIDE(warn("RdN=3D%d VuN=3D%d RsN=3D%d RsV=3D0x%08x widx=3D%d",RdN,VuN,RsN= ,RsV,((RsV & (fVBYTES()-1)) >> 2));) +RdV =3D VuV.uw[ (RsV & (fVBYTES()-1)) >> 2]; +fHIDE(warn("RdV=3D0x%08x",RdV);)) + +DEF_CVI_MAPPING(V6_extractw_alt,"Rd32.w=3Dvextract(Vu32,Rs32)","Rd32=3Dvex= tract(Vu32,Rs32)") + +EXTINSN(V6_vinsertwr, "Vx32.w=3Dvinsert(Rt32)", +ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX,A_CVI_LATE,A_NOTE_MPY_RESOURCE), +"Insert Word Scalar into Vector", +VxV.uw[0] =3D RtV;) + + + + +ITERATOR_INSN_MPY_SLOT_LATE(32,lvsplatw, "Vd32=3Dvsplat(Rt32)", "Replicate= s scalar accross words in vector", VdV.uw[i] =3D RtV) + +ITERATOR_INSN_MPY_SLOT_LATE(16,lvsplath, "Vd32.h=3Dvsplat(Rt32)", "Replica= tes scalar accross halves in vector", VdV.uh[i] =3D RtV) + +ITERATOR_INSN_MPY_SLOT_LATE(8,lvsplatb, "Vd32.b=3Dvsplat(Rt32)", "Replicat= es scalar accross bytes in vector", VdV.ub[i] =3D RtV) + + +DEF_CVI_MAPPING(V6_vassignp,"Vdd32=3DVuu32","Vdd32=3Dvcombine(Vuu.H32,Vuu.= L32)") + +ITERATOR_INSN_ANY_SLOT(32,vassign,"Vd32=3DVu32","Copy a vector",VdV.w[i]= =3DVuV.w[i]) + + +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8,vcombine,"Vdd32=3Dvcombine(Vu32,Vv32)", +"Vector assign, Any two to Vector Pair", + VddV.v[0].ub[i] =3D VvV.ub[i]; + VddV.v[1].ub[i] =3D VuV.ub[i]) + + + +/////////////////////////////////////////////////////////////////////////// + + +/********************************************************* +* GENERAL PERMUTE NETWORKS +*********************************************************/ + + +EXTINSN(V6_vdelta, "Vd32=3Dvdelta(Vu32,Vv32)", ATTRIBS(A_EXTENSION,A_CV= I,A_CVI_VP,A_NOTE_PERMUTE_RESOURCE), +"Reverse Benes Butterfly network ", +{ + fHIDE(int offset;) + fHIDE(int k;) + for (offset=3DfVBYTES(); (offset>>=3D1)>0; ) { + for (k =3D 0; k>3; \ + unsigned char element =3D value & 7; \ + READ_EXT_VREG(regno,tmp,0); \ + tmp.uh[(128/16)*lane+(element)]++; \ + WRITE_EXT_VREG(regno,tmp,EXT_NEW); \ + } \ + } + +#define fHISTQ(INPUTVEC,QVAL) \ + fUARCH_NOTE_PUMP_4X(); \ + fHIDE(int lane;) \ + fHIDE(mmvector_t tmp;) \ + fVFOREACH(128, lane) { \ + for (fHIDE(int )i=3D0; i<128/8; ++i) { \ + unsigned char value =3D INPUTVEC.ub[(128/8)*lane+i]; \ + unsigned char regno =3D value>>3; \ + unsigned char element =3D value & 7; \ + READ_EXT_VREG(regno,tmp,0); \ + if (fGETQBIT(QVAL,128/8*lane+i)) tmp.uh[(128/16)*lane+(element)]++; \ + WRITE_EXT_VREG(regno,tmp,EXT_NEW); \ + } \ + } + + + +EXTINSN(V6_vhist, "vhist",ATTRIBS(A_EXTENSION,A_CVI,A_CVI_4SLOT,A_CVI_REQU= IRES_TMPLOAD), "vhist instruction",{ fHIDE(mmvector_t inputVec;) inputVec= =3DfTMPVDATA(); fHIST(inputVec); }) +EXTINSN(V6_vhistq, "vhist(Qv4)",ATTRIBS(A_EXTENSION,A_CVI,A_CVI_4SLOT,A_CV= I_REQUIRES_TMPLOAD), "vhist instruction",{ fHIDE(mmvector_t inputVec;) inpu= tVec=3DfTMPVDATA(); fHISTQ(inputVec,QvV); }) + +#undef fHIST +#undef fHISTQ + + +/* **** WEIGHTED HISTOGRAM **** */ + + +#if 1 +#define WHIST(EL,MASK,BSHIFT,COND,SATF) \ + fHIDE(unsigned int) bucket =3D fGETUBYTE(0,input.h[i]); \ + fHIDE(unsigned int) weight =3D fGETUBYTE(1,input.h[i]); \ + fHIDE(unsigned int) vindex =3D (bucket >> 3) & 0x1F; \ + fHIDE(unsigned int) elindex =3D ((i>>BSHIFT) & (~MASK)) | ((bucket>>BSHIF= T) & MASK); \ + fHIDE(mmvector_t tmp;) \ + READ_EXT_VREG(vindex,tmp,0); \ + COND tmp.EL[elindex] =3D SATF(tmp.EL[elindex] + weight); \ + WRITE_EXT_VREG(vindex,tmp,EXT_NEW); \ + fUARCH_NOTE_PUMP_2X(); + +ITERATOR_INSN_VHISTLIKE(16,vwhist256,"vwhist256","vector weighted histogra= m halfword counters", WHIST(uh,7,0,,)) +ITERATOR_INSN_VHISTLIKE(16,vwhist256q,"vwhist256(Qv4)","vector weighted hi= stogram halfword counters", WHIST(uh,7,0,if (fGETQBIT(QvV,2*i)),)) +ITERATOR_INSN_VHISTLIKE(16,vwhist256_sat,"vwhist256:sat","vector weighted = histogram halfword counters", WHIST(uh,7,0,,fVSATUH)) +ITERATOR_INSN_VHISTLIKE(16,vwhist256q_sat,"vwhist256(Qv4):sat","vector wei= ghted histogram halfword counters", WHIST(uh,7,0,if (fGETQBIT(QvV,2*i)),fVS= ATUH)) +ITERATOR_INSN_VHISTLIKE(16,vwhist128,"vwhist128","vector weighted histogra= m word counters", WHIST(uw,3,1,,)) +ITERATOR_INSN_VHISTLIKE(16,vwhist128q,"vwhist128(Qv4)","vector weighted hi= stogram word counters", WHIST(uw,3,1,if (fGETQBIT(QvV,2*i)),)) +ITERATOR_INSN_VHISTLIKE(16,vwhist128m,"vwhist128(#u1)","vector weighted hi= stogram word counters", WHIST(uw,3,1,if ((bucket & 1) =3D=3D uiV),)) +ITERATOR_INSN_VHISTLIKE(16,vwhist128qm,"vwhist128(Qv4,#u1)","vector weight= ed histogram word counters", WHIST(uw,3,1,if (((bucket & 1) =3D=3D uiV) && = fGETQBIT(QvV,2*i)),)) + + +#endif + + + +/* ****** lookup table instructions ***********= */ + +/* Use low bits from idx to choose next-bigger elements from vector, then = use LSB from idx to choose odd or even element */ + +ITERATOR_INSN_PERMUTE_SLOT(8,vlutvvb,"Vd32.b=3Dvlut32(Vu32.b,Vv32.b,Rt8)",= "vector-vector table lookup", +fHIDE(fRT8NOTE()) +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D RtV & 0x7; +oddhalf =3D (RtV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D VuV.ub[i]; +VdV.b[i] =3D ((idx & 0xE0) =3D=3D (matchval << 5)) ? fGETBYTE(oddhalf,VvV.= h[idx % fVELEM(16)]) : 0) + + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(8,vlutvvb_oracc,"Vx32.b|=3Dvlut32(Vu= 32.b,Vv32.b,Rt8)","vector-vector table lookup", +fHIDE(fRT8NOTE()) +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D RtV & 0x7; +oddhalf =3D (RtV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D VuV.ub[i]; +VxV.b[i] |=3D ((idx & 0xE0) =3D=3D (matchval << 5)) ? fGETBYTE(oddhalf,VvV= .h[idx % fVELEM(16)]) : 0) + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(16,vlutvwh,"Vdd32.h=3Dvlut16(Vu32.b,= Vv32.h,Rt8)","vector-vector table lookup", +fHIDE(fRT8NOTE()) +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D RtV & 0xF; +oddhalf =3D (RtV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D fGETUBYTE(0,VuV.uh[i]); +VddV.v[0].h[i] =3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddhal= f,VvV.w[idx % fVELEM(32)]) : 0; +idx =3D fGETUBYTE(1,VuV.uh[i]); +VddV.v[1].h[i] =3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddhal= f,VvV.w[idx % fVELEM(32)]) : 0) + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(16,vlutvwh_oracc,"Vxx32.h|=3Dvlut16(= Vu32.b,Vv32.h,Rt8)","vector-vector table lookup", +fHIDE(fRT8NOTE()) +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D fGETUBYTE(0,RtV) & 0xF; +oddhalf =3D (RtV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D fGETUBYTE(0,VuV.uh[i]); +VxxV.v[0].h[i] |=3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddha= lf,VvV.w[idx % fVELEM(32)]) : 0; +idx =3D fGETUBYTE(1,VuV.uh[i]); +VxxV.v[1].h[i] |=3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddha= lf,VvV.w[idx % fVELEM(32)]) : 0) + +ITERATOR_INSN_PERMUTE_SLOT(8,vlutvvbi,"Vd32.b=3Dvlut32(Vu32.b,Vv32.b,#u3)"= ,"vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D uiV & 0x7; +oddhalf =3D (uiV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D VuV.ub[i]; +VdV.b[i] =3D ((idx & 0xE0) =3D=3D (matchval << 5)) ? fGETBYTE(oddhalf,VvV.= h[idx % fVELEM(16)]) : 0) + + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(8,vlutvvb_oracci,"Vx32.b|=3Dvlut32(V= u32.b,Vv32.b,#u3)","vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D uiV & 0x7; +oddhalf =3D (uiV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D VuV.ub[i]; +VxV.b[i] |=3D ((idx & 0xE0) =3D=3D (matchval << 5)) ? fGETBYTE(oddhalf,VvV= .h[idx % fVELEM(16)]) : 0) + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(16,vlutvwhi,"Vdd32.h=3Dvlut16(Vu32.b= ,Vv32.h,#u3)","vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D uiV & 0xF; +oddhalf =3D (uiV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D fGETUBYTE(0,VuV.uh[i]); +VddV.v[0].h[i] =3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddhal= f,VvV.w[idx % fVELEM(32)]) : 0; +idx =3D fGETUBYTE(1,VuV.uh[i]); +VddV.v[1].h[i] =3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddhal= f,VvV.w[idx % fVELEM(32)]) : 0) + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(16,vlutvwh_oracci,"Vxx32.h|=3Dvlut16= (Vu32.b,Vv32.h,#u3)","vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D uiV & 0xF; +oddhalf =3D (uiV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D fGETUBYTE(0,VuV.uh[i]); +VxxV.v[0].h[i] |=3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddha= lf,VvV.w[idx % fVELEM(32)]) : 0; +idx =3D fGETUBYTE(1,VuV.uh[i]); +VxxV.v[1].h[i] |=3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddha= lf,VvV.w[idx % fVELEM(32)]) : 0) + +ITERATOR_INSN_PERMUTE_SLOT(8,vlutvvb_nm,"Vd32.b=3Dvlut32(Vu32.b,Vv32.b,Rt8= ):nomatch","vector-vector table lookup", +fHIDE(fRT8NOTE()) +fHIDE(unsigned int idx;) fHIDE(int oddhalf;) fHIDE(int matchval;) + matchval =3D RtV & 0x7; + oddhalf =3D (RtV >> (fVECLOGSIZE()-6)) & 0x1; + idx =3D VuV.ub[i]; + idx =3D (idx&0x1F) | (matchval<<5); + VdV.b[i] =3D fGETBYTE(oddhalf,VvV.h[idx % fVELEM(16)])) + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(16,vlutvwh_nm,"Vdd32.h=3Dvlut16(Vu32= .b,Vv32.h,Rt8):nomatch","vector-vector table lookup", +fHIDE(fRT8NOTE()) +fHIDE(unsigned int idx;) fHIDE(int oddhalf;) fHIDE(int matchval;) + matchval =3D RtV & 0xF; + oddhalf =3D (RtV >> (fVECLOGSIZE()-6)) & 0x1; + idx =3D fGETUBYTE(0,VuV.uh[i]); + idx =3D (idx&0x0F) | (matchval<<4); + VddV.v[0].h[i] =3D fGETHALF(oddhalf,VvV.w[idx % fVELEM(32)]); + idx =3D fGETUBYTE(1,VuV.uh[i]); + idx =3D (idx&0x0F) | (matchval<<4); + VddV.v[1].h[i] =3D fGETHALF(oddhalf,VvV.w[idx % fVELEM(32)])) + + + + +/*************************************************************************= ***** +NON LINEAR - V65 + *************************************************************************= *****/ + +ITERATOR_INSN_SLOT2_DOUBLE_VEC(16,vmpahhsat,"Vx32.h=3Dvmpa(Vx32.h,Vu32.h,R= tt32.h):sat","piecewise linear approximation", + VxV.h[i]=3D fVSATH( ( ( fMPY16SS(VxV.h[i],VuV.h[i])<<1) + (fGETHALF(( = (VuV.h[i]>>14)&0x3), RttV )<<15))>>16)) + + +ITERATOR_INSN_SLOT2_DOUBLE_VEC(16,vmpauhuhsat,"Vx32.h=3Dvmpa(Vx32.h,Vu32.u= h,Rtt32.uh):sat","piecewise linear approximation", + VxV.h[i]=3D fVSATH( ( fMPY16SU(VxV.h[i],VuV.uh[i]) + (fGETUHALF(((VuV= .uh[i]>>14)&0x3), RttV )<<15))>>16)) + +ITERATOR_INSN_SLOT2_DOUBLE_VEC(16,vmpsuhuhsat,"Vx32.h=3Dvmps(Vx32.h,Vu32.u= h,Rtt32.uh):sat","piecewise linear approximation", + VxV.h[i]=3D fVSATH( ( fMPY16SU(VxV.h[i],VuV.uh[i]) - (fGETUHALF(((VuV= .uh[i]>>14)&0x3), RttV )<<15))>>16)) + + +ITERATOR_INSN_SLOT2_DOUBLE_VEC(16,vlut4,"Vd32.h=3Dvlut4(Vu32.uh,Rtt32.h)",= "4 entry lookup table", + VdV.h[i]=3D fGETHALF( ((VuV.h[i]>>14)&0x3), RttV )) + + + +/*************************************************************************= ***** +V65 + *************************************************************************= *****/ + +ITERATOR_INSN_MPY_SLOT_NOV1(32,vmpyuhe,"Vd32.uw=3Dvmpye(Vu32.uh,Rt32.uh)", +"Vector even halfword unsigned multiply by scalar", + VdV.uw[i] =3D fMPY16UU(fGETUHALF(0, VuV.uw[i]),fGETUHALF(0,RtV))) + + +ITERATOR_INSN_MPY_SLOT_NOV1(32,vmpyuhe_acc,"Vx32.uw+=3Dvmpye(Vu32.uh,Rt32.= uh)", +"Vector even halfword unsigned multiply by scalar", + VxV.uw[i] +=3D fMPY16UU(fGETUHALF(0, VuV.uw[i]),fGETUHALF(0,RtV))) + + + + +/*************************************************************************= ***** + Vecror HI/LOW accessors + *************************************************************************= *****/ +DEF_CVI_MAPPING(V6_hi, "Vd32=3Dhi(Vss32)","Vd32=3DVss.H32") +DEF_CVI_MAPPING(V6_lo, "Vd32=3Dlo(Vss32)","Vd32=3DVss.L32") + + + + +EXTINSN(V6_vgathermw, "vtmp.w=3Dvgather(Rt32,Mu2,Vv32.w).w", ATTRIBS(A_EX= TENSION,A_CVI,A_CVI_GATHER,A_CVI_VA,A_CVI_VM,A_CVI_TMP_DST,A_EA_PAGECROSS,A= _MEMSIZE_4B,A_MEMLIKE,A_CVI_GATHER_ADDR_4B,A_NOTE_ANY_RESOURCE), "Gather Wo= rds", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 4;) + fHIDE(fGATHER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + EA =3D RtV+VvV.uw[i]; + fVLOG_VTCM_GATHER_WORD(EA, VvV.uw[i], i,MuV); + } + fGATHER_FINISH() +}) +EXTINSN(V6_vgathermh, "vtmp.h=3Dvgather(Rt32,Mu2,Vv32.h).h", ATTRIBS(A_EX= TENSION,A_CVI,A_CVI_GATHER,A_CVI_VA,A_CVI_VM,A_CVI_TMP_DST,A_EA_PAGECROSS,A= _MEMSIZE_2B,A_MEMLIKE,A_CVI_GATHER_ADDR_2B,A_NOTE_ANY_RESOURCE), "Gather ha= lfwords", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 2;) + fHIDE(fGATHER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(16, i) { + EA =3D RtV+VvV.uh[i]; + fVLOG_VTCM_GATHER_HALFWORD(EA, VvV.uh[i], i,MuV); + } + fGATHER_FINISH() +}) + + + +EXTINSN(V6_vgathermhw, "vtmp.h=3Dvgather(Rt32,Mu2,Vvv32.w).h", ATTRIBS(A_= EXTENSION,A_CVI,A_CVI_GATHER,A_CVI_VA_DV,A_CVI_VM,A_CVI_TMP_DST,A_EA_PAGECR= OSS,A_MEMSIZE_2B,A_MEMLIKE,A_CVI_GATHER_ADDR_4B,A_NOTE_ANY_RESOURCE), "Gath= er halfwords", +{ + fHIDE(int i;) + fHIDE(int j;) + fHIDE(int element_size =3D 2;) + fHIDE(fGATHER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + for(j =3D 0; j < 2; j++) { + EA =3D RtV+VvvV.v[j].uw[i]; + fVLOG_VTCM_GATHER_HALFWORD_DV(EA, VvvV.v[j].uw[i], (2*i+j),i,j= ,MuV); + } + } + fGATHER_FINISH() +}) + + +EXTINSN(V6_vgathermwq, "if (Qs4) vtmp.w=3Dvgather(Rt32,Mu2,Vv32.w).w", AT= TRIBS(A_EXTENSION,A_CVI,A_CVI_GATHER,A_CVI_VA,A_CVI_VM,A_CVI_TMP_DST,A_EA_P= AGECROSS,A_MEMSIZE_4B,A_MEMLIKE,A_CVI_GATHER_ADDR_4B,A_NOTE_ANY_RESOURCE), = "Gather Words", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 4;) + fHIDE(fGATHER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + EA =3D RtV+VvV.uw[i]; + fVLOG_VTCM_GATHER_WORDQ(EA, VvV.uw[i], i,QsV,MuV); + } + fGATHER_FINISH() +}) +EXTINSN(V6_vgathermhq, "if (Qs4) vtmp.h=3Dvgather(Rt32,Mu2,Vv32.h).h", AT= TRIBS(A_EXTENSION,A_CVI,A_CVI_GATHER,A_CVI_VA,A_CVI_VM,A_CVI_TMP_DST,A_EA_P= AGECROSS,A_MEMSIZE_2B,A_MEMLIKE,A_CVI_GATHER_ADDR_2B,A_NOTE_ANY_RESOURCE), = "Gather halfwords", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 2;) + fHIDE(fGATHER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(16, i) { + EA =3D RtV+VvV.uh[i]; + fVLOG_VTCM_GATHER_HALFWORDQ(EA, VvV.uh[i], i,QsV,MuV); + } + fGATHER_FINISH() +}) + + + +EXTINSN(V6_vgathermhwq, "if (Qs4) vtmp.h=3Dvgather(Rt32,Mu2,Vvv32.w).h", = ATTRIBS(A_EXTENSION,A_CVI,A_CVI_GATHER,A_CVI_VA_DV,A_CVI_VM,A_CVI_TMP_DST,A= _EA_PAGECROSS,A_MEMSIZE_2B,A_MEMLIKE,A_CVI_GATHER_ADDR_4B,A_NOTE_ANY_RESOUR= CE), "Gather halfwords", +{ + fHIDE(int i;) + fHIDE(int j;) + fHIDE(int element_size =3D 2;) + fHIDE(fGATHER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + for(j =3D 0; j < 2; j++) { + EA =3D RtV+VvvV.v[j].uw[i]; + fVLOG_VTCM_GATHER_HALFWORDQ_DV(EA, VvvV.v[j].uw[i], (2*i+j),i,= j,QsV,MuV); + } + } + fGATHER_FINISH() +}) + + + +EXTINSN(V6_vscattermw , "vscatter(Rt32,Mu2,Vv32.w).w=3DVw32", ATTRIBS(A_EX= TENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA,A_CVI_VM,A_EA_PAGECROSS,A_MEMSIZE_4B,A= _CVI_GATHER_ADDR_4B,A_MEMLIKE,A_NOTE_ANY_RESOURCE), "Scatter Words", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 4;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + EA =3D RtV+VvV.uw[i]; + fVLOG_VTCM_WORD(EA, VvV.uw[i], VwV,i,MuV); + } + fSCATTER_FINISH(0) +}) + + + +EXTINSN(V6_vscattermh , "vscatter(Rt32,Mu2,Vv32.h).h=3DVw32", ATTRIBS(A_EX= TENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA,A_CVI_VM,A_EA_PAGECROSS,A_MEMSIZE_2B,A= _CVI_GATHER_ADDR_2B,A_MEMLIKE,A_NOTE_ANY_RESOURCE), "Scatter halfWords", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 2;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(16, i) { + EA =3D RtV+VvV.uh[i]; + fVLOG_VTCM_HALFWORD(EA,VvV.uh[i],VwV,i,MuV); + } + fSCATTER_FINISH(0) +}) + + +EXTINSN(V6_vscattermw_add, "vscatter(Rt32,Mu2,Vv32.w).w+=3DVw32", ATTRIBS= (A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA,A_CVI_VM,A_EA_PAGECROSS,A_MEMSIZE= _4B,A_CVI_GATHER_ADDR_4B,A_MEMLIKE,A_CVI_SCATTER_WORD_ACC,A_NOTE_ANY_RESOUR= CE), "Scatter Words-Add", +{ + fHIDE(int i;) + fHIDE(int ALIGNMENT=3D4;) + fHIDE(int element_size =3D 4;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + EA =3D (RtV+fVALIGN(VvV.uw[i],ALIGNMENT)); + fVLOG_VTCM_WORD_INCREMENT(EA,VvV.uw[i],VwV,i,ALIGNMENT,MuV); + } + fHIDE(fLOG_SCATTER_OP(4);) + fSCATTER_FINISH(1) +}) + +EXTINSN(V6_vscattermh_add, "vscatter(Rt32,Mu2,Vv32.h).h+=3DVw32", ATTRIBS= (A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA,A_CVI_VM,A_EA_PAGECROSS,A_MEMSIZE= _2B,A_CVI_GATHER_ADDR_2B,A_CVI_SCATTER_ACC,A_MEMLIKE,A_NOTE_ANY_RESOURCE), = "Scatter halfword-Add", +{ + fHIDE(int i;) + fHIDE(int ALIGNMENT=3D2;) + fHIDE(int element_size =3D 2;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(16, i) { + EA =3D (RtV+fVALIGN(VvV.uh[i],ALIGNMENT)); + fVLOG_VTCM_HALFWORD_INCREMENT(EA,VvV.uh[i],VwV,i,ALIGNMENT,MuV); + } + fHIDE(fLOG_SCATTER_OP(2);) + fSCATTER_FINISH(1) +}) + + +EXTINSN(V6_vscattermwq, "if (Qs4) vscatter(Rt32,Mu2,Vv32.w).w=3DVw32", AT= TRIBS(A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA,A_CVI_VM,A_EA_PAGECROSS,A_ME= MSIZE_4B,A_CVI_GATHER_ADDR_4B,A_MEMLIKE,A_NOTE_ANY_RESOURCE), "Scatter Word= s conditional", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 4;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + EA =3D RtV+VvV.uw[i]; + fVLOG_VTCM_WORDQ(EA,VvV.uw[i], VwV,i,QsV,MuV); + } + fSCATTER_FINISH(0) +}) + +EXTINSN(V6_vscattermhq, "if (Qs4) vscatter(Rt32,Mu2,Vv32.h).h=3DVw32", AT= TRIBS(A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA,A_CVI_VM,A_EA_PAGECROSS,A_ME= MSIZE_2B,A_CVI_GATHER_ADDR_2B,A_MEMLIKE,A_NOTE_ANY_RESOURCE), "Scatter Half= Words conditional", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 2;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(16, i) { + EA =3D RtV+VvV.uh[i]; + fVLOG_VTCM_HALFWORDQ(EA,VvV.uh[i],VwV,i,QsV,MuV); + } + fSCATTER_FINISH(0) +}) + + + + +EXTINSN(V6_vscattermhw , "vscatter(Rt32,Mu2,Vvv32.w).h=3DVw32", ATTRIBS(A_= EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA_DV,A_CVI_VM,A_EA_PAGECROSS,A_MEMSIZE= _2B,A_CVI_GATHER_ADDR_4B,A_MEMLIKE,A_NOTE_ANY2_RESOURCE), "Scatter Words", +{ + fHIDE(int i;) + fHIDE(int j;) + fHIDE(int element_size =3D 2;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + for(j =3D 0; j < 2; j++) { + EA =3D RtV+VvvV.v[j].uw[i]; + fVLOG_VTCM_HALFWORD_DV(EA,VvvV.v[j].uw[i],VwV,(2*i+j),i,j,MuV); + } + } + fSCATTER_FINISH(0) +}) + + + +EXTINSN(V6_vscattermhwq, "if (Qs4) vscatter(Rt32,Mu2,Vvv32.w).h=3DVw32", = ATTRIBS(A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA_DV,A_CVI_VM,A_EA_PAGECROSS= ,A_MEMSIZE_2B,A_CVI_GATHER_ADDR_4B,A_MEMLIKE,A_NOTE_ANY2_RESOURCE), "Scatte= r halfwords conditional", +{ + fHIDE(int i;) + fHIDE(int j;) + fHIDE(int element_size =3D 2;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + for(j =3D 0; j < 2; j++) { + EA =3D RtV+VvvV.v[j].uw[i]; + fVLOG_VTCM_HALFWORDQ_DV(EA,VvvV.v[j].uw[i],VwV,(2*i+j),QsV,i,j= ,MuV); + } + } + fSCATTER_FINISH(0) +}) + +EXTINSN(V6_vscattermhw_add, "vscatter(Rt32,Mu2,Vvv32.w).h+=3DVw32", ATTRI= BS(A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA_DV,A_CVI_VM,A_EA_PAGECROSS,A_ME= MSIZE_2B,A_CVI_GATHER_ADDR_4B,A_CVI_SCATTER_ACC,A_MEMLIKE,A_NOTE_ANY2_RESOU= RCE), "Scatter halfwords-add", +{ + fHIDE(int i;) + fHIDE(int j;) + fHIDE(int ALIGNMENT=3D2;) + fHIDE(int element_size =3D 2;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + for(j =3D 0; j < 2; j++) { + EA =3D RtV + fVALIGN(VvvV.v[j].uw[i],ALIGNMENT);; + fVLOG_VTCM_HALFWORD_INCREMENT_DV(EA,VvvV.v[j].uw[i],VwV,(2*i+= j),i,j,ALIGNMENT,MuV); + } + } + fHIDE(fLOG_SCATTER_OP(2);) + fSCATTER_FINISH(1) +}) +DEF_CVI_MAPPING(V6_vscattermw_alt, "vscatter(Rt32,Mu2,Vv32.w)=3DVw32.w", = "vscatter(Rt32,Mu2,Vv32.w).w=3DVw32") +DEF_CVI_MAPPING(V6_vscattermwh_alt, "vscatter(Rt32,Mu2,Vvv32.w)=3DVw32.h"= , "vscatter(Rt32,Mu2,Vvv32.w).h=3DVw32") +DEF_CVI_MAPPING(V6_vscattermh_alt, "vscatter(Rt32,Mu2,Vv32.h)=3DVw32.h", = "vscatter(Rt32,Mu2,Vv32.h).h=3DVw32") + +DEF_CVI_MAPPING(V6_vscattermw_add_alt, "vscatter(Rt32,Mu2,Vv32.w)+=3DVw32= .w", "vscatter(Rt32,Mu2,Vv32.w).w+=3DVw32") +DEF_CVI_MAPPING(V6_vscattermwh_add_alt, "vscatter(Rt32,Mu2,Vvv32.w)+=3DVw= 32.h", "vscatter(Rt32,Mu2,Vvv32.w).h+=3DVw32") +DEF_CVI_MAPPING(V6_vscattermh_add_alt, "vscatter(Rt32,Mu2,Vv32.h)+=3DVw32= .h", "vscatter(Rt32,Mu2,Vv32.h).h+=3DVw32") + +DEF_CVI_MAPPING(V6_vscattermwq_alt, "if (Qs4) vscatter(Rt32,Mu2,Vv32.w)= =3DVw32.w", "if (Qs4) vscatter(Rt32,Mu2,Vv32.w).w=3DVw32") +DEF_CVI_MAPPING(V6_vscattermwhq_alt, "if (Qs4) vscatter(Rt32,Mu2,Vvv32.w)= =3DVw32.h", "if (Qs4) vscatter(Rt32,Mu2,Vvv32.w).h=3DVw32") +DEF_CVI_MAPPING(V6_vscattermhq_alt, "if (Qs4) vscatter(Rt32,Mu2,Vv32.h)= =3DVw32.h", "if (Qs4) vscatter(Rt32,Mu2,Vv32.h).h=3DVw32") + +EXTINSN(V6_vprefixqb,"Vd32.b=3Dprefixsum(Qv4)", ATTRIBS(A_EXTENSION,A_CV= I,A_CVI_VS,A_CVI_EARLY,A_NOTE_SHIFT_RESOURCE), "parallel prefix sum of Q i= nto byte", +{ + fHIDE(int i;) + fHIDE(size1u_t acc =3D 0;) + fVFOREACH(8, i) { + acc +=3D fGETQBIT(QvV,i); + VdV.ub[i] =3D acc; + } + } ) +EXTINSN(V6_vprefixqh,"Vd32.h=3Dprefixsum(Qv4)", ATTRIBS(A_EXTENSION,A_CV= I,A_CVI_VS,A_CVI_EARLY,A_NOTE_SHIFT_RESOURCE), "parallel prefix sum of Q i= nto halfwords", +{ + fHIDE(int i;) + fHIDE(size2u_t acc =3D 0;) + fVFOREACH(16, i) { + acc +=3D fGETQBIT(QvV,i*2+0); + acc +=3D fGETQBIT(QvV,i*2+1); + VdV.uh[i] =3D acc; + } + } ) +EXTINSN(V6_vprefixqw,"Vd32.w=3Dprefixsum(Qv4)", ATTRIBS(A_EXTENSION,A_CV= I,A_CVI_VS,A_CVI_EARLY,A_NOTE_SHIFT_RESOURCE), "parallel prefix sum of Q i= nto words", +{ + fHIDE(int i;) + fHIDE(size4u_t acc =3D 0;) + fVFOREACH(32, i) { + acc +=3D fGETQBIT(QvV,i*4+0); + acc +=3D fGETQBIT(QvV,i*4+1); + acc +=3D fGETQBIT(QvV,i*4+2); + acc +=3D fGETQBIT(QvV,i*4+3); + VdV.uw[i] =3D acc; + } + } ) + + + + + +/*************************************************************************= ***** + DEBUG Vector/Register Printing + *************************************************************************= *****/ + +#define PRINT_VU(TYPE, TYPE2, COUNT)\ + int i; \ + size4u_t vec_len =3D fVBYTES();\ + fprintf(stdout,"V%2d: ",VuN); \ + for (i=3D0;i>COUNT;i++) { \ + fprintf(stdout,TYPE2 " ", VuV.TYPE[i]); \ + }; \ + fprintf(stdout,"\\n"); \ + fflush(stdout);\ + +EXTINSN(V6_pv32, "pv32(Vu32)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAK= EINSN),"Print a Vector", { PRINT_VU(uw, "%08x", 2); }) +EXTINSN(V6_pv32d, "pv32d(Vu32)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAK= EINSN),"Print a Vector", { PRINT_VU(w, "%10d", 2); }) +EXTINSN(V6_pv32du, "pv32du(Vu32)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAK= EINSN),"Print a Vector", { PRINT_VU(uw, "%10u", 2); }) +EXTINSN(V6_pv64d, "pv64d(Vu32)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAK= EINSN),"Print a Vector", { PRINT_VU(ud, "%lli", 4); }) +EXTINSN(V6_pv16, "pv16(Vu32)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAK= EINSN),"Print a Vector", { PRINT_VU(uh, "%04x", 1); }) +EXTINSN(V6_pv16d, "pv16d(Vu32)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAK= EINSN),"Print a Vector", { PRINT_VU(h, "%5d", 1); }) +EXTINSN(V6_pv8d, "pv8d(Vu32)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAK= EINSN),"Print a Vector", { PRINT_VU(ub, "%3u", 0); }) +EXTINSN(V6_pv8, "pv8(Vu32)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAK= EINSN),"Print a Vector", { PRINT_VU(ub, "%02x", 0); }) +EXTINSN(V6_preg, "preg(Ru32)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAK= EINSN),"Print a scalar", { printf("R%02d=3D0x%08x\\n", RuN, RuV);}) +EXTINSN(V6_pregd, "pregd(Ru32)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAK= EINSN),"Print a scalar", { printf("R%02d=3D%10d\\n", RuN, RuV); }) +EXTINSN(V6_pregf, "pregf(Ru32)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAK= EINSN),"Print a scalar", { printf("R%02d=3D%f\\n", RuN, (float)RuV); }) + +EXTINSN(V6_pz, "pz(Zu2)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAKEINSN)= ,"Print a scalar", { + fprintf(stdout,"Z%d:\\n", ZuN); + for(int m=3D0, l=3D0; l < fVBYTES()/4/8; l++) + { + fprintf(stdout,"\\t"); + for(int k =3D 0; k < 8; k++,m++) + fprintf(stdout,"%x ", ZuV.w[m]); + fprintf(stdout,"\\n"); + } + fprintf(stdout,"\\n"); + fflush(stdout); +}) + + + + +EXTINSN(V6_ppred, "ppred(Qs4)", ATTRIBS(A_EXTENSION,A_CVI,A_VDBG,A_FAK= EINSN),"Print Predicates", + int j; + fprintf(stdout,"Q%d: [",QsN); + for (j =3D 0; j < fVBYTES()-1; j++){ + fprintf(stdout,"%1x,", fGETQBIT(QsV,j)); + } + fprintf(stdout,"%1x", fGETQBIT(QsV,j)); + fprintf(stdout,"]\\n"); + fflush(stdout); +) + + + + +#undef ATTR_VMEM +#undef ATTR_VMEMU +#undef ATTR_VMEM_NT + +#endif /* NO_MMVEC */ + +#ifdef __SELF_DEF_EXTINSN +#undef EXTINSN +#undef __SELF_DEF_EXTINSN +#endif --=20 2.7.4