From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634034145399199.57502205339608; Tue, 12 Oct 2021 03:22:25 -0700 (PDT) Received: from localhost ([::1]:57284 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maEvQ-0003q1-72 for importer@patchew.org; Tue, 12 Oct 2021 06:22:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50258) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEku-0000Vv-N2 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:32 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64080) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEks-0006xP-9z for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:32 -0400 Received: from unknown (HELO ironmsg04-sd.qualcomm.com) ([10.53.140.144]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg04-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:22 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 39012105B; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033490; x=1665569490; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oDbj3QZGWSbZPYsQeOoIRAL7I0gMXESaNXtRe7Fn8wI=; b=wvOEjtVptwUeXaiSR54JkvWSzP48f/SUKQbmF9Yox0DNhrnd+aL3C1m8 Nfgousv21wc78SofSOTgbw0DhEdj1H6cgOm9C97cHy3QRRqt9dqhwi1Hi twF/k0tuiMTgHygxUlCLpnmEBNfk/xsZOVJacp1ActDFLv4b4PNzmdTnl 4=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 01/30] Hexagon HVX (target/hexagon) README Date: Tue, 12 Oct 2021 05:10:39 -0500 Message-Id: <1634033468-23566-2-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634034146079100001 Signed-off-by: Taylor Simpson --- target/hexagon/README | 81 +++++++++++++++++++++++++++++++++++++++++++++++= +++- 1 file changed, 80 insertions(+), 1 deletion(-) diff --git a/target/hexagon/README b/target/hexagon/README index b0b2435..372e247 100644 --- a/target/hexagon/README +++ b/target/hexagon/README @@ -1,9 +1,13 @@ Hexagon is Qualcomm's very long instruction word (VLIW) digital signal -processor(DSP). +processor(DSP). We also support Hexagon Vector eXtensions (HVX). HVX +is a wide vector coprocessor designed for high performance computer vision, +image processing, machine learning, and other workloads. =20 The following versions of the Hexagon core are supported Scalar core: v67 https://developer.qualcomm.com/downloads/qualcomm-hexagon-v67-programm= er-s-reference-manual + HVX extension: v66 + https://developer.qualcomm.com/downloads/qualcomm-hexagon-v66-hvx-prog= rammer-s-reference-manual =20 We presented an overview of the project at the 2019 KVM Forum. https://kvmforum2019.sched.com/event/Tmwc/qemu-hexagon-automatic-trans= lation-of-the-isa-manual-pseudcode-to-tiny-code-instructions-of-a-vliw-arch= itecture-niccolo-izzo-revng-taylor-simpson-qualcomm-innovation-center @@ -124,6 +128,71 @@ There are also cases where we brute force the TCG code= generation. Instructions with multiple definitions are examples. These require special handling because qemu helpers can only return a single value. =20 +For HVX vectors, the generator behaves slightly differently. The wide vec= tors +won't fit in a TCGv or TCGv_i64, so we pass TCGv_ptr variables to pass the +address to helper functions. Here's an example for an HVX vector-add-word +istruction. + static void generate_V6_vaddw( + CPUHexagonState *env, + DisasContext *ctx, + Insn *insn, + Packet *pkt) + { + const int VdN =3D insn->regno[0]; + const intptr_t VdV_off =3D + ctx_future_vreg_off(ctx, VdN, 1, true); + TCGv_ptr VdV =3D tcg_temp_local_new_ptr(); + tcg_gen_addi_ptr(VdV, cpu_env, VdV_off); + const int VuN =3D insn->regno[1]; + const intptr_t VuV_off =3D + vreg_src_off(ctx, VuN); + TCGv_ptr VuV =3D tcg_temp_local_new_ptr(); + const int VvN =3D insn->regno[2]; + const intptr_t VvV_off =3D + vreg_src_off(ctx, VvN); + TCGv_ptr VvV =3D tcg_temp_local_new_ptr(); + tcg_gen_addi_ptr(VuV, cpu_env, VuV_off); + tcg_gen_addi_ptr(VvV, cpu_env, VvV_off); + TCGv slot =3D tcg_constant_tl(insn->slot); + gen_helper_V6_vaddw(cpu_env, VdV, VuV, VvV, slot); + tcg_temp_free(slot); + gen_log_vreg_write(ctx, VdV_off, VdN, EXT_DFL, insn->slot, false); + ctx_log_vreg_write(ctx, VdN, EXT_DFL, false); + tcg_temp_free_ptr(VdV); + tcg_temp_free_ptr(VuV); + tcg_temp_free_ptr(VvV); + } + +Notice that we also generate a variable named _off for each opera= nd of +the instruction. This makes it easy to override the instruction semantics= with +functions from tcg-op-gvec.h. Here's the override for this instruction. + #define fGEN_TCG_V6_vaddw(SHORTCODE) \ + tcg_gen_gvec_add(MO_32, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +Finally, we notice that the override doesn't use the TCGv_ptr variables, so +we don't generate them when an override is present. Here is what we gener= ate +when the override is present. + static void generate_V6_vaddw( + CPUHexagonState *env, + DisasContext *ctx, + Insn *insn, + Packet *pkt) + { + const int VdN =3D insn->regno[0]; + const intptr_t VdV_off =3D + ctx_future_vreg_off(ctx, VdN, 1, true); + const int VuN =3D insn->regno[1]; + const intptr_t VuV_off =3D + vreg_src_off(ctx, VuN); + const int VvN =3D insn->regno[2]; + const intptr_t VvV_off =3D + vreg_src_off(ctx, VvN); + fGEN_TCG_V6_vaddw({ fHIDE(int i;) fVFOREACH(32, i) { VdV.w[i] =3D = VuV.w[i] + VvV.w[i] ; } }); + gen_log_vreg_write(ctx, VdV_off, VdN, EXT_DFL, insn->slot, false); + ctx_log_vreg_write(ctx, VdN, EXT_DFL, false); + } + In addition to instruction semantics, we use a generator to create the dec= ode tree. This generation is also a two step process. The first step is to r= un target/hexagon/gen_dectree_import.c to produce @@ -140,6 +209,7 @@ runtime information for each thread and contains stuff = like the GPR and predicate registers. =20 macros.h +mmvec/macros.h =20 The Hexagon arch lib relies heavily on macros for the instruction semantic= s. This is a great advantage for qemu because we can override them for differ= ent @@ -203,6 +273,15 @@ During runtime, the following fields in CPUHexagonStat= e (see cpu.h) are used pred_written boolean indicating if predicate was written mem_log_stores record of the stores (indexed by slot) =20 +For Hexagon Vector eXtensions (HVX), the following fields are used + VRegs Vector registers + future_VRegs Registers to be stored during packet commit + tmp_VRegs Temporary registers *not* stored during co= mmit + VRegs_updated Mask of predicated vector writes + QRegs Q (vector predicate) registers + future_QRegs Registers to be stored during packet commit + QRegs_updated Mask of predicated vector writes + *** Debugging *** =20 You can turn on a lot of debugging by changing the HEX_DEBUG macro to 1 in --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634033688852374.2547111068353; Tue, 12 Oct 2021 03:14:48 -0700 (PDT) Received: from localhost ([::1]:44250 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maEo3-0003Nr-Hg for importer@patchew.org; Tue, 12 Oct 2021 06:14:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50204) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEks-0000SN-4b for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:30 -0400 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:12878) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEkp-0006y1-A5 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:29 -0400 Received: from unknown (HELO ironmsg05-sd.qualcomm.com) ([10.53.140.145]) by alexa-out-sd-01.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg05-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:22 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 3BBF7112F; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033487; x=1665569487; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4fYcsZDRHfZ138edYAQvEkSlFSmh9GnZyI11ShDDakg=; b=JIdL9Dl692yOirX6uADhHM77IlCIzDq9oM38xHay4QMdTe8mJArBYvAF BqHxyzXL7a6ycLIcMu4S0dhFtpcBee6JbV9FXeA68kgiGoDdujZUKClCX TMvmth7Nsh4zpDeCUbUmJs9zbFc6TU+IMMkxr4sN3vYioYWgaFOI2Phu3 k=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 02/30] Hexagon HVX (target/hexagon) add Hexagon Vector eXtensions (HVX) to core Date: Tue, 12 Oct 2021 05:10:40 -0500 Message-Id: <1634033468-23566-3-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.38; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-01.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634033689097100002 HVX is a set of wide vector instructions. Machine state includes vector registers (VRegs) vector predicate registers (QRegs) temporary registers for intermediate values store buffer (masked stores and scatter/gather) Acked-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/cpu.h | 35 ++++++++++++++++- target/hexagon/hex_arch_types.h | 5 +++ target/hexagon/insn.h | 3 ++ target/hexagon/internal.h | 3 ++ target/hexagon/mmvec/mmvec.h | 83 +++++++++++++++++++++++++++++++++++++= ++++ target/hexagon/cpu.c | 78 ++++++++++++++++++++++++++++++++++++-- 6 files changed, 201 insertions(+), 6 deletions(-) create mode 100644 target/hexagon/mmvec/mmvec.h diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h index f7d0438..e696699 100644 --- a/target/hexagon/cpu.h +++ b/target/hexagon/cpu.h @@ -26,6 +26,7 @@ typedef struct CPUHexagonState CPUHexagonState; #include "qemu-common.h" #include "exec/cpu-defs.h" #include "hex_regs.h" +#include "mmvec/mmvec.h" =20 #define NUM_PREGS 4 #define TOTAL_PER_THREAD_REGS 64 @@ -34,6 +35,7 @@ typedef struct CPUHexagonState CPUHexagonState; #define STORES_MAX 2 #define REG_WRITES_MAX 32 #define PRED_WRITES_MAX 5 /* 4 insns + endloop */ +#define VSTORES_MAX 2 =20 #define TYPE_HEXAGON_CPU "hexagon-cpu" =20 @@ -52,6 +54,13 @@ typedef struct { uint64_t data64; } MemLog; =20 +typedef struct { + target_ulong va; + int size; + DECLARE_BITMAP(mask, MAX_VEC_SIZE_BYTES / 8) QEMU_ALIGNED(16); + MMVector data QEMU_ALIGNED(16); +} VStoreLog; + #define EXEC_STATUS_OK 0x0000 #define EXEC_STATUS_STOP 0x0002 #define EXEC_STATUS_REPLAY 0x0010 @@ -64,6 +73,9 @@ typedef struct { #define CLEAR_EXCEPTION (env->status &=3D (~EXEC_STATUS_EXCEPTION)) #define SET_EXCEPTION (env->status |=3D EXEC_STATUS_EXCEPTION) =20 +/* Maximum number of vector temps in a packet */ +#define VECTOR_TEMPS_MAX 4 + struct CPUHexagonState { target_ulong gpr[TOTAL_PER_THREAD_REGS]; target_ulong pred[NUM_PREGS]; @@ -97,8 +109,27 @@ struct CPUHexagonState { target_ulong llsc_val; uint64_t llsc_val_i64; =20 - target_ulong is_gather_store_insn; - target_ulong gather_issued; + MMVector VRegs[NUM_VREGS] QEMU_ALIGNED(16); + MMVector future_VRegs[VECTOR_TEMPS_MAX] QEMU_ALIGNED(16); + MMVector tmp_VRegs[VECTOR_TEMPS_MAX] QEMU_ALIGNED(16); + + VRegMask VRegs_updated; + + MMQReg QRegs[NUM_QREGS] QEMU_ALIGNED(16); + MMQReg future_QRegs[NUM_QREGS] QEMU_ALIGNED(16); + QRegMask QRegs_updated; + + /* Temporaries used within instructions */ + MMVectorPair VuuV QEMU_ALIGNED(16); + MMVectorPair VvvV QEMU_ALIGNED(16); + MMVectorPair VxxV QEMU_ALIGNED(16); + MMVector vtmp QEMU_ALIGNED(16); + MMQReg qtmp QEMU_ALIGNED(16); + + VStoreLog vstore[VSTORES_MAX]; + target_ulong vstore_pending[VSTORES_MAX]; + bool vtcm_pending; + VTCMStoreLog vtcm_log; }; =20 #define HEXAGON_CPU_CLASS(klass) \ diff --git a/target/hexagon/hex_arch_types.h b/target/hexagon/hex_arch_type= s.h index d721e1f..78ad607 100644 --- a/target/hexagon/hex_arch_types.h +++ b/target/hexagon/hex_arch_types.h @@ -19,6 +19,7 @@ #define HEXAGON_ARCH_TYPES_H =20 #include "qemu/osdep.h" +#include "mmvec/mmvec.h" #include "qemu/int128.h" =20 /* @@ -35,4 +36,8 @@ typedef uint64_t size8u_t; typedef int64_t size8s_t; typedef Int128 size16s_t; =20 +typedef MMVector mmvector_t; +typedef MMVectorPair mmvector_pair_t; +typedef MMQReg mmqret_t; + #endif diff --git a/target/hexagon/insn.h b/target/hexagon/insn.h index 2e34591..aa26389 100644 --- a/target/hexagon/insn.h +++ b/target/hexagon/insn.h @@ -67,6 +67,9 @@ struct Packet { bool pkt_has_store_s0; bool pkt_has_store_s1; =20 + bool pkt_has_hvx; + Insn *vhist_insn; + Insn insn[INSTRUCTIONS_MAX]; }; =20 diff --git a/target/hexagon/internal.h b/target/hexagon/internal.h index 6b20aff..82ac304 100644 --- a/target/hexagon/internal.h +++ b/target/hexagon/internal.h @@ -31,6 +31,9 @@ =20 int hexagon_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg); int hexagon_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg); + +void hexagon_debug_vreg(CPUHexagonState *env, int regnum); +void hexagon_debug_qreg(CPUHexagonState *env, int regnum); void hexagon_debug(CPUHexagonState *env); =20 extern const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS]; diff --git a/target/hexagon/mmvec/mmvec.h b/target/hexagon/mmvec/mmvec.h new file mode 100644 index 0000000..6196c52 --- /dev/null +++ b/target/hexagon/mmvec/mmvec.h @@ -0,0 +1,83 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#ifndef HEXAGON_MMVEC_H +#define HEXAGON_MMVEC_H + +#define MAX_VEC_SIZE_LOGBYTES 7 +#define MAX_VEC_SIZE_BYTES (1 << MAX_VEC_SIZE_LOGBYTES) + +#define NUM_VREGS 32 +#define NUM_QREGS 4 + +typedef uint32_t VRegMask; /* at least NUM_VREGS bits */ +typedef uint32_t QRegMask; /* at least NUM_QREGS bits */ + +#define VECTOR_SIZE_BYTE (fVECSIZE()) + +typedef union { + uint64_t ud[MAX_VEC_SIZE_BYTES / 8]; + int64_t d[MAX_VEC_SIZE_BYTES / 8]; + uint32_t uw[MAX_VEC_SIZE_BYTES / 4]; + int32_t w[MAX_VEC_SIZE_BYTES / 4]; + uint16_t uh[MAX_VEC_SIZE_BYTES / 2]; + int16_t h[MAX_VEC_SIZE_BYTES / 2]; + uint8_t ub[MAX_VEC_SIZE_BYTES / 1]; + int8_t b[MAX_VEC_SIZE_BYTES / 1]; +} MMVector; + +typedef union { + uint64_t ud[2 * MAX_VEC_SIZE_BYTES / 8]; + int64_t d[2 * MAX_VEC_SIZE_BYTES / 8]; + uint32_t uw[2 * MAX_VEC_SIZE_BYTES / 4]; + int32_t w[2 * MAX_VEC_SIZE_BYTES / 4]; + uint16_t uh[2 * MAX_VEC_SIZE_BYTES / 2]; + int16_t h[2 * MAX_VEC_SIZE_BYTES / 2]; + uint8_t ub[2 * MAX_VEC_SIZE_BYTES / 1]; + int8_t b[2 * MAX_VEC_SIZE_BYTES / 1]; + MMVector v[2]; +} MMVectorPair; + +typedef union { + uint64_t ud[MAX_VEC_SIZE_BYTES / 8 / 8]; + int64_t d[MAX_VEC_SIZE_BYTES / 8 / 8]; + uint32_t uw[MAX_VEC_SIZE_BYTES / 4 / 8]; + int32_t w[MAX_VEC_SIZE_BYTES / 4 / 8]; + uint16_t uh[MAX_VEC_SIZE_BYTES / 2 / 8]; + int16_t h[MAX_VEC_SIZE_BYTES / 2 / 8]; + uint8_t ub[MAX_VEC_SIZE_BYTES / 1 / 8]; + int8_t b[MAX_VEC_SIZE_BYTES / 1 / 8]; +} MMQReg; + +typedef struct { + MMVector data; + DECLARE_BITMAP(mask, MAX_VEC_SIZE_BYTES); + int size; + target_ulong va[MAX_VEC_SIZE_BYTES]; + bool op; + int op_size; +} VTCMStoreLog; + + +/* Types of vector register assignment */ +typedef enum { + EXT_DFL, /* Default */ + EXT_NEW, /* New - value used in the same packet */ + EXT_TMP /* Temp - value used but not stored to register */ +} VRegWriteType; + +#endif diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c index 3338365..989bd76 100644 --- a/target/hexagon/cpu.c +++ b/target/hexagon/cpu.c @@ -113,7 +113,66 @@ static void print_reg(FILE *f, CPUHexagonState *env, i= nt regnum) hexagon_regnames[regnum], value); } =20 -static void hexagon_dump(CPUHexagonState *env, FILE *f) +static void print_vreg(FILE *f, CPUHexagonState *env, int regnum, + bool skip_if_zero) +{ + if (skip_if_zero) { + bool nonzero_found =3D false; + for (int i =3D 0; i < MAX_VEC_SIZE_BYTES; i++) { + if (env->VRegs[regnum].ub[i] !=3D 0) { + nonzero_found =3D true; + break; + } + } + if (!nonzero_found) { + return; + } + } + + qemu_fprintf(f, " v%d =3D ( ", regnum); + qemu_fprintf(f, "0x%02x", env->VRegs[regnum].ub[MAX_VEC_SIZE_BYTES - 1= ]); + for (int i =3D MAX_VEC_SIZE_BYTES - 2; i >=3D 0; i--) { + qemu_fprintf(f, ", 0x%02x", env->VRegs[regnum].ub[i]); + } + qemu_fprintf(f, " )\n"); +} + +void hexagon_debug_vreg(CPUHexagonState *env, int regnum) +{ + print_vreg(stdout, env, regnum, false); +} + +static void print_qreg(FILE *f, CPUHexagonState *env, int regnum, + bool skip_if_zero) +{ + if (skip_if_zero) { + bool nonzero_found =3D false; + for (int i =3D 0; i < MAX_VEC_SIZE_BYTES / 8; i++) { + if (env->QRegs[regnum].ub[i] !=3D 0) { + nonzero_found =3D true; + break; + } + } + if (!nonzero_found) { + return; + } + } + + qemu_fprintf(f, " q%d =3D ( ", regnum); + qemu_fprintf(f, "0x%02x", + env->QRegs[regnum].ub[MAX_VEC_SIZE_BYTES / 8 - 1]); + for (int i =3D MAX_VEC_SIZE_BYTES / 8 - 2; i >=3D 0; i--) { + qemu_fprintf(f, ", 0x%02x", env->QRegs[regnum].ub[i]); + } + qemu_fprintf(f, " )\n"); +} + +void hexagon_debug_qreg(CPUHexagonState *env, int regnum) +{ + print_qreg(stdout, env, regnum, false); +} + +static void hexagon_dump(CPUHexagonState *env, FILE *f, int flags) { HexagonCPU *cpu =3D env_archcpu(env); =20 @@ -159,6 +218,17 @@ static void hexagon_dump(CPUHexagonState *env, FILE *f) print_reg(f, env, HEX_REG_CS1); #endif qemu_fprintf(f, "}\n"); + + if (flags & CPU_DUMP_FPU) { + qemu_fprintf(f, "Vector Registers =3D {\n"); + for (int i =3D 0; i < NUM_VREGS; i++) { + print_vreg(f, env, i, true); + } + for (int i =3D 0; i < NUM_QREGS; i++) { + print_qreg(f, env, i, true); + } + qemu_fprintf(f, "}\n"); + } } =20 static void hexagon_dump_state(CPUState *cs, FILE *f, int flags) @@ -166,12 +236,12 @@ static void hexagon_dump_state(CPUState *cs, FILE *f,= int flags) HexagonCPU *cpu =3D HEXAGON_CPU(cs); CPUHexagonState *env =3D &cpu->env; =20 - hexagon_dump(env, f); + hexagon_dump(env, f, flags); } =20 void hexagon_debug(CPUHexagonState *env) { - hexagon_dump(env, stdout); + hexagon_dump(env, stdout, CPU_DUMP_FPU); } =20 static void hexagon_cpu_set_pc(CPUState *cs, vaddr value) @@ -292,7 +362,7 @@ static void hexagon_cpu_class_init(ObjectClass *c, void= *data) cc->set_pc =3D hexagon_cpu_set_pc; cc->gdb_read_register =3D hexagon_gdb_read_register; cc->gdb_write_register =3D hexagon_gdb_write_register; - cc->gdb_num_core_regs =3D TOTAL_PER_THREAD_REGS; + cc->gdb_num_core_regs =3D TOTAL_PER_THREAD_REGS + NUM_VREGS + NUM_QREG= S; cc->gdb_stop_before_watchpoint =3D true; cc->disas_set_info =3D hexagon_cpu_disas_set_info; cc->tcg_ops =3D &hexagon_tcg_ops; --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634033992737570.8295810381934; Tue, 12 Oct 2021 03:19:52 -0700 (PDT) Received: from localhost ([::1]:52936 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maEsw-0000po-Cn for importer@patchew.org; Tue, 12 Oct 2021 06:19:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50218) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEkt-0000T2-0t for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:31 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64084) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEkp-0006yI-7p for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:30 -0400 Received: from unknown (HELO ironmsg04-sd.qualcomm.com) ([10.53.140.144]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg04-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:22 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 3EAA21132; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033487; x=1665569487; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=X1hJ3UZ5nF7quYwvDSNlNNWFNGtGRkO3y+A3Qi9Srf8=; b=ul/tGWgSCymrA6ZQieFEGTaUMG70TvJnMKjSkACFU2qKmGpeYkYTRQSY DVXSwVZ9uZsFWzWeRINLWCFLmHx0VNZupHJbSive/XitowpQFT8U3CUju ZQhuRLvtwIRN6gh8g3gsrE/B0ElGEixoQib9Y7uvShECr3XlQZxh+AxS9 I=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 03/30] Hexagon HVX (target/hexagon) register names Date: Tue, 12 Oct 2021 05:10:41 -0500 Message-Id: <1634033468-23566-4-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634033994010100003 Reviewed-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/hex_regs.h | 1 + target/hexagon/cpu.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/target/hexagon/hex_regs.h b/target/hexagon/hex_regs.h index f291911..e1b3149 100644 --- a/target/hexagon/hex_regs.h +++ b/target/hexagon/hex_regs.h @@ -76,6 +76,7 @@ enum { /* Use reserved control registers for qemu execution counts */ HEX_REG_QEMU_PKT_CNT =3D 52, HEX_REG_QEMU_INSN_CNT =3D 53, + HEX_REG_QEMU_HVX_CNT =3D 54, HEX_REG_UTIMERLO =3D 62, HEX_REG_UTIMERHI =3D 63, }; diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c index 989bd76..3bd3f10 100644 --- a/target/hexagon/cpu.c +++ b/target/hexagon/cpu.c @@ -59,7 +59,7 @@ const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS= ] =3D { "r24", "r25", "r26", "r27", "r28", "r29", "r30", "r31", "sa0", "lc0", "sa1", "lc1", "p3_0", "c5", "m0", "m1", "usr", "pc", "ugp", "gp", "cs0", "cs1", "c14", "c15", - "c16", "c17", "c18", "c19", "pkt_cnt", "insn_cnt", "c22", "c23", + "c16", "c17", "c18", "c19", "pkt_cnt", "insn_cnt", "hvx_cnt", "c23", "c24", "c25", "c26", "c27", "c28", "c29", "c30", "c31", }; =20 --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634033756710328.02813959110006; Tue, 12 Oct 2021 03:15:56 -0700 (PDT) Received: from localhost ([::1]:45356 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maEp9-00046v-9V for importer@patchew.org; Tue, 12 Oct 2021 06:15:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50264) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEkv-0000Ws-3A for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:33 -0400 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:12894) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEkt-0007HY-0a for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:32 -0400 Received: from unknown (HELO ironmsg05-sd.qualcomm.com) ([10.53.140.145]) by alexa-out-sd-01.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg05-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:22 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 412151279; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033491; x=1665569491; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=g9R1vRfdagec/a9VxyFTac7JPkP0JPaSU/5EiQHOoRs=; b=O77zmNc0JjPeRVmRYTt1GwYLdREbn0fueI7G0tTLp65RDHLzfKm4HcGJ OAiLIUwf4iZEhG5rnaG/KQEtV/SvExy6t8NAiSacOpA7CDG77JOsymtzj s7j6qNbL+Idhivbp8RSVyNpeO71B2eOyL0pCrRJnDtttaEYCP+RevrTWG w=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 04/30] Hexagon HVX (target/hexagon) instruction attributes Date: Tue, 12 Oct 2021 05:10:42 -0500 Message-Id: <1634033468-23566-5-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.38; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-01.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634033757372100001 Acked-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/attribs_def.h.inc | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.= h.inc index 3815509..4138a7a 100644 --- a/target/hexagon/attribs_def.h.inc +++ b/target/hexagon/attribs_def.h.inc @@ -41,6 +41,27 @@ DEF_ATTRIB(STORE, "Stores to memory", "", "") DEF_ATTRIB(MEMLIKE, "Memory-like instruction", "", "") DEF_ATTRIB(MEMLIKE_PACKET_RULES, "follows Memory-like packet rules", "", "= ") =20 +/* V6 Vector attributes */ +DEF_ATTRIB(CVI, "Executes on the HVX extension", "", "") + +DEF_ATTRIB(CVI_NEW, "New value memory instruction executes on HVX", "", "") +DEF_ATTRIB(CVI_VM, "Memory instruction executes on HVX", "", "") +DEF_ATTRIB(CVI_VP, "Permute instruction executes on HVX", "", "") +DEF_ATTRIB(CVI_VP_VS, "Double vector permute/shft insn executes on HVX", "= ", "") +DEF_ATTRIB(CVI_VX, "Multiply instruction executes on HVX", "", "") +DEF_ATTRIB(CVI_VX_DV, "Double vector multiply insn executes on HVX", "", "= ") +DEF_ATTRIB(CVI_VS, "Shift instruction executes on HVX", "", "") +DEF_ATTRIB(CVI_VS_VX, "Permute/shift and multiply insn executes on HVX", "= ", "") +DEF_ATTRIB(CVI_VA, "ALU instruction executes on HVX", "", "") +DEF_ATTRIB(CVI_VA_DV, "Double vector alu instruction executes on HVX", "",= "") +DEF_ATTRIB(CVI_4SLOT, "Consumes all the vector execution resources", "", "= ") +DEF_ATTRIB(CVI_TMP, "Transient Memory Load not written to register", "", "= ") +DEF_ATTRIB(CVI_GATHER, "CVI Gather operation", "", "") +DEF_ATTRIB(CVI_SCATTER, "CVI Scatter operation", "", "") +DEF_ATTRIB(CVI_SCATTER_RELEASE, "CVI Store Release for scatter", "", "") +DEF_ATTRIB(CVI_TMP_DST, "CVI instruction that doesn't write a register", "= ", "") +DEF_ATTRIB(CVI_SLOT23, "Can execute in slot 2 or slot 3 (HVX)", "", "") + =20 /* Change-of-flow attributes */ DEF_ATTRIB(JUMP, "Jump-type instruction", "", "") @@ -86,6 +107,7 @@ DEF_ATTRIB(HWLOOP1_END, "Ends HW loop1", "", "") DEF_ATTRIB(DCZEROA, "dczeroa type", "", "") DEF_ATTRIB(ICFLUSHOP, "icflush op type", "", "") DEF_ATTRIB(DCFLUSHOP, "dcflush op type", "", "") +DEF_ATTRIB(L2FLUSHOP, "l2flush op type", "", "") DEF_ATTRIB(DCFETCH, "dcfetch type", "", "") =20 DEF_ATTRIB(L2FETCH, "Instruction is l2fetch type", "", "") --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634035286039662.7702425668355; Tue, 12 Oct 2021 03:41:26 -0700 (PDT) Received: from localhost ([::1]:54146 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFDo-0004SJ-N0 for importer@patchew.org; Tue, 12 Oct 2021 06:41:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50448) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEl7-0000cF-Fo for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:45 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64084) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEl0-0006yI-UM for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:45 -0400 Received: from unknown (HELO ironmsg03-sd.qualcomm.com) ([10.53.140.143]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg03-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 440D213BE; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033498; x=1665569498; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YGD6GGUjALsr+s/OLDbMBfVo0rUV/+BCYpQV2QkXpUs=; b=fesTLhYZLDE3xnh/P8oz+pfAoIXdvg+HZs4vbyraRtrZj/HNxAhcJg9G ZbwXw7dxoG/zf5XwdMRDS1MkUwcL7BsE2JzSjc9C/93irmCVSKt3H//Hy pXIiwUboT4039Sld/hwY0MJidvsBFIBw60FkXqhRYIlP/Y9Md9UF2kynd s=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 05/30] Hexagon HVX (target/hexagon) macros Date: Tue, 12 Oct 2021 05:10:43 -0500 Message-Id: <1634033468-23566-6-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634035287581100001 macros to interface with the generator macros referenced in instruction semantics Acked-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/macros.h | 22 +++ target/hexagon/mmvec/macros.h | 354 ++++++++++++++++++++++++++++++++++++++= ++++ 2 files changed, 376 insertions(+) create mode 100644 target/hexagon/mmvec/macros.h diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h index 44e9b85..4421285 100644 --- a/target/hexagon/macros.h +++ b/target/hexagon/macros.h @@ -266,6 +266,10 @@ static inline void gen_pred_cancel(TCGv pred, int slot= _num) =20 #define fNEWREG_ST(VAL) (VAL) =20 +#define fVSATUVALN(N, VAL) \ + ({ \ + (((int)(VAL)) < 0) ? 0 : ((1LL << (N)) - 1); \ + }) #define fSATUVALN(N, VAL) \ ({ \ fSET_OVERFLOW(); \ @@ -276,10 +280,16 @@ static inline void gen_pred_cancel(TCGv pred, int slo= t_num) fSET_OVERFLOW(); \ ((VAL) < 0) ? (-(1LL << ((N) - 1))) : ((1LL << ((N) - 1)) - 1); \ }) +#define fVSATVALN(N, VAL) \ + ({ \ + ((VAL) < 0) ? (-(1LL << ((N) - 1))) : ((1LL << ((N) - 1)) - 1); \ + }) #define fZXTN(N, M, VAL) (((N) !=3D 0) ? extract64((VAL), 0, (N)) : 0LL) #define fSXTN(N, M, VAL) (((N) !=3D 0) ? sextract64((VAL), 0, (N)) : 0LL) #define fSATN(N, VAL) \ ((fSXTN(N, 64, VAL) =3D=3D (VAL)) ? (VAL) : fSATVALN(N, VAL)) +#define fVSATN(N, VAL) \ + ((fSXTN(N, 64, VAL) =3D=3D (VAL)) ? (VAL) : fVSATVALN(N, VAL)) #define fADDSAT64(DST, A, B) \ do { \ uint64_t __a =3D fCAST8u(A); \ @@ -302,12 +312,18 @@ static inline void gen_pred_cancel(TCGv pred, int slo= t_num) DST =3D __sum; \ } \ } while (0) +#define fVSATUN(N, VAL) \ + ((fZXTN(N, 64, VAL) =3D=3D (VAL)) ? (VAL) : fVSATUVALN(N, VAL)) #define fSATUN(N, VAL) \ ((fZXTN(N, 64, VAL) =3D=3D (VAL)) ? (VAL) : fSATUVALN(N, VAL)) #define fSATH(VAL) (fSATN(16, VAL)) #define fSATUH(VAL) (fSATUN(16, VAL)) +#define fVSATH(VAL) (fVSATN(16, VAL)) +#define fVSATUH(VAL) (fVSATUN(16, VAL)) #define fSATUB(VAL) (fSATUN(8, VAL)) #define fSATB(VAL) (fSATN(8, VAL)) +#define fVSATUB(VAL) (fVSATUN(8, VAL)) +#define fVSATB(VAL) (fVSATN(8, VAL)) #define fIMMEXT(IMM) (IMM =3D IMM) #define fMUST_IMMEXT(IMM) fIMMEXT(IMM) =20 @@ -414,6 +430,8 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val,= int shift) #define fCAST4s(A) ((int32_t)(A)) #define fCAST8u(A) ((uint64_t)(A)) #define fCAST8s(A) ((int64_t)(A)) +#define fCAST2_2s(A) ((int16_t)(A)) +#define fCAST2_2u(A) ((uint16_t)(A)) #define fCAST4_4s(A) ((int32_t)(A)) #define fCAST4_4u(A) ((uint32_t)(A)) #define fCAST4_8s(A) ((int64_t)((int32_t)(A))) @@ -511,7 +529,9 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val,= int shift) #define fPM_M(REG, MVAL) do { REG =3D REG + (MVAL); } while (0) #endif #define fSCALE(N, A) (((int64_t)(A)) << N) +#define fVSATW(A) fVSATN(32, ((long long)A)) #define fSATW(A) fSATN(32, ((long long)A)) +#define fVSAT(A) fVSATN(32, (A)) #define fSAT(A) fSATN(32, (A)) #define fSAT_ORIG_SHL(A, ORIG_REG) \ ((((int32_t)((fSAT(A)) ^ ((int32_t)(ORIG_REG)))) < 0) \ @@ -648,12 +668,14 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv va= l, int shift) fSETBIT(j, DST, VAL); \ } \ } while (0) +#define fCOUNTONES_2(VAL) ctpop16(VAL) #define fCOUNTONES_4(VAL) ctpop32(VAL) #define fCOUNTONES_8(VAL) ctpop64(VAL) #define fBREV_8(VAL) revbit64(VAL) #define fBREV_4(VAL) revbit32(VAL) #define fCL1_8(VAL) clo64(VAL) #define fCL1_4(VAL) clo32(VAL) +#define fCL1_2(VAL) (clz32(~(uint16_t)(VAL) & 0xffff) - 16) #define fINTERLEAVE(ODD, EVEN) interleave(ODD, EVEN) #define fDEINTERLEAVE(MIXED) deinterleave(MIXED) #define fHIDE(A) A diff --git a/target/hexagon/mmvec/macros.h b/target/hexagon/mmvec/macros.h new file mode 100644 index 0000000..eff882c --- /dev/null +++ b/target/hexagon/mmvec/macros.h @@ -0,0 +1,354 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#ifndef HEXAGON_MMVEC_MACROS_H +#define HEXAGON_MMVEC_MACROS_H + +#include "qemu/osdep.h" +#include "qemu/host-utils.h" +#include "arch.h" +#include "mmvec/system_ext_mmvec.h" + +#ifndef QEMU_GENERATE +#define VdV (*(MMVector *)(VdV_void)) +#define VsV (*(MMVector *)(VsV_void)) +#define VuV (*(MMVector *)(VuV_void)) +#define VvV (*(MMVector *)(VvV_void)) +#define VwV (*(MMVector *)(VwV_void)) +#define VxV (*(MMVector *)(VxV_void)) +#define VyV (*(MMVector *)(VyV_void)) + +#define VddV (*(MMVectorPair *)(VddV_void)) +#define VuuV (*(MMVectorPair *)(VuuV_void)) +#define VvvV (*(MMVectorPair *)(VvvV_void)) +#define VxxV (*(MMVectorPair *)(VxxV_void)) + +#define QeV (*(MMQReg *)(QeV_void)) +#define QdV (*(MMQReg *)(QdV_void)) +#define QsV (*(MMQReg *)(QsV_void)) +#define QtV (*(MMQReg *)(QtV_void)) +#define QuV (*(MMQReg *)(QuV_void)) +#define QvV (*(MMQReg *)(QvV_void)) +#define QxV (*(MMQReg *)(QxV_void)) +#endif + +#define LOG_VTCM_BYTE(VA, MASK, VAL, IDX) \ + do { \ + env->vtcm_log.data.ub[IDX] =3D (VAL); \ + if (MASK) { \ + set_bit((IDX), env->vtcm_log.mask); \ + } else { \ + clear_bit((IDX), env->vtcm_log.mask); \ + } \ + env->vtcm_log.va[IDX] =3D (VA); \ + } while (0) + +#define fNOTQ(VAL) \ + ({ \ + MMQReg _ret; \ + int _i_; \ + for (_i_ =3D 0; _i_ < fVECSIZE() / 64; _i_++) { \ + _ret.ud[_i_] =3D ~VAL.ud[_i_]; \ + } \ + _ret;\ + }) +#define fGETQBITS(REG, WIDTH, MASK, BITNO) \ + ((MASK) & (REG.w[(BITNO) >> 5] >> ((BITNO) & 0x1f))) +#define fGETQBIT(REG, BITNO) fGETQBITS(REG, 1, 1, BITNO) +#define fGENMASKW(QREG, IDX) \ + (((fGETQBIT(QREG, (IDX * 4 + 0)) ? 0xFF : 0x0) << 0) | \ + ((fGETQBIT(QREG, (IDX * 4 + 1)) ? 0xFF : 0x0) << 8) | \ + ((fGETQBIT(QREG, (IDX * 4 + 2)) ? 0xFF : 0x0) << 16) | \ + ((fGETQBIT(QREG, (IDX * 4 + 3)) ? 0xFF : 0x0) << 24)) +#define fGETNIBBLE(IDX, SRC) (fSXTN(4, 8, (SRC >> (4 * IDX)) & 0xF)) +#define fGETCRUMB(IDX, SRC) (fSXTN(2, 8, (SRC >> (2 * IDX)) & 0x3)) +#define fGETCRUMB_SYMMETRIC(IDX, SRC) \ + ((fGETCRUMB(IDX, SRC) >=3D 0 ? (2 - fGETCRUMB(IDX, SRC)) \ + : fGETCRUMB(IDX, SRC))) +#define fGENMASKH(QREG, IDX) \ + (((fGETQBIT(QREG, (IDX * 2 + 0)) ? 0xFF : 0x0) << 0) | \ + ((fGETQBIT(QREG, (IDX * 2 + 1)) ? 0xFF : 0x0) << 8)) +#define fGETMASKW(VREG, QREG, IDX) (VREG.w[IDX] & fGENMASKW((QREG), IDX)) +#define fGETMASKH(VREG, QREG, IDX) (VREG.h[IDX] & fGENMASKH((QREG), IDX)) +#define fCONDMASK8(QREG, IDX, YESVAL, NOVAL) \ + (fGETQBIT(QREG, IDX) ? (YESVAL) : (NOVAL)) +#define fCONDMASK16(QREG, IDX, YESVAL, NOVAL) \ + ((fGENMASKH(QREG, IDX) & (YESVAL)) | \ + (fGENMASKH(fNOTQ(QREG), IDX) & (NOVAL))) +#define fCONDMASK32(QREG, IDX, YESVAL, NOVAL) \ + ((fGENMASKW(QREG, IDX) & (YESVAL)) | \ + (fGENMASKW(fNOTQ(QREG), IDX) & (NOVAL))) +#define fSETQBITS(REG, WIDTH, MASK, BITNO, VAL) \ + do { \ + uint32_t __TMP =3D (VAL); \ + REG.w[(BITNO) >> 5] &=3D ~((MASK) << ((BITNO) & 0x1f)); \ + REG.w[(BITNO) >> 5] |=3D (((__TMP) & (MASK)) << ((BITNO) & 0x1f));= \ + } while (0) +#define fSETQBIT(REG, BITNO, VAL) fSETQBITS(REG, 1, 1, BITNO, VAL) +#define fVBYTES() (fVECSIZE()) +#define fVALIGN(ADDR, LOG2_ALIGNMENT) (ADDR =3D ADDR & ~(LOG2_ALIGNMENT - = 1)) +#define fVLASTBYTE(ADDR, LOG2_ALIGNMENT) (ADDR =3D ADDR | (LOG2_ALIGNMENT = - 1)) +#define fVELEM(WIDTH) ((fVECSIZE() * 8) / WIDTH) +#define fVECLOGSIZE() (7) +#define fVECSIZE() (1 << fVECLOGSIZE()) +#define fSWAPB(A, B) do { uint8_t tmp =3D A; A =3D B; B =3D tmp; } while (= 0) +#define fV_AL_CHECK(EA, MASK) \ + if ((EA) & (MASK)) { \ + warn("aligning misaligned vector. EA=3D%08x", (EA)); \ + } +#define fSCATTER_INIT(REGION_START, LENGTH, ELEMENT_SIZE) \ + mem_vector_scatter_init(env, slot, REGION_START, LENGTH, ELEMENT_SIZE) +#define fGATHER_INIT(REGION_START, LENGTH, ELEMENT_SIZE) \ + mem_vector_gather_init(env, REGION_START, LENGTH, ELEMENT_SIZE) +#define fSCATTER_FINISH(OP) +#define fGATHER_FINISH() +#define fLOG_SCATTER_OP(SIZE) \ + do { \ + env->vtcm_log.op =3D true; \ + env->vtcm_log.op_size =3D SIZE; \ + } while (0) +#define fVLOG_VTCM_WORD_INCREMENT(EA, OFFSET, INC, IDX, ALIGNMENT, LEN) \ + do { \ + int log_byte =3D 0; \ + target_ulong va =3D EA; \ + target_ulong va_high =3D EA + LEN; \ + for (int i0 =3D 0; i0 < 4; i0++) { \ + log_byte =3D (va + i0) <=3D va_high; \ + LOG_VTCM_BYTE(va + i0, log_byte, INC. ub[4 * IDX + i0], \ + 4 * IDX + i0); \ + } \ + } while (0) +#define fVLOG_VTCM_HALFWORD_INCREMENT(EA, OFFSET, INC, IDX, ALIGNMENT, LEN= ) \ + do { \ + int log_byte =3D 0; \ + target_ulong va =3D EA; \ + target_ulong va_high =3D EA + LEN; \ + for (int i0 =3D 0; i0 < 2; i0++) { \ + log_byte =3D (va + i0) <=3D va_high; \ + LOG_VTCM_BYTE(va + i0, log_byte, INC.ub[2 * IDX + i0], \ + 2 * IDX + i0); \ + } \ + } while (0) + +#define fVLOG_VTCM_HALFWORD_INCREMENT_DV(EA, OFFSET, INC, IDX, IDX2, IDX_H= , \ + ALIGNMENT, LEN) \ + do { \ + int log_byte =3D 0; \ + target_ulong va =3D EA; \ + target_ulong va_high =3D EA + LEN; \ + for (int i0 =3D 0; i0 < 2; i0++) { \ + log_byte =3D (va + i0) <=3D va_high; \ + LOG_VTCM_BYTE(va + i0, log_byte, INC.ub[2 * IDX + i0], \ + 2 * IDX + i0); \ + } \ + } while (0) + +/* NOTE - Will this always be tmp_VRegs[0]; */ +#define GATHER_FUNCTION(EA, OFFSET, IDX, LEN, ELEMENT_SIZE, BANK_IDX, QVAL= ) \ + do { \ + int i0; \ + target_ulong va =3D EA; \ + target_ulong va_high =3D EA + LEN; \ + uintptr_t ra =3D GETPC(); \ + int log_bank =3D 0; \ + int log_byte =3D 0; \ + for (i0 =3D 0; i0 < ELEMENT_SIZE; i0++) { \ + log_byte =3D ((va + i0) <=3D va_high) && QVAL; \ + log_bank |=3D (log_byte << i0); \ + uint8_t B; \ + B =3D cpu_ldub_data_ra(env, EA + i0, ra); \ + env->tmp_VRegs[0].ub[ELEMENT_SIZE * IDX + i0] =3D B; \ + LOG_VTCM_BYTE(va + i0, log_byte, B, ELEMENT_SIZE * IDX + i0); \ + } \ + } while (0) +#define fVLOG_VTCM_GATHER_WORD(EA, OFFSET, IDX, LEN) \ + do { \ + GATHER_FUNCTION(EA, OFFSET, IDX, LEN, 4, IDX, 1); \ + } while (0) +#define fVLOG_VTCM_GATHER_HALFWORD(EA, OFFSET, IDX, LEN) \ + do { \ + GATHER_FUNCTION(EA, OFFSET, IDX, LEN, 2, IDX, 1); \ + } while (0) +#define fVLOG_VTCM_GATHER_HALFWORD_DV(EA, OFFSET, IDX, IDX2, IDX_H, LEN) \ + do { \ + GATHER_FUNCTION(EA, OFFSET, IDX, LEN, 2, (2 * IDX2 + IDX_H), 1); \ + } while (0) +#define fVLOG_VTCM_GATHER_WORDQ(EA, OFFSET, IDX, Q, LEN) \ + do { \ + GATHER_FUNCTION(EA, OFFSET, IDX, LEN, 4, IDX, \ + fGETQBIT(QsV, 4 * IDX + i0)); \ + } while (0) +#define fVLOG_VTCM_GATHER_HALFWORDQ(EA, OFFSET, IDX, Q, LEN) \ + do { \ + GATHER_FUNCTION(EA, OFFSET, IDX, LEN, 2, IDX, \ + fGETQBIT(QsV, 2 * IDX + i0)); \ + } while (0) +#define fVLOG_VTCM_GATHER_HALFWORDQ_DV(EA, OFFSET, IDX, IDX2, IDX_H, Q, LE= N) \ + do { \ + GATHER_FUNCTION(EA, OFFSET, IDX, LEN, 2, (2 * IDX2 + IDX_H), \ + fGETQBIT(QsV, 2 * IDX + i0)); \ + } while (0) +#define SCATTER_OP_WRITE_TO_MEM(TYPE) \ + do { \ + uintptr_t ra =3D GETPC(); \ + for (int i =3D 0; i < env->vtcm_log.size; i +=3D sizeof(TYPE)) { \ + if (test_bit(i, env->vtcm_log.mask)) { \ + TYPE dst =3D 0; \ + TYPE inc =3D 0; \ + for (int j =3D 0; j < sizeof(TYPE); j++) { \ + uint8_t val; \ + val =3D cpu_ldub_data_ra(env, env->vtcm_log.va[i + j],= ra); \ + dst |=3D val << (8 * j); \ + inc |=3D env->vtcm_log.data.ub[j + i] << (8 * j); \ + clear_bit(j + i, env->vtcm_log.mask); \ + env->vtcm_log.data.ub[j + i] =3D 0; \ + } \ + dst +=3D inc; \ + for (int j =3D 0; j < sizeof(TYPE); j++) { \ + cpu_stb_data_ra(env, env->vtcm_log.va[i + j], \ + (dst >> (8 * j)) & 0xFF, ra); \ + } \ + } \ + } \ + } while (0) +#define SCATTER_OP_PROBE_MEM(TYPE, MMU_IDX, RETADDR) \ + do { \ + for (int i =3D 0; i < env->vtcm_log.size; i +=3D sizeof(TYPE)) { \ + if (test_bit(i, env->vtcm_log.mask)) { \ + for (int j =3D 0; j < sizeof(TYPE); j++) { \ + probe_read(env, env->vtcm_log.va[i + j], 1, \ + MMU_IDX, RETADDR); \ + probe_write(env, env->vtcm_log.va[i + j], 1, \ + MMU_IDX, RETADDR); \ + } \ + } \ + } \ + } while (0) +#define SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, ELEM_SIZE, BANK_IDX, QVAL, = IN) \ + do { \ + int i0; \ + target_ulong va =3D EA; \ + target_ulong va_high =3D EA + LEN; \ + int log_bank =3D 0; \ + int log_byte =3D 0; \ + for (i0 =3D 0; i0 < ELEM_SIZE; i0++) { \ + log_byte =3D ((va + i0) <=3D va_high) && QVAL; \ + log_bank |=3D (log_byte << i0); \ + LOG_VTCM_BYTE(va + i0, log_byte, IN.ub[ELEM_SIZE * IDX + i0], \ + ELEM_SIZE * IDX + i0); \ + } \ + } while (0) +#define fVLOG_VTCM_HALFWORD(EA, OFFSET, IN, IDX, LEN) \ + do { \ + SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, 2, IDX, 1, IN); \ + } while (0) +#define fVLOG_VTCM_WORD(EA, OFFSET, IN, IDX, LEN) \ + do { \ + SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, 4, IDX, 1, IN); \ + } while (0) +#define fVLOG_VTCM_HALFWORDQ(EA, OFFSET, IN, IDX, Q, LEN) \ + do { \ + SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, 2, IDX, \ + fGETQBIT(QsV, 2 * IDX + i0), IN); \ + } while (0) +#define fVLOG_VTCM_WORDQ(EA, OFFSET, IN, IDX, Q, LEN) \ + do { \ + SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, 4, IDX, \ + fGETQBIT(QsV, 4 * IDX + i0), IN); \ + } while (0) +#define fVLOG_VTCM_HALFWORD_DV(EA, OFFSET, IN, IDX, IDX2, IDX_H, LEN) \ + do { \ + SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, 2, \ + (2 * IDX2 + IDX_H), 1, IN); \ + } while (0) +#define fVLOG_VTCM_HALFWORDQ_DV(EA, OFFSET, IN, IDX, Q, IDX2, IDX_H, LEN) \ + do { \ + SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, 2, (2 * IDX2 + IDX_H), \ + fGETQBIT(QsV, 2 * IDX + i0), IN); \ + } while (0) +#define fSTORERELEASE(EA, TYPE) \ + do { \ + fV_AL_CHECK(EA, fVECSIZE() - 1); \ + } while (0) +#ifdef QEMU_GENERATE +#define fLOADMMV(EA, DST) gen_vreg_load(ctx, DST##_off, EA, true) +#endif +#ifdef QEMU_GENERATE +#define fLOADMMVU(EA, DST) gen_vreg_load(ctx, DST##_off, EA, false) +#endif +#ifdef QEMU_GENERATE +#define fSTOREMMV(EA, SRC) \ + gen_vreg_store(ctx, insn, pkt, EA, SRC##_off, insn->slot, true) +#endif +#ifdef QEMU_GENERATE +#define fSTOREMMVQ(EA, SRC, MASK) \ + gen_vreg_masked_store(ctx, EA, SRC##_off, MASK##_off, insn->slot, fals= e) +#endif +#ifdef QEMU_GENERATE +#define fSTOREMMVNQ(EA, SRC, MASK) \ + gen_vreg_masked_store(ctx, EA, SRC##_off, MASK##_off, insn->slot, true) +#endif +#ifdef QEMU_GENERATE +#define fSTOREMMVU(EA, SRC) \ + gen_vreg_store(ctx, insn, pkt, EA, SRC##_off, insn->slot, false) +#endif +#define fVFOREACH(WIDTH, VAR) for (VAR =3D 0; VAR < fVELEM(WIDTH); VAR++) +#define fVARRAY_ELEMENT_ACCESS(ARRAY, TYPE, INDEX) \ + ARRAY.v[(INDEX) / (fVECSIZE() / (sizeof(ARRAY.TYPE[0])))].TYPE[(INDEX)= % \ + (fVECSIZE() / (sizeof(ARRAY.TYPE[0])))] + +#define fVSATDW(U, V) fVSATW(((((long long)U) << 32) | fZXTN(32, 64, V))) +#define fVASL_SATHI(U, V) fVSATW(((U) << 1) | ((V) >> 31)) +#define fVUADDSAT(WIDTH, U, V) \ + fVSATUN(WIDTH, fZXTN(WIDTH, 2 * WIDTH, U) + fZXTN(WIDTH, 2 * WIDTH, V)) +#define fVSADDSAT(WIDTH, U, V) \ + fVSATN(WIDTH, fSXTN(WIDTH, 2 * WIDTH, U) + fSXTN(WIDTH, 2 * WIDTH, V)) +#define fVUSUBSAT(WIDTH, U, V) \ + fVSATUN(WIDTH, fZXTN(WIDTH, 2 * WIDTH, U) - fZXTN(WIDTH, 2 * WIDTH, V)) +#define fVSSUBSAT(WIDTH, U, V) \ + fVSATN(WIDTH, fSXTN(WIDTH, 2 * WIDTH, U) - fSXTN(WIDTH, 2 * WIDTH, V)) +#define fVAVGU(WIDTH, U, V) \ + ((fZXTN(WIDTH, 2 * WIDTH, U) + fZXTN(WIDTH, 2 * WIDTH, V)) >> 1) +#define fVAVGURND(WIDTH, U, V) \ + ((fZXTN(WIDTH, 2 * WIDTH, U) + fZXTN(WIDTH, 2 * WIDTH, V) + 1) >> 1) +#define fVNAVGU(WIDTH, U, V) \ + ((fZXTN(WIDTH, 2 * WIDTH, U) - fZXTN(WIDTH, 2 * WIDTH, V)) >> 1) +#define fVNAVGURNDSAT(WIDTH, U, V) \ + fVSATUN(WIDTH, ((fZXTN(WIDTH, 2 * WIDTH, U) - \ + fZXTN(WIDTH, 2 * WIDTH, V) + 1) >> 1)) +#define fVAVGS(WIDTH, U, V) \ + ((fSXTN(WIDTH, 2 * WIDTH, U) + fSXTN(WIDTH, 2 * WIDTH, V)) >> 1) +#define fVAVGSRND(WIDTH, U, V) \ + ((fSXTN(WIDTH, 2 * WIDTH, U) + fSXTN(WIDTH, 2 * WIDTH, V) + 1) >> 1) +#define fVNAVGS(WIDTH, U, V) \ + ((fSXTN(WIDTH, 2 * WIDTH, U) - fSXTN(WIDTH, 2 * WIDTH, V)) >> 1) +#define fVNAVGSRND(WIDTH, U, V) \ + ((fSXTN(WIDTH, 2 * WIDTH, U) - fSXTN(WIDTH, 2 * WIDTH, V) + 1) >> 1) +#define fVNAVGSRNDSAT(WIDTH, U, V) \ + fVSATN(WIDTH, ((fSXTN(WIDTH, 2 * WIDTH, U) - \ + fSXTN(WIDTH, 2 * WIDTH, V) + 1) >> 1)) +#define fVNOROUND(VAL, SHAMT) VAL +#define fVNOSAT(VAL) VAL +#define fVROUND(VAL, SHAMT) \ + ((VAL) + (((SHAMT) > 0) ? (1LL << ((SHAMT) - 1)) : 0)) +#define fCARRY_FROM_ADD32(A, B, C) \ + (((fZXTN(32, 64, A) + fZXTN(32, 64, B) + C) >> 32) & 1) +#define fUARCH_NOTE_PUMP_4X() +#define fUARCH_NOTE_PUMP_2X() + +#define IV1DEAD() +#endif --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634034527055271.4046952346431; Tue, 12 Oct 2021 03:28:47 -0700 (PDT) Received: from localhost ([::1]:38006 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maF1Z-0001aF-JP for importer@patchew.org; Tue, 12 Oct 2021 06:28:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50312) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEkx-0000YR-0S for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:35 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64084) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEkt-0006yI-B0 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:34 -0400 Received: from unknown (HELO ironmsg04-sd.qualcomm.com) ([10.53.140.144]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg04-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 46B521418; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033491; x=1665569491; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fd+VuoQRBI3tIYQAM2UnprTR7ca4vFJcp8Ye5N2CzwE=; b=WO9daGIXObF0D9fcKLOKwTHyJXdMtdFqKEVIJd2dtUQmN1ucHgV1XEai Xiz+KgbKJoJ+7aJbbFos11gRWvIxFoafa7Wrg0kc/gX9iRfUsnjkZR+v9 +jfnyPLhJv/J+W9vz9b1fIbfFnmBMomp7kMIP5iLK5kcItC8OInUfsoTL w=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 06/30] Hexagon HVX (target/hexagon) import macro definitions Date: Tue, 12 Oct 2021 05:10:44 -0500 Message-Id: <1634033468-23566-7-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634034528036100001 Imported from the Hexagon architecture library imported/allext_macros.def Top level macro include for all extens= ions imported/macros.def Scalar core macros (some HVX here) imported/mmvec/macros.def HVX macro definitions The macro definition files specify instruction attributes that are applied to each instruction that reverences the macro. Acked-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/imported/allext_macros.def | 25 + target/hexagon/imported/macros.def | 88 ++++ target/hexagon/imported/mmvec/macros.def | 842 ++++++++++++++++++++++++++= ++++ 3 files changed, 955 insertions(+) create mode 100644 target/hexagon/imported/allext_macros.def create mode 100755 target/hexagon/imported/mmvec/macros.def diff --git a/target/hexagon/imported/allext_macros.def b/target/hexagon/imp= orted/allext_macros.def new file mode 100644 index 0000000..9c91199 --- /dev/null +++ b/target/hexagon/imported/allext_macros.def @@ -0,0 +1,25 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/* + * Top level file for all instruction set extensions + */ +#define EXTNAME mmvec +#define EXTSTR "mmvec" +#include "mmvec/macros.def" +#undef EXTNAME +#undef EXTSTR diff --git a/target/hexagon/imported/macros.def b/target/hexagon/imported/m= acros.def index 32ed3bf..e23f915 100755 --- a/target/hexagon/imported/macros.def +++ b/target/hexagon/imported/macros.def @@ -177,6 +177,12 @@ DEF_MACRO( ) =20 DEF_MACRO( + fVSATUVALN, + ({ ((VAL) < 0) ? 0 : ((1LL<<(N))-1);}), + () +) + +DEF_MACRO( fSATUVALN, ({fSET_OVERFLOW(); ((VAL) < 0) ? 0 : ((1LL<<(N))-1);}), () @@ -189,6 +195,12 @@ DEF_MACRO( ) =20 DEF_MACRO( + fVSATVALN, + ({((VAL) < 0) ? (-(1LL<<((N)-1))) : ((1LL<<((N)-1))-1);}), + () +) + +DEF_MACRO( fZXTN, /* macro name */ ((VAL) & ((1LL<<(N))-1)), /* attribs */ @@ -205,6 +217,11 @@ DEF_MACRO( ((fSXTN(N,64,VAL) =3D=3D (VAL)) ? (VAL) : fSATVALN(N,VAL)), () ) +DEF_MACRO( + fVSATN, + ((fSXTN(N,64,VAL) =3D=3D (VAL)) ? (VAL) : fVSATVALN(N,VAL)), + () +) =20 DEF_MACRO( fADDSAT64, @@ -235,6 +252,12 @@ DEF_MACRO( ) =20 DEF_MACRO( + fVSATUN, + ((fZXTN(N,64,VAL) =3D=3D (VAL)) ? (VAL) : fVSATUVALN(N,VAL)), + () +) + +DEF_MACRO( fSATUN, ((fZXTN(N,64,VAL) =3D=3D (VAL)) ? (VAL) : fSATUVALN(N,VAL)), () @@ -254,6 +277,19 @@ DEF_MACRO( ) =20 DEF_MACRO( + fVSATH, + (fVSATN(16,VAL)), + () +) + +DEF_MACRO( + fVSATUH, + (fVSATUN(16,VAL)), + () +) + + +DEF_MACRO( fSATUB, (fSATUN(8,VAL)), () @@ -265,6 +301,20 @@ DEF_MACRO( ) =20 =20 +DEF_MACRO( + fVSATUB, + (fVSATUN(8,VAL)), + () +) +DEF_MACRO( + fVSATB, + (fVSATN(8,VAL)), + () +) + + + + /*************************************/ /* immediate extension */ /*************************************/ @@ -557,6 +607,18 @@ DEF_MACRO( ) =20 DEF_MACRO( + fCAST2_2s, /* macro name */ + ((size2s_t)(A)), + /* optional attributes */ +) + +DEF_MACRO( + fCAST2_2u, /* macro name */ + ((size2u_t)(A)), + /* optional attributes */ +) + +DEF_MACRO( fCAST4_4s, /* macro name */ ((size4s_t)(A)), /* optional attributes */ @@ -876,6 +938,11 @@ DEF_MACRO( (((size8s_t)(A))<. + */ + +DEF_MACRO(fDUMPQ, + do { + printf(STR ":" #REG ": 0x%016llx\n",REG.ud[0]); + } while (0), + () +) + +DEF_MACRO(fUSE_LOOKUP_ADDRESS_BY_REV, + PROC->arch_proc_options->mmvec_use_full_va_for_lookup, + () +) + +DEF_MACRO(fUSE_LOOKUP_ADDRESS, + 1, + () +) + +DEF_MACRO(fNOTQ, + ({mmqreg_t _ret =3D {0}; int _i_; for (_i_ =3D 0; _i_ < fVECSIZE()/64; _i= _++) _ret.ud[_i_] =3D ~VAL.ud[_i_]; _ret;}), + () +) + +DEF_MACRO(fGETQBITS, + ((MASK) & (REG.w[(BITNO)>>5] >> ((BITNO) & 0x1f))), + () +) + +DEF_MACRO(fGETQBIT, + fGETQBITS(REG,1,1,BITNO), + () +) + +DEF_MACRO(fGENMASKW, + (((fGETQBIT(QREG,(IDX*4+0)) ? 0xFF : 0x0) << 0) + |((fGETQBIT(QREG,(IDX*4+1)) ? 0xFF : 0x0) << 8) + |((fGETQBIT(QREG,(IDX*4+2)) ? 0xFF : 0x0) << 16) + |((fGETQBIT(QREG,(IDX*4+3)) ? 0xFF : 0x0) << 24)), + () +) +DEF_MACRO(fGET10BIT, + { + COE =3D (((((fGETUBYTE(3,VAL) >> (2 * POS)) & 3) << 8) | fGETUBYTE(POS,V= AL)) << 6); + COE >>=3D 6; + }, + () +) + +DEF_MACRO(fVMAX, + (X>Y) ? X : Y, + () +) + + +DEF_MACRO(fGETNIBBLE, + ( fSXTN(4,8,(SRC >> (4*IDX)) & 0xF) ), + () +) + +DEF_MACRO(fGETCRUMB, + ( fSXTN(2,8,(SRC >> (2*IDX)) & 0x3) ), + () +) + +DEF_MACRO(fGETCRUMB_SYMMETRIC, + ( (fGETCRUMB(IDX,SRC)>=3D0 ? (2-fGETCRUMB(IDX,SRC)) : fGETCRUMB(IDX,SR= C) ) ), + () +) + +#define ZERO_OFFSET_2B + + +DEF_MACRO(fGENMASKH, + (((fGETQBIT(QREG,(IDX*2+0)) ? 0xFF : 0x0) << 0) + |((fGETQBIT(QREG,(IDX*2+1)) ? 0xFF : 0x0) << 8)), + () +) + +DEF_MACRO(fGETMASKW, + (VREG.w[IDX] & fGENMASKW((QREG),IDX)), + () +) + +DEF_MACRO(fGETMASKH, + (VREG.h[IDX] & fGENMASKH((QREG),IDX)), + () +) + +DEF_MACRO(fCONDMASK8, + (fGETQBIT(QREG,IDX) ? (YESVAL) : (NOVAL)), + () +) + +DEF_MACRO(fCONDMASK16, + ((fGENMASKH(QREG,IDX) & (YESVAL)) | (fGENMASKH(fNOTQ(QREG),IDX) & (NOVAL)= )), + () +) + +DEF_MACRO(fCONDMASK32, + ((fGENMASKW(QREG,IDX) & (YESVAL)) | (fGENMASKW(fNOTQ(QREG),IDX) & (NOVAL)= )), + () +) + + +DEF_MACRO(fSETQBITS, + do { + size4u_t __TMP =3D (VAL); + REG.w[(BITNO)>>5] &=3D ~((MASK) << ((BITNO) & 0x1f)); + REG.w[(BITNO)>>5] |=3D (((__TMP) & (MASK)) << ((BITNO) & 0x1f)); + } while (0), + () +) + +DEF_MACRO(fSETQBIT, + fSETQBITS(REG,1,1,BITNO,VAL), + () +) + +DEF_MACRO(fVBYTES, + (fVECSIZE()), + () +) + +DEF_MACRO(fVHALVES, + (fVECSIZE()/2), + () +) + +DEF_MACRO(fVWORDS, + (fVECSIZE()/4), + () +) + +DEF_MACRO(fVDWORDS, + (fVECSIZE()/8), + () +) + +DEF_MACRO(fVALIGN, + ( ADDR =3D ADDR & ~(LOG2_ALIGNMENT-1)), + () +) + +DEF_MACRO(fVLASTBYTE, + ( ADDR =3D ADDR | (LOG2_ALIGNMENT-1)), + () +) + + +DEF_MACRO(fVELEM, + ((fVECSIZE()*8)/WIDTH), + () +) + +DEF_MACRO(fVECLOGSIZE, + (mmvec_current_veclogsize(thread)), + () +) + +DEF_MACRO(fVECSIZE, + (1<VRegs_updated & (((VRegMask)1)<future_VRegs[VNUM] : mmvec_zero_vector()), + (A_DOTNEWVALUE,A_RESTRICT_SLOT0ONLY) +) + +DEF_MACRO( + fV_AL_CHECK, + if ((EA) & (MASK)) { + warn("aligning misaligned vector. PC=3D%08x EA=3D%08x",thread->Regs[REG_= PC],(EA)); + }, + () +) +DEF_MACRO(fSCATTER_INIT, + { + mem_vector_scatter_init(thread, insn, REGION_START, LENGTH, ELEMENT_= SIZE); + if (EXCEPTION_DETECTED) return; + }, + (A_STORE,A_MEMLIKE,A_RESTRICT_SLOT0ONLY) +) + +DEF_MACRO(fGATHER_INIT, + { + mem_vector_gather_init(thread, insn, REGION_START, LENGTH, ELEMENT_S= IZE); + if (EXCEPTION_DETECTED) return; + }, + (A_LOAD,A_MEMLIKE,A_RESTRICT_SLOT1ONLY) +) + +DEF_MACRO(fSCATTER_FINISH, + { + if (EXCEPTION_DETECTED) return; + mem_vector_scatter_finish(thread, insn, OP); + }, + () +) + +DEF_MACRO(fGATHER_FINISH, + { + if (EXCEPTION_DETECTED) return; + mem_vector_gather_finish(thread, insn); + }, + () +) + + +DEF_MACRO(CHECK_VTCM_PAGE, + { + int slot =3D insn->slot; + paddr_t pa =3D thread->mem_access[slot].paddr+OFFSET; + pa =3D pa & ~(ALIGNMENT-1); + FLAG =3D (pa < (thread->mem_access[slot].paddr+LENGTH)); + }, + () +) +DEF_MACRO(COUNT_OUT_OF_BOUNDS, + { + if (!FLAG) + { + THREAD2STRUCT->vtcm_log.oob_access +=3D SIZE; + warn("Scatter/Gather out of bounds of region"); + } + }, + () +) + +DEF_MACRO(fLOG_SCATTER_OP, + { + // Log the size and indicate that the extension ext.c file needs t= o increment right before memory write + THREAD2STRUCT->vtcm_log.op =3D 1; + THREAD2STRUCT->vtcm_log.op_size =3D SIZE; + }, + () +) + + + +DEF_MACRO(fVLOG_VTCM_WORD_INCREMENT, + { + int slot =3D insn->slot; + int log_bank =3D 0; + int log_byte =3D0; + paddr_t pa =3D thread->mem_access[slot].paddr+(OFFSET & ~(ALIGNMEN= T-1)); + paddr_t pa_high =3D thread->mem_access[slot].paddr+LEN; + for(int i0 =3D 0; i0 < 4; i0++) + { + log_byte =3D ((OFFSET>=3D0)&&((pa+i0)<=3Dpa_high)); + log_bank |=3D (log_byte<slot; + int log_bank =3D 0; + int log_byte =3D 0; + paddr_t pa =3D thread->mem_access[slot].paddr+(OFFSET & ~(ALIGNMEN= T-1)); + paddr_t pa_high =3D thread->mem_access[slot].paddr+LEN; + for(int i0 =3D 0; i0 < 2; i0++) { + log_byte =3D ((OFFSET>=3D0)&&((pa+i0)<=3Dpa_high)); + log_bank |=3D (log_byte<slot; + int log_bank =3D 0; + int log_byte =3D 0; + paddr_t pa =3D thread->mem_access[slot].paddr+(OFFSET & ~(ALIGNMEN= T-1)); + paddr_t pa_high =3D thread->mem_access[slot].paddr+LEN; + for(int i0 =3D 0; i0 < 2; i0++) { + log_byte =3D ((OFFSET>=3D0)&&((pa+i0)<=3Dpa_high)); + log_bank |=3D (log_byte<slot; + int i0; + paddr_t pa =3D thread->mem_access[slot].paddr+OFFSET; + paddr_t pa_high =3D thread->mem_access[slot].paddr+LEN; + int log_bank =3D 0; + int log_byte =3D 0; + for(i0 =3D 0; i0 < ELEMENT_SIZE; i0++) + { + log_byte =3D ((OFFSET>=3D0)&&((pa+i0)<=3Dpa_high)) && QVAL; + log_bank |=3D (log_byte<system_ptr, thread->thre= adId, thread->mem_access[slot].paddr+OFFSET+i0); + THREAD2STRUCT->tmp_VRegs[0].ub[ELEMENT_SIZE*IDX+i0] =3D B; + LOG_VTCM_BYTE(pa+i0,log_byte,B,ELEMENT_SIZE*IDX+i0); + } + LOG_VTCM_BANK(pa, log_bank,BANK_IDX); +}, +() +) + + + +DEF_MACRO(fVLOG_VTCM_GATHER_WORD, + { + GATHER_FUNCTION(EA,OFFSET,IDX, LEN, 4, IDX, 1); + }, + () +) +DEF_MACRO(fVLOG_VTCM_GATHER_HALFWORD, + { + GATHER_FUNCTION(EA,OFFSET,IDX, LEN, 2, IDX, 1); + }, + () +) +DEF_MACRO(fVLOG_VTCM_GATHER_HALFWORD_DV, + { + GATHER_FUNCTION(EA,OFFSET,IDX, LEN, 2, (2*IDX2+IDX_H), 1); + }, + () +) +DEF_MACRO(fVLOG_VTCM_GATHER_WORDQ, + { + GATHER_FUNCTION(EA,OFFSET,IDX, LEN, 4, IDX, fGETQBIT(QsV,4*IDX+i0)); + }, + () +) +DEF_MACRO(fVLOG_VTCM_GATHER_HALFWORDQ, + { + GATHER_FUNCTION(EA,OFFSET,IDX, LEN, 2, IDX, fGETQBIT(QsV,2*IDX+i0)); + }, + () +) + +DEF_MACRO(fVLOG_VTCM_GATHER_HALFWORDQ_DV, + { + GATHER_FUNCTION(EA,OFFSET,IDX, LEN, 2, (2*IDX2+IDX_H), fGETQBIT(QsV,2*ID= X+i0)); + }, + () +) + + +DEF_MACRO(DEBUG_LOG_ADDR, + { + + if (thread->processor_ptr->arch_proc_options->mmvec_network_addr_l= og2) + { + + int slot =3D insn->slot; + paddr_t pa =3D thread->mem_access[slot].paddr+OFFSET; + } + }, + () +) + + + + + + + +DEF_MACRO(SCATTER_OP_WRITE_TO_MEM, + { + for (int i =3D 0; i < mmvecx->vtcm_log.size; i+=3Dsizeof(TYPE)) + { + if ( mmvecx->vtcm_log.mask.ub[i] !=3D 0) { + TYPE dst =3D 0; + TYPE inc =3D 0; + for(int j =3D 0; j < sizeof(TYPE); j++) { + dst |=3D (sim_mem_read1(thread->system_ptr, thread->th= readId, mmvecx->vtcm_log.pa[i+j]) << (8*j)); + inc |=3D mmvecx->vtcm_log.data.ub[j+i] << (8*j); + + mmvecx->vtcm_log.mask.ub[j+i] =3D 0; + mmvecx->vtcm_log.data.ub[j+i] =3D 0; + mmvecx->vtcm_log.offsets.ub[j+i] =3D 0; + } + dst +=3D inc; + for(int j =3D 0; j < sizeof(TYPE); j++) { + sim_mem_write1(thread->system_ptr,thread->threadId, mm= vecx->vtcm_log.pa[i+j], (dst >> (8*j))& 0xFF ); + } + } + + } + }, + () +) + +DEF_MACRO(SCATTER_FUNCTION, +{ + int slot =3D insn->slot; + int i0; + paddr_t pa =3D thread->mem_access[slot].paddr+OFFSET; + paddr_t pa_high =3D thread->mem_access[slot].paddr+LEN; + int log_bank =3D 0; + int log_byte =3D 0; + for(i0 =3D 0; i0 < ELEMENT_SIZE; i0++) { + log_byte =3D ((OFFSET>=3D0)&&((pa+i0)<=3Dpa_high)) && QVAL; + log_bank |=3D (log_byte<processor_ptr)); + }, + (A_STORE,A_MEMLIKE) +) + +DEF_MACRO(fVFETCH_AL, + { + fV_AL_CHECK(EA,fVECSIZE()-1); + mem_fetch_vector(thread, insn, EA&~(fVECSIZE()-1), insn->slot, fVECSIZ= E()); + }, + (A_LOAD,A_MEMLIKE) +) + + +DEF_MACRO(fLOADMMV_AL, + { + fV_AL_CHECK(EA,ALIGNMENT-1); + thread->last_pkt->double_access_vec =3D 0; + mem_load_vector_oddva(thread, insn, EA&~(ALIGNMENT-1), EA, insn->slot,= LEN, &DST.ub[0], LEN, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr)); + }, + (A_LOAD,A_MEMLIKE) +) + +DEF_MACRO(fLOADMMV, + fLOADMMV_AL(EA,fVECSIZE(),fVECSIZE(),DST), + () +) + +DEF_MACRO(fLOADMMVQ, + do { + int __i; + fLOADMMV_AL(EA,fVECSIZE(),fVECSIZE(),DST); + fVFOREACH(8,__i) if (!fGETQBIT(QVAL,__i)) DST.b[__i] =3D 0; + } while (0), + () +) + +DEF_MACRO(fLOADMMVNQ, + do { + int __i; + fLOADMMV_AL(EA,fVECSIZE(),fVECSIZE(),DST); + fVFOREACH(8,__i) if (fGETQBIT(QVAL,__i)) DST.b[__i] =3D 0; + } while (0), + () +) + +DEF_MACRO(fLOADMMVU_AL, + { + size4u_t size2 =3D (EA)&(ALIGNMENT-1); + size4u_t size1 =3D LEN-size2; + thread->last_pkt->double_access_vec =3D 1; + mem_load_vector_oddva(thread, insn, EA+size1, EA+fVECSIZE(), /* slot *= / 1, size2, &DST.ub[size1], size2, fUSE_LOOKUP_ADDRESS()); + mem_load_vector_oddva(thread, insn, EA, EA,/* slot */ 0, size1, &DST.u= b[0], size1, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr)); + }, + (A_LOAD,A_MEMLIKE) +) + +DEF_MACRO(fLOADMMVU, + { + /* if address happens to be aligned, only do aligned load */ + thread->last_pkt->pkt_has_vtcm_access =3D 0; + thread->last_pkt->pkt_access_count =3D 0; + if ( (EA & (fVECSIZE()-1)) =3D=3D 0) { + thread->last_pkt->pkt_has_vmemu_access =3D 0; + thread->last_pkt->double_access =3D 0; + + fLOADMMV_AL(EA,fVECSIZE(),fVECSIZE(),DST); + } else { + thread->last_pkt->pkt_has_vmemu_access =3D 1; + thread->last_pkt->double_access =3D 1; + + fLOADMMVU_AL(EA,fVECSIZE(),fVECSIZE(),DST); + } + }, + () +) + +DEF_MACRO(fSTOREMMV_AL, + { + fV_AL_CHECK(EA,ALIGNMENT-1); + mem_store_vector_oddva(thread, insn, EA&~(ALIGNMENT-1), EA, insn->slot= , LEN, &SRC.ub[0], 0, 0, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr)); + }, + (A_STORE,A_MEMLIKE) +) + +DEF_MACRO(fSTOREMMV, + fSTOREMMV_AL(EA,fVECSIZE(),fVECSIZE(),SRC), + () +) + +DEF_MACRO(fSTOREMMVQ_AL, + do { + mmvector_t maskvec; + int i; + for (i =3D 0; i < fVECSIZE(); i++) maskvec.ub[i] =3D fGETQBIT(MASK,i); + mem_store_vector_oddva(thread, insn, EA&~(ALIGNMENT-1), EA, insn->slot, L= EN, &SRC.ub[0], &maskvec.ub[0], 0, fUSE_LOOKUP_ADDRESS_BY_REV(thread->proce= ssor_ptr)); + } while (0), + (A_STORE,A_MEMLIKE) +) + +DEF_MACRO(fSTOREMMVQ, + fSTOREMMVQ_AL(EA,fVECSIZE(),fVECSIZE(),SRC,MASK), + () +) + +DEF_MACRO(fSTOREMMVNQ_AL, + { + mmvector_t maskvec; + int i; + for (i =3D 0; i < fVECSIZE(); i++) maskvec.ub[i] =3D fGETQBIT(MASK,i); + fV_AL_CHECK(EA,ALIGNMENT-1); + mem_store_vector_oddva(thread, insn, EA&~(ALIGNMENT-1), EA, insn->slot, L= EN, &SRC.ub[0], &maskvec.ub[0], 1, fUSE_LOOKUP_ADDRESS_BY_REV(thread->proce= ssor_ptr)); + }, + (A_STORE,A_MEMLIKE) +) + +DEF_MACRO(fSTOREMMVNQ, + fSTOREMMVNQ_AL(EA,fVECSIZE(),fVECSIZE(),SRC,MASK), + () +) + +DEF_MACRO(fSTOREMMVU_AL, + { + size4u_t size1 =3D ALIGNMENT-((EA)&(ALIGNMENT-1)); + size4u_t size2; + if (size1>LEN) size1 =3D LEN; + size2 =3D LEN-size1; + mem_store_vector_oddva(thread, insn, EA+size1, EA+fVECSIZE(), /* slot = */ 1, size2, &SRC.ub[size1], 0, 0, fUSE_LOOKUP_ADDRESS()); + mem_store_vector_oddva(thread, insn, EA, EA, /* slot */ 0, size1, &SRC= .ub[0], 0, 0, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr)); + }, + (A_STORE,A_MEMLIKE) +) + +DEF_MACRO(fSTOREMMVU, + { + thread->last_pkt->pkt_has_vtcm_access =3D 0; + thread->last_pkt->pkt_access_count =3D 0; + if ( (EA & (fVECSIZE()-1)) =3D=3D 0) { + thread->last_pkt->double_access =3D 0; + fSTOREMMV_AL(EA,fVECSIZE(),fVECSIZE(),SRC); + } else { + thread->last_pkt->double_access =3D 1; + thread->last_pkt->pkt_has_vmemu_access =3D 1; + fSTOREMMVU_AL(EA,fVECSIZE(),fVECSIZE(),SRC); + } + }, + () +) + +DEF_MACRO(fSTOREMMVQU_AL, + { + size4u_t size1 =3D ALIGNMENT-((EA)&(ALIGNMENT-1)); + size4u_t size2; + mmvector_t maskvec; + int i; + for (i =3D 0; i < fVECSIZE(); i++) maskvec.ub[i] =3D fGETQBIT(MASK,i); + if (size1>LEN) size1 =3D LEN; + size2 =3D LEN-size1; + mem_store_vector_oddva(thread, insn, EA+size1, EA+fVECSIZE(),/* slot */ 1= , size2, &SRC.ub[size1], &maskvec.ub[size1], 0, fUSE_LOOKUP_ADDRESS()); + mem_store_vector_oddva(thread, insn, EA, /* slot */ 0, size1, &SRC.ub[0],= &maskvec.ub[0], 0, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr)); + }, + (A_STORE,A_MEMLIKE) +) + +DEF_MACRO(fSTOREMMVQU, + { + thread->last_pkt->pkt_has_vtcm_access =3D 0; + thread->last_pkt->pkt_access_count =3D 0; + if ( (EA & (fVECSIZE()-1)) =3D=3D 0) { + thread->last_pkt->double_access =3D 0; + fSTOREMMVQ_AL(EA,fVECSIZE(),fVECSIZE(),SRC,MASK); + } else { + thread->last_pkt->double_access =3D 1; + thread->last_pkt->pkt_has_vmemu_access =3D 1; + fSTOREMMVQU_AL(EA,fVECSIZE(),fVECSIZE(),SRC,MASK); + } + }, + () +) + +DEF_MACRO(fSTOREMMVNQU_AL, + { + size4u_t size1 =3D ALIGNMENT-((EA)&(ALIGNMENT-1)); + size4u_t size2; + mmvector_t maskvec; + int i; + for (i =3D 0; i < fVECSIZE(); i++) maskvec.ub[i] =3D fGETQBIT(MASK,i); + if (size1>LEN) size1 =3D LEN; + size2 =3D LEN-size1; + mem_store_vector_oddva(thread, insn, EA+size1, EA+fVECSIZE(), /* slot */ = 1, size2, &SRC.ub[size1], &maskvec.ub[size1], 1, fUSE_LOOKUP_ADDRESS()); + mem_store_vector_oddva(thread, insn, EA, EA, /* slot */ 0, size1, &SRC.ub= [0], &maskvec.ub[0], 1, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr)); + }, + (A_STORE,A_MEMLIKE) +) + +DEF_MACRO(fSTOREMMVNQU, + { + thread->last_pkt->pkt_has_vtcm_access =3D 0; + thread->last_pkt->pkt_access_count =3D 0; + if ( (EA & (fVECSIZE()-1)) =3D=3D 0) { + thread->last_pkt->double_access =3D 0; + fSTOREMMVNQ_AL(EA,fVECSIZE(),fVECSIZE(),SRC,MASK); + } else { + thread->last_pkt->double_access =3D 1; + thread->last_pkt->pkt_has_vmemu_access =3D 1; + fSTOREMMVNQU_AL(EA,fVECSIZE(),fVECSIZE(),SRC,MASK); + } + }, + () +) + + + + +DEF_MACRO(fVFOREACH, + for (VAR =3D 0; VAR < fVELEM(WIDTH); VAR++), + /* NOTHING */ +) + +DEF_MACRO(fVARRAY_ELEMENT_ACCESS, + ARRAY.v[(INDEX) / (fVECSIZE()/(sizeof(ARRAY.TYPE[0])))].TYPE[(INDEX) %= (fVECSIZE()/(sizeof(ARRAY.TYPE[0])))], + () +) + +DEF_MACRO(fVNEWCANCEL, + do { THREAD2STRUCT->VRegs_select &=3D ~(1<<(REGNUM)); } while (0), + () +) + +DEF_MACRO(fTMPVDATA, + mmvec_vtmp_data(thread), + (A_CVI) +) + +DEF_MACRO(fVSATDW, + fVSATW( ( ( ((long long)U)<<32 ) | fZXTN(32,64,V) ) ), + /* attribs */ +) + +DEF_MACRO(fVASL_SATHI, + fVSATW(((U)<<1) | ((V)>>31)), + /* attribs */ +) + +DEF_MACRO(fVUADDSAT, + fVSATUN( WIDTH, fZXTN(WIDTH, 2*WIDTH, U) + fZXTN(WIDTH, 2*WIDTH, V)), + /* attribs */ +) + +DEF_MACRO(fVSADDSAT, + fVSATN( WIDTH, fSXTN(WIDTH, 2*WIDTH, U) + fSXTN(WIDTH, 2*WIDTH, V)), + /* attribs */ +) + +DEF_MACRO(fVUSUBSAT, + fVSATUN( WIDTH, fZXTN(WIDTH, 2*WIDTH, U) - fZXTN(WIDTH, 2*WIDTH, V)), + /* attribs */ +) + +DEF_MACRO(fVSSUBSAT, + fVSATN( WIDTH, fSXTN(WIDTH, 2*WIDTH, U) - fSXTN(WIDTH, 2*WIDTH, V)), + /* attribs */ +) + +DEF_MACRO(fVAVGU, + ((fZXTN(WIDTH, 2*WIDTH, U) + fZXTN(WIDTH, 2*WIDTH, V))>>1), + /* attribs */ +) + +DEF_MACRO(fVAVGURND, + ((fZXTN(WIDTH, 2*WIDTH, U) + fZXTN(WIDTH, 2*WIDTH, V)+1)>>1), + /* attribs */ +) + +DEF_MACRO(fVNAVGU, + ((fZXTN(WIDTH, 2*WIDTH, U) - fZXTN(WIDTH, 2*WIDTH, V))>>1), + /* attribs */ +) + +DEF_MACRO(fVNAVGURNDSAT, + fVSATUN(WIDTH,((fZXTN(WIDTH, 2*WIDTH, U) - fZXTN(WIDTH, 2*WIDTH, V)+1)>>1= )), + /* attribs */ +) + +DEF_MACRO(fVAVGS, + ((fSXTN(WIDTH, 2*WIDTH, U) + fSXTN(WIDTH, 2*WIDTH, V))>>1), + /* attribs */ +) + +DEF_MACRO(fVAVGSRND, + ((fSXTN(WIDTH, 2*WIDTH, U) + fSXTN(WIDTH, 2*WIDTH, V)+1)>>1), + /* attribs */ +) + +DEF_MACRO(fVNAVGS, + ((fSXTN(WIDTH, 2*WIDTH, U) - fSXTN(WIDTH, 2*WIDTH, V))>>1), + /* attribs */ +) + +DEF_MACRO(fVNAVGSRND, + ((fSXTN(WIDTH, 2*WIDTH, U) - fSXTN(WIDTH, 2*WIDTH, V)+1)>>1), + /* attribs */ +) + +DEF_MACRO(fVNAVGSRNDSAT, + fVSATN(WIDTH,((fSXTN(WIDTH, 2*WIDTH, U) - fSXTN(WIDTH, 2*WIDTH, V)+1)>>1)= ), + /* attribs */ +) + + +DEF_MACRO(fVNOROUND, + VAL, + /* NOTHING */ +) +DEF_MACRO(fVNOSAT, + VAL, + /* NOTHING */ +) + +DEF_MACRO(fVROUND, + ((VAL) + (((SHAMT)>0)?(1LL<<((SHAMT)-1)):0)), + /* NOTHING */ +) + +DEF_MACRO(fCARRY_FROM_ADD32, + (((fZXTN(32,64,A)+fZXTN(32,64,B)+C) >> 32) & 1), + /* NOTHING */ +) + +DEF_MACRO(fUARCH_NOTE_PUMP_4X, + , + () +) + +DEF_MACRO(fUARCH_NOTE_PUMP_2X, + , + () +) --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634033688652902.2689189216544; Tue, 12 Oct 2021 03:14:48 -0700 (PDT) Received: from localhost ([::1]:44276 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maEo3-0003Oz-Lf for importer@patchew.org; Tue, 12 Oct 2021 06:14:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50330) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEky-0000Zb-HH for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:39 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64080) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEkv-0006xP-03 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:36 -0400 Received: from unknown (HELO ironmsg01-sd.qualcomm.com) ([10.53.140.141]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg01-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 497D8141C; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033492; x=1665569492; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Jl+j4HlqgM8ZVoCTG1T957gz/73q/eVw6Wus25Y8toE=; b=vXkU1ipkxss9mznj6h1aHGmQe/KkprsZgTrS7P/kwbvuC/wJuAPyaEAH 5zltU3nUA7Jp4Wo7Hz26kEopddKAcjxQwUN2sziV7FzXLMYXhDcdFDIeu BwdqfGccbjpw2drz1ChlBeC/TzYLcZPShZjU3LPnxvyGjgFmqfUADCpx8 U=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 07/30] Hexagon HVX (target/hexagon) semantics generator Date: Tue, 12 Oct 2021 05:10:45 -0500 Message-Id: <1634033468-23566-8-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634033689070100001 Add HVX support to the semantics generator Acked-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/gen_semantics.c | 33 +++++++++++++++++++++++++++++++++ target/hexagon/hex_common.py | 13 +++++++++++++ 2 files changed, 46 insertions(+) diff --git a/target/hexagon/gen_semantics.c b/target/hexagon/gen_semantics.c index c5fccec..4a2bdd7 100644 --- a/target/hexagon/gen_semantics.c +++ b/target/hexagon/gen_semantics.c @@ -44,6 +44,11 @@ int main(int argc, char *argv[]) * Q6INSN(A2_add,"Rd32=3Dadd(Rs32,Rt32)",ATTRIBS(), * "Add 32-bit registers", * { RdV=3DRsV+RtV;}) + * HVX instructions have the following form + * EXTINSN(V6_vinsertwr, "Vx32.w=3Dvinsert(Rt32)", + * ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX), + * "Insert Word Scalar into Vector", + * VxV.uw[0] =3D RtV;) */ #define Q6INSN(TAG, BEH, ATTRIBS, DESCR, SEM) \ do { \ @@ -59,8 +64,23 @@ int main(int argc, char *argv[]) ")\n", \ #TAG, STRINGIZE(ATTRIBS)); \ } while (0); +#define EXTINSN(TAG, BEH, ATTRIBS, DESCR, SEM) \ + do { \ + fprintf(outfile, "SEMANTICS( \\\n" \ + " \"%s\", \\\n" \ + " %s, \\\n" \ + " \"\"\"%s\"\"\" \\\n" \ + ")\n", \ + #TAG, STRINGIZE(BEH), STRINGIZE(SEM)); \ + fprintf(outfile, "ATTRIBUTES( \\\n" \ + " \"%s\", \\\n" \ + " \"%s\" \\\n" \ + ")\n", \ + #TAG, STRINGIZE(ATTRIBS)); \ + } while (0); #include "imported/allidefs.def" #undef Q6INSN +#undef EXTINSN =20 /* * Process the macro definitions @@ -83,6 +103,19 @@ int main(int argc, char *argv[]) #include "imported/macros.def" #undef DEF_MACRO =20 +/* + * Process the macros for HVX + */ +#define DEF_MACRO(MNAME, BEH, ATTRS) \ + fprintf(outfile, "MACROATTRIB( \\\n" \ + " \"%s\", \\\n" \ + " \"\"\"%s\"\"\", \\\n" \ + " \"%s\" \\\n" \ + ")\n", \ + #MNAME, STRINGIZE(BEH), STRINGIZE(ATTRS)); +#include "imported/allext_macros.def" +#undef DEF_MACRO + fclose(outfile); return 0; } diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py index b3b5340..47fb628 100755 --- a/target/hexagon/hex_common.py +++ b/target/hexagon/hex_common.py @@ -143,6 +143,9 @@ def compute_tag_immediates(tag): ## P predicate register ## R GPR register ## M modifier register +## Q HVX predicate vector +## V HVX vector register +## O HVX new vector register ## regid can be one of the following ## d, e destination register ## dd destination register pair @@ -178,6 +181,9 @@ def is_readwrite(regid): def is_scalar_reg(regtype): return regtype in "RPC" =20 +def is_hvx_reg(regtype): + return regtype in "VQ" + def is_old_val(regtype, regid, tag): return regtype+regid+'V' in semdict[tag] =20 @@ -201,6 +207,13 @@ def need_ea(tag): def skip_qemu_helper(tag): return tag in overrides.keys() =20 +def is_tmp_result(tag): + return ('A_CVI_TMP' in attribdict[tag] or + 'A_CVI_TMP_DST' in attribdict[tag]) + +def is_new_result(tag): + return ('A_CVI_NEW' in attribdict[tag]) + def imm_name(immlett): return "%siV" % immlett =20 --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634035774951262.6019247792441; Tue, 12 Oct 2021 03:49:34 -0700 (PDT) Received: from localhost ([::1]:35716 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFLh-0002s4-QN for importer@patchew.org; Tue, 12 Oct 2021 06:49:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50400) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEl2-0000bB-Vs for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:42 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64100) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEkw-0007Fq-2I for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:40 -0400 Received: from unknown (HELO ironmsg01-sd.qualcomm.com) ([10.53.140.141]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg01-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 4C21514A5; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033494; x=1665569494; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=krnfHXX7Bpa7mmfdNIpVbByrgfnRBpJq8hqQRE2k1mQ=; b=n9FbgbDp6ccxgWGVHuiN3CsiwGQapLgfX1VVNIBNDyZUFpWmilesN5Fc eJ+xbSfC3nUwnMFZ7f6yGNrcpB5b47n1pW+GaTs2Qumfvt5o9YKSXqul+ QyZlswt6csnpJDVEEmHmJgnqo9vzCt9VX5RWztWDNe5Bb1pN7gOMPUOfV M=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 08/30] Hexagon HVX (target/hexagon) semantics generator - part 2 Date: Tue, 12 Oct 2021 05:10:46 -0500 Message-Id: <1634033468-23566-9-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634035775314100001 Acked-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/gen_helper_funcs.py | 112 ++++++++++++++-- target/hexagon/gen_helper_protos.py | 16 ++- target/hexagon/gen_tcg_funcs.py | 254 ++++++++++++++++++++++++++++++++= ++-- 3 files changed, 360 insertions(+), 22 deletions(-) diff --git a/target/hexagon/gen_helper_funcs.py b/target/hexagon/gen_helper= _funcs.py index 2b1c5d8..ac5ce10 100755 --- a/target/hexagon/gen_helper_funcs.py +++ b/target/hexagon/gen_helper_funcs.py @@ -48,12 +48,26 @@ def gen_helper_arg_pair(f,regtype,regid,regno): if regno >=3D 0 : f.write(", ") f.write("int64_t %s%sV" % (regtype,regid)) =20 +def gen_helper_arg_ext(f,regtype,regid,regno): + if regno > 0 : f.write(", ") + f.write("void *%s%sV_void" % (regtype,regid)) + +def gen_helper_arg_ext_pair(f,regtype,regid,regno): + if regno > 0 : f.write(", ") + f.write("void *%s%sV_void" % (regtype,regid)) + def gen_helper_arg_opn(f,regtype,regid,i,tag): if (hex_common.is_pair(regid)): - gen_helper_arg_pair(f,regtype,regid,i) + if (hex_common.is_hvx_reg(regtype)): + gen_helper_arg_ext_pair(f,regtype,regid,i) + else: + gen_helper_arg_pair(f,regtype,regid,i) elif (hex_common.is_single(regid)): if hex_common.is_old_val(regtype, regid, tag): - gen_helper_arg(f,regtype,regid,i) + if (hex_common.is_hvx_reg(regtype)): + gen_helper_arg_ext(f,regtype,regid,i) + else: + gen_helper_arg(f,regtype,regid,i) elif hex_common.is_new_val(regtype, regid, tag): gen_helper_arg_new(f,regtype,regid,i) else: @@ -72,25 +86,67 @@ def gen_helper_dest_decl_pair(f,regtype,regid,regno,sub= field=3D""): f.write(" int64_t %s%sV%s =3D 0;\n" % \ (regtype,regid,subfield)) =20 +def gen_helper_dest_decl_ext(f,regtype,regid): + if (regtype =3D=3D "Q"): + f.write(" /* %s%sV is *(MMQReg *)(%s%sV_void) */\n" % \ + (regtype,regid,regtype,regid)) + else: + f.write(" /* %s%sV is *(MMVector *)(%s%sV_void) */\n" % \ + (regtype,regid,regtype,regid)) + +def gen_helper_dest_decl_ext_pair(f,regtype,regid,regno): + f.write(" /* %s%sV is *(MMVectorPair *))%s%sV_void) */\n" % \ + (regtype,regid,regtype, regid)) + def gen_helper_dest_decl_opn(f,regtype,regid,i): if (hex_common.is_pair(regid)): - gen_helper_dest_decl_pair(f,regtype,regid,i) + if (hex_common.is_hvx_reg(regtype)): + gen_helper_dest_decl_ext_pair(f,regtype,regid, i) + else: + gen_helper_dest_decl_pair(f,regtype,regid,i) elif (hex_common.is_single(regid)): - gen_helper_dest_decl(f,regtype,regid,i) + if (hex_common.is_hvx_reg(regtype)): + gen_helper_dest_decl_ext(f,regtype,regid) + else: + gen_helper_dest_decl(f,regtype,regid,i) else: print("Bad register parse: ",regtype,regid,toss,numregs) =20 +def gen_helper_src_var_ext(f,regtype,regid): + if (regtype =3D=3D "Q"): + f.write(" /* %s%sV is *(MMQReg *)(%s%sV_void) */\n" % \ + (regtype,regid,regtype,regid)) + else: + f.write(" /* %s%sV is *(MMVector *)(%s%sV_void) */\n" % \ + (regtype,regid,regtype,regid)) + +def gen_helper_src_var_ext_pair(f,regtype,regid,regno): + f.write(" /* %s%sV%s is *(MMVectorPair *)(%s%sV%s_void) */\n" % \ + (regtype,regid,regno,regtype,regid,regno)) + def gen_helper_return(f,regtype,regid,regno): f.write(" return %s%sV;\n" % (regtype,regid)) =20 def gen_helper_return_pair(f,regtype,regid,regno): f.write(" return %s%sV;\n" % (regtype,regid)) =20 +def gen_helper_dst_write_ext(f,regtype,regid): + return + +def gen_helper_dst_write_ext_pair(f,regtype,regid): + return + def gen_helper_return_opn(f, regtype, regid, i): if (hex_common.is_pair(regid)): - gen_helper_return_pair(f,regtype,regid,i) + if (hex_common.is_hvx_reg(regtype)): + gen_helper_dst_write_ext_pair(f,regtype,regid) + else: + gen_helper_return_pair(f,regtype,regid,i) elif (hex_common.is_single(regid)): - gen_helper_return(f,regtype,regid,i) + if (hex_common.is_hvx_reg(regtype)): + gen_helper_dst_write_ext(f,regtype,regid) + else: + gen_helper_return(f,regtype,regid,i) else: print("Bad register parse: ",regtype,regid,toss,numregs) =20 @@ -129,14 +185,20 @@ def gen_helper_function(f, tag, tagregs, tagimms): % (tag, tag)) else: ## The return type of the function is the type of the destination - ## register + ## register (if scalar) i=3D0 for regtype,regid,toss,numregs in regs: if (hex_common.is_written(regid)): if (hex_common.is_pair(regid)): - gen_helper_return_type_pair(f,regtype,regid,i) + if (hex_common.is_hvx_reg(regtype)): + continue + else: + gen_helper_return_type_pair(f,regtype,regid,i) elif (hex_common.is_single(regid)): - gen_helper_return_type(f,regtype,regid,i) + if (hex_common.is_hvx_reg(regtype)): + continue + else: + gen_helper_return_type(f,regtype,regid,i) else: print("Bad register parse: ",regtype,regid,toss,numreg= s) i +=3D 1 @@ -145,16 +207,37 @@ def gen_helper_function(f, tag, tagregs, tagimms): f.write("void") f.write(" HELPER(%s)(CPUHexagonState *env" % tag) =20 + ## Arguments include the vector destination operands i =3D 1 + for regtype,regid,toss,numregs in regs: + if (hex_common.is_written(regid)): + if (hex_common.is_pair(regid)): + if (hex_common.is_hvx_reg(regtype)): + gen_helper_arg_ext_pair(f,regtype,regid,i) + else: + continue + elif (hex_common.is_single(regid)): + if (hex_common.is_hvx_reg(regtype)): + gen_helper_arg_ext(f,regtype,regid,i) + else: + # This is the return value of the function + continue + else: + print("Bad register parse: ",regtype,regid,toss,numreg= s) + i +=3D 1 =20 ## Arguments to the helper function are the source regs and immedi= ates for regtype,regid,toss,numregs in regs: if (hex_common.is_read(regid)): + if (hex_common.is_hvx_reg(regtype) and + hex_common.is_readwrite(regid)): + continue gen_helper_arg_opn(f,regtype,regid,i,tag) i +=3D 1 for immlett,bits,immshift in imms: gen_helper_arg_imm(f,immlett) i +=3D 1 + if hex_common.need_slot(tag): if i > 0: f.write(", ") f.write("uint32_t slot") @@ -173,6 +256,17 @@ def gen_helper_function(f, tag, tagregs, tagimms): gen_helper_dest_decl_opn(f,regtype,regid,i) i +=3D 1 =20 + for regtype,regid,toss,numregs in regs: + if (hex_common.is_read(regid)): + if (hex_common.is_pair(regid)): + if (hex_common.is_hvx_reg(regtype)): + gen_helper_src_var_ext_pair(f,regtype,regid,i) + elif (hex_common.is_single(regid)): + if (hex_common.is_hvx_reg(regtype)): + gen_helper_src_var_ext(f,regtype,regid) + else: + print("Bad register parse: ",regtype,regid,toss,numreg= s) + if 'A_FPOP' in hex_common.attribdict[tag]: f.write(' arch_fpop_start(env);\n'); =20 diff --git a/target/hexagon/gen_helper_protos.py b/target/hexagon/gen_helpe= r_protos.py index ea41007..229ef8d 100755 --- a/target/hexagon/gen_helper_protos.py +++ b/target/hexagon/gen_helper_protos.py @@ -94,19 +94,33 @@ def gen_helper_prototype(f, tag, tagregs, tagimms): f.write('DEF_HELPER_%s(%s' % (def_helper_size, tag)) =20 ## Generate the qemu DEF_HELPER type for each result + ## Iterate over this list twice + ## - Emit the scalar result + ## - Emit the vector result i=3D0 for regtype,regid,toss,numregs in regs: if (hex_common.is_written(regid)): - gen_def_helper_opn(f, tag, regtype, regid, toss, numregs, = i) + if (not hex_common.is_hvx_reg(regtype)): + gen_def_helper_opn(f, tag, regtype, regid, toss, numre= gs, i) i +=3D 1 =20 ## Put the env between the outputs and inputs f.write(', env' ) i +=3D 1 =20 + # Second pass + for regtype,regid,toss,numregs in regs: + if (hex_common.is_written(regid)): + if (hex_common.is_hvx_reg(regtype)): + gen_def_helper_opn(f, tag, regtype, regid, toss, numre= gs, i) + i +=3D 1 + ## Generate the qemu type for each input operand (regs and immedia= tes) for regtype,regid,toss,numregs in regs: if (hex_common.is_read(regid)): + if (hex_common.is_hvx_reg(regtype) and + hex_common.is_readwrite(regid)): + continue gen_def_helper_opn(f, tag, regtype, regid, toss, numregs, = i) i +=3D 1 for immlett,bits,immshift in imms: diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs= .py index ca8a801..48bcf89 100755 --- a/target/hexagon/gen_tcg_funcs.py +++ b/target/hexagon/gen_tcg_funcs.py @@ -119,10 +119,95 @@ def genptr_decl(f, tag, regtype, regid, regno): (regtype, regid, regtype, regid)) else: print("Bad register parse: ", regtype, regid) + elif (regtype =3D=3D "V"): + if (regid in {"dd"}): + f.write(" const int %s%sN =3D insn->regno[%d];\n" %\ + (regtype, regid, regno)) + f.write(" const intptr_t %s%sV_off =3D\n" %\ + (regtype, regid)) + if (hex_common.is_tmp_result(tag)): + f.write(" ctx_tmp_vreg_off(ctx, %s%sN, 2, true);\n"= % \ + (regtype, regid)) + else: + f.write(" ctx_future_vreg_off(ctx, %s%sN," % \ + (regtype, regid)) + f.write(" 2, true);\n") + if (not hex_common.skip_qemu_helper(tag)): + f.write(" TCGv_ptr %s%sV =3D tcg_temp_new_ptr();\n" % \ + (regtype, regid)) + f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\= n" % \ + (regtype, regid, regtype, regid)) + elif (regid in {"uu", "vv", "xx"}): + f.write(" const int %s%sN =3D insn->regno[%d];\n" %\ + (regtype, regid, regno)) + f.write(" const intptr_t %s%sV_off =3D\n" % \ + (regtype, regid)) + f.write(" offsetof(CPUHexagonState, %s%sV);\n" % \ + (regtype, regid)) + if (not hex_common.skip_qemu_helper(tag)): + f.write(" TCGv_ptr %s%sV =3D tcg_temp_new_ptr();\n" % \ + (regtype, regid)) + f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\= n" % \ + (regtype, regid, regtype, regid)) + elif (regid in {"s", "u", "v", "w"}): + f.write(" const int %s%sN =3D insn->regno[%d];\n" % \ + (regtype, regid, regno)) + f.write(" const intptr_t %s%sV_off =3D\n" % \ + (regtype, regid)) + f.write(" vreg_src_off(ctx, %s%sN);\n" % \ + (regtype, regid)) + if (not hex_common.skip_qemu_helper(tag)): + f.write(" TCGv_ptr %s%sV =3D tcg_temp_new_ptr();\n" % \ + (regtype, regid)) + elif (regid in {"d", "x", "y"}): + f.write(" const int %s%sN =3D insn->regno[%d];\n" % \ + (regtype, regid, regno)) + f.write(" const intptr_t %s%sV_off =3D\n" % \ + (regtype, regid)) + if (hex_common.is_tmp_result(tag)): + f.write(" ctx_tmp_vreg_off(ctx, %s%sN, 1, true);\n"= % \ + (regtype, regid)) + else: + f.write(" ctx_future_vreg_off(ctx, %s%sN," %\ + (regtype, regid)) + f.write(" 1, true);\n"); + if (not hex_common.skip_qemu_helper(tag)): + f.write(" TCGv_ptr %s%sV =3D tcg_temp_new_ptr();\n" % \ + (regtype, regid)) + f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\= n" % \ + (regtype, regid, regtype, regid)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype =3D=3D "Q"): + if (regid in {"d", "e", "x"}): + f.write(" const int %s%sN =3D insn->regno[%d];\n" % \ + (regtype, regid, regno)) + f.write(" const intptr_t %s%sV_off =3D\n" % \ + (regtype, regid)) + f.write(" offsetof(CPUHexagonState,\n") + f.write(" future_QRegs[%s%sN]);\n" % \ + (regtype, regid)) + if (not hex_common.skip_qemu_helper(tag)): + f.write(" TCGv_ptr %s%sV =3D tcg_temp_new_ptr();\n" % \ + (regtype, regid)) + f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\= n" % \ + (regtype, regid, regtype, regid)) + elif (regid in {"s", "t", "u", "v"}): + f.write(" const int %s%sN =3D insn->regno[%d];\n" % \ + (regtype, regid, regno)) + f.write(" const intptr_t %s%sV_off =3D\n" %\ + (regtype, regid)) + f.write(" offsetof(CPUHexagonState, QRegs[%s%sN]);\n" %= \ + (regtype, regid)) + if (not hex_common.skip_qemu_helper(tag)): + f.write(" TCGv_ptr %s%sV =3D tcg_temp_new_ptr();\n" % \ + (regtype, regid)) + else: + print("Bad register parse: ", regtype, regid) else: print("Bad register parse: ", regtype, regid) =20 -def genptr_decl_new(f,regtype,regid,regno): +def genptr_decl_new(f, tag, regtype, regid, regno): if (regtype =3D=3D "N"): if (regid in {"s", "t"}): f.write(" TCGv %s%sN =3D hex_new_value[insn->regno[%d]];\n"= % \ @@ -135,6 +220,21 @@ def genptr_decl_new(f,regtype,regid,regno): (regtype, regid, regno)) else: print("Bad register parse: ", regtype, regid) + elif (regtype =3D=3D "O"): + if (regid =3D=3D "s"): + f.write(" const intptr_t %s%sN_num =3D insn->regno[%d];\n" = % \ + (regtype, regid, regno)) + if (hex_common.skip_qemu_helper(tag)): + f.write(" const intptr_t %s%sN_off =3D\n" % \ + (regtype, regid)) + f.write(" ctx_future_vreg_off(ctx, %s%sN_num," % \ + (regtype, regid)) + f.write(" 1, true);\n") + else: + f.write(" TCGv %s%sN =3D tcg_constant_tl(%s%sN_num); /*= HERE */\n" % \ + (regtype, regid, regtype, regid)) + else: + print("Bad register parse: ", regtype, regid) else: print("Bad register parse: ", regtype, regid) =20 @@ -145,7 +245,7 @@ def genptr_decl_opn(f, tag, regtype, regid, toss, numre= gs, i): if hex_common.is_old_val(regtype, regid, tag): genptr_decl(f,tag, regtype, regid, i) elif hex_common.is_new_val(regtype, regid, tag): - genptr_decl_new(f,regtype,regid,i) + genptr_decl_new(f, tag, regtype, regid, i) else: print("Bad register parse: ",regtype,regid,toss,numregs) else: @@ -159,7 +259,7 @@ def genptr_decl_imm(f,immlett): f.write(" int %s =3D insn->immed[%d];\n" % \ (hex_common.imm_name(immlett), i)) =20 -def genptr_free(f,regtype,regid,regno): +def genptr_free(f, tag, regtype, regid, regno): if (regtype =3D=3D "R"): if (regid in {"dd", "ss", "tt", "xx", "yy"}): f.write(" tcg_temp_free_i64(%s%sV);\n" % (regtype, regid)) @@ -182,33 +282,51 @@ def genptr_free(f,regtype,regid,regno): elif (regtype =3D=3D "M"): if (regid !=3D "u"): print("Bad register parse: ", regtype, regid) + elif (regtype =3D=3D "V"): + if (regid in {"dd", "uu", "vv", "xx", \ + "d", "s", "u", "v", "w", "x", "y"}): + if (not hex_common.skip_qemu_helper(tag)): + f.write(" tcg_temp_free_ptr(%s%sV);\n" % \ + (regtype, regid)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype =3D=3D "Q"): + if (regid in {"d", "e", "s", "t", "u", "v", "x"}): + if (not hex_common.skip_qemu_helper(tag)): + f.write(" tcg_temp_free_ptr(%s%sV);\n" % \ + (regtype, regid)) + else: + print("Bad register parse: ", regtype, regid) else: print("Bad register parse: ", regtype, regid) =20 -def genptr_free_new(f,regtype,regid,regno): +def genptr_free_new(f, tag, regtype, regid, regno): if (regtype =3D=3D "N"): if (regid not in {"s", "t"}): print("Bad register parse: ", regtype, regid) elif (regtype =3D=3D "P"): if (regid not in {"t", "u", "v"}): print("Bad register parse: ", regtype, regid) + elif (regtype =3D=3D "O"): + if (regid !=3D "s"): + print("Bad register parse: ", regtype, regid) else: print("Bad register parse: ", regtype, regid) =20 def genptr_free_opn(f,regtype,regid,i,tag): if (hex_common.is_pair(regid)): - genptr_free(f,regtype,regid,i) + genptr_free(f, tag, regtype, regid, i) elif (hex_common.is_single(regid)): if hex_common.is_old_val(regtype, regid, tag): - genptr_free(f,regtype,regid,i) + genptr_free(f, tag, regtype, regid, i) elif hex_common.is_new_val(regtype, regid, tag): - genptr_free_new(f,regtype,regid,i) + genptr_free_new(f, tag, regtype, regid, i) else: print("Bad register parse: ",regtype,regid,toss,numregs) else: print("Bad register parse: ",regtype,regid,toss,numregs) =20 -def genptr_src_read(f,regtype,regid): +def genptr_src_read(f, tag, regtype, regid): if (regtype =3D=3D "R"): if (regid in {"ss", "tt", "xx", "yy"}): f.write(" tcg_gen_concat_i32_i64(%s%sV, hex_gpr[%s%sN],\n" = % \ @@ -238,6 +356,47 @@ def genptr_src_read(f,regtype,regid): elif (regtype =3D=3D "M"): if (regid !=3D "u"): print("Bad register parse: ", regtype, regid) + elif (regtype =3D=3D "V"): + if (regid in {"uu", "vv", "xx"}): + f.write(" tcg_gen_gvec_mov(MO_64, %s%sV_off,\n" % \ + (regtype, regid)) + f.write(" vreg_src_off(ctx, %s%sN),\n" % \ + (regtype, regid)) + f.write(" sizeof(MMVector), sizeof(MMVector));\n") + f.write(" tcg_gen_gvec_mov(MO_64,\n") + f.write(" %s%sV_off + sizeof(MMVector),\n" % \ + (regtype, regid)) + f.write(" vreg_src_off(ctx, %s%sN ^ 1),\n" % \ + (regtype, regid)) + f.write(" sizeof(MMVector), sizeof(MMVector));\n") + elif (regid in {"s", "u", "v", "w"}): + if (not hex_common.skip_qemu_helper(tag)): + f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\= n" % \ + (regtype, regid, regtype, regid)) + elif (regid in {"x", "y"}): + f.write(" tcg_gen_gvec_mov(MO_64, %s%sV_off,\n" % \ + (regtype, regid)) + f.write(" vreg_src_off(ctx, %s%sN),\n" % \ + (regtype, regid)) + f.write(" sizeof(MMVector), sizeof(MMVector));\n") + if (not hex_common.skip_qemu_helper(tag)): + f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\= n" % \ + (regtype, regid, regtype, regid)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype =3D=3D "Q"): + if (regid in {"s", "t", "u", "v"}): + if (not hex_common.skip_qemu_helper(tag)): + f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\= n" % \ + (regtype, regid, regtype, regid)) + elif (regid in {"x"}): + f.write(" tcg_gen_gvec_mov(MO_64, %s%sV_off,\n" % \ + (regtype, regid)) + f.write(" offsetof(CPUHexagonState, QRegs[%s%sN]),\n" %= \ + (regtype, regid)) + f.write(" sizeof(MMQReg), sizeof(MMQReg));\n") + else: + print("Bad register parse: ", regtype, regid) else: print("Bad register parse: ", regtype, regid) =20 @@ -248,15 +407,18 @@ def genptr_src_read_new(f,regtype,regid): elif (regtype =3D=3D "P"): if (regid not in {"t", "u", "v"}): print("Bad register parse: ", regtype, regid) + elif (regtype =3D=3D "O"): + if (regid !=3D "s"): + print("Bad register parse: ", regtype, regid) else: print("Bad register parse: ", regtype, regid) =20 def genptr_src_read_opn(f,regtype,regid,tag): if (hex_common.is_pair(regid)): - genptr_src_read(f,regtype,regid) + genptr_src_read(f, tag, regtype, regid) elif (hex_common.is_single(regid)): if hex_common.is_old_val(regtype, regid, tag): - genptr_src_read(f,regtype,regid) + genptr_src_read(f, tag, regtype, regid) elif hex_common.is_new_val(regtype, regid, tag): genptr_src_read_new(f,regtype,regid) else: @@ -334,11 +496,68 @@ def genptr_dst_write(f, tag, regtype, regid): else: print("Bad register parse: ", regtype, regid) =20 +def genptr_dst_write_ext(f, tag, regtype, regid, newv=3D"0"): + if (regtype =3D=3D "V"): + if (regid in {"dd", "xx", "yy"}): + if ('A_CONDEXEC' in hex_common.attribdict[tag]): + is_predicated =3D "true" + else: + is_predicated =3D "false" + f.write(" gen_log_vreg_write_pair(ctx, %s%sV_off, %s%sN, " = % \ + (regtype, regid, regtype, regid)) + f.write("%s, insn->slot, %s);\n" % \ + (newv, is_predicated)) + f.write(" ctx_log_vreg_write_pair(ctx, %s%sN, %s,\n" % \ + (regtype, regid, newv)) + f.write(" %s);\n" % (is_predicated)) + elif (regid in {"d", "x", "y"}): + if ('A_CONDEXEC' in hex_common.attribdict[tag]): + is_predicated =3D "true" + else: + is_predicated =3D "false" + f.write(" gen_log_vreg_write(ctx, %s%sV_off, %s%sN, %s, " %= \ + (regtype, regid, regtype, regid, newv)) + f.write("insn->slot, %s);\n" % \ + (is_predicated)) + f.write(" ctx_log_vreg_write(ctx, %s%sN, %s, %s);\n" % \ + (regtype, regid, newv, is_predicated)) + else: + print("Bad register parse: ", regtype, regid) + elif (regtype =3D=3D "Q"): + if (regid in {"d", "e", "x"}): + if ('A_CONDEXEC' in hex_common.attribdict[tag]): + is_predicated =3D "true" + else: + is_predicated =3D "false" + f.write(" gen_log_qreg_write(%s%sV_off, %s%sN, %s, " % \ + (regtype, regid, regtype, regid, newv)) + f.write("insn->slot, %s);\n" % (is_predicated)) + f.write(" ctx_log_qreg_write(ctx, %s%sN, %s);\n" % \ + (regtype, regid, is_predicated)) + else: + print("Bad register parse: ", regtype, regid) + else: + print("Bad register parse: ", regtype, regid) + def genptr_dst_write_opn(f,regtype, regid, tag): if (hex_common.is_pair(regid)): - genptr_dst_write(f, tag, regtype, regid) + if (hex_common.is_hvx_reg(regtype)): + if (hex_common.is_tmp_result(tag)): + genptr_dst_write_ext(f, tag, regtype, regid, "EXT_TMP") + else: + genptr_dst_write_ext(f, tag, regtype, regid) + else: + genptr_dst_write(f, tag, regtype, regid) elif (hex_common.is_single(regid)): - genptr_dst_write(f, tag, regtype, regid) + if (hex_common.is_hvx_reg(regtype)): + if (hex_common.is_new_result(tag)): + genptr_dst_write_ext(f, tag, regtype, regid, "EXT_NEW") + if (hex_common.is_tmp_result(tag)): + genptr_dst_write_ext(f, tag, regtype, regid, "EXT_TMP") + else: + genptr_dst_write_ext(f, tag, regtype, regid, "EXT_DFL") + else: + genptr_dst_write(f, tag, regtype, regid) else: print("Bad register parse: ",regtype,regid,toss,numregs) =20 @@ -409,13 +628,24 @@ def gen_tcg_func(f, tag, regs, imms): ## If there is a scalar result, it is the return type for regtype,regid,toss,numregs in regs: if (hex_common.is_written(regid)): + if (hex_common.is_hvx_reg(regtype)): + continue gen_helper_call_opn(f, tag, regtype, regid, toss, numregs,= i) i +=3D 1 if (i > 0): f.write(", ") f.write("cpu_env") i=3D1 for regtype,regid,toss,numregs in regs: + if (hex_common.is_written(regid)): + if (not hex_common.is_hvx_reg(regtype)): + continue + gen_helper_call_opn(f, tag, regtype, regid, toss, numregs,= i) + i +=3D 1 + for regtype,regid,toss,numregs in regs: if (hex_common.is_read(regid)): + if (hex_common.is_hvx_reg(regtype) and + hex_common.is_readwrite(regid)): + continue gen_helper_call_opn(f, tag, regtype, regid, toss, numregs,= i) i +=3D 1 for immlett,bits,immshift in imms: --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634035750648692.4867936102253; Tue, 12 Oct 2021 03:49:10 -0700 (PDT) Received: from localhost ([::1]:34586 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFLJ-00028c-M5 for importer@patchew.org; Tue, 12 Oct 2021 06:49:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50472) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maElG-0000la-P1 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:55 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64080) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maElE-0006xP-31 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:54 -0400 Received: from unknown (HELO ironmsg03-sd.qualcomm.com) ([10.53.140.143]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg03-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 4E90814A8; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033512; x=1665569512; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ITRDH6jVME6ePg7atny8ShXneIJSliVfuzvjirIwLtM=; b=iICc80ew4FoL/fSiKA/9CMIRXgHVbbblRrHseLeb4hcWwbnIDWM8pr7r c8U4iMBqDfSAuF7v0gyJcxlkroUcEu1FNVRQUPE0lVtDrlLNtSr20J1zA iNq2aZSjz3cggcDhHFkB9hlndcdpOl9/klSqhMZyygqlVRuur6fwtjxdf 8=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 09/30] Hexagon HVX (target/hexagon) C preprocessor for decode tree Date: Tue, 12 Oct 2021 05:10:47 -0500 Message-Id: <1634033468-23566-10-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634035753113100001 Acked-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/gen_dectree_import.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/target/hexagon/gen_dectree_import.c b/target/hexagon/gen_dectr= ee_import.c index 5b7ecfc..ee35467 100644 --- a/target/hexagon/gen_dectree_import.c +++ b/target/hexagon/gen_dectree_import.c @@ -40,6 +40,11 @@ const char * const opcode_names[] =3D { * Q6INSN(A2_add,"Rd32=3Dadd(Rs32,Rt32)",ATTRIBS(), * "Add 32-bit registers", * { RdV=3DRsV+RtV;}) + * HVX instructions have the following form + * EXTINSN(V6_vinsertwr, "Vx32.w=3Dvinsert(Rt32)", + * ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX,A_CVI_LATE), + * "Insert Word Scalar into Vector", + * VxV.uw[0] =3D RtV;) */ const char * const opcode_syntax[XX_LAST_OPCODE] =3D { #define Q6INSN(TAG, BEH, ATTRIBS, DESCR, SEM) \ @@ -105,6 +110,14 @@ static const char *get_opcode_enc(int opcode) =20 static const char *get_opcode_enc_class(int opcode) { + const char *tmp =3D opcode_encodings[opcode].encoding; + if (tmp =3D=3D NULL) { + const char *test =3D "V6_"; /* HVX */ + const char *name =3D opcode_names[opcode]; + if (strncmp(name, test, strlen(test)) =3D=3D 0) { + return "EXT_mmvec"; + } + } return opcode_enc_class_names[opcode_encodings[opcode].enc_class]; } =20 --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634034236776280.03953331758566; Tue, 12 Oct 2021 03:23:56 -0700 (PDT) Received: from localhost ([::1]:33468 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maEwt-0006pP-PS for importer@patchew.org; Tue, 12 Oct 2021 06:23:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50276) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEkv-0000Y7-PL for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:34 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64100) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEkt-0007Fq-0l for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:33 -0400 Received: from unknown (HELO ironmsg04-sd.qualcomm.com) ([10.53.140.144]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg04-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 5140714D9; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033490; x=1665569490; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bGp/KPbKIVKnNpYwnYk/XT/23zzIap9g1cRDBEVKqWQ=; b=F5vqcdNVLOXWjipiqod8sXhsknGALjBO98jtLsvmrgxv8PxY3Xn42RAe ljHT+InegwFcVB88KkRC1IPEwJ3Sxj7UFy47gmFecquWd7ZKOBDarg4jo C77Ixv+Lo/zucjnrEj/+8s/oz5QHIB3JHQy9bp8vzjkQC2u/++0C7bt1/ s=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 10/30] Hexagon HVX (target/hexagon) instruction utility functions Date: Tue, 12 Oct 2021 05:10:48 -0500 Message-Id: <1634033468-23566-11-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634034239028100001 Functions to support scatter/gather Add new file to target/hexagon/meson.build Signed-off-by: Taylor Simpson --- target/hexagon/mmvec/system_ext_mmvec.h | 29 +++++++++++++++ target/hexagon/mmvec/system_ext_mmvec.c | 66 +++++++++++++++++++++++++++++= ++++ target/hexagon/meson.build | 1 + 3 files changed, 96 insertions(+) create mode 100644 target/hexagon/mmvec/system_ext_mmvec.h create mode 100644 target/hexagon/mmvec/system_ext_mmvec.c diff --git a/target/hexagon/mmvec/system_ext_mmvec.h b/target/hexagon/mmvec= /system_ext_mmvec.h new file mode 100644 index 0000000..2963061 --- /dev/null +++ b/target/hexagon/mmvec/system_ext_mmvec.h @@ -0,0 +1,29 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#ifndef HEXAGON_SYSTEM_EXT_MMVEC_H +#define HEXAGON_SYSTEM_EXT_MMVEC_H + +void mem_gather_store(CPUHexagonState *env, target_ulong vaddr, int slot); +void mem_vector_scatter_init(CPUHexagonState *env, int slot, + target_ulong base_vaddr, int length, + int element_size); +void mem_vector_gather_init(CPUHexagonState *env, + target_ulong base_vaddr, int length, + int element_size); + +#endif diff --git a/target/hexagon/mmvec/system_ext_mmvec.c b/target/hexagon/mmvec= /system_ext_mmvec.c new file mode 100644 index 0000000..9de1a25 --- /dev/null +++ b/target/hexagon/mmvec/system_ext_mmvec.c @@ -0,0 +1,66 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#include "qemu/osdep.h" +#include "cpu.h" +#include "mmvec/system_ext_mmvec.h" + +void mem_gather_store(CPUHexagonState *env, target_ulong vaddr, int slot) +{ + size_t size =3D sizeof(MMVector); + + env->vstore_pending[slot] =3D 1; + env->vstore[slot].va =3D vaddr; + env->vstore[slot].size =3D size; + memcpy(&env->vstore[slot].data.ub[0], &env->tmp_VRegs[0], size); + + /* On a gather store, overwrite the store mask to emulate dropped gath= ers */ + bitmap_copy(env->vstore[slot].mask, env->vtcm_log.mask, size); +} + +void mem_vector_scatter_init(CPUHexagonState *env, int slot, + target_ulong base_vaddr, + int length, int element_size) +{ + int i; + + for (i =3D 0; i < sizeof(MMVector); i++) { + env->vtcm_log.data.ub[i] =3D 0; + } + bitmap_zero(env->vtcm_log.mask, MAX_VEC_SIZE_BYTES); + + env->vtcm_pending =3D true; + env->vtcm_log.op =3D false; + env->vtcm_log.op_size =3D 0; + env->vtcm_log.size =3D sizeof(MMVector); +} + +void mem_vector_gather_init(CPUHexagonState *env, + target_ulong base_vaddr, + int length, int element_size) +{ + int i; + + for (i =3D 0; i < sizeof(MMVector); i++) { + env->vtcm_log.data.ub[i] =3D 0; + env->vtcm_log.va[i] =3D 0; + env->tmp_VRegs[0].ub[i] =3D 0; + } + bitmap_zero(env->vtcm_log.mask, MAX_VEC_SIZE_BYTES / 8); + env->vtcm_log.op =3D false; + env->vtcm_log.op_size =3D 0; +} diff --git a/target/hexagon/meson.build b/target/hexagon/meson.build index c6d858f..0bfaa41 100644 --- a/target/hexagon/meson.build +++ b/target/hexagon/meson.build @@ -174,6 +174,7 @@ hexagon_ss.add(files( 'printinsn.c', 'arch.c', 'fma_emu.c', + 'mmvec/system_ext_mmvec.c', )) =20 target_arch +=3D {'hexagon': hexagon_ss} --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634034808086929.4624957233768; Tue, 12 Oct 2021 03:33:28 -0700 (PDT) Received: from localhost ([::1]:42408 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maF66-0004nK-NV for importer@patchew.org; Tue, 12 Oct 2021 06:33:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50372) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEl0-0000aP-J1 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:40 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64084) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEkx-0006yI-A4 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:38 -0400 Received: from unknown (HELO ironmsg01-sd.qualcomm.com) ([10.53.140.141]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg01-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 53BE614DD; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033495; x=1665569495; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zDEi7v7WMqd9gW84fqKaNv8xkilLQNuceISmGJOqKaE=; b=GCPXR44+hAm+/Lt2xRoIF++wBqMqUyZDIvyUR+y6MwEonN5P0pA68ScH wEd65ZYokLzaed6hCCulYMPjP7/vOYxLznPniKT9WYepasJWxa8Qjtn0b BkBdGTNC/StGq54clz4JKJ4k0g19HhcXbHMbD/+hnMVeWxpc7ndLXSLxb 8=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 11/30] Hexagon HVX (target/hexagon) helper functions Date: Tue, 12 Oct 2021 05:10:49 -0500 Message-Id: <1634033468-23566-12-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634034809083100001 Probe and commit vector stores (masked and scatter/gather) Log vector register writes Add the execution counters to the debug log Histogram instructions Signed-off-by: Taylor Simpson Reviewed-by: Richard Henderson --- target/hexagon/helper.h | 16 +++ target/hexagon/op_helper.c | 282 +++++++++++++++++++++++++++++++++++++++++= +++- 2 files changed, 296 insertions(+), 2 deletions(-) diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h index 89de2a3..c89aa4e 100644 --- a/target/hexagon/helper.h +++ b/target/hexagon/helper.h @@ -23,6 +23,8 @@ DEF_HELPER_1(debug_start_packet, void, env) DEF_HELPER_FLAGS_3(debug_check_store_width, TCG_CALL_NO_WG, void, env, int= , int) DEF_HELPER_FLAGS_3(debug_commit_end, TCG_CALL_NO_WG, void, env, int, int) DEF_HELPER_2(commit_store, void, env, int) +DEF_HELPER_3(gather_store, void, env, i32, int) +DEF_HELPER_1(commit_hvx_stores, void, env) DEF_HELPER_FLAGS_4(fcircadd, TCG_CALL_NO_RWG_SE, s32, s32, s32, s32, s32) DEF_HELPER_FLAGS_1(fbrev, TCG_CALL_NO_RWG_SE, i32, i32) DEF_HELPER_3(sfrecipa, i64, env, f32, f32) @@ -90,4 +92,18 @@ DEF_HELPER_4(sffms_lib, f32, env, f32, f32, f32) DEF_HELPER_3(dfmpyfix, f64, env, f64, f64) DEF_HELPER_4(dfmpyhh, f64, env, f64, f64, f64) =20 +/* Histogram instructions */ +DEF_HELPER_1(vhist, void, env) +DEF_HELPER_1(vhistq, void, env) +DEF_HELPER_1(vwhist256, void, env) +DEF_HELPER_1(vwhist256q, void, env) +DEF_HELPER_1(vwhist256_sat, void, env) +DEF_HELPER_1(vwhist256q_sat, void, env) +DEF_HELPER_1(vwhist128, void, env) +DEF_HELPER_1(vwhist128q, void, env) +DEF_HELPER_2(vwhist128m, void, env, s32) +DEF_HELPER_2(vwhist128qm, void, env, s32) + DEF_HELPER_2(probe_pkt_scalar_store_s0, void, env, int) +DEF_HELPER_2(probe_hvx_stores, void, env, int) +DEF_HELPER_3(probe_pkt_scalar_hvx_stores, void, env, int, int) diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c index af32de4..a67a148 100644 --- a/target/hexagon/op_helper.c +++ b/target/hexagon/op_helper.c @@ -27,6 +27,8 @@ #include "arch.h" #include "hex_arch_types.h" #include "fma_emu.h" +#include "mmvec/mmvec.h" +#include "mmvec/macros.h" =20 #define SF_BIAS 127 #define SF_MANTBITS 23 @@ -164,6 +166,57 @@ void HELPER(commit_store)(CPUHexagonState *env, int sl= ot_num) } } =20 +void HELPER(gather_store)(CPUHexagonState *env, uint32_t addr, int slot) +{ + mem_gather_store(env, addr, slot); +} + +void HELPER(commit_hvx_stores)(CPUHexagonState *env) +{ + uintptr_t ra =3D GETPC(); + int i; + + /* Normal (possibly masked) vector store */ + for (i =3D 0; i < VSTORES_MAX; i++) { + if (env->vstore_pending[i]) { + env->vstore_pending[i] =3D 0; + target_ulong va =3D env->vstore[i].va; + int size =3D env->vstore[i].size; + for (int j =3D 0; j < size; j++) { + if (test_bit(j, env->vstore[i].mask)) { + cpu_stb_data_ra(env, va + j, env->vstore[i].data.ub[j]= , ra); + } + } + } + } + + /* Scatter store */ + if (env->vtcm_pending) { + env->vtcm_pending =3D false; + if (env->vtcm_log.op) { + /* Need to perform the scatter read/modify/write at commit tim= e */ + if (env->vtcm_log.op_size =3D=3D 2) { + SCATTER_OP_WRITE_TO_MEM(uint16_t); + } else if (env->vtcm_log.op_size =3D=3D 4) { + /* Word Scatter +=3D */ + SCATTER_OP_WRITE_TO_MEM(uint32_t); + } else { + g_assert_not_reached(); + } + } else { + for (i =3D 0; i < env->vtcm_log.size; i++) { + if (test_bit(i, env->vtcm_log.mask)) { + cpu_stb_data_ra(env, env->vtcm_log.va[i], + env->vtcm_log.data.ub[i], ra); + clear_bit(i, env->vtcm_log.mask); + env->vtcm_log.data.ub[i] =3D 0; + } + + } + } + } +} + static void print_store(CPUHexagonState *env, int slot) { if (!(env->slot_cancelled & (1 << slot))) { @@ -242,9 +295,10 @@ void HELPER(debug_commit_end)(CPUHexagonState *env, in= t has_st0, int has_st1) HEX_DEBUG_LOG("Next PC =3D " TARGET_FMT_lx "\n", env->next_PC); HEX_DEBUG_LOG("Exec counters: pkt =3D " TARGET_FMT_lx ", insn =3D " TARGET_FMT_lx - "\n", + ", hvx =3D " TARGET_FMT_lx "\n", env->gpr[HEX_REG_QEMU_PKT_CNT], - env->gpr[HEX_REG_QEMU_INSN_CNT]); + env->gpr[HEX_REG_QEMU_INSN_CNT], + env->gpr[HEX_REG_QEMU_HVX_CNT]); =20 } =20 @@ -393,6 +447,65 @@ void HELPER(probe_pkt_scalar_store_s0)(CPUHexagonState= *env, int mmu_idx) probe_store(env, 0, mmu_idx); } =20 +void HELPER(probe_hvx_stores)(CPUHexagonState *env, int mmu_idx) +{ + uintptr_t retaddr =3D GETPC(); + int i; + + /* Normal (possibly masked) vector store */ + for (i =3D 0; i < VSTORES_MAX; i++) { + if (env->vstore_pending[i]) { + target_ulong va =3D env->vstore[i].va; + int size =3D env->vstore[i].size; + for (int j =3D 0; j < size; j++) { + if (test_bit(j, env->vstore[i].mask)) { + probe_write(env, va + j, 1, mmu_idx, retaddr); + } + } + } + } + + /* Scatter store */ + if (env->vtcm_pending) { + if (env->vtcm_log.op) { + /* Need to perform the scatter read/modify/write at commit tim= e */ + if (env->vtcm_log.op_size =3D=3D 2) { + SCATTER_OP_PROBE_MEM(size2u_t, mmu_idx, retaddr); + } else if (env->vtcm_log.op_size =3D=3D 4) { + /* Word Scatter +=3D */ + SCATTER_OP_PROBE_MEM(size4u_t, mmu_idx, retaddr); + } else { + g_assert_not_reached(); + } + } else { + for (int i =3D 0; i < env->vtcm_log.size; i++) { + if (test_bit(i, env->vtcm_log.mask)) { + probe_write(env, env->vtcm_log.va[i], 1, mmu_idx, reta= ddr); + } + + } + } + } +} + +void HELPER(probe_pkt_scalar_hvx_stores)(CPUHexagonState *env, int mask, + int mmu_idx) +{ + bool has_st0 =3D (mask >> 0) & 1; + bool has_st1 =3D (mask >> 1) & 1; + bool has_hvx_stores =3D (mask >> 2) & 1; + + if (has_st0) { + probe_store(env, 0, mmu_idx); + } + if (has_st1) { + probe_store(env, 1, mmu_idx); + } + if (has_hvx_stores) { + HELPER(probe_hvx_stores)(env, mmu_idx); + } +} + /* * mem_noshuf * Section 5.5 of the Hexagon V67 Programmer's Reference Manual @@ -1181,6 +1294,171 @@ float64 HELPER(dfmpyhh)(CPUHexagonState *env, float= 64 RxxV, return RxxV; } =20 +/* Histogram instructions */ + +void HELPER(vhist)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int lane =3D 0; lane < 8; lane++) { + for (int i =3D 0; i < sizeof(MMVector) / 8; ++i) { + unsigned char value =3D input->ub[(sizeof(MMVector) / 8) * lan= e + i]; + unsigned char regno =3D value >> 3; + unsigned char element =3D value & 7; + + env->VRegs[regno].uh[(sizeof(MMVector) / 16) * lane + element]= ++; + } + } +} + +void HELPER(vhistq)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int lane =3D 0; lane < 8; lane++) { + for (int i =3D 0; i < sizeof(MMVector) / 8; ++i) { + unsigned char value =3D input->ub[(sizeof(MMVector) / 8) * lan= e + i]; + unsigned char regno =3D value >> 3; + unsigned char element =3D value & 7; + + if (fGETQBIT(env->qtmp, sizeof(MMVector) / 8 * lane + i)) { + env->VRegs[regno].uh[ + (sizeof(MMVector) / 16) * lane + element]++; + } + } + } +} + +void HELPER(vwhist256)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 0) & (~7)) | ((bucket >> 0) & 7); + + env->VRegs[vindex].uh[elindex] =3D + env->VRegs[vindex].uh[elindex] + weight; + } +} + +void HELPER(vwhist256q)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 0) & (~7)) | ((bucket >> 0) & 7); + + if (fGETQBIT(env->qtmp, 2 * i)) { + env->VRegs[vindex].uh[elindex] =3D + env->VRegs[vindex].uh[elindex] + weight; + } + } +} + +void HELPER(vwhist256_sat)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 0) & (~7)) | ((bucket >> 0) & 7); + + env->VRegs[vindex].uh[elindex] =3D + fVSATUH(env->VRegs[vindex].uh[elindex] + weight); + } +} + +void HELPER(vwhist256q_sat)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 0) & (~7)) | ((bucket >> 0) & 7); + + if (fGETQBIT(env->qtmp, 2 * i)) { + env->VRegs[vindex].uh[elindex] =3D + fVSATUH(env->VRegs[vindex].uh[elindex] + weight); + } + } +} + +void HELPER(vwhist128)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 1) & (~3)) | ((bucket >> 1) & 3); + + env->VRegs[vindex].uw[elindex] =3D + env->VRegs[vindex].uw[elindex] + weight; + } +} + +void HELPER(vwhist128q)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 1) & (~3)) | ((bucket >> 1) & 3); + + if (fGETQBIT(env->qtmp, 2 * i)) { + env->VRegs[vindex].uw[elindex] =3D + env->VRegs[vindex].uw[elindex] + weight; + } + } +} + +void HELPER(vwhist128m)(CPUHexagonState *env, int32_t uiV) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 1) & (~3)) | ((bucket >> 1) & 3); + + if ((bucket & 1) =3D=3D uiV) { + env->VRegs[vindex].uw[elindex] =3D + env->VRegs[vindex].uw[elindex] + weight; + } + } +} + +void HELPER(vwhist128qm)(CPUHexagonState *env, int32_t uiV) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 1) & (~3)) | ((bucket >> 1) & 3); + + if (((bucket & 1) =3D=3D uiV) && fGETQBIT(env->qtmp, 2 * i)) { + env->VRegs[vindex].uw[elindex] =3D + env->VRegs[vindex].uw[elindex] + weight; + } + } +} + static void cancel_slot(CPUHexagonState *env, uint32_t slot) { HEX_DEBUG_LOG("Slot %d cancelled\n", slot); --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634034045263981.2987093000274; Tue, 12 Oct 2021 03:20:45 -0700 (PDT) Received: from localhost ([::1]:54056 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maEto-0001bO-2Z for importer@patchew.org; Tue, 12 Oct 2021 06:20:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50266) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEkv-0000X6-93 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:33 -0400 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:12878) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEks-0006y1-CQ for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:32 -0400 Received: from unknown (HELO ironmsg02-sd.qualcomm.com) ([10.53.140.142]) by alexa-out-sd-01.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg02-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 563C41632; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033490; x=1665569490; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=COjVNNEs1WBrot3xH6k/9y1BeG3mdPMZPJkwKH5HmOM=; b=fwhEumtZve1BdIDflSrzSwGwIXj9fEn1OOqyMgKCERAwonlVLYtwDnEE OB3PPPJIHcUenMK72EbybmRDtKOyKKt2vNIW8i0bx15jTz6sh7wap4Ddk Bb1O2jt7HH8pn0wglMYQTtRmdN7geoheI/aijBzmZFGr8oJtys4Sbd3hy s=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 12/30] Hexagon HVX (target/hexagon) TCG generation Date: Tue, 12 Oct 2021 05:10:50 -0500 Message-Id: <1634033468-23566-13-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.38; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-01.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634034046461100001 Signed-off-by: Taylor Simpson Reviewed-by: Richard Henderson --- target/hexagon/translate.h | 61 ++++++++++++ target/hexagon/genptr.c | 15 +++ target/hexagon/translate.c | 243 +++++++++++++++++++++++++++++++++++++++++= +++- 3 files changed, 315 insertions(+), 4 deletions(-) diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h index 703fd13..fccfb94 100644 --- a/target/hexagon/translate.h +++ b/target/hexagon/translate.h @@ -29,6 +29,7 @@ typedef struct DisasContext { uint32_t mem_idx; uint32_t num_packets; uint32_t num_insns; + uint32_t num_hvx_insns; int reg_log[REG_WRITES_MAX]; int reg_log_idx; DECLARE_BITMAP(regs_written, TOTAL_PER_THREAD_REGS); @@ -37,6 +38,20 @@ typedef struct DisasContext { DECLARE_BITMAP(pregs_written, NUM_PREGS); uint8_t store_width[STORES_MAX]; bool s1_store_processed; + int future_vregs_idx; + int future_vregs_num[VECTOR_TEMPS_MAX]; + int tmp_vregs_idx; + int tmp_vregs_num[VECTOR_TEMPS_MAX]; + int vreg_log[NUM_VREGS]; + bool vreg_is_predicated[NUM_VREGS]; + int vreg_log_idx; + DECLARE_BITMAP(vregs_updated_tmp, NUM_VREGS); + DECLARE_BITMAP(vregs_updated, NUM_VREGS); + DECLARE_BITMAP(vregs_select, NUM_VREGS); + int qreg_log[NUM_QREGS]; + bool qreg_is_predicated[NUM_QREGS]; + int qreg_log_idx; + bool pre_commit; } DisasContext; =20 static inline void ctx_log_reg_write(DisasContext *ctx, int rnum) @@ -67,6 +82,46 @@ static inline bool is_preloaded(DisasContext *ctx, int n= um) return test_bit(num, ctx->regs_written); } =20 +intptr_t ctx_future_vreg_off(DisasContext *ctx, int regnum, + int num, bool alloc_ok); +intptr_t ctx_tmp_vreg_off(DisasContext *ctx, int regnum, + int num, bool alloc_ok); + +static inline void ctx_log_vreg_write(DisasContext *ctx, + int rnum, VRegWriteType type, + bool is_predicated) +{ + if (type !=3D EXT_TMP) { + ctx->vreg_log[ctx->vreg_log_idx] =3D rnum; + ctx->vreg_is_predicated[ctx->vreg_log_idx] =3D is_predicated; + ctx->vreg_log_idx++; + + set_bit(rnum, ctx->vregs_updated); + } + if (type =3D=3D EXT_NEW) { + set_bit(rnum, ctx->vregs_select); + } + if (type =3D=3D EXT_TMP) { + set_bit(rnum, ctx->vregs_updated_tmp); + } +} + +static inline void ctx_log_vreg_write_pair(DisasContext *ctx, + int rnum, VRegWriteType type, + bool is_predicated) +{ + ctx_log_vreg_write(ctx, rnum ^ 0, type, is_predicated); + ctx_log_vreg_write(ctx, rnum ^ 1, type, is_predicated); +} + +static inline void ctx_log_qreg_write(DisasContext *ctx, + int rnum, bool is_predicated) +{ + ctx->qreg_log[ctx->qreg_log_idx] =3D rnum; + ctx->qreg_is_predicated[ctx->qreg_log_idx] =3D is_predicated; + ctx->qreg_log_idx++; +} + extern TCGv hex_gpr[TOTAL_PER_THREAD_REGS]; extern TCGv hex_pred[NUM_PREGS]; extern TCGv hex_next_PC; @@ -85,6 +140,12 @@ extern TCGv hex_dczero_addr; extern TCGv hex_llsc_addr; extern TCGv hex_llsc_val; extern TCGv_i64 hex_llsc_val_i64; +extern TCGv hex_VRegs_updated; +extern TCGv hex_QRegs_updated; +extern TCGv hex_vstore_addr[VSTORES_MAX]; +extern TCGv hex_vstore_size[VSTORES_MAX]; +extern TCGv hex_vstore_pending[VSTORES_MAX]; =20 +bool is_gather_store_insn(Insn *insn, Packet *pkt); void process_store(DisasContext *ctx, Packet *pkt, int slot_num); #endif diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c index 4a21fa5..d16ff74 100644 --- a/target/hexagon/genptr.c +++ b/target/hexagon/genptr.c @@ -165,6 +165,9 @@ static inline void gen_read_ctrl_reg(DisasContext *ctx,= const int reg_num, } else if (reg_num =3D=3D HEX_REG_QEMU_INSN_CNT) { tcg_gen_addi_tl(dest, hex_gpr[HEX_REG_QEMU_INSN_CNT], ctx->num_insns); + } else if (reg_num =3D=3D HEX_REG_QEMU_HVX_CNT) { + tcg_gen_addi_tl(dest, hex_gpr[HEX_REG_QEMU_HVX_CNT], + ctx->num_hvx_insns); } else { tcg_gen_mov_tl(dest, hex_gpr[reg_num]); } @@ -191,6 +194,12 @@ static inline void gen_read_ctrl_reg_pair(DisasContext= *ctx, const int reg_num, tcg_gen_concat_i32_i64(dest, pkt_cnt, insn_cnt); tcg_temp_free(pkt_cnt); tcg_temp_free(insn_cnt); + } else if (reg_num =3D=3D HEX_REG_QEMU_HVX_CNT) { + TCGv hvx_cnt =3D tcg_temp_new(); + tcg_gen_addi_tl(hvx_cnt, hex_gpr[HEX_REG_QEMU_HVX_CNT], + ctx->num_hvx_insns); + tcg_gen_concat_i32_i64(dest, hvx_cnt, hex_gpr[reg_num + 1]); + tcg_temp_free(hvx_cnt); } else { tcg_gen_concat_i32_i64(dest, hex_gpr[reg_num], @@ -226,6 +235,9 @@ static inline void gen_write_ctrl_reg(DisasContext *ctx= , int reg_num, if (reg_num =3D=3D HEX_REG_QEMU_INSN_CNT) { ctx->num_insns =3D 0; } + if (reg_num =3D=3D HEX_REG_QEMU_HVX_CNT) { + ctx->num_hvx_insns =3D 0; + } } } =20 @@ -247,6 +259,9 @@ static inline void gen_write_ctrl_reg_pair(DisasContext= *ctx, int reg_num, ctx->num_packets =3D 0; ctx->num_insns =3D 0; } + if (reg_num =3D=3D HEX_REG_QEMU_HVX_CNT) { + ctx->num_hvx_insns =3D 0; + } } } =20 diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c index 4f05ce3..e33e39c 100644 --- a/target/hexagon/translate.c +++ b/target/hexagon/translate.c @@ -19,6 +19,7 @@ #include "qemu/osdep.h" #include "cpu.h" #include "tcg/tcg-op.h" +#include "tcg/tcg-op-gvec.h" #include "exec/cpu_ldst.h" #include "exec/log.h" #include "internal.h" @@ -47,11 +48,60 @@ TCGv hex_dczero_addr; TCGv hex_llsc_addr; TCGv hex_llsc_val; TCGv_i64 hex_llsc_val_i64; +TCGv hex_VRegs_updated; +TCGv hex_QRegs_updated; +TCGv hex_vstore_addr[VSTORES_MAX]; +TCGv hex_vstore_size[VSTORES_MAX]; +TCGv hex_vstore_pending[VSTORES_MAX]; =20 static const char * const hexagon_prednames[] =3D { "p0", "p1", "p2", "p3" }; =20 +intptr_t ctx_future_vreg_off(DisasContext *ctx, int regnum, + int num, bool alloc_ok) +{ + intptr_t offset; + + /* See if it is already allocated */ + for (int i =3D 0; i < ctx->future_vregs_idx; i++) { + if (ctx->future_vregs_num[i] =3D=3D regnum) { + return offsetof(CPUHexagonState, future_VRegs[i]); + } + } + + g_assert(alloc_ok); + offset =3D offsetof(CPUHexagonState, future_VRegs[ctx->future_vregs_id= x]); + for (int i =3D 0; i < num; i++) { + ctx->future_vregs_num[ctx->future_vregs_idx + i] =3D regnum++; + } + ctx->future_vregs_idx +=3D num; + g_assert(ctx->future_vregs_idx <=3D VECTOR_TEMPS_MAX); + return offset; +} + +intptr_t ctx_tmp_vreg_off(DisasContext *ctx, int regnum, + int num, bool alloc_ok) +{ + intptr_t offset; + + /* See if it is already allocated */ + for (int i =3D 0; i < ctx->tmp_vregs_idx; i++) { + if (ctx->tmp_vregs_num[i] =3D=3D regnum) { + return offsetof(CPUHexagonState, tmp_VRegs[i]); + } + } + + g_assert(alloc_ok); + offset =3D offsetof(CPUHexagonState, tmp_VRegs[ctx->tmp_vregs_idx]); + for (int i =3D 0; i < num; i++) { + ctx->tmp_vregs_num[ctx->tmp_vregs_idx + i] =3D regnum++; + } + ctx->tmp_vregs_idx +=3D num; + g_assert(ctx->tmp_vregs_idx <=3D VECTOR_TEMPS_MAX); + return offset; +} + static void gen_exception_raw(int excp) { gen_helper_raise_exception(cpu_env, tcg_constant_i32(excp)); @@ -63,6 +113,8 @@ static void gen_exec_counters(DisasContext *ctx) hex_gpr[HEX_REG_QEMU_PKT_CNT], ctx->num_packets); tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_INSN_CNT], hex_gpr[HEX_REG_QEMU_INSN_CNT], ctx->num_insns); + tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_HVX_CNT], + hex_gpr[HEX_REG_QEMU_HVX_CNT], ctx->num_hvx_insns); } =20 static void gen_end_tb(DisasContext *ctx) @@ -171,11 +223,19 @@ static void gen_start_packet(DisasContext *ctx, Packe= t *pkt) bitmap_zero(ctx->regs_written, TOTAL_PER_THREAD_REGS); ctx->preg_log_idx =3D 0; bitmap_zero(ctx->pregs_written, NUM_PREGS); + ctx->future_vregs_idx =3D 0; + ctx->tmp_vregs_idx =3D 0; + ctx->vreg_log_idx =3D 0; + bitmap_zero(ctx->vregs_updated_tmp, NUM_VREGS); + bitmap_zero(ctx->vregs_updated, NUM_VREGS); + bitmap_zero(ctx->vregs_select, NUM_VREGS); + ctx->qreg_log_idx =3D 0; for (i =3D 0; i < STORES_MAX; i++) { ctx->store_width[i] =3D 0; } tcg_gen_movi_tl(hex_pkt_has_store_s1, pkt->pkt_has_store_s1); ctx->s1_store_processed =3D false; + ctx->pre_commit =3D true; =20 if (HEX_DEBUG) { /* Handy place to set a breakpoint before the packet executes */ @@ -197,6 +257,26 @@ static void gen_start_packet(DisasContext *ctx, Packet= *pkt) if (need_pred_written(pkt)) { tcg_gen_movi_tl(hex_pred_written, 0); } + + if (pkt->pkt_has_hvx) { + tcg_gen_movi_tl(hex_VRegs_updated, 0); + tcg_gen_movi_tl(hex_QRegs_updated, 0); + } +} + +bool is_gather_store_insn(Insn *insn, Packet *pkt) +{ + if (GET_ATTRIB(insn->opcode, A_CVI_NEW) && + insn->new_value_producer_slot =3D=3D 1) { + /* Look for gather instruction */ + for (int i =3D 0; i < pkt->num_insns; i++) { + Insn *in =3D &pkt->insn[i]; + if (GET_ATTRIB(in->opcode, A_CVI_GATHER) && in->slot =3D=3D 1)= { + return true; + } + } + } + return false; } =20 /* @@ -445,10 +525,102 @@ static void process_dczeroa(DisasContext *ctx, Packe= t *pkt) } } =20 +static bool pkt_has_hvx_store(Packet *pkt) +{ + int i; + for (i =3D 0; i < pkt->num_insns; i++) { + int opcode =3D pkt->insn[i].opcode; + if (GET_ATTRIB(opcode, A_CVI) && GET_ATTRIB(opcode, A_STORE)) { + return true; + } + } + return false; +} + +static void gen_commit_hvx(DisasContext *ctx, Packet *pkt) +{ + int i; + + /* + * for (i =3D 0; i < ctx->vreg_log_idx; i++) { + * int rnum =3D ctx->vreg_log[i]; + * if (ctx->vreg_is_predicated[i]) { + * if (env->VRegs_updated & (1 << rnum)) { + * env->VRegs[rnum] =3D env->future_VRegs[rnum]; + * } + * } else { + * env->VRegs[rnum] =3D env->future_VRegs[rnum]; + * } + * } + */ + for (i =3D 0; i < ctx->vreg_log_idx; i++) { + int rnum =3D ctx->vreg_log[i]; + bool is_predicated =3D ctx->vreg_is_predicated[i]; + intptr_t dstoff =3D offsetof(CPUHexagonState, VRegs[rnum]); + intptr_t srcoff =3D ctx_future_vreg_off(ctx, rnum, 1, false); + size_t size =3D sizeof(MMVector); + + if (is_predicated) { + TCGv cmp =3D tcg_temp_local_new(); + TCGLabel *label_skip =3D gen_new_label(); + + tcg_gen_andi_tl(cmp, hex_VRegs_updated, 1 << rnum); + tcg_gen_brcondi_tl(TCG_COND_EQ, cmp, 0, label_skip); + { + tcg_gen_gvec_mov(MO_64, dstoff, srcoff, size, size); + } + gen_set_label(label_skip); + tcg_temp_free(cmp); + } else { + tcg_gen_gvec_mov(MO_64, dstoff, srcoff, size, size); + } + } + + /* + * for (i =3D 0; i < ctx->qreg_log_idx; i++) { + * int rnum =3D ctx->qreg_log[i]; + * if (ctx->qreg_is_predicated[i]) { + * if (env->QRegs_updated) & (1 << rnum)) { + * env->QRegs[rnum] =3D env->future_QRegs[rnum]; + * } + * } else { + * env->QRegs[rnum] =3D env->future_QRegs[rnum]; + * } + * } + */ + for (i =3D 0; i < ctx->qreg_log_idx; i++) { + int rnum =3D ctx->qreg_log[i]; + bool is_predicated =3D ctx->qreg_is_predicated[i]; + intptr_t dstoff =3D offsetof(CPUHexagonState, QRegs[rnum]); + intptr_t srcoff =3D offsetof(CPUHexagonState, future_QRegs[rnum]); + size_t size =3D sizeof(MMQReg); + + if (is_predicated) { + TCGv cmp =3D tcg_temp_local_new(); + TCGLabel *label_skip =3D gen_new_label(); + + tcg_gen_andi_tl(cmp, hex_QRegs_updated, 1 << rnum); + tcg_gen_brcondi_tl(TCG_COND_EQ, cmp, 0, label_skip); + { + tcg_gen_gvec_mov(MO_64, dstoff, srcoff, size, size); + } + gen_set_label(label_skip); + tcg_temp_free(cmp); + } else { + tcg_gen_gvec_mov(MO_64, dstoff, srcoff, size, size); + } + } + + if (pkt_has_hvx_store(pkt)) { + gen_helper_commit_hvx_stores(cpu_env); + } +} + static void update_exec_counters(DisasContext *ctx, Packet *pkt) { int num_insns =3D pkt->num_insns; int num_real_insns =3D 0; + int num_hvx_insns =3D 0; =20 for (int i =3D 0; i < num_insns; i++) { if (!pkt->insn[i].is_endloop && @@ -456,13 +628,18 @@ static void update_exec_counters(DisasContext *ctx, P= acket *pkt) !GET_ATTRIB(pkt->insn[i].opcode, A_IT_NOP)) { num_real_insns++; } + if (GET_ATTRIB(pkt->insn[i].opcode, A_CVI)) { + num_hvx_insns++; + } } =20 ctx->num_packets++; ctx->num_insns +=3D num_real_insns; + ctx->num_hvx_insns +=3D num_hvx_insns; } =20 -static void gen_commit_packet(DisasContext *ctx, Packet *pkt) +static void gen_commit_packet(CPUHexagonState *env, DisasContext *ctx, + Packet *pkt) { /* * If there is more than one store in a packet, make sure they are all= OK @@ -471,6 +648,10 @@ static void gen_commit_packet(DisasContext *ctx, Packe= t *pkt) * dczeroa has to be the only store operation in the packet, so we go * ahead and process that first. * + * When there is an HVX store, there can also be a scalar store in eit= her + * slot 0 or slot1, so we create a mask for the helper to indicate what + * work to do. + * * When there are two scalar stores, we probe the one in slot 0. * * Note that we don't call the probe helper for packets with only one @@ -479,13 +660,35 @@ static void gen_commit_packet(DisasContext *ctx, Pack= et *pkt) */ bool has_store_s0 =3D pkt->pkt_has_store_s0; bool has_store_s1 =3D (pkt->pkt_has_store_s1 && !ctx->s1_store_process= ed); + bool has_hvx_store =3D pkt_has_hvx_store(pkt); if (pkt->pkt_has_dczeroa) { /* * The dczeroa will be the store in slot 0, check that we don't ha= ve - * a store in slot 1. + * a store in slot 1 or an HVX store. */ - g_assert(has_store_s0 && !has_store_s1); + g_assert(has_store_s0 && !has_store_s1 && !has_hvx_store); process_dczeroa(ctx, pkt); + } else if (has_hvx_store) { + TCGv mem_idx =3D tcg_constant_tl(ctx->mem_idx); + + if (!has_store_s0 && !has_store_s1) { + gen_helper_probe_hvx_stores(cpu_env, mem_idx); + } else { + int mask =3D 0; + TCGv mask_tcgv; + + if (has_store_s0) { + mask |=3D (1 << 0); + } + if (has_store_s1) { + mask |=3D (1 << 1); + } + if (has_hvx_store) { + mask |=3D (1 << 2); + } + mask_tcgv =3D tcg_constant_tl(mask); + gen_helper_probe_pkt_scalar_hvx_stores(cpu_env, mask_tcgv, mem= _idx); + } } else if (has_store_s0 && has_store_s1) { /* * process_store_log will execute the slot 1 store first, @@ -500,6 +703,9 @@ static void gen_commit_packet(DisasContext *ctx, Packet= *pkt) =20 gen_reg_writes(ctx); gen_pred_writes(ctx, pkt); + if (pkt->pkt_has_hvx) { + gen_commit_hvx(ctx, pkt); + } update_exec_counters(ctx, pkt); if (HEX_DEBUG) { TCGv has_st0 =3D @@ -511,6 +717,11 @@ static void gen_commit_packet(DisasContext *ctx, Packe= t *pkt) gen_helper_debug_commit_end(cpu_env, has_st0, has_st1); } =20 + if (pkt->vhist_insn !=3D NULL) { + ctx->pre_commit =3D false; + pkt->vhist_insn->generate(env, ctx, pkt->vhist_insn, pkt); + } + if (pkt->pkt_has_cof) { gen_end_tb(ctx); } @@ -535,7 +746,7 @@ static void decode_and_translate_packet(CPUHexagonState= *env, DisasContext *ctx) for (i =3D 0; i < pkt.num_insns; i++) { gen_insn(env, ctx, &pkt.insn[i], &pkt); } - gen_commit_packet(ctx, &pkt); + gen_commit_packet(env, ctx, &pkt); ctx->base.pc_next +=3D pkt.encod_pkt_size_in_bytes; } else { gen_exception_end_tb(ctx, HEX_EXCP_INVALID_PACKET); @@ -550,6 +761,7 @@ static void hexagon_tr_init_disas_context(DisasContextB= ase *dcbase, ctx->mem_idx =3D MMU_USER_IDX; ctx->num_packets =3D 0; ctx->num_insns =3D 0; + ctx->num_hvx_insns =3D 0; } =20 static void hexagon_tr_tb_start(DisasContextBase *db, CPUState *cpu) @@ -658,6 +870,9 @@ static char store_addr_names[STORES_MAX][NAME_LEN]; static char store_width_names[STORES_MAX][NAME_LEN]; static char store_val32_names[STORES_MAX][NAME_LEN]; static char store_val64_names[STORES_MAX][NAME_LEN]; +static char vstore_addr_names[VSTORES_MAX][NAME_LEN]; +static char vstore_size_names[VSTORES_MAX][NAME_LEN]; +static char vstore_pending_names[VSTORES_MAX][NAME_LEN]; =20 void hexagon_translate_init(void) { @@ -720,6 +935,10 @@ void hexagon_translate_init(void) offsetof(CPUHexagonState, llsc_val), "llsc_val"); hex_llsc_val_i64 =3D tcg_global_mem_new_i64(cpu_env, offsetof(CPUHexagonState, llsc_val_i64), "llsc_val_i64"); + hex_VRegs_updated =3D tcg_global_mem_new(cpu_env, + offsetof(CPUHexagonState, VRegs_updated), "VRegs_updated"); + hex_QRegs_updated =3D tcg_global_mem_new(cpu_env, + offsetof(CPUHexagonState, QRegs_updated), "QRegs_updated"); for (i =3D 0; i < STORES_MAX; i++) { snprintf(store_addr_names[i], NAME_LEN, "store_addr_%d", i); hex_store_addr[i] =3D tcg_global_mem_new(cpu_env, @@ -741,4 +960,20 @@ void hexagon_translate_init(void) offsetof(CPUHexagonState, mem_log_stores[i].data64), store_val64_names[i]); } + for (int i =3D 0; i < VSTORES_MAX; i++) { + snprintf(vstore_addr_names[i], NAME_LEN, "vstore_addr_%d", i); + hex_vstore_addr[i] =3D tcg_global_mem_new(cpu_env, + offsetof(CPUHexagonState, vstore[i].va), + vstore_addr_names[i]); + + snprintf(vstore_size_names[i], NAME_LEN, "vstore_size_%d", i); + hex_vstore_size[i] =3D tcg_global_mem_new(cpu_env, + offsetof(CPUHexagonState, vstore[i].size), + vstore_size_names[i]); + + snprintf(vstore_pending_names[i], NAME_LEN, "vstore_pending_%d", i= ); + hex_vstore_pending[i] =3D tcg_global_mem_new(cpu_env, + offsetof(CPUHexagonState, vstore_pending[i]), + vstore_pending_names[i]); + } } --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634036102743472.66652052088796; Tue, 12 Oct 2021 03:55:02 -0700 (PDT) Received: from localhost ([::1]:44352 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFQz-0000Qg-Fm for importer@patchew.org; Tue, 12 Oct 2021 06:55:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50416) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEl3-0000bE-Nq for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:43 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64080) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEky-0006xP-Qq for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:41 -0400 Received: from unknown (HELO ironmsg03-sd.qualcomm.com) ([10.53.140.143]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg03-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 58BC41657; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033496; x=1665569496; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XyX1RK/6dvKmMnb/Y5rfh+p9Wm7Y+HK94jshWYTfmy4=; b=CrgZuyVvWcTnW8egBeOYBHrWtOsJlNbkVb5rDkgoYU410EdFj7PZB1D6 E/Tu2jeJQZD8tDuKYZ35/mVRPl9MYDYJDwjwNuMrRFEIO6/Wt3ijqoXbw FFTcq9LM1igAoMkUHm8y5Zk+mBYLHapxZb8Cbf6KHbyg0aMTMjSZXGeyQ o=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 13/30] Hexagon HVX (target/hexagon) helper overrides infrastructure Date: Tue, 12 Oct 2021 05:10:51 -0500 Message-Id: <1634033468-23566-14-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634036104258100001 Build the infrastructure to create overrides for HVX instructions. We create a new empty file (gen_tcg_hvx.h) that will be populated in subsequent patches. Signed-off-by: Taylor Simpson Reviewed-by: Philippe Mathieu-Daud=C3=A9 Reviewed-by: Richard Henderson --- target/hexagon/gen_tcg_hvx.h | 21 +++++++++++++++++++++ target/hexagon/genptr.c | 1 + target/hexagon/gen_helper_funcs.py | 3 ++- target/hexagon/gen_helper_protos.py | 3 ++- target/hexagon/gen_tcg_funcs.py | 3 ++- target/hexagon/meson.build | 13 +++++++------ 6 files changed, 35 insertions(+), 9 deletions(-) create mode 100644 target/hexagon/gen_tcg_hvx.h diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h new file mode 100644 index 0000000..b5c6cad --- /dev/null +++ b/target/hexagon/gen_tcg_hvx.h @@ -0,0 +1,21 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#ifndef HEXAGON_GEN_TCG_HVX_H +#define HEXAGON_GEN_TCG_HVX_H + +#endif diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c index d16ff74..473438a 100644 --- a/target/hexagon/genptr.c +++ b/target/hexagon/genptr.c @@ -26,6 +26,7 @@ #include "macros.h" #undef QEMU_GENERATE #include "gen_tcg.h" +#include "gen_tcg_hvx.h" =20 static inline void gen_log_predicated_reg_write(int rnum, TCGv val, int sl= ot) { diff --git a/target/hexagon/gen_helper_funcs.py b/target/hexagon/gen_helper= _funcs.py index ac5ce10..a446c45 100755 --- a/target/hexagon/gen_helper_funcs.py +++ b/target/hexagon/gen_helper_funcs.py @@ -286,11 +286,12 @@ def main(): hex_common.read_semantics_file(sys.argv[1]) hex_common.read_attribs_file(sys.argv[2]) hex_common.read_overrides_file(sys.argv[3]) + hex_common.read_overrides_file(sys.argv[4]) hex_common.calculate_attribs() tagregs =3D hex_common.get_tagregs() tagimms =3D hex_common.get_tagimms() =20 - with open(sys.argv[4], 'w') as f: + with open(sys.argv[5], 'w') as f: for tag in hex_common.tags: ## Skip the priv instructions if ( "A_PRIV" in hex_common.attribdict[tag] ) : diff --git a/target/hexagon/gen_helper_protos.py b/target/hexagon/gen_helpe= r_protos.py index 229ef8d..3b4e993 100755 --- a/target/hexagon/gen_helper_protos.py +++ b/target/hexagon/gen_helper_protos.py @@ -135,11 +135,12 @@ def main(): hex_common.read_semantics_file(sys.argv[1]) hex_common.read_attribs_file(sys.argv[2]) hex_common.read_overrides_file(sys.argv[3]) + hex_common.read_overrides_file(sys.argv[4]) hex_common.calculate_attribs() tagregs =3D hex_common.get_tagregs() tagimms =3D hex_common.get_tagimms() =20 - with open(sys.argv[4], 'w') as f: + with open(sys.argv[5], 'w') as f: for tag in hex_common.tags: ## Skip the priv instructions if ( "A_PRIV" in hex_common.attribdict[tag] ) : diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs= .py index 48bcf89..5bee1ca 100755 --- a/target/hexagon/gen_tcg_funcs.py +++ b/target/hexagon/gen_tcg_funcs.py @@ -682,11 +682,12 @@ def main(): hex_common.read_semantics_file(sys.argv[1]) hex_common.read_attribs_file(sys.argv[2]) hex_common.read_overrides_file(sys.argv[3]) + hex_common.read_overrides_file(sys.argv[4]) hex_common.calculate_attribs() tagregs =3D hex_common.get_tagregs() tagimms =3D hex_common.get_tagimms() =20 - with open(sys.argv[4], 'w') as f: + with open(sys.argv[5], 'w') as f: f.write("#ifndef HEXAGON_TCG_FUNCS_H\n") f.write("#define HEXAGON_TCG_FUNCS_H\n\n") =20 diff --git a/target/hexagon/meson.build b/target/hexagon/meson.build index 0bfaa41..a35eb28 100644 --- a/target/hexagon/meson.build +++ b/target/hexagon/meson.build @@ -20,6 +20,7 @@ hexagon_ss =3D ss.source_set() hex_common_py =3D 'hex_common.py' attribs_def =3D meson.current_source_dir() / 'attribs_def.h.inc' gen_tcg_h =3D meson.current_source_dir() / 'gen_tcg.h' +gen_tcg_hvx_h =3D meson.current_source_dir() / 'gen_tcg_hvx.h' =20 # # Step 1 @@ -63,8 +64,8 @@ helper_protos_generated =3D custom_target( 'helper_protos_generated.h.inc', output: 'helper_protos_generated.h.inc', depends: [semantics_generated], - depend_files: [hex_common_py, attribs_def, gen_tcg_h], - command: [python, files('gen_helper_protos.py'), semantics_generated, = attribs_def, gen_tcg_h, '@OUTPUT@'], + depend_files: [hex_common_py, attribs_def, gen_tcg_h, gen_tcg_hvx_h], + command: [python, files('gen_helper_protos.py'), semantics_generated, = attribs_def, gen_tcg_h, gen_tcg_hvx_h, '@OUTPUT@'], ) hexagon_ss.add(helper_protos_generated) =20 @@ -72,8 +73,8 @@ tcg_funcs_generated =3D custom_target( 'tcg_funcs_generated.c.inc', output: 'tcg_funcs_generated.c.inc', depends: [semantics_generated], - depend_files: [hex_common_py, attribs_def, gen_tcg_h], - command: [python, files('gen_tcg_funcs.py'), semantics_generated, attr= ibs_def, gen_tcg_h, '@OUTPUT@'], + depend_files: [hex_common_py, attribs_def, gen_tcg_h, gen_tcg_hvx_h], + command: [python, files('gen_tcg_funcs.py'), semantics_generated, attr= ibs_def, gen_tcg_h, gen_tcg_hvx_h, '@OUTPUT@'], ) hexagon_ss.add(tcg_funcs_generated) =20 @@ -90,8 +91,8 @@ helper_funcs_generated =3D custom_target( 'helper_funcs_generated.c.inc', output: 'helper_funcs_generated.c.inc', depends: [semantics_generated], - depend_files: [hex_common_py, attribs_def, gen_tcg_h], - command: [python, files('gen_helper_funcs.py'), semantics_generated, a= ttribs_def, gen_tcg_h, '@OUTPUT@'], + depend_files: [hex_common_py, attribs_def, gen_tcg_h, gen_tcg_hvx_h], + command: [python, files('gen_helper_funcs.py'), semantics_generated, a= ttribs_def, gen_tcg_h, gen_tcg_hvx_h, '@OUTPUT@'], ) hexagon_ss.add(helper_funcs_generated) =20 --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634033990526702.5316663990926; Tue, 12 Oct 2021 03:19:50 -0700 (PDT) Received: from localhost ([::1]:52864 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maEsv-0000mj-BO for importer@patchew.org; Tue, 12 Oct 2021 06:19:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50318) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEkx-0000ZX-EB for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:36 -0400 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:12878) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEkv-0006y1-Ki for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:35 -0400 Received: from unknown (HELO ironmsg02-sd.qualcomm.com) ([10.53.140.142]) by alexa-out-sd-01.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg02-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 5B16C167F; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033493; x=1665569493; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Oau7xqaEz0ArC6cVJZqDdEH2oQSa8dL507+tY4f6SWQ=; b=jgNRw4KtDhG0GaXFBWpyhoW6Da8vCq3d6G4mOqJRl4XdXroBn9QgDnj4 q+s6in0zojIxkoVHmB+IW9G79KR+Y2jiwPs012mXGfekatDByTtz4URdq /cdxz6w7hdUE2Rq9R1KVMlMS8B3hULcLvlzs39osepj6glgVrzg9U7AdT E=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 14/30] Hexagon HVX (target/hexagon) helper overrides for histogram instructions Date: Tue, 12 Oct 2021 05:10:52 -0500 Message-Id: <1634033468-23566-15-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.38; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-01.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634033991988100001 Signed-off-by: Taylor Simpson Reviewed-by: Richard Henderson --- target/hexagon/gen_tcg_hvx.h | 106 +++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 106 insertions(+) diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h index b5c6cad..a560504 100644 --- a/target/hexagon/gen_tcg_hvx.h +++ b/target/hexagon/gen_tcg_hvx.h @@ -18,4 +18,110 @@ #ifndef HEXAGON_GEN_TCG_HVX_H #define HEXAGON_GEN_TCG_HVX_H =20 +/* + * Histogram instructions + * + * Note that these instructions operate directly on the vector registers + * and therefore happen after commit. + * + * The generate_ function is called twice + * The first time is during the normal TCG generation + * ctx->pre_commit is true + * In the masked cases, we save the mask to the qtmp temporary + * Otherwise, there is nothing to do + * The second call is at the end of gen_commit_packet + * ctx->pre_commit is false + * Generate the call to the helper + */ + +static inline void assert_vhist_tmp(DisasContext *ctx) +{ + /* vhist instructions require exactly one .tmp to be defined */ + g_assert(ctx->tmp_vregs_idx =3D=3D 1); +} + +#define fGEN_TCG_V6_vhist(SHORTCODE) \ + if (!ctx->pre_commit) { \ + assert_vhist_tmp(ctx); \ + gen_helper_vhist(cpu_env); \ + } +#define fGEN_TCG_V6_vhistq(SHORTCODE) \ + do { \ + if (ctx->pre_commit) { \ + intptr_t dstoff =3D offsetof(CPUHexagonState, qtmp); \ + tcg_gen_gvec_mov(MO_64, dstoff, QvV_off, \ + sizeof(MMVector), sizeof(MMVector)); \ + } else { \ + assert_vhist_tmp(ctx); \ + gen_helper_vhistq(cpu_env); \ + } \ + } while (0) +#define fGEN_TCG_V6_vwhist256(SHORTCODE) \ + if (!ctx->pre_commit) { \ + assert_vhist_tmp(ctx); \ + gen_helper_vwhist256(cpu_env); \ + } +#define fGEN_TCG_V6_vwhist256q(SHORTCODE) \ + do { \ + if (ctx->pre_commit) { \ + intptr_t dstoff =3D offsetof(CPUHexagonState, qtmp); \ + tcg_gen_gvec_mov(MO_64, dstoff, QvV_off, \ + sizeof(MMVector), sizeof(MMVector)); \ + } else { \ + assert_vhist_tmp(ctx); \ + gen_helper_vwhist256q(cpu_env); \ + } \ + } while (0) +#define fGEN_TCG_V6_vwhist256_sat(SHORTCODE) \ + if (!ctx->pre_commit) { \ + assert_vhist_tmp(ctx); \ + gen_helper_vwhist256_sat(cpu_env); \ + } +#define fGEN_TCG_V6_vwhist256q_sat(SHORTCODE) \ + do { \ + if (ctx->pre_commit) { \ + intptr_t dstoff =3D offsetof(CPUHexagonState, qtmp); \ + tcg_gen_gvec_mov(MO_64, dstoff, QvV_off, \ + sizeof(MMVector), sizeof(MMVector)); \ + } else { \ + assert_vhist_tmp(ctx); \ + gen_helper_vwhist256q_sat(cpu_env); \ + } \ + } while (0) +#define fGEN_TCG_V6_vwhist128(SHORTCODE) \ + if (!ctx->pre_commit) { \ + assert_vhist_tmp(ctx); \ + gen_helper_vwhist128(cpu_env); \ + } +#define fGEN_TCG_V6_vwhist128q(SHORTCODE) \ + do { \ + if (ctx->pre_commit) { \ + intptr_t dstoff =3D offsetof(CPUHexagonState, qtmp); \ + tcg_gen_gvec_mov(MO_64, dstoff, QvV_off, \ + sizeof(MMVector), sizeof(MMVector)); \ + } else { \ + assert_vhist_tmp(ctx); \ + gen_helper_vwhist128q(cpu_env); \ + } \ + } while (0) +#define fGEN_TCG_V6_vwhist128m(SHORTCODE) \ + if (!ctx->pre_commit) { \ + TCGv tcgv_uiV =3D tcg_constant_tl(uiV); \ + assert_vhist_tmp(ctx); \ + gen_helper_vwhist128m(cpu_env, tcgv_uiV); \ + } +#define fGEN_TCG_V6_vwhist128qm(SHORTCODE) \ + do { \ + if (ctx->pre_commit) { \ + intptr_t dstoff =3D offsetof(CPUHexagonState, qtmp); \ + tcg_gen_gvec_mov(MO_64, dstoff, QvV_off, \ + sizeof(MMVector), sizeof(MMVector)); \ + } else { \ + TCGv tcgv_uiV =3D tcg_constant_tl(uiV); \ + assert_vhist_tmp(ctx); \ + gen_helper_vwhist128qm(cpu_env, tcgv_uiV); \ + } \ + } while (0) + + #endif --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634034389048802.4678798226229; Tue, 12 Oct 2021 03:26:29 -0700 (PDT) Received: from localhost ([::1]:36378 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maEzL-0000KS-Ra for importer@patchew.org; Tue, 12 Oct 2021 06:26:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50444) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEl4-0000bN-Vp for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:43 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64100) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEl3-0007Fq-BE for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:42 -0400 Received: from unknown (HELO ironmsg04-sd.qualcomm.com) ([10.53.140.144]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg04-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 5DA6D168E; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033501; x=1665569501; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ObK4YZmk5896JVm5DYR19L0vZgkvxXdQAUzClPnIKF4=; b=o96OelHQHi4NECvR9RcRI7+SitlZazxTP4JVLu2nDByyD3UtIvTi6GTr q4c5KugE/qG3Kyf8vsL1PRAuNN2TttB8nAw2kxb2sGc15jPtezdo825kV K+a0Oy8kE9OayelPBN1k1wNmiLoDoQEDNrGFS23/IuFir5Ox/Z6QEu7Tz k=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 15/30] Hexagon HVX (target/hexagon) helper overrides - vector assign & cmov Date: Tue, 12 Oct 2021 05:10:53 -0500 Message-Id: <1634033468-23566-16-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634034390583100001 Reviewed-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg_hvx.h | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h index a560504..916230e 100644 --- a/target/hexagon/gen_tcg_hvx.h +++ b/target/hexagon/gen_tcg_hvx.h @@ -124,4 +124,35 @@ static inline void assert_vhist_tmp(DisasContext *ctx) } while (0) =20 =20 +#define fGEN_TCG_V6_vassign(SHORTCODE) \ + tcg_gen_gvec_mov(MO_64, VdV_off, VuV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +/* Vector conditional move */ +#define fGEN_TCG_VEC_CMOV(PRED) \ + do { \ + TCGv lsb =3D tcg_temp_new(); \ + TCGLabel *false_label =3D gen_new_label(); \ + TCGLabel *end_label =3D gen_new_label(); \ + tcg_gen_andi_tl(lsb, PsV, 1); \ + tcg_gen_brcondi_tl(TCG_COND_NE, lsb, PRED, false_label); \ + tcg_temp_free(lsb); \ + tcg_gen_gvec_mov(MO_64, VdV_off, VuV_off, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_gen_br(end_label); \ + gen_set_label(false_label); \ + tcg_gen_ori_tl(hex_slot_cancelled, hex_slot_cancelled, \ + 1 << insn->slot); \ + gen_set_label(end_label); \ + } while (0) + + +/* Vector conditional move (true) */ +#define fGEN_TCG_V6_vcmov(SHORTCODE) \ + fGEN_TCG_VEC_CMOV(1) + +/* Vector conditional move (false) */ +#define fGEN_TCG_V6_vncmov(SHORTCODE) \ + fGEN_TCG_VEC_CMOV(0) + #endif --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16340351483231019.7549375052456; Tue, 12 Oct 2021 03:39:08 -0700 (PDT) Received: from localhost ([::1]:51088 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFBb-0002N9-5a for importer@patchew.org; Tue, 12 Oct 2021 06:39:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50404) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEl3-0000bC-91 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:43 -0400 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:12899) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEky-0007Vl-TV for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:41 -0400 Received: from unknown (HELO ironmsg02-sd.qualcomm.com) ([10.53.140.142]) by alexa-out-sd-01.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg02-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 6008D16D7; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033496; x=1665569496; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XC6rw9A7F+UcRCM9yFYGKw/EEw6WCO/AGuJcBv03rzs=; b=QIvMnRpocsGm9kssvhJ5mI5xaT1b3KfM3uO7GSbwcbqv9DhJPbEYHQnN W5XJeHdOayGMZq6x3l1jFuOUVw0A+CtbjKDn1vhH1Uyhvf1K1nuuXqINR FWs56hd3dJX883sTVzyTW2ejxXXQ0Ye5mzj8bdXtZvNDqQ6q7Eb/k1KSG Q=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 16/30] Hexagon HVX (target/hexagon) helper overrides - vector add & sub Date: Tue, 12 Oct 2021 05:10:54 -0500 Message-Id: <1634033468-23566-17-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.38; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-01.qualcomm.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634035148942100001 Reviewed-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg_hvx.h | 50 ++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 50 insertions(+) diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h index 916230e..ac2143e 100644 --- a/target/hexagon/gen_tcg_hvx.h +++ b/target/hexagon/gen_tcg_hvx.h @@ -155,4 +155,54 @@ static inline void assert_vhist_tmp(DisasContext *ctx) #define fGEN_TCG_V6_vncmov(SHORTCODE) \ fGEN_TCG_VEC_CMOV(0) =20 +/* Vector add - various forms */ +#define fGEN_TCG_V6_vaddb(SHORTCODE) \ + tcg_gen_gvec_add(MO_8, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +#define fGEN_TCG_V6_vaddh(SHORTCYDE) \ + tcg_gen_gvec_add(MO_16, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +#define fGEN_TCG_V6_vaddw(SHORTCODE) \ + tcg_gen_gvec_add(MO_32, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +#define fGEN_TCG_V6_vaddb_dv(SHORTCODE) \ + tcg_gen_gvec_add(MO_8, VddV_off, VuuV_off, VvvV_off, \ + sizeof(MMVector) * 2, sizeof(MMVector) * 2) + +#define fGEN_TCG_V6_vaddh_dv(SHORTCYDE) \ + tcg_gen_gvec_add(MO_16, VddV_off, VuuV_off, VvvV_off, \ + sizeof(MMVector) * 2, sizeof(MMVector) * 2) + +#define fGEN_TCG_V6_vaddw_dv(SHORTCODE) \ + tcg_gen_gvec_add(MO_32, VddV_off, VuuV_off, VvvV_off, \ + sizeof(MMVector) * 2, sizeof(MMVector) * 2) + +/* Vector sub - various forms */ +#define fGEN_TCG_V6_vsubb(SHORTCODE) \ + tcg_gen_gvec_sub(MO_8, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +#define fGEN_TCG_V6_vsubh(SHORTCODE) \ + tcg_gen_gvec_sub(MO_16, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +#define fGEN_TCG_V6_vsubw(SHORTCODE) \ + tcg_gen_gvec_sub(MO_32, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +#define fGEN_TCG_V6_vsubb_dv(SHORTCODE) \ + tcg_gen_gvec_sub(MO_8, VddV_off, VuuV_off, VvvV_off, \ + sizeof(MMVector) * 2, sizeof(MMVector) * 2) + +#define fGEN_TCG_V6_vsubh_dv(SHORTCODE) \ + tcg_gen_gvec_sub(MO_16, VddV_off, VuuV_off, VvvV_off, \ + sizeof(MMVector) * 2, sizeof(MMVector) * 2) + +#define fGEN_TCG_V6_vsubw_dv(SHORTCODE) \ + tcg_gen_gvec_sub(MO_32, VddV_off, VuuV_off, VvvV_off, \ + sizeof(MMVector) * 2, sizeof(MMVector) * 2) + #endif --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634035359629289.7461908186317; Tue, 12 Oct 2021 03:42:39 -0700 (PDT) Received: from localhost ([::1]:55108 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFF0-0005Em-Ib for importer@patchew.org; Tue, 12 Oct 2021 06:42:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50492) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maElK-0000p9-FF for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:58 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64100) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maElF-0007Fq-S5 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:56 -0400 Received: from unknown (HELO ironmsg04-sd.qualcomm.com) ([10.53.140.144]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg04-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 6281816E0; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033513; x=1665569513; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=l9oi6+pZhvwWghXNvuDv1dHkvzkuofqAFABK6p7kRmc=; b=IYhg+AUIgJKzkTgDwNnHm1T3W0O3iSq4RPylJU+PillK6VB6b+UmaHKm Wu1TjduY0IVZa0HafahMvpmg2vy3CVFp8gmNkEBrHTXKYXXpVfnTcfoW3 dynvDrLSXZmWn9SzU0nqIuhGLp6EzUXQMQ/taU+lM58UqfEibJMk0M0Ad o=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 17/30] Hexagon HVX (target/hexagon) helper overrides - vector shifts Date: Tue, 12 Oct 2021 05:10:55 -0500 Message-Id: <1634033468-23566-18-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634035361129100001 Reviewed-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg_hvx.h | 122 +++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 122 insertions(+) diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h index ac2143e..e865410 100644 --- a/target/hexagon/gen_tcg_hvx.h +++ b/target/hexagon/gen_tcg_hvx.h @@ -205,4 +205,126 @@ static inline void assert_vhist_tmp(DisasContext *ctx) tcg_gen_gvec_sub(MO_32, VddV_off, VuuV_off, VvvV_off, \ sizeof(MMVector) * 2, sizeof(MMVector) * 2) =20 +/* Vector shift right - various forms */ +#define fGEN_TCG_V6_vasrh(SHORTCODE) \ + do { \ + TCGv shift =3D tcg_temp_new(); \ + tcg_gen_andi_tl(shift, RtV, 15); \ + tcg_gen_gvec_sars(MO_16, VdV_off, VuV_off, shift, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_temp_free(shift); \ + } while (0) + +#define fGEN_TCG_V6_vasrh_acc(SHORTCODE) \ + do { \ + intptr_t tmpoff =3D offsetof(CPUHexagonState, vtmp); \ + TCGv shift =3D tcg_temp_new(); \ + tcg_gen_andi_tl(shift, RtV, 15); \ + tcg_gen_gvec_sars(MO_16, tmpoff, VuV_off, shift, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_gen_gvec_add(MO_16, VxV_off, VxV_off, tmpoff, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_temp_free(shift); \ + } while (0) + +#define fGEN_TCG_V6_vasrw(SHORTCODE) \ + do { \ + TCGv shift =3D tcg_temp_new(); \ + tcg_gen_andi_tl(shift, RtV, 31); \ + tcg_gen_gvec_sars(MO_32, VdV_off, VuV_off, shift, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_temp_free(shift); \ + } while (0) + +#define fGEN_TCG_V6_vasrw_acc(SHORTCODE) \ + do { \ + intptr_t tmpoff =3D offsetof(CPUHexagonState, vtmp); \ + TCGv shift =3D tcg_temp_new(); \ + tcg_gen_andi_tl(shift, RtV, 31); \ + tcg_gen_gvec_sars(MO_32, tmpoff, VuV_off, shift, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_gen_gvec_add(MO_32, VxV_off, VxV_off, tmpoff, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_temp_free(shift); \ + } while (0) + +#define fGEN_TCG_V6_vlsrb(SHORTCODE) \ + do { \ + TCGv shift =3D tcg_temp_new(); \ + tcg_gen_andi_tl(shift, RtV, 7); \ + tcg_gen_gvec_shrs(MO_8, VdV_off, VuV_off, shift, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_temp_free(shift); \ + } while (0) + +#define fGEN_TCG_V6_vlsrh(SHORTCODE) \ + do { \ + TCGv shift =3D tcg_temp_new(); \ + tcg_gen_andi_tl(shift, RtV, 15); \ + tcg_gen_gvec_shrs(MO_16, VdV_off, VuV_off, shift, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_temp_free(shift); \ + } while (0) + +#define fGEN_TCG_V6_vlsrw(SHORTCODE) \ + do { \ + TCGv shift =3D tcg_temp_new(); \ + tcg_gen_andi_tl(shift, RtV, 31); \ + tcg_gen_gvec_shrs(MO_32, VdV_off, VuV_off, shift, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_temp_free(shift); \ + } while (0) + +/* Vector shift left - various forms */ +#define fGEN_TCG_V6_vaslb(SHORTCODE) \ + do { \ + TCGv shift =3D tcg_temp_new(); \ + tcg_gen_andi_tl(shift, RtV, 7); \ + tcg_gen_gvec_shls(MO_8, VdV_off, VuV_off, shift, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_temp_free(shift); \ + } while (0) + +#define fGEN_TCG_V6_vaslh(SHORTCODE) \ + do { \ + TCGv shift =3D tcg_temp_new(); \ + tcg_gen_andi_tl(shift, RtV, 15); \ + tcg_gen_gvec_shls(MO_16, VdV_off, VuV_off, shift, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_temp_free(shift); \ + } while (0) + +#define fGEN_TCG_V6_vaslh_acc(SHORTCODE) \ + do { \ + intptr_t tmpoff =3D offsetof(CPUHexagonState, vtmp); \ + TCGv shift =3D tcg_temp_new(); \ + tcg_gen_andi_tl(shift, RtV, 15); \ + tcg_gen_gvec_shls(MO_16, tmpoff, VuV_off, shift, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_gen_gvec_add(MO_16, VxV_off, VxV_off, tmpoff, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_temp_free(shift); \ + } while (0) + +#define fGEN_TCG_V6_vaslw(SHORTCODE) \ + do { \ + TCGv shift =3D tcg_temp_new(); \ + tcg_gen_andi_tl(shift, RtV, 31); \ + tcg_gen_gvec_shls(MO_32, VdV_off, VuV_off, shift, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_temp_free(shift); \ + } while (0) + +#define fGEN_TCG_V6_vaslw_acc(SHORTCODE) \ + do { \ + intptr_t tmpoff =3D offsetof(CPUHexagonState, vtmp); \ + TCGv shift =3D tcg_temp_new(); \ + tcg_gen_andi_tl(shift, RtV, 31); \ + tcg_gen_gvec_shls(MO_32, tmpoff, VuV_off, shift, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_gen_gvec_add(MO_32, VxV_off, VxV_off, tmpoff, \ + sizeof(MMVector), sizeof(MMVector)); \ + tcg_temp_free(shift); \ + } while (0) + #endif --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634034298373457.01934779666647; Tue, 12 Oct 2021 03:24:58 -0700 (PDT) Received: from localhost ([::1]:35020 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maExt-0007qP-Ak for importer@patchew.org; Tue, 12 Oct 2021 06:24:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50368) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEl0-0000aO-H0 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:40 -0400 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:12894) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEky-0007HY-4x for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:37 -0400 Received: from unknown (HELO ironmsg02-sd.qualcomm.com) ([10.53.140.142]) by alexa-out-sd-01.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg02-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 64E4316EA; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033496; x=1665569496; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NxDDV113CFZXsxo7zX18MC24YodgmP4D9k+eeDd23K4=; b=Ar1YcpXqLK3BSEb33puHP89dua2rAICFsaXqVCNjDwqMKKddDQJM7Fok 6jY8AJLzJKcdfkFx9+vlmjncxi94dkIIujcCD5UT7x5E/zsrWA1HrV6tR YcSeCbVEoPW7cajrn+bMYwgYlOS4SyLJCcP6wevNRPTrS/0wE39LymR2r E=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 18/30] Hexagon HVX (target/hexagon) helper overrides - vector max/min Date: Tue, 12 Oct 2021 05:10:56 -0500 Message-Id: <1634033468-23566-19-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.38; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-01.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634034300226100001 Reviewed-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg_hvx.h | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h index e865410..f548404 100644 --- a/target/hexagon/gen_tcg_hvx.h +++ b/target/hexagon/gen_tcg_hvx.h @@ -327,4 +327,38 @@ static inline void assert_vhist_tmp(DisasContext *ctx) tcg_temp_free(shift); \ } while (0) =20 +/* Vector max - various forms */ +#define fGEN_TCG_V6_vmaxw(SHORTCODE) \ + tcg_gen_gvec_smax(MO_32, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) +#define fGEN_TCG_V6_vmaxh(SHORTCODE) \ + tcg_gen_gvec_smax(MO_16, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) +#define fGEN_TCG_V6_vmaxuh(SHORTCODE) \ + tcg_gen_gvec_umax(MO_16, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) +#define fGEN_TCG_V6_vmaxb(SHORTCODE) \ + tcg_gen_gvec_smax(MO_8, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) +#define fGEN_TCG_V6_vmaxub(SHORTCODE) \ + tcg_gen_gvec_umax(MO_8, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +/* Vector min - various forms */ +#define fGEN_TCG_V6_vminw(SHORTCODE) \ + tcg_gen_gvec_smin(MO_32, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) +#define fGEN_TCG_V6_vminh(SHORTCODE) \ + tcg_gen_gvec_smin(MO_16, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) +#define fGEN_TCG_V6_vminuh(SHORTCODE) \ + tcg_gen_gvec_umin(MO_16, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) +#define fGEN_TCG_V6_vminb(SHORTCODE) \ + tcg_gen_gvec_smin(MO_8, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) +#define fGEN_TCG_V6_vminub(SHORTCODE) \ + tcg_gen_gvec_umin(MO_8, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) + #endif --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634035631590256.9487638588039; Tue, 12 Oct 2021 03:47:11 -0700 (PDT) Received: from localhost ([::1]:59692 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFJO-0008JP-2t for importer@patchew.org; Tue, 12 Oct 2021 06:47:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50446) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEl6-0000bd-7S for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:44 -0400 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:12894) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEl0-0007HY-Qe for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:43 -0400 Received: from unknown (HELO ironmsg02-sd.qualcomm.com) ([10.53.140.142]) by alexa-out-sd-01.qualcomm.com with ESMTP; 12 Oct 2021 03:11:25 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg02-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 678A11712; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033498; x=1665569498; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=flPwfRbNTm91MihUfAOICNVj2XY+hfsyfSN2ovxtNcQ=; b=Faf9O8JQ08k+fdu/EeE6mi+MnC4n3tdN4GeXL4espAMTPBHoN0JSzaWt 57A5Y5Zq2+SVQMNTamAmgVOPwWuPgwVI0Op8xgAZwll46vOR/APS93WwF lZR8UfEU6zWO3FXxl/eZQmADwyB+IK5CzIrww/o1xHk/cjSvDN1waBEPZ w=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 19/30] Hexagon HVX (target/hexagon) helper overrides - vector logical ops Date: Tue, 12 Oct 2021 05:10:57 -0500 Message-Id: <1634033468-23566-20-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.38; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-01.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634035633140100001 Signed-off-by: Taylor Simpson Reviewed-by: Richard Henderson --- target/hexagon/gen_tcg_hvx.h | 42 ++++++++++++++++++++++++++++++++++++++++= ++ 1 file changed, 42 insertions(+) diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h index f548404..f53a7f2 100644 --- a/target/hexagon/gen_tcg_hvx.h +++ b/target/hexagon/gen_tcg_hvx.h @@ -361,4 +361,46 @@ static inline void assert_vhist_tmp(DisasContext *ctx) tcg_gen_gvec_umin(MO_8, VdV_off, VuV_off, VvV_off, \ sizeof(MMVector), sizeof(MMVector)) =20 +/* Vector logical ops */ +#define fGEN_TCG_V6_vxor(SHORTCODE) \ + tcg_gen_gvec_xor(MO_64, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +#define fGEN_TCG_V6_vand(SHORTCODE) \ + tcg_gen_gvec_and(MO_64, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +#define fGEN_TCG_V6_vor(SHORTCODE) \ + tcg_gen_gvec_or(MO_64, VdV_off, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +#define fGEN_TCG_V6_vnot(SHORTCODE) \ + tcg_gen_gvec_not(MO_64, VdV_off, VuV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +/* Q register logical ops */ +#define fGEN_TCG_V6_pred_or(SHORTCODE) \ + tcg_gen_gvec_or(MO_64, QdV_off, QsV_off, QtV_off, \ + sizeof(MMQReg), sizeof(MMQReg)) + +#define fGEN_TCG_V6_pred_and(SHORTCODE) \ + tcg_gen_gvec_and(MO_64, QdV_off, QsV_off, QtV_off, \ + sizeof(MMQReg), sizeof(MMQReg)) + +#define fGEN_TCG_V6_pred_xor(SHORTCODE) \ + tcg_gen_gvec_xor(MO_64, QdV_off, QsV_off, QtV_off, \ + sizeof(MMQReg), sizeof(MMQReg)) + +#define fGEN_TCG_V6_pred_or_n(SHORTCODE) \ + tcg_gen_gvec_orc(MO_64, QdV_off, QsV_off, QtV_off, \ + sizeof(MMQReg), sizeof(MMQReg)) + +#define fGEN_TCG_V6_pred_and_n(SHORTCODE) \ + tcg_gen_gvec_andc(MO_64, QdV_off, QsV_off, QtV_off, \ + sizeof(MMQReg), sizeof(MMQReg)) + +#define fGEN_TCG_V6_pred_not(SHORTCODE) \ + tcg_gen_gvec_not(MO_64, QdV_off, QsV_off, \ + sizeof(MMQReg), sizeof(MMQReg)) + #endif --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634034969994191.39630796464985; Tue, 12 Oct 2021 03:36:09 -0700 (PDT) Received: from localhost ([::1]:46662 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maF8j-0007iY-2U for importer@patchew.org; Tue, 12 Oct 2021 06:36:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50314) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEkx-0000ZQ-Bz for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:36 -0400 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:12894) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEkv-0007HY-C4 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:35 -0400 Received: from unknown (HELO ironmsg05-sd.qualcomm.com) ([10.53.140.145]) by alexa-out-sd-01.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg05-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 69F15173D; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033493; x=1665569493; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tB0sq2Wz2SnBky+DemfXZe2wTSkBuwINTBdlRVp5alQ=; b=lCO6P5nm8xH7gOKJyGAtkY7HiIKKkNLZnYwvjA3+tHBth5jP9zR8DlN6 cA4PoGl9jbl1HxEhGVWHR4g1wWfZdfEvUGcRIpu0ItkRYVrJ3wQT8Jtia UJKIlclOS9i0kzKEMQu8u+L0ozARDXM53rPom57YUuKngCpnP0R3VXAOt o=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 20/30] Hexagon HVX (target/hexagon) helper overrides - vector compares Date: Tue, 12 Oct 2021 05:10:58 -0500 Message-Id: <1634033468-23566-21-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.38; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-01.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634034970320100002 Reviewed-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg_hvx.h | 103 +++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 103 insertions(+) diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h index f53a7f2..32f8e20 100644 --- a/target/hexagon/gen_tcg_hvx.h +++ b/target/hexagon/gen_tcg_hvx.h @@ -403,4 +403,107 @@ static inline void assert_vhist_tmp(DisasContext *ctx) tcg_gen_gvec_not(MO_64, QdV_off, QsV_off, \ sizeof(MMQReg), sizeof(MMQReg)) =20 +/* Vector compares */ +#define fGEN_TCG_VEC_CMP(COND, TYPE, SIZE) \ + do { \ + intptr_t tmpoff =3D offsetof(CPUHexagonState, vtmp); \ + tcg_gen_gvec_cmp(COND, TYPE, tmpoff, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)); \ + vec_to_qvec(SIZE, QdV_off, tmpoff); \ + } while (0) + +#define fGEN_TCG_V6_vgtw(SHORTCODE) \ + fGEN_TCG_VEC_CMP(TCG_COND_GT, MO_32, 4) +#define fGEN_TCG_V6_vgth(SHORTCODE) \ + fGEN_TCG_VEC_CMP(TCG_COND_GT, MO_16, 2) +#define fGEN_TCG_V6_vgtb(SHORTCODE) \ + fGEN_TCG_VEC_CMP(TCG_COND_GT, MO_8, 1) + +#define fGEN_TCG_V6_vgtuw(SHORTCODE) \ + fGEN_TCG_VEC_CMP(TCG_COND_GTU, MO_32, 4) +#define fGEN_TCG_V6_vgtuh(SHORTCODE) \ + fGEN_TCG_VEC_CMP(TCG_COND_GTU, MO_16, 2) +#define fGEN_TCG_V6_vgtub(SHORTCODE) \ + fGEN_TCG_VEC_CMP(TCG_COND_GTU, MO_8, 1) + +#define fGEN_TCG_V6_veqw(SHORTCODE) \ + fGEN_TCG_VEC_CMP(TCG_COND_EQ, MO_32, 4) +#define fGEN_TCG_V6_veqh(SHORTCODE) \ + fGEN_TCG_VEC_CMP(TCG_COND_EQ, MO_16, 2) +#define fGEN_TCG_V6_veqb(SHORTCODE) \ + fGEN_TCG_VEC_CMP(TCG_COND_EQ, MO_8, 1) + +#define fGEN_TCG_VEC_CMP_OP(COND, TYPE, SIZE, OP) \ + do { \ + intptr_t tmpoff =3D offsetof(CPUHexagonState, vtmp); \ + intptr_t qoff =3D offsetof(CPUHexagonState, qtmp); \ + tcg_gen_gvec_cmp(COND, TYPE, tmpoff, VuV_off, VvV_off, \ + sizeof(MMVector), sizeof(MMVector)); \ + vec_to_qvec(SIZE, qoff, tmpoff); \ + OP(MO_64, QxV_off, QxV_off, qoff, sizeof(MMQReg), sizeof(MMQReg));= \ + } while (0) + +#define fGEN_TCG_V6_vgtw_and(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_32, 4, tcg_gen_gvec_and) +#define fGEN_TCG_V6_vgtw_or(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_32, 4, tcg_gen_gvec_or) +#define fGEN_TCG_V6_vgtw_xor(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_32, 4, tcg_gen_gvec_xor) + +#define fGEN_TCG_V6_vgtuw_and(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_32, 4, tcg_gen_gvec_and) +#define fGEN_TCG_V6_vgtuw_or(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_32, 4, tcg_gen_gvec_or) +#define fGEN_TCG_V6_vgtuw_xor(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_32, 4, tcg_gen_gvec_xor) + +#define fGEN_TCG_V6_vgth_and(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_16, 2, tcg_gen_gvec_and) +#define fGEN_TCG_V6_vgth_or(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_16, 2, tcg_gen_gvec_or) +#define fGEN_TCG_V6_vgth_xor(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_16, 2, tcg_gen_gvec_xor) + +#define fGEN_TCG_V6_vgtuh_and(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_16, 2, tcg_gen_gvec_and) +#define fGEN_TCG_V6_vgtuh_or(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_16, 2, tcg_gen_gvec_or) +#define fGEN_TCG_V6_vgtuh_xor(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_16, 2, tcg_gen_gvec_xor) + +#define fGEN_TCG_V6_vgtb_and(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_8, 1, tcg_gen_gvec_and) +#define fGEN_TCG_V6_vgtb_or(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_8, 1, tcg_gen_gvec_or) +#define fGEN_TCG_V6_vgtb_xor(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_8, 1, tcg_gen_gvec_xor) + +#define fGEN_TCG_V6_vgtub_and(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_8, 1, tcg_gen_gvec_and) +#define fGEN_TCG_V6_vgtub_or(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_8, 1, tcg_gen_gvec_or) +#define fGEN_TCG_V6_vgtub_xor(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_8, 1, tcg_gen_gvec_xor) + +#define fGEN_TCG_V6_veqw_and(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_32, 4, tcg_gen_gvec_and) +#define fGEN_TCG_V6_veqw_or(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_32, 4, tcg_gen_gvec_or) +#define fGEN_TCG_V6_veqw_xor(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_32, 4, tcg_gen_gvec_xor) + +#define fGEN_TCG_V6_veqh_and(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_16, 2, tcg_gen_gvec_and) +#define fGEN_TCG_V6_veqh_or(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_16, 2, tcg_gen_gvec_or) +#define fGEN_TCG_V6_veqh_xor(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_16, 2, tcg_gen_gvec_xor) + +#define fGEN_TCG_V6_veqb_and(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_8, 1, tcg_gen_gvec_and) +#define fGEN_TCG_V6_veqb_or(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_8, 1, tcg_gen_gvec_or) +#define fGEN_TCG_V6_veqb_xor(SHORTCODE) \ + fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_8, 1, tcg_gen_gvec_xor) + #endif --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634034969981535.3078450132905; Tue, 12 Oct 2021 03:36:09 -0700 (PDT) Received: from localhost ([::1]:46610 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maF8i-0007gY-VG for importer@patchew.org; Tue, 12 Oct 2021 06:36:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50494) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maElK-0000pS-HT for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:58 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64084) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maElH-0006yI-Ph for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:58 -0400 Received: from unknown (HELO ironmsg01-sd.qualcomm.com) ([10.53.140.141]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg01-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 6C4561750; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033515; x=1665569515; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=J1q6N7eroWTogI5P3k1kl74rt4nIu/OpnQAPpp9LGg8=; b=o4eBk01xNM2VEGx7s9HbaPzopNAF3VyTRr6HwzY1/BOSps3kp7+kwLms abMARaqIiNuHEZiqIUj+GK0waHbZCBTql5jkrN3DZ0JormzYx3goAE4WZ rW5bDH4KbGiOMlOgkdW8HIPnblEI2Lbafyao6xpYM4TMaA/kYyeJs2J+3 0=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 21/30] Hexagon HVX (target/hexagon) helper overrides - vector splat and abs Date: Tue, 12 Oct 2021 05:10:59 -0500 Message-Id: <1634033468-23566-22-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634034970316100001 Reviewed-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg_hvx.h | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h index 32f8e20..435c7b5 100644 --- a/target/hexagon/gen_tcg_hvx.h +++ b/target/hexagon/gen_tcg_hvx.h @@ -506,4 +506,30 @@ static inline void assert_vhist_tmp(DisasContext *ctx) #define fGEN_TCG_V6_veqb_xor(SHORTCODE) \ fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_8, 1, tcg_gen_gvec_xor) =20 +/* Vector splat - various forms */ +#define fGEN_TCG_V6_lvsplatw(SHORTCODE) \ + tcg_gen_gvec_dup_i32(MO_32, VdV_off, \ + sizeof(MMVector), sizeof(MMVector), RtV) + +#define fGEN_TCG_V6_lvsplath(SHORTCODE) \ + tcg_gen_gvec_dup_i32(MO_16, VdV_off, \ + sizeof(MMVector), sizeof(MMVector), RtV) + +#define fGEN_TCG_V6_lvsplatb(SHORTCODE) \ + tcg_gen_gvec_dup_i32(MO_8, VdV_off, \ + sizeof(MMVector), sizeof(MMVector), RtV) + +/* Vector absolute value - various forms */ +#define fGEN_TCG_V6_vabsb(SHORTCODE) \ + tcg_gen_gvec_abs(MO_8, VdV_off, VuV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +#define fGEN_TCG_V6_vabsh(SHORTCODE) \ + tcg_gen_gvec_abs(MO_16, VdV_off, VuV_off, \ + sizeof(MMVector), sizeof(MMVector)) + +#define fGEN_TCG_V6_vabsw(SHORTCODE) \ + tcg_gen_gvec_abs(MO_32, VdV_off, VuV_off, \ + sizeof(MMVector), sizeof(MMVector)) + #endif --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634035360689596.7195798832075; Tue, 12 Oct 2021 03:42:40 -0700 (PDT) Received: from localhost ([::1]:55152 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFF1-0005GZ-HK for importer@patchew.org; Tue, 12 Oct 2021 06:42:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50334) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEky-0000a7-Jx for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:39 -0400 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:12899) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEkv-0007Vl-Vd for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:36 -0400 Received: from unknown (HELO ironmsg05-sd.qualcomm.com) ([10.53.140.145]) by alexa-out-sd-01.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg05-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 6EEE81754; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033494; x=1665569494; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aqvDx7m+N5fY/xFj3g5vlcg+BttOAvYjgqws3lvH7m8=; b=sYzSqN//UnlCtsfb7rlmJTWT7uVHCwfRCHwfSjYuaOT1sEWBlEZnPMsw DCQXEAhb1PPtbhePo0zTphQoWqi0qvcTB7g4bZ2lWVIAxFurk8i6m6UIO Iznfqx43g3RGq2+Z8V/ZqXT+mONxBRmzsyCN+XIbxrASvyhBG0ZJ/MWpi w=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 22/30] Hexagon HVX (target/hexagon) helper overrides - vector loads Date: Tue, 12 Oct 2021 05:11:00 -0500 Message-Id: <1634033468-23566-23-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.38; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-01.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634035361131100002 Acked-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg_hvx.h | 150 +++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 150 insertions(+) diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h index 435c7b5..2d1d778 100644 --- a/target/hexagon/gen_tcg_hvx.h +++ b/target/hexagon/gen_tcg_hvx.h @@ -532,4 +532,154 @@ static inline void assert_vhist_tmp(DisasContext *ctx) tcg_gen_gvec_abs(MO_32, VdV_off, VuV_off, \ sizeof(MMVector), sizeof(MMVector)) =20 +/* Vector loads */ +#define fGEN_TCG_V6_vL32b_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32Ub_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_cur_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_tmp_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_nt_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_nt_cur_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_nt_tmp_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32Ub_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_cur_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_tmp_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_nt_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_nt_cur_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_nt_tmp_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_ppu(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32Ub_ppu(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_cur_ppu(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_tmp_ppu(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_nt_ppu(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_nt_cur_ppu(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vL32b_nt_tmp_ppu(SHORTCODE) SHORTCODE + +/* Predicated vector loads */ +#define fGEN_TCG_PRED_VEC_LOAD(GET_EA, PRED, DSTOFF, INC) \ + do { \ + TCGv LSB =3D tcg_temp_new(); \ + TCGLabel *false_label =3D gen_new_label(); \ + TCGLabel *end_label =3D gen_new_label(); \ + GET_EA; \ + PRED; \ + tcg_gen_brcondi_tl(TCG_COND_EQ, LSB, 0, false_label); \ + tcg_temp_free(LSB); \ + gen_vreg_load(ctx, DSTOFF, EA, true); \ + INC; \ + tcg_gen_br(end_label); \ + gen_set_label(false_label); \ + tcg_gen_ori_tl(hex_slot_cancelled, hex_slot_cancelled, \ + 1 << insn->slot); \ + gen_set_label(end_label); \ + } while (0) + +#define fGEN_TCG_PRED_VEC_LOAD_pred_pi \ + fGEN_TCG_PRED_VEC_LOAD(fLSBOLD(PvV), \ + fEA_REG(RxV), \ + VdV_off, \ + fPM_I(RxV, siV * sizeof(MMVector))) +#define fGEN_TCG_PRED_VEC_LOAD_npred_pi \ + fGEN_TCG_PRED_VEC_LOAD(fLSBOLDNOT(PvV), \ + fEA_REG(RxV), \ + VdV_off, \ + fPM_I(RxV, siV * sizeof(MMVector))) + +#define fGEN_TCG_V6_vL32b_pred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_pi +#define fGEN_TCG_V6_vL32b_npred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_pi +#define fGEN_TCG_V6_vL32b_cur_pred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_pi +#define fGEN_TCG_V6_vL32b_cur_npred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_pi +#define fGEN_TCG_V6_vL32b_tmp_pred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_pi +#define fGEN_TCG_V6_vL32b_tmp_npred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_pi +#define fGEN_TCG_V6_vL32b_nt_pred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_pi +#define fGEN_TCG_V6_vL32b_nt_npred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_pi +#define fGEN_TCG_V6_vL32b_nt_cur_pred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_pi +#define fGEN_TCG_V6_vL32b_nt_cur_npred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_pi +#define fGEN_TCG_V6_vL32b_nt_tmp_pred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_pi +#define fGEN_TCG_V6_vL32b_nt_tmp_npred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_pi + +#define fGEN_TCG_PRED_VEC_LOAD_pred_ai \ + fGEN_TCG_PRED_VEC_LOAD(fLSBOLD(PvV), \ + fEA_RI(RtV, siV * sizeof(MMVector)), \ + VdV_off, \ + do {} while (0)) +#define fGEN_TCG_PRED_VEC_LOAD_npred_ai \ + fGEN_TCG_PRED_VEC_LOAD(fLSBOLDNOT(PvV), \ + fEA_RI(RtV, siV * sizeof(MMVector)), \ + VdV_off, \ + do {} while (0)) + +#define fGEN_TCG_V6_vL32b_pred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_ai +#define fGEN_TCG_V6_vL32b_npred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_ai +#define fGEN_TCG_V6_vL32b_cur_pred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_ai +#define fGEN_TCG_V6_vL32b_cur_npred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_ai +#define fGEN_TCG_V6_vL32b_tmp_pred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_ai +#define fGEN_TCG_V6_vL32b_tmp_npred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_ai +#define fGEN_TCG_V6_vL32b_nt_pred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_ai +#define fGEN_TCG_V6_vL32b_nt_npred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_ai +#define fGEN_TCG_V6_vL32b_nt_cur_pred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_ai +#define fGEN_TCG_V6_vL32b_nt_cur_npred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_ai +#define fGEN_TCG_V6_vL32b_nt_tmp_pred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_ai +#define fGEN_TCG_V6_vL32b_nt_tmp_npred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_ai + +#define fGEN_TCG_PRED_VEC_LOAD_pred_ppu \ + fGEN_TCG_PRED_VEC_LOAD(fLSBOLD(PvV), \ + fEA_REG(RxV), \ + VdV_off, \ + fPM_M(RxV, MuV)) +#define fGEN_TCG_PRED_VEC_LOAD_npred_ppu \ + fGEN_TCG_PRED_VEC_LOAD(fLSBOLDNOT(PvV), \ + fEA_REG(RxV), \ + VdV_off, \ + fPM_M(RxV, MuV)) + +#define fGEN_TCG_V6_vL32b_pred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_ppu +#define fGEN_TCG_V6_vL32b_npred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_ppu +#define fGEN_TCG_V6_vL32b_cur_pred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_ppu +#define fGEN_TCG_V6_vL32b_cur_npred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_ppu +#define fGEN_TCG_V6_vL32b_tmp_pred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_ppu +#define fGEN_TCG_V6_vL32b_tmp_npred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_ppu +#define fGEN_TCG_V6_vL32b_nt_pred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_ppu +#define fGEN_TCG_V6_vL32b_nt_npred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_ppu +#define fGEN_TCG_V6_vL32b_nt_cur_pred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_ppu +#define fGEN_TCG_V6_vL32b_nt_cur_npred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_ppu +#define fGEN_TCG_V6_vL32b_nt_tmp_pred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_pred_ppu +#define fGEN_TCG_V6_vL32b_nt_tmp_npred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_LOAD_npred_ppu + #endif --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634036043122412.524737315117; Tue, 12 Oct 2021 03:54:03 -0700 (PDT) Received: from localhost ([::1]:43028 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFQ1-0007vz-PJ for importer@patchew.org; Tue, 12 Oct 2021 06:54:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50542) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maElU-0000zD-4K for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:10 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64080) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maElR-0006xP-29 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:07 -0400 Received: from unknown (HELO ironmsg01-sd.qualcomm.com) ([10.53.140.141]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg01-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 7165D1758; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033525; x=1665569525; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=c3PmDiq2iCDBy9vqHrDB/CwVaNF/C8dzdO+H7jLAqNE=; b=n1nHnOxCex+6eonSiKeDiIFne7i5g3v7vm1yQW7ZiK51hs3RtgqtRI1w kAVAXHbviyXWAcV8Fz3yrGu27cvW8cU8Iw+MD5R2aULjJCQ6gGjw89k1f RRsiPwm88aylf9gQXdP3M9RXFXaZYVWHW14u0bLlt7Csx5nIc6LkGaRZG 4=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 23/30] Hexagon HVX (target/hexagon) helper overrides - vector stores Date: Tue, 12 Oct 2021 05:11:01 -0500 Message-Id: <1634033468-23566-24-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -39 X-Spam_score: -4.0 X-Spam_bar: ---- X-Spam_report: (-4.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UPPERCASE_50_75=0.008 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634036044220100003 Acked-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/gen_tcg_hvx.h | 218 +++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 218 insertions(+) diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h index 2d1d778..cdcc938 100644 --- a/target/hexagon/gen_tcg_hvx.h +++ b/target/hexagon/gen_tcg_hvx.h @@ -682,4 +682,222 @@ static inline void assert_vhist_tmp(DisasContext *ctx) #define fGEN_TCG_V6_vL32b_nt_tmp_npred_ppu(SHORTCODE) \ fGEN_TCG_PRED_VEC_LOAD_npred_ppu =20 +/* Vector stores */ +#define fGEN_TCG_V6_vS32b_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32Ub_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_nt_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32Ub_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_nt_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_ppu(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32Ub_ppu(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_nt_ppu(SHORTCODE) SHORTCODE + +/* New value vector stores */ +#define fGEN_TCG_NEWVAL_VEC_STORE(GET_EA, INC) \ + do { \ + GET_EA; \ + gen_vreg_store(ctx, insn, pkt, EA, OsN_off, insn->slot, true); \ + INC; \ + } while (0) + +#define fGEN_TCG_NEWVAL_VEC_STORE_pi \ + fGEN_TCG_NEWVAL_VEC_STORE(fEA_REG(RxV), fPM_I(RxV, siV * sizeof(MMVect= or))) + +#define fGEN_TCG_V6_vS32b_new_pi(SHORTCODE) \ + fGEN_TCG_NEWVAL_VEC_STORE_pi +#define fGEN_TCG_V6_vS32b_nt_new_pi(SHORTCODE) \ + fGEN_TCG_NEWVAL_VEC_STORE_pi + +#define fGEN_TCG_NEWVAL_VEC_STORE_ai \ + fGEN_TCG_NEWVAL_VEC_STORE(fEA_RI(RtV, siV * sizeof(MMVector)), \ + do { } while (0)) + +#define fGEN_TCG_V6_vS32b_new_ai(SHORTCODE) \ + fGEN_TCG_NEWVAL_VEC_STORE_ai +#define fGEN_TCG_V6_vS32b_nt_new_ai(SHORTCODE) \ + fGEN_TCG_NEWVAL_VEC_STORE_ai + +#define fGEN_TCG_NEWVAL_VEC_STORE_ppu \ + fGEN_TCG_NEWVAL_VEC_STORE(fEA_REG(RxV), fPM_M(RxV, MuV)) + +#define fGEN_TCG_V6_vS32b_new_ppu(SHORTCODE) \ + fGEN_TCG_NEWVAL_VEC_STORE_ppu +#define fGEN_TCG_V6_vS32b_nt_new_ppu(SHORTCODE) \ + fGEN_TCG_NEWVAL_VEC_STORE_ppu + +/* Predicated vector stores */ +#define fGEN_TCG_PRED_VEC_STORE(GET_EA, PRED, SRCOFF, ALIGN, INC) \ + do { \ + TCGv LSB =3D tcg_temp_new(); \ + TCGLabel *false_label =3D gen_new_label(); \ + TCGLabel *end_label =3D gen_new_label(); \ + GET_EA; \ + PRED; \ + tcg_gen_brcondi_tl(TCG_COND_EQ, LSB, 0, false_label); \ + tcg_temp_free(LSB); \ + gen_vreg_store(ctx, insn, pkt, EA, SRCOFF, insn->slot, ALIGN); \ + INC; \ + tcg_gen_br(end_label); \ + gen_set_label(false_label); \ + tcg_gen_ori_tl(hex_slot_cancelled, hex_slot_cancelled, \ + 1 << insn->slot); \ + gen_set_label(end_label); \ + } while (0) + +#define fGEN_TCG_PRED_VEC_STORE_pred_pi(ALIGN) \ + fGEN_TCG_PRED_VEC_STORE(fLSBOLD(PvV), \ + fEA_REG(RxV), \ + VsV_off, ALIGN, \ + fPM_I(RxV, siV * sizeof(MMVector))) +#define fGEN_TCG_PRED_VEC_STORE_npred_pi(ALIGN) \ + fGEN_TCG_PRED_VEC_STORE(fLSBOLDNOT(PvV), \ + fEA_REG(RxV), \ + VsV_off, ALIGN, \ + fPM_I(RxV, siV * sizeof(MMVector))) +#define fGEN_TCG_PRED_VEC_STORE_new_pred_pi \ + fGEN_TCG_PRED_VEC_STORE(fLSBOLD(PvV), \ + fEA_REG(RxV), \ + OsN_off, true, \ + fPM_I(RxV, siV * sizeof(MMVector))) +#define fGEN_TCG_PRED_VEC_STORE_new_npred_pi \ + fGEN_TCG_PRED_VEC_STORE(fLSBOLDNOT(PvV), \ + fEA_REG(RxV), \ + OsN_off, true, \ + fPM_I(RxV, siV * sizeof(MMVector))) + +#define fGEN_TCG_V6_vS32b_pred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_pred_pi(true) +#define fGEN_TCG_V6_vS32b_npred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_npred_pi(true) +#define fGEN_TCG_V6_vS32Ub_pred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_pred_pi(false) +#define fGEN_TCG_V6_vS32Ub_npred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_npred_pi(false) +#define fGEN_TCG_V6_vS32b_nt_pred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_pred_pi(true) +#define fGEN_TCG_V6_vS32b_nt_npred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_npred_pi(true) +#define fGEN_TCG_V6_vS32b_new_pred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_new_pred_pi +#define fGEN_TCG_V6_vS32b_new_npred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_new_npred_pi +#define fGEN_TCG_V6_vS32b_nt_new_pred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_new_pred_pi +#define fGEN_TCG_V6_vS32b_nt_new_npred_pi(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_new_npred_pi + +#define fGEN_TCG_PRED_VEC_STORE_pred_ai(ALIGN) \ + fGEN_TCG_PRED_VEC_STORE(fLSBOLD(PvV), \ + fEA_RI(RtV, siV * sizeof(MMVector)), \ + VsV_off, ALIGN, \ + do { } while (0)) +#define fGEN_TCG_PRED_VEC_STORE_npred_ai(ALIGN) \ + fGEN_TCG_PRED_VEC_STORE(fLSBOLDNOT(PvV), \ + fEA_RI(RtV, siV * sizeof(MMVector)), \ + VsV_off, ALIGN, \ + do { } while (0)) +#define fGEN_TCG_PRED_VEC_STORE_new_pred_ai \ + fGEN_TCG_PRED_VEC_STORE(fLSBOLD(PvV), \ + fEA_RI(RtV, siV * sizeof(MMVector)), \ + OsN_off, true, \ + do { } while (0)) +#define fGEN_TCG_PRED_VEC_STORE_new_npred_ai \ + fGEN_TCG_PRED_VEC_STORE(fLSBOLDNOT(PvV), \ + fEA_RI(RtV, siV * sizeof(MMVector)), \ + OsN_off, true, \ + do { } while (0)) + +#define fGEN_TCG_V6_vS32b_pred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_pred_ai(true) +#define fGEN_TCG_V6_vS32b_npred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_npred_ai(true) +#define fGEN_TCG_V6_vS32Ub_pred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_pred_ai(false) +#define fGEN_TCG_V6_vS32Ub_npred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_npred_ai(false) +#define fGEN_TCG_V6_vS32b_nt_pred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_pred_ai(true) +#define fGEN_TCG_V6_vS32b_nt_npred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_npred_ai(true) +#define fGEN_TCG_V6_vS32b_new_pred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_new_pred_ai +#define fGEN_TCG_V6_vS32b_new_npred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_new_npred_ai +#define fGEN_TCG_V6_vS32b_nt_new_pred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_new_pred_ai +#define fGEN_TCG_V6_vS32b_nt_new_npred_ai(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_new_npred_ai + +#define fGEN_TCG_PRED_VEC_STORE_pred_ppu(ALIGN) \ + fGEN_TCG_PRED_VEC_STORE(fLSBOLD(PvV), \ + fEA_REG(RxV), \ + VsV_off, ALIGN, \ + fPM_M(RxV, MuV)) +#define fGEN_TCG_PRED_VEC_STORE_npred_ppu(ALIGN) \ + fGEN_TCG_PRED_VEC_STORE(fLSBOLDNOT(PvV), \ + fEA_REG(RxV), \ + VsV_off, ALIGN, \ + fPM_M(RxV, MuV)) +#define fGEN_TCG_PRED_VEC_STORE_new_pred_ppu \ + fGEN_TCG_PRED_VEC_STORE(fLSBOLD(PvV), \ + fEA_REG(RxV), \ + OsN_off, true, \ + fPM_M(RxV, MuV)) +#define fGEN_TCG_PRED_VEC_STORE_new_npred_ppu \ + fGEN_TCG_PRED_VEC_STORE(fLSBOLDNOT(PvV), \ + fEA_REG(RxV), \ + OsN_off, true, \ + fPM_M(RxV, MuV)) + +#define fGEN_TCG_V6_vS32b_pred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_pred_ppu(true) +#define fGEN_TCG_V6_vS32b_npred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_npred_ppu(true) +#define fGEN_TCG_V6_vS32Ub_pred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_pred_ppu(false) +#define fGEN_TCG_V6_vS32Ub_npred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_npred_ppu(false) +#define fGEN_TCG_V6_vS32b_nt_pred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_pred_ppu(true) +#define fGEN_TCG_V6_vS32b_nt_npred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_npred_ppu(true) +#define fGEN_TCG_V6_vS32b_new_pred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_new_pred_ppu +#define fGEN_TCG_V6_vS32b_new_npred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_new_npred_ppu +#define fGEN_TCG_V6_vS32b_nt_new_pred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_new_pred_ppu +#define fGEN_TCG_V6_vS32b_nt_new_npred_ppu(SHORTCODE) \ + fGEN_TCG_PRED_VEC_STORE_new_npred_ppu + +/* Masked vector stores */ +#define fGEN_TCG_V6_vS32b_qpred_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_nt_qpred_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_qpred_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_nt_qpred_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_qpred_ppu(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_nt_qpred_ppu(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_nqpred_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_nt_nqpred_pi(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_nqpred_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_nt_nqpred_ai(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_nqpred_ppu(SHORTCODE) SHORTCODE +#define fGEN_TCG_V6_vS32b_nt_nqpred_ppu(SHORTCODE) SHORTCODE + +/* Store release not modelled in qemu, but need to suppress compiler warni= ngs */ +#define fGEN_TCG_V6_vS32b_srls_pi(SHORTCODE) \ + do { \ + siV =3D siV; \ + } while (0) +#define fGEN_TCG_V6_vS32b_srls_ai(SHORTCODE) \ + do { \ + RtV =3D RtV; \ + siV =3D siV; \ + } while (0) +#define fGEN_TCG_V6_vS32b_srls_ppu(SHORTCODE) \ + do { \ + MuV =3D MuV; \ + } while (0) + #endif --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634036367453221.0863479697607; Tue, 12 Oct 2021 03:59:27 -0700 (PDT) Received: from localhost ([::1]:52046 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFVF-0005ml-F0 for importer@patchew.org; Tue, 12 Oct 2021 06:59:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50592) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEld-0001AS-9R for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:17 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64084) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maElW-0006yI-84 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:17 -0400 Received: from unknown (HELO ironmsg01-sd.qualcomm.com) ([10.53.140.141]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg01-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 7480B1761; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033530; x=1665569530; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qEFhoOAD7bZUMMq5g/WVUKF8A3fVxbeFvpHKfXorNJI=; b=vYuAwK/3a7b54XeRQBqAAM5hieWcq6iIzHhHb8GqhIKk9XMRvbMmJxG4 VxITzeHcE7BZ/y1ThPegsou6rpu37wyakxuhAIi3z3ts+S4VKo8uIgxq/ IY45flj991eo6qxXyKa9ZzSGgQ9hjc62jWw66qB69lwQ5K+DnmBAmdWZH M=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 24/30] Hexagon HVX (target/hexagon) import semantics Date: Tue, 12 Oct 2021 05:11:02 -0500 Message-Id: <1634033468-23566-25-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634036369419100001 Imported from the Hexagon architecture library imported/allext.idef Top level file for all extensions imported/mmvec/ext.idef HVX instruction definitions Support functions added to target/hexagon/genptr.c Acked-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/genptr.c | 172 +++ target/hexagon/imported/allext.idef | 25 + target/hexagon/imported/allidefs.def | 1 + target/hexagon/imported/mmvec/ext.idef | 2606 ++++++++++++++++++++++++++++= ++++ 4 files changed, 2804 insertions(+) create mode 100644 target/hexagon/imported/allext.idef create mode 100644 target/hexagon/imported/mmvec/ext.idef diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c index 473438a..4419d30 100644 --- a/target/hexagon/genptr.c +++ b/target/hexagon/genptr.c @@ -19,11 +19,13 @@ #include "cpu.h" #include "internal.h" #include "tcg/tcg-op.h" +#include "tcg/tcg-op-gvec.h" #include "insn.h" #include "opcodes.h" #include "translate.h" #define QEMU_GENERATE /* Used internally by macros.h */ #include "macros.h" +#include "mmvec/macros.h" #undef QEMU_GENERATE #include "gen_tcg.h" #include "gen_tcg_hvx.h" @@ -462,5 +464,175 @@ static TCGv gen_8bitsof(TCGv result, TCGv value) return result; } =20 +static intptr_t vreg_src_off(DisasContext *ctx, int num) +{ + intptr_t offset =3D offsetof(CPUHexagonState, VRegs[num]); + + if (test_bit(num, ctx->vregs_select)) { + offset =3D ctx_future_vreg_off(ctx, num, 1, false); + } + if (test_bit(num, ctx->vregs_updated_tmp)) { + offset =3D ctx_tmp_vreg_off(ctx, num, 1, false); + } + return offset; +} + +static void gen_log_vreg_write(DisasContext *ctx, intptr_t srcoff, int num, + VRegWriteType type, int slot_num, + bool is_predicated) +{ + TCGLabel *label_end =3D NULL; + intptr_t dstoff; + + if (is_predicated) { + TCGv cancelled =3D tcg_temp_local_new(); + label_end =3D gen_new_label(); + + /* Don't do anything if the slot was cancelled */ + tcg_gen_extract_tl(cancelled, hex_slot_cancelled, slot_num, 1); + tcg_gen_brcondi_tl(TCG_COND_NE, cancelled, 0, label_end); + tcg_temp_free(cancelled); + } + + if (type !=3D EXT_TMP) { + dstoff =3D ctx_future_vreg_off(ctx, num, 1, true); + tcg_gen_gvec_mov(MO_64, dstoff, srcoff, + sizeof(MMVector), sizeof(MMVector)); + tcg_gen_ori_tl(hex_VRegs_updated, hex_VRegs_updated, 1 << num); + } else { + dstoff =3D ctx_tmp_vreg_off(ctx, num, 1, false); + tcg_gen_gvec_mov(MO_64, dstoff, srcoff, + sizeof(MMVector), sizeof(MMVector)); + } + + if (is_predicated) { + gen_set_label(label_end); + } +} + +static void gen_log_vreg_write_pair(DisasContext *ctx, intptr_t srcoff, in= t num, + VRegWriteType type, int slot_num, + bool is_predicated) +{ + gen_log_vreg_write(ctx, srcoff, num ^ 0, type, slot_num, is_predicated= ); + srcoff +=3D sizeof(MMVector); + gen_log_vreg_write(ctx, srcoff, num ^ 1, type, slot_num, is_predicated= ); +} + +static void gen_log_qreg_write(intptr_t srcoff, int num, int vnew, + int slot_num, bool is_predicated) +{ + TCGLabel *label_end =3D NULL; + intptr_t dstoff; + + if (is_predicated) { + TCGv cancelled =3D tcg_temp_local_new(); + label_end =3D gen_new_label(); + + /* Don't do anything if the slot was cancelled */ + tcg_gen_extract_tl(cancelled, hex_slot_cancelled, slot_num, 1); + tcg_gen_brcondi_tl(TCG_COND_NE, cancelled, 0, label_end); + tcg_temp_free(cancelled); + } + + dstoff =3D offsetof(CPUHexagonState, future_QRegs[num]); + tcg_gen_gvec_mov(MO_64, dstoff, srcoff, sizeof(MMQReg), sizeof(MMQReg)= ); + + if (is_predicated) { + tcg_gen_ori_tl(hex_QRegs_updated, hex_QRegs_updated, 1 << num); + gen_set_label(label_end); + } +} + +static void gen_vreg_load(DisasContext *ctx, intptr_t dstoff, TCGv src, + bool aligned) +{ + TCGv_i64 tmp =3D tcg_temp_new_i64(); + if (aligned) { + tcg_gen_andi_tl(src, src, ~((int32_t)sizeof(MMVector) - 1)); + } + for (int i =3D 0; i < sizeof(MMVector) / 8; i++) { + tcg_gen_qemu_ld64(tmp, src, ctx->mem_idx); + tcg_gen_addi_tl(src, src, 8); + tcg_gen_st_i64(tmp, cpu_env, dstoff + i * 8); + } + tcg_temp_free_i64(tmp); +} + +static void gen_vreg_store(DisasContext *ctx, Insn *insn, Packet *pkt, + TCGv EA, intptr_t srcoff, int slot, bool aligne= d) +{ + intptr_t dstoff =3D offsetof(CPUHexagonState, vstore[slot].data); + intptr_t maskoff =3D offsetof(CPUHexagonState, vstore[slot].mask); + + if (is_gather_store_insn(insn, pkt)) { + TCGv sl =3D tcg_constant_tl(slot); + gen_helper_gather_store(cpu_env, EA, sl); + return; + } + + tcg_gen_movi_tl(hex_vstore_pending[slot], 1); + if (aligned) { + tcg_gen_andi_tl(hex_vstore_addr[slot], EA, + ~((int32_t)sizeof(MMVector) - 1)); + } else { + tcg_gen_mov_tl(hex_vstore_addr[slot], EA); + } + tcg_gen_movi_tl(hex_vstore_size[slot], sizeof(MMVector)); + + /* Copy the data to the vstore buffer */ + tcg_gen_gvec_mov(MO_64, dstoff, srcoff, sizeof(MMVector), sizeof(MMVec= tor)); + /* Set the mask to all 1's */ + tcg_gen_gvec_dup_imm(MO_64, maskoff, sizeof(MMQReg), sizeof(MMQReg), ~= 0LL); +} + +static void gen_vreg_masked_store(DisasContext *ctx, TCGv EA, intptr_t src= off, + intptr_t bitsoff, int slot, bool invert) +{ + intptr_t dstoff =3D offsetof(CPUHexagonState, vstore[slot].data); + intptr_t maskoff =3D offsetof(CPUHexagonState, vstore[slot].mask); + + tcg_gen_movi_tl(hex_vstore_pending[slot], 1); + tcg_gen_andi_tl(hex_vstore_addr[slot], EA, + ~((int32_t)sizeof(MMVector) - 1)); + tcg_gen_movi_tl(hex_vstore_size[slot], sizeof(MMVector)); + + /* Copy the data to the vstore buffer */ + tcg_gen_gvec_mov(MO_64, dstoff, srcoff, sizeof(MMVector), sizeof(MMVec= tor)); + /* Copy the mask */ + tcg_gen_gvec_mov(MO_64, maskoff, bitsoff, sizeof(MMQReg), sizeof(MMQRe= g)); + if (invert) { + tcg_gen_gvec_not(MO_64, maskoff, maskoff, + sizeof(MMQReg), sizeof(MMQReg)); + } +} + +static void vec_to_qvec(size_t size, intptr_t dstoff, intptr_t srcoff) +{ + TCGv_i64 tmp =3D tcg_temp_new_i64(); + TCGv_i64 word =3D tcg_temp_new_i64(); + TCGv_i64 bits =3D tcg_temp_new_i64(); + TCGv_i64 mask =3D tcg_temp_new_i64(); + TCGv_i64 zero =3D tcg_constant_i64(0); + TCGv_i64 ones =3D tcg_constant_i64(~0); + + for (int i =3D 0; i < sizeof(MMVector) / 8; i++) { + tcg_gen_ld_i64(tmp, cpu_env, srcoff + i * 8); + tcg_gen_movi_i64(mask, 0); + + for (int j =3D 0; j < 8; j +=3D size) { + tcg_gen_extract_i64(word, tmp, j * 8, size * 8); + tcg_gen_movcond_i64(TCG_COND_NE, bits, word, zero, ones, zero); + tcg_gen_deposit_i64(mask, mask, bits, j, size); + } + + tcg_gen_st8_i64(mask, cpu_env, dstoff + i); + } + tcg_temp_free_i64(tmp); + tcg_temp_free_i64(word); + tcg_temp_free_i64(bits); + tcg_temp_free_i64(mask); +} + #include "tcg_funcs_generated.c.inc" #include "tcg_func_table_generated.c.inc" diff --git a/target/hexagon/imported/allext.idef b/target/hexagon/imported/= allext.idef new file mode 100644 index 0000000..9d4b23e --- /dev/null +++ b/target/hexagon/imported/allext.idef @@ -0,0 +1,25 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/* + * Top level file for all instruction set extensions + */ +#define EXTNAME mmvec +#define EXTSTR "mmvec" +#include "mmvec/ext.idef" +#undef EXTNAME +#undef EXTSTR diff --git a/target/hexagon/imported/allidefs.def b/target/hexagon/imported= /allidefs.def index 2aace29..ee253b8 100644 --- a/target/hexagon/imported/allidefs.def +++ b/target/hexagon/imported/allidefs.def @@ -28,3 +28,4 @@ #include "shift.idef" #include "system.idef" #include "subinsns.idef" +#include "allext.idef" diff --git a/target/hexagon/imported/mmvec/ext.idef b/target/hexagon/import= ed/mmvec/ext.idef new file mode 100644 index 0000000..8ca5a60 --- /dev/null +++ b/target/hexagon/imported/mmvec/ext.idef @@ -0,0 +1,2606 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/*************************************************************************= ***** + * + * HOYA: MULTI MEDIA INSTRUCITONS + * + *************************************************************************= *****/ + +#ifndef EXTINSN +#define EXTINSN Q6INSN +#define __SELF_DEF_EXTINSN 1 +#endif + +#ifndef NO_MMVEC + +#define DO_FOR_EACH_CODE(WIDTH, CODE) \ +{ \ + fHIDE(int i;) \ + fVFOREACH(WIDTH, i) {\ + CODE ;\ + } \ +} + + + + +#define ITERATOR_INSN_ANY_SLOT(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + + + +#define ITERATOR_INSN2_ANY_SLOT(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE) \ +ITERATOR_INSN_ANY_SLOT(WIDTH,TAG,SYNTAX2,DESCR,CODE) + +#define ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA_DV), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + + +#define ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,= CODE) \ +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX2,DESCR,CODE) + + +#define ITERATOR_INSN_SHIFT_SLOT(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VS), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + + + +#define ITERATOR_INSN_SHIFT_SLOT_VV_LATE(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VS), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_SHIFT_SLOT(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE) \ +ITERATOR_INSN_SHIFT_SLOT(WIDTH,TAG,SYNTAX2,DESCR,CODE) + +#define ITERATOR_INSN_PERMUTE_SLOT(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_PERMUTE_SLOT(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE) \ +ITERATOR_INSN_PERMUTE_SLOT(WIDTH,TAG,SYNTAX2,DESCR,CODE) + +#define ITERATOR_INSN_PERMUTE_SLOT_DEP(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP), + + +#define ITERATOR_INSN2_PERMUTE_SLOT_DEP(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,COD= E) \ +ITERATOR_INSN_PERMUTE_SLOT_DEP(WIDTH,TAG,SYNTAX2,DESCR,CODE) + +#define ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,DESCR,CODE)= \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP_VS), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC_DEP(WIDTH,TAG,SYNTAX,DESCR,C= ODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP_VS), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_PERMUTE_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,SYNTAX2,DE= SCR,CODE) \ +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX2,DESCR,CODE) + +#define ITERATOR_INSN_MPY_SLOT(WIDTH,TAG, SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, \ +ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN_MPY_SLOT_LATE(WIDTH,TAG, SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, \ +ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_MPY_SLOT(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE) \ +ITERATOR_INSN_MPY_SLOT(WIDTH,TAG,SYNTAX2,DESCR,CODE) + +#define ITERATOR_INSN2_MPY_SLOT_LATE(WIDTH,TAG, SYNTAX,SYNTAX2,DESCR,CODE)= \ +ITERATOR_INSN_MPY_SLOT_LATE(WIDTH,TAG, SYNTAX2,DESCR,CODE) + + +#define ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX_DV), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,= CODE) \ +ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC(WIDTH,TAG,SYNTAX2,DESCR,CODE) + + + + +#define ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC2(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR= ,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX_DV,A_CVI_VX_V= SRC0_IS_DST), DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN_SLOT2_DOUBLE_VEC(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX_DV,A_RESTRICT= _SLOT2ONLY), DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN_VHISTLIKE(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_4SLOT), \ +DESCR, fHIDE(mmvector_t input;) input =3D fTMPVDATA(); DO_FOR_EACH_CODE(WI= DTH, CODE)) + + + + + +/*************************************************************************= ***************** +* +* MMVECTOR MEMORY OPERATIONS - NO NAPALI V1 +* +**************************************************************************= *****************/ + + + +#define ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC_NOV1(WIDTH,TAG,SYNTAX,DESCR,CODE= ) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX_DV), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC_NOV1(WIDTH,TAG,SYNTAX,SYNTAX2,D= ESCR,CODE) \ +ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC_NOV1(WIDTH,TAG,SYNTAX2,DESCR,CODE) + + + +#define ITERATOR_INSN_SHIFT_SLOT_NOV1(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VS), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_SHIFT_SLOT_NOV1(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE= ) \ +ITERATOR_INSN_SHIFT_SLOT_NOV1(WIDTH,TAG,SYNTAX2,DESCR,CODE) + + +#define ITERATOR_INSN_ANY_SLOT_NOV1(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_ANY_SLOT_NOV1(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,CODE) \ +ITERATOR_INSN_ANY_SLOT_NOV1(WIDTH,TAG,SYNTAX2,DESCR,CODE) + + +#define ITERATOR_INSN_MPY_SLOT_NOV1(WIDTH,TAG, SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, \ +ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN_PERMUTE_SLOT_NOV1(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_PERMUTE_SLOTT_NOV1(WIDTH,TAG,SYNTAX,SYNTAX2,DESCR,C= ODE) \ +ITERATOR_INSN_PERMUTE_SLOT(WIDTH,TAG,SYNTAX2,DESCR,CODE) + +#define ITERATOR_INSN_PERMUTE_SLOT_DEPT_NOV1(WIDTH,TAG,SYNTAX,DESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP), + + +#define ITERATOR_INSN2_PERMUTE_SLOT_DEPT_NOV1(WIDTH,TAG,SYNTAX,SYNTAX2,DES= CR,CODE) \ +ITERATOR_INSN_PERMUTE_SLOT_DEP_NOV1(WIDTH,TAG,SYNTAX2,DESCR,CODE) + +#define ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC_NOV1(WIDTH,TAG,SYNTAX,DESCR,= CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP_VS), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC_DEPT_NOV1(WIDTH,TAG,SYNTAX,D= ESCR,CODE) \ +EXTINSN(V6_##TAG, SYNTAX, ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP_VS), \ +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE)) + +#define ITERATOR_INSN2_PERMUTE_SLOT_DOUBLE_VEC_NOV1(WIDTH,TAG,SYNTAX,SYNTA= X2,DESCR,CODE) \ +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC_NOV1(WIDTH,TAG,SYNTAX2,DESCR,CODE) + +#define NARROWING_SHIFT_NOV1(ITERSIZE,TAG,DSTM,DSTTYPE,SRCTYPE,SYNOPTS,SAT= FUNC,RNDFUNC,SHAMTMASK) \ +ITERATOR_INSN_SHIFT_SLOT_NOV1(ITERSIZE,TAG, \ +"Vd32." #DSTTYPE "=3Dvasr(Vu32." #SRCTYPE ",Vv32." #SRCTYPE ",Rt8)" #SYNOP= TS, \ +"Vector shift right and shuffle", \ + fHIDE(int )shamt =3D RtV & SHAMTMASK; \ + DSTM(0,VdV.SRCTYPE[i],SATFUNC(RNDFUNC(VvV.SRCTYPE[i],shamt) >> shamt))= ; \ + DSTM(1,VdV.SRCTYPE[i],SATFUNC(RNDFUNC(VuV.SRCTYPE[i],shamt) >> shamt))) + +#define MMVEC_AVGS_NOV1(TYPE,TYPE2,DESCR, WIDTH, DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT_NOV1(WIDTH,vavg##TYPE, "Vd3= 2=3Dvavg"TYPE2"(Vu32,Vv32)", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32= ."#SRC")", "Vector Average "DESCR, = VdV.DEST[i] =3D fVAVGS( WIDTH, VuV.SRC[i], VvV.SRC[i])) \ +ITERATOR_INSN2_ANY_SLOT_NOV1(WIDTH,vavg##TYPE##rnd, "Vd3= 2=3Dvavg"TYPE2"(Vu32,Vv32):rnd", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32= ."#SRC"):rnd", "Vector Average % Round"DESCR, = VdV.DEST[i] =3D fVAVGSRND( WIDTH, VuV.SRC[i], VvV.SRC[i])) \ +ITERATOR_INSN2_ANY_SLOT_NOV1(WIDTH,vnavg##TYPE, "Vd3= 2=3Dvnavg"TYPE2"(Vu32,Vv32)", "Vd32."#DEST"=3Dvnavg(Vu32."#SRC",Vv3= 2."#SRC")", "Vector Negative Average "DESCR, = VdV.DEST[i] =3D fVNAVGS( WIDTH, VuV.SRC[i], VvV.SRC[i])) + + #define MMVEC_AVGU_NOV1(TYPE,TYPE2,DESCR, WIDTH, DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT_NOV1(WIDTH,vavg##TYPE, "Vd3= 2=3Dvavg"TYPE2"(Vu32,Vv32)", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32.= "#SRC")", "Vector Average "DESCR, = VdV.DEST[i] =3D fVAVGU( WIDTH, VuV.SRC[i], VvV.SRC[i])) \ +ITERATOR_INSN2_ANY_SLOT_NOV1(WIDTH,vavg##TYPE##rnd, "Vd3= 2=3Dvavg"TYPE2"(Vu32,Vv32):rnd", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32.= "#SRC"):rnd", "Vector Average % Round"DESCR, = VdV.DEST[i] =3D fVAVGURND(WIDTH, VuV.SRC[i], VvV.SRC[i])) + + + +/*************************************************************************= ***************** +* +* MMVECTOR MEMORY OPERATIONS +* +**************************************************************************= *****************/ + +#define MMVEC_EACH_EA(TAG,DESCR,ATTRIB,NT,SYNTAXA,SYNTAXB,BEH) \ +EXTINSN(V6_##TAG##_pi, SYNTAXA "(Rx32++#s3)" NT SYNTAXB,ATTRIB,DESCR,= { fEA_REG(RxV); BEH; fPM_I(RxV,VEC_SCALE(siV)); }) \ +EXTINSN(V6_##TAG##_ai, SYNTAXA "(Rt32+#s4)" NT SYNTAXB,ATTRIB,DESCR,{= fEA_RI(RtV,VEC_SCALE(siV)); BEH;}) \ +EXTINSN(V6_##TAG##_ppu, SYNTAXA "(Rx32++Mu2)" NT SYNTAXB,ATTRIB,DESCR= ,{ fEA_REG(RxV); BEH; fPM_M(RxV,MuV); }) \ + + +#define MMVEC_COND_EACH_EA_TRUE(TAG,DESCR,ATTRIB,NT,SYNTAXA,SYNTAXB,SYNTAX= P,BEH) \ +EXTINSN(V6_##TAG##_pred_pi, "if (" #SYNTAXP "4) " SYNTAXA "(Rx32++#s3= )" NT SYNTAXB, ATTRIB,DESCR, { if (fLSBOLD(SYNTAXP##V)) { fEA_REG(RxV); BEH= ; fPM_I(RxV,siV*fVECSIZE()); } else {CANCEL;}}) \ +EXTINSN(V6_##TAG##_pred_ai, "if (" #SYNTAXP "4) " SYNTAXA "(Rt32+#s4)= " NT SYNTAXB, ATTRIB,DESCR, { if (fLSBOLD(SYNTAXP##V)) { fEA_RI(RtV,siV*fV= ECSIZE()); BEH;} else {CANCEL;}}) \ +EXTINSN(V6_##TAG##_pred_ppu, "if (" #SYNTAXP "4) " SYNTAXA "(Rx32++Mu2= )" NT SYNTAXB,ATTRIB,DESCR, { if (fLSBOLD(SYNTAXP##V)) { fEA_REG(RxV); BEH= ; fPM_M(RxV,MuV); } else {CANCEL;}}) \ + +#define MMVEC_COND_EACH_EA_FALSE(TAG,DESCR,ATTRIB,NT,SYNTAXA,SYNTAXB,SYNTA= XP,BEH) \ +EXTINSN(V6_##TAG##_npred_pi, "if (!" #SYNTAXP "4) " SYNTAXA "(Rx32++#s= 3)" NT SYNTAXB,ATTRIB,DESCR,{ if (fLSBOLDNOT(SYNTAXP##V)) { fEA_REG(RxV); B= EH; fPM_I(RxV,siV*fVECSIZE()); } else {CANCEL;}}) \ +EXTINSN(V6_##TAG##_npred_ai, "if (!" #SYNTAXP "4) " SYNTAXA "(Rt32+#s4= )" NT SYNTAXB,ATTRIB,DESCR, { if (fLSBOLDNOT(SYNTAXP##V)) { fEA_RI(RtV,siV*= fVECSIZE()); BEH;} else {CANCEL;}}) \ +EXTINSN(V6_##TAG##_npred_ppu, "if (!" #SYNTAXP "4) " SYNTAXA "(Rx32++Mu= 2)" NT SYNTAXB,ATTRIB,DESCR,{ if (fLSBOLDNOT(SYNTAXP##V)) { fEA_REG(RxV); B= EH; fPM_M(RxV,MuV); } else {CANCEL;}}) + +#define MMVEC_COND_EACH_EA(TAG,DESCR,ATTRIB,NT,SYNTAXA,SYNTAXB,SYNTAXP,BEH= ) \ +MMVEC_COND_EACH_EA_TRUE(TAG,DESCR,ATTRIB,NT,SYNTAXA,SYNTAXB,SYNTAXP,BEH) \ +MMVEC_COND_EACH_EA_FALSE(TAG,DESCR,ATTRIB,NT,SYNTAXA,SYNTAXB,SYNTAXP,BEH) + + +#define VEC_SCALE(X) X*fVECSIZE() + + +#define MMVEC_LD(TAG,DESCR,ATTRIB,NT) MMVEC_EACH_EA(TAG,DESCR,ATTRIB,NT,"V= d32=3Dvmem","",fLOADMMV(EA,VdV)) +#define MMVEC_LDC(TAG,DESCR,ATTRIB,NT) MMVEC_EACH_EA(TAG##_cur,DESCR,ATTRI= B,NT,"Vd32.cur=3Dvmem","",fLOADMMV(EA,VdV)) +#define MMVEC_LDT(TAG,DESCR,ATTRIB,NT) MMVEC_EACH_EA(TAG##_tmp,DESCR,ATTRI= B,NT,"Vd32.tmp=3Dvmem","",fLOADMMV(EA,VdV)) +#define MMVEC_LDU(TAG,DESCR,ATTRIB,NT) MMVEC_EACH_EA(TAG,DESCR,ATTRIB,NT,"= Vd32=3Dvmemu","",fLOADMMVU(EA,VdV)) + + +#define MMVEC_STQ(TAG,DESCR,ATTRIB,NT) \ +MMVEC_EACH_EA(TAG##_qpred,DESCR,ATTRIB,NT,"if (Qv4) vmem","=3DVs32",fSTORE= MMVQ(EA,VsV,QvV)) \ +MMVEC_EACH_EA(TAG##_nqpred,DESCR,ATTRIB,NT,"if (!Qv4) vmem","=3DVs32",fSTO= REMMVNQ(EA,VsV,QvV)) + +/**************************************************************** +* MAPPING FOR VMEMs +****************************************************************/ + +#define ATTR_VMEM A_EXTENSION,A_CVI,A_CVI_VM +#define ATTR_VMEMU A_EXTENSION,A_CVI,A_CVI_VM,A_CVI_VP + + +MMVEC_LD(vL32b, "Aligned Vector Load", ATTRIBS(ATTR_VMEM,A_LOAD,A_= CVI_VA),) +MMVEC_LDC(vL32b, "Aligned Vector Load Cur", ATTRIBS(ATTR_VMEM,A_LOAD,A_CV= I_NEW,A_CVI_VA),) +MMVEC_LDT(vL32b, "Aligned Vector Load Tmp", ATTRIBS(ATTR_VMEM,A_LOAD,A_CV= I_TMP),) + +MMVEC_COND_EACH_EA(vL32b,"Conditional Aligned Vector Load",ATTRIBS(ATTR_VM= EM,A_LOAD,A_CVI_VA),,"Vd32=3Dvmem",,Pv,fLOADMMV(EA,VdV);) +MMVEC_COND_EACH_EA(vL32b_cur,"Conditional Aligned Vector Load Cur",ATTRIBS= (ATTR_VMEM,A_LOAD,A_CVI_VA,A_CVI_NEW),,"Vd32.cur=3Dvmem",,Pv,fLOADMMV(EA,Vd= V);) +MMVEC_COND_EACH_EA(vL32b_tmp,"Conditional Aligned Vector Load Tmp",ATTRIBS= (ATTR_VMEM,A_LOAD,A_CVI_TMP),,"Vd32.tmp=3Dvmem",,Pv,fLOADMMV(EA,VdV);) + +MMVEC_EACH_EA(vS32b,"Aligned Vector Store",ATTRIBS(ATTR_VMEM,A_STORE,A_RES= TRICT_SLOT0ONLY,A_CVI_VA),,"vmem","=3DVs32",fSTOREMMV(EA,VsV)) +MMVEC_COND_EACH_EA(vS32b,"Aligned Vector Store",ATTRIBS(ATTR_VMEM,A_STORE,= A_RESTRICT_SLOT0ONLY,A_CVI_VA),,"vmem","=3DVs32",Pv,fSTOREMMV(EA,VsV)) + + +MMVEC_STQ(vS32b, "Aligned Vector Store", ATTRIBS(ATTR_VMEM,A_STORE,A= _RESTRICT_SLOT0ONLY,A_CVI_VA),) + +MMVEC_LDU(vL32Ub, "Unaligned Vector Load", ATTRIBS(ATTR_VMEMU,A_LOAD,A= _RESTRICT_NOSLOT1),) + +MMVEC_EACH_EA(vS32Ub,"Unaligned Vector Store",ATTRIBS(ATTR_VMEMU,A_STORE,A= _RESTRICT_NOSLOT1),,"vmemu","=3DVs32",fSTOREMMVU(EA,VsV)) + +MMVEC_COND_EACH_EA(vS32Ub,"Unaligned Vector Store",ATTRIBS(ATTR_VMEMU,A_ST= ORE,A_RESTRICT_NOSLOT1),,"vmemu","=3DVs32",Pv,fSTOREMMVU(EA,VsV)) + +MMVEC_EACH_EA(vS32b_new,"Aligned Vector Store New",ATTRIBS(ATTR_VMEM,A_STO= RE,A_CVI_NEW,A_DOTNEWVALUE,A_RESTRICT_SLOT0ONLY),,"vmem","=3DOs8.new",fSTOR= EMMV(EA,fNEWVREG(OsN))) + +// V65 store relase, zero byte store +MMVEC_EACH_EA(vS32b_srls,"Aligned Vector Scatter Release",ATTRIBS(ATTR_VME= M,A_STORE,A_CVI_SCATTER_RELEASE,A_CVI_NEW,A_RESTRICT_SLOT0ONLY),,"vmem",":s= catter_release",fSTORERELEASE(EA,0)) + + + +MMVEC_COND_EACH_EA(vS32b_new,"Aligned Vector Store New",ATTRIBS(ATTR_VMEM,= A_STORE,A_CVI_NEW,A_DOTNEWVALUE,A_RESTRICT_SLOT0ONLY),,"vmem","=3DOs8.new",= Pv,fSTOREMMV(EA,fNEWVREG(OsN))) + + +/*************************************************************************= ***************** +* +* MMVECTOR MEMORY OPERATIONS - NON TEMPORAL +* +**************************************************************************= *****************/ + +#define ATTR_VMEM_NT A_EXTENSION,A_CVI,A_CVI_VM + +MMVEC_EACH_EA(vS32b_nt,"Aligned Vector Store - Non temporal",ATTRIBS(ATTR_= VMEM_NT,A_STORE,A_RESTRICT_SLOT0ONLY,A_CVI_VA),":nt","vmem","=3DVs32",fSTOR= EMMV(EA,VsV)) +MMVEC_COND_EACH_EA(vS32b_nt,"Aligned Vector Store - Non temporal",ATTRIBS(= ATTR_VMEM_NT,A_STORE,A_RESTRICT_SLOT0ONLY,A_CVI_VA),":nt","vmem","=3DVs32",= Pv,fSTOREMMV(EA,VsV)) + +MMVEC_EACH_EA(vS32b_nt_new,"Aligned Vector Store New - Non temporal",ATTRI= BS(ATTR_VMEM_NT,A_STORE,A_CVI_NEW,A_DOTNEWVALUE,A_RESTRICT_SLOT0ONLY),":nt"= ,"vmem","=3DOs8.new",fSTOREMMV(EA,fNEWVREG(OsN))) +MMVEC_COND_EACH_EA(vS32b_nt_new,"Aligned Vector Store New - Non temporal",= ATTRIBS(ATTR_VMEM_NT,A_STORE,A_CVI_NEW,A_DOTNEWVALUE,A_RESTRICT_SLOT0ONLY),= ":nt","vmem","=3DOs8.new",Pv,fSTOREMMV(EA,fNEWVREG(OsN))) + + +MMVEC_STQ(vS32b_nt, "Aligned Vector Store - Non temporal", ATTRIBS(A= TTR_VMEM_NT,A_STORE,A_RESTRICT_SLOT0ONLY,A_CVI_VA),":nt") + +MMVEC_LD(vL32b_nt, "Aligned Vector Load - Non temporal", ATTRIBS(AT= TR_VMEM_NT,A_LOAD,A_CVI_VA),":nt") +MMVEC_LDC(vL32b_nt, "Aligned Vector Load Cur - Non temporal", ATTRIBS(ATT= R_VMEM_NT,A_LOAD,A_CVI_NEW,A_CVI_VA),":nt") +MMVEC_LDT(vL32b_nt, "Aligned Vector Load Tmp - Non temporal", ATTRIBS(ATT= R_VMEM_NT,A_LOAD,A_CVI_TMP),":nt") + +MMVEC_COND_EACH_EA(vL32b_nt,"Conditional Aligned Vector Load",ATTRIBS(ATTR= _VMEM_NT,A_CVI_VA),,"Vd32=3Dvmem",":nt",Pv,fLOADMMV(EA,VdV);) +MMVEC_COND_EACH_EA(vL32b_nt_cur,"Conditional Aligned Vector Load Cur",ATTR= IBS(ATTR_VMEM_NT,A_CVI_VA,A_CVI_NEW),,"Vd32.cur=3Dvmem",":nt",Pv,fLOADMMV(E= A,VdV);) +MMVEC_COND_EACH_EA(vL32b_nt_tmp,"Conditional Aligned Vector Load Tmp",ATTR= IBS(ATTR_VMEM_NT,A_CVI_TMP),,"Vd32.tmp=3Dvmem",":nt",Pv,fLOADMMV(EA,VdV);) + + +#undef VEC_SCALE + + +/*************************************************** + * Vector Alignment + ************************************************/ + +#define VALIGNB(SHIFT) \ + fHIDE(int i;) \ + for(i =3D 0; i < fVBYTES(); i++) {\ + VdV.ub[i] =3D (i+SHIFT>=3DfVBYTES()) ? VuV.ub[i+SHIFT-fVBYTES()] := VvV.ub[i+SHIFT];\ + } + +EXTINSN(V6_valignb, "Vd32=3Dvalign(Vu32,Vv32,Rt8)", ATTRIBS(A_EXTENSION,= A_CVI,A_CVI_VP),"Align Two vectors by Rt8 as control", +{ + unsigned shift =3D RtV & (fVBYTES()-1); + VALIGNB(shift) +}) +EXTINSN(V6_vlalignb, "Vd32=3Dvlalign(Vu32,Vv32,Rt8)", ATTRIBS(A_EXTENSION,= A_CVI,A_CVI_VP),"Align Two vectors by Rt8 as control", +{ + unsigned shift =3D fVBYTES() - (RtV & (fVBYTES()-1)); + VALIGNB(shift) +}) +EXTINSN(V6_valignbi, "Vd32=3Dvalign(Vu32,Vv32,#u3)", ATTRIBS(A_EXTENSION,= A_CVI,A_CVI_VP),"Align Two vectors by #u3 as control", +{ + VALIGNB(uiV) +}) +EXTINSN(V6_vlalignbi,"Vd32=3Dvlalign(Vu32,Vv32,#u3)", ATTRIBS(A_EXTENSION,= A_CVI,A_CVI_VP),"Align Two vectors by #u3 as control", +{ + unsigned shift =3D fVBYTES() - uiV; + VALIGNB(shift) +}) + +EXTINSN(V6_vror, "Vd32=3Dvror(Vu32,Rt32)", ATTRIBS(A_EXTENSION,A_CVI,A_CVI= _VP), +"Align Two vectors by Rt32 as control", +{ + fHIDE(int k;) + for (k=3D0;k> (RtV & (SIZE-1= )))) \ +ITERATOR_INSN2_SHIFT_SLOT(SIZE,vasl##TYPE, "Vd32=3Dvasl" #TYPE "(Vu32,Rt= 32)","Vd32."#TYPE"=3Dvasl(Vu32."#TYPE",Rt32)", "Vector arithmetic s= hift left " DESC, VdV.TYPE[i] =3D (VuV.TYPE[i] << (RtV & (SIZE-1= )))) \ +ITERATOR_INSN2_SHIFT_SLOT(SIZE,vlsr##TYPE, "Vd32=3Dvlsr" #TYPE "(Vu32,Rt= 32)","Vd32.u"#TYPE"=3Dvlsr(Vu32.u"#TYPE",Rt32)", "Vector logical shif= t right " DESC, VdV.u##TYPE[i] =3D (VuV.u##TYPE[i] >> (RtV & (SIZE-1= )))) \ +ITERATOR_INSN2_SHIFT_SLOT(SIZE,vasr##TYPE##v,"Vd32=3Dvasr" #TYPE "(Vu32,Vv= 32)","Vd32."#TYPE"=3Dvasr(Vu32."#TYPE",Vv32."#TYPE")", "Vector arithmetic s= hift right " DESC, VdV.TYPE[i] =3D fBIDIR_ASHIFTR(VuV.TYPE[i], fSXTN= ((LOGSIZE+1),SIZE,VvV.TYPE[i]),CASTTYPE)) \ +ITERATOR_INSN2_SHIFT_SLOT(SIZE,vasl##TYPE##v,"Vd32=3Dvasl" #TYPE "(Vu32,Vv= 32)","Vd32."#TYPE"=3Dvasl(Vu32."#TYPE",Vv32."#TYPE")", "Vector arithmetic s= hift left " DESC, VdV.TYPE[i] =3D fBIDIR_ASHIFTL(VuV.TYPE[i], fSXT= N((LOGSIZE+1),SIZE,VvV.TYPE[i]),CASTTYPE)) \ +ITERATOR_INSN2_SHIFT_SLOT(SIZE,vlsr##TYPE##v,"Vd32=3Dvlsr" #TYPE "(Vu32,Vv= 32)","Vd32."#TYPE"=3Dvlsr(Vu32."#TYPE",Vv32."#TYPE")", "Vector logical shif= t right " DESC, VdV.u##TYPE[i] =3D fBIDIR_LSHIFTR(VuV.u##TYPE[i], fS= XTN((LOGSIZE+1),SIZE,VvV.TYPE[i]),CASTTYPE)) \ + +V_SHIFT(w, "word", 32,5,4_4) +V_SHIFT(h, "halfword", 16,4,2_2) + +ITERATOR_INSN_SHIFT_SLOT(8,vlsrb,"Vd32.ub=3Dvlsr(Vu32.ub,Rt32)","vec log s= hift right bytes", VdV.b[i] =3D VuV.ub[i] >> (RtV & 0x7)) + +ITERATOR_INSN2_SHIFT_SLOT(32,vrotr,"Vd32=3Dvrotr(Vu32,Vv32)","Vd32.uw=3Dvr= otr(Vu32.uw,Vv32.uw)","Vector word rotate right", VdV.uw[i] =3D ((VuV.uw[i]= >> (VvV.uw[i] & 0x1f)) | (VuV.uw[i] << (32 - (VvV.uw[i] & 0x1f))))) + +/********************************************************************* + * MMVECTOR SHIFT AND PERMUTE + * ******************************************************************/ + +ITERATOR_INSN2_PERMUTE_SLOT_DOUBLE_VEC(32,vasr_into,"Vxx32=3Dvasrinto(Vu32= ,Vv32)","Vxx32.w=3Dvasrinto(Vu32.w,Vv32.w)","ASR vector 1 elements and over= lay dropping bits to MSB of vector 2 elements", + fHIDE(int64_t ) shift =3D (fSE32_64(VuV.w[i]) << 32); + fHIDE(int64_t ) mask =3D (((fSE32_64(VxxV.v[0].w[i])) << 32) | fZE32_= 64(VxxV.v[0].w[i])); + fHIDE(int64_t) lomask =3D (((fSE32_64(1)) << 32) - 1); + fHIDE(int ) count =3D -(0x40 & VvV.w[i]) + (VvV.w[i] & 0x3f); + fHIDE(int64_t ) result =3D (count =3D=3D -0x40) ? 0 : (((count < 0) ? = ((shift << -(count)) | (mask & (lomask << -(count)))) : ((shift >> count) |= (mask & (lomask >> count))))); + VxxV.v[1].w[i] =3D ((result >> 32) & 0xffffffff); + VxxV.v[0].w[i] =3D (result & 0xffffffff)) + +#define NEW_NARROWING_SHIFT 1 + +#if NEW_NARROWING_SHIFT +#define NARROWING_SHIFT(ITERSIZE,TAG,DSTM,DSTTYPE,SRCTYPE,SYNOPTS,SATFUNC,= RNDFUNC,SHAMTMASK) \ +ITERATOR_INSN_SHIFT_SLOT(ITERSIZE,TAG, \ +"Vd32." #DSTTYPE "=3Dvasr(Vu32." #SRCTYPE ",Vv32." #SRCTYPE ",Rt8)" #SYNOP= TS, \ +"Vector shift right and shuffle", \ + fHIDE(int )shamt =3D RtV & SHAMTMASK; \ + DSTM(0,VdV.SRCTYPE[i],SATFUNC(RNDFUNC(VvV.SRCTYPE[i],shamt) >> shamt))= ; \ + DSTM(1,VdV.SRCTYPE[i],SATFUNC(RNDFUNC(VuV.SRCTYPE[i],shamt) >> shamt))) + + + + + +/* WORD TO HALF*/ + +NARROWING_SHIFT(32,vasrwh,fSETHALF,h,w,,fECHO,fVNOROUND,0xF) +NARROWING_SHIFT(32,vasrwhsat,fSETHALF,h,w,:sat,fVSATH,fVNOROUND,0xF) +NARROWING_SHIFT(32,vasrwhrndsat,fSETHALF,h,w,:rnd:sat,fVSATH,fVROUND,0xF) +NARROWING_SHIFT(32,vasrwuhrndsat,fSETHALF,uh,w,:rnd:sat,fVSATUH,fVROUND,0x= F) +NARROWING_SHIFT(32,vasrwuhsat,fSETHALF,uh,w,:sat,fVSATUH,fVNOROUND,0xF) +NARROWING_SHIFT(32,vasruwuhrndsat,fSETHALF,uh,uw,:rnd:sat,fVSATUH,fVROUND,= 0xF) + +NARROWING_SHIFT_NOV1(32,vasruwuhsat,fSETHALF,uh,uw,:sat,fVSATUH,fVNOROUND,= 0xF) +NARROWING_SHIFT(16,vasrhubsat,fSETBYTE,ub,h,:sat,fVSATUB,fVNOROUND,0x7) +NARROWING_SHIFT(16,vasrhubrndsat,fSETBYTE,ub,h,:rnd:sat,fVSATUB,fVROUND,0x= 7) +NARROWING_SHIFT(16,vasrhbsat,fSETBYTE,b,h,:sat,fVSATB,fVNOROUND,0x7) +NARROWING_SHIFT(16,vasrhbrndsat,fSETBYTE,b,h,:rnd:sat,fVSATB,fVROUND,0x7) + +NARROWING_SHIFT_NOV1(16,vasruhubsat,fSETBYTE,ub,uh,:sat,fVSATUB,fVNOROUND,= 0x7) +NARROWING_SHIFT_NOV1(16,vasruhubrndsat,fSETBYTE,ub,uh,:rnd:sat,fVSATUB,fVR= OUND,0x7) + +#else +ITERATOR_INSN2_SHIFT_SLOT(32,vasrwh,"Vd32=3Dvasrwh(Vu32,Vv32,Rt8)","Vd32.h= =3Dvasr(Vu32.w,Vv32.w,Rt8)", +"Vector arithmetic shift right words, shuffle even halfwords", + fSETHALF(0,VdV.w[i], (VvV.w[i] >> (RtV & 0xF))); + fSETHALF(1,VdV.w[i], (VuV.w[i] >> (RtV & 0xF)))) + + +ITERATOR_INSN2_SHIFT_SLOT(32,vasrwhsat,"Vd32=3Dvasrwh(Vu32,Vv32,Rt8):sat",= "Vd32.h=3Dvasr(Vu32.w,Vv32.w,Rt8):sat", +"Vector arithmetic shift right words, shuffle even halfwords", + fSETHALF(0,VdV.w[i], fVSATH(VvV.w[i] >> (RtV & 0xF))); + fSETHALF(1,VdV.w[i], fVSATH(VuV.w[i] >> (RtV & 0xF)))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vasrwhrndsat,"Vd32=3Dvasrwh(Vu32,Vv32,Rt8):rn= d:sat","Vd32.h=3Dvasr(Vu32.w,Vv32.w,Rt8):rnd:sat", +"Vector arithmetic shift right words, shuffle even halfwords", + fHIDE(int ) shamt =3D RtV & 0xF; + fSETHALF(0,VdV.w[i], fVSATH( (VvV.w[i] + fBIDIR_ASHIFTL(1,(shamt-1),4= _8) ) >> shamt)); + fSETHALF(1,VdV.w[i], fVSATH( (VuV.w[i] + fBIDIR_ASHIFTL(1,(shamt-1),4= _8) ) >> shamt))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vasrwuhrndsat,"Vd32=3Dvasrwuh(Vu32,Vv32,Rt8):= rnd:sat","Vd32.uh=3Dvasr(Vu32.w,Vv32.w,Rt8):rnd:sat", +"Vector arithmetic shift right words, shuffle even halfwords", + fHIDE(int ) shamt =3D RtV & 0xF; + fSETHALF(0,VdV.w[i], fVSATUH( (VvV.w[i] + fBIDIR_ASHIFTL(1,(shamt-1),= 4_8) ) >> shamt)); + fSETHALF(1,VdV.w[i], fVSATUH( (VuV.w[i] + fBIDIR_ASHIFTL(1,(shamt-1),= 4_8) ) >> shamt))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vasrwuhsat,"Vd32=3Dvasrwuh(Vu32,Vv32,Rt8):sat= ","Vd32.uh=3Dvasr(Vu32.w,Vv32.w,Rt8):sat", +"Vector arithmetic shift right words, shuffle even halfwords", + fSETHALF(0, VdV.uw[i], fVSATUH(VvV.w[i] >> (RtV & 0xF))); + fSETHALF(1, VdV.uw[i], fVSATUH(VuV.w[i] >> (RtV & 0xF)))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vasruwuhrndsat,"Vd32=3Dvasruwuh(Vu32,Vv32,Rt8= ):rnd:sat","Vd32.uh=3Dvasr(Vu32.uw,Vv32.uw,Rt8):rnd:sat", +"Vector arithmetic shift right words, shuffle even halfwords", + fHIDE(int ) shamt =3D RtV & 0xF; + fSETHALF(0,VdV.w[i], fVSATUH( (VvV.uw[i] + fBIDIR_ASHIFTL(1,(shamt-1)= ,4_8) ) >> shamt)); + fSETHALF(1,VdV.w[i], fVSATUH( (VuV.uw[i] + fBIDIR_ASHIFTL(1,(shamt-1)= ,4_8) ) >> shamt))) +#endif + + + +ITERATOR_INSN2_SHIFT_SLOT(32,vroundwh,"Vd32=3Dvroundwh(Vu32,Vv32):sat","Vd= 32.h=3Dvround(Vu32.w,Vv32.w):sat", +"Vector round words to halves, shuffle resultant halfwords", + fSETHALF(0, VdV.uw[i], fVSATH((VvV.w[i] + fCONSTLL(0x8000)) >> 16)); + fSETHALF(1, VdV.uw[i], fVSATH((VuV.w[i] + fCONSTLL(0x8000)) >> 16))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vroundwuh,"Vd32=3Dvroundwuh(Vu32,Vv32):sat","= Vd32.uh=3Dvround(Vu32.w,Vv32.w):sat", +"Vector round words to halves, shuffle resultant halfwords", + fSETHALF(0, VdV.uw[i], fVSATUH((VvV.w[i] + fCONSTLL(0x8000)) >> 16)); + fSETHALF(1, VdV.uw[i], fVSATUH((VuV.w[i] + fCONSTLL(0x8000)) >> 16))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vrounduwuh,"Vd32=3Dvrounduwuh(Vu32,Vv32):sat"= ,"Vd32.uh=3Dvround(Vu32.uw,Vv32.uw):sat", +"Vector round words to halves, shuffle resultant halfwords", + fSETHALF(0, VdV.uw[i], fVSATUH((VvV.uw[i] + fCONSTLL(0x8000)) >> 16)); + fSETHALF(1, VdV.uw[i], fVSATUH((VuV.uw[i] + fCONSTLL(0x8000)) >> 16))) + + + + + +/* HALF TO BYTE*/ + +ITERATOR_INSN2_SHIFT_SLOT(16,vroundhb,"Vd32=3Dvroundhb(Vu32,Vv32):sat","Vd= 32.b=3Dvround(Vu32.h,Vv32.h):sat", +"Vector round words to halves, shuffle resultant halfwords", + fSETBYTE(0, VdV.uh[i], fVSATB((VvV.h[i] + 0x80) >> 8)); + fSETBYTE(1, VdV.uh[i], fVSATB((VuV.h[i] + 0x80) >> 8))) + +ITERATOR_INSN2_SHIFT_SLOT(16,vroundhub,"Vd32=3Dvroundhub(Vu32,Vv32):sat","= Vd32.ub=3Dvround(Vu32.h,Vv32.h):sat", +"Vector round words to halves, shuffle resultant halfwords", + fSETBYTE(0, VdV.uh[i], fVSATUB((VvV.h[i] + 0x80) >> 8)); + fSETBYTE(1, VdV.uh[i], fVSATUB((VuV.h[i] + 0x80) >> 8))) + +ITERATOR_INSN2_SHIFT_SLOT(16,vrounduhub,"Vd32=3Dvrounduhub(Vu32,Vv32):sat"= ,"Vd32.ub=3Dvround(Vu32.uh,Vv32.uh):sat", +"Vector round words to halves, shuffle resultant halfwords", + fSETBYTE(0, VdV.uh[i], fVSATUB((VvV.uh[i] + 0x80) >> 8)); + fSETBYTE(1, VdV.uh[i], fVSATUB((VuV.uh[i] + 0x80) >> 8))) + + +ITERATOR_INSN2_SHIFT_SLOT(32,vaslw_acc,"Vx32+=3Dvaslw(Vu32,Rt32)","Vx32.w+= =3Dvasl(Vu32.w,Rt32)", +"Vector shift add word", + VxV.w[i] +=3D (VuV.w[i] << (RtV & (32-1)))) + +ITERATOR_INSN2_SHIFT_SLOT(32,vasrw_acc,"Vx32+=3Dvasrw(Vu32,Rt32)","Vx32.w+= =3Dvasr(Vu32.w,Rt32)", +"Vector shift add word", + VxV.w[i] +=3D (VuV.w[i] >> (RtV & (32-1)))) + +ITERATOR_INSN2_SHIFT_SLOT_NOV1(16,vaslh_acc,"Vx32+=3Dvaslh(Vu32,Rt32)","Vx= 32.h+=3Dvasl(Vu32.h,Rt32)", +"Vector shift add halfword", + VxV.h[i] +=3D (VuV.h[i] << (RtV & (16-1)))) + +ITERATOR_INSN2_SHIFT_SLOT_NOV1(16,vasrh_acc,"Vx32+=3Dvasrh(Vu32,Rt32)","Vx= 32.h+=3Dvasr(Vu32.h,Rt32)", +"Vector shift add halfword", + VxV.h[i] +=3D (VuV.h[i] >> (RtV & (16-1)))) + +/************************************************************************** +* +* MMVECTOR ELEMENT-WISE ARITHMETIC +* +**************************************************************************/ + +/************************************************************************** +* MACROS GO IN MACROS.DEF NOT HERE!!! +**************************************************************************/ + + +#define MMVEC_ABSDIFF(TYPE,TYPE2,DESCR, WIDTH, DEST,SRC)\ +ITERATOR_INSN2_MPY_SLOT(WIDTH, vabsdiff##TYPE, "Vd32=3Dv= absdiff"TYPE2"(Vu32,Vv32)" ,"Vd32."#DEST"=3Dvabsdiff(Vu32."#SRC",Vv32."#SRC= ")" , "Vector Absolute of Difference "DESCR, VdV.DEST[i] =3D (VuV.SRC= [i] > VvV.SRC[i]) ? (VuV.SRC[i] - VvV.SRC[i]) : (VvV.SRC[i] - VuV.SRC[i])) + +#define MMVEC_ADDU_SAT(TYPE,TYPE2,DESCR, WIDTH, DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT(WIDTH, vadd##TYPE##sat, "Vd32=3Dv= add"TYPE2"(Vu32,Vv32):sat" , "Vd32."#DEST"=3Dvadd(Vu32."#SRC",Vv32."#SRC= "):sat", "Vector Add & Saturate "DESCR, VdV.DEST[i] =3D fVUAD= DSAT(WIDTH, VuV.SRC[i], VvV.SRC[i]))\ +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH, vadd##TYPE##sat_dv, "Vdd32=3D= vadd"TYPE2"(Vuu32,Vvv32):sat", "Vdd32."#DEST"=3Dvadd(Vuu32."#SRC",Vvv32."#= SRC"):sat", "Double Vector Add & Saturate "DESCR, VddV.v[0].DEST[i] =3D = fVUADDSAT(WIDTH, VuuV.v[0].SRC[i],VvvV.v[0].SRC[i]); VddV.v[1].DEST[i] =3D = fVUADDSAT(WIDTH, VuuV.v[1].SRC[i],VvvV.v[1].SRC[i]))\ +ITERATOR_INSN2_ANY_SLOT(WIDTH, vsub##TYPE##sat, "Vd32=3Dv= sub"TYPE2"(Vu32,Vv32):sat", "Vd32."#DEST"=3Dvsub(Vu32."#SRC",Vv32."#SRC= "):sat", "Vector Add & Saturate "DESCR, VdV.DEST[i] =3D fVUSU= BSAT(WIDTH, VuV.SRC[i], VvV.SRC[i]))\ +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH, vsub##TYPE##sat_dv, "Vdd32=3D= vsub"TYPE2"(Vuu32,Vvv32):sat", "Vdd32."#DEST"=3Dvsub(Vuu32."#SRC",Vvv32."#= SRC"):sat", "Double Vector Add & Saturate "DESCR, VddV.v[0].DEST[i] =3D = fVUSUBSAT(WIDTH, VuuV.v[0].SRC[i],VvvV.v[0].SRC[i]); VddV.v[1].DEST[i] =3D = fVUSUBSAT(WIDTH, VuuV.v[1].SRC[i],VvvV.v[1].SRC[i]))\ + +#define MMVEC_ADDS_SAT(TYPE,TYPE2,DESCR, WIDTH,DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT(WIDTH, vadd##TYPE##sat, "Vd32=3Dv= add"TYPE2"(Vu32,Vv32):sat" , "Vd32."#DEST"=3Dvadd(Vu32."#SRC",Vv32."#SRC= "):sat", "Vector Add & Saturate "DESCR, VdV.DEST[i] =3D fVSAD= DSAT(WIDTH, VuV.SRC[i], VvV.SRC[i]))\ +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH, vadd##TYPE##sat_dv, "Vdd32=3D= vadd"TYPE2"(Vuu32,Vvv32):sat", "Vdd32."#DEST"=3Dvadd(Vuu32."#SRC",Vvv32."#= SRC"):sat", "Double Vector Add & Saturate "DESCR, VddV.v[0].DEST[i] =3D = fVSADDSAT(WIDTH, VuuV.v[0].SRC[i], VvvV.v[0].SRC[i]); VddV.v[1].DEST[i] =3D= fVSADDSAT(WIDTH, VuuV.v[1].SRC[i], VvvV.v[1].SRC[i]))\ +ITERATOR_INSN2_ANY_SLOT(WIDTH, vsub##TYPE##sat, "Vd32=3Dv= sub"TYPE2"(Vu32,Vv32):sat", "Vd32."#DEST"=3Dvsub(Vu32."#SRC",Vv32."#SRC= "):sat", "Vector Add & Saturate "DESCR, VdV.DEST[i] =3D fVSSU= BSAT(WIDTH, VuV.SRC[i], VvV.SRC[i]))\ +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH, vsub##TYPE##sat_dv, "Vdd32=3D= vsub"TYPE2"(Vuu32,Vvv32):sat", "Vdd32."#DEST"=3Dvsub(Vuu32."#SRC",Vvv32."#= SRC"):sat", "Double Vector Add & Saturate "DESCR, VddV.v[0].DEST[i] =3D = fVSSUBSAT(WIDTH, VuuV.v[0].SRC[i], VvvV.v[0].SRC[i]); VddV.v[1].DEST[i] =3D= fVSSUBSAT(WIDTH, VuuV.v[1].SRC[i], VvvV.v[1].SRC[i]))\ + +#define MMVEC_AVGU(TYPE,TYPE2,DESCR, WIDTH, DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vavg##TYPE, "Vd32=3Dv= avg"TYPE2"(Vu32,Vv32)", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32."#SRC= ")", "Vector Average "DESCR, Vd= V.DEST[i] =3D fVAVGU( WIDTH, VuV.SRC[i], VvV.SRC[i])) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vavg##TYPE##rnd, "Vd32=3Dv= avg"TYPE2"(Vu32,Vv32):rnd", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32."#SRC= "):rnd", "Vector Average % Round"DESCR, Vd= V.DEST[i] =3D fVAVGURND(WIDTH, VuV.SRC[i], VvV.SRC[i])) + + + +#define MMVEC_AVGS(TYPE,TYPE2,DESCR, WIDTH, DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vavg##TYPE, "Vd32=3Dv= avg"TYPE2"(Vu32,Vv32)", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32."#SR= C")", "Vector Average "DESCR, = VdV.DEST[i] =3D fVAVGS( WIDTH, VuV.SRC[i], VvV.SRC[i])) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vavg##TYPE##rnd, "Vd32=3Dv= avg"TYPE2"(Vu32,Vv32):rnd", "Vd32."#DEST"=3Dvavg(Vu32."#SRC",Vv32."#SR= C"):rnd", "Vector Average % Round"DESCR, = VdV.DEST[i] =3D fVAVGSRND( WIDTH, VuV.SRC[i], VvV.SRC[i])) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vnavg##TYPE, "Vd32=3Dv= navg"TYPE2"(Vu32,Vv32)", "Vd32."#DEST"=3Dvnavg(Vu32."#SRC",Vv32."#S= RC")", "Vector Negative Average "DESCR, = VdV.DEST[i] =3D fVNAVGS( WIDTH, VuV.SRC[i], VvV.SRC[i])) + + + + + + + +#define MMVEC_ADDWRAP(TYPE,TYPE2, DESCR, WIDTH , DEST,SRC)\ +ITERATOR_INSN2_ANY_SLOT(WIDTH, vadd##TYPE, "Vd32=3Dvadd"T= YPE2"(Vu32,Vv32)" , "Vd32."#DEST"=3Dvadd(Vu32."#SRC",Vv32."#SRC")", = "Vector Add "DESCR, VdV.DEST[i] =3D VuV.SRC[i] + VvV.SRC[i])\ +ITERATOR_INSN2_ANY_SLOT(WIDTH, vsub##TYPE, "Vd32=3Dvsub"T= YPE2"(Vu32,Vv32)" , "Vd32."#DEST"=3Dvsub(Vu32."#SRC",Vv32."#SRC")", = "Vector Sub "DESCR, VdV.DEST[i] =3D VuV.SRC[i] - VvV.SRC[i])\ +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH, vadd##TYPE##_dv, "Vdd32=3Dvadd"= TYPE2"(Vuu32,Vvv32)" , "Vdd32."#DEST"=3Dvadd(Vuu32."#SRC",Vvv32."#SRC")", = "Double Vector Add "DESCR, VddV.v[0].DEST[i] =3D VuuV.v[0].SRC[i] + VvvV.= v[0].SRC[i]; VddV.v[1].DEST[i] =3D VuuV.v[1].SRC[i] + VvvV.v[1].SRC[i])\ +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(WIDTH, vsub##TYPE##_dv, "Vdd32=3Dvsub"= TYPE2"(Vuu32,Vvv32)" , "Vdd32."#DEST"=3Dvsub(Vuu32."#SRC",Vvv32."#SRC")", = "Double Vector Sub "DESCR, VddV.v[0].DEST[i] =3D VuuV.v[0].SRC[i] - VvvV.= v[0].SRC[i]; VddV.v[1].DEST[i] =3D VuuV.v[1].SRC[i] - VvvV.v[1].SRC[i]) \ + + + + + +/* Wrapping Adds */ +MMVEC_ADDWRAP(b, "b", "Byte", 8, b, b) +MMVEC_ADDWRAP(h, "h", "Halfword", 16, h, h) +MMVEC_ADDWRAP(w, "w", "Word", 32, w, w) + +/* Saturating Adds */ +MMVEC_ADDU_SAT(ub, "ub", "Unsigned Byte", 8, ub, ub) +MMVEC_ADDU_SAT(uh, "uh", "Unsigned Halfword", 16, uh, uh) +MMVEC_ADDU_SAT(uw, "uw", "Unsigned word", 32, uw, uw) +MMVEC_ADDS_SAT(b, "b", "byte", 8, b, b) +MMVEC_ADDS_SAT(h, "h", "Halfword", 16, h, h) +MMVEC_ADDS_SAT(w, "w", "Word", 32, w, w) + + +/* Averaging Instructions */ +MMVEC_AVGU(ub,"ub", "Unsigned Byte", 8, ub, ub) +MMVEC_AVGU(uh,"uh", "Unsigned Halfword", 16, uh, uh) +MMVEC_AVGU_NOV1(uw,"uw", "Unsigned Word", 32, uw, uw) +MMVEC_AVGS_NOV1(b, "b", "Byte", 8, b, b) +MMVEC_AVGS(h, "h", "Halfword", 16, h, h) +MMVEC_AVGS(w, "w", "Word", 32, w, w) + + +/* Absolute Difference */ +MMVEC_ABSDIFF(ub,"ub", "Unsigned Byte", 8, ub, ub) +MMVEC_ABSDIFF(uh,"uh", "Unsigned Halfword", 16, uh, uh) +MMVEC_ABSDIFF(h,"h", "Halfword", 16, uh, h) +MMVEC_ABSDIFF(w,"w", "Word", 32, uw, w) + +ITERATOR_INSN2_ANY_SLOT(8,vnavgub, "Vd32=3Dvnavgub(Vu32,Vv32)", "Vd32.b=3D= vnavg(Vu32.ub,Vv32.ub)", +"Vector Negative Average Unsigned Byte", VdV.b[i] =3D fVNAVGU(8, VuV.ub[= i], VvV.ub[i])) + +ITERATOR_INSN_ANY_SLOT(32,vaddcarrysat,"Vd32.w=3Dvadd(Vu32.w,Vv32.w,Qs4):c= arry:sat","add w/carry and saturate", +VdV.w[i] =3D fVSATW(VuV.w[i]+VvV.w[i]+fGETQBIT(QsV,i*4))) + +ITERATOR_INSN_ANY_SLOT(32,vaddcarry,"Vd32.w=3Dvadd(Vu32.w,Vv32.w,Qx4):carr= y","add w/carry", +VdV.w[i] =3D VuV.w[i]+VvV.w[i]+fGETQBIT(QxV,i*4); +fSETQBITS(QxV,4,0xF,4*i,-fCARRY_FROM_ADD32(VuV.w[i],VvV.w[i],fGETQBIT(QxV,= i*4)))) + +ITERATOR_INSN_ANY_SLOT(32,vsubcarry,"Vd32.w=3Dvsub(Vu32.w,Vv32.w,Qx4):carr= y","add w/carry", +VdV.w[i] =3D VuV.w[i]+~VvV.w[i]+fGETQBIT(QxV,i*4); +fSETQBITS(QxV,4,0xF,4*i,-fCARRY_FROM_ADD32(VuV.w[i],~VvV.w[i],fGETQBIT(QxV= ,i*4)))) + +ITERATOR_INSN_ANY_SLOT(32,vaddcarryo,"Vd32.w,Qe4=3Dvadd(Vu32.w,Vv32.w):car= ry","add w/carry out-only", +VdV.w[i] =3D VuV.w[i]+VvV.w[i]; +fSETQBITS(QeV,4,0xF,4*i,-fCARRY_FROM_ADD32(VuV.w[i],VvV.w[i],0))) + +ITERATOR_INSN_ANY_SLOT(32,vsubcarryo,"Vd32.w,Qe4=3Dvsub(Vu32.w,Vv32.w):car= ry","subtract w/carry out-only", +VdV.w[i] =3D VuV.w[i]+~VvV.w[i]+1; +fSETQBITS(QeV,4,0xF,4*i,-fCARRY_FROM_ADD32(VuV.w[i],~VvV.w[i],1))) + + +ITERATOR_INSN_ANY_SLOT(32,vsatdw,"Vd32.w=3Dvsatdw(Vu32.w,Vv32.w)","Saturat= e from 64-bits (higher 32-bits come from first vector) to 32-bits",VdV.w[i]= =3D fVSATDW(VuV.w[i],VvV.w[i])) + + +#define MMVEC_ADDSAT_MIX(TAGEND,SATF,WIDTH,DEST,SRC1,SRC2)\ +ITERATOR_INSN_ANY_SLOT(WIDTH, vadd##TAGEND,"Vd32."#DEST"=3Dvadd(Vu32."#SRC= 1",Vv32."#SRC2"):sat", "Vector Add mixed", VdV.DEST[i] =3D SATF(VuV.SRC= 1[i] + VvV.SRC2[i]))\ +ITERATOR_INSN_ANY_SLOT(WIDTH, vsub##TAGEND,"Vd32."#DEST"=3Dvsub(Vu32."#SRC= 1",Vv32."#SRC2"):sat", "Vector Sub mixed", VdV.DEST[i] =3D SATF(VuV.SRC= 1[i] - VvV.SRC2[i]))\ + +MMVEC_ADDSAT_MIX(ububb_sat,fVSATUB,8,ub,ub,b) + +/**************************** +* WIDENING +****************************/ + + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vaddubh,"Vdd32=3Dvaddub(Vu32,Vv32)",= "Vdd32.h=3Dvadd(Vu32.ub,Vv32.ub)", +"Vector addition with widen into two vectors", + VddV.v[0].h[i] =3D fZE8_16(fGETUBYTE(0, VuV.uh[i])) + fZE8_16(fGETUBYT= E(0, VvV.uh[i])); + VddV.v[1].h[i] =3D fZE8_16(fGETUBYTE(1, VuV.uh[i])) + fZE8_16(fGETUBYT= E(1, VvV.uh[i]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vsububh,"Vdd32=3Dvsubub(Vu32,Vv32)",= "Vdd32.h=3Dvsub(Vu32.ub,Vv32.ub)", +"Vector subtraction with widen into two vectors", + VddV.v[0].h[i] =3D fZE8_16(fGETUBYTE(0, VuV.uh[i])) - fZE8_16(fGETUBYT= E(0, VvV.uh[i])); + VddV.v[1].h[i] =3D fZE8_16(fGETUBYTE(1, VuV.uh[i])) - fZE8_16(fGETUBYT= E(1, VvV.uh[i]))) + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vaddhw,"Vdd32=3Dvaddh(Vu32,Vv32)","V= dd32.w=3Dvadd(Vu32.h,Vv32.h)", +"Vector addition with widen into two vectors", + VddV.v[0].w[i] =3D fGETHALF(0, VuV.w[i]) + fGETHALF(0, VvV.w[i]); + VddV.v[1].w[i] =3D fGETHALF(1, VuV.w[i]) + fGETHALF(1, VvV.w[i])) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vsubhw,"Vdd32=3Dvsubh(Vu32,Vv32)","V= dd32.w=3Dvsub(Vu32.h,Vv32.h)", +"Vector subtraction with widen into two vectors", + VddV.v[0].w[i] =3D fGETHALF(0, VuV.w[i]) - fGETHALF(0, VvV.w[i]); + VddV.v[1].w[i] =3D fGETHALF(1, VuV.w[i]) - fGETHALF(1, VvV.w[i])) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vadduhw,"Vdd32=3Dvadduh(Vu32,Vv32)",= "Vdd32.w=3Dvadd(Vu32.uh,Vv32.uh)", +"Vector addition with widen into two vectors", + VddV.v[0].w[i] =3D fZE16_32(fGETUHALF(0, VuV.uw[i])) + fZE16_32(fGETUH= ALF(0, VvV.uw[i])); + VddV.v[1].w[i] =3D fZE16_32(fGETUHALF(1, VuV.uw[i])) + fZE16_32(fGETUH= ALF(1, VvV.uw[i]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vsubuhw,"Vdd32=3Dvsubuh(Vu32,Vv32)",= "Vdd32.w=3Dvsub(Vu32.uh,Vv32.uh)", +"Vector subtraction with widen into two vectors", + VddV.v[0].w[i] =3D fZE16_32(fGETUHALF(0, VuV.uw[i])) - fZE16_32(fGETUH= ALF(0, VvV.uw[i])); + VddV.v[1].w[i] =3D fZE16_32(fGETUHALF(1, VuV.uw[i])) - fZE16_32(fGETUH= ALF(1, VvV.uw[i]))) + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vaddhw_acc,"Vxx32+=3Dvaddh(Vu32,Vv32= )","Vxx32.w+=3Dvadd(Vu32.h,Vv32.h)", +"Vector addition with widen into two vectors", + VxxV.v[0].w[i] +=3D fGETHALF(0, VuV.w[i]) + fGETHALF(0, VvV.w[i]); + VxxV.v[1].w[i] +=3D fGETHALF(1, VuV.w[i]) + fGETHALF(1, VvV.w[i])) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vadduhw_acc,"Vxx32+=3Dvadduh(Vu32,Vv= 32)","Vxx32.w+=3Dvadd(Vu32.uh,Vv32.uh)", +"Vector addition with widen into two vectors", + VxxV.v[0].w[i] +=3D fGETUHALF(0, VuV.w[i]) + fGETUHALF(0, VvV.w[i]); + VxxV.v[1].w[i] +=3D fGETUHALF(1, VuV.w[i]) + fGETUHALF(1, VvV.w[i])) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vaddubh_acc,"Vxx32+=3Dvaddub(Vu32,Vv= 32)","Vxx32.h+=3Dvadd(Vu32.ub,Vv32.ub)", +"Vector addition with widen into two vectors", + VxxV.v[0].h[i] +=3D fGETUBYTE(0, VuV.h[i]) + fGETUBYTE(0, VvV.h[i]); + VxxV.v[1].h[i] +=3D fGETUBYTE(1, VuV.h[i]) + fGETUBYTE(1, VvV.h[i])) + + +/**************************** +* Conditional +****************************/ + +#define CONDADDSUB(WIDTH,TAGEND,LHSYN,RHSYN,DESCR,LHBEH,RHBEH) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vadd##TAGEND##q,"if (Qv4."#TAGEND") "LHSYN"+= =3D"RHSYN,"if (Qv4) "LHSYN"+=3D"RHSYN,DESCR,LHBEH=3DfCONDMASK##WIDTH(QvV,i,= LHBEH+RHBEH,LHBEH)) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vsub##TAGEND##q,"if (Qv4."#TAGEND") "LHSYN"-= =3D"RHSYN,"if (Qv4) "LHSYN"-=3D"RHSYN,DESCR,LHBEH=3DfCONDMASK##WIDTH(QvV,i,= LHBEH-RHBEH,LHBEH)) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vadd##TAGEND##nq,"if (!Qv4."#TAGEND") "LHSYN= "+=3D"RHSYN,"if (!Qv4) "LHSYN"+=3D"RHSYN,DESCR,LHBEH=3DfCONDMASK##WIDTH(QvV= ,i,LHBEH,LHBEH+RHBEH)) \ +ITERATOR_INSN2_ANY_SLOT(WIDTH,vsub##TAGEND##nq,"if (!Qv4."#TAGEND") "LHSYN= "-=3D"RHSYN,"if (!Qv4) "LHSYN"-=3D"RHSYN,DESCR,LHBEH=3DfCONDMASK##WIDTH(QvV= ,i,LHBEH,LHBEH-RHBEH)) \ + +CONDADDSUB(8,b,"Vx32.b","Vu32.b","Conditional add/sub Byte",VxV.ub[i],VuV.= ub[i]) +CONDADDSUB(16,h,"Vx32.h","Vu32.h","Conditional add/sub Half",VxV.h[i],VuV.= h[i]) +CONDADDSUB(32,w,"Vx32.w","Vu32.w","Conditional add/sub Word",VxV.w[i],VuV.= w[i]) + +/***************************************************** + ABSOLUTE VALUES +*****************************************************/ +// V65 +ITERATOR_INSN2_ANY_SLOT_NOV1(8,vabsb, "Vd32=3Dvabsb(Vu32)", "Vd= 32.b=3Dvabs(Vu32.b)", "Vector absolute value of bytes", VdV.b[i] = =3D fABS(VuV.b[i])) +ITERATOR_INSN2_ANY_SLOT_NOV1(8,vabsb_sat, "Vd32=3Dvabsb(Vu32):sat", "Vd= 32.b=3Dvabs(Vu32.b):sat", "Vector absolute value of bytes", VdV.b[i] = =3D fVSATB(fABS(fSE8_16(VuV.b[i])))) + + +ITERATOR_INSN2_ANY_SLOT(16,vabsh, "Vd32=3Dvabsh(Vu32)", "Vd32.h= =3Dvabs(Vu32.h)", "Vector absolute value of halfwords", VdV.h[i] = =3D fABS(VuV.h[i])) +ITERATOR_INSN2_ANY_SLOT(16,vabsh_sat, "Vd32=3Dvabsh(Vu32):sat", "Vd32.h= =3Dvabs(Vu32.h):sat", "Vector absolute value of halfwords", VdV.h[i] = =3D fVSATH(fABS(fSE16_32(VuV.h[i])))) +ITERATOR_INSN2_ANY_SLOT(32,vabsw, "Vd32=3Dvabsw(Vu32)", "Vd32.w= =3Dvabs(Vu32.w)", "Vector absolute value of words", VdV.w[i] = =3D fABS(VuV.w[i])) +ITERATOR_INSN2_ANY_SLOT(32,vabsw_sat, "Vd32=3Dvabsw(Vu32):sat", "Vd32.w= =3Dvabs(Vu32.w):sat", "Vector absolute value of words", VdV.w[i] = =3D fVSATW(fABS(fSE32_64(VuV.w[i])))) + + +/************************************************************************** + * MMVECTOR MULTIPLICATIONS + * ***********************************************************************= */ + + +/* Byte by Byte */ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpybv,"Vdd32=3Dvmpyb(Vu32,Vv32)","V= dd32.h=3Dvmpy(Vu32.b,Vv32.b)", +"Vector absolute value of words", + VddV.v[0].h[i] =3D fMPY8SS(fGETBYTE(0, VuV.h[i]), fGETBYTE(0, VvV.h[i= ])); + VddV.v[1].h[i] =3D fMPY8SS(fGETBYTE(1, VuV.h[i]), fGETBYTE(1, VvV.h[i= ]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpybv_acc,"Vxx32+=3Dvmpyb(Vu32,Vv32= )","Vxx32.h+=3Dvmpy(Vu32.b,Vv32.b)", +"Vector absolute value of words", + VxxV.v[0].h[i] +=3D fMPY8SS(fGETBYTE(0, VuV.h[i]), fGETBYTE(0, VvV.h[= i])); + VxxV.v[1].h[i] +=3D fMPY8SS(fGETBYTE(1, VuV.h[i]), fGETBYTE(1, VvV.h[= i]))) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyubv,"Vdd32=3Dvmpyub(Vu32,Vv32)",= "Vdd32.uh=3Dvmpy(Vu32.ub,Vv32.ub)", +"Vector absolute value of words", + VddV.v[0].uh[i] =3D fMPY8UU(fGETUBYTE(0, VuV.uh[i]), fGETUBYTE(0, VvV= .uh[i]) ); + VddV.v[1].uh[i] =3D fMPY8UU(fGETUBYTE(1, VuV.uh[i]), fGETUBYTE(1, VvV= .uh[i]) )) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyubv_acc,"Vxx32+=3Dvmpyub(Vu32,Vv= 32)","Vxx32.uh+=3Dvmpy(Vu32.ub,Vv32.ub)", +"Vector absolute value of words", + VxxV.v[0].uh[i] +=3D fMPY8UU(fGETUBYTE(0, VuV.uh[i]), fGETUBYTE(0, Vv= V.uh[i]) ); + VxxV.v[1].uh[i] +=3D fMPY8UU(fGETUBYTE(1, VuV.uh[i]), fGETUBYTE(1, Vv= V.uh[i]) )) + + + + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpybusv,"Vdd32=3Dvmpybus(Vu32,Vv32)= ","Vdd32.h=3Dvmpy(Vu32.ub,Vv32.b)", +"Vector absolute value of words", + VddV.v[0].h[i] =3D fMPY8US(fGETUBYTE(0, VuV.uh[i]), fGETBYTE(0, VvV.h= [i])); + VddV.v[1].h[i] =3D fMPY8US(fGETUBYTE(1, VuV.uh[i]), fGETBYTE(1, VvV.h= [i]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpybusv_acc,"Vxx32+=3Dvmpybus(Vu32,= Vv32)","Vxx32.h+=3Dvmpy(Vu32.ub,Vv32.b)", +"Vector absolute value of words", + VxxV.v[0].h[i] +=3D fMPY8US(fGETUBYTE(0, VuV.uh[i]), fGETBYTE(0, VvV.= h[i])); + VxxV.v[1].h[i] +=3D fMPY8US(fGETUBYTE(1, VuV.uh[i]), fGETBYTE(1, VvV.= h[i]))) + + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpabusv,"Vdd32=3Dvmpabus(Vuu32,Vvv3= 2)","Vdd32.h=3Dvmpa(Vuu32.ub,Vvv32.b)", +"Vertical Byte Multiply", + VddV.v[0].h[i] =3D fMPY8US(fGETUBYTE(0, VuuV.v[0].uh[i]), fGETBYTE(0, = VvvV.v[0].uh[i])) + fMPY8US(fGETUBYTE(0, VuuV.v[1].uh[i]), fGETBYTE(0, VvvV= .v[1].uh[i])); + VddV.v[1].h[i] =3D fMPY8US(fGETUBYTE(1, VuuV.v[0].uh[i]), fGETBYTE(1, = VvvV.v[0].uh[i])) + fMPY8US(fGETUBYTE(1, VuuV.v[1].uh[i]), fGETBYTE(1, VvvV= .v[1].uh[i]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpabuuv,"Vdd32=3Dvmpabuu(Vuu32,Vvv3= 2)","Vdd32.h=3Dvmpa(Vuu32.ub,Vvv32.ub)", +"Vertical Byte Multiply", + VddV.v[0].h[i] =3D fMPY8UU(fGETUBYTE(0, VuuV.v[0].uh[i]), fGETUBYTE(0,= VvvV.v[0].uh[i])) + fMPY8UU(fGETUBYTE(0, VuuV.v[1].uh[i]), fGETUBYTE(0, Vv= vV.v[1].uh[i])); + VddV.v[1].h[i] =3D fMPY8UU(fGETUBYTE(1, VuuV.v[0].uh[i]), fGETUBYTE(1,= VvvV.v[0].uh[i])) + fMPY8UU(fGETUBYTE(1, VuuV.v[1].uh[i]), fGETUBYTE(1, Vv= vV.v[1].uh[i]))) + + + + + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhv,"Vdd32=3Dvmpyh(Vu32,Vv32)","V= dd32.w=3Dvmpy(Vu32.h,Vv32.h)", +"Vector by Vector Halfword Multiply", + VddV.v[0].w[i] =3D fMPY16SS(fGETHALF(0, VuV.w[i]), fGETHALF(0, VvV.w[i= ])); + VddV.v[1].w[i] =3D fMPY16SS(fGETHALF(1, VuV.w[i]), fGETHALF(1, VvV.w[i= ]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhv_acc,"Vxx32+=3Dvmpyh(Vu32,Vv32= )","Vxx32.w+=3Dvmpy(Vu32.h,Vv32.h)", +"Vector by Vector Halfword Multiply", + VxxV.v[0].w[i] +=3D fMPY16SS(fGETHALF(0, VuV.w[i]), fGETHALF(0, VvV.w[= i])); + VxxV.v[1].w[i] +=3D fMPY16SS(fGETHALF(1, VuV.w[i]), fGETHALF(1, VvV.w[= i]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyuhv,"Vdd32=3Dvmpyuh(Vu32,Vv32)",= "Vdd32.uw=3Dvmpy(Vu32.uh,Vv32.uh)", +"Vector by Vector Unsigned Halfword Multiply", + VddV.v[0].uw[i] =3D fMPY16UU(fGETUHALF(0, VuV.uw[i]), fGETUHALF(0, VvV= .uw[i])); + VddV.v[1].uw[i] =3D fMPY16UU(fGETUHALF(1, VuV.uw[i]), fGETUHALF(1, VvV= .uw[i]))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyuhv_acc,"Vxx32+=3Dvmpyuh(Vu32,Vv= 32)","Vxx32.uw+=3Dvmpy(Vu32.uh,Vv32.uh)", +"Vector by Vector Unsigned Halfword Multiply", + VxxV.v[0].uw[i] +=3D fMPY16UU(fGETUHALF(0, VuV.uw[i]), fGETUHALF(0, Vv= V.uw[i])); + VxxV.v[1].uw[i] +=3D fMPY16UU(fGETUHALF(1, VuV.uw[i]), fGETUHALF(1, Vv= V.uw[i]))) + + + +/* Vector by Vector */ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyhvsrs,"Vd32=3Dvmpyh(Vu32,Vv32):<= <1:rnd:sat","Vd32.h=3Dvmpy(Vu32.h,Vv32.h):<<1:rnd:sat", +"Vector halfword multiply with round, shift, and sat16", + VdV.h[i] =3D fVSATH(fGETHALF(1,fVSAT(fROUND((fMPY16SS(VuV.h[i],VvV.h[i= ] )<<1)))))) + + + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhus, "Vdd32=3Dvmpyhus(Vu32,Vv32)= ","Vdd32.w=3Dvmpy(Vu32.h,Vv32.uh)", +"Vector by Vector Halfword Multiply", + VddV.v[0].w[i] =3D fMPY16SU(fGETHALF(0, VuV.w[i]), fGETUHALF(0, VvV.uw= [i])); + VddV.v[1].w[i] =3D fMPY16SU(fGETHALF(1, VuV.w[i]), fGETUHALF(1, VvV.uw= [i]))) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhus_acc, "Vxx32+=3Dvmpyhus(Vu32,= Vv32)","Vxx32.w+=3Dvmpy(Vu32.h,Vv32.uh)", +"Vector by Vector Halfword Multiply", + VxxV.v[0].w[i] +=3D fMPY16SU(fGETHALF(0, VuV.w[i]), fGETUHALF(0, VvV.u= w[i])); + VxxV.v[1].w[i] +=3D fMPY16SU(fGETHALF(1, VuV.w[i]), fGETUHALF(1, VvV.u= w[i]))) + + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyih,"Vd32=3Dvmpyih(Vu32,Vv32)","V= d32.h=3Dvmpyi(Vu32.h,Vv32.h)", +"Vector by Vector Halfword Multiply", + VdV.h[i] =3D fMPY16SS(VuV.h[i], VvV.h[i])) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyih_acc,"Vx32+=3Dvmpyih(Vu32,Vv32= )","Vx32.h+=3Dvmpyi(Vu32.h,Vv32.h)", +"Vector by Vector Halfword Multiply", + VxV.h[i] +=3D fMPY16SS(VuV.h[i], VvV.h[i])) + + + +/* 32x32 high half / frac */ + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyewuh,"Vd32=3Dvmpyewuh(Vu32,Vv32)= ","Vd32.w=3Dvmpye(Vu32.w,Vv32.uh)", +"Vector by Vector Halfword Multiply", +VdV.w[i] =3D fMPY3216SU(VuV.w[i], fGETUHALF(0, VvV.w[i])) >> 16) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyowh,"Vd32=3Dvmpyowh(Vu32,Vv32):<= <1:sat","Vd32.w=3Dvmpyo(Vu32.w,Vv32.h):<<1:sat", +"Vector by Vector Halfword Multiply", +VdV.w[i] =3D fVSATW((((fMPY3216SS(VuV.w[i], fGETHALF(1, VvV.w[i])) >> 14) = + 0) >> 1))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyowh_rnd,"Vd32=3Dvmpyowh(Vu32,Vv3= 2):<<1:rnd:sat","Vd32.w=3Dvmpyo(Vu32.w,Vv32.h):<<1:rnd:sat", +"Vector by Vector Halfword Multiply", +VdV.w[i] =3D fVSATW((((fMPY3216SS(VuV.w[i], fGETHALF(1, VvV.w[i])) >> 14) = + 1) >> 1))) + +ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC(32,vmpyewuh_64,"Vdd32=3Dvmpye(Vu32.w,Vv3= 2.uh)", +"Word times Halfword Multiply, 64-bit result", + fHIDE(size8s_t prod;) + prod =3D fMPY32SU(VuV.w[i],fGETUHALF(0,VvV.w[i])); + VddV.v[1].w[i] =3D prod >> 16; + VddV.v[0].w[i] =3D prod << 16) + +ITERATOR_INSN_MPY_SLOT_DOUBLE_VEC(32,vmpyowh_64_acc,"Vxx32+=3Dvmpyo(Vu32.w= ,Vv32.h)", +"Word times Halfword Multiply, 64-bit result", + fHIDE(size8s_t prod;) + prod =3D fMPY32SS(VuV.w[i],fGETHALF(1,VvV.w[i])) + fSE32_64(VxxV.v[1].w[= i]); + VxxV.v[1].w[i] =3D prod >> 16; + fSETHALF(0, VxxV.v[0].w[i], VxxV.v[0].w[i] >> 16); + fSETHALF(1, VxxV.v[0].w[i], prod & 0x0000ffff)) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyowh_sacc,"Vx32+=3Dvmpyowh(Vu32,V= v32):<<1:sat:shift","Vx32.w+=3Dvmpyo(Vu32.w,Vv32.h):<<1:sat:shift", +"Vector by Vector Halfword Multiply", +IV1DEAD() VxV.w[i] =3D fVSATW(((((VxV.w[i] + fMPY3216SS(VuV.w[i], fGETHALF= (1, VvV.w[i]))) >> 14) + 0) >> 1))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyowh_rnd_sacc,"Vx32+=3Dvmpyowh(Vu= 32,Vv32):<<1:rnd:sat:shift","Vx32.w+=3Dvmpyo(Vu32.w,Vv32.h):<<1:rnd:sat:shi= ft", +"Vector by Vector Halfword Multiply", +IV1DEAD() VxV.w[i] =3D fVSATW(((((VxV.w[i] + fMPY3216SS(VuV.w[i], fGETHALF= (1, VvV.w[i]))) >> 14) + 1) >> 1))) + +/* For 32x32 integer / low half */ + +ITERATOR_INSN_MPY_SLOT(32,vmpyieoh,"Vd32.w=3Dvmpyieo(Vu32.h,Vv32.h)","Odd/= Even multiply for 32x32 low half", + VdV.w[i] =3D (fGETHALF(0,VuV.w[i])*fGETHALF(1,VvV.w[i])) << 16) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyiewuh,"Vd32=3Dvmpyiewuh(Vu32,Vv3= 2)","Vd32.w=3Dvmpyie(Vu32.w,Vv32.uh)", +"Vector by Vector Word by Halfword Multiply", +IV1DEAD() VdV.w[i] =3D fMPY3216SU(VuV.w[i], fGETUHALF(0, VvV.w[i])) ) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyiowh,"Vd32=3Dvmpyiowh(Vu32,Vv32)= ","Vd32.w=3Dvmpyio(Vu32.w,Vv32.h)", +"Vector by Vector Word by Halfword Multiply", +IV1DEAD() VdV.w[i] =3D fMPY3216SS(VuV.w[i], fGETHALF(1, VvV.w[i])) ) + +/* Add back these... */ + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyiewh_acc,"Vx32+=3Dvmpyiewh(Vu32,= Vv32)","Vx32.w+=3Dvmpyie(Vu32.w,Vv32.h)", +"Vector by Vector Word by Halfword Multiply", +VxV.w[i] =3D VxV.w[i] + fMPY3216SS(VuV.w[i], fGETHALF(0, VvV.w[i])) ) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyiewuh_acc,"Vx32+=3Dvmpyiewuh(Vu3= 2,Vv32)","Vx32.w+=3Dvmpyie(Vu32.w,Vv32.uh)", +"Vector by Vector Word by Halfword Multiply", +VxV.w[i] =3D VxV.w[i] + fMPY3216SU(VuV.w[i], fGETUHALF(0, VvV.w[i])) ) + + + + + + + +/* Vector by Scalar */ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyub,"Vdd32=3Dvmpyub(Vu32,Rt32)","= Vdd32.uh=3Dvmpy(Vu32.ub,Rt32.ub)", +"Vector absolute value of words", + VddV.v[0].uh[i] =3D fMPY8UU(fGETUBYTE(0, VuV.uh[i]), fGETUBYTE((2*i+0= )%4, RtV)); + VddV.v[1].uh[i] =3D fMPY8UU(fGETUBYTE(1, VuV.uh[i]), fGETUBYTE((2*i+1= )%4, RtV))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpyub_acc,"Vxx32+=3Dvmpyub(Vu32,Rt3= 2)","Vxx32.uh+=3Dvmpy(Vu32.ub,Rt32.ub)", +"Vector absolute value of words", + VxxV.v[0].uh[i] +=3D fMPY8UU(fGETUBYTE(0, VuV.uh[i]), fGETUBYTE((2*i+0= )%4, RtV)); + VxxV.v[1].uh[i] +=3D fMPY8UU(fGETUBYTE(1, VuV.uh[i]), fGETUBYTE((2*i+1= )%4, RtV))) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpybus,"Vdd32=3Dvmpybus(Vu32,Rt32)"= ,"Vdd32.h=3Dvmpy(Vu32.ub,Rt32.b)", +"Vector absolute value of words", + VddV.v[0].h[i] =3D fMPY8US(fGETUBYTE(0, VuV.uh[i]), fGETBYTE((2*i+0)%= 4, RtV)); + VddV.v[1].h[i] =3D fMPY8US(fGETUBYTE(1, VuV.uh[i]), fGETBYTE((2*i+1)%= 4, RtV))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpybus_acc,"Vxx32+=3Dvmpybus(Vu32,R= t32)","Vxx32.h+=3Dvmpy(Vu32.ub,Rt32.b)", +"Vector absolute value of words", + VxxV.v[0].h[i] +=3D fMPY8US(fGETUBYTE(0, VuV.uh[i]), fGETBYTE((2*i+0)%= 4, RtV)); + VxxV.v[1].h[i] +=3D fMPY8US(fGETUBYTE(1, VuV.uh[i]), fGETBYTE((2*i+1)%= 4, RtV))) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpabus,"Vdd32=3Dvmpabus(Vuu32,Rt32)= ","Vdd32.h=3Dvmpa(Vuu32.ub,Rt32.b)", +"Vertical Byte Multiply", + VddV.v[0].h[i] =3D fMPY8US(fGETUBYTE(0, VuuV.v[0].uh[i]), fGETBYTE(0, = RtV)) + fMPY16SS(fGETUBYTE(0, VuuV.v[1].uh[i]), fGETBYTE(1, RtV)); + VddV.v[1].h[i] =3D fMPY8US(fGETUBYTE(1, VuuV.v[0].uh[i]), fGETBYTE(2, = RtV)) + fMPY16SS(fGETUBYTE(1, VuuV.v[1].uh[i]), fGETBYTE(3, RtV))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(16,vmpabus_acc,"Vxx32+=3Dvmpabus(Vuu32,= Rt32)","Vxx32.h+=3Dvmpa(Vuu32.ub,Rt32.b)", +"Vertical Byte Multiply", + VxxV.v[0].h[i] +=3D fMPY8US(fGETUBYTE(0, VuuV.v[0].uh[i]), fGETBYTE(0,= RtV)) + fMPY16SS(fGETUBYTE(0, VuuV.v[1].uh[i]), fGETBYTE(1, RtV)); + VxxV.v[1].h[i] +=3D fMPY8US(fGETUBYTE(1, VuuV.v[0].uh[i]), fGETBYTE(2,= RtV)) + fMPY16SS(fGETUBYTE(1, VuuV.v[1].uh[i]), fGETBYTE(3, RtV))) + +// V65 + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC_NOV1(16,vmpabuu,"Vdd32=3Dvmpabuu(Vuu32,= Rt32)","Vdd32.h=3Dvmpa(Vuu32.ub,Rt32.ub)", +"Vertical Byte Multiply", + VddV.v[0].uh[i] =3D fMPY8UU(fGETUBYTE(0, VuuV.v[0].uh[i]), fGETUBYTE(0= , RtV)) + fMPY8UU(fGETUBYTE(0, VuuV.v[1].uh[i]), fGETUBYTE(1, RtV)); + VddV.v[1].uh[i] =3D fMPY8UU(fGETUBYTE(1, VuuV.v[0].uh[i]), fGETUBYTE(2= , RtV)) + fMPY8UU(fGETUBYTE(1, VuuV.v[1].uh[i]), fGETUBYTE(3, RtV))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC_NOV1(16,vmpabuu_acc,"Vxx32+=3Dvmpabuu(V= uu32,Rt32)","Vxx32.h+=3Dvmpa(Vuu32.ub,Rt32.ub)", +"Vertical Byte Multiply", + VxxV.v[0].uh[i] +=3D fMPY8UU(fGETUBYTE(0, VuuV.v[0].uh[i]), fGETUBYTE(= 0, RtV)) + fMPY8UU(fGETUBYTE(0, VuuV.v[1].uh[i]), fGETUBYTE(1, RtV)); + VxxV.v[1].uh[i] +=3D fMPY8UU(fGETUBYTE(1, VuuV.v[0].uh[i]), fGETUBYTE(= 2, RtV)) + fMPY8UU(fGETUBYTE(1, VuuV.v[1].uh[i]), fGETUBYTE(3, RtV))) + + + + +/* Half by Byte */ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpahb,"Vdd32=3Dvmpahb(Vuu32,Rt32)",= "Vdd32.w=3Dvmpa(Vuu32.h,Rt32.b)", +"Vertical Byte Multiply", + VddV.v[0].w[i] =3D fMPY16SS(fGETHALF(0, VuuV.v[0].w[i]), fSE8_16(fGETB= YTE(0, RtV))) + fMPY16SS(fGETHALF(0, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(1, R= tV))); + VddV.v[1].w[i] =3D fMPY16SS(fGETHALF(1, VuuV.v[0].w[i]), fSE8_16(fGETB= YTE(2, RtV))) + fMPY16SS(fGETHALF(1, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(3, R= tV)))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpahb_acc,"Vxx32+=3Dvmpahb(Vuu32,Rt= 32)","Vxx32.w+=3Dvmpa(Vuu32.h,Rt32.b)", +"Vertical Byte Multiply", + VxxV.v[0].w[i] +=3D fMPY16SS(fGETHALF(0, VuuV.v[0].w[i]), fSE8_16(fGET= BYTE(0, RtV))) + fMPY16SS(fGETHALF(0, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(1, = RtV))); + VxxV.v[1].w[i] +=3D fMPY16SS(fGETHALF(1, VuuV.v[0].w[i]), fSE8_16(fGET= BYTE(2, RtV))) + fMPY16SS(fGETHALF(1, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(3, = RtV)))) + +/* Half by Byte */ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpauhb,"Vdd32=3Dvmpauhb(Vuu32,Rt32)= ","Vdd32.w=3Dvmpa(Vuu32.uh,Rt32.b)", +"Vertical Byte Multiply", + VddV.v[0].w[i] =3D fMPY16US(fGETUHALF(0, VuuV.v[0].w[i]), fSE8_16(fGET= BYTE(0, RtV))) + fMPY16US(fGETUHALF(0, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(1,= RtV))); + VddV.v[1].w[i] =3D fMPY16US(fGETUHALF(1, VuuV.v[0].w[i]), fSE8_16(fGET= BYTE(2, RtV))) + fMPY16US(fGETUHALF(1, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(3,= RtV)))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpauhb_acc,"Vxx32+=3Dvmpauhb(Vuu32,= Rt32)","Vxx32.w+=3Dvmpa(Vuu32.uh,Rt32.b)", +"Vertical Byte Multiply", + VxxV.v[0].w[i] +=3D fMPY16US(fGETUHALF(0, VuuV.v[0].w[i]), fSE8_16(fGE= TBYTE(0, RtV))) + fMPY16US(fGETUHALF(0, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(1= , RtV))); + VxxV.v[1].w[i] +=3D fMPY16US(fGETUHALF(1, VuuV.v[0].w[i]), fSE8_16(fGE= TBYTE(2, RtV))) + fMPY16US(fGETUHALF(1, VuuV.v[1].w[i]), fSE8_16(fGETBYTE(3= , RtV)))) + + + + + + + +/* Half by Half */ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyh,"Vdd32=3Dvmpyh(Vu32,Rt32)","Vd= d32.w=3Dvmpy(Vu32.h,Rt32.h)", +"Vector absolute value of words", + VddV.v[0].w[i] =3D fMPY16SS(fGETHALF(0, VuV.w[i]), fGETHALF(0, RtV)); + VddV.v[1].w[i] =3D fMPY16SS(fGETHALF(1, VuV.w[i]), fGETHALF(1, RtV))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC_NOV1(32,vmpyh_acc,"Vxx32+=3Dvmpyh(Vu32,= Rt32)","Vxx32.w+=3Dvmpy(Vu32.h,Rt32.h)", +"Vector even halfwords with scalar lower halfword multiply with shift and = sat32", + VxxV.v[0].w[i] =3D fCAST8s(VxxV.v[0].w[i]) + fMPY16SS(fGETHALF(0, VuV= .w[i]), fGETHALF(0, RtV)); + VxxV.v[1].w[i] =3D fCAST8s(VxxV.v[1].w[i]) + fMPY16SS(fGETHALF(1, VuV= .w[i]), fGETHALF(1, RtV))) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhsat_acc,"Vxx32+=3Dvmpyh(Vu32,Rt= 32):sat","Vxx32.w+=3Dvmpy(Vu32.h,Rt32.h):sat", +"Vector even halfwords with scalar lower halfword multiply with shift and = sat32", + VxxV.v[0].w[i] =3D fVSATW(fCAST8s(VxxV.v[0].w[i]) + fMPY16SS(fGETHALF= (0, VuV.w[i]), fGETHALF(0, RtV))); + VxxV.v[1].w[i] =3D fVSATW(fCAST8s(VxxV.v[1].w[i]) + fMPY16SS(fGETHALF= (1, VuV.w[i]), fGETHALF(1, RtV)))) + + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhss,"Vd32=3Dvmpyh(Vu32,Rt32):<<1= :sat","Vd32.h=3Dvmpy(Vu32.h,Rt32.h):<<1:sat", +"Vector halfword by halfword multiply, shift by 1, and take upper 16 msb", + fSETHALF(0,VdV.w[i],fVSATH(fGETHALF(1,fVSAT((fMPY16SS(fGETHALF(0= ,VuV.w[i]),fGETHALF(0,RtV))<<1))))); + fSETHALF(1,VdV.w[i],fVSATH(fGETHALF(1,fVSAT((fMPY16SS(fGETHALF(1= ,VuV.w[i]),fGETHALF(1,RtV))<<1))))); +) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyhsrs,"Vd32=3Dvmpyh(Vu32,Rt32):<<= 1:rnd:sat","Vd32.h=3Dvmpy(Vu32.h,Rt32.h):<<1:rnd:sat", +"Vector halfword with scalar halfword multiply with round, shift, and sat1= 6", + fSETHALF(0,VdV.w[i],fVSATH(fGETHALF(1,fVSAT(fROUND((fMPY16SS(fGETHA= LF(0,VuV.w[i]),fGETHALF(0,RtV))<<1)))))); + fSETHALF(1,VdV.w[i],fVSATH(fGETHALF(1,fVSAT(fROUND((fMPY16SS(fGETHA= LF(1,VuV.w[i]),fGETHALF(1,RtV))<<1)))))); +) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyuh,"Vdd32=3Dvmpyuh(Vu32,Rt32)","= Vdd32.uw=3Dvmpy(Vu32.uh,Rt32.uh)", +"Vector even halfword unsigned multiply by scalar", + VddV.v[0].uw[i] =3D fMPY16UU(fGETUHALF(0, VuV.uw[i]),fGETUHALF(0,RtV)); + VddV.v[1].uw[i] =3D fMPY16UU(fGETUHALF(1, VuV.uw[i]),fGETUHALF(1,RtV))) + + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyuh_acc,"Vxx32+=3Dvmpyuh(Vu32,Rt3= 2)","Vxx32.uw+=3Dvmpy(Vu32.uh,Rt32.uh)", +"Vector even halfword unsigned multiply by scalar", + VxxV.v[0].uw[i] +=3D fMPY16UU(fGETUHALF(0, VuV.uw[i]),fGETUHALF(0,RtV)= ); + VxxV.v[1].uw[i] +=3D fMPY16UU(fGETUHALF(1, VuV.uw[i]),fGETUHALF(1,RtV)= )) + + + + +/******************************************** +* HALF BY BYTE +********************************************/ +ITERATOR_INSN2_MPY_SLOT(16,vmpyihb,"Vd32=3Dvmpyihb(Vu32,Rt32)","Vd32.h=3Dv= mpyi(Vu32.h,Rt32.b)", +"Vector word by byte multiply, keep lower result", +VdV.h[i] =3D fMPY16SS(VuV.h[i], fGETBYTE(i % 4, RtV) )) + +ITERATOR_INSN2_MPY_SLOT(16,vmpyihb_acc,"Vx32+=3Dvmpyihb(Vu32,Rt32)","Vx32.= h+=3Dvmpyi(Vu32.h,Rt32.b)", +"Vector word by byte multiply, keep lower result", +VxV.h[i] +=3D fMPY16SS(VuV.h[i], fGETBYTE(i % 4, RtV) )) + + +/******************************************** +* WORD BY BYTE +********************************************/ +ITERATOR_INSN2_MPY_SLOT(32,vmpyiwb,"Vd32=3Dvmpyiwb(Vu32,Rt32)","Vd32.w=3Dv= mpyi(Vu32.w,Rt32.b)", +"Vector word by byte multiply, keep lower result", +VdV.w[i] =3D fMPY32SS(VuV.w[i], fGETBYTE(i % 4, RtV) )) + +ITERATOR_INSN2_MPY_SLOT(32,vmpyiwb_acc,"Vx32+=3Dvmpyiwb(Vu32,Rt32)","Vx32.= w+=3Dvmpyi(Vu32.w,Rt32.b)", +"Vector word by byte multiply, keep lower result", +VxV.w[i] +=3D fMPY32SS(VuV.w[i], fGETBYTE(i % 4, RtV) )) + +ITERATOR_INSN2_MPY_SLOT(32,vmpyiwub,"Vd32=3Dvmpyiwub(Vu32,Rt32)","Vd32.w= =3Dvmpyi(Vu32.w,Rt32.ub)", +"Vector word by byte multiply, keep lower result", +VdV.w[i] =3D fMPY32SS(VuV.w[i], fGETUBYTE(i % 4, RtV) )) + +ITERATOR_INSN2_MPY_SLOT(32,vmpyiwub_acc,"Vx32+=3Dvmpyiwub(Vu32,Rt32)","Vx3= 2.w+=3Dvmpyi(Vu32.w,Rt32.ub)", +"Vector word by byte multiply, keep lower result", +VxV.w[i] +=3D fMPY32SS(VuV.w[i], fGETUBYTE(i % 4, RtV) )) + + +/******************************************** +* WORD BY HALF +********************************************/ +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyiwh,"Vd32=3Dvmpyiwh(Vu32,Rt32)",= "Vd32.w=3Dvmpyi(Vu32.w,Rt32.h)", +"Vector word by byte multiply, keep lower result", +VdV.w[i] =3D fMPY32SS(VuV.w[i], fGETHALF(i % 2, RtV))) + +ITERATOR_INSN2_MPY_SLOT_DOUBLE_VEC(32,vmpyiwh_acc,"Vx32+=3Dvmpyiwh(Vu32,Rt= 32)","Vx32.w+=3Dvmpyi(Vu32.w,Rt32.h)", +"Vector word by byte multiply, keep lower result", +VxV.w[i] +=3D fMPY32SS(VuV.w[i], fGETHALF(i % 2, RtV))) + + + + + + + + + + + + + + + + + + + +/************************************************************************** + * MMVECTOR LOGICAL OPERATIONS + * ***********************************************************************= */ +ITERATOR_INSN_ANY_SLOT(16,vand,"Vd32=3Dvand(Vu32,Vv32)", "Vector Logical A= nd", VdV.uh[i] =3D VuV.uh[i] & VvV.h[i]) +ITERATOR_INSN_ANY_SLOT(16,vor, "Vd32=3Dvor(Vu32,Vv32)", "Vector Logical O= r", VdV.uh[i] =3D VuV.uh[i] | VvV.h[i]) +ITERATOR_INSN_ANY_SLOT(16,vxor,"Vd32=3Dvxor(Vu32,Vv32)", "Vector Logical X= OR", VdV.uh[i] =3D VuV.uh[i] ^ VvV.h[i]) +ITERATOR_INSN_ANY_SLOT(16,vnot,"Vd32=3Dvnot(Vu32)", "Vector Logical NO= T", VdV.uh[i] =3D ~VuV.uh[i]) + + + + + +ITERATOR_INSN2_MPY_SLOT_LATE(8, vandqrt, +"Vd32.ub=3Dvand(Qu4.ub,Rt32.ub)", "Vd32=3Dvand(Qu4,Rt32)", "Insert Predica= te into Vector", + VdV.ub[i] =3D fGETQBIT(QuV,i) ? fGETUBYTE(i % 4, RtV) : 0) + +ITERATOR_INSN2_MPY_SLOT_LATE(8, vandqrt_acc, +"Vx32.ub|=3Dvand(Qu4.ub,Rt32.ub)", "Vx32|=3Dvand(Qu4,Rt32)", "Insert Pred= icate into Vector", + VxV.ub[i] |=3D (fGETQBIT(QuV,i)) ? fGETUBYTE(i % 4, RtV) : 0) + +ITERATOR_INSN2_MPY_SLOT_LATE(8, vandnqrt, +"Vd32.ub=3Dvand(!Qu4.ub,Rt32.ub)", "Vd32=3Dvand(!Qu4,Rt32)", "Insert Predi= cate into Vector", + VdV.ub[i] =3D !fGETQBIT(QuV,i) ? fGETUBYTE(i % 4, RtV) : 0) + +ITERATOR_INSN2_MPY_SLOT_LATE(8, vandnqrt_acc, +"Vx32.ub|=3Dvand(!Qu4.ub,Rt32.ub)", "Vx32|=3Dvand(!Qu4,Rt32)", "Insert Pr= edicate into Vector", + VxV.ub[i] |=3D !(fGETQBIT(QuV,i)) ? fGETUBYTE(i % 4, RtV) : 0) + + +ITERATOR_INSN2_MPY_SLOT_LATE(8, vandvrt, +"Qd4.ub=3Dvand(Vu32.ub,Rt32.ub)", "Qd4=3Dvand(Vu32,Rt32)", "Insert into Pr= edicate", + fSETQBIT(QdV,i,((VuV.ub[i] & fGETUBYTE(i % 4, RtV)) !=3D 0) ? 1 : 0)) + +ITERATOR_INSN2_MPY_SLOT_LATE(8, vandvrt_acc, +"Qx4.ub|=3Dvand(Vu32.ub,Rt32.ub)", "Qx4|=3Dvand(Vu32,Rt32)", "Insert into = Predicate ", + fSETQBIT(QxV,i,fGETQBIT(QxV,i)|(((VuV.ub[i] & fGETUBYTE(i % 4, RtV)) != =3D 0) ? 1 : 0))) + +ITERATOR_INSN_ANY_SLOT(8,vandvqv,"Vd32=3Dvand(Qv4,Vu32)","Mask off bytes", +VdV.b[i] =3D fGETQBIT(QvV,i) ? VuV.b[i] : 0) +ITERATOR_INSN_ANY_SLOT(8,vandvnqv,"Vd32=3Dvand(!Qv4,Vu32)","Mask off bytes= ", +VdV.b[i] =3D !fGETQBIT(QvV,i) ? VuV.b[i] : 0) + + + /*************************************************** + * Compare Vector with Vector + ***************************************************/ +#define VCMP(DEST, ASRC, ASRCOP, CMP, N, SRC, MASK, WIDTH) \ +{ \ + for(fHIDE(int) i =3D 0; i < fVBYTES(); i +=3D WIDTH) { \ + fSETQBITS(DEST,WIDTH,MASK,i,ASRC ASRCOP ((VuV.SRC[i/WIDTH] CMP VvV.SRC[i= /WIDTH]) ? MASK : 0)); \ + } \ + } + + +#define MMVEC_CMPGT(TYPE,TYPE2,TYPE3,DESCR,N,MASK,WIDTH,SRC) \ +EXTINSN(V6_vgt##TYPE, "Qd4=3Dvcmp.gt(Vu32." TYPE2 ",Vv32." TYPE2 ")"= , ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA), DESCR" greater than", \ + VCMP(QdV, , , >, N, SRC, MASK, WIDTH)) \ +EXTINSN(V6_vgt##TYPE##_and, "Qx4&=3Dvcmp.gt(Vu32." TYPE2 ",Vv32." TYPE2 ")= ", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA), DESCR" greater than with predicate-= and", \ + VCMP(QxV, fGETQBITS(QxV,WIDTH,MASK,i), &, >, N, SRC, MASK, WIDTH)) \ +EXTINSN(V6_vgt##TYPE##_or, "Qx4|=3Dvcmp.gt(Vu32." TYPE2 ",Vv32." TYPE2 ")= ", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA), DESCR" greater than with predicate-= or", \ + VCMP(QxV, fGETQBITS(QxV,WIDTH,MASK,i), |, >, N, SRC, MASK, WIDTH)) \ +EXTINSN(V6_vgt##TYPE##_xor, "Qx4^=3Dvcmp.gt(Vu32." TYPE2 ",Vv32." TYPE2 ")= ", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA), DESCR" greater than with predicate-= xor", \ + VCMP(QxV, fGETQBITS(QxV,WIDTH,MASK,i), ^, >, N, SRC, MASK, WIDTH)) + +#define MMVEC_CMP(TYPE,TYPE2,TYPE3,DESCR,N,MASK, WIDTH, SRC)\ +MMVEC_CMPGT(TYPE,TYPE2,TYPE3,DESCR,N,MASK,WIDTH,SRC) \ +EXTINSN(V6_veq##TYPE, "Qd4=3Dvcmp.eq(Vu32." TYPE2 ",Vv32." TYPE2 ")"= , ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA), DESCR" equal to", \ + VCMP(QdV, , , =3D=3D, N, SRC, MASK, WIDTH)) \ +EXTINSN(V6_veq##TYPE##_and, "Qx4&=3Dvcmp.eq(Vu32." TYPE2 ",Vv32." TYPE2 ")= ", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA), DESCR" equalto with predicate-and",= \ + VCMP(QxV, fGETQBITS(QxV,WIDTH,MASK,i), &, =3D=3D, N, SRC, MASK, WIDTH)) \ +EXTINSN(V6_veq##TYPE##_or, "Qx4|=3Dvcmp.eq(Vu32." TYPE2 ",Vv32." TYPE2 ")= ", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA), DESCR" equalto with predicate-or", \ + VCMP(QxV, fGETQBITS(QxV,WIDTH,MASK,i), |, =3D=3D, N, SRC, MASK, WIDTH)) \ +EXTINSN(V6_veq##TYPE##_xor, "Qx4^=3Dvcmp.eq(Vu32." TYPE2 ",Vv32." TYPE2 ")= ", ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA), DESCR" equalto with predicate-xor",= \ + VCMP(QxV, fGETQBITS(QxV,WIDTH,MASK,i), ^, =3D=3D, N, SRC, MASK, WIDTH)) + + +MMVEC_CMP(w,"w","","Vector Word Compare ", fVELEM(32), 0xF, 4, w) +MMVEC_CMP(h,"h","","Vector Half Compare ", fVELEM(16), 0x3, 2, h) +MMVEC_CMP(b,"b","","Vector Half Compare ", fVELEM(8), 0x1, 1, b) +MMVEC_CMPGT(uw,"uw","","Vector Unsigned Half Compare ", fVELEM(32), 0xF, 4= ,uw) +MMVEC_CMPGT(uh,"uh","","Vector Unsigned Half Compare ", fVELEM(16), 0x3, 2= ,uh) +MMVEC_CMPGT(ub,"ub","","Vector Unsigned Byte Compare ", fVELEM(8), 0x1, 1= ,ub) + +/*************************************************** +* Predicate Operations +***************************************************/ + +EXTINSN(V6_pred_scalar2, "Qd4=3Dvsetq(Rt32)", ATTRIBS(A_EXTENSION,= A_CVI,A_CVI_VP), "Set Vector Predicate ", +{ + fHIDE(int i;) + for(i =3D 0; i < fVBYTES(); i++) fSETQBIT(QdV,i,(i < (RtV & (fVBYTES()= -1))) ? 1 : 0); +}) + +EXTINSN(V6_pred_scalar2v2, "Qd4=3Dvsetq2(Rt32)", ATTRIBS(A_EXTENSI= ON,A_CVI,A_CVI_VP), "Set Vector Predicate ", +{ + fHIDE(int i;) + for(i =3D 0; i < fVBYTES(); i++) fSETQBIT(QdV,i,(i <=3D ((RtV-1) & (fV= BYTES()-1))) ? 1 : 0); +}) + + +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, shuffeqw, "Qd4.h=3Dvshuffe(Qs4.w,Qt4.= w)","Shrink Predicate", fSETQBIT(QdV,i, (i & 2) ? fGETQBIT(QsV,i-2) : fGETQ= BIT(QtV,i) ) ) +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, shuffeqh, "Qd4.b=3Dvshuffe(Qs4.h,Qt4.= h)","Shrink Predicate", fSETQBIT(QdV,i, (i & 1) ? fGETQBIT(QsV,i-1) : fGETQ= BIT(QtV,i) ) ) +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, pred_or, "Qd4=3Dor(Qs4,Qt4)","Vector = Predicate Or", fSETQBIT(QdV,i,fGETQBIT(QsV,i) || fGETQBIT(QtV,i) ) ) +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, pred_and, "Qd4=3Dand(Qs4,Qt4)","Vecto= r Predicate And", fSETQBIT(QdV,i,fGETQBIT(QsV,i) && fGETQBIT(QtV,i) ) ) +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, pred_xor, "Qd4=3Dxor(Qs4,Qt4)","Vecto= r Predicate Xor", fSETQBIT(QdV,i,fGETQBIT(QsV,i) ^ fGETQBIT(QtV,i) ) ) +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, pred_or_n, "Qd4=3Dor(Qs4,!Qt4)","Vect= or Predicate Or with not", fSETQBIT(QdV,i,fGETQBIT(QsV,i) || !fGETQBIT(QtV,= i) ) ) +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8, pred_and_n, "Qd4=3Dand(Qs4,!Qt4)","Ve= ctor Predicate And with not", fSETQBIT(QdV,i,fGETQBIT(QsV,i) && !fGETQBIT(= QtV,i) ) ) +ITERATOR_INSN_ANY_SLOT(8, pred_not, "Qd4=3Dnot(Qs4)","Vector Predicate Not= ", fSETQBIT(QdV,i,!fGETQBIT(QsV,i) ) ) + + + +EXTINSN(V6_vcmov, "if (Ps4) Vd32=3DVu32", ATTRIBS(A_EXTENSION,A_CVI,A_CV= I_VA), "Conditional Mov", +{ +if (fLSBOLD(PsV)) { + fHIDE(int i;) + fVFOREACH(8, i) { + VdV.ub[i] =3D VuV.ub[i]; + } + } else {CANCEL;} +}) + +EXTINSN(V6_vncmov, "if (!Ps4) Vd32=3DVu32", ATTRIBS(A_EXTENSION,A_CVI,A_= CVI_VA), "Conditional Mov", +{ +if (fLSBOLDNOT(PsV)) { + fHIDE(int i;) + fVFOREACH(8, i) { + VdV.ub[i] =3D VuV.ub[i]; + } + } else {CANCEL;} +}) + +EXTINSN(V6_vccombine, "if (Ps4) Vdd32=3Dvcombine(Vu32,Vv32)", ATTRIBS(A_E= XTENSION,A_CVI,A_CVI_VA_DV), "Conditional Combine", +{ +if (fLSBOLD(PsV)) { + fHIDE(int i;) + fVFOREACH(8, i) { + VddV.v[0].ub[i] =3D VvV.ub[i]; + VddV.v[1].ub[i] =3D VuV.ub[i]; + } + } else {CANCEL;} +}) + +EXTINSN(V6_vnccombine, "if (!Ps4) Vdd32=3Dvcombine(Vu32,Vv32)", ATTRIBS(A= _EXTENSION,A_CVI,A_CVI_VA_DV), "Conditional Combine", +{ +if (fLSBOLDNOT(PsV)) { + fHIDE(int i;) + fVFOREACH(8, i) { + VddV.v[0].ub[i] =3D VvV.ub[i]; + VddV.v[1].ub[i] =3D VuV.ub[i]; + } + } else {CANCEL;} +}) + + + +ITERATOR_INSN_ANY_SLOT(8,vmux,"Vd32=3Dvmux(Qt4,Vu32,Vv32)", +"Vector Select Element 8-bit", + VdV.ub[i] =3D fGETQBIT(QtV,i) ? VuV.ub[i] : VvV.ub[i]) + +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8,vswap,"Vdd32=3Dvswap(Qt4,Vu32,Vv32)", +"Vector Swap Element 8-bit", + VddV.v[0].ub[i] =3D fGETQBIT(QtV,i) ? VuV.ub[i] : VvV.ub[i]; + VddV.v[1].ub[i] =3D !fGETQBIT(QtV,i) ? VuV.ub[i] : VvV.ub[i]) + + +/*************************************************************************= ** +* +* MMVECTOR SORTING +* +**************************************************************************= **/ + +#define MMVEC_SORT(TYPE,TYPE2,DESCR,ELEMENTSIZE,SRC)\ +ITERATOR_INSN2_ANY_SLOT(ELEMENTSIZE,vmax##TYPE, "Vd32=3Dvmax" TYPE2 "(Vu32= ,Vv32)", "Vd32."#SRC"=3Dvmax(Vu32."#SRC",Vv32."#SRC")", "Vector " DESCR " m= ax", VdV.SRC[i] =3D (VuV.SRC[i] > VvV.SRC[i]) ? VuV.SRC[i] : VvV.SRC[i]) \ +ITERATOR_INSN2_ANY_SLOT(ELEMENTSIZE,vmin##TYPE, "Vd32=3Dvmin" TYPE2 "(Vu32= ,Vv32)", "Vd32."#SRC"=3Dvmin(Vu32."#SRC",Vv32."#SRC")", "Vector " DESCR " m= in", VdV.SRC[i] =3D (VuV.SRC[i] < VvV.SRC[i]) ? VuV.SRC[i] : VvV.SRC[i]) + +MMVEC_SORT(b,"b", "signed byte", 8, b) +MMVEC_SORT(ub,"ub", "unsigned byte", 8, ub) +MMVEC_SORT(uh,"uh", "unsigned halfword",16, uh) +MMVEC_SORT(h, "h", "halfword", 16, h) +MMVEC_SORT(w, "w", "word", 32, w) + + + + + + + + + +/************************************************************* +* SHUFFLES +****************************************************************/ + +ITERATOR_INSN2_ANY_SLOT(16,vsathub,"Vd32=3Dvsathub(Vu32,Vv32)","Vd32.ub=3D= vsat(Vu32.h,Vv32.h)", +"Saturate and pack 32 halfwords to 32 unsigned bytes, and interleave them", + fSETBYTE(0, VdV.uh[i], fVSATUB(VvV.h[i])); + fSETBYTE(1, VdV.uh[i], fVSATUB(VuV.h[i]))) + +ITERATOR_INSN2_ANY_SLOT(32,vsatwh,"Vd32=3Dvsatwh(Vu32,Vv32)","Vd32.h=3Dvsa= t(Vu32.w,Vv32.w)", +"Saturate and pack 16 words to 16 halfwords, and interleave them", + fSETHALF(0, VdV.w[i], fVSATH(VvV.w[i])); + fSETHALF(1, VdV.w[i], fVSATH(VuV.w[i]))) + +ITERATOR_INSN2_ANY_SLOT(32,vsatuwuh,"Vd32=3Dvsatuwuh(Vu32,Vv32)","Vd32.uh= =3Dvsat(Vu32.uw,Vv32.uw)", +"Saturate and pack 16 words to 16 halfwords, and interleave them", + fSETHALF(0, VdV.w[i], fVSATUH(VvV.uw[i])); + fSETHALF(1, VdV.w[i], fVSATUH(VuV.uw[i]))) + +ITERATOR_INSN2_ANY_SLOT(16,vshuffeb,"Vd32=3Dvshuffeb(Vu32,Vv32)","Vd32.b= =3Dvshuffe(Vu32.b,Vv32.b)", +"Shuffle half words with in a lane", + fSETBYTE(0, VdV.uh[i], fGETUBYTE(0, VvV.uh[i])); + fSETBYTE(1, VdV.uh[i], fGETUBYTE(0, VuV.uh[i]))) + +ITERATOR_INSN2_ANY_SLOT(16,vshuffob,"Vd32=3Dvshuffob(Vu32,Vv32)","Vd32.b= =3Dvshuffo(Vu32.b,Vv32.b)", +"Shuffle half words with in a lane", + fSETBYTE(0, VdV.uh[i], fGETUBYTE(1, VvV.uh[i])); + fSETBYTE(1, VdV.uh[i], fGETUBYTE(1, VuV.uh[i]))) + +ITERATOR_INSN2_ANY_SLOT(32,vshufeh,"Vd32=3Dvshuffeh(Vu32,Vv32)","Vd32.h=3D= vshuffe(Vu32.h,Vv32.h)", +"Shuffle half words with in a lane", + fSETHALF(0, VdV.uw[i], fGETUHALF(0, VvV.uw[i])); + fSETHALF(1, VdV.uw[i], fGETUHALF(0, VuV.uw[i]))) + +ITERATOR_INSN2_ANY_SLOT(32,vshufoh,"Vd32=3Dvshuffoh(Vu32,Vv32)","Vd32.h=3D= vshuffo(Vu32.h,Vv32.h)", +"Shuffle half words with in a lane", + fSETHALF(0, VdV.uw[i], fGETUHALF(1, VvV.uw[i])); + fSETHALF(1, VdV.uw[i], fGETUHALF(1, VuV.uw[i]))) + + + + +/************************************************************************** +* Double Vector Shuffles +**************************************************************************/ + +EXTINSN(V6_vshuff, "vshuff(Vy32,Vx32,Rt32)", +ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VP_VS), +"2x2->2x2 transpose, for multiple data sizes, inplace", +{ + fHIDE(int offset;) + for (offset=3D1; offset2x2 transpose for multiple data sizes", +{ + fHIDE(int offset;) + VddV.v[0] =3D VvV; + VddV.v[1] =3D VuV; + for (offset=3D1; offset>1; offset>0; offset>>=3D1) { + if ( RtV & offset) { + fHIDE(int k;) \ + fVFOREACH(8, k) {\ + if (!( k & offset)) { + fSWAPB(VddV.v[1].ub[k], VddV.v[0].ub[k+offset]); + } + } + } + } + }) + +/*************************************************************************= */ + + + +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(32,vshufoeh,"Vdd32=3Dvshuffoeh(Vu32,Vv3= 2)","Vdd32.h=3Dvshuffoe(Vu32.h,Vv32.h)", +"Vector Shuffle half words", + fSETHALF(0, VddV.v[0].uw[i], fGETUHALF(0, VvV.uw[i])); + fSETHALF(1, VddV.v[0].uw[i], fGETUHALF(0, VuV.uw[i])); + fSETHALF(0, VddV.v[1].uw[i], fGETUHALF(1, VvV.uw[i])); + fSETHALF(1, VddV.v[1].uw[i], fGETUHALF(1, VuV.uw[i]))) + +ITERATOR_INSN2_ANY_SLOT_DOUBLE_VEC(16,vshufoeb,"Vdd32=3Dvshuffoeb(Vu32,Vv3= 2)","Vdd32.b=3Dvshuffoe(Vu32.b,Vv32.b)", +"Vector Shuffle bytes", + fSETBYTE(0, VddV.v[0].uh[i], fGETUBYTE(0, VvV.uh[i])); + fSETBYTE(1, VddV.v[0].uh[i], fGETUBYTE(0, VuV.uh[i])); + fSETBYTE(0, VddV.v[1].uh[i], fGETUBYTE(1, VvV.uh[i])); + fSETBYTE(1, VddV.v[1].uh[i], fGETUBYTE(1, VuV.uh[i]))) + + +/*************************************************************** +* Deal +***************************************************************/ + +ITERATOR_INSN2_PERMUTE_SLOT(32, vdealh, "Vd32=3Dvdealh(Vu32)", "Vd32.h=3Dv= deal(Vu32.h)", +"Deal Halfwords", + VdV.uh[i ] =3D fGETUHALF(0, VuV.uw[i]); + VdV.uh[i+fVELEM(32)] =3D fGETUHALF(1, VuV.uw[i])) + +ITERATOR_INSN2_PERMUTE_SLOT(16, vdealb, "Vd32=3Dvdealb(Vu32)", "Vd32.b=3Dv= deal(Vu32.b)", +"Deal Halfwords", + VdV.ub[i ] =3D fGETUBYTE(0, VuV.uh[i]); + VdV.ub[i+fVELEM(16)] =3D fGETUBYTE(1, VuV.uh[i])) + +ITERATOR_INSN2_PERMUTE_SLOT(32, vdealb4w, "Vd32=3Dvdealb4w(Vu32,Vv32)", "= Vd32.b=3Dvdeale(Vu32.b,Vv32.b)", +"Deal Two Vectors Bytes", + VdV.ub[0+i ] =3D fGETUBYTE(0, VvV.uw[i]); + VdV.ub[fVELEM(32)+i ] =3D fGETUBYTE(2, VvV.uw[i]); + VdV.ub[2*fVELEM(32)+i] =3D fGETUBYTE(0, VuV.uw[i]); + VdV.ub[3*fVELEM(32)+i] =3D fGETUBYTE(2, VuV.uw[i])) + +/*************************************************************** +* shuffle +***************************************************************/ + +ITERATOR_INSN2_PERMUTE_SLOT(32, vshuffh, "Vd32=3Dvshuffh(Vu32)", "Vd32.h= =3Dvshuff(Vu32.h)", +"Deal Halfwords", + fSETHALF(0, VdV.uw[i], VuV.uh[i]); + fSETHALF(1, VdV.uw[i], VuV.uh[i+fVELEM(32)])) + +ITERATOR_INSN2_PERMUTE_SLOT(16, vshuffb, "Vd32=3Dvshuffb(Vu32)", "Vd32.b= =3Dvshuff(Vu32.b)", +"Deal Halfwords", + fSETBYTE(0, VdV.uh[i], VuV.ub[i]); + fSETBYTE(1, VdV.uh[i], VuV.ub[i+fVELEM(16)])) + + + + + +/*********************************************************** +* INSERT AND EXTRACT +*********************************************************/ +EXTINSN(V6_extractw, "Rd32=3Dvextract(Vu32,Rs32)", +ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VA,A_MEMLIKE,A_RESTRICT_SLOT0ONLY), +"Extract an element from a vector to scalar", +fHIDE(warn("RdN=3D%d VuN=3D%d RsN=3D%d RsV=3D0x%08x widx=3D%d",RdN,VuN,RsN= ,RsV,((RsV & (fVBYTES()-1)) >> 2));) +RdV =3D VuV.uw[ (RsV & (fVBYTES()-1)) >> 2]; +fHIDE(warn("RdV=3D0x%08x",RdV);)) + +EXTINSN(V6_vinsertwr, "Vx32.w=3Dvinsert(Rt32)", +ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX), +"Insert Word Scalar into Vector", +VxV.uw[0] =3D RtV;) + + + + +ITERATOR_INSN_MPY_SLOT_LATE(32,lvsplatw, "Vd32=3Dvsplat(Rt32)", "Replicate= s scalar accross words in vector", VdV.uw[i] =3D RtV) + +ITERATOR_INSN_MPY_SLOT_LATE(16,lvsplath, "Vd32.h=3Dvsplat(Rt32)", "Replica= tes scalar accross halves in vector", VdV.uh[i] =3D RtV) + +ITERATOR_INSN_MPY_SLOT_LATE(8,lvsplatb, "Vd32.b=3Dvsplat(Rt32)", "Replicat= es scalar accross bytes in vector", VdV.ub[i] =3D RtV) + + +ITERATOR_INSN_ANY_SLOT(32,vassign,"Vd32=3DVu32","Copy a vector",VdV.w[i]= =3DVuV.w[i]) + + +ITERATOR_INSN_ANY_SLOT_DOUBLE_VEC(8,vcombine,"Vdd32=3Dvcombine(Vu32,Vv32)", +"Vector assign, Any two to Vector Pair", + VddV.v[0].ub[i] =3D VvV.ub[i]; + VddV.v[1].ub[i] =3D VuV.ub[i]) + + + +/////////////////////////////////////////////////////////////////////////// + + +/********************************************************* +* GENERAL PERMUTE NETWORKS +*********************************************************/ + + +EXTINSN(V6_vdelta, "Vd32=3Dvdelta(Vu32,Vv32)", ATTRIBS(A_EXTENSION,A_CV= I,A_CVI_VP), +"Reverse Benes Butterfly network ", +{ + fHIDE(int offset;) + fHIDE(int k;) + fHIDE(mmvector_t tmp;) + tmp =3D VuV; + for (offset=3DfVBYTES(); (offset>>=3D1)>0; ) { + for (k =3D 0; k>3; \ + unsigned char element =3D value & 7; \ + READ_EXT_VREG(regno,tmp,0); \ + tmp.uh[(128/16)*lane+(element)]++; \ + WRITE_EXT_VREG(regno,tmp,EXT_NEW); \ + } \ + } + +#define fHISTQ(INPUTVEC,QVAL) \ + fUARCH_NOTE_PUMP_4X(); \ + fHIDE(int lane;) \ + fHIDE(mmvector_t tmp;) \ + fVFOREACH(128, lane) { \ + for (fHIDE(int )i=3D0; i<128/8; ++i) { \ + unsigned char value =3D INPUTVEC.ub[(128/8)*lane+i]; \ + unsigned char regno =3D value>>3; \ + unsigned char element =3D value & 7; \ + READ_EXT_VREG(regno,tmp,0); \ + if (fGETQBIT(QVAL,128/8*lane+i)) tmp.uh[(128/16)*lane+(element)]++; \ + WRITE_EXT_VREG(regno,tmp,EXT_NEW); \ + } \ + } + + + +EXTINSN(V6_vhist, "vhist",ATTRIBS(A_EXTENSION,A_CVI,A_CVI_4SLOT), "vhist i= nstruction",{ fHIDE(mmvector_t inputVec;) inputVec=3DfTMPVDATA(); fHIST(inp= utVec); }) +EXTINSN(V6_vhistq, "vhist(Qv4)",ATTRIBS(A_EXTENSION,A_CVI,A_CVI_4SLOT), "v= hist instruction",{ fHIDE(mmvector_t inputVec;) inputVec=3DfTMPVDATA(); fHI= STQ(inputVec,QvV); }) + +#undef fHIST +#undef fHISTQ + + +/* **** WEIGHTED HISTOGRAM **** */ + + +#if 1 +#define WHIST(EL,MASK,BSHIFT,COND,SATF) \ + fHIDE(unsigned int) bucket =3D fGETUBYTE(0,input.h[i]); \ + fHIDE(unsigned int) weight =3D fGETUBYTE(1,input.h[i]); \ + fHIDE(unsigned int) vindex =3D (bucket >> 3) & 0x1F; \ + fHIDE(unsigned int) elindex =3D ((i>>BSHIFT) & (~MASK)) | ((bucket>>BSHIF= T) & MASK); \ + fHIDE(mmvector_t tmp;) \ + READ_EXT_VREG(vindex,tmp,0); \ + COND tmp.EL[elindex] =3D SATF(tmp.EL[elindex] + weight); \ + WRITE_EXT_VREG(vindex,tmp,EXT_NEW); \ + fUARCH_NOTE_PUMP_2X(); + +ITERATOR_INSN_VHISTLIKE(16,vwhist256,"vwhist256","vector weighted histogra= m halfword counters", WHIST(uh,7,0,,)) +ITERATOR_INSN_VHISTLIKE(16,vwhist256q,"vwhist256(Qv4)","vector weighted hi= stogram halfword counters", WHIST(uh,7,0,if (fGETQBIT(QvV,2*i)),)) +ITERATOR_INSN_VHISTLIKE(16,vwhist256_sat,"vwhist256:sat","vector weighted = histogram halfword counters", WHIST(uh,7,0,,fVSATUH)) +ITERATOR_INSN_VHISTLIKE(16,vwhist256q_sat,"vwhist256(Qv4):sat","vector wei= ghted histogram halfword counters", WHIST(uh,7,0,if (fGETQBIT(QvV,2*i)),fVS= ATUH)) +ITERATOR_INSN_VHISTLIKE(16,vwhist128,"vwhist128","vector weighted histogra= m word counters", WHIST(uw,3,1,,)) +ITERATOR_INSN_VHISTLIKE(16,vwhist128q,"vwhist128(Qv4)","vector weighted hi= stogram word counters", WHIST(uw,3,1,if (fGETQBIT(QvV,2*i)),)) +ITERATOR_INSN_VHISTLIKE(16,vwhist128m,"vwhist128(#u1)","vector weighted hi= stogram word counters", WHIST(uw,3,1,if ((bucket & 1) =3D=3D uiV),)) +ITERATOR_INSN_VHISTLIKE(16,vwhist128qm,"vwhist128(Qv4,#u1)","vector weight= ed histogram word counters", WHIST(uw,3,1,if (((bucket & 1) =3D=3D uiV) && = fGETQBIT(QvV,2*i)),)) + + +#endif + + + +/* ****** lookup table instructions ***********= */ + +/* Use low bits from idx to choose next-bigger elements from vector, then = use LSB from idx to choose odd or even element */ + +ITERATOR_INSN_PERMUTE_SLOT(8,vlutvvb,"Vd32.b=3Dvlut32(Vu32.b,Vv32.b,Rt8)",= "vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D RtV & 0x7; +oddhalf =3D (RtV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D VuV.ub[i]; +VdV.b[i] =3D ((idx & 0xE0) =3D=3D (matchval << 5)) ? fGETBYTE(oddhalf,VvV.= h[idx % fVELEM(16)]) : 0) + + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(8,vlutvvb_oracc,"Vx32.b|=3Dvlut32(Vu= 32.b,Vv32.b,Rt8)","vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D RtV & 0x7; +oddhalf =3D (RtV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D VuV.ub[i]; +VxV.b[i] |=3D ((idx & 0xE0) =3D=3D (matchval << 5)) ? fGETBYTE(oddhalf,VvV= .h[idx % fVELEM(16)]) : 0) + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(16,vlutvwh,"Vdd32.h=3Dvlut16(Vu32.b,= Vv32.h,Rt8)","vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D RtV & 0xF; +oddhalf =3D (RtV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D fGETUBYTE(0,VuV.uh[i]); +VddV.v[0].h[i] =3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddhal= f,VvV.w[idx % fVELEM(32)]) : 0; +idx =3D fGETUBYTE(1,VuV.uh[i]); +VddV.v[1].h[i] =3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddhal= f,VvV.w[idx % fVELEM(32)]) : 0) + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(16,vlutvwh_oracc,"Vxx32.h|=3Dvlut16(= Vu32.b,Vv32.h,Rt8)","vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D fGETUBYTE(0,RtV) & 0xF; +oddhalf =3D (RtV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D fGETUBYTE(0,VuV.uh[i]); +VxxV.v[0].h[i] |=3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddha= lf,VvV.w[idx % fVELEM(32)]) : 0; +idx =3D fGETUBYTE(1,VuV.uh[i]); +VxxV.v[1].h[i] |=3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddha= lf,VvV.w[idx % fVELEM(32)]) : 0) + +ITERATOR_INSN_PERMUTE_SLOT(8,vlutvvbi,"Vd32.b=3Dvlut32(Vu32.b,Vv32.b,#u3)"= ,"vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D uiV & 0x7; +oddhalf =3D (uiV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D VuV.ub[i]; +VdV.b[i] =3D ((idx & 0xE0) =3D=3D (matchval << 5)) ? fGETBYTE(oddhalf,VvV.= h[idx % fVELEM(16)]) : 0) + + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(8,vlutvvb_oracci,"Vx32.b|=3Dvlut32(V= u32.b,Vv32.b,#u3)","vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D uiV & 0x7; +oddhalf =3D (uiV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D VuV.ub[i]; +VxV.b[i] |=3D ((idx & 0xE0) =3D=3D (matchval << 5)) ? fGETBYTE(oddhalf,VvV= .h[idx % fVELEM(16)]) : 0) + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(16,vlutvwhi,"Vdd32.h=3Dvlut16(Vu32.b= ,Vv32.h,#u3)","vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D uiV & 0xF; +oddhalf =3D (uiV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D fGETUBYTE(0,VuV.uh[i]); +VddV.v[0].h[i] =3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddhal= f,VvV.w[idx % fVELEM(32)]) : 0; +idx =3D fGETUBYTE(1,VuV.uh[i]); +VddV.v[1].h[i] =3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddhal= f,VvV.w[idx % fVELEM(32)]) : 0) + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(16,vlutvwh_oracci,"Vxx32.h|=3Dvlut16= (Vu32.b,Vv32.h,#u3)","vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int matchval;) fHIDE(int oddhalf;) +matchval =3D uiV & 0xF; +oddhalf =3D (uiV >> (fVECLOGSIZE()-6)) & 0x1; +idx =3D fGETUBYTE(0,VuV.uh[i]); +VxxV.v[0].h[i] |=3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddha= lf,VvV.w[idx % fVELEM(32)]) : 0; +idx =3D fGETUBYTE(1,VuV.uh[i]); +VxxV.v[1].h[i] |=3D ((idx & 0xF0) =3D=3D (matchval << 4)) ? fGETHALF(oddha= lf,VvV.w[idx % fVELEM(32)]) : 0) + +ITERATOR_INSN_PERMUTE_SLOT(8,vlutvvb_nm,"Vd32.b=3Dvlut32(Vu32.b,Vv32.b,Rt8= ):nomatch","vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int oddhalf;) fHIDE(int matchval;) + matchval =3D RtV & 0x7; + oddhalf =3D (RtV >> (fVECLOGSIZE()-6)) & 0x1; + idx =3D VuV.ub[i]; + idx =3D (idx&0x1F) | (matchval<<5); + VdV.b[i] =3D fGETBYTE(oddhalf,VvV.h[idx % fVELEM(16)])) + +ITERATOR_INSN_PERMUTE_SLOT_DOUBLE_VEC(16,vlutvwh_nm,"Vdd32.h=3Dvlut16(Vu32= .b,Vv32.h,Rt8):nomatch","vector-vector table lookup", +fHIDE(unsigned int idx;) fHIDE(int oddhalf;) fHIDE(int matchval;) + matchval =3D RtV & 0xF; + oddhalf =3D (RtV >> (fVECLOGSIZE()-6)) & 0x1; + idx =3D fGETUBYTE(0,VuV.uh[i]); + idx =3D (idx&0x0F) | (matchval<<4); + VddV.v[0].h[i] =3D fGETHALF(oddhalf,VvV.w[idx % fVELEM(32)]); + idx =3D fGETUBYTE(1,VuV.uh[i]); + idx =3D (idx&0x0F) | (matchval<<4); + VddV.v[1].h[i] =3D fGETHALF(oddhalf,VvV.w[idx % fVELEM(32)])) + + + + +/*************************************************************************= ***** +NON LINEAR - V65 + *************************************************************************= *****/ + +ITERATOR_INSN_SLOT2_DOUBLE_VEC(16,vmpahhsat,"Vx32.h=3Dvmpa(Vx32.h,Vu32.h,R= tt32.h):sat","piecewise linear approximation", + VxV.h[i]=3D fVSATH( ( ( fMPY16SS(VxV.h[i],VuV.h[i])<<1) + (fGETHALF(( = (VuV.h[i]>>14)&0x3), RttV )<<15))>>16)) + + +ITERATOR_INSN_SLOT2_DOUBLE_VEC(16,vmpauhuhsat,"Vx32.h=3Dvmpa(Vx32.h,Vu32.u= h,Rtt32.uh):sat","piecewise linear approximation", + VxV.h[i]=3D fVSATH( ( fMPY16SU(VxV.h[i],VuV.uh[i]) + (fGETUHALF(((VuV= .uh[i]>>14)&0x3), RttV )<<15))>>16)) + +ITERATOR_INSN_SLOT2_DOUBLE_VEC(16,vmpsuhuhsat,"Vx32.h=3Dvmps(Vx32.h,Vu32.u= h,Rtt32.uh):sat","piecewise linear approximation", + VxV.h[i]=3D fVSATH( ( fMPY16SU(VxV.h[i],VuV.uh[i]) - (fGETUHALF(((VuV= .uh[i]>>14)&0x3), RttV )<<15))>>16)) + + +ITERATOR_INSN_SLOT2_DOUBLE_VEC(16,vlut4,"Vd32.h=3Dvlut4(Vu32.uh,Rtt32.h)",= "4 entry lookup table", + VdV.h[i]=3D fGETHALF( ((VuV.h[i]>>14)&0x3), RttV )) + + + +/*************************************************************************= ***** +V65 + *************************************************************************= *****/ + +ITERATOR_INSN_MPY_SLOT_NOV1(32,vmpyuhe,"Vd32.uw=3Dvmpye(Vu32.uh,Rt32.uh)", +"Vector even halfword unsigned multiply by scalar", + VdV.uw[i] =3D fMPY16UU(fGETUHALF(0, VuV.uw[i]),fGETUHALF(0,RtV))) + + +ITERATOR_INSN_MPY_SLOT_NOV1(32,vmpyuhe_acc,"Vx32.uw+=3Dvmpye(Vu32.uh,Rt32.= uh)", +"Vector even halfword unsigned multiply by scalar", + VxV.uw[i] +=3D fMPY16UU(fGETUHALF(0, VuV.uw[i]),fGETUHALF(0,RtV))) + + + + +EXTINSN(V6_vgathermw, "vtmp.w=3Dvgather(Rt32,Mu2,Vv32.w).w", ATTRIBS(A_EX= TENSION,A_CVI,A_CVI_GATHER,A_CVI_VA,A_CVI_VM,A_CVI_TMP_DST,A_MEMLIKE), "Gat= her Words", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 4;) + fHIDE(fGATHER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + EA =3D RtV+VvV.uw[i]; + fVLOG_VTCM_GATHER_WORD(EA, VvV.uw[i], i,MuV); + } + fGATHER_FINISH() +}) +EXTINSN(V6_vgathermh, "vtmp.h=3Dvgather(Rt32,Mu2,Vv32.h).h", ATTRIBS(A_EX= TENSION,A_CVI,A_CVI_GATHER,A_CVI_VA,A_CVI_VM,A_CVI_TMP_DST,A_MEMLIKE), "Gat= her halfwords", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 2;) + fHIDE(fGATHER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(16, i) { + EA =3D RtV+VvV.uh[i]; + fVLOG_VTCM_GATHER_HALFWORD(EA, VvV.uh[i], i,MuV); + } + fGATHER_FINISH() +}) + + + +EXTINSN(V6_vgathermhw, "vtmp.h=3Dvgather(Rt32,Mu2,Vvv32.w).h", ATTRIBS(A_= EXTENSION,A_CVI,A_CVI_GATHER,A_CVI_VA_DV,A_CVI_VM,A_CVI_TMP_DST,A_MEMLIKE),= "Gather halfwords", +{ + fHIDE(int i;) + fHIDE(int j;) + fHIDE(int element_size =3D 2;) + fHIDE(fGATHER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + for(j =3D 0; j < 2; j++) { + EA =3D RtV+VvvV.v[j].uw[i]; + fVLOG_VTCM_GATHER_HALFWORD_DV(EA, VvvV.v[j].uw[i], (2*i+j),i,j= ,MuV); + } + } + fGATHER_FINISH() +}) + + +EXTINSN(V6_vgathermwq, "if (Qs4) vtmp.w=3Dvgather(Rt32,Mu2,Vv32.w).w", AT= TRIBS(A_EXTENSION,A_CVI,A_CVI_GATHER,A_CVI_VA,A_CVI_VM,A_CVI_TMP_DST,A_MEML= IKE), "Gather Words", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 4;) + fHIDE(fGATHER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + EA =3D RtV+VvV.uw[i]; + fVLOG_VTCM_GATHER_WORDQ(EA, VvV.uw[i], i,QsV,MuV); + } + fGATHER_FINISH() +}) +EXTINSN(V6_vgathermhq, "if (Qs4) vtmp.h=3Dvgather(Rt32,Mu2,Vv32.h).h", AT= TRIBS(A_EXTENSION,A_CVI,A_CVI_GATHER,A_CVI_VA,A_CVI_VM,A_CVI_TMP_DST,A_MEML= IKE), "Gather halfwords", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 2;) + fHIDE(fGATHER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(16, i) { + EA =3D RtV+VvV.uh[i]; + fVLOG_VTCM_GATHER_HALFWORDQ(EA, VvV.uh[i], i,QsV,MuV); + } + fGATHER_FINISH() +}) + + + +EXTINSN(V6_vgathermhwq, "if (Qs4) vtmp.h=3Dvgather(Rt32,Mu2,Vvv32.w).h", = ATTRIBS(A_EXTENSION,A_CVI,A_CVI_GATHER,A_CVI_VA_DV,A_CVI_VM,A_CVI_TMP_DST,A= _MEMLIKE), "Gather halfwords", +{ + fHIDE(int i;) + fHIDE(int j;) + fHIDE(int element_size =3D 2;) + fHIDE(fGATHER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + for(j =3D 0; j < 2; j++) { + EA =3D RtV+VvvV.v[j].uw[i]; + fVLOG_VTCM_GATHER_HALFWORDQ_DV(EA, VvvV.v[j].uw[i], (2*i+j),i,= j,QsV,MuV); + } + } + fGATHER_FINISH() +}) + + + +EXTINSN(V6_vscattermw , "vscatter(Rt32,Mu2,Vv32.w).w=3DVw32", ATTRIBS(A_EX= TENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA,A_CVI_VM,A_MEMLIKE), "Scatter Words", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 4;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + EA =3D RtV+VvV.uw[i]; + fVLOG_VTCM_WORD(EA, VvV.uw[i], VwV,i,MuV); + } + fSCATTER_FINISH(0) +}) + + + +EXTINSN(V6_vscattermh , "vscatter(Rt32,Mu2,Vv32.h).h=3DVw32", ATTRIBS(A_EX= TENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA,A_CVI_VM,A_MEMLIKE), "Scatter halfWord= s", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 2;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(16, i) { + EA =3D RtV+VvV.uh[i]; + fVLOG_VTCM_HALFWORD(EA,VvV.uh[i],VwV,i,MuV); + } + fSCATTER_FINISH(0) +}) + + +EXTINSN(V6_vscattermw_add, "vscatter(Rt32,Mu2,Vv32.w).w+=3DVw32", ATTRIBS= (A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA,A_CVI_VM,A_MEMLIKE), "Scatter Wor= ds-Add", +{ + fHIDE(int i;) + fHIDE(int ALIGNMENT=3D4;) + fHIDE(int element_size =3D 4;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + EA =3D (RtV+fVALIGN(VvV.uw[i],ALIGNMENT)); + fVLOG_VTCM_WORD_INCREMENT(EA,VvV.uw[i],VwV,i,ALIGNMENT,MuV); + } + fHIDE(fLOG_SCATTER_OP(4);) + fSCATTER_FINISH(1) +}) + +EXTINSN(V6_vscattermh_add, "vscatter(Rt32,Mu2,Vv32.h).h+=3DVw32", ATTRIBS= (A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA,A_CVI_VM,A_MEMLIKE), "Scatter hal= fword-Add", +{ + fHIDE(int i;) + fHIDE(int ALIGNMENT=3D2;) + fHIDE(int element_size =3D 2;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(16, i) { + EA =3D (RtV+fVALIGN(VvV.uh[i],ALIGNMENT)); + fVLOG_VTCM_HALFWORD_INCREMENT(EA,VvV.uh[i],VwV,i,ALIGNMENT,MuV); + } + fHIDE(fLOG_SCATTER_OP(2);) + fSCATTER_FINISH(1) +}) + + +EXTINSN(V6_vscattermwq, "if (Qs4) vscatter(Rt32,Mu2,Vv32.w).w=3DVw32", AT= TRIBS(A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA,A_CVI_VM,A_MEMLIKE), "Scatte= r Words conditional", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 4;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + EA =3D RtV+VvV.uw[i]; + fVLOG_VTCM_WORDQ(EA,VvV.uw[i], VwV,i,QsV,MuV); + } + fSCATTER_FINISH(0) +}) + +EXTINSN(V6_vscattermhq, "if (Qs4) vscatter(Rt32,Mu2,Vv32.h).h=3DVw32", AT= TRIBS(A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA,A_CVI_VM,A_MEMLIKE), "Scatte= r HalfWords conditional", +{ + fHIDE(int i;) + fHIDE(int element_size =3D 2;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(16, i) { + EA =3D RtV+VvV.uh[i]; + fVLOG_VTCM_HALFWORDQ(EA,VvV.uh[i],VwV,i,QsV,MuV); + } + fSCATTER_FINISH(0) +}) + + + + +EXTINSN(V6_vscattermhw , "vscatter(Rt32,Mu2,Vvv32.w).h=3DVw32", ATTRIBS(A_= EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA_DV,A_CVI_VM,A_MEMLIKE), "Scatter Wor= ds", +{ + fHIDE(int i;) + fHIDE(int j;) + fHIDE(int element_size =3D 2;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + for(j =3D 0; j < 2; j++) { + EA =3D RtV+VvvV.v[j].uw[i]; + fVLOG_VTCM_HALFWORD_DV(EA,VvvV.v[j].uw[i],VwV,(2*i+j),i,j,MuV); + } + } + fSCATTER_FINISH(0) +}) + + + +EXTINSN(V6_vscattermhwq, "if (Qs4) vscatter(Rt32,Mu2,Vvv32.w).h=3DVw32", = ATTRIBS(A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA_DV,A_CVI_VM,A_MEMLIKE), "S= catter halfwords conditional", +{ + fHIDE(int i;) + fHIDE(int j;) + fHIDE(int element_size =3D 2;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + for(j =3D 0; j < 2; j++) { + EA =3D RtV+VvvV.v[j].uw[i]; + fVLOG_VTCM_HALFWORDQ_DV(EA,VvvV.v[j].uw[i],VwV,(2*i+j),QsV,i,j= ,MuV); + } + } + fSCATTER_FINISH(0) +}) + +EXTINSN(V6_vscattermhw_add, "vscatter(Rt32,Mu2,Vvv32.w).h+=3DVw32", ATTRI= BS(A_EXTENSION,A_CVI,A_CVI_SCATTER,A_CVI_VA_DV,A_CVI_VM,A_MEMLIKE), "Scatte= r halfwords-add", +{ + fHIDE(int i;) + fHIDE(int j;) + fHIDE(int ALIGNMENT=3D2;) + fHIDE(int element_size =3D 2;) + fHIDE(fSCATTER_INIT( RtV, MuV, element_size);) + fVLASTBYTE(MuV, element_size); + fVALIGN(RtV, element_size); + fVFOREACH(32, i) { + for(j =3D 0; j < 2; j++) { + EA =3D RtV + fVALIGN(VvvV.v[j].uw[i],ALIGNMENT);; + fVLOG_VTCM_HALFWORD_INCREMENT_DV(EA,VvvV.v[j].uw[i],VwV,(2*i+= j),i,j,ALIGNMENT,MuV); + } + } + fHIDE(fLOG_SCATTER_OP(2);) + fSCATTER_FINISH(1) +}) + +EXTINSN(V6_vprefixqb,"Vd32.b=3Dprefixsum(Qv4)", ATTRIBS(A_EXTENSION,A_CV= I,A_CVI_VS), "parallel prefix sum of Q into byte", +{ + fHIDE(int i;) + fHIDE(size1u_t acc =3D 0;) + fVFOREACH(8, i) { + acc +=3D fGETQBIT(QvV,i); + VdV.ub[i] =3D acc; + } + } ) +EXTINSN(V6_vprefixqh,"Vd32.h=3Dprefixsum(Qv4)", ATTRIBS(A_EXTENSION,A_CV= I,A_CVI_VS), "parallel prefix sum of Q into halfwords", +{ + fHIDE(int i;) + fHIDE(size2u_t acc =3D 0;) + fVFOREACH(16, i) { + acc +=3D fGETQBIT(QvV,i*2+0); + acc +=3D fGETQBIT(QvV,i*2+1); + VdV.uh[i] =3D acc; + } + } ) +EXTINSN(V6_vprefixqw,"Vd32.w=3Dprefixsum(Qv4)", ATTRIBS(A_EXTENSION,A_CV= I,A_CVI_VS), "parallel prefix sum of Q into words", +{ + fHIDE(int i;) + fHIDE(size4u_t acc =3D 0;) + fVFOREACH(32, i) { + acc +=3D fGETQBIT(QvV,i*4+0); + acc +=3D fGETQBIT(QvV,i*4+1); + acc +=3D fGETQBIT(QvV,i*4+2); + acc +=3D fGETQBIT(QvV,i*4+3); + VdV.uw[i] =3D acc; + } + } ) + + + + + +/*************************************************************************= ***** + DEBUG Vector/Register Printing + *************************************************************************= *****/ + +#define PRINT_VU(TYPE, TYPE2, COUNT)\ + int i; \ + size4u_t vec_len =3D fVBYTES();\ + fprintf(stdout,"V%2d: ",VuN); \ + for (i=3D0;i>COUNT;i++) { \ + fprintf(stdout,TYPE2 " ", VuV.TYPE[i]); \ + }; \ + fprintf(stdout,"\\n"); \ + fflush(stdout);\ + +#undef ATTR_VMEM +#undef ATTR_VMEMU +#undef ATTR_VMEM_NT + +#endif /* NO_MMVEC */ + +#ifdef __SELF_DEF_EXTINSN +#undef EXTINSN +#undef __SELF_DEF_EXTINSN +#endif --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634036074566517.1514617221895; Tue, 12 Oct 2021 03:54:34 -0700 (PDT) Received: from localhost ([::1]:43888 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFQV-00005u-Pi for importer@patchew.org; Tue, 12 Oct 2021 06:54:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50626) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maElm-0001Kk-F3 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:27 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64100) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maElj-0007Fq-JY for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:26 -0400 Received: from unknown (HELO ironmsg03-sd.qualcomm.com) ([10.53.140.143]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg03-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 774D61777; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033543; x=1665569543; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=a/pTfUt6IF7hCIGqXND48yK2gQsfcVd4YCxIFghkjxk=; b=AGQrV8u6RjCRGdJA2ilu4poHLMuFwRIg0zIvOI66C+Ubfl0M1v7VSYqR NIm8nCKniacf+4Mfv1FrumhrbnvHWmMS9lgPO3i0sNc9RNdXGFkfOyaMY JoI81qFDK6wrN4Wf0QFFxQvMODJeXWf1SmWsckE/LTXmVB1A6YX53ihKv s=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 25/30] Hexagon HVX (target/hexagon) instruction decoding Date: Tue, 12 Oct 2021 05:11:03 -0500 Message-Id: <1634033468-23566-26-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634036076494100001 Add new file to target/hexagon/meson.build Acked-by: Richard Henderson Signed-off-by: Taylor Simpson --- target/hexagon/mmvec/decode_ext_mmvec.h | 24 ++++ target/hexagon/decode.c | 24 +++- target/hexagon/mmvec/decode_ext_mmvec.c | 236 ++++++++++++++++++++++++++++= ++++ target/hexagon/meson.build | 1 + 4 files changed, 283 insertions(+), 2 deletions(-) create mode 100644 target/hexagon/mmvec/decode_ext_mmvec.h create mode 100644 target/hexagon/mmvec/decode_ext_mmvec.c diff --git a/target/hexagon/mmvec/decode_ext_mmvec.h b/target/hexagon/mmvec= /decode_ext_mmvec.h new file mode 100644 index 0000000..3664b68 --- /dev/null +++ b/target/hexagon/mmvec/decode_ext_mmvec.h @@ -0,0 +1,24 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#ifndef HEXAGON_DECODE_EXT_MMVEC_H +#define HEXAGON_DECODE_EXT_MMVEC_H + +void mmvec_ext_decode_checks(Packet *pkt, bool disas_only); +SlotMask mmvec_ext_decode_find_iclass_slots(int opcode); + +#endif diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c index d424245..653bfd7 100644 --- a/target/hexagon/decode.c +++ b/target/hexagon/decode.c @@ -22,6 +22,7 @@ #include "decode.h" #include "insn.h" #include "printinsn.h" +#include "mmvec/decode_ext_mmvec.h" =20 #define fZXTN(N, M, VAL) ((VAL) & ((1LL << (N)) - 1)) =20 @@ -566,8 +567,12 @@ static void decode_remove_extenders(Packet *packet) =20 static SlotMask get_valid_slots(const Packet *pkt, unsigned int slot) { - return find_iclass_slots(pkt->insn[slot].opcode, - pkt->insn[slot].iclass); + if (GET_ATTRIB(pkt->insn[slot].opcode, A_EXTENSION)) { + return mmvec_ext_decode_find_iclass_slots(pkt->insn[slot].opcode); + } else { + return find_iclass_slots(pkt->insn[slot].opcode, + pkt->insn[slot].iclass); + } } =20 #define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT) /* NOTHING */ @@ -728,6 +733,11 @@ decode_insns_tablewalk(Insn *insn, const DectreeTable = *table, } decode_op(insn, opc, encoding); return 1; + } else if (table->table[i].type =3D=3D DECTREE_EXTSPACE) { + /* + * For now, HVX will be the only coproc + */ + return decode_insns_tablewalk(insn, ext_trees[EXT_IDX_mmvec], enco= ding); } else { return 0; } @@ -874,6 +884,7 @@ int decode_packet(int max_words, const uint32_t *words,= Packet *pkt, int words_read =3D 0; bool end_of_packet =3D false; int new_insns =3D 0; + int i; uint32_t encoding32; =20 /* Initialize */ @@ -901,6 +912,11 @@ int decode_packet(int max_words, const uint32_t *words= , Packet *pkt, return 0; } pkt->encod_pkt_size_in_bytes =3D words_read * 4; + pkt->pkt_has_hvx =3D false; + for (i =3D 0; i < num_insns; i++) { + pkt->pkt_has_hvx |=3D + GET_ATTRIB(pkt->insn[i].opcode, A_CVI); + } =20 /* * Check for :endloop in the parse bits @@ -931,6 +947,10 @@ int decode_packet(int max_words, const uint32_t *words= , Packet *pkt, decode_set_slot_number(pkt); decode_fill_newvalue_regno(pkt); =20 + if (pkt->pkt_has_hvx) { + mmvec_ext_decode_checks(pkt, disas_only); + } + if (!disas_only) { decode_shuffle_for_execution(pkt); decode_split_cmpjump(pkt); diff --git a/target/hexagon/mmvec/decode_ext_mmvec.c b/target/hexagon/mmvec= /decode_ext_mmvec.c new file mode 100644 index 0000000..061a65a --- /dev/null +++ b/target/hexagon/mmvec/decode_ext_mmvec.c @@ -0,0 +1,236 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#include "qemu/osdep.h" +#include "decode.h" +#include "opcodes.h" +#include "insn.h" +#include "iclass.h" +#include "mmvec/mmvec.h" +#include "mmvec/decode_ext_mmvec.h" + +static void +check_new_value(Packet *pkt) +{ + /* .new value for a MMVector store */ + int i, j; + const char *reginfo; + const char *destletters; + const char *dststr =3D NULL; + uint16_t def_opcode; + char letter; + int def_regnum; + + for (i =3D 1; i < pkt->num_insns; i++) { + uint16_t use_opcode =3D pkt->insn[i].opcode; + if (GET_ATTRIB(use_opcode, A_DOTNEWVALUE) && + GET_ATTRIB(use_opcode, A_CVI) && + GET_ATTRIB(use_opcode, A_STORE)) { + int use_regidx =3D strchr(opcode_reginfo[use_opcode], 's') - + opcode_reginfo[use_opcode]; + /* + * What's encoded at the N-field is the offset to who's produc= ing + * the value. + * Shift off the LSB which indicates odd/even register. + */ + int def_off =3D ((pkt->insn[i].regno[use_regidx]) >> 1); + int def_oreg =3D pkt->insn[i].regno[use_regidx] & 1; + int def_idx =3D -1; + for (j =3D i - 1; (j >=3D 0) && (def_off >=3D 0); j--) { + if (!GET_ATTRIB(pkt->insn[j].opcode, A_CVI)) { + continue; + } + def_off--; + if (def_off =3D=3D 0) { + def_idx =3D j; + break; + } + } + /* + * Check for a badly encoded N-field which points to an instru= ction + * out-of-range + */ + g_assert(!((def_off !=3D 0) || (def_idx < 0) || + (def_idx > (pkt->num_insns - 1)))); + + /* def_idx is the index of the producer */ + def_opcode =3D pkt->insn[def_idx].opcode; + reginfo =3D opcode_reginfo[def_opcode]; + destletters =3D "dexy"; + for (j =3D 0; (letter =3D destletters[j]) !=3D 0; j++) { + dststr =3D strchr(reginfo, letter); + if (dststr !=3D NULL) { + break; + } + } + if ((dststr =3D=3D NULL) && GET_ATTRIB(def_opcode, A_CVI_GATH= ER)) { + def_regnum =3D 0; + pkt->insn[i].regno[use_regidx] =3D def_oreg; + pkt->insn[i].new_value_producer_slot =3D pkt->insn[def_idx= ].slot; + } else { + if (dststr =3D=3D NULL) { + /* still not there, we have a bad packet */ + g_assert_not_reached(); + } + def_regnum =3D pkt->insn[def_idx].regno[dststr - reginfo]; + /* Now patch up the consumer with the register number */ + pkt->insn[i].regno[use_regidx] =3D def_regnum ^ def_oreg; + /* special case for (Vx,Vy) */ + dststr =3D strchr(reginfo, 'y'); + if (def_oreg && strchr(reginfo, 'x') && dststr) { + def_regnum =3D pkt->insn[def_idx].regno[dststr - regin= fo]; + pkt->insn[i].regno[use_regidx] =3D def_regnum; + } + /* + * We need to remember who produces this value to later + * check if it was dynamically cancelled + */ + pkt->insn[i].new_value_producer_slot =3D pkt->insn[def_idx= ].slot; + } + } + } +} + +/* + * We don't want to reorder slot1/slot0 with respect to each other. + * So in our shuffling, we don't want to move the .cur / .tmp vmem earlier + * Instead, we should move the producing instruction later + * But the producing instruction might feed a .new store! + * So we may need to move that even later. + */ + +static void +decode_mmvec_move_cvi_to_end(Packet *pkt, int max) +{ + int i; + for (i =3D 0; i < max; i++) { + if (GET_ATTRIB(pkt->insn[i].opcode, A_CVI)) { + int last_inst =3D pkt->num_insns - 1; + uint16_t last_opcode =3D pkt->insn[last_inst].opcode; + + /* + * If the last instruction is an endloop, move to the one befo= re it + * Keep endloop as the last thing always + */ + if ((last_opcode =3D=3D J2_endloop0) || + (last_opcode =3D=3D J2_endloop1) || + (last_opcode =3D=3D J2_endloop01)) { + last_inst--; + } + + decode_send_insn_to(pkt, i, last_inst); + max--; + i--; /* Retry this index now that packet has rotated */ + } + } +} + +static void +decode_shuffle_for_execution_vops(Packet *pkt) +{ + /* + * Sort for .new + */ + int i; + for (i =3D 0; i < pkt->num_insns; i++) { + uint16_t opcode =3D pkt->insn[i].opcode; + if (GET_ATTRIB(opcode, A_LOAD) && + (GET_ATTRIB(opcode, A_CVI_NEW) || + GET_ATTRIB(opcode, A_CVI_TMP))) { + /* + * Find prior consuming vector instructions + * Move to end of packet + */ + decode_mmvec_move_cvi_to_end(pkt, i); + break; + } + } + + /* Move HVX new value stores to the end of the packet */ + for (i =3D 0; i < pkt->num_insns - 1; i++) { + uint16_t opcode =3D pkt->insn[i].opcode; + if (GET_ATTRIB(opcode, A_STORE) && + GET_ATTRIB(opcode, A_CVI_NEW) && + !GET_ATTRIB(opcode, A_CVI_SCATTER_RELEASE)) { + int last_inst =3D pkt->num_insns - 1; + uint16_t last_opcode =3D pkt->insn[last_inst].opcode; + + /* + * If the last instruction is an endloop, move to the one befo= re it + * Keep endloop as the last thing always + */ + if ((last_opcode =3D=3D J2_endloop0) || + (last_opcode =3D=3D J2_endloop1) || + (last_opcode =3D=3D J2_endloop01)) { + last_inst--; + } + + decode_send_insn_to(pkt, i, last_inst); + break; + } + } +} + +static void +check_for_vhist(Packet *pkt) +{ + pkt->vhist_insn =3D NULL; + for (int i =3D 0; i < pkt->num_insns; i++) { + Insn *insn =3D &pkt->insn[i]; + int opcode =3D insn->opcode; + if (GET_ATTRIB(opcode, A_CVI) && GET_ATTRIB(opcode, A_CVI_4SLOT)) { + pkt->vhist_insn =3D insn; + return; + } + } +} + +/* + * Public Functions + */ + +SlotMask mmvec_ext_decode_find_iclass_slots(int opcode) +{ + if (GET_ATTRIB(opcode, A_CVI_VM)) { + /* HVX memory instruction */ + if (GET_ATTRIB(opcode, A_RESTRICT_SLOT0ONLY)) { + return SLOTS_0; + } else if (GET_ATTRIB(opcode, A_RESTRICT_SLOT1ONLY)) { + return SLOTS_1; + } + return SLOTS_01; + } else if (GET_ATTRIB(opcode, A_RESTRICT_SLOT2ONLY)) { + return SLOTS_2; + } else if (GET_ATTRIB(opcode, A_CVI_VX)) { + /* HVX multiply instruction */ + return SLOTS_23; + } else if (GET_ATTRIB(opcode, A_CVI_VS_VX)) { + /* HVX permute/shift instruction */ + return SLOTS_23; + } else { + return SLOTS_0123; + } +} + +void mmvec_ext_decode_checks(Packet *pkt, bool disas_only) +{ + check_new_value(pkt); + if (!disas_only) { + decode_shuffle_for_execution_vops(pkt); + } + check_for_vhist(pkt); +} diff --git a/target/hexagon/meson.build b/target/hexagon/meson.build index a35eb28..b612431 100644 --- a/target/hexagon/meson.build +++ b/target/hexagon/meson.build @@ -175,6 +175,7 @@ hexagon_ss.add(files( 'printinsn.c', 'arch.c', 'fma_emu.c', + 'mmvec/decode_ext_mmvec.c', 'mmvec/system_ext_mmvec.c', )) =20 --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634036718195580.27284472282; Tue, 12 Oct 2021 04:05:18 -0700 (PDT) Received: from localhost ([::1]:33424 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFar-00049y-N3 for importer@patchew.org; Tue, 12 Oct 2021 07:05:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50674) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEly-0001bn-DD for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:39 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64080) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEls-0006xP-2z for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:36 -0400 Received: from unknown (HELO ironmsg03-sd.qualcomm.com) ([10.53.140.143]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg03-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 79D2D1798; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033552; x=1665569552; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YnvgX8G0QQdIz/MK59kouB8HKlGjTxuD/ph7Ye/haHM=; b=gFJfTNf2ZCVp5+4tWT1ScP8BGX7r8OCxvdEHTHdzfKTQJ7iQUDJxa2dl 7riIyV4hncqc/qjPI1vF7folDD0azFE8lGffuK23qYDCdHwuaQbVBr9TB llycwAsNuDiOpk+aNBbx2DpOikXDAfUm1BuQqoqYPRVqqMEt8NU9TGs/a 0=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 26/30] Hexagon HVX (target/hexagon) import instruction encodings Date: Tue, 12 Oct 2021 05:11:04 -0500 Message-Id: <1634033468-23566-27-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634036720487100001 Signed-off-by: Taylor Simpson Acked-by: Richard Henderson --- target/hexagon/decode.c | 4 + target/hexagon/imported/allextenc.def | 20 + target/hexagon/imported/encode.def | 1 + target/hexagon/imported/mmvec/encode_ext.def | 794 +++++++++++++++++++++++= ++++ 4 files changed, 819 insertions(+) create mode 100644 target/hexagon/imported/allextenc.def create mode 100644 target/hexagon/imported/mmvec/encode_ext.def diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c index 653bfd7..6f0f27b 100644 --- a/target/hexagon/decode.c +++ b/target/hexagon/decode.c @@ -47,6 +47,7 @@ enum { /* Name Num Table */ DEF_REGMAP(R_16, 16, 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, = 23) DEF_REGMAP(R__8, 8, 0, 2, 4, 6, 16, 18, 20, 22) +DEF_REGMAP(R_8, 8, 0, 1, 2, 3, 4, 5, 6, 7) =20 #define DECODE_MAPPED_REG(OPNUM, NAME) \ insn->regno[OPNUM] =3D DECODE_REGISTER_##NAME[insn->regno[OPNUM]]; @@ -158,6 +159,9 @@ static void decode_ext_init(void) for (i =3D EXT_IDX_noext; i < EXT_IDX_noext_AFTER; i++) { ext_trees[i] =3D &dectree_table_DECODE_EXT_EXT_noext; } + for (i =3D EXT_IDX_mmvec; i < EXT_IDX_mmvec_AFTER; i++) { + ext_trees[i] =3D &dectree_table_DECODE_EXT_EXT_mmvec; + } } =20 typedef struct { diff --git a/target/hexagon/imported/allextenc.def b/target/hexagon/importe= d/allextenc.def new file mode 100644 index 0000000..39a3e93 --- /dev/null +++ b/target/hexagon/imported/allextenc.def @@ -0,0 +1,20 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#define EXTNAME mmvec +#include "mmvec/encode_ext.def" +#undef EXTNAME diff --git a/target/hexagon/imported/encode.def b/target/hexagon/imported/e= ncode.def index b9368d1..e40e7fb 100644 --- a/target/hexagon/imported/encode.def +++ b/target/hexagon/imported/encode.def @@ -71,6 +71,7 @@ =20 #include "encode_pp.def" #include "encode_subinsn.def" +#include "allextenc.def" =20 #ifdef __SELF_DEF_FIELD32 #undef __SELF_DEF_FIELD32 diff --git a/target/hexagon/imported/mmvec/encode_ext.def b/target/hexagon/= imported/mmvec/encode_ext.def new file mode 100644 index 0000000..6fbbe2c --- /dev/null +++ b/target/hexagon/imported/mmvec/encode_ext.def @@ -0,0 +1,794 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#define CONCAT(A,B) A##B +#define EXTEXTNAME(X) CONCAT(EXT_,X) +#define DEF_ENC(TAG,STR) DEF_EXT_ENC(TAG,EXTEXTNAME(EXTNAME),STR) + + +#ifndef NO_MMVEC +DEF_ENC(V6_extractw, ICLASS_LD" 001 0 000sssss PP0uuuuu --1ddddd") /* c= oproc insn, returns Rd */ +#endif + + +#ifndef NO_MMVEC + + + +DEF_CLASS32(ICLASS_NCJ" 1--- -------- PP------ --------",COPROC_VMEM) +DEF_CLASS32(ICLASS_NCJ" 1000 0-0ttttt PPi--iii ---ddddd",BaseOffset_VMEM_L= oads) +DEF_CLASS32(ICLASS_NCJ" 1000 1-0ttttt PPivviii ---ddddd",BaseOffset_if_Pv_= VMEM_Loads) +DEF_CLASS32(ICLASS_NCJ" 1000 0-1ttttt PPi--iii --------",BaseOffset_VMEM_S= tores1) +DEF_CLASS32(ICLASS_NCJ" 1000 1-0ttttt PPi--iii 00------",BaseOffset_VMEM_S= tores2) +DEF_CLASS32(ICLASS_NCJ" 1000 1-1ttttt PPivviii --------",BaseOffset_if_Pv_= VMEM_Stores) + +DEF_CLASS32(ICLASS_NCJ" 1001 0-0xxxxx PP---iii ---ddddd",PostImm_VMEM_Load= s) +DEF_CLASS32(ICLASS_NCJ" 1001 1-0xxxxx PP-vviii ---ddddd",PostImm_if_Pv_VME= M_Loads) +DEF_CLASS32(ICLASS_NCJ" 1001 0-1xxxxx PP---iii --------",PostImm_VMEM_Stor= es1) +DEF_CLASS32(ICLASS_NCJ" 1001 1-0xxxxx PP---iii 00------",PostImm_VMEM_Stor= es2) +DEF_CLASS32(ICLASS_NCJ" 1001 1-1xxxxx PP-vviii --------",PostImm_if_Pv_VME= M_Stores) + +DEF_CLASS32(ICLASS_NCJ" 1011 0-0xxxxx PPu----- ---ddddd",PostM_VMEM_Loads) +DEF_CLASS32(ICLASS_NCJ" 1011 1-0xxxxx PPuvv--- ---ddddd",PostM_if_Pv_VMEM_= Loads) +DEF_CLASS32(ICLASS_NCJ" 1011 0-1xxxxx PPu----- --------",PostM_VMEM_Stores= 1) +DEF_CLASS32(ICLASS_NCJ" 1011 1-0xxxxx PPu----- 00------",PostM_VMEM_Stores= 2) +DEF_CLASS32(ICLASS_NCJ" 1011 1-1xxxxx PPuvv--- --------",PostM_if_Pv_VMEM_= Stores) + +DEF_CLASS32(ICLASS_NCJ" 110- 0------- PP------ --------",Z_Load) +DEF_CLASS32(ICLASS_NCJ" 110- 1------- PP------ --------",Z_Load_if_Pv) + +DEF_CLASS32(ICLASS_NCJ" 1111 000ttttt PPu--0-- ---vvvvv",Gather) +DEF_CLASS32(ICLASS_NCJ" 1111 000ttttt PPu--1-- -ssvvvvv",Gather_if_Qs) +DEF_CLASS32(ICLASS_NCJ" 1111 001ttttt PPuvvvvv ---wwwww",Scatter) +DEF_CLASS32(ICLASS_NCJ" 1111 001ttttt PPuvvvvv -----sss",Scatter_New) +DEF_CLASS32(ICLASS_NCJ" 1111 1--ttttt PPuvvvvv -sswwwww",Scatter_if_Qs) + + +DEF_FIELD32(ICLASS_NCJ" 1--- -!------ PP------ --------",NT,"NonTemporal") + + + +DEF_FIELDROW_DESC32( ICLASS_NCJ" 1 000 --- ----- PP i --iii= ----- ---","[#0] vmem(Rt+#s4)[:nt]") + +#define LDST_ENC(TAG,MAJ3,MID3,RREG,TINY6,MIN3,VREG) DEF_ENC(TAG, ICLASS_N= CJ "1" #MAJ3 #MID3 #RREG "PP" #TINY6 #MIN3 #VREG) + +#define LDST_BO(TAGPRE,MID3,PRED,MIN3,VREG) LDST_ENC(TAGPRE##_ai, 000,MID3= ,ttttt,i PRED iii,MIN3,VREG) +#define LDST_PI(TAGPRE,MID3,PRED,MIN3,VREG) LDST_ENC(TAGPRE##_pi, 001,MID3= ,xxxxx,- PRED iii,MIN3,VREG) +#define LDST_PM(TAGPRE,MID3,PRED,MIN3,VREG) LDST_ENC(TAGPRE##_ppu,011,MID3= ,xxxxx,u PRED ---,MIN3,VREG) + +#define LDST_BASICLD(OP,TAGPRE) \ + OP(TAGPRE, 000,00,000,ddddd) \ + OP(TAGPRE##_nt, 010,00,000,ddddd) \ + OP(TAGPRE##_cur, 000,00,001,ddddd) \ + OP(TAGPRE##_nt_cur, 010,00,001,ddddd) \ + OP(TAGPRE##_tmp, 000,00,010,ddddd) \ + OP(TAGPRE##_nt_tmp, 010,00,010,ddddd) + +#define LDST_BASICST(OP,TAGPRE) \ + OP(TAGPRE, 001,--,000,sssss) \ + OP(TAGPRE##_nt, 011,--,000,sssss) \ + OP(TAGPRE##_new, 001,--,001,-0sss) \ + OP(TAGPRE##_srls, 001,--,001,-1---) \ + OP(TAGPRE##_nt_new, 011,--,001,--sss) \ + + +#define LDST_QPREDST(OP,TAGPRE) \ + OP(TAGPRE##_qpred, 100,vv,000,sssss) \ + OP(TAGPRE##_nt_qpred, 110,vv,000,sssss) \ + OP(TAGPRE##_nqpred, 100,vv,001,sssss) \ + OP(TAGPRE##_nt_nqpred,110,vv,001,sssss) \ + +#define LDST_CONDLD(OP,TAGPRE) \ + OP(TAGPRE##_pred, 100,vv,010,ddddd) \ + OP(TAGPRE##_nt_pred, 110,vv,010,ddddd) \ + OP(TAGPRE##_npred, 100,vv,011,ddddd) \ + OP(TAGPRE##_nt_npred, 110,vv,011,ddddd) \ + OP(TAGPRE##_cur_pred, 100,vv,100,ddddd) \ + OP(TAGPRE##_nt_cur_pred, 110,vv,100,ddddd) \ + OP(TAGPRE##_cur_npred, 100,vv,101,ddddd) \ + OP(TAGPRE##_nt_cur_npred, 110,vv,101,ddddd) \ + OP(TAGPRE##_tmp_pred, 100,vv,110,ddddd) \ + OP(TAGPRE##_nt_tmp_pred, 110,vv,110,ddddd) \ + OP(TAGPRE##_tmp_npred, 100,vv,111,ddddd) \ + OP(TAGPRE##_nt_tmp_npred, 110,vv,111,ddddd) \ + +#define LDST_PREDST(OP,TAGPRE,NT,MIN2) \ + OP(TAGPRE##_pred, 1 NT 1,vv,MIN2 0,sssss) \ + OP(TAGPRE##_npred, 1 NT 1,vv,MIN2 1,sssss) + +#define LDST_PREDSTNEW(OP,TAGPRE,NT,MIN2) \ + OP(TAGPRE##_pred, 1 NT 1,vv,MIN2 0,NT 0 sss) \ + OP(TAGPRE##_npred, 1 NT 1,vv,MIN2 1,NT 1 sss) + +// 0.0,vv,0--,sssss: pred st +#define LDST_BASICPREDST(OP,TAGPRE) \ + LDST_PREDST(OP,TAGPRE, 0,00) \ + LDST_PREDST(OP,TAGPRE##_nt, 1,00) \ + LDST_PREDSTNEW(OP,TAGPRE##_new, 0,01) \ + LDST_PREDSTNEW(OP,TAGPRE##_nt_new, 1,01) + + + +LDST_BASICLD(LDST_BO,V6_vL32b) +LDST_CONDLD(LDST_BO,V6_vL32b) +LDST_BASICLD(LDST_PI,V6_vL32b) +LDST_CONDLD(LDST_PI,V6_vL32b) +LDST_BASICLD(LDST_PM,V6_vL32b) +LDST_CONDLD(LDST_PM,V6_vL32b) + +// Loads + +LDST_BO(V6_vL32Ub,000,00,111,ddddd) +//Stores +LDST_BASICST(LDST_BO,V6_vS32b) + + +LDST_BO(V6_vS32Ub,001,--,111,sssss) + + + + +// Byte Enabled Stores +LDST_QPREDST(LDST_BO,V6_vS32b) + +// Scalar Predicated Stores +LDST_BASICPREDST(LDST_BO,V6_vS32b) + + +LDST_PREDST(LDST_BO,V6_vS32Ub,0,11) + + + + +DEF_FIELDROW_DESC32( ICLASS_NCJ" 1 001 --- ----- PP - -----= ddddd ---","[#1] vmem(Rx++#s3)[:nt]") + +// Loads +LDST_PI(V6_vL32Ub,000,00,111,ddddd) + +//Stores +LDST_BASICST(LDST_PI,V6_vS32b) + + + +LDST_PI(V6_vS32Ub,001,--,111,sssss) + + +// Byte Enabled Stores +LDST_QPREDST(LDST_PI,V6_vS32b) + + +// Scalar Predicated Stores +LDST_BASICPREDST(LDST_PI,V6_vS32b) + + +LDST_PREDST(LDST_PI,V6_vS32Ub,0,11) + + + +DEF_FIELDROW_DESC32( ICLASS_NCJ" 1 011 --- ----- PP - ----- ---= -- ---","[#3] vmem(Rx++#M)[:nt]") + +// Loads +LDST_PM(V6_vL32Ub,000,00,111,ddddd) + +//Stores +LDST_BASICST(LDST_PM,V6_vS32b) + + + +LDST_PM(V6_vS32Ub,001,--,111,sssss) + +// Byte Enabled Stores +LDST_QPREDST(LDST_PM,V6_vS32b) + +// Scalar Predicated Stores +LDST_BASICPREDST(LDST_PM,V6_vS32b) + + +LDST_PREDST(LDST_PM,V6_vS32Ub,0,11) + + + +DEF_ENC(V6_vaddcarrysat, ICLASS_CJ" 1 101 100 vvvvv PP 1 uuuuu 0ss dddd= d") // +DEF_ENC(V6_vaddcarryo, ICLASS_CJ" 1 101 101 vvvvv PP 1 uuuuu 0ee dd= ddd") // +DEF_ENC(V6_vsubcarryo, ICLASS_CJ" 1 101 101 vvvvv PP 1 uuuuu 1ee dd= ddd") // +DEF_ENC(V6_vsatdw, ICLASS_CJ" 1 101 100 vvvvv PP 1 uuuuu 111 dddd= d") // + +DEF_FIELDROW_DESC32( ICLASS_NCJ" 1 111 --- ----- PP - ----- ----= - ---","[#6] vgather,vscatter") +DEF_ENC(V6_vgathermw, ICLASS_NCJ" 1 111 000 ttttt PP u --000 --- v= vvvv") // vtmp.w=3Dvmem(Rt32,Mu2,Vv32.w).w +DEF_ENC(V6_vgathermh, ICLASS_NCJ" 1 111 000 ttttt PP u --001 --- v= vvvv") // vtmp.h=3Dvmem(Rt32,Mu2,Vv32.h).h +DEF_ENC(V6_vgathermhw, ICLASS_NCJ" 1 111 000 ttttt PP u --010 --- = vvvvv") // vtmp.h=3Dvmem(Rt32,Mu2,Vvv32.w).h + + +DEF_ENC(V6_vgathermwq, ICLASS_NCJ" 1 111 000 ttttt PP u --100 -ss = vvvvv") // if (Qs4) vtmp.w=3Dvmem(Rt32,Mu2,Vv32.w).w +DEF_ENC(V6_vgathermhq, ICLASS_NCJ" 1 111 000 ttttt PP u --101 -ss = vvvvv") // if (Qs4) vtmp.h=3Dvmem(Rt32,Mu2,Vv32.h).h +DEF_ENC(V6_vgathermhwq, ICLASS_NCJ" 1 111 000 ttttt PP u --110 -ss vvv= vv") // if (Qs4) vtmp.h=3Dvmem(Rt32,Mu2,Vvv32.w).h + + + +DEF_ENC(V6_vscattermw, ICLASS_NCJ" 1 111 001 ttttt PP u vvvvv 000 = wwwww") // vmem(Rt32,Mu2,Vv32.w)=3DVw32.w +DEF_ENC(V6_vscattermh, ICLASS_NCJ" 1 111 001 ttttt PP u vvvvv 001 = wwwww") // vmem(Rt32,Mu2,Vv32.h)=3DVw32.h +DEF_ENC(V6_vscattermhw, ICLASS_NCJ" 1 111 001 ttttt PP u vvvvv 010 www= ww") // vmem(Rt32,Mu2,Vv32.h)=3DVw32.h + +DEF_ENC(V6_vscattermw_add, ICLASS_NCJ" 1 111 001 ttttt PP u vvvvv 100 = wwwww") // vmem(Rt32,Mu2,Vv32.w) +=3D Vw32.w +DEF_ENC(V6_vscattermh_add, ICLASS_NCJ" 1 111 001 ttttt PP u vvvvv 101 = wwwww") // vmem(Rt32,Mu2,Vv32.h) +=3D Vw32.h +DEF_ENC(V6_vscattermhw_add, ICLASS_NCJ" 1 111 001 ttttt PP u vvvvv 110 www= ww") // vmem(Rt32,Mu2,Vv32.h) +=3D Vw32.h + + +DEF_ENC(V6_vscattermwq, ICLASS_NCJ" 1 111 100 ttttt PP u vvvvv 0ss www= ww") // if (Qs4) vmem(Rt32,Mu2,Vv32.w)=3DVw32.w +DEF_ENC(V6_vscattermhq, ICLASS_NCJ" 1 111 100 ttttt PP u vvvvv 1ss www= ww") // if (Qs4) vmem(Rt32,Mu2,Vv32.h)=3DVw32.h +DEF_ENC(V6_vscattermhwq, ICLASS_NCJ" 1 111 101 ttttt PP u vvvvv 0ss ww= www") // if (Qs4) vmem(Rt32,Mu2,Vv32.h)=3DVw32.h + + + + + +DEF_CLASS32(ICLASS_CJ" 1--- -------- PP------ --------",COPROC_VX) + + + +/*************************************************************** +* +* Group #0, Uses Q6 Rt8: new in v61 +* +****************************************************************/ + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 000 --- ----- PP - ----- ----= - ---","[#1] Vd32=3D(Vu32, Vv32, Rt8)") +DEF_ENC(V6_vasrhbsat, ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 00= 0 ddddd") // +DEF_ENC(V6_vasruwuhrndsat, ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 0= 01 ddddd") // +DEF_ENC(V6_vasrwuhrndsat, ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 01= 0 ddddd") // +DEF_ENC(V6_vlutvvb_nm, ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 0= 11 ddddd") // +DEF_ENC(V6_vlutvwh_nm, ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 1= 00 ddddd") // +DEF_ENC(V6_vasruhubrndsat, ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 1= 11 ddddd") // + +DEF_ENC(V6_vasruwuhsat, ICLASS_CJ" 1 000 vvv vvttt PP 1 uuuuu 100 = ddddd") // +DEF_ENC(V6_vasruhubsat, ICLASS_CJ" 1 000 vvv vvttt PP 1 uuuuu 1= 01 ddddd") // + +/*************************************************************** +* +* Group #1, Uses Q6 Rt32 +* +****************************************************************/ + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 001 --- ----- PP - ----- ----- --= -","[#1] Vd32=3D(Vu32, Rt32)") +DEF_ENC(V6_vtmpyb, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 000 d= dddd") // +DEF_ENC(V6_vtmpybus, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 001 ddd= dd") // +DEF_ENC(V6_vdmpyhb, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 010 dddd= d") // +DEF_ENC(V6_vrmpyub, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 011 dddd= d") // +DEF_ENC(V6_vrmpybus, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 100 ddd= dd") // +DEF_ENC(V6_vdsaduh, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 101 dddd= d") // +DEF_ENC(V6_vdmpybus, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 110 ddd= dd") // +DEF_ENC(V6_vdmpybus_dv, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 111 dddd= d") // + +DEF_ENC(V6_vdmpyhsusat, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 000 dddd= d") // +DEF_ENC(V6_vdmpyhsuisat, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 001 ddd= dd") // +DEF_ENC(V6_vdmpyhsat, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 010 dd= ddd") // +DEF_ENC(V6_vdmpyhisat, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 011 d= dddd") // +DEF_ENC(V6_vdmpyhb_dv, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 100 d= dddd") // +DEF_ENC(V6_vmpybus, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 101 dddd= d") // +DEF_ENC(V6_vmpabus, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 110 dddd= d") // +DEF_ENC(V6_vmpahb, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 111 d= dddd") // + +DEF_ENC(V6_vmpyh, ICLASS_CJ" 1 001 010 ttttt PP 0 uuuuu 000 dd= ddd") // +DEF_ENC(V6_vmpyhss, ICLASS_CJ" 1 001 010 ttttt PP 0 uuuuu 001 dddd= d") // +DEF_ENC(V6_vmpyhsrs, ICLASS_CJ" 1 001 010 ttttt PP 0 uuuuu 010 ddd= dd") // +DEF_ENC(V6_vmpyuh, ICLASS_CJ" 1 001 010 ttttt PP 0 uuuuu 011 d= dddd") // +DEF_ENC(V6_vrmpybusi, ICLASS_CJ" 1 001 010 ttttt PP 0 uuuuu 10i dd= ddd") // +DEF_ENC(V6_vrsadubi, ICLASS_CJ" 1 001 010 ttttt PP 0 uuuuu 11i ddd= dd") // + +DEF_ENC(V6_vmpyihb, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 000 dddd= d") // +DEF_ENC(V6_vror, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 001 ddd= dd") // +DEF_ENC(V6_vmpyuhe, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 010 dddd= d") // +DEF_ENC(V6_vmpabuu, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 011 dddd= d") // +DEF_ENC(V6_vlut4, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 100 ddd= dd") // + + +DEF_ENC(V6_vasrw, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 101 dd= ddd") // +DEF_ENC(V6_vasrh, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 110 dd= ddd") // +DEF_ENC(V6_vaslw, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 111 dd= ddd") // + +DEF_ENC(V6_vaslh, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 000 dd= ddd") // +DEF_ENC(V6_vlsrw, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 001 dd= ddd") // +DEF_ENC(V6_vlsrh, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 010 dd= ddd") // +DEF_ENC(V6_vlsrb, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 011 ddd= dd") // + +DEF_ENC(V6_vmpauhb, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 101 d= dddd") // +DEF_ENC(V6_vmpyiwub, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 110 ddd= dd") // +DEF_ENC(V6_vmpyiwh, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 111 dddd= d") // + +DEF_ENC(V6_vmpyiwb, ICLASS_CJ" 1 001 101 ttttt PP 0 uuuuu 000 dddd= d") // +DEF_ENC(V6_lvsplatw, ICLASS_CJ" 1 001 101 ttttt PP 0 ----0 001 ddd= dd") // + + + +DEF_ENC(V6_pred_scalar2, ICLASS_CJ" 1 001 101 ttttt PP 0 ----- 010 -01= dd") // +DEF_ENC(V6_vandvrt, ICLASS_CJ" 1 001 101 ttttt PP 0 uuuuu 010 -10d= d") // +DEF_ENC(V6_pred_scalar2v2, ICLASS_CJ" 1 001 101 ttttt PP 0 ----- 010 -= 11dd") // + +DEF_ENC(V6_vtmpyhb, ICLASS_CJ" 1 001 101 ttttt PP 0 uuuuu 100 dddd= d") // +DEF_ENC(V6_vandqrt, ICLASS_CJ" 1 001 101 ttttt PP 0 --0uu 101 dddd= d") // +DEF_ENC(V6_vandnqrt, ICLASS_CJ" 1 001 101 ttttt PP 0 --1uu 101 ddd= dd") // + +DEF_ENC(V6_vrmpyubi, ICLASS_CJ" 1 001 101 ttttt PP 0 uuuuu 11i ddd= dd") // + +DEF_ENC(V6_vmpyub, ICLASS_CJ" 1 001 110 ttttt PP 0 uuuuu 000 d= dddd") // +DEF_ENC(V6_lvsplath, ICLASS_CJ" 1 001 110 ttttt PP 0 ----- 001 ddd= dd") // +DEF_ENC(V6_lvsplatb, ICLASS_CJ" 1 001 110 ttttt PP 0 ----- 010 ddd= dd") // + + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 001 --- ----- PP - ----- ----- --= -","[#1] Vx32=3D(Vu32, Rt32)") +DEF_ENC(V6_vtmpyb_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 000 x= xxxx") // +DEF_ENC(V6_vtmpybus_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 001 xxx= xx") // +DEF_ENC(V6_vtmpyhb_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 010 xxxx= x") // +DEF_ENC(V6_vdmpyhb_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 011 xxxx= x") // +DEF_ENC(V6_vrmpyub_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 100 xxxx= x") // +DEF_ENC(V6_vrmpybus_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 101 xxx= xx") // +DEF_ENC(V6_vdmpybus_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 110 xxx= xx") // +DEF_ENC(V6_vdmpybus_dv_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 111 xxxx= x") // + +DEF_ENC(V6_vdmpyhsusat_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 000 xxxx= x") // +DEF_ENC(V6_vdmpyhsuisat_acc,ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 001 xxxx= x") // +DEF_ENC(V6_vdmpyhisat_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 010 x= xxxx") // +DEF_ENC(V6_vdmpyhsat_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 011 xx= xxx") // +DEF_ENC(V6_vdmpyhb_dv_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 100 x= xxxx") // +DEF_ENC(V6_vmpybus_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 101 xxxx= x") // +DEF_ENC(V6_vmpabus_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 110 xxxx= x") // +DEF_ENC(V6_vmpahb_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 111 x= xxxx") // + +DEF_ENC(V6_vmpyhsat_acc, ICLASS_CJ" 1 001 010 ttttt PP 1 uuuuu 000 xxx= xx") // +DEF_ENC(V6_vmpyuh_acc, ICLASS_CJ" 1 001 010 ttttt PP 1 uuuuu 001 x= xxxx") // +DEF_ENC(V6_vmpyiwb_acc, ICLASS_CJ" 1 001 010 ttttt PP 1 uuuuu 010 xxxx= x") // +DEF_ENC(V6_vmpyiwh_acc, ICLASS_CJ" 1 001 010 ttttt PP 1 uuuuu 011 xxxx= x") // +DEF_ENC(V6_vrmpybusi_acc, ICLASS_CJ" 1 001 010 ttttt PP 1 uuuuu 10i xx= xxx") // +DEF_ENC(V6_vrsadubi_acc, ICLASS_CJ" 1 001 010 ttttt PP 1 uuuuu 11i xxx= xx") // + +DEF_ENC(V6_vdsaduh_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 uuuuu 000 xxxx= x") // +DEF_ENC(V6_vmpyihb_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 uuuuu 001 xxxx= x") // +DEF_ENC(V6_vaslw_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 uuuuu 010 xx= xxx") // +DEF_ENC(V6_vandqrt_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 --0uu 011 xxxx= x") // +DEF_ENC(V6_vandnqrt_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 --1uu 011 xxx= xx") // +DEF_ENC(V6_vandvrt_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 uuuuu 100 ---x= x") // +DEF_ENC(V6_vasrw_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 uuuuu 101 xx= xxx") // +DEF_ENC(V6_vrmpyubi_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 uuuuu 11i xxx= xx") // + +DEF_ENC(V6_vmpyub_acc, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 000 x= xxxx") // +DEF_ENC(V6_vmpyiwub_acc, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 001 xxxx= x") // +DEF_ENC(V6_vmpauhb_acc, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 010 x= xxxx") // +DEF_ENC(V6_vmpyuhe_acc, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 011 x= xxxx") +DEF_ENC(V6_vmpahhsat, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 100 xxx= xx") // +DEF_ENC(V6_vmpauhuhsat, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 101 x= xxxx") // +DEF_ENC(V6_vmpsuhuhsat, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 110 x= xxxx") // +DEF_ENC(V6_vasrh_acc, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 111 xx= xxx") // + + + + +DEF_ENC(V6_vinsertwr, ICLASS_CJ" 1 001 101 ttttt PP 1 ----- 001 xxx= xx") + +DEF_ENC(V6_vmpabuu_acc, ICLASS_CJ" 1 001 101 ttttt PP 1 uuuuu 100 x= xxxx") // +DEF_ENC(V6_vaslh_acc, ICLASS_CJ" 1 001 101 ttttt PP 1 uuuuu 101 xxx= xx") // +DEF_ENC(V6_vmpyh_acc, ICLASS_CJ" 1 001 101 ttttt PP 1 uuuuu 110 xxx= xx") // + + + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 001 --- ----- PP - ----- ----- --= -","[#1] (Vx32, Vy32, Rt32)") +DEF_ENC(V6_vshuff, ICLASS_CJ" 1 001 111 ttttt PP 1 yyyyy 001 x= xxxx") // +DEF_ENC(V6_vdeal, ICLASS_CJ" 1 001 111 ttttt PP 1 yyyyy 010 xx= xxx") // + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 010 --- ----- PP - ----- ----- ---","= [#2] if (Ps) Vd=3DVu") +DEF_ENC(V6_vcmov, ICLASS_CJ" 1 010 000 ----- PP - uuuuu -ss ddddd") +DEF_ENC(V6_vncmov, ICLASS_CJ" 1 010 001 ----- PP - uuuuu -ss ddddd= ") +DEF_ENC(V6_vnccombine, ICLASS_CJ" 1 010 010 vvvvv PP - uuuuu -ss ddddd= ") +DEF_ENC(V6_vccombine, ICLASS_CJ" 1 010 011 vvvvv PP - uuuuu -ss ddddd") + +DEF_ENC(V6_vrotr, ICLASS_CJ" 1 010 100 vvvvv PP 1 uuuuu 111 ddddd") +DEF_ENC(V6_vasr_into, ICLASS_CJ" 1 010 101 vvvvv PP 1 uuuuu 111 xxxxx") + +/*************************************************************** +* +* Group #3, Uses Q6 Rt8 +* +****************************************************************/ + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 011 --- ----- PP - ----- ----- --= -","[#3] Vd32=3D(Vu32, Vv32, Rt8)") +DEF_ENC(V6_valignb, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 000 dddd= d") // +DEF_ENC(V6_vlalignb, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 001 ddd= dd") // +DEF_ENC(V6_vasrwh, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 010 ddddd= ") // +DEF_ENC(V6_vasrwhsat, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 011 dd= ddd") // +DEF_ENC(V6_vasrwhrndsat, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 100 ddd= dd") // +DEF_ENC(V6_vasrwuhsat, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 101 d= dddd") // +DEF_ENC(V6_vasrhubsat, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 110 d= dddd") // +DEF_ENC(V6_vasrhubrndsat, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 111 dd= ddd") // + +DEF_ENC(V6_vasrhbrndsat, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 000 ddd= dd") // +DEF_ENC(V6_vlutvvb, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 001 d= dddd") +DEF_ENC(V6_vshuffvdd, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 011 dd= ddd") // +DEF_ENC(V6_vdealvdd, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 100 ddd= dd") // +DEF_ENC(V6_vlutvvb_oracc, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 101 xxx= xx") +DEF_ENC(V6_vlutvwh, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 110 d= dddd") +DEF_ENC(V6_vlutvwh_oracc, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 111 xxx= xx") + + + +/*************************************************************** +* +* Group #4, No Q6 regs +* +****************************************************************/ + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 100 --- ----- PP 0 ----- ----- ---","= [#4] Vd32=3D(Vu32, Vv32)") +DEF_ENC(V6_vrmpyubv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 000 ddddd")= // +DEF_ENC(V6_vrmpybv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 001 ddddd") = // +DEF_ENC(V6_vrmpybusv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 010 ddddd"= ) // +DEF_ENC(V6_vdmpyhvsat, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 011 ddddd= ") // +DEF_ENC(V6_vmpybv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 100 ddddd= ") // +DEF_ENC(V6_vmpyubv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 101 ddddd") = // +DEF_ENC(V6_vmpybusv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 110 ddddd")= // +DEF_ENC(V6_vmpyhv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 111 ddddd= ") // + +DEF_ENC(V6_vmpyuhv, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 000 ddddd") = // +DEF_ENC(V6_vmpyhvsrs, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 001 ddddd"= ) // +DEF_ENC(V6_vmpyhus, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 010 ddddd") = // +DEF_ENC(V6_vmpabusv, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 011 ddddd")= // +DEF_ENC(V6_vmpyih, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 100 ddddd= ") // +DEF_ENC(V6_vand, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 101 ddddd")= // +DEF_ENC(V6_vor, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 110 ddddd") = // +DEF_ENC(V6_vxor, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 111 ddddd")= // + +DEF_ENC(V6_vaddw, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 000 ddddd"= ) // +DEF_ENC(V6_vaddubsat, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 001 ddddd"= ) // +DEF_ENC(V6_vadduhsat, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 010 ddddd"= ) // +DEF_ENC(V6_vaddhsat, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 011 ddddd")= // +DEF_ENC(V6_vaddwsat, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 100 ddddd")= // +DEF_ENC(V6_vsubb, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 101 ddddd"= ) // +DEF_ENC(V6_vsubh, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 110 ddddd"= ) // +DEF_ENC(V6_vsubw, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 111 ddddd"= ) // + +DEF_ENC(V6_vsububsat, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 000 ddddd"= ) // +DEF_ENC(V6_vsubuhsat, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 001 ddddd"= ) // +DEF_ENC(V6_vsubhsat, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 010 ddddd")= // +DEF_ENC(V6_vsubwsat, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 011 ddddd")= // +DEF_ENC(V6_vaddb_dv, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 100 ddddd")= // +DEF_ENC(V6_vaddh_dv, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 101 ddddd")= // +DEF_ENC(V6_vaddw_dv, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 110 ddddd")= // +DEF_ENC(V6_vaddubsat_dv,ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 111 ddddd") = // + +DEF_ENC(V6_vadduhsat_dv,ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 000 ddddd") = // +DEF_ENC(V6_vaddhsat_dv, ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 001 ddddd") = // +DEF_ENC(V6_vaddwsat_dv, ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 010 ddddd") = // +DEF_ENC(V6_vsubb_dv, ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 011 ddddd")= // +DEF_ENC(V6_vsubh_dv, ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 100 ddddd")= // +DEF_ENC(V6_vsubw_dv, ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 101 ddddd")= // +DEF_ENC(V6_vsububsat_dv,ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 110 ddddd") = // +DEF_ENC(V6_vsubuhsat_dv,ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 111 ddddd") = // + +DEF_ENC(V6_vsubhsat_dv, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 000 ddddd= ") // +DEF_ENC(V6_vsubwsat_dv, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 001 ddddd") = // +DEF_ENC(V6_vaddubh, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 010 ddddd") = // +DEF_ENC(V6_vadduhw, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 011 ddddd") = // +DEF_ENC(V6_vaddhw, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 100 ddddd= ") // +DEF_ENC(V6_vsububh, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 101 ddddd") = // +DEF_ENC(V6_vsubuhw, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 110 ddddd= ") // +DEF_ENC(V6_vsubhw, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 111 ddddd"= ) // + +DEF_ENC(V6_vabsdiffub, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 000 ddddd"= ) // +DEF_ENC(V6_vabsdiffh, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 001 ddddd"= ) // +DEF_ENC(V6_vabsdiffuh, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 010 ddddd= ") // +DEF_ENC(V6_vabsdiffw, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 011 ddddd"= ) // +DEF_ENC(V6_vavgub, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 100 ddddd= ") // +DEF_ENC(V6_vavguh, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 101 ddddd= ") // +DEF_ENC(V6_vavgh, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 110 ddddd")= // +DEF_ENC(V6_vavgw, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 111 ddddd")= // + +DEF_ENC(V6_vnavgub, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 000 ddddd= ") // +DEF_ENC(V6_vnavgh, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 001 ddddd= ") // +DEF_ENC(V6_vnavgw, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 010 ddddd= ") // +DEF_ENC(V6_vavgubrnd, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 011 ddddd"= ) // +DEF_ENC(V6_vavguhrnd, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 100 ddddd"= ) // +DEF_ENC(V6_vavghrnd, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 101 ddddd")= // +DEF_ENC(V6_vavgwrnd, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 110 ddddd") = // +DEF_ENC(V6_vmpabuuv, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 111 ddddd") = // + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 100 --- ----- PP 1 ----- ----- --= -","[#4] Vx32=3D(Vu32, Vv32)") +DEF_ENC(V6_vrmpyubv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 000 xx= xxx") // +DEF_ENC(V6_vrmpybv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 001 xx= xxx") // +DEF_ENC(V6_vrmpybusv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 010 xxx= xx") // +DEF_ENC(V6_vdmpyhvsat_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 011 xx= xxx") // +DEF_ENC(V6_vmpybv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 100 x= xxxx") // +DEF_ENC(V6_vmpyubv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 101 xxxx= x") // +DEF_ENC(V6_vmpybusv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 110 xxxx= x") // +DEF_ENC(V6_vmpyhv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 111 xx= xxx") // + +DEF_ENC(V6_vmpyuhv_acc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 000 x= xxxx") // +DEF_ENC(V6_vmpyhus_acc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 001 xxxx= x") // +DEF_ENC(V6_vaddhw_acc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 010 xx= xxx") // +DEF_ENC(V6_vmpyowh_64_acc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 011 xx= xxx") +DEF_ENC(V6_vmpyih_acc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 100 x= xxxx") // +DEF_ENC(V6_vmpyiewuh_acc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 101 xxx= xx") // +DEF_ENC(V6_vmpyowh_sacc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 110 xxxx= x") // +DEF_ENC(V6_vmpyowh_rnd_sacc,ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 111 xxxx= x") // + +DEF_ENC(V6_vmpyiewh_acc, ICLASS_CJ" 1 100 010 vvvvv PP 1 uuuuu 000 xx= xxx") // + +DEF_ENC(V6_vadduhw_acc, ICLASS_CJ" 1 100 010 vvvvv PP 1 uuuuu 100= xxxxx") // +DEF_ENC(V6_vaddubh_acc, ICLASS_CJ" 1 100 010 vvvvv PP 1 uuuuu 101= xxxxx") // + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 100 100 ----- PP 1 ----- ----- ---","= [#4] Qx4=3D(Vu32, Vv32)") +// Grouped by element size (lsbs), operation (next-lsbs) and operation (ne= xt-lsbs) +DEF_ENC(V6_veqb_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 000 000xx")= // +DEF_ENC(V6_veqh_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 000 001xx")= // +DEF_ENC(V6_veqw_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 000 010xx")= // + +DEF_ENC(V6_vgtb_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 000 100xx") = // +DEF_ENC(V6_vgth_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 000 101xx") = // +DEF_ENC(V6_vgtw_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 000 110xx") = // + +DEF_ENC(V6_vgtub_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 001 000xx")= // +DEF_ENC(V6_vgtuh_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 001 001xx")= // +DEF_ENC(V6_vgtuw_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 001 010xx")= // + +DEF_ENC(V6_veqb_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 010 000xx") = // +DEF_ENC(V6_veqh_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 010 001xx") = // +DEF_ENC(V6_veqw_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 010 010xx") = // + +DEF_ENC(V6_vgtb_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 010 100xx= ") // +DEF_ENC(V6_vgth_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 010 101xx= ") // +DEF_ENC(V6_vgtw_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 010 110xx= ") // + +DEF_ENC(V6_vgtub_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 011 000xx") = // +DEF_ENC(V6_vgtuh_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 011 001xx") = // +DEF_ENC(V6_vgtuw_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 011 010xx") = // + +DEF_ENC(V6_veqb_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 100 000xx")= // +DEF_ENC(V6_veqh_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 100 001xx")= // +DEF_ENC(V6_veqw_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 100 010xx")= // + +DEF_ENC(V6_vgtb_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 100 100xx") = // +DEF_ENC(V6_vgth_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 100 101xx") = // +DEF_ENC(V6_vgtw_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 100 110xx") = // + +DEF_ENC(V6_vgtub_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 101 000xx")= // +DEF_ENC(V6_vgtuh_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 101 001xx")= // +DEF_ENC(V6_vgtuw_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 101 010xx")= // + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 100 101 ----- PP 1 ----- ----- ---","= [#4] Qx4,Vd32=3D(Vu32, Vv32)") +DEF_ENC(V6_vaddcarry, ICLASS_CJ" 1 100 101 vvvvv PP 1 uuuuu 0xx ddddd")= // +DEF_ENC(V6_vsubcarry, ICLASS_CJ" 1 100 101 vvvvv PP 1 uuuuu 1xx ddddd")= // + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 100 11- ----- PP 1 ----- ----- --= -","[#4] Vx32|=3D(Vu32, Vv32,#)") +DEF_ENC(V6_vlutvvb_oracci, ICLASS_CJ" 1 100 110 vvvvv PP 1 uuuuu iii xx= xxx") // +DEF_ENC(V6_vlutvwh_oracci, ICLASS_CJ" 1 100 111 vvvvv PP 1 uuuuu iii xx= xxx") // + + + +/*************************************************************** +* +* Group #5, Reserved/Deprecated. Uses Q6 Rx. Stupid FFT. +* +****************************************************************/ + + + + +/*************************************************************** +* +* Group #6, No Q6 regs +* +****************************************************************/ + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 110 --0 ----- PP 0 ----- ----- ---","= [#6] Vd32=3DVu32") +DEF_ENC(V6_vabsh, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 000 ddddd"= ) // +DEF_ENC(V6_vabsh_sat, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 001 ddddd"= ) // +DEF_ENC(V6_vabsw, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 010 ddddd"= ) // +DEF_ENC(V6_vabsw_sat, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 011 ddddd"= ) // +DEF_ENC(V6_vnot, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 100 ddddd")= // +DEF_ENC(V6_vdealh, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 110 ddddd= ") // +DEF_ENC(V6_vdealb, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 111 ddddd= ") // + +DEF_ENC(V6_vunpackub, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 000 ddddd"= ) // +DEF_ENC(V6_vunpackuh, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 001 ddddd"= ) // +DEF_ENC(V6_vunpackb, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 010 ddddd")= // +DEF_ENC(V6_vunpackh, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 011 ddddd")= // +DEF_ENC(V6_vabsb, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 100 ddddd"= ) // +DEF_ENC(V6_vabsb_sat, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 101 ddddd"= ) // +DEF_ENC(V6_vshuffh, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 111 ddddd") = // + +DEF_ENC(V6_vshuffb, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 000 ddddd") = // +DEF_ENC(V6_vzb, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 001 ddddd") = // +DEF_ENC(V6_vzh, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 010 ddddd") = // +DEF_ENC(V6_vsb, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 011 ddddd") = // +DEF_ENC(V6_vsh, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 100 ddddd") = // +DEF_ENC(V6_vcl0w, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 101 ddddd"= ) // +DEF_ENC(V6_vpopcounth, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 110 ddddd= ") // +DEF_ENC(V6_vcl0h, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 111 ddddd"= ) // + + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 110 --0 ---11 PP 0 ----- ----- ---","= [#6] Qd4=3DQt4, Qs4") +DEF_ENC(V6_pred_and, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 000dd")= // +DEF_ENC(V6_pred_or, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 001dd") = // +DEF_ENC(V6_pred_not, ICLASS_CJ" 1 110 --0 ---11 PP 0 ---ss 000 010dd")= // +DEF_ENC(V6_pred_xor, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 011dd")= // +DEF_ENC(V6_pred_or_n, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 100dd"= ) // +DEF_ENC(V6_pred_and_n, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 101dd= ") // +DEF_ENC(V6_shuffeqh, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 110dd")= // +DEF_ENC(V6_shuffeqw, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 111dd")= // + +DEF_ENC(V6_vnormamtw, ICLASS_CJ" 1 110 --0 ---11 PP 0 uuuuu 100 ddd= dd") // +DEF_ENC(V6_vnormamth, ICLASS_CJ" 1 110 --0 ---11 PP 0 uuuuu 101 ddd= dd") // + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 110 --1 ----- PP 0 ----- ----- --= -","[#6] Vd32=3DVu32,Vv32") +DEF_ENC(V6_vlutvvbi, ICLASS_CJ" 1 110 001 vvvvv PP 0 uuuuu iii dddd= d") +DEF_ENC(V6_vlutvwhi, ICLASS_CJ" 1 110 011 vvvvv PP 0 uuuuu iii dddd= d") + +DEF_ENC(V6_vaddbsat_dv, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 000 d= dddd") +DEF_ENC(V6_vsubbsat_dv, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 001 d= dddd") +DEF_ENC(V6_vadduwsat_dv, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 010 dddd= d") +DEF_ENC(V6_vsubuwsat_dv, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 011 dddd= d") +DEF_ENC(V6_vaddububb_sat, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 100 ddd= dd") +DEF_ENC(V6_vsubububb_sat, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 101 ddd= dd") +DEF_ENC(V6_vmpyewuh_64, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 110 d= dddd") + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 110 --0 ----- PP 1 ----- ----- --= -","Vx32=3DVu32") +DEF_ENC(V6_vunpackob, ICLASS_CJ" 1 110 --0 ---00 PP 1 uuuuu 000 xx= xxx") // +DEF_ENC(V6_vunpackoh, ICLASS_CJ" 1 110 --0 ---00 PP 1 uuuuu 001 xx= xxx") // +//DEF_ENC(V6_vunpackow, ICLASS_CJ" 1 110 --0 ---00 PP 1 uuuuu 010 xxxx= x") // + +DEF_ENC(V6_vhist, ICLASS_CJ" 1 110 --0 ---00 PP 1 -000- 100 ---= --") +DEF_ENC(V6_vwhist256, ICLASS_CJ" 1 110 --0 ---00 PP 1 -0010 100 ---= --") +DEF_ENC(V6_vwhist256_sat, ICLASS_CJ" 1 110 --0 ---00 PP 1 -0011 100 ---= --") +DEF_ENC(V6_vwhist128, ICLASS_CJ" 1 110 --0 ---00 PP 1 -010- 100 ---= --") +DEF_ENC(V6_vwhist128m, ICLASS_CJ" 1 110 --0 ---00 PP 1 -011i 100 --= ---") + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 110 --0 ----- PP 1 ----- ----- --= -","if (Qv4) Vx32=3DVu32") +DEF_ENC(V6_vaddbq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 000 x= xxxx") // +DEF_ENC(V6_vaddhq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 001 x= xxxx") // +DEF_ENC(V6_vaddwq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 010 x= xxxx") // +DEF_ENC(V6_vaddbnq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 011 xxxx= x") // +DEF_ENC(V6_vaddhnq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 100 xxxx= x") // +DEF_ENC(V6_vaddwnq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 101 xxxx= x") // +DEF_ENC(V6_vsubbq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 110 x= xxxx") // +DEF_ENC(V6_vsubhq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 111 x= xxxx") // + +DEF_ENC(V6_vsubwq, ICLASS_CJ" 1 110 vv0 ---10 PP 1 uuuuu 000 x= xxxx") // +DEF_ENC(V6_vsubbnq, ICLASS_CJ" 1 110 vv0 ---10 PP 1 uuuuu 001 xxxx= x") // +DEF_ENC(V6_vsubhnq, ICLASS_CJ" 1 110 vv0 ---10 PP 1 uuuuu 010 xxxx= x") // +DEF_ENC(V6_vsubwnq, ICLASS_CJ" 1 110 vv0 ---10 PP 1 uuuuu 011 xxxx= x") // + +DEF_ENC(V6_vhistq, ICLASS_CJ" 1 110 vv0 ---10 PP 1 --00- 100 --= ---") +DEF_ENC(V6_vwhist256q, ICLASS_CJ" 1 110 vv0 ---10 PP 1 --010 100 --= ---") +DEF_ENC(V6_vwhist256q_sat, ICLASS_CJ" 1 110 vv0 ---10 PP 1 --011 100 --= ---") +DEF_ENC(V6_vwhist128q, ICLASS_CJ" 1 110 vv0 ---10 PP 1 --10- 100 --= ---") +DEF_ENC(V6_vwhist128qm, ICLASS_CJ" 1 110 vv0 ---10 PP 1 --11i 100 -= ----") + + +DEF_ENC(V6_vandvqv, ICLASS_CJ" 1 110 vv0 ---11 PP 1 uuuuu 000 d= dddd") +DEF_ENC(V6_vandvnqv, ICLASS_CJ" 1 110 vv0 ---11 PP 1 uuuuu 001 dddd= d") + + +DEF_ENC(V6_vprefixqb, ICLASS_CJ" 1 110 vv0 ---11 PP 1 --000 010 dddd= d") // +DEF_ENC(V6_vprefixqh, ICLASS_CJ" 1 110 vv0 ---11 PP 1 --001 010 dddd= d") // +DEF_ENC(V6_vprefixqw, ICLASS_CJ" 1 110 vv0 ---11 PP 1 --010 010 dddd= d") // + + + + +DEF_ENC(V6_vassign, ICLASS_CJ" 1 110 --0 ---11 PP 1 uuuuu 111 d= dddd") + +DEF_ENC(V6_valignbi, ICLASS_CJ" 1 110 001 vvvvv PP 1 uuuuu iii ddd= dd") +DEF_ENC(V6_vlalignbi, ICLASS_CJ" 1 110 011 vvvvv PP 1 uuuuu iii dd= ddd") +DEF_ENC(V6_vswap, ICLASS_CJ" 1 110 101 vvvvv PP 1 uuuuu -tt dd= ddd") // +DEF_ENC(V6_vmux, ICLASS_CJ" 1 110 111 vvvvv PP 1 uuuuu -tt ddd= dd") // + + + +/*************************************************************** +* +* Group #7, No Q6 regs +* +****************************************************************/ + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 111 --- ----- PP 0 ----- ----- ---","= [#7] Vd32=3D(Vu32, Vv32)") +DEF_ENC(V6_vaddbsat, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 000 ddddd") = // +DEF_ENC(V6_vminub, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 001 ddddd= ") // +DEF_ENC(V6_vminuh, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 010 ddddd= ") // +DEF_ENC(V6_vminh, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 011 ddddd"= ) // +DEF_ENC(V6_vminw, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 100 ddddd"= ) // +DEF_ENC(V6_vmaxub, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 101 ddddd= ") // +DEF_ENC(V6_vmaxuh, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 110 ddddd= ") // +DEF_ENC(V6_vmaxh, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 111 ddddd"= ) // + + +DEF_ENC(V6_vaddclbh, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 000 ddddd") = // +DEF_ENC(V6_vaddclbw, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 001 ddddd") = // + +DEF_ENC(V6_vavguw, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 010 ddddd"= ) // +DEF_ENC(V6_vavguwrnd, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 011 ddddd")= // +DEF_ENC(V6_vavgb, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 100 ddddd")= // +DEF_ENC(V6_vavgbrnd, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 101 ddddd") = // +DEF_ENC(V6_vnavgb, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 110 ddddd"= ) // + + +DEF_ENC(V6_vmaxw, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 000 ddddd"= ) // +DEF_ENC(V6_vdelta, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 001 ddddd= ") // +DEF_ENC(V6_vsubbsat, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 010 ddddd") = // +DEF_ENC(V6_vrdelta, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 011 ddddd") = // +DEF_ENC(V6_vminb, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 100 ddddd"= ) // +DEF_ENC(V6_vmaxb, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 101 ddddd"= ) // +DEF_ENC(V6_vsatuwuh, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 110 ddddd") = // +DEF_ENC(V6_vdealb4w, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 111 ddddd")= // + + +DEF_ENC(V6_vmpyowh_rnd, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 000 dddd= d") // +DEF_ENC(V6_vshuffeb, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 001 ddddd"= ) // +DEF_ENC(V6_vshuffob, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 010 ddddd"= ) // +DEF_ENC(V6_vshufeh, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 011 ddddd")= // +DEF_ENC(V6_vshufoh, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 100 ddddd")= // +DEF_ENC(V6_vshufoeh, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 101 ddddd"= ) // +DEF_ENC(V6_vshufoeb, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 110 ddddd"= ) // +DEF_ENC(V6_vcombine, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 111 ddddd")= // + +DEF_ENC(V6_vmpyieoh, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 000 ddddd")= // +DEF_ENC(V6_vadduwsat, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 001 ddddd"= ) // +DEF_ENC(V6_vsathub, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 010 ddddd") = // +DEF_ENC(V6_vsatwh, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 011 ddddd= ") // +DEF_ENC(V6_vroundwh, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 100 ddddd") +DEF_ENC(V6_vroundwuh, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 101 ddddd") +DEF_ENC(V6_vroundhb, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 110 ddddd") +DEF_ENC(V6_vroundhub, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 111 ddddd") + +DEF_FIELDROW_DESC32( ICLASS_CJ" 1 111 100 ----- PP - ----- ----- ---","= [#7] Qd4=3D(Vu32, Vv32)") +DEF_ENC(V6_veqb, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 000 000dd")= // +DEF_ENC(V6_veqh, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 000 001dd")= // +DEF_ENC(V6_veqw, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 000 010dd")= // + +DEF_ENC(V6_vgtb, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 000 100dd")= // +DEF_ENC(V6_vgth, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 000 101dd")= // +DEF_ENC(V6_vgtw, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 000 110dd")= // + +DEF_ENC(V6_vgtub, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 001 000dd"= ) // +DEF_ENC(V6_vgtuh, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 001 001dd"= ) // +DEF_ENC(V6_vgtuw, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 001 010dd"= ) // + + +DEF_ENC(V6_vasrwv, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 000 ddddd= ") // +DEF_ENC(V6_vlsrwv, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 001 ddddd= ") // +DEF_ENC(V6_vlsrhv, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 010 ddddd= ") // +DEF_ENC(V6_vasrhv, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 011 ddddd= ") // +DEF_ENC(V6_vaslwv, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 100 ddddd= ") // +DEF_ENC(V6_vaslhv, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 101 ddddd= ") // +DEF_ENC(V6_vaddb, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 110 ddddd"= ) // +DEF_ENC(V6_vaddh, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 111 ddddd"= ) // + + +DEF_ENC(V6_vmpyiewuh, ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 000 ddddd") +DEF_ENC(V6_vmpyiowh, ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 001 ddddd") +DEF_ENC(V6_vpackeb, ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 010 ddddd") = // +DEF_ENC(V6_vpackeh, ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 011 ddddd") = // +DEF_ENC(V6_vsubuwsat, ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 100 ddddd"= ) // +DEF_ENC(V6_vpackhub_sat,ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 101 ddddd") = // +DEF_ENC(V6_vpackhb_sat, ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 110 ddddd") = // +DEF_ENC(V6_vpackwuh_sat,ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 111 ddddd") = // + +DEF_ENC(V6_vpackwh_sat, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 000 ddddd") = // +DEF_ENC(V6_vpackob, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 001 ddddd") = // +DEF_ENC(V6_vpackoh, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 010 ddddd") = // +DEF_ENC(V6_vrounduhub, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 011 ddddd= ") // +DEF_ENC(V6_vrounduwuh, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 100 ddddd= ") // +DEF_ENC(V6_vmpyewuh, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 101 ddddd") +DEF_ENC(V6_vmpyowh, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 111 ddddd= ") + + +#endif /* NO MMVEC */ --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634035926194716.9274420908359; Tue, 12 Oct 2021 03:52:06 -0700 (PDT) Received: from localhost ([::1]:39810 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFO9-0005cv-1j for importer@patchew.org; Tue, 12 Oct 2021 06:52:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50572) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maElZ-00015Z-EM for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:13 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64100) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maElW-0007Fq-6b for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:13 -0400 Received: from unknown (HELO ironmsg01-sd.qualcomm.com) ([10.53.140.141]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg01-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 7C2E617A8; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033530; x=1665569530; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QrgjOSAQ4BdDEt4ME1dUQIpj3DiCdnF0aVXkLPPiGpU=; b=Q8cMnx1mIyDB6QEZ1ucIoHTf1ZTROrGbo2KHm1AAdM6AQeUmvXVK5Bx2 A5vjcQ6rUTQT4KW+5rjQCGaJmq1du9fm8ABFVE4EFG0EcmhNB+x0dlsiG IXj81Tt4mWGWhus3HSwFJ/BpreUUmSKp+rnr7IWIEAp7XHftzi4gHHabQ E=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 27/30] Hexagon HVX (tests/tcg/hexagon) vector_add_int test Date: Tue, 12 Oct 2021 05:11:05 -0500 Message-Id: <1634033468-23566-28-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634035927669100001 Signe-off-by: Taylor Simpson Reviewed-by: Richard Henderson --- tests/tcg/hexagon/vector_add_int.c | 61 ++++++++++++++++++++++++++++++++++= ++++ tests/tcg/hexagon/Makefile.target | 3 ++ 2 files changed, 64 insertions(+) create mode 100644 tests/tcg/hexagon/vector_add_int.c diff --git a/tests/tcg/hexagon/vector_add_int.c b/tests/tcg/hexagon/vector_= add_int.c new file mode 100644 index 0000000..d6010ea --- /dev/null +++ b/tests/tcg/hexagon/vector_add_int.c @@ -0,0 +1,61 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#include + +int gA[401]; +int gB[401]; +int gC[401]; + +void vector_add_int() +{ + int i; + for (i =3D 0; i < 400; i++) { + gA[i] =3D gB[i] + gC[i]; + } +} + +int main() +{ + int error =3D 0; + int i; + for (i =3D 0; i < 400; i++) { + gB[i] =3D i * 2; + gC[i] =3D i * 3; + } + gA[400] =3D 17; + vector_add_int(); + for (i =3D 0; i < 400; i++) { + if (gA[i] !=3D i * 5) { + error++; + printf("ERROR: gB[%d] =3D %d\t", i, gB[i]); + printf("gC[%d] =3D %d\t", i, gC[i]); + printf("gA[%d] =3D %d\n", i, gA[i]); + } + } + if (gA[400] !=3D 17) { + error++; + printf("ERROR: Overran the buffer\n"); + } + if (!error) { + printf("PASS\n"); + return 0; + } else { + printf("FAIL\n"); + return 1; + } +} diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile= .target index c1e1650..b010edc 100644 --- a/tests/tcg/hexagon/Makefile.target +++ b/tests/tcg/hexagon/Makefile.target @@ -38,7 +38,10 @@ HEX_TESTS +=3D circ HEX_TESTS +=3D brev HEX_TESTS +=3D load_unpack HEX_TESTS +=3D load_align +HEX_TESTS +=3D vector_add_int HEX_TESTS +=3D atomics HEX_TESTS +=3D fpstuff =20 TESTS +=3D $(HEX_TESTS) + +vector_add_int: CFLAGS +=3D -mhvx -fvectorize --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634035768265145.62994339752493; Tue, 12 Oct 2021 03:49:28 -0700 (PDT) Received: from localhost ([::1]:35284 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFLb-0002aL-4v for importer@patchew.org; Tue, 12 Oct 2021 06:49:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50604) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maElh-0001F2-OC for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:23 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64080) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maElf-0006xP-6b for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:21 -0400 Received: from unknown (HELO ironmsg03-sd.qualcomm.com) ([10.53.140.143]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg03-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 7EE0B17AA; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033539; x=1665569539; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kgdSjWxtIrvpCXEDbQ/EOXv3gHOQENMV7kkrAw7pNmc=; b=FFEIkJslSCbDnpoh3yTn1OidEvlagZ4Lg6IIaGW+72EQql0LXuTAfiRD qPhMnf2onVfHdp5OT0cw1Sa54MgvoQs44X91sa+WEGrYAslgkGQq3Mulw D8fKtNlO/ut2KL4uo1K44X8ufasDvgfXGrqAVRG/rVNyG1VSZn9Z66CxI s=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 28/30] Hexagon HVX (tests/tcg/hexagon) hvx_misc test Date: Tue, 12 Oct 2021 05:11:06 -0500 Message-Id: <1634033468-23566-29-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634035768561100001 Tests for packet semantics vector loads (aligned and unaligned) vector stores (aligned and unaligned) vector masked stores vector new value store maximum HVX temps in a packet vector operations Signed-off-by: Taylor Simpson Acked-by: Richard Henderson --- tests/tcg/hexagon/hvx_misc.c | 469 ++++++++++++++++++++++++++++++++++= ++++ tests/tcg/hexagon/Makefile.target | 2 + 2 files changed, 471 insertions(+) create mode 100644 tests/tcg/hexagon/hvx_misc.c diff --git a/tests/tcg/hexagon/hvx_misc.c b/tests/tcg/hexagon/hvx_misc.c new file mode 100644 index 0000000..312bb98 --- /dev/null +++ b/tests/tcg/hexagon/hvx_misc.c @@ -0,0 +1,469 @@ +/* + * Copyright(c) 2021 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#include +#include +#include +#include + +int err; + +static void __check(int line, int i, int j, uint64_t result, uint64_t expe= ct) +{ + if (result !=3D expect) { + printf("ERROR at line %d: [%d][%d] 0x%016llx !=3D 0x%016llx\n", + line, i, j, result, expect); + err++; + } +} + +#define check(RES, EXP) __check(__LINE__, RES, EXP) + +#define MAX_VEC_SIZE_BYTES 128 + +typedef union { + uint64_t ud[MAX_VEC_SIZE_BYTES / 8]; + int64_t d[MAX_VEC_SIZE_BYTES / 8]; + uint32_t uw[MAX_VEC_SIZE_BYTES / 4]; + int32_t w[MAX_VEC_SIZE_BYTES / 4]; + uint16_t uh[MAX_VEC_SIZE_BYTES / 2]; + int16_t h[MAX_VEC_SIZE_BYTES / 2]; + uint8_t ub[MAX_VEC_SIZE_BYTES / 1]; + int8_t b[MAX_VEC_SIZE_BYTES / 1]; +} MMVector; + +#define BUFSIZE 16 +#define OUTSIZE 16 +#define MASKMOD 3 + +MMVector buffer0[BUFSIZE] __attribute__((aligned(MAX_VEC_SIZE_BYTES))); +MMVector buffer1[BUFSIZE] __attribute__((aligned(MAX_VEC_SIZE_BYTES))); +MMVector mask[BUFSIZE] __attribute__((aligned(MAX_VEC_SIZE_BYTES))); +MMVector output[OUTSIZE] __attribute__((aligned(MAX_VEC_SIZE_BYTES))); +MMVector expect[OUTSIZE] __attribute__((aligned(MAX_VEC_SIZE_BYTES))); + +#define CHECK_OUTPUT_FUNC(FIELD, FIELDSZ) \ +static void check_output_##FIELD(int line, size_t num_vectors) \ +{ \ + for (int i =3D 0; i < num_vectors; i++) { \ + for (int j =3D 0; j < MAX_VEC_SIZE_BYTES / FIELDSZ; j++) { \ + __check(line, i, j, output[i].FIELD[j], expect[i].FIELD[j]); \ + } \ + } \ +} + +CHECK_OUTPUT_FUNC(d, 8) +CHECK_OUTPUT_FUNC(w, 4) +CHECK_OUTPUT_FUNC(h, 2) +CHECK_OUTPUT_FUNC(b, 1) + +static void init_buffers(void) +{ + int counter0 =3D 0; + int counter1 =3D 17; + for (int i =3D 0; i < BUFSIZE; i++) { + for (int j =3D 0; j < MAX_VEC_SIZE_BYTES; j++) { + buffer0[i].b[j] =3D counter0++; + buffer1[i].b[j] =3D counter1++; + } + for (int j =3D 0; j < MAX_VEC_SIZE_BYTES / 4; j++) { + mask[i].w[j] =3D (i + j % MASKMOD =3D=3D 0) ? 0 : 1; + } + } +} + +static void test_load_tmp(void) +{ + void *p0 =3D buffer0; + void *p1 =3D buffer1; + void *pout =3D output; + + for (int i =3D 0; i < BUFSIZE; i++) { + /* + * Load into v12 as .tmp, then use it in the next packet + * Should get the new value within the same packet and + * the old value in the next packet + */ + asm("v3 =3D vmem(%0 + #0)\n\t" + "r1 =3D #1\n\t" + "v12 =3D vsplat(r1)\n\t" + "{\n\t" + " v12.tmp =3D vmem(%1 + #0)\n\t" + " v4.w =3D vadd(v12.w, v3.w)\n\t" + "}\n\t" + "v4.w =3D vadd(v4.w, v12.w)\n\t" + "vmem(%2 + #0) =3D v4\n\t" + : : "r"(p0), "r"(p1), "r"(pout) + : "r1", "v12", "v3", "v4", "v6", "memory"); + p0 +=3D sizeof(MMVector); + p1 +=3D sizeof(MMVector); + pout +=3D sizeof(MMVector); + + for (int j =3D 0; j < MAX_VEC_SIZE_BYTES / 4; j++) { + expect[i].w[j] =3D buffer0[i].w[j] + buffer1[i].w[j] + 1; + } + } + + check_output_w(__LINE__, BUFSIZE); +} + +static void test_load_cur(void) +{ + void *p0 =3D buffer0; + void *pout =3D output; + + for (int i =3D 0; i < BUFSIZE; i++) { + asm("{\n\t" + " v2.cur =3D vmem(%0 + #0)\n\t" + " vmem(%1 + #0) =3D v2\n\t" + "}\n\t" + : : "r"(p0), "r"(pout) : "v2", "memory"); + p0 +=3D sizeof(MMVector); + pout +=3D sizeof(MMVector); + + for (int j =3D 0; j < MAX_VEC_SIZE_BYTES / 4; j++) { + expect[i].uw[j] =3D buffer0[i].uw[j]; + } + } + + check_output_w(__LINE__, BUFSIZE); +} + +static void test_load_aligned(void) +{ + /* Aligned loads ignore the low bits of the address */ + void *p0 =3D buffer0; + void *pout =3D output; + const size_t offset =3D 13; + + p0 +=3D offset; /* Create an unaligned address */ + asm("v2 =3D vmem(%0 + #0)\n\t" + "vmem(%1 + #0) =3D v2\n\t" + : : "r"(p0), "r"(pout) : "v2", "memory"); + + expect[0] =3D buffer0[0]; + + check_output_w(__LINE__, 1); +} + +static void test_load_unaligned(void) +{ + void *p0 =3D buffer0; + void *pout =3D output; + const size_t offset =3D 12; + + p0 +=3D offset; /* Create an unaligned address */ + asm("v2 =3D vmemu(%0 + #0)\n\t" + "vmem(%1 + #0) =3D v2\n\t" + : : "r"(p0), "r"(pout) : "v2", "memory"); + + memcpy(expect, &buffer0[0].ub[offset], sizeof(MMVector)); + + check_output_w(__LINE__, 1); +} + +static void test_store_aligned(void) +{ + /* Aligned stores ignore the low bits of the address */ + void *p0 =3D buffer0; + void *pout =3D output; + const size_t offset =3D 13; + + pout +=3D offset; /* Create an unaligned address */ + asm("v2 =3D vmem(%0 + #0)\n\t" + "vmem(%1 + #0) =3D v2\n\t" + : : "r"(p0), "r"(pout) : "v2", "memory"); + + expect[0] =3D buffer0[0]; + + check_output_w(__LINE__, 1); +} + +static void test_store_unaligned(void) +{ + void *p0 =3D buffer0; + void *pout =3D output; + const size_t offset =3D 12; + + pout +=3D offset; /* Create an unaligned address */ + asm("v2 =3D vmem(%0 + #0)\n\t" + "vmemu(%1 + #0) =3D v2\n\t" + : : "r"(p0), "r"(pout) : "v2", "memory"); + + memcpy(expect, buffer0, 2 * sizeof(MMVector)); + memcpy(&expect[0].ub[offset], buffer0, sizeof(MMVector)); + + check_output_w(__LINE__, 2); +} + +static void test_masked_store(bool invert) +{ + void *p0 =3D buffer0; + void *pmask =3D mask; + void *pout =3D output; + + memset(expect, 0xff, sizeof(expect)); + memset(output, 0xff, sizeof(expect)); + + for (int i =3D 0; i < BUFSIZE; i++) { + if (invert) { + asm("r4 =3D #0\n\t" + "v4 =3D vsplat(r4)\n\t" + "v5 =3D vmem(%0 + #0)\n\t" + "q0 =3D vcmp.eq(v4.w, v5.w)\n\t" + "v5 =3D vmem(%1)\n\t" + "if (!q0) vmem(%2) =3D v5\n\t" /* Inverted tes= t */ + : : "r"(pmask), "r"(p0), "r"(pout) + : "r4", "v4", "v5", "q0", "memory"); + } else { + asm("r4 =3D #0\n\t" + "v4 =3D vsplat(r4)\n\t" + "v5 =3D vmem(%0 + #0)\n\t" + "q0 =3D vcmp.eq(v4.w, v5.w)\n\t" + "v5 =3D vmem(%1)\n\t" + "if (q0) vmem(%2) =3D v5\n\t" /* Non-inverted = test */ + : : "r"(pmask), "r"(p0), "r"(pout) + : "r4", "v4", "v5", "q0", "memory"); + } + p0 +=3D sizeof(MMVector); + pmask +=3D sizeof(MMVector); + pout +=3D sizeof(MMVector); + + for (int j =3D 0; j < MAX_VEC_SIZE_BYTES / 4; j++) { + if (invert) { + if (i + j % MASKMOD !=3D 0) { + expect[i].w[j] =3D buffer0[i].w[j]; + } + } else { + if (i + j % MASKMOD =3D=3D 0) { + expect[i].w[j] =3D buffer0[i].w[j]; + } + } + } + } + + check_output_w(__LINE__, BUFSIZE); +} + +static void test_new_value_store(void) +{ + void *p0 =3D buffer0; + void *pout =3D output; + + asm("{\n\t" + " v2 =3D vmem(%0 + #0)\n\t" + " vmem(%1 + #0) =3D v2.new\n\t" + "}\n\t" + : : "r"(p0), "r"(pout) : "v2", "memory"); + + expect[0] =3D buffer0[0]; + + check_output_w(__LINE__, 1); +} + +static void test_max_temps() +{ + void *p0 =3D buffer0; + void *pout =3D output; + + asm("v0 =3D vmem(%0 + #0)\n\t" + "v1 =3D vmem(%0 + #1)\n\t" + "v2 =3D vmem(%0 + #2)\n\t" + "v3 =3D vmem(%0 + #3)\n\t" + "v4 =3D vmem(%0 + #4)\n\t" + "{\n\t" + " v1:0.w =3D vadd(v3:2.w, v1:0.w)\n\t" + " v2.b =3D vshuffe(v3.b, v2.b)\n\t" + " v3.w =3D vadd(v1.w, v4.w)\n\t" + " v4.tmp =3D vmem(%0 + #5)\n\t" + "}\n\t" + "vmem(%1 + #0) =3D v0\n\t" + "vmem(%1 + #1) =3D v1\n\t" + "vmem(%1 + #2) =3D v2\n\t" + "vmem(%1 + #3) =3D v3\n\t" + "vmem(%1 + #4) =3D v4\n\t" + : : "r"(p0), "r"(pout) : "memory"); + + /* The first two vectors come from the vadd-pair instruction */ + for (int i =3D 0; i < MAX_VEC_SIZE_BYTES / 4; i++) { + expect[0].w[i] =3D buffer0[0].w[i] + buffer0[2].w[i]; + expect[1].w[i] =3D buffer0[1].w[i] + buffer0[3].w[i]; + } + /* The third vector comes from the vshuffe instruction */ + for (int i =3D 0; i < MAX_VEC_SIZE_BYTES / 2; i++) { + expect[2].uh[i] =3D (buffer0[2].uh[i] & 0xff) | + (buffer0[3].uh[i] & 0xff) << 8; + } + /* The fourth vector comes from the vadd-single instruction */ + for (int i =3D 0; i < MAX_VEC_SIZE_BYTES / 4; i++) { + expect[3].w[i] =3D buffer0[1].w[i] + buffer0[5].w[i]; + } + /* + * The fifth vector comes from the load to v4 + * make sure the .tmp is dropped + */ + expect[4] =3D buffer0[4]; + + check_output_b(__LINE__, 5); +} + +#define VEC_OP1(ASM, EL, IN, OUT) \ + asm("v2 =3D vmem(%0 + #0)\n\t" \ + "v2" #EL " =3D " #ASM "(v2" #EL ")\n\t" \ + "vmem(%1 + #0) =3D v2\n\t" \ + : : "r"(IN), "r"(OUT) : "v2", "memory") + +#define VEC_OP2(ASM, EL, IN0, IN1, OUT) \ + asm("v2 =3D vmem(%0 + #0)\n\t" \ + "v3 =3D vmem(%1 + #0)\n\t" \ + "v2" #EL " =3D " #ASM "(v2" #EL ", v3" #EL ")\n\t" \ + "vmem(%2 + #0) =3D v2\n\t" \ + : : "r"(IN0), "r"(IN1), "r"(OUT) : "v2", "v3", "memory") + +#define TEST_VEC_OP1(NAME, ASM, EL, FIELD, FIELDSZ, OP) \ +static void test_##NAME(void) \ +{ \ + void *pin =3D buffer0; \ + void *pout =3D output; \ + for (int i =3D 0; i < BUFSIZE; i++) { \ + VEC_OP1(ASM, EL, pin, pout); \ + pin +=3D sizeof(MMVector); \ + pout +=3D sizeof(MMVector); \ + } \ + for (int i =3D 0; i < BUFSIZE; i++) { \ + for (int j =3D 0; j < MAX_VEC_SIZE_BYTES / FIELDSZ; j++) { \ + expect[i].FIELD[j] =3D OP buffer0[i].FIELD[j]; \ + } \ + } \ + check_output_##FIELD(__LINE__, BUFSIZE); \ +} + +#define TEST_VEC_OP2(NAME, ASM, EL, FIELD, FIELDSZ, OP) \ +static void test_##NAME(void) \ +{ \ + void *p0 =3D buffer0; \ + void *p1 =3D buffer1; \ + void *pout =3D output; \ + for (int i =3D 0; i < BUFSIZE; i++) { \ + VEC_OP2(ASM, EL, p0, p1, pout); \ + p0 +=3D sizeof(MMVector); \ + p1 +=3D sizeof(MMVector); \ + pout +=3D sizeof(MMVector); \ + } \ + for (int i =3D 0; i < BUFSIZE; i++) { \ + for (int j =3D 0; j < MAX_VEC_SIZE_BYTES / FIELDSZ; j++) { \ + expect[i].FIELD[j] =3D buffer0[i].FIELD[j] OP buffer1[i].FIELD= [j]; \ + } \ + } \ + check_output_##FIELD(__LINE__, BUFSIZE); \ +} + +#define THRESHOLD 31 + +#define PRED_OP2(ASM, IN0, IN1, OUT, INV) \ + asm("r4 =3D #%3\n\t" \ + "v1.b =3D vsplat(r4)\n\t" \ + "v2 =3D vmem(%0 + #0)\n\t" \ + "q0 =3D vcmp.gt(v2.b, v1.b)\n\t" \ + "v3 =3D vmem(%1 + #0)\n\t" \ + "q1 =3D vcmp.gt(v3.b, v1.b)\n\t" \ + "q2 =3D " #ASM "(q0, " INV "q1)\n\t" \ + "r4 =3D #0xff\n\t" \ + "v1.b =3D vsplat(r4)\n\t" \ + "if (q2) vmem(%2 + #0) =3D v1\n\t" \ + : : "r"(IN0), "r"(IN1), "r"(OUT), "i"(THRESHOLD) \ + : "r4", "v1", "v2", "v3", "q0", "q1", "q2", "memory") + +#define TEST_PRED_OP2(NAME, ASM, OP, INV) \ +static void test_##NAME(bool invert) \ +{ \ + void *p0 =3D buffer0; \ + void *p1 =3D buffer1; \ + void *pout =3D output; \ + memset(output, 0, sizeof(expect)); \ + for (int i =3D 0; i < BUFSIZE; i++) { \ + PRED_OP2(ASM, p0, p1, pout, INV); \ + p0 +=3D sizeof(MMVector); \ + p1 +=3D sizeof(MMVector); \ + pout +=3D sizeof(MMVector); \ + } \ + for (int i =3D 0; i < BUFSIZE; i++) { \ + for (int j =3D 0; j < MAX_VEC_SIZE_BYTES; j++) { \ + bool p0 =3D (buffer0[i].b[j] > THRESHOLD); \ + bool p1 =3D (buffer1[i].b[j] > THRESHOLD); \ + if (invert) { \ + expect[i].b[j] =3D (p0 OP !p1) ? 0xff : 0x00; \ + } else { \ + expect[i].b[j] =3D (p0 OP p1) ? 0xff : 0x00; \ + } \ + } \ + } \ + check_output_b(__LINE__, BUFSIZE); \ +} + +TEST_VEC_OP2(vadd_w, vadd, .w, w, 4, +) +TEST_VEC_OP2(vadd_h, vadd, .h, h, 2, +) +TEST_VEC_OP2(vadd_b, vadd, .b, b, 1, +) +TEST_VEC_OP2(vsub_w, vsub, .w, w, 4, -) +TEST_VEC_OP2(vsub_h, vsub, .h, h, 2, -) +TEST_VEC_OP2(vsub_b, vsub, .b, b, 1, -) +TEST_VEC_OP2(vxor, vxor, , d, 8, ^) +TEST_VEC_OP2(vand, vand, , d, 8, &) +TEST_VEC_OP2(vor, vor, , d, 8, |) +TEST_VEC_OP1(vnot, vnot, , d, 8, ~) + +TEST_PRED_OP2(pred_or, or, |, "") +TEST_PRED_OP2(pred_or_n, or, |, "!") +TEST_PRED_OP2(pred_and, and, &, "") +TEST_PRED_OP2(pred_and_n, and, &, "!") +TEST_PRED_OP2(pred_xor, xor, ^, "") + +int main() +{ + init_buffers(); + + test_load_tmp(); + test_load_cur(); + test_load_aligned(); + test_load_unaligned(); + test_store_aligned(); + test_store_unaligned(); + test_masked_store(false); + test_masked_store(true); + test_new_value_store(); + test_max_temps(); + + test_vadd_w(); + test_vadd_h(); + test_vadd_b(); + test_vsub_w(); + test_vsub_h(); + test_vsub_b(); + test_vxor(); + test_vand(); + test_vor(); + test_vnot(); + + test_pred_or(false); + test_pred_or_n(true); + test_pred_and(false); + test_pred_and_n(true); + test_pred_xor(false); + + puts(err ? "FAIL" : "PASS"); + return err ? 1 : 0; +} diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile= .target index b010edc..62916a5 100644 --- a/tests/tcg/hexagon/Makefile.target +++ b/tests/tcg/hexagon/Makefile.target @@ -41,7 +41,9 @@ HEX_TESTS +=3D load_align HEX_TESTS +=3D vector_add_int HEX_TESTS +=3D atomics HEX_TESTS +=3D fpstuff +HEX_TESTS +=3D hvx_misc =20 TESTS +=3D $(HEX_TESTS) =20 vector_add_int: CFLAGS +=3D -mhvx -fvectorize +hvx_misc: CFLAGS +=3D -mhvx --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634036510738949.7278470566391; Tue, 12 Oct 2021 04:01:50 -0700 (PDT) Received: from localhost ([::1]:56000 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maFXZ-0008Rc-1v for importer@patchew.org; Tue, 12 Oct 2021 07:01:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50648) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEls-0001Qf-8t for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:32 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64084) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEln-0006yI-Jf for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:12:31 -0400 Received: from unknown (HELO ironmsg01-sd.qualcomm.com) ([10.53.140.141]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg01-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 817CF17B2; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033547; x=1665569547; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1dI3mEPKZbhIbBwzBT+VvFaHRGdWal23mr6A8edM0zY=; b=QadVxsvELTFOHE3IkSIdBN7DQU0r1iCrDbDX5U9n9dGDsWibd1q+Tuw1 661vuPXuk7HVqaGlKKwpZojyYG9UVqu3vEzgLO19cRbachHKeok1oc0na 4eKibgZDmRTBmlZSBjFiU10l85YAaOqgRgLliLbecxRT2WLLAvvmnHVwB 0=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 29/30] Hexagon HVX (tests/tcg/hexagon) scatter_gather test Date: Tue, 12 Oct 2021 05:11:07 -0500 Message-Id: <1634033468-23566-30-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634036513366100001 Signed-off-by: Taylor Simpson Acked-by: Richard Henderson --- tests/tcg/hexagon/scatter_gather.c | 1011 ++++++++++++++++++++++++++++++++= ++++ tests/tcg/hexagon/Makefile.target | 2 + 2 files changed, 1013 insertions(+) create mode 100644 tests/tcg/hexagon/scatter_gather.c diff --git a/tests/tcg/hexagon/scatter_gather.c b/tests/tcg/hexagon/scatter= _gather.c new file mode 100644 index 0000000..b93eb18 --- /dev/null +++ b/tests/tcg/hexagon/scatter_gather.c @@ -0,0 +1,1011 @@ +/* + * Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +/* + * This example tests the HVX scatter/gather instructions + * + * See section 5.13 of the V68 HVX Programmer's Reference + * + * There are 3 main classes operations + * _16 16-bit elements and 16-bit offsets + * _32 32-bit elements and 32-bit offsets + * _16_32 16-bit elements and 32-bit offsets + * + * There are also masked and accumulate versions + */ + +#include +#include +#include +#include + +typedef long HVX_Vector __attribute__((__vector_size__(128))) + __attribute__((aligned(128))); +typedef long HVX_VectorPair __attribute__((__vector_size__(256))) + __attribute__((aligned(128))); +typedef long HVX_VectorPred __attribute__((__vector_size__(128))) + __attribute__((aligned(128))); + +#define VSCATTER_16(BASE, RGN, OFF, VALS) \ + __builtin_HEXAGON_V6_vscattermh_128B((int)BASE, RGN, OFF, VALS) +#define VSCATTER_16_MASKED(MASK, BASE, RGN, OFF, VALS) \ + __builtin_HEXAGON_V6_vscattermhq_128B(MASK, (int)BASE, RGN, OFF, VALS) +#define VSCATTER_32(BASE, RGN, OFF, VALS) \ + __builtin_HEXAGON_V6_vscattermw_128B((int)BASE, RGN, OFF, VALS) +#define VSCATTER_32_MASKED(MASK, BASE, RGN, OFF, VALS) \ + __builtin_HEXAGON_V6_vscattermwq_128B(MASK, (int)BASE, RGN, OFF, VALS) +#define VSCATTER_16_32(BASE, RGN, OFF, VALS) \ + __builtin_HEXAGON_V6_vscattermhw_128B((int)BASE, RGN, OFF, VALS) +#define VSCATTER_16_32_MASKED(MASK, BASE, RGN, OFF, VALS) \ + __builtin_HEXAGON_V6_vscattermhwq_128B(MASK, (int)BASE, RGN, OFF, VALS) +#define VSCATTER_16_ACC(BASE, RGN, OFF, VALS) \ + __builtin_HEXAGON_V6_vscattermh_add_128B((int)BASE, RGN, OFF, VALS) +#define VSCATTER_32_ACC(BASE, RGN, OFF, VALS) \ + __builtin_HEXAGON_V6_vscattermw_add_128B((int)BASE, RGN, OFF, VALS) +#define VSCATTER_16_32_ACC(BASE, RGN, OFF, VALS) \ + __builtin_HEXAGON_V6_vscattermhw_add_128B((int)BASE, RGN, OFF, VALS) + +#define VGATHER_16(DSTADDR, BASE, RGN, OFF) \ + __builtin_HEXAGON_V6_vgathermh_128B(DSTADDR, (int)BASE, RGN, OFF) +#define VGATHER_16_MASKED(DSTADDR, MASK, BASE, RGN, OFF) \ + __builtin_HEXAGON_V6_vgathermhq_128B(DSTADDR, MASK, (int)BASE, RGN, OF= F) +#define VGATHER_32(DSTADDR, BASE, RGN, OFF) \ + __builtin_HEXAGON_V6_vgathermw_128B(DSTADDR, (int)BASE, RGN, OFF) +#define VGATHER_32_MASKED(DSTADDR, MASK, BASE, RGN, OFF) \ + __builtin_HEXAGON_V6_vgathermwq_128B(DSTADDR, MASK, (int)BASE, RGN, OF= F) +#define VGATHER_16_32(DSTADDR, BASE, RGN, OFF) \ + __builtin_HEXAGON_V6_vgathermhw_128B(DSTADDR, (int)BASE, RGN, OFF) +#define VGATHER_16_32_MASKED(DSTADDR, MASK, BASE, RGN, OFF) \ + __builtin_HEXAGON_V6_vgathermhwq_128B(DSTADDR, MASK, (int)BASE, RGN, O= FF) + +#define VSHUFF_H(V) \ + __builtin_HEXAGON_V6_vshuffh_128B(V) +#define VSPLAT_H(X) \ + __builtin_HEXAGON_V6_lvsplath_128B(X) +#define VAND_VAL(PRED, VAL) \ + __builtin_HEXAGON_V6_vandvrt_128B(PRED, VAL) +#define VDEAL_H(V) \ + __builtin_HEXAGON_V6_vdealh_128B(V) + +int err; + +/* define the number of rows/cols in a square matrix */ +#define MATRIX_SIZE 64 + +/* define the size of the scatter buffer */ +#define SCATTER_BUFFER_SIZE (MATRIX_SIZE * MATRIX_SIZE) + +/* fake vtcm - put buffers together and force alignment */ +static struct { + unsigned short vscatter16[SCATTER_BUFFER_SIZE]; + unsigned short vgather16[MATRIX_SIZE]; + unsigned int vscatter32[SCATTER_BUFFER_SIZE]; + unsigned int vgather32[MATRIX_SIZE]; + unsigned short vscatter16_32[SCATTER_BUFFER_SIZE]; + unsigned short vgather16_32[MATRIX_SIZE]; +} vtcm __attribute__((aligned(0x10000))); + +/* declare the arrays of reference values */ +unsigned short vscatter16_ref[SCATTER_BUFFER_SIZE]; +unsigned short vgather16_ref[MATRIX_SIZE]; +unsigned int vscatter32_ref[SCATTER_BUFFER_SIZE]; +unsigned int vgather32_ref[MATRIX_SIZE]; +unsigned short vscatter16_32_ref[SCATTER_BUFFER_SIZE]; +unsigned short vgather16_32_ref[MATRIX_SIZE]; + +/* declare the arrays of offsets */ +unsigned short half_offsets[MATRIX_SIZE]; +unsigned int word_offsets[MATRIX_SIZE]; + +/* declare the arrays of values */ +unsigned short half_values[MATRIX_SIZE]; +unsigned short half_values_acc[MATRIX_SIZE]; +unsigned short half_values_masked[MATRIX_SIZE]; +unsigned int word_values[MATRIX_SIZE]; +unsigned int word_values_acc[MATRIX_SIZE]; +unsigned int word_values_masked[MATRIX_SIZE]; + +/* declare the arrays of predicates */ +unsigned short half_predicates[MATRIX_SIZE]; +unsigned int word_predicates[MATRIX_SIZE]; + +/* make this big enough for all the intrinsics */ +const size_t region_len =3D sizeof(vtcm); + +/* optionally add sync instructions */ +#define SYNC_VECTOR 1 + +static void sync_scatter(void *addr) +{ +#if SYNC_VECTOR + /* + * Do the scatter release followed by a dummy load to complete the + * synchronization. Normally the dummy load would be deferred as + * long as possible to minimize stalls. + */ + asm volatile("vmem(%0 + #0):scatter_release\n" : : "r"(addr)); + /* use volatile to force the load */ + volatile HVX_Vector vDummy =3D *(HVX_Vector *)addr; vDummy =3D vDummy; +#endif +} + +static void sync_gather(void *addr) +{ +#if SYNC_VECTOR + /* use volatile to force the load */ + volatile HVX_Vector vDummy =3D *(HVX_Vector *)addr; vDummy =3D vDummy; +#endif +} + +/* optionally print the results */ +#define PRINT_DATA 0 + +#define FILL_CHAR '.' + +/* fill vtcm scratch with ee */ +void prefill_vtcm_scratch(void) +{ + memset(&vtcm, FILL_CHAR, sizeof(vtcm)); +} + +/* create byte offsets to be a diagonal of the matrix with 16 bit elements= */ +void create_offsets_values_preds_16(void) +{ + unsigned short half_element =3D 0; + unsigned short half_element_masked =3D 0; + char letter =3D 'A'; + char letter_masked =3D '@'; + + for (int i =3D 0; i < MATRIX_SIZE; i++) { + half_offsets[i] =3D i * (2 * MATRIX_SIZE + 2); + + half_element =3D 0; + half_element_masked =3D 0; + for (int j =3D 0; j < 2; j++) { + half_element |=3D letter << j * 8; + half_element_masked |=3D letter_masked << j * 8; + } + + half_values[i] =3D half_element; + half_values_acc[i] =3D ((i % 10) << 8) + (i % 10); + half_values_masked[i] =3D half_element_masked; + + letter++; + /* reset to 'A' */ + if (letter =3D=3D 'M') { + letter =3D 'A'; + } + + half_predicates[i] =3D (i % 3 =3D=3D 0 || i % 5 =3D=3D 0) ? ~0 : 0; + } +} + +/* create byte offsets to be a diagonal of the matrix with 32 bit elements= */ +void create_offsets_values_preds_32(void) +{ + unsigned int word_element =3D 0; + unsigned int word_element_masked =3D 0; + char letter =3D 'A'; + char letter_masked =3D '&'; + + for (int i =3D 0; i < MATRIX_SIZE; i++) { + word_offsets[i] =3D i * (4 * MATRIX_SIZE + 4); + + word_element =3D 0; + word_element_masked =3D 0; + for (int j =3D 0; j < 4; j++) { + word_element |=3D letter << j * 8; + word_element_masked |=3D letter_masked << j * 8; + } + + word_values[i] =3D word_element; + word_values_acc[i] =3D ((i % 10) << 8) + (i % 10); + word_values_masked[i] =3D word_element_masked; + + letter++; + /* reset to 'A' */ + if (letter =3D=3D 'M') { + letter =3D 'A'; + } + + word_predicates[i] =3D (i % 4 =3D=3D 0 || i % 7 =3D=3D 0) ? ~0 : 0; + } +} + +/* + * create byte offsets to be a diagonal of the matrix with 16 bit elements + * and 32 bit offsets + */ +void create_offsets_values_preds_16_32(void) +{ + unsigned short half_element =3D 0; + unsigned short half_element_masked =3D 0; + char letter =3D 'D'; + char letter_masked =3D '$'; + + for (int i =3D 0; i < MATRIX_SIZE; i++) { + word_offsets[i] =3D i * (2 * MATRIX_SIZE + 2); + + half_element =3D 0; + half_element_masked =3D 0; + for (int j =3D 0; j < 2; j++) { + half_element |=3D letter << j * 8; + half_element_masked |=3D letter_masked << j * 8; + } + + half_values[i] =3D half_element; + half_values_acc[i] =3D ((i % 10) << 8) + (i % 10); + half_values_masked[i] =3D half_element_masked; + + letter++; + /* reset to 'A' */ + if (letter =3D=3D 'P') { + letter =3D 'D'; + } + + half_predicates[i] =3D (i % 2 =3D=3D 0 || i % 13 =3D=3D 0) ? ~0 : = 0; + } +} + +/* scatter the 16 bit elements using intrinsics */ +void vector_scatter_16(void) +{ + /* copy the offsets and values to vectors */ + HVX_Vector offsets =3D *(HVX_Vector *)half_offsets; + HVX_Vector values =3D *(HVX_Vector *)half_values; + + VSCATTER_16(&vtcm.vscatter16, region_len, offsets, values); + + sync_scatter(vtcm.vscatter16); +} + +/* scatter-accumulate the 16 bit elements using intrinsics */ +void vector_scatter_16_acc(void) +{ + /* copy the offsets and values to vectors */ + HVX_Vector offsets =3D *(HVX_Vector *)half_offsets; + HVX_Vector values =3D *(HVX_Vector *)half_values_acc; + + VSCATTER_16_ACC(&vtcm.vscatter16, region_len, offsets, values); + + sync_scatter(vtcm.vscatter16); +} + +/* scatter the 16 bit elements using intrinsics */ +void vector_scatter_16_masked(void) +{ + /* copy the offsets and values to vectors */ + HVX_Vector offsets =3D *(HVX_Vector *)half_offsets; + HVX_Vector values =3D *(HVX_Vector *)half_values_masked; + HVX_Vector pred_reg =3D *(HVX_Vector *)half_predicates; + HVX_VectorPred preds =3D VAND_VAL(pred_reg, ~0); + + VSCATTER_16_MASKED(preds, &vtcm.vscatter16, region_len, offsets, value= s); + + sync_scatter(vtcm.vscatter16); +} + +/* scatter the 32 bit elements using intrinsics */ +void vector_scatter_32(void) +{ + /* copy the offsets and values to vectors */ + HVX_Vector offsetslo =3D *(HVX_Vector *)word_offsets; + HVX_Vector offsetshi =3D *(HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; + HVX_Vector valueslo =3D *(HVX_Vector *)word_values; + HVX_Vector valueshi =3D *(HVX_Vector *)&word_values[MATRIX_SIZE / 2]; + + VSCATTER_32(&vtcm.vscatter32, region_len, offsetslo, valueslo); + VSCATTER_32(&vtcm.vscatter32, region_len, offsetshi, valueshi); + + sync_scatter(vtcm.vscatter32); +} + +/* scatter-acc the 32 bit elements using intrinsics */ +void vector_scatter_32_acc(void) +{ + /* copy the offsets and values to vectors */ + HVX_Vector offsetslo =3D *(HVX_Vector *)word_offsets; + HVX_Vector offsetshi =3D *(HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; + HVX_Vector valueslo =3D *(HVX_Vector *)word_values_acc; + HVX_Vector valueshi =3D *(HVX_Vector *)&word_values_acc[MATRIX_SIZE / = 2]; + + VSCATTER_32_ACC(&vtcm.vscatter32, region_len, offsetslo, valueslo); + VSCATTER_32_ACC(&vtcm.vscatter32, region_len, offsetshi, valueshi); + + sync_scatter(vtcm.vscatter32); +} + +/* scatter the 32 bit elements using intrinsics */ +void vector_scatter_32_masked(void) +{ + /* copy the offsets and values to vectors */ + HVX_Vector offsetslo =3D *(HVX_Vector *)word_offsets; + HVX_Vector offsetshi =3D *(HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; + HVX_Vector valueslo =3D *(HVX_Vector *)word_values_masked; + HVX_Vector valueshi =3D *(HVX_Vector *)&word_values_masked[MATRIX_SIZE= / 2]; + HVX_Vector pred_reglo =3D *(HVX_Vector *)word_predicates; + HVX_Vector pred_reghi =3D *(HVX_Vector *)&word_predicates[MATRIX_SIZE = / 2]; + HVX_VectorPred predslo =3D VAND_VAL(pred_reglo, ~0); + HVX_VectorPred predshi =3D VAND_VAL(pred_reghi, ~0); + + VSCATTER_32_MASKED(predslo, &vtcm.vscatter32, region_len, offsetslo, + valueslo); + VSCATTER_32_MASKED(predshi, &vtcm.vscatter32, region_len, offsetshi, + valueshi); + + sync_scatter(vtcm.vscatter16); +} + +/* scatter the 16 bit elements with 32 bit offsets using intrinsics */ +void vector_scatter_16_32(void) +{ + HVX_VectorPair offsets; + HVX_Vector values; + + /* get the word offsets in a vector pair */ + offsets =3D *(HVX_VectorPair *)word_offsets; + + /* these values need to be shuffled for the scatter */ + values =3D *(HVX_Vector *)half_values; + values =3D VSHUFF_H(values); + + VSCATTER_16_32(&vtcm.vscatter16_32, region_len, offsets, values); + + sync_scatter(vtcm.vscatter16_32); +} + +/* scatter-acc the 16 bit elements with 32 bit offsets using intrinsics */ +void vector_scatter_16_32_acc(void) +{ + HVX_VectorPair offsets; + HVX_Vector values; + + /* get the word offsets in a vector pair */ + offsets =3D *(HVX_VectorPair *)word_offsets; + + /* these values need to be shuffled for the scatter */ + values =3D *(HVX_Vector *)half_values_acc; + values =3D VSHUFF_H(values); + + VSCATTER_16_32_ACC(&vtcm.vscatter16_32, region_len, offsets, values); + + sync_scatter(vtcm.vscatter16_32); +} + +/* masked scatter the 16 bit elements with 32 bit offsets using intrinsics= */ +void vector_scatter_16_32_masked(void) +{ + HVX_VectorPair offsets; + HVX_Vector values; + HVX_Vector pred_reg; + + /* get the word offsets in a vector pair */ + offsets =3D *(HVX_VectorPair *)word_offsets; + + /* these values need to be shuffled for the scatter */ + values =3D *(HVX_Vector *)half_values_masked; + values =3D VSHUFF_H(values); + + pred_reg =3D *(HVX_Vector *)half_predicates; + pred_reg =3D VSHUFF_H(pred_reg); + HVX_VectorPred preds =3D VAND_VAL(pred_reg, ~0); + + VSCATTER_16_32_MASKED(preds, &vtcm.vscatter16_32, region_len, offsets, + values); + + sync_scatter(vtcm.vscatter16_32); +} + +/* gather the elements from the scatter16 buffer */ +void vector_gather_16(void) +{ + HVX_Vector *vgather =3D (HVX_Vector *)&vtcm.vgather16; + HVX_Vector offsets =3D *(HVX_Vector *)half_offsets; + + VGATHER_16(vgather, &vtcm.vscatter16, region_len, offsets); + + sync_gather(vgather); +} + +static unsigned short gather_16_masked_init(void) +{ + char letter =3D '?'; + return letter | (letter << 8); +} + +void vector_gather_16_masked(void) +{ + HVX_Vector *vgather =3D (HVX_Vector *)&vtcm.vgather16; + HVX_Vector offsets =3D *(HVX_Vector *)half_offsets; + HVX_Vector pred_reg =3D *(HVX_Vector *)half_predicates; + HVX_VectorPred preds =3D VAND_VAL(pred_reg, ~0); + + *vgather =3D VSPLAT_H(gather_16_masked_init()); + VGATHER_16_MASKED(vgather, preds, &vtcm.vscatter16, region_len, offset= s); + + sync_gather(vgather); +} + +/* gather the elements from the scatter32 buffer */ +void vector_gather_32(void) +{ + HVX_Vector *vgatherlo =3D (HVX_Vector *)&vtcm.vgather32; + HVX_Vector *vgatherhi =3D + (HVX_Vector *)((int)&vtcm.vgather32 + (MATRIX_SIZE * 2)); + HVX_Vector offsetslo =3D *(HVX_Vector *)word_offsets; + HVX_Vector offsetshi =3D *(HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; + + VGATHER_32(vgatherlo, &vtcm.vscatter32, region_len, offsetslo); + VGATHER_32(vgatherhi, &vtcm.vscatter32, region_len, offsetshi); + + sync_gather(vgatherhi); +} + +static unsigned int gather_32_masked_init(void) +{ + char letter =3D '?'; + return letter | (letter << 8) | (letter << 16) | (letter << 24); +} + +void vector_gather_32_masked(void) +{ + HVX_Vector *vgatherlo =3D (HVX_Vector *)&vtcm.vgather32; + HVX_Vector *vgatherhi =3D + (HVX_Vector *)((int)&vtcm.vgather32 + (MATRIX_SIZE * 2)); + HVX_Vector offsetslo =3D *(HVX_Vector *)word_offsets; + HVX_Vector offsetshi =3D *(HVX_Vector *)&word_offsets[MATRIX_SIZE / 2]; + HVX_Vector pred_reglo =3D *(HVX_Vector *)word_predicates; + HVX_VectorPred predslo =3D VAND_VAL(pred_reglo, ~0); + HVX_Vector pred_reghi =3D *(HVX_Vector *)&word_predicates[MATRIX_SIZE = / 2]; + HVX_VectorPred predshi =3D VAND_VAL(pred_reghi, ~0); + + *vgatherlo =3D VSPLAT_H(gather_32_masked_init()); + *vgatherhi =3D VSPLAT_H(gather_32_masked_init()); + VGATHER_32_MASKED(vgatherlo, predslo, &vtcm.vscatter32, region_len, + offsetslo); + VGATHER_32_MASKED(vgatherhi, predshi, &vtcm.vscatter32, region_len, + offsetshi); + + sync_gather(vgatherlo); + sync_gather(vgatherhi); +} + +/* gather the elements from the scatter16_32 buffer */ +void vector_gather_16_32(void) +{ + HVX_Vector *vgather; + HVX_VectorPair offsets; + HVX_Vector values; + + /* get the vtcm address to gather from */ + vgather =3D (HVX_Vector *)&vtcm.vgather16_32; + + /* get the word offsets in a vector pair */ + offsets =3D *(HVX_VectorPair *)word_offsets; + + VGATHER_16_32(vgather, &vtcm.vscatter16_32, region_len, offsets); + + /* deal the elements to get the order back */ + values =3D *(HVX_Vector *)vgather; + values =3D VDEAL_H(values); + + /* write it back to vtcm address */ + *(HVX_Vector *)vgather =3D values; +} + +void vector_gather_16_32_masked(void) +{ + HVX_Vector *vgather; + HVX_VectorPair offsets; + HVX_Vector pred_reg; + HVX_VectorPred preds; + HVX_Vector values; + + /* get the vtcm address to gather from */ + vgather =3D (HVX_Vector *)&vtcm.vgather16_32; + + /* get the word offsets in a vector pair */ + offsets =3D *(HVX_VectorPair *)word_offsets; + pred_reg =3D *(HVX_Vector *)half_predicates; + pred_reg =3D VSHUFF_H(pred_reg); + preds =3D VAND_VAL(pred_reg, ~0); + + *vgather =3D VSPLAT_H(gather_16_masked_init()); + VGATHER_16_32_MASKED(vgather, preds, &vtcm.vscatter16_32, region_len, + offsets); + + /* deal the elements to get the order back */ + values =3D *(HVX_Vector *)vgather; + values =3D VDEAL_H(values); + + /* write it back to vtcm address */ + *(HVX_Vector *)vgather =3D values; +} + +static void check_buffer(const char *name, void *c, void *r, size_t size) +{ + char *check =3D (char *)c; + char *ref =3D (char *)r; + for (int i =3D 0; i < size; i++) { + if (check[i] !=3D ref[i]) { + printf("ERROR %s [%d]: 0x%x (%c) !=3D 0x%x (%c)\n", name, i, + check[i], check[i], ref[i], ref[i]); + err++; + } + } +} + +/* + * These scalar functions are the C equivalents of the vector functions th= at + * use HVX + */ + +/* scatter the 16 bit elements using C */ +void scalar_scatter_16(unsigned short *vscatter16) +{ + for (int i =3D 0; i < MATRIX_SIZE; ++i) { + vscatter16[half_offsets[i] / 2] =3D half_values[i]; + } +} + +void check_scatter_16() +{ + memset(vscatter16_ref, FILL_CHAR, + SCATTER_BUFFER_SIZE * sizeof(unsigned short)); + scalar_scatter_16(vscatter16_ref); + check_buffer(__func__, vtcm.vscatter16, vscatter16_ref, + SCATTER_BUFFER_SIZE * sizeof(unsigned short)); +} + +/* scatter the 16 bit elements using C */ +void scalar_scatter_16_acc(unsigned short *vscatter16) +{ + for (int i =3D 0; i < MATRIX_SIZE; ++i) { + vscatter16[half_offsets[i] / 2] +=3D half_values_acc[i]; + } +} + +void check_scatter_16_acc() +{ + memset(vscatter16_ref, FILL_CHAR, + SCATTER_BUFFER_SIZE * sizeof(unsigned short)); + scalar_scatter_16(vscatter16_ref); + scalar_scatter_16_acc(vscatter16_ref); + check_buffer(__func__, vtcm.vscatter16, vscatter16_ref, + SCATTER_BUFFER_SIZE * sizeof(unsigned short)); +} + +/* scatter the 16 bit elements using C */ +void scalar_scatter_16_masked(unsigned short *vscatter16) +{ + for (int i =3D 0; i < MATRIX_SIZE; i++) { + if (half_predicates[i]) { + vscatter16[half_offsets[i] / 2] =3D half_values_masked[i]; + } + } + +} + +void check_scatter_16_masked() +{ + memset(vscatter16_ref, FILL_CHAR, + SCATTER_BUFFER_SIZE * sizeof(unsigned short)); + scalar_scatter_16(vscatter16_ref); + scalar_scatter_16_acc(vscatter16_ref); + scalar_scatter_16_masked(vscatter16_ref); + check_buffer(__func__, vtcm.vscatter16, vscatter16_ref, + SCATTER_BUFFER_SIZE * sizeof(unsigned short)); +} + +/* scatter the 32 bit elements using C */ +void scalar_scatter_32(unsigned int *vscatter32) +{ + for (int i =3D 0; i < MATRIX_SIZE; ++i) { + vscatter32[word_offsets[i] / 4] =3D word_values[i]; + } +} + +void check_scatter_32() +{ + memset(vscatter32_ref, FILL_CHAR, + SCATTER_BUFFER_SIZE * sizeof(unsigned int)); + scalar_scatter_32(vscatter32_ref); + check_buffer(__func__, vtcm.vscatter32, vscatter32_ref, + SCATTER_BUFFER_SIZE * sizeof(unsigned int)); +} + +/* scatter the 32 bit elements using C */ +void scalar_scatter_32_acc(unsigned int *vscatter32) +{ + for (int i =3D 0; i < MATRIX_SIZE; ++i) { + vscatter32[word_offsets[i] / 4] +=3D word_values_acc[i]; + } +} + +void check_scatter_32_acc() +{ + memset(vscatter32_ref, FILL_CHAR, + SCATTER_BUFFER_SIZE * sizeof(unsigned int)); + scalar_scatter_32(vscatter32_ref); + scalar_scatter_32_acc(vscatter32_ref); + check_buffer(__func__, vtcm.vscatter32, vscatter32_ref, + SCATTER_BUFFER_SIZE * sizeof(unsigned int)); +} + +/* scatter the 32 bit elements using C */ +void scalar_scatter_32_masked(unsigned int *vscatter32) +{ + for (int i =3D 0; i < MATRIX_SIZE; i++) { + if (word_predicates[i]) { + vscatter32[word_offsets[i] / 4] =3D word_values_masked[i]; + } + } +} + +void check_scatter_32_masked() +{ + memset(vscatter32_ref, FILL_CHAR, + SCATTER_BUFFER_SIZE * sizeof(unsigned int)); + scalar_scatter_32(vscatter32_ref); + scalar_scatter_32_acc(vscatter32_ref); + scalar_scatter_32_masked(vscatter32_ref); + check_buffer(__func__, vtcm.vscatter32, vscatter32_ref, + SCATTER_BUFFER_SIZE * sizeof(unsigned int)); +} + +/* scatter the 32 bit elements using C */ +void scalar_scatter_16_32(unsigned short *vscatter16_32) +{ + for (int i =3D 0; i < MATRIX_SIZE; ++i) { + vscatter16_32[word_offsets[i] / 2] =3D half_values[i]; + } +} + +void check_scatter_16_32() +{ + memset(vscatter16_32_ref, FILL_CHAR, + SCATTER_BUFFER_SIZE * sizeof(unsigned short)); + scalar_scatter_16_32(vscatter16_32_ref); + check_buffer(__func__, vtcm.vscatter16_32, vscatter16_32_ref, + SCATTER_BUFFER_SIZE * sizeof(unsigned short)); +} + +/* scatter the 32 bit elements using C */ +void scalar_scatter_16_32_acc(unsigned short *vscatter16_32) +{ + for (int i =3D 0; i < MATRIX_SIZE; ++i) { + vscatter16_32[word_offsets[i] / 2] +=3D half_values_acc[i]; + } +} + +void check_scatter_16_32_acc() +{ + memset(vscatter16_32_ref, FILL_CHAR, + SCATTER_BUFFER_SIZE * sizeof(unsigned short)); + scalar_scatter_16_32(vscatter16_32_ref); + scalar_scatter_16_32_acc(vscatter16_32_ref); + check_buffer(__func__, vtcm.vscatter16_32, vscatter16_32_ref, + SCATTER_BUFFER_SIZE * sizeof(unsigned short)); +} + +void scalar_scatter_16_32_masked(unsigned short *vscatter16_32) +{ + for (int i =3D 0; i < MATRIX_SIZE; i++) { + if (half_predicates[i]) { + vscatter16_32[word_offsets[i] / 2] =3D half_values_masked[i]; + } + } +} + +void check_scatter_16_32_masked() +{ + memset(vscatter16_32_ref, FILL_CHAR, + SCATTER_BUFFER_SIZE * sizeof(unsigned short)); + scalar_scatter_16_32(vscatter16_32_ref); + scalar_scatter_16_32_acc(vscatter16_32_ref); + scalar_scatter_16_32_masked(vscatter16_32_ref); + check_buffer(__func__, vtcm.vscatter16_32, vscatter16_32_ref, + SCATTER_BUFFER_SIZE * sizeof(unsigned short)); +} + +/* gather the elements from the scatter buffer using C */ +void scalar_gather_16(unsigned short *vgather16) +{ + for (int i =3D 0; i < MATRIX_SIZE; ++i) { + vgather16[i] =3D vtcm.vscatter16[half_offsets[i] / 2]; + } +} + +void check_gather_16() +{ + memset(vgather16_ref, 0, MATRIX_SIZE * sizeof(unsigned short)); + scalar_gather_16(vgather16_ref); + check_buffer(__func__, vtcm.vgather16, vgather16_ref, + MATRIX_SIZE * sizeof(unsigned short)); +} + +void scalar_gather_16_masked(unsigned short *vgather16) +{ + for (int i =3D 0; i < MATRIX_SIZE; ++i) { + if (half_predicates[i]) { + vgather16[i] =3D vtcm.vscatter16[half_offsets[i] / 2]; + } + } +} + +void check_gather_16_masked() +{ + memset(vgather16_ref, gather_16_masked_init(), + MATRIX_SIZE * sizeof(unsigned short)); + scalar_gather_16_masked(vgather16_ref); + check_buffer(__func__, vtcm.vgather16, vgather16_ref, + MATRIX_SIZE * sizeof(unsigned short)); +} + +/* gather the elements from the scatter buffer using C */ +void scalar_gather_32(unsigned int *vgather32) +{ + for (int i =3D 0; i < MATRIX_SIZE; ++i) { + vgather32[i] =3D vtcm.vscatter32[word_offsets[i] / 4]; + } +} + +void check_gather_32(void) +{ + memset(vgather32_ref, 0, MATRIX_SIZE * sizeof(unsigned int)); + scalar_gather_32(vgather32_ref); + check_buffer(__func__, vtcm.vgather32, vgather32_ref, + MATRIX_SIZE * sizeof(unsigned int)); +} + +void scalar_gather_32_masked(unsigned int *vgather32) +{ + for (int i =3D 0; i < MATRIX_SIZE; ++i) { + if (word_predicates[i]) { + vgather32[i] =3D vtcm.vscatter32[word_offsets[i] / 4]; + } + } +} + + +void check_gather_32_masked(void) +{ + memset(vgather32_ref, gather_32_masked_init(), + MATRIX_SIZE * sizeof(unsigned int)); + scalar_gather_32_masked(vgather32_ref); + check_buffer(__func__, vtcm.vgather32, + vgather32_ref, MATRIX_SIZE * sizeof(unsigned int)); +} + +/* gather the elements from the scatter buffer using C */ +void scalar_gather_16_32(unsigned short *vgather16_32) +{ + for (int i =3D 0; i < MATRIX_SIZE; ++i) { + vgather16_32[i] =3D vtcm.vscatter16_32[word_offsets[i] / 2]; + } +} + +void check_gather_16_32(void) +{ + memset(vgather16_32_ref, 0, MATRIX_SIZE * sizeof(unsigned short)); + scalar_gather_16_32(vgather16_32_ref); + check_buffer(__func__, vtcm.vgather16_32, vgather16_32_ref, + MATRIX_SIZE * sizeof(unsigned short)); +} + +void scalar_gather_16_32_masked(unsigned short *vgather16_32) +{ + for (int i =3D 0; i < MATRIX_SIZE; ++i) { + if (half_predicates[i]) { + vgather16_32[i] =3D vtcm.vscatter16_32[word_offsets[i] / 2]; + } + } + +} + +void check_gather_16_32_masked(void) +{ + memset(vgather16_32_ref, gather_16_masked_init(), + MATRIX_SIZE * sizeof(unsigned short)); + scalar_gather_16_32_masked(vgather16_32_ref); + check_buffer(__func__, vtcm.vgather16_32, vgather16_32_ref, + MATRIX_SIZE * sizeof(unsigned short)); +} + +/* print scatter16 buffer */ +void print_scatter16_buffer(void) +{ + if (PRINT_DATA) { + printf("\n\nPrinting the 16 bit scatter buffer"); + + for (int i =3D 0; i < SCATTER_BUFFER_SIZE; i++) { + if ((i % MATRIX_SIZE) =3D=3D 0) { + printf("\n"); + } + for (int j =3D 0; j < 2; j++) { + printf("%c", (char)((vtcm.vscatter16[i] >> j * 8) & 0xff)); + } + printf(" "); + } + printf("\n"); + } +} + +/* print the gather 16 buffer */ +void print_gather_result_16(void) +{ + if (PRINT_DATA) { + printf("\n\nPrinting the 16 bit gather result\n"); + + for (int i =3D 0; i < MATRIX_SIZE; i++) { + for (int j =3D 0; j < 2; j++) { + printf("%c", (char)((vtcm.vgather16[i] >> j * 8) & 0xff)); + } + printf(" "); + } + printf("\n"); + } +} + +/* print the scatter32 buffer */ +void print_scatter32_buffer(void) +{ + if (PRINT_DATA) { + printf("\n\nPrinting the 32 bit scatter buffer"); + + for (int i =3D 0; i < SCATTER_BUFFER_SIZE; i++) { + if ((i % MATRIX_SIZE) =3D=3D 0) { + printf("\n"); + } + for (int j =3D 0; j < 4; j++) { + printf("%c", (char)((vtcm.vscatter32[i] >> j * 8) & 0xff)); + } + printf(" "); + } + printf("\n"); + } +} + +/* print the gather 32 buffer */ +void print_gather_result_32(void) +{ + if (PRINT_DATA) { + printf("\n\nPrinting the 32 bit gather result\n"); + + for (int i =3D 0; i < MATRIX_SIZE; i++) { + for (int j =3D 0; j < 4; j++) { + printf("%c", (char)((vtcm.vgather32[i] >> j * 8) & 0xff)); + } + printf(" "); + } + printf("\n"); + } +} + +/* print the scatter16_32 buffer */ +void print_scatter16_32_buffer(void) +{ + if (PRINT_DATA) { + printf("\n\nPrinting the 16_32 bit scatter buffer"); + + for (int i =3D 0; i < SCATTER_BUFFER_SIZE; i++) { + if ((i % MATRIX_SIZE) =3D=3D 0) { + printf("\n"); + } + for (int j =3D 0; j < 2; j++) { + printf("%c", + (unsigned char)((vtcm.vscatter16_32[i] >> j * 8) & 0= xff)); + } + printf(" "); + } + printf("\n"); + } +} + +/* print the gather 16_32 buffer */ +void print_gather_result_16_32(void) +{ + if (PRINT_DATA) { + printf("\n\nPrinting the 16_32 bit gather result\n"); + + for (int i =3D 0; i < MATRIX_SIZE; i++) { + for (int j =3D 0; j < 2; j++) { + printf("%c", + (unsigned char)((vtcm.vgather16_32[i] >> j * 8) & 0= xff)); + } + printf(" "); + } + printf("\n"); + } +} + +int main() +{ + prefill_vtcm_scratch(); + + /* 16 bit elements with 16 bit offsets */ + create_offsets_values_preds_16(); + + vector_scatter_16(); + print_scatter16_buffer(); + check_scatter_16(); + + vector_gather_16(); + print_gather_result_16(); + check_gather_16(); + + vector_gather_16_masked(); + print_gather_result_16(); + check_gather_16_masked(); + + vector_scatter_16_acc(); + print_scatter16_buffer(); + check_scatter_16_acc(); + + vector_scatter_16_masked(); + print_scatter16_buffer(); + check_scatter_16_masked(); + + /* 32 bit elements with 32 bit offsets */ + create_offsets_values_preds_32(); + + vector_scatter_32(); + print_scatter32_buffer(); + check_scatter_32(); + + vector_gather_32(); + print_gather_result_32(); + check_gather_32(); + + vector_gather_32_masked(); + print_gather_result_32(); + check_gather_32_masked(); + + vector_scatter_32_acc(); + print_scatter32_buffer(); + check_scatter_32_acc(); + + vector_scatter_32_masked(); + print_scatter32_buffer(); + check_scatter_32_masked(); + + /* 16 bit elements with 32 bit offsets */ + create_offsets_values_preds_16_32(); + + vector_scatter_16_32(); + print_scatter16_32_buffer(); + check_scatter_16_32(); + + vector_gather_16_32(); + print_gather_result_16_32(); + check_gather_16_32(); + + vector_gather_16_32_masked(); + print_gather_result_16_32(); + check_gather_16_32_masked(); + + vector_scatter_16_32_acc(); + print_scatter16_32_buffer(); + check_scatter_16_32_acc(); + + vector_scatter_16_32_masked(); + print_scatter16_32_buffer(); + check_scatter_16_32_masked(); + + puts(err ? "FAIL" : "PASS"); + return err; +} diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile= .target index 62916a5..c4ccc99 100644 --- a/tests/tcg/hexagon/Makefile.target +++ b/tests/tcg/hexagon/Makefile.target @@ -39,11 +39,13 @@ HEX_TESTS +=3D brev HEX_TESTS +=3D load_unpack HEX_TESTS +=3D load_align HEX_TESTS +=3D vector_add_int +HEX_TESTS +=3D scatter_gather HEX_TESTS +=3D atomics HEX_TESTS +=3D fpstuff HEX_TESTS +=3D hvx_misc =20 TESTS +=3D $(HEX_TESTS) =20 +scatter_gather: CFLAGS +=3D -mhvx vector_add_int: CFLAGS +=3D -mhvx -fvectorize hvx_misc: CFLAGS +=3D -mhvx --=20 2.7.4 From nobody Wed May 15 23:08:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634034896052528.2861209409235; Tue, 12 Oct 2021 03:34:56 -0700 (PDT) Received: from localhost ([::1]:45324 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maF7W-0006nX-Jd for importer@patchew.org; Tue, 12 Oct 2021 06:34:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50432) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEl4-0000bF-7K for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:43 -0400 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:12878) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEky-0006y1-5I for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:41 -0400 Received: from unknown (HELO ironmsg05-sd.qualcomm.com) ([10.53.140.145]) by alexa-out-sd-01.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg05-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 8422317B8; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033496; x=1665569496; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6JvRJ2q1ifVtnyLb52peUvtsSWpJPHREY5pUVLcrtjM=; b=Hi34LWC6TYiR+NUKnWGDOofC6BqLua8BdUzKUkxRrXXrlyslP7LMf5Y2 /17/RJqxB/trWVYvaeenFaFBpB+a+QZULshXFvLHusXb01idwujXt2Ds7 icXROPEB8y6wsTZiHl18uWXsZyFwvrjFAn4Y6Sp+cSDnT6OYGpr0Vd4Gc E=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 30/30] Hexagon HVX (tests/tcg/hexagon) histogram test Date: Tue, 12 Oct 2021 05:11:08 -0500 Message-Id: <1634033468-23566-31-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.38; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-01.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634034896710100001 Signe-off-by: Taylor Simpson Acked-by: Richard Henderson --- tests/tcg/hexagon/hvx_histogram_input.h | 717 ++++++++++++++++++++++++++++= ++++ tests/tcg/hexagon/hvx_histogram_row.h | 24 ++ tests/tcg/hexagon/hvx_histogram.c | 88 ++++ tests/tcg/hexagon/Makefile.target | 5 + tests/tcg/hexagon/hvx_histogram_row.S | 294 +++++++++++++ 5 files changed, 1128 insertions(+) create mode 100644 tests/tcg/hexagon/hvx_histogram_input.h create mode 100644 tests/tcg/hexagon/hvx_histogram_row.h create mode 100644 tests/tcg/hexagon/hvx_histogram.c create mode 100644 tests/tcg/hexagon/hvx_histogram_row.S diff --git a/tests/tcg/hexagon/hvx_histogram_input.h b/tests/tcg/hexagon/hv= x_histogram_input.h new file mode 100644 index 0000000..2f91092 --- /dev/null +++ b/tests/tcg/hexagon/hvx_histogram_input.h @@ -0,0 +1,717 @@ +/* + * Copyright(c) 2021 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + + { 0x26, 0x32, 0x2e, 0x2e, 0x2d, 0x2c, 0x2d, 0x2d, + 0x2c, 0x2e, 0x31, 0x33, 0x36, 0x39, 0x3b, 0x3f, + 0x42, 0x46, 0x4a, 0x4c, 0x51, 0x53, 0x53, 0x54, + 0x56, 0x57, 0x58, 0x57, 0x56, 0x52, 0x51, 0x4f, + 0x4c, 0x49, 0x47, 0x42, 0x3e, 0x3b, 0x38, 0x35, + 0x33, 0x30, 0x2e, 0x2c, 0x2b, 0x2a, 0x2a, 0x28, + 0x28, 0x27, 0x27, 0x28, 0x29, 0x2a, 0x2c, 0x2e, + 0x2f, 0x33, 0x36, 0x38, 0x3c, 0x3d, 0x40, 0x42, + 0x43, 0x42, 0x43, 0x44, 0x43, 0x41, 0x40, 0x3b, + 0x3b, 0x3a, 0x38, 0x35, 0x32, 0x2f, 0x2c, 0x29, + 0x27, 0x26, 0x23, 0x21, 0x1e, 0x1c, 0x1a, 0x19, + 0x17, 0x15, 0x15, 0x14, 0x13, 0x12, 0x11, 0x10, + 0x0f, 0x0e, 0x0f, 0x0f, 0x0e, 0x0d, 0x0d, 0x0d, + 0x0c, 0x0d, 0x0e, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, + 0x0c, 0x0c, 0x0d, 0x0c, 0x0f, 0x0e, 0x0f, 0x0f, + 0x0f, 0x10, 0x11, 0x12, 0x14, 0x16, 0x17, 0x19, + 0x1c, 0x1d, 0x21, 0x25, 0x27, 0x29, 0x2b, 0x2f, + 0x31, 0x33, 0x36, 0x38, 0x39, 0x3a, 0x3b, 0x3c, + 0x3c, 0x3d, 0x3e, 0x3e, 0x3c, 0x3b, 0x3a, 0x39, + 0x39, 0x3a, 0x3a, 0x3a, 0x3a, 0x3c, 0x3e, 0x43, + 0x47, 0x4a, 0x4d, 0x51, 0x51, 0x54, 0x56, 0x56, + 0x57, 0x56, 0x53, 0x4f, 0x4b, 0x47, 0x43, 0x41, + 0x3e, 0x3c, 0x3a, 0x37, 0x36, 0x33, 0x32, 0x34, + 0x34, 0x34, 0x34, 0x35, 0x36, 0x39, 0x3d, 0x3d, + 0x3f, 0x40, 0x40, 0x40, 0x40, 0x3e, 0x40, 0x40, + 0x42, 0x44, 0x47, 0x48, 0x4b, 0x4e, 0x56, 0x5c, + 0x62, 0x68, 0x6f, 0x73, 0x76, 0x79, 0x7a, 0x7c, + 0x7e, 0x7c, 0x78, 0x72, 0x6e, 0x69, 0x65, 0x60, + 0x5b, 0x56, 0x52, 0x4d, 0x4a, 0x48, 0x47, 0x46, + 0x44, 0x43, 0x42, 0x41, 0x41, 0x41, 0x40, 0x40, + 0x3f, 0x3e, 0x3d, 0x3c, 0x3b, 0x3b, 0x38, 0x37, + 0x36, 0x35, 0x36, 0x35, 0x36, 0x37, 0x38, 0x3c, + 0x3d, 0x3f, 0x42, 0x44, 0x46, 0x48, 0x4b, 0x4c, + 0x4e, 0x4e, 0x4d, 0x4c, 0x4a, 0x48, 0x49, 0x49, + 0x4b, 0x4d, 0x4e, }, + { 0x23, 0x2d, 0x29, 0x29, 0x28, 0x28, 0x29, 0x29, + 0x28, 0x2b, 0x2d, 0x2f, 0x32, 0x34, 0x36, 0x3a, + 0x3d, 0x41, 0x44, 0x47, 0x4a, 0x4c, 0x4e, 0x4e, + 0x50, 0x51, 0x51, 0x51, 0x4f, 0x4c, 0x4b, 0x48, + 0x46, 0x44, 0x40, 0x3d, 0x39, 0x36, 0x34, 0x30, + 0x2f, 0x2d, 0x2a, 0x29, 0x28, 0x27, 0x26, 0x25, + 0x25, 0x24, 0x24, 0x24, 0x26, 0x28, 0x28, 0x2a, + 0x2b, 0x2e, 0x32, 0x34, 0x37, 0x39, 0x3b, 0x3c, + 0x3d, 0x3d, 0x3e, 0x3e, 0x3e, 0x3c, 0x3b, 0x38, + 0x37, 0x35, 0x33, 0x30, 0x2e, 0x2b, 0x27, 0x25, + 0x24, 0x21, 0x20, 0x1d, 0x1b, 0x1a, 0x18, 0x16, + 0x15, 0x14, 0x13, 0x12, 0x10, 0x11, 0x10, 0x0e, + 0x0e, 0x0d, 0x0d, 0x0d, 0x0d, 0x0c, 0x0c, 0x0b, + 0x0b, 0x0b, 0x0c, 0x0b, 0x0b, 0x09, 0x0a, 0x0b, + 0x0b, 0x0a, 0x0a, 0x0c, 0x0c, 0x0c, 0x0d, 0x0e, + 0x0e, 0x0f, 0x0f, 0x11, 0x12, 0x15, 0x15, 0x17, + 0x1a, 0x1c, 0x1f, 0x22, 0x25, 0x26, 0x29, 0x2a, + 0x2d, 0x30, 0x33, 0x34, 0x35, 0x35, 0x37, 0x37, + 0x39, 0x3a, 0x39, 0x38, 0x37, 0x36, 0x36, 0x37, + 0x35, 0x36, 0x35, 0x35, 0x36, 0x37, 0x3a, 0x3e, + 0x40, 0x43, 0x48, 0x49, 0x4b, 0x4c, 0x4d, 0x4e, + 0x4f, 0x4f, 0x4c, 0x48, 0x45, 0x41, 0x3e, 0x3b, + 0x3a, 0x37, 0x36, 0x33, 0x32, 0x31, 0x30, 0x31, + 0x32, 0x31, 0x31, 0x31, 0x31, 0x34, 0x37, 0x38, + 0x3a, 0x3b, 0x3b, 0x3b, 0x3c, 0x3b, 0x3d, 0x3e, + 0x3f, 0x40, 0x43, 0x44, 0x47, 0x4b, 0x4f, 0x56, + 0x5a, 0x60, 0x66, 0x69, 0x6a, 0x6e, 0x71, 0x72, + 0x73, 0x72, 0x6d, 0x69, 0x66, 0x60, 0x5c, 0x59, + 0x54, 0x50, 0x4d, 0x48, 0x46, 0x44, 0x44, 0x43, + 0x42, 0x41, 0x41, 0x40, 0x3f, 0x3f, 0x3e, 0x3d, + 0x3d, 0x3d, 0x3c, 0x3a, 0x39, 0x38, 0x35, 0x35, + 0x34, 0x34, 0x35, 0x34, 0x35, 0x36, 0x39, 0x3c, + 0x3d, 0x3e, 0x41, 0x43, 0x44, 0x46, 0x48, 0x49, + 0x4a, 0x49, 0x48, 0x47, 0x45, 0x43, 0x43, 0x44, + 0x45, 0x47, 0x48, }, + { 0x23, 0x2d, 0x2a, 0x2a, 0x29, 0x29, 0x2a, 0x2a, + 0x29, 0x2c, 0x2d, 0x2f, 0x32, 0x34, 0x36, 0x3a, + 0x3d, 0x40, 0x44, 0x48, 0x4a, 0x4c, 0x4e, 0x4e, + 0x50, 0x51, 0x51, 0x51, 0x4f, 0x4c, 0x4b, 0x48, + 0x46, 0x44, 0x40, 0x3d, 0x39, 0x36, 0x34, 0x30, + 0x2f, 0x2d, 0x2a, 0x29, 0x28, 0x27, 0x26, 0x25, + 0x25, 0x24, 0x24, 0x25, 0x26, 0x28, 0x29, 0x2a, + 0x2b, 0x2e, 0x31, 0x34, 0x37, 0x39, 0x3b, 0x3c, + 0x3d, 0x3e, 0x3e, 0x3d, 0x3e, 0x3c, 0x3c, 0x3a, + 0x37, 0x35, 0x33, 0x30, 0x2f, 0x2b, 0x28, 0x26, + 0x24, 0x21, 0x20, 0x1e, 0x1c, 0x1b, 0x18, 0x17, + 0x16, 0x14, 0x13, 0x12, 0x10, 0x10, 0x0f, 0x0e, + 0x0f, 0x0e, 0x0d, 0x0d, 0x0d, 0x0d, 0x0d, 0x0c, + 0x0b, 0x0b, 0x0c, 0x0c, 0x0c, 0x0b, 0x0b, 0x0c, + 0x0c, 0x0b, 0x0b, 0x0c, 0x0d, 0x0c, 0x0e, 0x0e, + 0x0e, 0x0f, 0x11, 0x11, 0x13, 0x14, 0x16, 0x18, + 0x1a, 0x1d, 0x1f, 0x22, 0x25, 0x26, 0x29, 0x2b, + 0x2d, 0x31, 0x33, 0x34, 0x36, 0x37, 0x38, 0x38, + 0x39, 0x3a, 0x39, 0x38, 0x37, 0x36, 0x37, 0x37, + 0x35, 0x36, 0x35, 0x36, 0x35, 0x38, 0x3a, 0x3e, + 0x40, 0x41, 0x45, 0x47, 0x49, 0x4a, 0x4c, 0x4d, + 0x4e, 0x4d, 0x4a, 0x47, 0x44, 0x40, 0x3d, 0x3b, + 0x39, 0x37, 0x34, 0x34, 0x32, 0x31, 0x31, 0x33, + 0x32, 0x31, 0x32, 0x33, 0x32, 0x36, 0x38, 0x39, + 0x3b, 0x3c, 0x3c, 0x3c, 0x3d, 0x3d, 0x3e, 0x3e, + 0x41, 0x42, 0x43, 0x45, 0x48, 0x4c, 0x50, 0x56, + 0x5b, 0x5f, 0x62, 0x67, 0x69, 0x6c, 0x6e, 0x6e, + 0x70, 0x6f, 0x6b, 0x67, 0x63, 0x5e, 0x5b, 0x58, + 0x54, 0x51, 0x4e, 0x4a, 0x48, 0x46, 0x46, 0x46, + 0x45, 0x46, 0x44, 0x43, 0x44, 0x43, 0x42, 0x42, + 0x41, 0x40, 0x3f, 0x3e, 0x3c, 0x3b, 0x3a, 0x39, + 0x39, 0x39, 0x38, 0x37, 0x37, 0x3a, 0x3e, 0x40, + 0x42, 0x43, 0x47, 0x47, 0x48, 0x4a, 0x4b, 0x4c, + 0x4c, 0x4b, 0x4a, 0x48, 0x46, 0x44, 0x43, 0x45, + 0x45, 0x46, 0x47, }, + { 0x21, 0x2b, 0x28, 0x28, 0x28, 0x28, 0x29, 0x29, + 0x28, 0x2a, 0x2d, 0x30, 0x32, 0x34, 0x37, 0x3a, + 0x3c, 0x40, 0x44, 0x48, 0x4a, 0x4c, 0x4e, 0x4e, + 0x50, 0x51, 0x52, 0x51, 0x4f, 0x4b, 0x4b, 0x48, + 0x45, 0x43, 0x3f, 0x3c, 0x39, 0x36, 0x33, 0x30, + 0x2f, 0x2d, 0x2b, 0x2a, 0x28, 0x27, 0x26, 0x25, + 0x24, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a, + 0x2c, 0x2d, 0x31, 0x34, 0x37, 0x39, 0x3b, 0x3c, + 0x3d, 0x3e, 0x3e, 0x3e, 0x3e, 0x3d, 0x3c, 0x3a, + 0x37, 0x35, 0x33, 0x30, 0x2f, 0x2b, 0x28, 0x26, + 0x25, 0x21, 0x20, 0x1e, 0x1c, 0x19, 0x19, 0x18, + 0x17, 0x15, 0x15, 0x12, 0x11, 0x11, 0x11, 0x0f, + 0x0e, 0x0e, 0x0e, 0x0e, 0x0d, 0x0d, 0x0d, 0x0c, + 0x0c, 0x0c, 0x0b, 0x0b, 0x0b, 0x0b, 0x0b, 0x0b, + 0x0c, 0x0c, 0x0c, 0x0c, 0x0e, 0x0e, 0x0f, 0x0f, + 0x0f, 0x10, 0x11, 0x13, 0x13, 0x15, 0x16, 0x18, + 0x1a, 0x1c, 0x1f, 0x22, 0x25, 0x28, 0x29, 0x2d, + 0x2f, 0x32, 0x34, 0x35, 0x36, 0x37, 0x38, 0x38, + 0x39, 0x3a, 0x39, 0x39, 0x37, 0x36, 0x37, 0x36, + 0x35, 0x35, 0x37, 0x35, 0x36, 0x37, 0x3a, 0x3d, + 0x3e, 0x41, 0x43, 0x46, 0x46, 0x47, 0x48, 0x49, + 0x4a, 0x49, 0x47, 0x45, 0x42, 0x3f, 0x3d, 0x3b, + 0x3a, 0x38, 0x36, 0x34, 0x32, 0x32, 0x32, 0x32, + 0x32, 0x31, 0x33, 0x32, 0x34, 0x37, 0x38, 0x38, + 0x3a, 0x3b, 0x3d, 0x3d, 0x3d, 0x3e, 0x3f, 0x41, + 0x42, 0x44, 0x44, 0x46, 0x49, 0x4d, 0x50, 0x54, + 0x58, 0x5c, 0x61, 0x63, 0x65, 0x69, 0x6a, 0x6c, + 0x6d, 0x6c, 0x68, 0x64, 0x61, 0x5c, 0x59, 0x57, + 0x53, 0x51, 0x4f, 0x4c, 0x4a, 0x48, 0x48, 0x49, + 0x49, 0x48, 0x48, 0x48, 0x47, 0x47, 0x46, 0x46, + 0x45, 0x44, 0x42, 0x41, 0x3f, 0x3e, 0x3c, 0x3c, + 0x3c, 0x3d, 0x3c, 0x3c, 0x3c, 0x3e, 0x41, 0x43, + 0x46, 0x48, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4e, + 0x4e, 0x4d, 0x4b, 0x49, 0x47, 0x44, 0x44, 0x45, + 0x45, 0x45, 0x46, }, + { 0x22, 0x2b, 0x27, 0x27, 0x27, 0x27, 0x28, 0x28, + 0x28, 0x2a, 0x2c, 0x2f, 0x30, 0x34, 0x37, 0x3b, + 0x3d, 0x41, 0x45, 0x48, 0x4a, 0x4c, 0x4e, 0x4e, + 0x50, 0x51, 0x52, 0x51, 0x4f, 0x4b, 0x4b, 0x47, + 0x45, 0x43, 0x3f, 0x3c, 0x39, 0x36, 0x33, 0x30, + 0x2f, 0x2d, 0x2b, 0x2a, 0x27, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a, + 0x2c, 0x2e, 0x31, 0x34, 0x37, 0x39, 0x3a, 0x3b, + 0x3d, 0x3e, 0x3e, 0x3f, 0x3f, 0x3d, 0x3c, 0x3a, + 0x38, 0x36, 0x34, 0x31, 0x2e, 0x2c, 0x29, 0x26, + 0x25, 0x22, 0x20, 0x1e, 0x1c, 0x1a, 0x19, 0x18, + 0x16, 0x15, 0x14, 0x12, 0x10, 0x11, 0x11, 0x0f, + 0x0e, 0x0e, 0x0e, 0x0e, 0x0d, 0x0c, 0x0d, 0x0c, + 0x0c, 0x0c, 0x0b, 0x0b, 0x0b, 0x0b, 0x0b, 0x0b, + 0x0c, 0x0c, 0x0c, 0x0d, 0x0d, 0x0e, 0x0f, 0x0f, + 0x0f, 0x10, 0x11, 0x13, 0x13, 0x15, 0x15, 0x18, + 0x19, 0x1d, 0x1f, 0x21, 0x24, 0x27, 0x2a, 0x2c, + 0x30, 0x33, 0x35, 0x36, 0x37, 0x38, 0x39, 0x39, + 0x3a, 0x3a, 0x39, 0x39, 0x37, 0x36, 0x37, 0x36, + 0x36, 0x36, 0x36, 0x36, 0x36, 0x37, 0x39, 0x3a, + 0x3d, 0x3e, 0x41, 0x43, 0x43, 0x45, 0x46, 0x46, + 0x47, 0x46, 0x44, 0x42, 0x40, 0x3d, 0x3a, 0x39, + 0x37, 0x36, 0x35, 0x34, 0x33, 0x32, 0x32, 0x32, + 0x32, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, + 0x39, 0x3c, 0x3c, 0x3e, 0x3e, 0x3e, 0x41, 0x43, + 0x44, 0x45, 0x46, 0x48, 0x49, 0x4c, 0x51, 0x54, + 0x56, 0x5a, 0x5f, 0x61, 0x63, 0x65, 0x67, 0x69, + 0x6a, 0x69, 0x67, 0x61, 0x5f, 0x5b, 0x58, 0x56, + 0x54, 0x51, 0x50, 0x4e, 0x4c, 0x4a, 0x4b, 0x4c, + 0x4c, 0x4b, 0x4b, 0x4b, 0x4b, 0x49, 0x4a, 0x49, + 0x49, 0x48, 0x46, 0x44, 0x42, 0x41, 0x40, 0x3f, + 0x3f, 0x40, 0x40, 0x40, 0x40, 0x42, 0x46, 0x49, + 0x4b, 0x4c, 0x4f, 0x4f, 0x50, 0x52, 0x51, 0x51, + 0x50, 0x4f, 0x4c, 0x4a, 0x48, 0x46, 0x45, 0x44, + 0x44, 0x45, 0x46, }, + { 0x21, 0x2a, 0x27, 0x27, 0x27, 0x27, 0x27, 0x27, + 0x27, 0x29, 0x2d, 0x2f, 0x31, 0x34, 0x37, 0x3b, + 0x3e, 0x41, 0x45, 0x48, 0x4a, 0x4c, 0x4e, 0x4e, + 0x50, 0x51, 0x52, 0x51, 0x4f, 0x4b, 0x4b, 0x48, + 0x45, 0x43, 0x3f, 0x3c, 0x39, 0x36, 0x33, 0x2f, + 0x2f, 0x2d, 0x2a, 0x2a, 0x27, 0x26, 0x25, 0x24, + 0x22, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a, + 0x2c, 0x2f, 0x31, 0x34, 0x37, 0x39, 0x3a, 0x3c, + 0x3d, 0x3e, 0x3f, 0x40, 0x3f, 0x3d, 0x3d, 0x3a, + 0x38, 0x36, 0x34, 0x31, 0x2e, 0x2c, 0x29, 0x26, + 0x25, 0x22, 0x21, 0x1f, 0x1d, 0x1b, 0x19, 0x18, + 0x16, 0x14, 0x14, 0x13, 0x11, 0x11, 0x11, 0x0f, + 0x0f, 0x0f, 0x0e, 0x0e, 0x0d, 0x0d, 0x0d, 0x0d, + 0x0d, 0x0d, 0x0c, 0x0b, 0x0b, 0x0b, 0x0b, 0x0c, + 0x0c, 0x0d, 0x0d, 0x0d, 0x0e, 0x0e, 0x0f, 0x0f, + 0x0f, 0x10, 0x13, 0x13, 0x14, 0x15, 0x17, 0x19, + 0x1a, 0x1d, 0x1f, 0x22, 0x25, 0x27, 0x2a, 0x2e, + 0x31, 0x33, 0x35, 0x38, 0x39, 0x3a, 0x3b, 0x3b, + 0x3c, 0x3c, 0x3b, 0x3a, 0x39, 0x38, 0x38, 0x37, + 0x36, 0x36, 0x37, 0x36, 0x37, 0x38, 0x38, 0x3a, + 0x3b, 0x3e, 0x40, 0x40, 0x41, 0x42, 0x43, 0x42, + 0x43, 0x42, 0x40, 0x40, 0x3f, 0x3c, 0x3b, 0x39, + 0x38, 0x37, 0x36, 0x35, 0x34, 0x33, 0x32, 0x33, + 0x32, 0x32, 0x34, 0x35, 0x35, 0x36, 0x39, 0x39, + 0x3a, 0x3c, 0x3c, 0x3f, 0x40, 0x41, 0x43, 0x45, + 0x45, 0x47, 0x48, 0x4a, 0x4b, 0x4d, 0x50, 0x53, + 0x56, 0x59, 0x5c, 0x5f, 0x60, 0x65, 0x64, 0x66, + 0x68, 0x66, 0x64, 0x61, 0x5e, 0x5a, 0x59, 0x56, + 0x54, 0x52, 0x51, 0x50, 0x4e, 0x4c, 0x4d, 0x4f, + 0x4f, 0x4f, 0x50, 0x50, 0x4f, 0x4f, 0x4e, 0x4d, + 0x4c, 0x4b, 0x49, 0x47, 0x45, 0x44, 0x43, 0x43, + 0x42, 0x43, 0x44, 0x44, 0x46, 0x47, 0x49, 0x4d, + 0x4f, 0x51, 0x53, 0x54, 0x53, 0x54, 0x54, 0x53, + 0x53, 0x51, 0x4e, 0x4b, 0x4a, 0x47, 0x45, 0x44, + 0x44, 0x45, 0x46, }, + { 0x20, 0x28, 0x26, 0x26, 0x25, 0x24, 0x27, 0x27, + 0x27, 0x29, 0x2c, 0x2e, 0x31, 0x34, 0x37, 0x3b, + 0x3e, 0x41, 0x45, 0x48, 0x4a, 0x4c, 0x4e, 0x4e, + 0x50, 0x51, 0x52, 0x51, 0x4f, 0x4b, 0x4a, 0x49, + 0x45, 0x43, 0x3f, 0x3c, 0x3a, 0x36, 0x33, 0x30, + 0x2f, 0x2d, 0x2a, 0x28, 0x27, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a, + 0x2c, 0x2e, 0x31, 0x34, 0x37, 0x39, 0x3b, 0x3c, + 0x3d, 0x3e, 0x3f, 0x40, 0x3e, 0x3d, 0x3d, 0x3a, + 0x38, 0x36, 0x34, 0x31, 0x2f, 0x2c, 0x29, 0x27, + 0x25, 0x21, 0x21, 0x1f, 0x1c, 0x1d, 0x19, 0x18, + 0x16, 0x15, 0x15, 0x13, 0x12, 0x11, 0x11, 0x0f, + 0x0f, 0x0e, 0x0f, 0x0f, 0x0e, 0x0d, 0x0d, 0x0d, + 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, + 0x0d, 0x0d, 0x0d, 0x0e, 0x0e, 0x0e, 0x0f, 0x10, + 0x10, 0x10, 0x12, 0x13, 0x15, 0x16, 0x18, 0x1a, + 0x1c, 0x1d, 0x20, 0x22, 0x25, 0x27, 0x2a, 0x2e, + 0x30, 0x34, 0x38, 0x39, 0x3a, 0x3b, 0x3b, 0x3b, + 0x3c, 0x3d, 0x3c, 0x3b, 0x3a, 0x39, 0x38, 0x37, + 0x36, 0x36, 0x38, 0x37, 0x37, 0x37, 0x38, 0x3a, + 0x3b, 0x3c, 0x3d, 0x3e, 0x3f, 0x40, 0x40, 0x40, + 0x42, 0x40, 0x3f, 0x3e, 0x3d, 0x3b, 0x3a, 0x39, + 0x37, 0x36, 0x36, 0x35, 0x34, 0x34, 0x33, 0x33, + 0x33, 0x34, 0x35, 0x35, 0x35, 0x36, 0x38, 0x39, + 0x3a, 0x3b, 0x3d, 0x3f, 0x42, 0x43, 0x45, 0x45, + 0x46, 0x48, 0x49, 0x4b, 0x4b, 0x4d, 0x50, 0x53, + 0x56, 0x57, 0x5a, 0x5c, 0x5e, 0x61, 0x63, 0x65, + 0x66, 0x64, 0x62, 0x5f, 0x5c, 0x59, 0x58, 0x56, + 0x55, 0x54, 0x52, 0x51, 0x50, 0x51, 0x51, 0x52, + 0x52, 0x52, 0x52, 0x52, 0x51, 0x51, 0x51, 0x50, + 0x4f, 0x4e, 0x4c, 0x4a, 0x47, 0x46, 0x45, 0x45, + 0x45, 0x46, 0x46, 0x46, 0x4a, 0x4c, 0x4d, 0x52, + 0x54, 0x56, 0x58, 0x58, 0x56, 0x57, 0x57, 0x56, + 0x55, 0x53, 0x50, 0x4d, 0x49, 0x45, 0x44, 0x44, + 0x43, 0x44, 0x45, }, + { 0x1f, 0x27, 0x24, 0x23, 0x25, 0x24, 0x25, 0x26, + 0x26, 0x28, 0x2b, 0x2e, 0x31, 0x34, 0x37, 0x3a, + 0x3d, 0x41, 0x45, 0x48, 0x4b, 0x4d, 0x4f, 0x4e, + 0x50, 0x51, 0x52, 0x50, 0x4f, 0x4b, 0x4a, 0x49, + 0x45, 0x43, 0x3f, 0x3c, 0x3a, 0x36, 0x33, 0x30, + 0x2f, 0x2d, 0x29, 0x28, 0x27, 0x26, 0x25, 0x24, + 0x23, 0x25, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a, + 0x2c, 0x2f, 0x32, 0x34, 0x37, 0x39, 0x3b, 0x3c, + 0x3e, 0x3f, 0x3f, 0x40, 0x3e, 0x3d, 0x3c, 0x3a, + 0x38, 0x36, 0x34, 0x31, 0x30, 0x2c, 0x29, 0x28, + 0x25, 0x23, 0x22, 0x1f, 0x1c, 0x1c, 0x18, 0x18, + 0x16, 0x14, 0x14, 0x13, 0x11, 0x11, 0x11, 0x0f, + 0x0f, 0x0e, 0x0f, 0x0f, 0x0e, 0x0d, 0x0d, 0x0d, + 0x0c, 0x0c, 0x0b, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, + 0x0d, 0x0e, 0x0e, 0x0f, 0x0d, 0x0f, 0x10, 0x10, + 0x10, 0x11, 0x13, 0x14, 0x15, 0x16, 0x19, 0x1a, + 0x1c, 0x1f, 0x20, 0x23, 0x26, 0x28, 0x2a, 0x2e, + 0x31, 0x35, 0x38, 0x39, 0x3a, 0x3c, 0x3d, 0x3d, + 0x3e, 0x3e, 0x3d, 0x3c, 0x3a, 0x3a, 0x39, 0x39, + 0x38, 0x37, 0x38, 0x38, 0x37, 0x38, 0x39, 0x3a, + 0x3c, 0x3c, 0x3d, 0x3e, 0x3f, 0x3f, 0x40, 0x3f, + 0x41, 0x40, 0x3e, 0x3e, 0x3d, 0x3b, 0x3b, 0x39, + 0x37, 0x37, 0x35, 0x36, 0x34, 0x34, 0x34, 0x35, + 0x35, 0x34, 0x34, 0x35, 0x35, 0x37, 0x38, 0x39, + 0x3a, 0x3c, 0x3f, 0x3f, 0x43, 0x43, 0x45, 0x47, + 0x48, 0x48, 0x4a, 0x4b, 0x4e, 0x4d, 0x51, 0x53, + 0x56, 0x58, 0x59, 0x5b, 0x5d, 0x60, 0x62, 0x63, + 0x64, 0x63, 0x61, 0x5e, 0x5c, 0x5a, 0x57, 0x56, + 0x55, 0x54, 0x53, 0x52, 0x51, 0x51, 0x52, 0x52, + 0x54, 0x54, 0x55, 0x55, 0x55, 0x54, 0x54, 0x53, + 0x52, 0x50, 0x4e, 0x4d, 0x4b, 0x4a, 0x48, 0x48, + 0x48, 0x48, 0x4a, 0x4b, 0x4d, 0x4f, 0x52, 0x55, + 0x58, 0x5a, 0x5b, 0x5b, 0x5b, 0x5b, 0x5a, 0x59, + 0x58, 0x55, 0x51, 0x4e, 0x4a, 0x46, 0x45, 0x44, + 0x44, 0x44, 0x44, }, + { 0x1e, 0x26, 0x23, 0x23, 0x25, 0x24, 0x25, 0x26, + 0x26, 0x28, 0x2b, 0x2e, 0x31, 0x34, 0x37, 0x3a, + 0x3e, 0x42, 0x45, 0x48, 0x4b, 0x4d, 0x4f, 0x4f, + 0x50, 0x51, 0x52, 0x50, 0x4f, 0x4b, 0x4a, 0x48, + 0x46, 0x44, 0x3f, 0x3b, 0x39, 0x36, 0x33, 0x30, + 0x2f, 0x2d, 0x2a, 0x28, 0x27, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a, + 0x2c, 0x2f, 0x32, 0x34, 0x37, 0x39, 0x3b, 0x3d, + 0x3e, 0x3f, 0x41, 0x41, 0x40, 0x3e, 0x3d, 0x3b, + 0x38, 0x37, 0x34, 0x32, 0x30, 0x2c, 0x2a, 0x27, + 0x26, 0x23, 0x22, 0x20, 0x1d, 0x1b, 0x1a, 0x19, + 0x17, 0x15, 0x15, 0x13, 0x12, 0x12, 0x11, 0x0f, + 0x11, 0x0f, 0x0e, 0x0e, 0x0d, 0x0d, 0x0d, 0x0c, + 0x0d, 0x0d, 0x0d, 0x0d, 0x0d, 0x0d, 0x0d, 0x0d, + 0x0e, 0x0e, 0x0e, 0x0f, 0x10, 0x10, 0x11, 0x11, + 0x11, 0x13, 0x16, 0x15, 0x15, 0x18, 0x1a, 0x1b, + 0x1d, 0x20, 0x22, 0x24, 0x27, 0x29, 0x2c, 0x30, + 0x33, 0x37, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3e, + 0x40, 0x40, 0x40, 0x3f, 0x3e, 0x3d, 0x3c, 0x3a, + 0x3a, 0x3a, 0x3a, 0x3a, 0x3a, 0x3a, 0x3b, 0x3d, + 0x3d, 0x3f, 0x40, 0x40, 0x3f, 0x41, 0x41, 0x41, + 0x41, 0x41, 0x40, 0x40, 0x3f, 0x3e, 0x3c, 0x3b, + 0x3a, 0x39, 0x37, 0x36, 0x36, 0x35, 0x35, 0x36, + 0x36, 0x35, 0x35, 0x36, 0x36, 0x38, 0x39, 0x39, + 0x3b, 0x3c, 0x3e, 0x40, 0x41, 0x43, 0x45, 0x47, + 0x48, 0x48, 0x4b, 0x4c, 0x4d, 0x4f, 0x51, 0x53, + 0x56, 0x56, 0x59, 0x5b, 0x5d, 0x5f, 0x61, 0x62, + 0x63, 0x63, 0x61, 0x5e, 0x5c, 0x5a, 0x59, 0x57, + 0x56, 0x54, 0x54, 0x53, 0x52, 0x53, 0x53, 0x55, + 0x56, 0x56, 0x57, 0x57, 0x57, 0x57, 0x56, 0x56, + 0x55, 0x53, 0x51, 0x4f, 0x4d, 0x4b, 0x49, 0x4b, + 0x4b, 0x4c, 0x4d, 0x4e, 0x51, 0x53, 0x55, 0x58, + 0x5b, 0x5c, 0x60, 0x60, 0x5f, 0x5e, 0x5d, 0x5c, + 0x5a, 0x57, 0x53, 0x4f, 0x4b, 0x46, 0x45, 0x44, + 0x44, 0x44, 0x44, }, + { 0x1d, 0x25, 0x22, 0x22, 0x23, 0x23, 0x24, 0x25, + 0x25, 0x28, 0x2b, 0x2e, 0x31, 0x34, 0x37, 0x3a, + 0x3e, 0x42, 0x45, 0x48, 0x4b, 0x4d, 0x4f, 0x4f, + 0x50, 0x51, 0x52, 0x50, 0x4f, 0x4b, 0x4a, 0x47, + 0x45, 0x43, 0x3f, 0x3c, 0x38, 0x35, 0x33, 0x30, + 0x2f, 0x2d, 0x2a, 0x28, 0x27, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a, + 0x2b, 0x2f, 0x32, 0x34, 0x37, 0x39, 0x3c, 0x3d, + 0x3e, 0x3f, 0x40, 0x41, 0x40, 0x3e, 0x3d, 0x3b, + 0x39, 0x36, 0x34, 0x32, 0x30, 0x2d, 0x2a, 0x26, + 0x26, 0x24, 0x22, 0x1f, 0x1d, 0x1c, 0x1a, 0x19, + 0x18, 0x16, 0x15, 0x14, 0x12, 0x12, 0x12, 0x10, + 0x10, 0x0f, 0x0e, 0x10, 0x0e, 0x0e, 0x0d, 0x0c, + 0x0d, 0x0d, 0x0d, 0x0d, 0x0d, 0x0e, 0x0d, 0x0e, + 0x0f, 0x0f, 0x0f, 0x10, 0x11, 0x11, 0x11, 0x12, + 0x13, 0x14, 0x16, 0x16, 0x18, 0x1a, 0x1b, 0x1c, + 0x1e, 0x21, 0x23, 0x25, 0x28, 0x2a, 0x2e, 0x32, + 0x34, 0x38, 0x3a, 0x3c, 0x3d, 0x3f, 0x40, 0x42, + 0x43, 0x43, 0x43, 0x42, 0x40, 0x3e, 0x3e, 0x3c, + 0x3b, 0x3b, 0x3c, 0x3a, 0x3b, 0x3b, 0x3e, 0x3e, + 0x40, 0x3f, 0x41, 0x41, 0x41, 0x42, 0x42, 0x43, + 0x42, 0x41, 0x41, 0x41, 0x40, 0x3e, 0x3d, 0x3c, + 0x3b, 0x3a, 0x39, 0x37, 0x36, 0x35, 0x36, 0x37, + 0x35, 0x36, 0x36, 0x37, 0x38, 0x39, 0x3a, 0x3b, + 0x3b, 0x3d, 0x3e, 0x40, 0x41, 0x41, 0x44, 0x46, + 0x48, 0x48, 0x4a, 0x4c, 0x4d, 0x4f, 0x51, 0x53, + 0x55, 0x57, 0x59, 0x5a, 0x5b, 0x5e, 0x5f, 0x61, + 0x62, 0x61, 0x60, 0x5e, 0x5c, 0x5a, 0x59, 0x58, + 0x56, 0x55, 0x54, 0x53, 0x53, 0x54, 0x54, 0x55, + 0x57, 0x57, 0x58, 0x59, 0x5a, 0x58, 0x59, 0x58, + 0x57, 0x55, 0x53, 0x52, 0x4f, 0x4e, 0x4d, 0x4d, + 0x4d, 0x4f, 0x51, 0x50, 0x54, 0x56, 0x59, 0x5c, + 0x5f, 0x61, 0x64, 0x64, 0x63, 0x61, 0x5e, 0x5e, + 0x5c, 0x59, 0x54, 0x50, 0x4c, 0x46, 0x45, 0x44, + 0x44, 0x44, 0x44, }, + { 0x1c, 0x24, 0x21, 0x21, 0x21, 0x22, 0x23, 0x23, + 0x25, 0x27, 0x2a, 0x2e, 0x31, 0x33, 0x37, 0x3b, + 0x3e, 0x42, 0x45, 0x48, 0x4b, 0x4c, 0x50, 0x4f, + 0x50, 0x51, 0x52, 0x50, 0x4e, 0x4b, 0x4a, 0x49, + 0x45, 0x42, 0x3f, 0x3c, 0x38, 0x35, 0x33, 0x30, + 0x2f, 0x2d, 0x2a, 0x28, 0x27, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a, + 0x2b, 0x2f, 0x32, 0x34, 0x38, 0x39, 0x3c, 0x3d, + 0x3e, 0x3e, 0x40, 0x41, 0x40, 0x3e, 0x3c, 0x3a, + 0x39, 0x37, 0x35, 0x33, 0x30, 0x2d, 0x2b, 0x28, + 0x26, 0x23, 0x23, 0x20, 0x1e, 0x1b, 0x19, 0x19, + 0x17, 0x16, 0x15, 0x14, 0x12, 0x12, 0x11, 0x10, + 0x0f, 0x0e, 0x0e, 0x10, 0x0e, 0x0d, 0x0c, 0x0c, + 0x0c, 0x0d, 0x0d, 0x0d, 0x0d, 0x0e, 0x0d, 0x0e, + 0x0f, 0x0f, 0x0f, 0x10, 0x11, 0x11, 0x12, 0x14, + 0x14, 0x14, 0x16, 0x18, 0x19, 0x1b, 0x1c, 0x1e, + 0x20, 0x23, 0x26, 0x27, 0x29, 0x2c, 0x2f, 0x33, + 0x36, 0x38, 0x3b, 0x3e, 0x3e, 0x42, 0x43, 0x46, + 0x46, 0x46, 0x46, 0x44, 0x42, 0x41, 0x3f, 0x3e, + 0x3d, 0x3d, 0x3e, 0x3d, 0x3d, 0x3e, 0x3e, 0x40, + 0x40, 0x40, 0x43, 0x43, 0x42, 0x43, 0x45, 0x43, + 0x43, 0x43, 0x42, 0x42, 0x41, 0x40, 0x40, 0x3e, + 0x3c, 0x3a, 0x3a, 0x38, 0x36, 0x36, 0x36, 0x36, + 0x37, 0x37, 0x36, 0x38, 0x38, 0x39, 0x3b, 0x3b, + 0x3e, 0x3e, 0x3e, 0x40, 0x41, 0x43, 0x45, 0x46, + 0x46, 0x49, 0x4c, 0x4c, 0x4d, 0x4f, 0x51, 0x54, + 0x56, 0x57, 0x58, 0x5a, 0x5c, 0x5e, 0x60, 0x60, + 0x61, 0x61, 0x60, 0x5f, 0x5c, 0x5a, 0x59, 0x58, + 0x57, 0x57, 0x55, 0x54, 0x53, 0x55, 0x55, 0x58, + 0x58, 0x59, 0x5a, 0x5a, 0x5a, 0x5b, 0x5b, 0x5b, + 0x5a, 0x59, 0x56, 0x54, 0x53, 0x4e, 0x4e, 0x50, + 0x50, 0x51, 0x52, 0x52, 0x57, 0x59, 0x5d, 0x60, + 0x63, 0x63, 0x66, 0x66, 0x66, 0x64, 0x63, 0x61, + 0x60, 0x5b, 0x55, 0x51, 0x4d, 0x48, 0x45, 0x44, + 0x43, 0x43, 0x43, }, + { 0x1b, 0x23, 0x20, 0x21, 0x22, 0x22, 0x23, 0x24, + 0x26, 0x27, 0x2a, 0x2e, 0x31, 0x33, 0x37, 0x3b, + 0x3d, 0x42, 0x46, 0x49, 0x4a, 0x4c, 0x4f, 0x4f, + 0x50, 0x50, 0x52, 0x50, 0x4e, 0x4b, 0x4b, 0x49, + 0x45, 0x42, 0x3e, 0x3c, 0x38, 0x35, 0x33, 0x30, + 0x2f, 0x2d, 0x2a, 0x28, 0x27, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a, + 0x2c, 0x2f, 0x32, 0x35, 0x38, 0x3a, 0x3c, 0x3d, + 0x3e, 0x3e, 0x40, 0x41, 0x40, 0x3f, 0x3d, 0x3b, + 0x3a, 0x38, 0x36, 0x33, 0x30, 0x2d, 0x2b, 0x29, + 0x27, 0x24, 0x24, 0x21, 0x1e, 0x1c, 0x1b, 0x1a, + 0x18, 0x17, 0x16, 0x15, 0x13, 0x12, 0x10, 0x0f, + 0x10, 0x0f, 0x0e, 0x0f, 0x0e, 0x0d, 0x0d, 0x0d, + 0x0d, 0x0d, 0x0e, 0x0e, 0x0e, 0x0f, 0x0e, 0x0f, + 0x10, 0x11, 0x11, 0x12, 0x13, 0x13, 0x14, 0x15, + 0x15, 0x16, 0x17, 0x1a, 0x1b, 0x1d, 0x1e, 0x20, + 0x21, 0x25, 0x27, 0x29, 0x2b, 0x2d, 0x31, 0x35, + 0x37, 0x39, 0x3c, 0x3f, 0x40, 0x43, 0x46, 0x47, + 0x4a, 0x49, 0x48, 0x46, 0x45, 0x43, 0x42, 0x41, + 0x3f, 0x40, 0x3f, 0x3f, 0x40, 0x3f, 0x41, 0x43, + 0x43, 0x43, 0x44, 0x45, 0x45, 0x45, 0x45, 0x45, + 0x45, 0x45, 0x44, 0x43, 0x43, 0x42, 0x42, 0x40, + 0x3e, 0x3d, 0x3c, 0x39, 0x38, 0x38, 0x38, 0x38, + 0x38, 0x36, 0x38, 0x39, 0x39, 0x3a, 0x3c, 0x3d, + 0x3e, 0x3e, 0x3f, 0x41, 0x42, 0x42, 0x43, 0x45, + 0x46, 0x49, 0x4b, 0x4d, 0x4f, 0x50, 0x53, 0x54, + 0x57, 0x58, 0x5a, 0x5c, 0x5b, 0x5e, 0x60, 0x61, + 0x60, 0x60, 0x5f, 0x5f, 0x5d, 0x5b, 0x5b, 0x59, + 0x58, 0x57, 0x56, 0x55, 0x55, 0x55, 0x57, 0x59, + 0x5b, 0x5b, 0x5d, 0x5c, 0x5c, 0x5e, 0x5e, 0x5e, + 0x5d, 0x5b, 0x59, 0x56, 0x54, 0x51, 0x51, 0x51, + 0x52, 0x55, 0x56, 0x56, 0x5a, 0x5d, 0x5f, 0x63, + 0x66, 0x68, 0x6b, 0x6b, 0x68, 0x67, 0x66, 0x64, + 0x61, 0x5d, 0x57, 0x52, 0x4f, 0x49, 0x46, 0x45, + 0x43, 0x43, 0x43, }, + { 0x1a, 0x22, 0x1f, 0x20, 0x21, 0x22, 0x23, 0x24, + 0x26, 0x27, 0x2a, 0x2d, 0x31, 0x33, 0x37, 0x3b, + 0x3d, 0x41, 0x46, 0x49, 0x4a, 0x4d, 0x4f, 0x4f, + 0x50, 0x51, 0x52, 0x50, 0x4e, 0x4b, 0x4b, 0x48, + 0x44, 0x42, 0x3e, 0x3c, 0x39, 0x35, 0x33, 0x30, + 0x2f, 0x2d, 0x2a, 0x28, 0x27, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a, + 0x2d, 0x2f, 0x32, 0x35, 0x39, 0x3a, 0x3c, 0x3d, + 0x3e, 0x3f, 0x40, 0x41, 0x40, 0x3f, 0x3e, 0x3c, + 0x3a, 0x38, 0x36, 0x33, 0x31, 0x2d, 0x2c, 0x29, + 0x27, 0x26, 0x24, 0x21, 0x1f, 0x1d, 0x1c, 0x1a, + 0x19, 0x18, 0x16, 0x15, 0x14, 0x13, 0x12, 0x10, + 0x11, 0x10, 0x0f, 0x0f, 0x0f, 0x0e, 0x0e, 0x0e, + 0x0f, 0x0f, 0x0e, 0x0e, 0x0e, 0x0f, 0x0f, 0x10, + 0x11, 0x12, 0x12, 0x13, 0x15, 0x15, 0x16, 0x16, + 0x17, 0x18, 0x1a, 0x1b, 0x1c, 0x1e, 0x1f, 0x21, + 0x22, 0x25, 0x27, 0x2a, 0x2c, 0x2e, 0x33, 0x36, + 0x39, 0x3a, 0x3d, 0x40, 0x41, 0x45, 0x47, 0x4a, + 0x4c, 0x4d, 0x4c, 0x4a, 0x48, 0x45, 0x44, 0x41, + 0x42, 0x42, 0x42, 0x42, 0x42, 0x43, 0x43, 0x44, + 0x45, 0x47, 0x47, 0x48, 0x47, 0x48, 0x47, 0x47, + 0x48, 0x48, 0x46, 0x46, 0x46, 0x43, 0x43, 0x41, + 0x3f, 0x3e, 0x3b, 0x39, 0x38, 0x37, 0x37, 0x37, + 0x38, 0x38, 0x37, 0x39, 0x39, 0x3a, 0x3c, 0x3e, + 0x3e, 0x3f, 0x3f, 0x3f, 0x42, 0x43, 0x43, 0x45, + 0x47, 0x48, 0x4b, 0x4c, 0x4e, 0x50, 0x51, 0x54, + 0x56, 0x58, 0x5a, 0x5c, 0x5c, 0x5f, 0x5f, 0x5f, + 0x61, 0x60, 0x5f, 0x5f, 0x5e, 0x5b, 0x5c, 0x5b, + 0x59, 0x59, 0x57, 0x56, 0x55, 0x56, 0x57, 0x59, + 0x5a, 0x5b, 0x5c, 0x5c, 0x5d, 0x5e, 0x5e, 0x5d, + 0x5e, 0x5c, 0x5a, 0x57, 0x55, 0x52, 0x51, 0x52, + 0x53, 0x55, 0x57, 0x58, 0x5c, 0x5e, 0x61, 0x65, + 0x69, 0x6b, 0x6c, 0x6b, 0x6a, 0x69, 0x67, 0x64, + 0x61, 0x5d, 0x59, 0x53, 0x4d, 0x48, 0x46, 0x45, + 0x44, 0x44, 0x43, }, + { 0x1a, 0x21, 0x1e, 0x1f, 0x20, 0x21, 0x23, 0x24, + 0x25, 0x28, 0x2a, 0x2e, 0x31, 0x33, 0x37, 0x3b, + 0x3e, 0x41, 0x46, 0x49, 0x4b, 0x4d, 0x4f, 0x4e, + 0x50, 0x51, 0x51, 0x50, 0x4e, 0x4b, 0x4a, 0x48, + 0x44, 0x42, 0x3e, 0x3c, 0x39, 0x35, 0x32, 0x30, + 0x2f, 0x2d, 0x29, 0x27, 0x27, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x24, 0x25, 0x26, 0x27, 0x29, 0x2a, + 0x2c, 0x2f, 0x32, 0x35, 0x38, 0x3b, 0x3c, 0x3e, + 0x3f, 0x3f, 0x40, 0x41, 0x40, 0x3f, 0x3e, 0x3c, + 0x3a, 0x39, 0x36, 0x34, 0x31, 0x2d, 0x2c, 0x29, + 0x27, 0x26, 0x24, 0x21, 0x1f, 0x1d, 0x1c, 0x1a, + 0x19, 0x17, 0x16, 0x15, 0x14, 0x13, 0x12, 0x10, + 0x11, 0x10, 0x0f, 0x0f, 0x0f, 0x0e, 0x0e, 0x0e, + 0x0e, 0x0e, 0x0e, 0x0e, 0x0e, 0x0f, 0x0f, 0x10, + 0x11, 0x13, 0x14, 0x14, 0x15, 0x16, 0x17, 0x19, + 0x19, 0x1a, 0x1c, 0x1d, 0x1e, 0x20, 0x22, 0x24, + 0x25, 0x27, 0x29, 0x2c, 0x2e, 0x31, 0x35, 0x38, + 0x3a, 0x3d, 0x41, 0x42, 0x45, 0x48, 0x4c, 0x4e, + 0x4f, 0x4f, 0x4f, 0x4d, 0x4b, 0x49, 0x47, 0x47, + 0x46, 0x45, 0x45, 0x45, 0x44, 0x44, 0x46, 0x47, + 0x48, 0x49, 0x4b, 0x4b, 0x4a, 0x4b, 0x4b, 0x4a, + 0x4b, 0x4a, 0x49, 0x49, 0x48, 0x46, 0x46, 0x44, + 0x42, 0x41, 0x3d, 0x3b, 0x3a, 0x38, 0x38, 0x38, + 0x37, 0x37, 0x39, 0x38, 0x3a, 0x3a, 0x3c, 0x3c, + 0x3e, 0x40, 0x40, 0x41, 0x43, 0x43, 0x45, 0x46, + 0x48, 0x49, 0x4b, 0x4e, 0x4f, 0x50, 0x53, 0x55, + 0x57, 0x59, 0x5b, 0x5c, 0x5d, 0x5e, 0x5f, 0x60, + 0x60, 0x60, 0x5f, 0x5f, 0x5e, 0x5c, 0x5b, 0x5a, + 0x59, 0x58, 0x57, 0x57, 0x56, 0x56, 0x57, 0x58, + 0x59, 0x5a, 0x5b, 0x5c, 0x5c, 0x5d, 0x5e, 0x5d, + 0x5c, 0x5b, 0x58, 0x57, 0x54, 0x52, 0x52, 0x53, + 0x54, 0x57, 0x58, 0x58, 0x5b, 0x5e, 0x62, 0x65, + 0x69, 0x6b, 0x6d, 0x6c, 0x6a, 0x69, 0x67, 0x64, + 0x62, 0x5e, 0x59, 0x54, 0x4d, 0x48, 0x47, 0x46, + 0x45, 0x45, 0x44, }, + { 0x1a, 0x21, 0x1e, 0x1f, 0x20, 0x21, 0x23, 0x24, + 0x25, 0x28, 0x2a, 0x2e, 0x31, 0x34, 0x37, 0x3b, + 0x3e, 0x42, 0x47, 0x49, 0x4b, 0x4d, 0x4f, 0x4f, + 0x50, 0x51, 0x51, 0x50, 0x50, 0x4c, 0x4a, 0x47, + 0x44, 0x42, 0x3e, 0x3c, 0x39, 0x35, 0x32, 0x31, + 0x2f, 0x2d, 0x29, 0x27, 0x26, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x25, 0x25, 0x26, 0x27, 0x29, 0x2b, + 0x2c, 0x2f, 0x33, 0x35, 0x38, 0x3a, 0x3c, 0x3e, + 0x40, 0x40, 0x41, 0x42, 0x41, 0x3f, 0x3f, 0x3d, + 0x3b, 0x39, 0x36, 0x33, 0x32, 0x2e, 0x2d, 0x2a, + 0x27, 0x26, 0x25, 0x22, 0x1f, 0x1d, 0x1c, 0x1b, + 0x19, 0x17, 0x17, 0x16, 0x15, 0x14, 0x12, 0x11, + 0x11, 0x11, 0x10, 0x10, 0x0f, 0x0f, 0x0f, 0x0f, + 0x0f, 0x0f, 0x10, 0x11, 0x10, 0x11, 0x11, 0x12, + 0x11, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1b, + 0x1c, 0x1c, 0x1e, 0x20, 0x21, 0x22, 0x23, 0x25, + 0x27, 0x2a, 0x2c, 0x2f, 0x31, 0x35, 0x38, 0x3b, + 0x3d, 0x40, 0x44, 0x47, 0x49, 0x4c, 0x4f, 0x51, + 0x53, 0x53, 0x53, 0x51, 0x50, 0x4e, 0x4c, 0x4b, + 0x4a, 0x49, 0x49, 0x49, 0x49, 0x4a, 0x4a, 0x4d, + 0x4e, 0x4e, 0x4f, 0x50, 0x4f, 0x50, 0x51, 0x50, + 0x50, 0x4e, 0x4d, 0x4c, 0x4b, 0x48, 0x48, 0x47, + 0x44, 0x42, 0x3f, 0x3d, 0x3b, 0x3a, 0x39, 0x39, + 0x39, 0x38, 0x39, 0x3b, 0x3a, 0x3c, 0x3e, 0x3d, + 0x40, 0x40, 0x40, 0x42, 0x42, 0x42, 0x45, 0x46, + 0x47, 0x49, 0x4c, 0x4e, 0x50, 0x50, 0x53, 0x56, + 0x58, 0x59, 0x5d, 0x5d, 0x5e, 0x60, 0x61, 0x61, + 0x62, 0x61, 0x60, 0x60, 0x5e, 0x5d, 0x5d, 0x5b, + 0x57, 0x58, 0x56, 0x55, 0x55, 0x56, 0x56, 0x59, + 0x59, 0x58, 0x5a, 0x5a, 0x5a, 0x5c, 0x5c, 0x5c, + 0x5b, 0x5b, 0x58, 0x57, 0x54, 0x53, 0x52, 0x53, + 0x54, 0x57, 0x58, 0x59, 0x5c, 0x5f, 0x63, 0x67, + 0x6b, 0x6d, 0x6e, 0x6e, 0x6b, 0x6a, 0x68, 0x64, + 0x62, 0x5e, 0x58, 0x53, 0x4f, 0x49, 0x47, 0x46, + 0x45, 0x45, 0x44, }, + { 0x19, 0x20, 0x1e, 0x1e, 0x1f, 0x20, 0x22, 0x23, + 0x25, 0x27, 0x2a, 0x2e, 0x31, 0x34, 0x37, 0x3a, + 0x3e, 0x41, 0x46, 0x49, 0x4a, 0x4d, 0x4f, 0x4e, + 0x50, 0x51, 0x51, 0x4f, 0x4f, 0x4d, 0x49, 0x47, + 0x44, 0x42, 0x3e, 0x3c, 0x39, 0x36, 0x32, 0x31, + 0x2f, 0x2d, 0x29, 0x27, 0x26, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x25, 0x25, 0x26, 0x28, 0x29, 0x2b, + 0x2c, 0x2f, 0x33, 0x35, 0x38, 0x3a, 0x3c, 0x3e, + 0x3f, 0x3f, 0x41, 0x42, 0x41, 0x3f, 0x3f, 0x3d, + 0x3c, 0x39, 0x36, 0x33, 0x32, 0x2e, 0x2d, 0x2a, + 0x27, 0x26, 0x25, 0x22, 0x1f, 0x1e, 0x1d, 0x1b, + 0x1a, 0x17, 0x17, 0x17, 0x14, 0x14, 0x12, 0x11, + 0x11, 0x12, 0x11, 0x11, 0x10, 0x10, 0x10, 0x10, + 0x10, 0x10, 0x11, 0x11, 0x11, 0x12, 0x13, 0x14, + 0x14, 0x16, 0x17, 0x18, 0x19, 0x1a, 0x1c, 0x1e, + 0x1e, 0x1f, 0x22, 0x23, 0x23, 0x24, 0x25, 0x27, + 0x2a, 0x2d, 0x2f, 0x31, 0x35, 0x38, 0x3a, 0x3e, + 0x41, 0x44, 0x48, 0x4b, 0x4d, 0x51, 0x53, 0x55, + 0x57, 0x57, 0x56, 0x55, 0x54, 0x52, 0x52, 0x50, + 0x4e, 0x50, 0x4e, 0x4d, 0x4d, 0x4d, 0x4f, 0x51, + 0x51, 0x52, 0x54, 0x55, 0x55, 0x55, 0x57, 0x55, + 0x54, 0x53, 0x52, 0x4e, 0x4d, 0x4b, 0x4a, 0x49, + 0x46, 0x44, 0x41, 0x3f, 0x3d, 0x3b, 0x3a, 0x3a, + 0x39, 0x39, 0x39, 0x39, 0x3a, 0x3b, 0x3d, 0x3e, + 0x3f, 0x40, 0x41, 0x42, 0x44, 0x44, 0x45, 0x47, + 0x49, 0x49, 0x4a, 0x4d, 0x50, 0x51, 0x53, 0x57, + 0x5a, 0x5b, 0x5e, 0x5f, 0x60, 0x61, 0x62, 0x62, + 0x63, 0x62, 0x60, 0x60, 0x5e, 0x5c, 0x5c, 0x59, + 0x58, 0x56, 0x55, 0x55, 0x55, 0x55, 0x55, 0x54, + 0x56, 0x56, 0x57, 0x58, 0x58, 0x59, 0x5a, 0x59, + 0x58, 0x57, 0x56, 0x55, 0x54, 0x52, 0x53, 0x53, + 0x53, 0x56, 0x57, 0x59, 0x5b, 0x5e, 0x62, 0x66, + 0x6a, 0x6c, 0x6d, 0x6e, 0x6b, 0x69, 0x67, 0x64, + 0x61, 0x5d, 0x58, 0x54, 0x50, 0x4a, 0x47, 0x46, + 0x45, 0x45, 0x44, }, + { 0x1a, 0x21, 0x1e, 0x1f, 0x1f, 0x20, 0x22, 0x23, + 0x25, 0x27, 0x2b, 0x2e, 0x31, 0x34, 0x37, 0x3b, + 0x3d, 0x42, 0x45, 0x49, 0x4a, 0x4d, 0x4e, 0x4e, + 0x51, 0x52, 0x50, 0x4f, 0x4f, 0x4c, 0x49, 0x48, + 0x45, 0x42, 0x3e, 0x3b, 0x39, 0x36, 0x32, 0x32, + 0x2f, 0x2c, 0x2a, 0x28, 0x26, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x24, 0x25, 0x25, 0x28, 0x29, 0x2b, + 0x2d, 0x2f, 0x33, 0x35, 0x38, 0x3a, 0x3c, 0x3e, + 0x3f, 0x3f, 0x41, 0x42, 0x41, 0x3f, 0x3e, 0x3c, + 0x3c, 0x3a, 0x37, 0x33, 0x32, 0x2f, 0x2d, 0x2b, + 0x28, 0x26, 0x25, 0x22, 0x20, 0x1e, 0x1d, 0x1b, + 0x1a, 0x17, 0x17, 0x16, 0x14, 0x14, 0x12, 0x11, + 0x12, 0x11, 0x11, 0x11, 0x11, 0x10, 0x10, 0x10, + 0x10, 0x11, 0x12, 0x12, 0x12, 0x13, 0x14, 0x14, + 0x16, 0x18, 0x19, 0x1a, 0x1b, 0x1d, 0x1e, 0x1f, + 0x21, 0x22, 0x23, 0x25, 0x26, 0x26, 0x28, 0x2a, + 0x2c, 0x2e, 0x32, 0x34, 0x39, 0x39, 0x3d, 0x41, + 0x45, 0x47, 0x4c, 0x4e, 0x51, 0x54, 0x56, 0x58, + 0x5b, 0x5c, 0x5a, 0x59, 0x58, 0x56, 0x55, 0x53, + 0x53, 0x52, 0x52, 0x51, 0x52, 0x52, 0x53, 0x55, + 0x57, 0x58, 0x5a, 0x5a, 0x59, 0x5b, 0x59, 0x59, + 0x58, 0x57, 0x55, 0x53, 0x51, 0x4e, 0x4c, 0x4a, + 0x48, 0x46, 0x43, 0x40, 0x3e, 0x3c, 0x3b, 0x3b, + 0x38, 0x39, 0x38, 0x39, 0x3a, 0x3d, 0x3d, 0x3e, + 0x3f, 0x40, 0x41, 0x43, 0x44, 0x45, 0x46, 0x48, + 0x4a, 0x4b, 0x4d, 0x4e, 0x50, 0x52, 0x54, 0x56, + 0x59, 0x5c, 0x5e, 0x5f, 0x60, 0x62, 0x62, 0x63, + 0x63, 0x63, 0x61, 0x5f, 0x5e, 0x5d, 0x5c, 0x5b, + 0x59, 0x56, 0x56, 0x55, 0x54, 0x53, 0x53, 0x54, + 0x55, 0x54, 0x55, 0x55, 0x55, 0x57, 0x58, 0x57, + 0x57, 0x56, 0x55, 0x54, 0x54, 0x52, 0x52, 0x53, + 0x54, 0x55, 0x57, 0x58, 0x5b, 0x5e, 0x62, 0x65, + 0x69, 0x6b, 0x6d, 0x6e, 0x6a, 0x69, 0x67, 0x63, + 0x61, 0x5d, 0x58, 0x54, 0x4f, 0x4b, 0x48, 0x47, + 0x46, 0x45, 0x45, }, + { 0x1a, 0x21, 0x1e, 0x1f, 0x1f, 0x20, 0x22, 0x23, + 0x25, 0x27, 0x2b, 0x2d, 0x31, 0x34, 0x37, 0x3b, + 0x3d, 0x42, 0x45, 0x48, 0x4c, 0x4e, 0x4e, 0x4f, + 0x51, 0x52, 0x50, 0x50, 0x4f, 0x4c, 0x4a, 0x48, + 0x45, 0x42, 0x3f, 0x3b, 0x39, 0x36, 0x32, 0x31, + 0x2f, 0x2c, 0x2a, 0x28, 0x26, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x24, 0x25, 0x27, 0x28, 0x29, 0x2b, + 0x2d, 0x30, 0x33, 0x36, 0x39, 0x3b, 0x3d, 0x3f, + 0x3f, 0x40, 0x42, 0x43, 0x42, 0x40, 0x3e, 0x3c, + 0x3c, 0x3a, 0x37, 0x34, 0x32, 0x2f, 0x2d, 0x2c, + 0x2a, 0x27, 0x26, 0x23, 0x20, 0x1e, 0x1d, 0x1c, + 0x1a, 0x18, 0x18, 0x17, 0x15, 0x16, 0x14, 0x12, + 0x12, 0x12, 0x12, 0x12, 0x12, 0x11, 0x11, 0x12, + 0x12, 0x12, 0x13, 0x14, 0x14, 0x14, 0x15, 0x16, + 0x17, 0x19, 0x1b, 0x1c, 0x1e, 0x20, 0x20, 0x22, + 0x24, 0x25, 0x26, 0x27, 0x28, 0x2a, 0x2c, 0x2c, + 0x2f, 0x32, 0x35, 0x37, 0x3b, 0x3c, 0x41, 0x45, + 0x48, 0x4c, 0x50, 0x52, 0x54, 0x57, 0x5a, 0x5c, + 0x5f, 0x5f, 0x5f, 0x5d, 0x5c, 0x5b, 0x5a, 0x58, + 0x57, 0x57, 0x57, 0x56, 0x56, 0x57, 0x57, 0x5a, + 0x5c, 0x5e, 0x5f, 0x61, 0x5f, 0x5f, 0x5f, 0x5e, + 0x5d, 0x5c, 0x5a, 0x57, 0x55, 0x52, 0x4f, 0x4e, + 0x4a, 0x47, 0x46, 0x42, 0x41, 0x3e, 0x3d, 0x3c, + 0x3b, 0x3a, 0x39, 0x39, 0x3b, 0x3c, 0x3d, 0x3f, + 0x40, 0x42, 0x42, 0x44, 0x45, 0x46, 0x49, 0x49, + 0x4b, 0x4c, 0x4e, 0x4f, 0x51, 0x54, 0x57, 0x58, + 0x5b, 0x5d, 0x61, 0x61, 0x61, 0x63, 0x65, 0x65, + 0x64, 0x64, 0x62, 0x61, 0x60, 0x5e, 0x5d, 0x5c, + 0x59, 0x58, 0x56, 0x54, 0x53, 0x53, 0x53, 0x54, + 0x54, 0x53, 0x53, 0x54, 0x54, 0x54, 0x55, 0x55, + 0x56, 0x55, 0x54, 0x53, 0x53, 0x52, 0x52, 0x53, + 0x55, 0x56, 0x57, 0x58, 0x5b, 0x5e, 0x62, 0x66, + 0x69, 0x6b, 0x6d, 0x6d, 0x6b, 0x69, 0x67, 0x64, + 0x61, 0x5d, 0x58, 0x55, 0x50, 0x4b, 0x48, 0x47, + 0x46, 0x46, 0x46, }, + { 0x1a, 0x20, 0x1e, 0x1f, 0x1f, 0x21, 0x22, 0x23, + 0x25, 0x27, 0x2b, 0x2d, 0x31, 0x34, 0x37, 0x3b, + 0x3d, 0x42, 0x45, 0x48, 0x4c, 0x4e, 0x4f, 0x4f, + 0x51, 0x52, 0x51, 0x50, 0x4e, 0x4b, 0x4a, 0x48, + 0x45, 0x42, 0x3f, 0x3b, 0x38, 0x36, 0x32, 0x31, + 0x2f, 0x2c, 0x2a, 0x28, 0x26, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x24, 0x25, 0x27, 0x28, 0x29, 0x2b, + 0x2e, 0x30, 0x33, 0x36, 0x39, 0x3b, 0x3d, 0x3f, + 0x3f, 0x40, 0x41, 0x42, 0x41, 0x40, 0x3e, 0x3c, + 0x3c, 0x3a, 0x37, 0x34, 0x33, 0x30, 0x2e, 0x2b, + 0x29, 0x26, 0x24, 0x24, 0x20, 0x1f, 0x1d, 0x1d, + 0x1a, 0x19, 0x17, 0x16, 0x16, 0x16, 0x16, 0x14, + 0x13, 0x12, 0x13, 0x13, 0x13, 0x12, 0x12, 0x13, + 0x13, 0x14, 0x15, 0x15, 0x14, 0x15, 0x16, 0x18, + 0x19, 0x1b, 0x1c, 0x1e, 0x20, 0x21, 0x22, 0x24, + 0x27, 0x28, 0x29, 0x2a, 0x2c, 0x2c, 0x2d, 0x2f, + 0x32, 0x35, 0x37, 0x3a, 0x3c, 0x3e, 0x44, 0x48, + 0x4c, 0x50, 0x54, 0x56, 0x58, 0x5b, 0x5e, 0x60, + 0x61, 0x63, 0x62, 0x61, 0x60, 0x5f, 0x5e, 0x5e, + 0x5c, 0x5c, 0x5b, 0x5a, 0x5a, 0x5b, 0x5c, 0x5e, + 0x60, 0x63, 0x64, 0x65, 0x63, 0x62, 0x63, 0x63, + 0x61, 0x60, 0x5e, 0x5b, 0x58, 0x55, 0x51, 0x4f, + 0x4c, 0x4a, 0x47, 0x44, 0x42, 0x41, 0x3e, 0x3c, + 0x3b, 0x3a, 0x3a, 0x3b, 0x3b, 0x3c, 0x3e, 0x3f, + 0x40, 0x42, 0x43, 0x45, 0x46, 0x47, 0x49, 0x4a, + 0x4c, 0x4c, 0x4f, 0x51, 0x52, 0x55, 0x58, 0x5b, + 0x5c, 0x5f, 0x61, 0x62, 0x63, 0x64, 0x64, 0x65, + 0x66, 0x65, 0x63, 0x62, 0x5f, 0x5e, 0x5e, 0x5c, + 0x5b, 0x58, 0x56, 0x55, 0x54, 0x53, 0x52, 0x53, + 0x52, 0x52, 0x52, 0x52, 0x52, 0x53, 0x55, 0x55, + 0x55, 0x53, 0x53, 0x53, 0x52, 0x51, 0x52, 0x52, + 0x55, 0x55, 0x58, 0x58, 0x5b, 0x5d, 0x61, 0x65, + 0x68, 0x6a, 0x6c, 0x6b, 0x69, 0x68, 0x67, 0x64, + 0x61, 0x5e, 0x58, 0x54, 0x4f, 0x4b, 0x49, 0x48, + 0x47, 0x46, 0x45, }, + { 0x19, 0x20, 0x1d, 0x1f, 0x1f, 0x20, 0x23, 0x23, + 0x25, 0x27, 0x2b, 0x2d, 0x31, 0x34, 0x37, 0x3b, + 0x3d, 0x42, 0x45, 0x48, 0x4c, 0x4e, 0x4f, 0x4f, + 0x51, 0x52, 0x51, 0x50, 0x4e, 0x4b, 0x4a, 0x48, + 0x44, 0x42, 0x3f, 0x3a, 0x38, 0x36, 0x32, 0x30, + 0x2f, 0x2c, 0x2a, 0x28, 0x26, 0x26, 0x25, 0x24, + 0x23, 0x24, 0x24, 0x25, 0x26, 0x28, 0x29, 0x2b, + 0x2e, 0x30, 0x34, 0x36, 0x39, 0x3b, 0x3d, 0x3f, + 0x3f, 0x40, 0x41, 0x42, 0x41, 0x40, 0x3e, 0x3c, + 0x3c, 0x3a, 0x37, 0x34, 0x33, 0x30, 0x2e, 0x2b, + 0x29, 0x27, 0x25, 0x24, 0x21, 0x1f, 0x1e, 0x1c, + 0x1b, 0x19, 0x17, 0x16, 0x16, 0x16, 0x16, 0x14, + 0x13, 0x12, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13, + 0x13, 0x14, 0x15, 0x14, 0x14, 0x14, 0x17, 0x19, + 0x1a, 0x1c, 0x1e, 0x20, 0x21, 0x23, 0x24, 0x26, + 0x29, 0x29, 0x2b, 0x2c, 0x2d, 0x2e, 0x30, 0x31, + 0x34, 0x38, 0x3b, 0x3c, 0x3f, 0x42, 0x47, 0x4c, + 0x50, 0x54, 0x57, 0x5b, 0x5c, 0x5e, 0x62, 0x63, + 0x66, 0x66, 0x66, 0x65, 0x64, 0x63, 0x61, 0x62, + 0x60, 0x60, 0x5f, 0x5e, 0x5e, 0x5f, 0x60, 0x62, + 0x65, 0x67, 0x69, 0x6a, 0x69, 0x68, 0x69, 0x67, + 0x66, 0x64, 0x62, 0x5f, 0x5c, 0x58, 0x54, 0x51, + 0x4e, 0x4b, 0x49, 0x45, 0x43, 0x41, 0x40, 0x3e, + 0x3c, 0x3a, 0x3b, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f, + 0x41, 0x42, 0x44, 0x46, 0x46, 0x48, 0x49, 0x4b, + 0x4d, 0x50, 0x51, 0x53, 0x55, 0x57, 0x58, 0x5c, + 0x5f, 0x60, 0x63, 0x64, 0x64, 0x65, 0x66, 0x66, + 0x66, 0x65, 0x65, 0x63, 0x61, 0x5f, 0x5e, 0x5c, + 0x5a, 0x58, 0x56, 0x55, 0x54, 0x53, 0x52, 0x52, + 0x53, 0x52, 0x52, 0x52, 0x52, 0x53, 0x53, 0x53, + 0x54, 0x53, 0x53, 0x52, 0x53, 0x51, 0x53, 0x53, + 0x55, 0x57, 0x58, 0x59, 0x5b, 0x5d, 0x62, 0x64, + 0x68, 0x6a, 0x6c, 0x6b, 0x69, 0x68, 0x67, 0x64, + 0x61, 0x5d, 0x57, 0x54, 0x50, 0x4a, 0x48, 0x47, + 0x46, 0x45, 0x45, }, diff --git a/tests/tcg/hexagon/hvx_histogram_row.h b/tests/tcg/hexagon/hvx_= histogram_row.h new file mode 100644 index 0000000..6a4531a --- /dev/null +++ b/tests/tcg/hexagon/hvx_histogram_row.h @@ -0,0 +1,24 @@ +/* + * Copyright(c) 2021 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#ifndef HVX_HISTOGRAM_ROW_H +#define HVX_HISTOGRAM_ROW_H + +void hvx_histogram_row(uint8_t *src, int stride, int width, int height, + int *hist); + +#endif diff --git a/tests/tcg/hexagon/hvx_histogram.c b/tests/tcg/hexagon/hvx_hist= ogram.c new file mode 100644 index 0000000..43377a9 --- /dev/null +++ b/tests/tcg/hexagon/hvx_histogram.c @@ -0,0 +1,88 @@ +/* + * Copyright(c) 2021 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#include +#include +#include +#include "hvx_histogram_row.h" + +const int vector_len =3D 128; +const int width =3D 275; +const int height =3D 20; +const int stride =3D (width + vector_len - 1) & -vector_len; + +int err; + +static uint8_t input[height][stride] __attribute__((aligned(128))) =3D { +#include "hvx_histogram_input.h" +}; + +static int result[256] __attribute__((aligned(128))); +static int expect[256] __attribute__((aligned(128))); + +static void check(void) +{ + for (int i =3D 0; i < 256; i++) { + int res =3D result[i]; + int exp =3D expect[i]; + if (res !=3D exp) { + printf("ERROR at %3d: 0x%04x !=3D 0x%04x\n", + i, res, exp); + err++; + } + } +} + +static void ref_histogram(uint8_t *src, int stride, int width, int height, + int *hist) +{ + for (int i =3D 0; i < 256; i++) { + hist[i] =3D 0; + } + + for (int i =3D 0; i < height; i++) { + for (int j =3D 0; j < width; j++) { + hist[src[i * stride + j]]++; + } + } +} + +static void hvx_histogram(uint8_t *src, int stride, int width, int height, + int *hist) +{ + int n =3D 8192 / width; + + for (int i =3D 0; i < 256; i++) { + hist[i] =3D 0; + } + + for (int i =3D 0; i < height; i +=3D n) { + int k =3D height - i > n ? n : height - i; + hvx_histogram_row(src, stride, width, k, hist); + src +=3D n * stride; + } +} + +int main() +{ + ref_histogram(&input[0][0], stride, width, height, expect); + hvx_histogram(&input[0][0], stride, width, height, result); + check(); + + puts(err ? "FAIL" : "PASS"); + return err ? 1 : 0; +} diff --git a/tests/tcg/hexagon/Makefile.target b/tests/tcg/hexagon/Makefile= .target index c4ccc99..00c9a78 100644 --- a/tests/tcg/hexagon/Makefile.target +++ b/tests/tcg/hexagon/Makefile.target @@ -43,9 +43,14 @@ HEX_TESTS +=3D scatter_gather HEX_TESTS +=3D atomics HEX_TESTS +=3D fpstuff HEX_TESTS +=3D hvx_misc +HEX_TESTS +=3D hvx_histogram =20 TESTS +=3D $(HEX_TESTS) =20 scatter_gather: CFLAGS +=3D -mhvx vector_add_int: CFLAGS +=3D -mhvx -fvectorize hvx_misc: CFLAGS +=3D -mhvx +hvx_histogram: CFLAGS +=3D -mhvx -Wno-gnu-folding-constant + +hvx_histogram: hvx_histogram.c hvx_histogram_row.S + $(CC) $(CFLAGS) $(CROSS_CC_GUEST_CFLAGS) $^ -o $@ diff --git a/tests/tcg/hexagon/hvx_histogram_row.S b/tests/tcg/hexagon/hvx_= histogram_row.S new file mode 100644 index 0000000..5e42c33 --- /dev/null +++ b/tests/tcg/hexagon/hvx_histogram_row.S @@ -0,0 +1,294 @@ +/* + * Copyright(c) 2021 Qualcomm Innovation Center, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + + +/* + * void hvx_histogram_row(uint8_t *src, =3D> r0 + * int stride, =3D> r1 + * int width, =3D> r2 + * int height, =3D> r3 + * int *hist =3D> r4) + */ + .text + .p2align 2 + .global hvx_histogram_row + .type hvx_histogram_row, @function +hvx_histogram_row: + { r2 =3D lsr(r2, #7) /* size / VLEN */ + r5 =3D and(r2, #127) /* size % VLEN */ + v1 =3D #0 + v0 =3D #0 + } + /* + * Step 1: Clean the whole vector register file + */ + { v3:2 =3D v1:0 + v5:4 =3D v1:0 + p0 =3D cmp.gt(r2, #0) /* P0 =3D (width / VLEN > 0) */ + p1 =3D cmp.eq(r5, #0) /* P1 =3D (width % VLEN =3D=3D 0) */ + } + { q0 =3D vsetq(r5) + v7:6 =3D v1:0 + } + { v9:8 =3D v1:0 + v11:10 =3D v1:0 + } + { v13:12 =3D v1:0 + v15:14 =3D v1:0 + } + { v17:16 =3D v1:0 + v19:18 =3D v1:0 + } + { v21:20 =3D v1:0 + v23:22 =3D v1:0 + } + { v25:24 =3D v1:0 + v27:26 =3D v1:0 + } + { v29:28 =3D v1:0 + v31:30 =3D v1:0 + r10 =3D add(r0, r1) /* R10 =3D &src[2 * stride] */ + loop1(.outerloop, r3) + } + + /* + * Step 2: vhist + */ + .falign +.outerloop: + { if (!p0) jump .loopend + loop0(.innerloop, r2) + } + + .falign +.innerloop: + { v12.tmp =3D vmem(R0++#1) + vhist + }:endloop0 + + .falign +.loopend: + if (p1) jump .skip /* if (width % VLEN =3D=3D 0) done with curre= nt row */ + { v13.tmp =3D vmem(r0 + #0) + vhist(q0) + } + + .falign +.skip: + { r0 =3D r10 /* R0 =3D &src[(i + 1) * stride] */ + r10 =3D add(r10, r1) /* R10 =3D &src[(i + 2) * stride] */ + }:endloop1 + + + /* + * Step 3: Sum up the data + */ + { v0.h =3D vshuff(v0.h) + r10 =3D ##0x00010001 + } + v1.h =3D vshuff(v1.h) + { V2.h =3D vshuff(v2.h) + v0.w =3D vdmpy(v0.h, r10.h):sat + } + { v3.h =3D vshuff(v3.h) + v1.w =3D vdmpy(v1.h, r10.h):sat + } + { v4.h =3D vshuff(V4.h) + v2.w =3D vdmpy(v2.h, r10.h):sat + } + { v5.h =3D vshuff(v5.h) + v3.w =3D vdmpy(v3.h, r10.h):sat + } + { v6.h =3D vshuff(v6.h) + v4.w =3D vdmpy(v4.h, r10.h):sat + } + { v7.h =3D vshuff(v7.h) + v5.w =3D vdmpy(v5.h, r10.h):sat + } + { v8.h =3D vshuff(V8.h) + v6.w =3D vdmpy(v6.h, r10.h):sat + } + { v9.h =3D vshuff(V9.h) + v7.w =3D vdmpy(v7.h, r10.h):sat + } + { v10.h =3D vshuff(v10.h) + v8.w =3D vdmpy(v8.h, r10.h):sat + } + { v11.h =3D vshuff(v11.h) + v9.w =3D vdmpy(v9.h, r10.h):sat + } + { v12.h =3D vshuff(v12.h) + v10.w =3D vdmpy(v10.h, r10.h):sat + } + { v13.h =3D vshuff(V13.h) + v11.w =3D vdmpy(v11.h, r10.h):sat + } + { v14.h =3D vshuff(v14.h) + v12.w =3D vdmpy(v12.h, r10.h):sat + } + { v15.h =3D vshuff(v15.h) + v13.w =3D vdmpy(v13.h, r10.h):sat + } + { v16.h =3D vshuff(v16.h) + v14.w =3D vdmpy(v14.h, r10.h):sat + } + { v17.h =3D vshuff(v17.h) + v15.w =3D vdmpy(v15.h, r10.h):sat + } + { v18.h =3D vshuff(v18.h) + v16.w =3D vdmpy(v16.h, r10.h):sat + } + { v19.h =3D vshuff(v19.h) + v17.w =3D vdmpy(v17.h, r10.h):sat + } + { v20.h =3D vshuff(v20.h) + v18.W =3D vdmpy(v18.h, r10.h):sat + } + { v21.h =3D vshuff(v21.h) + v19.w =3D vdmpy(v19.h, r10.h):sat + } + { v22.h =3D vshuff(v22.h) + v20.w =3D vdmpy(v20.h, r10.h):sat + } + { v23.h =3D vshuff(v23.h) + v21.w =3D vdmpy(v21.h, r10.h):sat + } + { v24.h =3D vshuff(v24.h) + v22.w =3D vdmpy(v22.h, r10.h):sat + } + { v25.h =3D vshuff(v25.h) + v23.w =3D vdmpy(v23.h, r10.h):sat + } + { v26.h =3D vshuff(v26.h) + v24.w =3D vdmpy(v24.h, r10.h):sat + } + { v27.h =3D vshuff(V27.h) + v25.w =3D vdmpy(v25.h, r10.h):sat + } + { v28.h =3D vshuff(v28.h) + v26.w =3D vdmpy(v26.h, r10.h):sat + } + { v29.h =3D vshuff(v29.h) + v27.w =3D vdmpy(v27.h, r10.h):sat + } + { v30.h =3D vshuff(v30.h) + v28.w =3D vdmpy(v28.h, r10.h):sat + } + { v31.h =3D vshuff(v31.h) + v29.w =3D vdmpy(v29.h, r10.h):sat + r28 =3D #32 + } + { vshuff(v1, v0, r28) + v30.w =3D vdmpy(v30.h, r10.h):sat + } + { vshuff(v3, v2, r28) + v31.w =3D vdmpy(v31.h, r10.h):sat + } + { vshuff(v5, v4, r28) + v0.w =3D vadd(v1.w, v0.w) + v2.w =3D vadd(v3.w, v2.w) + } + { vshuff(v7, v6, r28) + r7 =3D #64 + } + { vshuff(v9, v8, r28) + v4.w =3D vadd(v5.w, v4.w) + v6.w =3D vadd(v7.w, v6.w) + } + vshuff(v11, v10, r28) + { vshuff(v13, v12, r28) + v8.w =3D vadd(v9.w, v8.w) + v10.w =3D vadd(v11.w, v10.w) + } + vshuff(v15, v14, r28) + { vshuff(v17, v16, r28) + v12.w =3D vadd(v13.w, v12.w) + v14.w =3D vadd(v15.w, v14.w) + } + vshuff(v19, v18, r28) + { vshuff(v21, v20, r28) + v16.w =3D vadd(v17.w, v16.w) + v18.w =3D vadd(v19.w, v18.w) + } + vshuff(v23, v22, r28) + { vshuff(v25, v24, r28) + v20.w =3D vadd(v21.w, v20.w) + v22.w =3D vadd(v23.w, v22.w) + } + vshuff(v27, v26, r28) + { vshuff(v29, v28, r28) + v24.w =3D vadd(v25.w, v24.w) + v26.w =3D vadd(v27.w, v26.w) + } + vshuff(v31, v30, r28) + { v28.w =3D vadd(v29.w, v28.w) + vshuff(v2, v0, r7) + } + { v30.w =3D vadd(v31.w, v30.w) + vshuff(v6, v4, r7) + v0.w =3D vadd(v0.w, v2.w) + } + { vshuff(v10, v8, r7) + v1.tmp =3D vmem(r4 + #0) /* update hist[0-31] */ + v0.w =3D vadd(v0.w, v1.w) + vmem(r4++#1) =3D v0.new + } + { vshuff(v14, v12, r7) + v4.w =3D vadd(v4.w, v6.w) + v8.w =3D vadd(v8.w, v10.w) + } + { vshuff(v18, v16, r7) + v1.tmp =3D vmem(r4 + #0) /* update hist[32-63] */ + v4.w =3D vadd(v4.w, v1.w) + vmem(r4++#1) =3D v4.new + } + { vshuff(v22, v20, r7) + v12.w =3D vadd(v12.w, v14.w) + V16.w =3D vadd(v16.w, v18.w) + } + { vshuff(v26, v24, r7) + v1.tmp =3D vmem(r4 + #0) /* update hist[64-95] */ + v8.w =3D vadd(v8.w, v1.w) + vmem(r4++#1) =3D v8.new + } + { vshuff(v30, v28, r7) + v1.tmp =3D vmem(r4 + #0) /* update hist[96-127] */ + v12.w =3D vadd(v12.w, v1.w) + vmem(r4++#1) =3D v12.new + } + + { v20.w =3D vadd(v20.w, v22.w) + v1.tmp =3D vmem(r4 + #0) /* update hist[128-159] */ + v16.w =3D vadd(v16.w, v1.w) + vmem(r4++#1) =3D v16.new + } + { v24.w =3D vadd(v24.w, v26.w) + v1.tmp =3D vmem(r4 + #0) /* update hist[160-191] */ + v20.w =3D vadd(v20.w, v1.w) + vmem(r4++#1) =3D v20.new + } + { v28.w =3D vadd(v28.w, v30.w) + v1.tmp =3D vmem(r4 + #0) /* update hist[192-223] */ + v24.w =3D vadd(v24.w, v1.w) + vmem(r4++#1) =3D v24.new + } + { v1.tmp =3D vmem(r4 + #0) /* update hist[224-255] */ + v28.w =3D vadd(v28.w, v1.w) + vmem(r4++#1) =3D v28.new + } + jumpr r31 + .size hvx_histogram_row, .-hvx_histogram_row --=20 2.7.4