From nobody Mon Feb 9 20:32:36 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1634034808086929.4624957233768; Tue, 12 Oct 2021 03:33:28 -0700 (PDT) Received: from localhost ([::1]:42408 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1maF66-0004nK-NV for importer@patchew.org; Tue, 12 Oct 2021 06:33:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50372) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1maEl0-0000aP-J1 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:40 -0400 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:64084) by eggs.gnu.org with esmtps (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1maEkx-0006yI-A4 for qemu-devel@nongnu.org; Tue, 12 Oct 2021 06:11:38 -0400 Received: from unknown (HELO ironmsg01-sd.qualcomm.com) ([10.53.140.141]) by alexa-out-sd-02.qualcomm.com with ESMTP; 12 Oct 2021 03:11:24 -0700 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg01-sd.qualcomm.com with ESMTP; 12 Oct 2021 03:11:23 -0700 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 53BE614DD; Tue, 12 Oct 2021 05:11:22 -0500 (CDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1634033495; x=1665569495; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zDEi7v7WMqd9gW84fqKaNv8xkilLQNuceISmGJOqKaE=; b=GCPXR44+hAm+/Lt2xRoIF++wBqMqUyZDIvyUR+y6MwEonN5P0pA68ScH wEd65ZYokLzaed6hCCulYMPjP7/vOYxLznPniKT9WYepasJWxa8Qjtn0b BkBdGTNC/StGq54clz4JKJ4k0g19HhcXbHMbD/+hnMVeWxpc7ndLXSLxb 8=; X-QCInternal: smtphost From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [PATCH v4 11/30] Hexagon HVX (target/hexagon) helper functions Date: Tue, 12 Oct 2021 05:10:49 -0500 Message-Id: <1634033468-23566-12-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> References: <1634033468-23566-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=199.106.114.39; envelope-from=tsimpson@qualcomm.com; helo=alexa-out-sd-02.qualcomm.com X-Spam_score_int: -40 X-Spam_score: -4.1 X-Spam_bar: ---- X-Spam_report: (-4.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ale@rev.ng, bcain@quicinc.com, tsimpson@quicinc.com, richard.henderson@linaro.org, f4bug@amsat.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1634034809083100001 Probe and commit vector stores (masked and scatter/gather) Log vector register writes Add the execution counters to the debug log Histogram instructions Signed-off-by: Taylor Simpson Reviewed-by: Richard Henderson --- target/hexagon/helper.h | 16 +++ target/hexagon/op_helper.c | 282 +++++++++++++++++++++++++++++++++++++++++= +++- 2 files changed, 296 insertions(+), 2 deletions(-) diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h index 89de2a3..c89aa4e 100644 --- a/target/hexagon/helper.h +++ b/target/hexagon/helper.h @@ -23,6 +23,8 @@ DEF_HELPER_1(debug_start_packet, void, env) DEF_HELPER_FLAGS_3(debug_check_store_width, TCG_CALL_NO_WG, void, env, int= , int) DEF_HELPER_FLAGS_3(debug_commit_end, TCG_CALL_NO_WG, void, env, int, int) DEF_HELPER_2(commit_store, void, env, int) +DEF_HELPER_3(gather_store, void, env, i32, int) +DEF_HELPER_1(commit_hvx_stores, void, env) DEF_HELPER_FLAGS_4(fcircadd, TCG_CALL_NO_RWG_SE, s32, s32, s32, s32, s32) DEF_HELPER_FLAGS_1(fbrev, TCG_CALL_NO_RWG_SE, i32, i32) DEF_HELPER_3(sfrecipa, i64, env, f32, f32) @@ -90,4 +92,18 @@ DEF_HELPER_4(sffms_lib, f32, env, f32, f32, f32) DEF_HELPER_3(dfmpyfix, f64, env, f64, f64) DEF_HELPER_4(dfmpyhh, f64, env, f64, f64, f64) =20 +/* Histogram instructions */ +DEF_HELPER_1(vhist, void, env) +DEF_HELPER_1(vhistq, void, env) +DEF_HELPER_1(vwhist256, void, env) +DEF_HELPER_1(vwhist256q, void, env) +DEF_HELPER_1(vwhist256_sat, void, env) +DEF_HELPER_1(vwhist256q_sat, void, env) +DEF_HELPER_1(vwhist128, void, env) +DEF_HELPER_1(vwhist128q, void, env) +DEF_HELPER_2(vwhist128m, void, env, s32) +DEF_HELPER_2(vwhist128qm, void, env, s32) + DEF_HELPER_2(probe_pkt_scalar_store_s0, void, env, int) +DEF_HELPER_2(probe_hvx_stores, void, env, int) +DEF_HELPER_3(probe_pkt_scalar_hvx_stores, void, env, int, int) diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c index af32de4..a67a148 100644 --- a/target/hexagon/op_helper.c +++ b/target/hexagon/op_helper.c @@ -27,6 +27,8 @@ #include "arch.h" #include "hex_arch_types.h" #include "fma_emu.h" +#include "mmvec/mmvec.h" +#include "mmvec/macros.h" =20 #define SF_BIAS 127 #define SF_MANTBITS 23 @@ -164,6 +166,57 @@ void HELPER(commit_store)(CPUHexagonState *env, int sl= ot_num) } } =20 +void HELPER(gather_store)(CPUHexagonState *env, uint32_t addr, int slot) +{ + mem_gather_store(env, addr, slot); +} + +void HELPER(commit_hvx_stores)(CPUHexagonState *env) +{ + uintptr_t ra =3D GETPC(); + int i; + + /* Normal (possibly masked) vector store */ + for (i =3D 0; i < VSTORES_MAX; i++) { + if (env->vstore_pending[i]) { + env->vstore_pending[i] =3D 0; + target_ulong va =3D env->vstore[i].va; + int size =3D env->vstore[i].size; + for (int j =3D 0; j < size; j++) { + if (test_bit(j, env->vstore[i].mask)) { + cpu_stb_data_ra(env, va + j, env->vstore[i].data.ub[j]= , ra); + } + } + } + } + + /* Scatter store */ + if (env->vtcm_pending) { + env->vtcm_pending =3D false; + if (env->vtcm_log.op) { + /* Need to perform the scatter read/modify/write at commit tim= e */ + if (env->vtcm_log.op_size =3D=3D 2) { + SCATTER_OP_WRITE_TO_MEM(uint16_t); + } else if (env->vtcm_log.op_size =3D=3D 4) { + /* Word Scatter +=3D */ + SCATTER_OP_WRITE_TO_MEM(uint32_t); + } else { + g_assert_not_reached(); + } + } else { + for (i =3D 0; i < env->vtcm_log.size; i++) { + if (test_bit(i, env->vtcm_log.mask)) { + cpu_stb_data_ra(env, env->vtcm_log.va[i], + env->vtcm_log.data.ub[i], ra); + clear_bit(i, env->vtcm_log.mask); + env->vtcm_log.data.ub[i] =3D 0; + } + + } + } + } +} + static void print_store(CPUHexagonState *env, int slot) { if (!(env->slot_cancelled & (1 << slot))) { @@ -242,9 +295,10 @@ void HELPER(debug_commit_end)(CPUHexagonState *env, in= t has_st0, int has_st1) HEX_DEBUG_LOG("Next PC =3D " TARGET_FMT_lx "\n", env->next_PC); HEX_DEBUG_LOG("Exec counters: pkt =3D " TARGET_FMT_lx ", insn =3D " TARGET_FMT_lx - "\n", + ", hvx =3D " TARGET_FMT_lx "\n", env->gpr[HEX_REG_QEMU_PKT_CNT], - env->gpr[HEX_REG_QEMU_INSN_CNT]); + env->gpr[HEX_REG_QEMU_INSN_CNT], + env->gpr[HEX_REG_QEMU_HVX_CNT]); =20 } =20 @@ -393,6 +447,65 @@ void HELPER(probe_pkt_scalar_store_s0)(CPUHexagonState= *env, int mmu_idx) probe_store(env, 0, mmu_idx); } =20 +void HELPER(probe_hvx_stores)(CPUHexagonState *env, int mmu_idx) +{ + uintptr_t retaddr =3D GETPC(); + int i; + + /* Normal (possibly masked) vector store */ + for (i =3D 0; i < VSTORES_MAX; i++) { + if (env->vstore_pending[i]) { + target_ulong va =3D env->vstore[i].va; + int size =3D env->vstore[i].size; + for (int j =3D 0; j < size; j++) { + if (test_bit(j, env->vstore[i].mask)) { + probe_write(env, va + j, 1, mmu_idx, retaddr); + } + } + } + } + + /* Scatter store */ + if (env->vtcm_pending) { + if (env->vtcm_log.op) { + /* Need to perform the scatter read/modify/write at commit tim= e */ + if (env->vtcm_log.op_size =3D=3D 2) { + SCATTER_OP_PROBE_MEM(size2u_t, mmu_idx, retaddr); + } else if (env->vtcm_log.op_size =3D=3D 4) { + /* Word Scatter +=3D */ + SCATTER_OP_PROBE_MEM(size4u_t, mmu_idx, retaddr); + } else { + g_assert_not_reached(); + } + } else { + for (int i =3D 0; i < env->vtcm_log.size; i++) { + if (test_bit(i, env->vtcm_log.mask)) { + probe_write(env, env->vtcm_log.va[i], 1, mmu_idx, reta= ddr); + } + + } + } + } +} + +void HELPER(probe_pkt_scalar_hvx_stores)(CPUHexagonState *env, int mask, + int mmu_idx) +{ + bool has_st0 =3D (mask >> 0) & 1; + bool has_st1 =3D (mask >> 1) & 1; + bool has_hvx_stores =3D (mask >> 2) & 1; + + if (has_st0) { + probe_store(env, 0, mmu_idx); + } + if (has_st1) { + probe_store(env, 1, mmu_idx); + } + if (has_hvx_stores) { + HELPER(probe_hvx_stores)(env, mmu_idx); + } +} + /* * mem_noshuf * Section 5.5 of the Hexagon V67 Programmer's Reference Manual @@ -1181,6 +1294,171 @@ float64 HELPER(dfmpyhh)(CPUHexagonState *env, float= 64 RxxV, return RxxV; } =20 +/* Histogram instructions */ + +void HELPER(vhist)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int lane =3D 0; lane < 8; lane++) { + for (int i =3D 0; i < sizeof(MMVector) / 8; ++i) { + unsigned char value =3D input->ub[(sizeof(MMVector) / 8) * lan= e + i]; + unsigned char regno =3D value >> 3; + unsigned char element =3D value & 7; + + env->VRegs[regno].uh[(sizeof(MMVector) / 16) * lane + element]= ++; + } + } +} + +void HELPER(vhistq)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int lane =3D 0; lane < 8; lane++) { + for (int i =3D 0; i < sizeof(MMVector) / 8; ++i) { + unsigned char value =3D input->ub[(sizeof(MMVector) / 8) * lan= e + i]; + unsigned char regno =3D value >> 3; + unsigned char element =3D value & 7; + + if (fGETQBIT(env->qtmp, sizeof(MMVector) / 8 * lane + i)) { + env->VRegs[regno].uh[ + (sizeof(MMVector) / 16) * lane + element]++; + } + } + } +} + +void HELPER(vwhist256)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 0) & (~7)) | ((bucket >> 0) & 7); + + env->VRegs[vindex].uh[elindex] =3D + env->VRegs[vindex].uh[elindex] + weight; + } +} + +void HELPER(vwhist256q)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 0) & (~7)) | ((bucket >> 0) & 7); + + if (fGETQBIT(env->qtmp, 2 * i)) { + env->VRegs[vindex].uh[elindex] =3D + env->VRegs[vindex].uh[elindex] + weight; + } + } +} + +void HELPER(vwhist256_sat)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 0) & (~7)) | ((bucket >> 0) & 7); + + env->VRegs[vindex].uh[elindex] =3D + fVSATUH(env->VRegs[vindex].uh[elindex] + weight); + } +} + +void HELPER(vwhist256q_sat)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 0) & (~7)) | ((bucket >> 0) & 7); + + if (fGETQBIT(env->qtmp, 2 * i)) { + env->VRegs[vindex].uh[elindex] =3D + fVSATUH(env->VRegs[vindex].uh[elindex] + weight); + } + } +} + +void HELPER(vwhist128)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 1) & (~3)) | ((bucket >> 1) & 3); + + env->VRegs[vindex].uw[elindex] =3D + env->VRegs[vindex].uw[elindex] + weight; + } +} + +void HELPER(vwhist128q)(CPUHexagonState *env) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 1) & (~3)) | ((bucket >> 1) & 3); + + if (fGETQBIT(env->qtmp, 2 * i)) { + env->VRegs[vindex].uw[elindex] =3D + env->VRegs[vindex].uw[elindex] + weight; + } + } +} + +void HELPER(vwhist128m)(CPUHexagonState *env, int32_t uiV) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 1) & (~3)) | ((bucket >> 1) & 3); + + if ((bucket & 1) =3D=3D uiV) { + env->VRegs[vindex].uw[elindex] =3D + env->VRegs[vindex].uw[elindex] + weight; + } + } +} + +void HELPER(vwhist128qm)(CPUHexagonState *env, int32_t uiV) +{ + MMVector *input =3D &env->tmp_VRegs[0]; + + for (int i =3D 0; i < (sizeof(MMVector) / 2); i++) { + unsigned int bucket =3D fGETUBYTE(0, input->h[i]); + unsigned int weight =3D fGETUBYTE(1, input->h[i]); + unsigned int vindex =3D (bucket >> 3) & 0x1F; + unsigned int elindex =3D ((i >> 1) & (~3)) | ((bucket >> 1) & 3); + + if (((bucket & 1) =3D=3D uiV) && fGETQBIT(env->qtmp, 2 * i)) { + env->VRegs[vindex].uw[elindex] =3D + env->VRegs[vindex].uw[elindex] + weight; + } + } +} + static void cancel_slot(CPUHexagonState *env, uint32_t slot) { HEX_DEBUG_LOG("Slot %d cancelled\n", slot); --=20 2.7.4