From nobody Sat Sep 21 05:49:57 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=quicinc.com ARC-Seal: i=1; a=rsa-sha256; t=1684440594; cv=none; d=zohomail.com; s=zohoarc; b=W5tczjX9L7AmgN0gcNLkpl+PYta680TVJiiCa+PmbWzEYMa/gx8/TfOzc+gSJ7sRqea+hkvyOJsCRMM5JJZnUY7zGQOY/KjQKLJO+huPkIJiMt3cuJq0mpaoSX8Qs5GtfwTvo8Z88yrEomI6F32uV+QFHfv/RChv4NzF8y0FKRc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1684440594; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=ECmn6rhLUfb23UNylnFNnJ+1qjpUjNpE3BdIuBlzSpo=; b=YKHsFzMGrrIyFgl8lW5nrCZo07PutM0rjqIxMfJyua0YpApFSqnmH2psoI8rjrq7LyScwaY+lhyrdxFw9pLqXZnosrLfH5uE6pcI0x0fGb54bM2KCg+1AP7WYe+fMfnhgEO3BQlMQ444bbJ4j3TXKKKouzuIHVcF3fUBucQi+o8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1684440594021131.9634066462554; Thu, 18 May 2023 13:09:54 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pzjrp-0000KV-TV; Thu, 18 May 2023 16:04:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pzjra-00008z-0K for qemu-devel@nongnu.org; Thu, 18 May 2023 16:04:43 -0400 Received: from mx0b-0031df01.pphosted.com ([205.220.180.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pzjrQ-00056e-OG for qemu-devel@nongnu.org; Thu, 18 May 2023 16:04:37 -0400 Received: from pps.filterd (m0279873.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34IJaO4I030455; Thu, 18 May 2023 20:04:17 GMT Received: from nalasppmta05.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3qn8d2jewt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 May 2023 20:04:17 +0000 Received: from pps.filterd (NALASPPMTA05.qualcomm.com [127.0.0.1]) by NALASPPMTA05.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 34IK1dsm008798; Thu, 18 May 2023 20:04:15 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA05.qualcomm.com (PPS) with ESMTPS id 3qj3mmt9yy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 May 2023 20:04:15 +0000 Received: from NALASPPMTA05.qualcomm.com (NALASPPMTA05.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 34IK4FRW012571; Thu, 18 May 2023 20:04:15 GMT Received: from hu-devc-sd-u20-a-1.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.204.221]) by NALASPPMTA05.qualcomm.com (PPS) with ESMTPS id 34IK4Eek012539 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 May 2023 20:04:15 +0000 Received: by hu-devc-sd-u20-a-1.qualcomm.com (Postfix, from userid 47164) id 8254A6D5; Thu, 18 May 2023 13:04:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=ECmn6rhLUfb23UNylnFNnJ+1qjpUjNpE3BdIuBlzSpo=; b=UMcqBMwNKcxu14qIQ9SXldqUldCXOgr3egcwSVVvYM/yE8DM8wK3w0euauqw+9jUQMQq 3XjgBTK48AoKLHMo/Skc1zGAnZenSqnod++MRIqdG9iuNjGqyCFmnBrxN8qo18c5d6f4 z9n4boOvYdf1ugV9kPMUr/3pPlpvQVAZOthSsXpAL1Ag+5eqq4/uFfvIQ2lSJ9zLGXgi aypOQoSdIoAf+FuC0MF/9Mb3DvFFtn5fQ5yjhI3mgx6z2gZ6wTVtQzjBFkQRbhu++Xvj VSoyug4ildSILoswy624cqpgsUXeIY9Bx9EJWMcFmZev06DC/DQjvQSfRQjjMxU9tLyf fg== From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, peter.maydell@linaro.org, bcain@quicinc.com, quic_mathbern@quicinc.com, stefanha@redhat.com, ale@rev.ng, anjo@rev.ng, quic_mliebel@quicinc.com Subject: [PULL v2 20/44] Hexagon (target/hexagon) Short-circuit packet register writes Date: Thu, 18 May 2023 13:03:47 -0700 Message-Id: <20230518200411.271148-21-tsimpson@quicinc.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230518200411.271148-1-tsimpson@quicinc.com> References: <20230518200411.271148-1-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: soXcQkOFkBZxViCRrdHvvXR9IHKZSnAF X-Proofpoint-GUID: soXcQkOFkBZxViCRrdHvvXR9IHKZSnAF X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-18_15,2023-05-17_02,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 adultscore=0 malwarescore=0 mlxlogscore=512 bulkscore=0 phishscore=0 spamscore=0 suspectscore=0 clxscore=1015 mlxscore=0 priorityscore=1501 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2304280000 definitions=main-2305180165 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=205.220.180.131; envelope-from=tsimpson@qualcomm.com; helo=mx0b-0031df01.pphosted.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @quicinc.com) X-ZM-MESSAGEID: 1684440595465100011 In certain cases, we can avoid the overhead of writing to hex_new_value and write directly to hex_gpr. We add need_commit field to DisasContext indicating if the end-of-packet commit is needed. If it is not needed, get_result_gpr() and get_result_gpr_pair() can return hex_gpr. We pass the ctx->need_commit to helpers when needed. Finally, we can early-exit from gen_reg_writes during packet commit. There are a few instructions whose semantics write to the result before reading all the inputs. Therefore, the idef-parser generated code is incompatible with short-circuit. We tell idef-parser to skip them. For debugging purposes, we add a cpu property to turn off short-circuit. When the short-circuit property is false, we skip the analysis and force the end-of-packet commit. Here's a simple example of the TCG generated for 0x004000b4: 0x7800c020 { R0 =3D #0x1 } BEFORE: ---- 004000b4 movi_i32 new_r0,$0x1 mov_i32 r0,new_r0 AFTER: ---- 004000b4 movi_i32 r0,$0x1 This patch reintroduces a use of check_for_attrib, so we remove the G_GNUC_UNUSED added earlier in this series. Signed-off-by: Taylor Simpson Reviewed-by: Richard Henderson Reviewed-by: Brian Cain Message-Id: <20230427230012.3800327-12-tsimpson@quicinc.com> --- target/hexagon/cpu.h | 1 + target/hexagon/gen_tcg.h | 3 +- target/hexagon/genptr.h | 2 + target/hexagon/helper.h | 2 +- target/hexagon/macros.h | 13 ++++- target/hexagon/translate.h | 2 + target/hexagon/arch.c | 3 +- target/hexagon/cpu.c | 3 ++ target/hexagon/genptr.c | 30 ++++------- target/hexagon/op_helper.c | 5 +- target/hexagon/translate.c | 67 ++++++++++++++++++++++++- target/hexagon/gen_helper_funcs.py | 2 + target/hexagon/gen_helper_protos.py | 10 +++- target/hexagon/gen_idef_parser_funcs.py | 7 +++ target/hexagon/gen_tcg_funcs.py | 5 ++ target/hexagon/hex_common.py | 3 ++ 16 files changed, 128 insertions(+), 30 deletions(-) diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h index 4d8981d862..631bfdbe9c 100644 --- a/target/hexagon/cpu.h +++ b/target/hexagon/cpu.h @@ -150,6 +150,7 @@ struct ArchCPU { =20 bool lldb_compat; target_ulong lldb_stack_adjust; + bool short_circuit; }; =20 #include "cpu_bits.h" diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h index 099a6cc47f..7e070c35bd 100644 --- a/target/hexagon/gen_tcg.h +++ b/target/hexagon/gen_tcg.h @@ -592,7 +592,8 @@ #define fGEN_TCG_A5_ACS(SHORTCODE) \ do { \ gen_helper_vacsh_pred(PeV, cpu_env, RxxV, RssV, RttV); \ - gen_helper_vacsh_val(RxxV, cpu_env, RxxV, RssV, RttV); \ + gen_helper_vacsh_val(RxxV, cpu_env, RxxV, RssV, RttV, \ + tcg_constant_tl(ctx->need_commit)); \ } while (0) =20 #define fGEN_TCG_S2_cabacdecbin(SHORTCODE) \ diff --git a/target/hexagon/genptr.h b/target/hexagon/genptr.h index 75d0fc262d..420867f934 100644 --- a/target/hexagon/genptr.h +++ b/target/hexagon/genptr.h @@ -58,4 +58,6 @@ void gen_set_half(int N, TCGv result, TCGv src); void gen_set_half_i64(int N, TCGv_i64 result, TCGv src); void probe_noshuf_load(TCGv va, int s, int mi); =20 +extern const target_ulong reg_immut_masks[TOTAL_PER_THREAD_REGS]; + #endif diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h index 73849e3d49..4b750d0351 100644 --- a/target/hexagon/helper.h +++ b/target/hexagon/helper.h @@ -29,7 +29,7 @@ DEF_HELPER_FLAGS_4(fcircadd, TCG_CALL_NO_RWG_SE, s32, s32= , s32, s32, s32) DEF_HELPER_FLAGS_1(fbrev, TCG_CALL_NO_RWG_SE, i32, i32) DEF_HELPER_3(sfrecipa, i64, env, f32, f32) DEF_HELPER_2(sfinvsqrta, i64, env, f32) -DEF_HELPER_4(vacsh_val, s64, env, s64, s64, s64) +DEF_HELPER_5(vacsh_val, s64, env, s64, s64, s64, i32) DEF_HELPER_FLAGS_4(vacsh_pred, TCG_CALL_NO_RWG_SE, s32, env, s64, s64, s64) DEF_HELPER_FLAGS_2(cabacdecbin_val, TCG_CALL_NO_RWG_SE, s64, s64, s64) DEF_HELPER_FLAGS_2(cabacdecbin_pred, TCG_CALL_NO_RWG_SE, s32, s64, s64) diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h index 24c78fe80a..54562cccb0 100644 --- a/target/hexagon/macros.h +++ b/target/hexagon/macros.h @@ -44,8 +44,17 @@ reg_field_info[FIELD].offset) =20 #define SET_USR_FIELD(FIELD, VAL) \ - fINSERT_BITS(env->new_value[HEX_REG_USR], reg_field_info[FIELD].width,= \ - reg_field_info[FIELD].offset, (VAL)) + do { \ + if (pkt_need_commit) { \ + fINSERT_BITS(env->new_value[HEX_REG_USR], \ + reg_field_info[FIELD].width, \ + reg_field_info[FIELD].offset, (VAL)); \ + } else { \ + fINSERT_BITS(env->gpr[HEX_REG_USR], \ + reg_field_info[FIELD].width, \ + reg_field_info[FIELD].offset, (VAL)); \ + } \ + } while (0) #endif =20 #ifdef QEMU_GENERATE diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h index f72228859f..3f6fd3452c 100644 --- a/target/hexagon/translate.h +++ b/target/hexagon/translate.h @@ -62,10 +62,12 @@ typedef struct DisasContext { int qreg_log_idx; DECLARE_BITMAP(qregs_read, NUM_QREGS); bool pre_commit; + bool need_commit; TCGCond branch_cond; target_ulong branch_dest; bool is_tight_loop; bool need_pkt_has_store_s1; + bool short_circuit; } DisasContext; =20 static inline void ctx_log_pred_write(DisasContext *ctx, int pnum) diff --git a/target/hexagon/arch.c b/target/hexagon/arch.c index da79b41c4d..d053d68487 100644 --- a/target/hexagon/arch.c +++ b/target/hexagon/arch.c @@ -1,5 +1,5 @@ /* - * Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights Res= erved. + * Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights Res= erved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -224,6 +224,7 @@ void arch_fpop_start(CPUHexagonState *env) =20 void arch_fpop_end(CPUHexagonState *env) { + const bool pkt_need_commit =3D true; int flags =3D get_float_exception_flags(&env->fp_status); if (flags !=3D 0) { SOFTFLOAT_TEST_FLAG(float_flag_inexact, FPINPF, FPINPE); diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c index c78fe25c9f..d4dfc382ab 100644 --- a/target/hexagon/cpu.c +++ b/target/hexagon/cpu.c @@ -54,6 +54,8 @@ static Property hexagon_lldb_compat_property =3D static Property hexagon_lldb_stack_adjust_property =3D DEFINE_PROP_UNSIGNED("lldb-stack-adjust", HexagonCPU, lldb_stack_adjus= t, 0, qdev_prop_uint32, target_ulong); +static Property hexagon_short_circuit_property =3D + DEFINE_PROP_BOOL("short-circuit", HexagonCPU, short_circuit, true); =20 const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS] =3D { "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", @@ -330,6 +332,7 @@ static void hexagon_cpu_init(Object *obj) cpu_set_cpustate_pointers(cpu); qdev_property_add_static(DEVICE(obj), &hexagon_lldb_compat_property); qdev_property_add_static(DEVICE(obj), &hexagon_lldb_stack_adjust_prope= rty); + qdev_property_add_static(DEVICE(obj), &hexagon_short_circuit_property); } =20 #include "hw/core/tcg-cpu-ops.h" diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c index 3c7e0dafaf..9858d7bc35 100644 --- a/target/hexagon/genptr.c +++ b/target/hexagon/genptr.c @@ -45,7 +45,7 @@ TCGv gen_read_preg(TCGv pred, uint8_t num) =20 #define IMMUTABLE (~0) =20 -static const target_ulong reg_immut_masks[TOTAL_PER_THREAD_REGS] =3D { +const target_ulong reg_immut_masks[TOTAL_PER_THREAD_REGS] =3D { [HEX_REG_USR] =3D 0xc13000c0, [HEX_REG_PC] =3D IMMUTABLE, [HEX_REG_GP] =3D 0x3f, @@ -70,14 +70,18 @@ static inline void gen_masked_reg_write(TCGv new_val, T= CGv cur_val, =20 static TCGv get_result_gpr(DisasContext *ctx, int rnum) { - return hex_new_value[rnum]; + if (ctx->need_commit) { + return hex_new_value[rnum]; + } else { + return hex_gpr[rnum]; + } } =20 static TCGv_i64 get_result_gpr_pair(DisasContext *ctx, int rnum) { TCGv_i64 result =3D tcg_temp_new_i64(); - tcg_gen_concat_i32_i64(result, hex_new_value[rnum], - hex_new_value[rnum + 1]); + tcg_gen_concat_i32_i64(result, get_result_gpr(ctx, rnum), + get_result_gpr(ctx, rnum + 1)); return result; } =20 @@ -86,7 +90,7 @@ void gen_log_reg_write(DisasContext *ctx, int rnum, TCGv = val) const target_ulong reg_mask =3D reg_immut_masks[rnum]; =20 gen_masked_reg_write(val, hex_gpr[rnum], reg_mask); - tcg_gen_mov_tl(hex_new_value[rnum], val); + tcg_gen_mov_tl(get_result_gpr(ctx, rnum), val); if (HEX_DEBUG) { /* Do this so HELPER(debug_commit_end) will know */ tcg_gen_movi_tl(hex_reg_written[rnum], 1); @@ -95,27 +99,15 @@ void gen_log_reg_write(DisasContext *ctx, int rnum, TCG= v val) =20 static void gen_log_reg_write_pair(DisasContext *ctx, int rnum, TCGv_i64 v= al) { - const target_ulong reg_mask_low =3D reg_immut_masks[rnum]; - const target_ulong reg_mask_high =3D reg_immut_masks[rnum + 1]; TCGv val32 =3D tcg_temp_new(); =20 /* Low word */ tcg_gen_extrl_i64_i32(val32, val); - gen_masked_reg_write(val32, hex_gpr[rnum], reg_mask_low); - tcg_gen_mov_tl(hex_new_value[rnum], val32); - if (HEX_DEBUG) { - /* Do this so HELPER(debug_commit_end) will know */ - tcg_gen_movi_tl(hex_reg_written[rnum], 1); - } + gen_log_reg_write(ctx, rnum, val32); =20 /* High word */ tcg_gen_extrh_i64_i32(val32, val); - gen_masked_reg_write(val32, hex_gpr[rnum + 1], reg_mask_high); - tcg_gen_mov_tl(hex_new_value[rnum + 1], val32); - if (HEX_DEBUG) { - /* Do this so HELPER(debug_commit_end) will know */ - tcg_gen_movi_tl(hex_reg_written[rnum + 1], 1); - } + gen_log_reg_write(ctx, rnum + 1, val32); } =20 void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val) diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c index 46ccc59106..fc5c30a141 100644 --- a/target/hexagon/op_helper.c +++ b/target/hexagon/op_helper.c @@ -220,7 +220,7 @@ void HELPER(debug_commit_end)(CPUHexagonState *env, int= has_st0, int has_st1) reg_printed =3D true; } HEX_DEBUG_LOG("\tr%d =3D " TARGET_FMT_ld " (0x" TARGET_FMT_lx = ")\n", - i, env->new_value[i], env->new_value[i]); + i, env->gpr[i], env->gpr[i]); } } =20 @@ -352,7 +352,8 @@ uint64_t HELPER(sfinvsqrta)(CPUHexagonState *env, float= 32 RsV) } =20 int64_t HELPER(vacsh_val)(CPUHexagonState *env, - int64_t RxxV, int64_t RssV, int64_t RttV) + int64_t RxxV, int64_t RssV, int64_t RttV, + uint32_t pkt_need_commit) { for (int i =3D 0; i < 4; i++) { int xv =3D sextract64(RxxV, i * 16, 16); diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c index e84bd34618..6fa885cf16 100644 --- a/target/hexagon/translate.c +++ b/target/hexagon/translate.c @@ -27,6 +27,7 @@ #include "insn.h" #include "decode.h" #include "translate.h" +#include "genptr.h" #include "printinsn.h" =20 #include "analyze_funcs_generated.c.inc" @@ -239,7 +240,7 @@ static int read_packet_words(CPUHexagonState *env, Disa= sContext *ctx, return nwords; } =20 -static G_GNUC_UNUSED bool check_for_attrib(Packet *pkt, int attrib) +static bool check_for_attrib(Packet *pkt, int attrib) { for (int i =3D 0; i < pkt->num_insns; i++) { if (GET_ATTRIB(pkt->insn[i].opcode, attrib)) { @@ -336,6 +337,58 @@ static void mark_implicit_pred_writes(DisasContext *ct= x) mark_implicit_pred_write(ctx, A_IMPLICIT_WRITES_P3, 3); } =20 +static bool pkt_raises_exception(Packet *pkt) +{ + if (check_for_attrib(pkt, A_LOAD) || + check_for_attrib(pkt, A_STORE)) { + return true; + } + return false; +} + +static bool need_commit(DisasContext *ctx) +{ + Packet *pkt =3D ctx->pkt; + + /* + * If the short-circuit property is set to false, we'll always do the = commit + */ + if (!ctx->short_circuit) { + return true; + } + + if (pkt_raises_exception(pkt)) { + return true; + } + + /* Registers with immutability flags require new_value */ + for (int i =3D 0; i < ctx->reg_log_idx; i++) { + int rnum =3D ctx->reg_log[i]; + if (reg_immut_masks[rnum]) { + return true; + } + } + + /* Floating point instructions are hard-coded to use new_value */ + if (check_for_attrib(pkt, A_FPOP)) { + return true; + } + + if (pkt->num_insns =3D=3D 1) { + return false; + } + + /* Check for overlap between register reads and writes */ + for (int i =3D 0; i < ctx->reg_log_idx; i++) { + int rnum =3D ctx->reg_log[i]; + if (test_bit(rnum, ctx->regs_read)) { + return true; + } + } + + return false; +} + static void mark_implicit_pred_read(DisasContext *ctx, int attrib, int pnu= m) { if (GET_ATTRIB(ctx->insn->opcode, attrib)) { @@ -365,6 +418,8 @@ static void analyze_packet(DisasContext *ctx) mark_implicit_pred_writes(ctx); mark_implicit_pred_reads(ctx); } + + ctx->need_commit =3D need_commit(ctx); } =20 static void gen_start_packet(DisasContext *ctx) @@ -434,7 +489,8 @@ static void gen_start_packet(DisasContext *ctx) } =20 /* Preload the predicated registers into hex_new_value[i] */ - if (!bitmap_empty(ctx->predicated_regs, TOTAL_PER_THREAD_REGS)) { + if (ctx->need_commit && + !bitmap_empty(ctx->predicated_regs, TOTAL_PER_THREAD_REGS)) { int i =3D find_first_bit(ctx->predicated_regs, TOTAL_PER_THREAD_RE= GS); while (i < TOTAL_PER_THREAD_REGS) { tcg_gen_mov_tl(hex_new_value[i], hex_gpr[i]); @@ -544,6 +600,11 @@ static void gen_reg_writes(DisasContext *ctx) { int i; =20 + /* Early exit if not needed */ + if (!ctx->need_commit) { + return; + } + for (i =3D 0; i < ctx->reg_log_idx; i++) { int reg_num =3D ctx->reg_log[i]; =20 @@ -922,6 +983,7 @@ static void hexagon_tr_init_disas_context(DisasContextB= ase *dcbase, CPUState *cs) { DisasContext *ctx =3D container_of(dcbase, DisasContext, base); + HexagonCPU *hex_cpu =3D env_archcpu(cs->env_ptr); uint32_t hex_flags =3D dcbase->tb->flags; =20 ctx->mem_idx =3D MMU_USER_IDX; @@ -930,6 +992,7 @@ static void hexagon_tr_init_disas_context(DisasContextB= ase *dcbase, ctx->num_hvx_insns =3D 0; ctx->branch_cond =3D TCG_COND_NEVER; ctx->is_tight_loop =3D FIELD_EX32(hex_flags, TB_FLAGS, IS_TIGHT_LOOP); + ctx->short_circuit =3D hex_cpu->short_circuit; } =20 static void hexagon_tr_tb_start(DisasContextBase *db, CPUState *cpu) diff --git a/target/hexagon/gen_helper_funcs.py b/target/hexagon/gen_helper= _funcs.py index c73d792580..e259ea3d03 100755 --- a/target/hexagon/gen_helper_funcs.py +++ b/target/hexagon/gen_helper_funcs.py @@ -287,6 +287,8 @@ def gen_helper_function(f, tag, tagregs, tagimms): =20 if hex_common.need_pkt_has_multi_cof(tag): f.write(", uint32_t pkt_has_multi_cof") + if (hex_common.need_pkt_need_commit(tag)): + f.write(", uint32_t pkt_need_commit") =20 if hex_common.need_PC(tag): if i > 0: diff --git a/target/hexagon/gen_helper_protos.py b/target/hexagon/gen_helpe= r_protos.py index 187cd6e04e..c5ecb85294 100755 --- a/target/hexagon/gen_helper_protos.py +++ b/target/hexagon/gen_helper_protos.py @@ -86,6 +86,8 @@ def gen_helper_prototype(f, tag, tagregs, tagimms): def_helper_size =3D len(regs) + len(imms) + numscalarreadwrite= + 1 if hex_common.need_pkt_has_multi_cof(tag): def_helper_size +=3D 1 + if hex_common.need_pkt_need_commit(tag): + def_helper_size +=3D 1 if hex_common.need_part1(tag): def_helper_size +=3D 1 if hex_common.need_slot(tag): @@ -103,6 +105,8 @@ def gen_helper_prototype(f, tag, tagregs, tagimms): def_helper_size =3D len(regs) + len(imms) + numscalarreadwrite if hex_common.need_pkt_has_multi_cof(tag): def_helper_size +=3D 1 + if hex_common.need_pkt_need_commit(tag): + def_helper_size +=3D 1 if hex_common.need_part1(tag): def_helper_size +=3D 1 if hex_common.need_slot(tag): @@ -156,10 +160,12 @@ def gen_helper_prototype(f, tag, tagregs, tagimms): for immlett, bits, immshift in imms: f.write(", s32") =20 - ## Add the arguments for the instruction pkt_has_multi_cof, slot a= nd - ## part1 (if needed) + ## Add the arguments for the instruction pkt_has_multi_cof, + ## pkt_needs_commit, PC, next_PC, slot, and part1 (if needed) if hex_common.need_pkt_has_multi_cof(tag): f.write(", i32") + if hex_common.need_pkt_need_commit(tag): + f.write(', i32') if hex_common.need_PC(tag): f.write(", i32") if hex_common.helper_needs_next_PC(tag): diff --git a/target/hexagon/gen_idef_parser_funcs.py b/target/hexagon/gen_i= def_parser_funcs.py index dc9e396b52..ad2e5c04d3 100644 --- a/target/hexagon/gen_idef_parser_funcs.py +++ b/target/hexagon/gen_idef_parser_funcs.py @@ -111,6 +111,13 @@ def main(): continue if ( tag.startswith('R6_release_') ): continue + ## Skip instructions that are incompatible with short-circuit + ## packet register writes + if ( tag =3D=3D 'S2_insert' or + tag =3D=3D 'S2_insert_rp' or + tag =3D=3D 'S2_asr_r_svw_trun' or + tag =3D=3D 'A2_swiz' ): + continue =20 regs =3D tagregs[tag] imms =3D tagimms[tag] diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs= .py index d9ccbe63f6..0e45d43685 100755 --- a/target/hexagon/gen_tcg_funcs.py +++ b/target/hexagon/gen_tcg_funcs.py @@ -550,6 +550,9 @@ def gen_tcg_func(f, tag, regs, imms): if hex_common.need_pkt_has_multi_cof(tag): f.write(" TCGv pkt_has_multi_cof =3D ") f.write("tcg_constant_tl(ctx->pkt->pkt_has_multi_cof);\n") + if hex_common.need_pkt_need_commit(tag): + f.write(" TCGv pkt_need_commit =3D ") + f.write("tcg_constant_tl(ctx->need_commit);\n") if hex_common.need_part1(tag): f.write(" TCGv part1 =3D tcg_constant_tl(insn->part1);\n") if hex_common.need_slot(tag): @@ -596,6 +599,8 @@ def gen_tcg_func(f, tag, regs, imms): =20 if hex_common.need_pkt_has_multi_cof(tag): f.write(", pkt_has_multi_cof") + if hex_common.need_pkt_need_commit(tag): + f.write(", pkt_need_commit") if hex_common.need_PC(tag): f.write(", PC") if hex_common.helper_needs_next_PC(tag): diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py index 232c6e2c20..29c0508f66 100755 --- a/target/hexagon/hex_common.py +++ b/target/hexagon/hex_common.py @@ -276,6 +276,9 @@ def need_pkt_has_multi_cof(tag): return "A_COF" in attribdict[tag] =20 =20 +def need_pkt_need_commit(tag): + return 'A_IMPLICIT_WRITES_USR' in attribdict[tag] + def need_condexec_reg(tag, regs): if "A_CONDEXEC" in attribdict[tag]: for regtype, regid, toss, numregs in regs: --=20 2.25.1