From nobody Wed Feb 11 04:39:31 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1666338646; cv=none; d=zohomail.com; s=zohoarc; b=NNloUp3TFtra5TxyGwtPVFkcmjsZhe7WdD6Mo23myzGaR6K7wvaMfTTLwt5edhLd/nxnV4nryJo9vKlCXtJo96aDOFRl96+BXavx8/GU7cgJozF2fuZniizrQ+n4Nzn/BhBYEtEIFTvWH4U88PIbmhhWONXrClBnJ8mtUR36sYQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1666338646; h=Content-Transfer-Encoding:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=V4SUZjAaozp75Gu5oXBaKguhljd88jPKuaDdX4HiKKo=; b=mitLVSq7h43nVorzUjaaaQg/qrljjhdtY4vpLyHJfsRJt6lAT/XTbj6Vfyup5lRPz5Vpj64fbS7BmUkOikbU2+oMpsTdROiegGwhuuxGGkP4zP8I9RelcBu58r/a24NgcfL0Hzt2U3Qx0JhsVqtAmZsQytfVPNlYXdMdZj8ZEdc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1666338646676634.6808767459107; Fri, 21 Oct 2022 00:50:46 -0700 (PDT) Received: from localhost ([::1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1olmnk-0002oE-E2 for importer@patchew.org; Fri, 21 Oct 2022 03:50:44 -0400 Received: from [::1] (helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1olmHl-0003Jd-3i for importer@patchew.org; Fri, 21 Oct 2022 03:17:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1olmHV-0003Fq-QA for qemu-devel@nongnu.org; Fri, 21 Oct 2022 03:17:29 -0400 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1olmHR-0005or-Pe for qemu-devel@nongnu.org; Fri, 21 Oct 2022 03:17:25 -0400 Received: by mail-pf1-x431.google.com with SMTP id g28so1793335pfk.8 for ; Fri, 21 Oct 2022 00:17:21 -0700 (PDT) Received: from localhost.localdomain ([149.135.10.35]) by smtp.gmail.com with ESMTPSA id q9-20020a638c49000000b0041cd5ddde6fsm13003327pgn.76.2022.10.21.00.17.17 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 Oct 2022 00:17:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=V4SUZjAaozp75Gu5oXBaKguhljd88jPKuaDdX4HiKKo=; b=A2gVn0UUjyf4L9D9IFFi89P0EKyAgui2gLM7FpClHnTraeOvuDSj0IcT1DZ53u7GRU eYQt66cPXXDbCUYtLeYaA2G5CYdIgi0C68trD32xZSZRfVXwuY5QOHvreE2DbzpbOQBC ysFv+MtS7t3yJYiD+g2FFOuDyXmnvu3dVgJLK83XVjJ8E7LskFNGTS0rmk26XXoPeSCG uiKV/SSywuXxddHuF2idtVI65p8QSue9JwTIySm23vxMYBeMaCZ5b0h8bYQymk5pZhGo hCFilJMLqoELD5vUqugrV5nWtHcQ4tLMNaW+YVeCA4b5nZ2/F0NYD2CJ/75LGPxywC4f R11Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V4SUZjAaozp75Gu5oXBaKguhljd88jPKuaDdX4HiKKo=; b=BlzFsZhHKLpEqCR4VX4J092jO5jH+oAraHjVoAzP4+UBbo4lctpmxx2ReVwWd9UN2O CMJveGnR0sMJBQ0XH/pr2SAzQn5Vx8N4xmwMOuq2A6QfSmIX0hOlLYfjlxVY0klMOjWd GlsA8i3woTfEjdENGeDVUYvz7aAq72rHPtsQh3fTM1bgQ+q/h3YcJuUzyZ7JRQZ42zqi h04a/hIzh3D2q0wdjYhIY5VOrz+IQYBQ0dpYck1XpDKkZYZe6pvCIMmnvXhNA7tBLqQg v1pRpSYXWE3/p9C89Wa3Of2yVdhI2ZsWFvC+fKXpHzCpxNvpM6/msrebXPSKRlQVmpGj gA+w== X-Gm-Message-State: ACrzQf01jxOzjhUoyjxcs41Bsj3Od469/3K3ClRu3OGFQYdaFqggHq1R 4Sb7Sifh8B4ROqE53kNzlr1zgO65mPcEJ6k7 X-Google-Smtp-Source: AMsMyM5xWHWPGMKEZJYKgUtTJ2jZ4SUlswoNwGTyUpOUuSWGNXF4E9VJ1J7aTW7QgRqZp9IF0tNaaQ== X-Received: by 2002:a05:6a00:1912:b0:564:f6be:11fd with SMTP id y18-20020a056a00191200b00564f6be11fdmr17554438pfi.32.1666336639968; Fri, 21 Oct 2022 00:17:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 20/36] tcg: Reorg function calls Date: Fri, 21 Oct 2022 17:15:33 +1000 Message-Id: <20221021071549.2398137-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221021071549.2398137-1-richard.henderson@linaro.org> References: <20221021071549.2398137-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::431; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x431.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @linaro.org) X-ZM-MESSAGEID: 1666338647317100001 Content-Type: text/plain; charset="utf-8" Pre-compute the function call layout for each helper at startup. Drop TCG_CALL_DUMMY_ARG, as we no longer need to leave gaps in the op->args[] array. For tcg_gen_callN, loop over the arguments once. Allocate the TCGOp for the call early but delay emitting it, collecting arguments first. This allows the argument processing loop to emit code for extensions and have them sequenced before the call. Free the temporaries for extensions immediately. For tcg_reg_alloc_call, loop over the arguments in reverse order, which allows stack slots to be filled first naturally. Signed-off-by: Richard Henderson --- include/tcg/tcg.h | 3 - tcg/tcg-internal.h | 17 +- tcg/tcg.c | 591 +++++++++++++++++++++++++++------------------ 3 files changed, 377 insertions(+), 234 deletions(-) diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h index 8bcd60d0ed..8d0626c797 100644 --- a/include/tcg/tcg.h +++ b/include/tcg/tcg.h @@ -425,9 +425,6 @@ typedef TCGv_ptr TCGv_env; #define TCG_CALL_NO_RWG_SE (TCG_CALL_NO_RWG | TCG_CALL_NO_SE) #define TCG_CALL_NO_WG_SE (TCG_CALL_NO_WG | TCG_CALL_NO_SE) =20 -/* Used to align parameters. See the comment before tcgv_i32_temp. */ -#define TCG_CALL_DUMMY_ARG ((TCGArg)0) - /* * Flags for the bswap opcodes. * If IZ, the input is zero-extended, otherwise unknown. diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h index f574743ff8..097fef2325 100644 --- a/tcg/tcg-internal.h +++ b/tcg/tcg-internal.h @@ -42,11 +42,24 @@ typedef enum { TCG_CALL_ARG_EXTEND_S, /* ... as a sign-extended i64 */ } TCGCallArgumentKind; =20 +typedef struct TCGCallArgumentLoc { + TCGCallArgumentKind kind : 8; + unsigned reg_slot : 8; + unsigned stk_slot : 8; + unsigned reg_n : 2; + unsigned arg_idx : 4; + unsigned tmp_subindex : 1; +} TCGCallArgumentLoc; + typedef struct TCGHelperInfo { void *func; const char *name; - unsigned flags; - unsigned typemask; + unsigned typemask : 32; + unsigned flags : 8; + unsigned nr_in : 8; + unsigned nr_out : 8; + TCGCallReturnKind out_kind : 8; + TCGCallArgumentLoc in[MAX_OPC_PARAM_IARGS * MAX_OPC_PARAM_PER_ARG]; } TCGHelperInfo; =20 extern TCGContext tcg_init_ctx; diff --git a/tcg/tcg.c b/tcg/tcg.c index e0f5c6ea7b..713e692621 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -546,7 +546,7 @@ void tcg_pool_reset(TCGContext *s) =20 #include "exec/helper-proto.h" =20 -static const TCGHelperInfo all_helpers[] =3D { +static TCGHelperInfo all_helpers[] =3D { #include "exec/helper-tcg.h" }; static GHashTable *helper_table; @@ -564,6 +564,162 @@ static ffi_type * const typecode_to_ffi[8] =3D { }; #endif =20 +typedef struct TCGCumulativeArgs { + int arg_idx; /* tcg_gen_callN args[] */ + int op_arg_idx; /* TCGOp op->args[] */ + int info_in_idx; /* TCGHelperInfo in[] */ + int reg_slot; /* tcg_target_call_iarg_regs[] */ + int stk_slot; /* TCG_TARGET_CALL_STACK_OFFSET + i*word */ + int max_reg_slot; + int max_stk_slot; +} TCGCumulativeArgs; + +static void layout_arg_1(TCGCumulativeArgs *cum, TCGHelperInfo *info, + TCGCallArgumentKind kind) +{ + TCGCallArgumentLoc *loc =3D &info->in[cum->info_in_idx]; + + *loc =3D (TCGCallArgumentLoc){ + .kind =3D kind, + .arg_idx =3D cum->arg_idx, + }; + + if (cum->reg_slot < cum->max_reg_slot) { + loc->reg_slot =3D cum->reg_slot++; + loc->reg_n =3D 1; + } else { + loc->stk_slot =3D cum->stk_slot++; + } + + cum->info_in_idx++; + cum->op_arg_idx++; +} + +static void layout_arg_normal_2(TCGCumulativeArgs *cum, TCGHelperInfo *inf= o) +{ + TCGCallArgumentLoc *loc =3D &info->in[cum->info_in_idx]; + + /* Layout the pair using the same arg_idx... */ + layout_arg_1(cum, info, TCG_CALL_ARG_NORMAL); + layout_arg_1(cum, info, TCG_CALL_ARG_NORMAL); + + /* ... then adjust the second loc to the second subindex. */ + loc[1].tmp_subindex =3D 1; +} + +static void init_call_layout(TCGHelperInfo *info) +{ + unsigned typemask =3D info->typemask; + unsigned typecode; + TCGCumulativeArgs cum =3D { + .max_reg_slot =3D ARRAY_SIZE(tcg_target_call_iarg_regs), + .max_stk_slot =3D TCG_STATIC_CALL_ARGS_SIZE / sizeof(tcg_target_lo= ng), + }; + + /* + * Parse and place any function return value. + */ + typecode =3D typemask & 7; + switch (typecode) { + case dh_typecode_void: + info->nr_out =3D 0; + break; + case dh_typecode_i32: + case dh_typecode_s32: + case dh_typecode_ptr: + info->nr_out =3D 1; + info->out_kind =3D TCG_CALL_RET_NORMAL; + break; + case dh_typecode_i64: + case dh_typecode_s64: + info->nr_out =3D 64 / TCG_TARGET_REG_BITS; + info->out_kind =3D TCG_CALL_RET_NORMAL; + break; + default: + g_assert_not_reached(); + } + assert(info->nr_out <=3D ARRAY_SIZE(tcg_target_call_oarg_regs)); + + /* + * The final two op->arg[] indexes are used for func & info. + * Account for that now to simplify the test below. + */ + cum.op_arg_idx =3D info->nr_out + 2; + + /* + * Parse and place function arguments. + */ + for (typemask >>=3D 3; typemask; typemask >>=3D 3, cum.arg_idx++) { + TCGCallArgumentKind kind; + TCGType type; + + typecode =3D typemask & 7; + switch (typecode) { + case dh_typecode_i32: + case dh_typecode_s32: + type =3D TCG_TYPE_I32; + kind =3D TCG_TARGET_CALL_ARG_I32; + break; + case dh_typecode_i64: + case dh_typecode_s64: + type =3D TCG_TYPE_I64; + kind =3D TCG_TARGET_CALL_ARG_I64; + break; + case dh_typecode_ptr: + type =3D TCG_TYPE_PTR; + kind =3D TCG_CALL_ARG_NORMAL; + break; + default: + g_assert_not_reached(); + } + + switch (kind) { + case TCG_CALL_ARG_EVEN: + tcg_debug_assert(cum.max_reg_slot % 2 =3D=3D 0); + if (cum.reg_slot < cum.max_reg_slot) { + cum.reg_slot +=3D cum.reg_slot & 1; + } else { + cum.stk_slot +=3D cum.stk_slot & 1; + } + /* fall through */ + case TCG_CALL_ARG_NORMAL: + switch (type) { + case TCG_TYPE_I32: + layout_arg_1(&cum, info, TCG_CALL_ARG_NORMAL); + break; + case TCG_TYPE_I64: + if (TCG_TARGET_REG_BITS =3D=3D 32) { + layout_arg_normal_2(&cum, info); + } else { + layout_arg_1(&cum, info, TCG_CALL_ARG_NORMAL); + } + break; + default: + g_assert_not_reached(); + } + break; + case TCG_CALL_ARG_EXTEND: + kind =3D TCG_CALL_ARG_EXTEND_U + (typecode & 1); + /* fall through */ + case TCG_CALL_ARG_EXTEND_U: + case TCG_CALL_ARG_EXTEND_S: + assert(type =3D=3D TCG_TYPE_I32); + layout_arg_1(&cum, info, kind); + break; + default: + g_assert_not_reached(); + } + } + info->nr_in =3D cum.info_in_idx; + + /* Validate that we didn't overrun the input array. */ + assert(cum.info_in_idx <=3D ARRAY_SIZE(info->in)); + /* Validate that we didn't overrun the output array. */ + assert(cum.op_arg_idx <=3D MAX_OPC_PARAM); + /* Validate the backend has preallocated enough space on stack. */ + assert(cum.stk_slot <=3D cum.max_stk_slot); +} + static int indirect_reg_alloc_order[ARRAY_SIZE(tcg_target_reg_alloc_order)= ]; static void process_op_defs(TCGContext *s); static TCGTemp *tcg_global_reg_new_internal(TCGContext *s, TCGType type, @@ -603,6 +759,7 @@ static void tcg_context_init(unsigned max_cpus) helper_table =3D g_hash_table_new(NULL, NULL); =20 for (i =3D 0; i < ARRAY_SIZE(all_helpers); ++i) { + init_call_layout(&all_helpers[i]); g_hash_table_insert(helper_table, (gpointer)all_helpers[i].func, (gpointer)&all_helpers[i]); } @@ -1473,18 +1630,15 @@ bool tcg_op_supported(TCGOpcode op) } } =20 -/* Note: we convert the 64 bit args to 32 bit and do some alignment - and endian swap. Maybe it would be better to do the alignment - and endian swap in tcg_reg_alloc_call(). */ +static TCGOp *tcg_op_alloc(TCGOpcode opc); + void tcg_gen_callN(void *func, TCGTemp *ret, int nargs, TCGTemp **args) { - int i, real_args, nb_rets, pi; - unsigned typemask; + TCGOp *op =3D tcg_op_alloc(INDEX_op_call); const TCGHelperInfo *info; - TCGOp *op; + int i, n, pi =3D 0; =20 info =3D g_hash_table_lookup(helper_table, (gpointer)func); - typemask =3D info->typemask; =20 #ifdef CONFIG_PLUGIN /* detect non-plugin helpers */ @@ -1493,106 +1647,59 @@ void tcg_gen_callN(void *func, TCGTemp *ret, int n= args, TCGTemp **args) } #endif =20 - if (TCG_TARGET_CALL_ARG_I32 =3D=3D TCG_CALL_ARG_EXTEND) { - for (i =3D 0; i < nargs; ++i) { - int argtype =3D extract32(typemask, (i + 1) * 3, 3); - bool is_32bit =3D (argtype & ~1) =3D=3D dh_typecode_i32; - bool is_signed =3D argtype & 1; + TCGOP_CALLO(op) =3D n =3D info->nr_out; + switch (n) { + case 0: + tcg_debug_assert(ret =3D=3D NULL); + break; + case 1: + tcg_debug_assert(ret !=3D NULL); + op->args[pi++] =3D temp_arg(ret); + break; + case 2: + tcg_debug_assert(ret !=3D NULL); + tcg_debug_assert(ret->temp_subindex =3D=3D 0); + op->args[pi++] =3D temp_arg(ret); + op->args[pi++] =3D temp_arg(ret + 1); + break; + default: + g_assert_not_reached(); + } =20 - if (is_32bit) { + TCGOP_CALLI(op) =3D n =3D info->nr_in; + for (i =3D 0; i < n; i++) { + const TCGCallArgumentLoc *loc =3D &info->in[i]; + TCGTemp *ts =3D args[loc->arg_idx] + loc->tmp_subindex; + + switch (loc->kind) { + case TCG_CALL_ARG_NORMAL: + op->args[pi++] =3D temp_arg(ts); + break; + + case TCG_CALL_ARG_EXTEND_U: + case TCG_CALL_ARG_EXTEND_S: + { TCGv_i64 temp =3D tcg_temp_new_i64(); - TCGv_i32 orig =3D temp_tcgv_i32(args[i]); - if (is_signed) { + TCGv_i32 orig =3D temp_tcgv_i32(ts); + + if (loc->kind =3D=3D TCG_CALL_ARG_EXTEND_S) { tcg_gen_ext_i32_i64(temp, orig); } else { tcg_gen_extu_i32_i64(temp, orig); } - args[i] =3D tcgv_i64_temp(temp); + op->args[pi++] =3D tcgv_i64_arg(temp); + tcg_temp_free_i64(temp); } - } - } - - op =3D tcg_emit_op(INDEX_op_call); - - pi =3D 0; - if (ret !=3D NULL) { - if (TCG_TARGET_REG_BITS < 64 && (typemask & 6) =3D=3D dh_typecode_= i64) { - op->args[pi++] =3D temp_arg(ret); - op->args[pi++] =3D temp_arg(ret + 1); - nb_rets =3D 2; - } else { - op->args[pi++] =3D temp_arg(ret); - nb_rets =3D 1; - } - } else { - nb_rets =3D 0; - } - TCGOP_CALLO(op) =3D nb_rets; - - real_args =3D 0; - for (i =3D 0; i < nargs; i++) { - int argtype =3D extract32(typemask, (i + 1) * 3, 3); - TCGCallArgumentKind kind; - TCGType type; - - switch (argtype) { - case dh_typecode_i32: - case dh_typecode_s32: - type =3D TCG_TYPE_I32; - kind =3D TCG_TARGET_CALL_ARG_I32; break; - case dh_typecode_i64: - case dh_typecode_s64: - type =3D TCG_TYPE_I64; - kind =3D TCG_TARGET_CALL_ARG_I64; - break; - case dh_typecode_ptr: - type =3D TCG_TYPE_PTR; - kind =3D TCG_CALL_ARG_NORMAL; - break; - default: - g_assert_not_reached(); - } =20 - switch (kind) { - case TCG_CALL_ARG_EVEN: - if (real_args & 1) { - op->args[pi++] =3D TCG_CALL_DUMMY_ARG; - real_args++; - } - /* fall through */ - case TCG_CALL_ARG_NORMAL: - if (TCG_TARGET_REG_BITS =3D=3D 32 && type =3D=3D TCG_TYPE_I64)= { - op->args[pi++] =3D temp_arg(args[i]); - op->args[pi++] =3D temp_arg(args[i] + 1); - real_args +=3D 2; - break; - } - op->args[pi++] =3D temp_arg(args[i]); - real_args++; - break; default: g_assert_not_reached(); } } op->args[pi++] =3D (uintptr_t)func; op->args[pi++] =3D (uintptr_t)info; - TCGOP_CALLI(op) =3D real_args; =20 - /* Make sure the fields didn't overflow. */ - tcg_debug_assert(TCGOP_CALLI(op) =3D=3D real_args); - tcg_debug_assert(pi <=3D ARRAY_SIZE(op->args)); - - if (TCG_TARGET_CALL_ARG_I32 =3D=3D TCG_CALL_ARG_EXTEND) { - for (i =3D 0; i < nargs; ++i) { - int argtype =3D extract32(typemask, (i + 1) * 3, 3); - bool is_32bit =3D (argtype & ~1) =3D=3D dh_typecode_i32; - - if (is_32bit) { - tcg_temp_free_internal(args[i]); - } - } - } + QTAILQ_INSERT_TAIL(&tcg_ctx->ops, op, link); } =20 static void tcg_reg_alloc_start(TCGContext *s) @@ -1807,10 +1914,7 @@ static void tcg_dump_ops(TCGContext *s, FILE *f, boo= l have_prefs) } for (i =3D 0; i < nb_iargs; i++) { TCGArg arg =3D op->args[nb_oargs + i]; - const char *t =3D ""; - if (arg !=3D TCG_CALL_DUMMY_ARG) { - t =3D tcg_get_arg_str(s, buf, sizeof(buf), arg); - } + const char *t =3D tcg_get_arg_str(s, buf, sizeof(buf), arg= ); col +=3D ne_fprintf(f, ",%s", t); } } else { @@ -2576,12 +2680,11 @@ static void liveness_pass_1(TCGContext *s) switch (opc) { case INDEX_op_call: { - int call_flags; - int nb_call_regs; + const TCGHelperInfo *info =3D tcg_call_info(op); + int call_flags =3D tcg_call_flags(op); =20 nb_oargs =3D TCGOP_CALLO(op); nb_iargs =3D TCGOP_CALLI(op); - call_flags =3D tcg_call_flags(op); =20 /* pure functions can be removed if their result is unused= */ if (call_flags & TCG_CALL_NO_SIDE_EFFECTS) { @@ -2621,7 +2724,7 @@ static void liveness_pass_1(TCGContext *s) /* Record arguments that die in this helper. */ for (i =3D nb_oargs; i < nb_iargs + nb_oargs; i++) { ts =3D arg_temp(op->args[i]); - if (ts && ts->state & TS_DEAD) { + if (ts->state & TS_DEAD) { arg_life |=3D DEAD_ARG << i; } } @@ -2629,31 +2732,59 @@ static void liveness_pass_1(TCGContext *s) /* For all live registers, remove call-clobbered prefs. */ la_cross_call(s, nb_temps); =20 - nb_call_regs =3D ARRAY_SIZE(tcg_target_call_iarg_regs); + /* + * Input arguments are live for preceding opcodes. + * + * For those arguments that die, and will be allocated in + * registers, clear the register set for that arg, to be + * filled in below. For args that will be on the stack, + * reset to any available reg. Process arguments in rever= se + * order so that if a temp is used more than once, the sta= ck + * reset to max happens before the register reset to 0. + */ + for (i =3D nb_iargs - 1; i >=3D 0; i--) { + const TCGCallArgumentLoc *loc =3D &info->in[i]; + ts =3D arg_temp(op->args[nb_oargs + i]); =20 - /* Input arguments are live for preceding opcodes. */ - for (i =3D 0; i < nb_iargs; i++) { - ts =3D arg_temp(op->args[i + nb_oargs]); - if (ts && ts->state & TS_DEAD) { - /* For those arguments that die, and will be alloc= ated - * in registers, clear the register set for that a= rg, - * to be filled in below. For args that will be on - * the stack, reset to any available reg. - */ - *la_temp_pref(ts) - =3D (i < nb_call_regs ? 0 : - tcg_target_available_regs[ts->type]); + if (ts->state & TS_DEAD) { + switch (loc->kind) { + case TCG_CALL_ARG_NORMAL: + case TCG_CALL_ARG_EXTEND_U: + case TCG_CALL_ARG_EXTEND_S: + if (loc->reg_n) { + *la_temp_pref(ts) =3D 0; + break; + } + /* fall through */ + default: + *la_temp_pref(ts) =3D + tcg_target_available_regs[ts->type]; + break; + } ts->state &=3D ~TS_DEAD; } } =20 - /* For each input argument, add its input register to pref= s. - If a temp is used once, this produces a single set bit.= */ - for (i =3D 0; i < MIN(nb_call_regs, nb_iargs); i++) { - ts =3D arg_temp(op->args[i + nb_oargs]); - if (ts) { - tcg_regset_set_reg(*la_temp_pref(ts), - tcg_target_call_iarg_regs[i]); + /* + * For each input argument, add its input register to pref= s. + * If a temp is used once, this produces a single set bit; + * if a temp is used multiple times, this produces a set. + */ + for (i =3D 0; i < nb_iargs; i++) { + const TCGCallArgumentLoc *loc =3D &info->in[i]; + ts =3D arg_temp(op->args[nb_oargs + i]); + + switch (loc->kind) { + case TCG_CALL_ARG_NORMAL: + case TCG_CALL_ARG_EXTEND_U: + case TCG_CALL_ARG_EXTEND_S: + if (loc->reg_n) { + tcg_regset_set_reg(*la_temp_pref(ts), + tcg_target_call_iarg_regs[loc->reg_slot]); + } + break; + default: + break; } } } @@ -2922,21 +3053,19 @@ static bool liveness_pass_2(TCGContext *s) /* Make sure that input arguments are available. */ for (i =3D nb_oargs; i < nb_iargs + nb_oargs; i++) { arg_ts =3D arg_temp(op->args[i]); - if (arg_ts) { - dir_ts =3D arg_ts->state_ptr; - if (dir_ts && arg_ts->state =3D=3D TS_DEAD) { - TCGOpcode lopc =3D (arg_ts->type =3D=3D TCG_TYPE_I32 - ? INDEX_op_ld_i32 - : INDEX_op_ld_i64); - TCGOp *lop =3D tcg_op_insert_before(s, op, lopc); + dir_ts =3D arg_ts->state_ptr; + if (dir_ts && arg_ts->state =3D=3D TS_DEAD) { + TCGOpcode lopc =3D (arg_ts->type =3D=3D TCG_TYPE_I32 + ? INDEX_op_ld_i32 + : INDEX_op_ld_i64); + TCGOp *lop =3D tcg_op_insert_before(s, op, lopc); =20 - lop->args[0] =3D temp_arg(dir_ts); - lop->args[1] =3D temp_arg(arg_ts->mem_base); - lop->args[2] =3D arg_ts->mem_offset; + lop->args[0] =3D temp_arg(dir_ts); + lop->args[1] =3D temp_arg(arg_ts->mem_base); + lop->args[2] =3D arg_ts->mem_offset; =20 - /* Loaded, but synced with memory. */ - arg_ts->state =3D TS_MEM; - } + /* Loaded, but synced with memory. */ + arg_ts->state =3D TS_MEM; } } =20 @@ -4158,106 +4287,100 @@ static bool tcg_reg_alloc_dup2(TCGContext *s, con= st TCGOp *op) return true; } =20 +static void load_arg_normal_1(TCGContext *s, const TCGCallArgumentLoc *loc, + TCGTemp *ts, TCGRegSet *allocated_regs) +{ + TCGReg reg; + + /* + * If the destination is on the stack, load up the temp and store. + * If there are many call-saved registers, the temp might live to + * see another use; otherwise it'll be discarded. + */ + if (!loc->reg_n) { + temp_load(s, ts, tcg_target_available_regs[ts->type], + *allocated_regs, 0); + tcg_out_st(s, ts->type, ts->reg, TCG_REG_CALL_STACK, + TCG_TARGET_CALL_STACK_OFFSET + + loc->stk_slot * sizeof(tcg_target_long)); + return; + } + + reg =3D tcg_target_call_iarg_regs[loc->reg_slot]; + + if (ts->val_type =3D=3D TEMP_VAL_REG) { + if (ts->reg !=3D reg) { + tcg_reg_free(s, reg, *allocated_regs); + if (!tcg_out_mov(s, ts->type, reg, ts->reg)) { + /* + * Cross register class move not supported. Sync the + * temp back to its slot and load from there. + */ + temp_sync(s, ts, *allocated_regs, 0, 0); + tcg_out_ld(s, ts->type, reg, + ts->mem_base->reg, ts->mem_offset); + } + } + } else { + TCGRegSet arg_set =3D 0; + + tcg_reg_free(s, reg, *allocated_regs); + tcg_regset_set_reg(arg_set, reg); + temp_load(s, ts, arg_set, *allocated_regs, 0); + } + + tcg_regset_set_reg(*allocated_regs, reg); +} + static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op) { const int nb_oargs =3D TCGOP_CALLO(op); const int nb_iargs =3D TCGOP_CALLI(op); const TCGLifeData arg_life =3D op->life; - const TCGHelperInfo *info; - int flags, nb_regs, i; - TCGReg reg; - TCGArg arg; - TCGTemp *ts; - intptr_t stack_offset; - size_t call_stack_size; - tcg_insn_unit *func_addr; - int allocate_args; - TCGRegSet allocated_regs; + const TCGHelperInfo *info =3D tcg_call_info(op); + TCGRegSet allocated_regs =3D s->reserved_regs; + int i; =20 - func_addr =3D tcg_call_func(op); - info =3D tcg_call_info(op); - flags =3D info->flags; + /* + * Move inputs into place in reverse order, + * so that we place stacked arguments first. + */ + for (i =3D nb_iargs - 1; i >=3D 0; --i) { + const TCGCallArgumentLoc *loc =3D &info->in[i]; + TCGTemp *ts =3D arg_temp(op->args[nb_oargs + i]); =20 - nb_regs =3D ARRAY_SIZE(tcg_target_call_iarg_regs); - if (nb_regs > nb_iargs) { - nb_regs =3D nb_iargs; - } - - /* assign stack slots first */ - call_stack_size =3D (nb_iargs - nb_regs) * sizeof(tcg_target_long); - call_stack_size =3D (call_stack_size + TCG_TARGET_STACK_ALIGN - 1) &=20 - ~(TCG_TARGET_STACK_ALIGN - 1); - allocate_args =3D (call_stack_size > TCG_STATIC_CALL_ARGS_SIZE); - if (allocate_args) { - /* XXX: if more than TCG_STATIC_CALL_ARGS_SIZE is needed, - preallocate call stack */ - tcg_abort(); - } - - stack_offset =3D TCG_TARGET_CALL_STACK_OFFSET; - for (i =3D nb_regs; i < nb_iargs; i++) { - arg =3D op->args[nb_oargs + i]; - if (arg !=3D TCG_CALL_DUMMY_ARG) { - ts =3D arg_temp(arg); - temp_load(s, ts, tcg_target_available_regs[ts->type], - s->reserved_regs, 0); - tcg_out_st(s, ts->type, ts->reg, TCG_REG_CALL_STACK, stack_off= set); - } - stack_offset +=3D sizeof(tcg_target_long); - } - =20 - /* assign input registers */ - allocated_regs =3D s->reserved_regs; - for (i =3D 0; i < nb_regs; i++) { - arg =3D op->args[nb_oargs + i]; - if (arg !=3D TCG_CALL_DUMMY_ARG) { - ts =3D arg_temp(arg); - reg =3D tcg_target_call_iarg_regs[i]; - - if (ts->val_type =3D=3D TEMP_VAL_REG) { - if (ts->reg !=3D reg) { - tcg_reg_free(s, reg, allocated_regs); - if (!tcg_out_mov(s, ts->type, reg, ts->reg)) { - /* - * Cross register class move not supported. Sync = the - * temp back to its slot and load from there. - */ - temp_sync(s, ts, allocated_regs, 0, 0); - tcg_out_ld(s, ts->type, reg, - ts->mem_base->reg, ts->mem_offset); - } - } - } else { - TCGRegSet arg_set =3D 0; - - tcg_reg_free(s, reg, allocated_regs); - tcg_regset_set_reg(arg_set, reg); - temp_load(s, ts, arg_set, allocated_regs, 0); - } - - tcg_regset_set_reg(allocated_regs, reg); + switch (loc->kind) { + case TCG_CALL_ARG_NORMAL: + case TCG_CALL_ARG_EXTEND_U: + case TCG_CALL_ARG_EXTEND_S: + load_arg_normal_1(s, loc, ts, &allocated_regs); + break; + default: + g_assert_not_reached(); } } - =20 - /* mark dead temporaries and free the associated registers */ + + /* Mark dead temporaries and free the associated registers. */ for (i =3D nb_oargs; i < nb_iargs + nb_oargs; i++) { if (IS_DEAD_ARG(i)) { temp_dead(s, arg_temp(op->args[i])); } } - =20 - /* clobber call registers */ + + /* Clobber call registers. */ for (i =3D 0; i < TCG_TARGET_NB_REGS; i++) { if (tcg_regset_test_reg(tcg_target_call_clobber_regs, i)) { tcg_reg_free(s, i, allocated_regs); } } =20 - /* Save globals if they might be written by the helper, sync them if - they might be read. */ - if (flags & TCG_CALL_NO_READ_GLOBALS) { + /* + * Save globals if they might be written by the helper, + * sync them if they might be read. + */ + if (info->flags & TCG_CALL_NO_READ_GLOBALS) { /* Nothing to do */ - } else if (flags & TCG_CALL_NO_WRITE_GLOBALS) { + } else if (info->flags & TCG_CALL_NO_WRITE_GLOBALS) { sync_globals(s, allocated_regs); } else { save_globals(s, allocated_regs); @@ -4268,31 +4391,41 @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp= *op) gpointer hash =3D (gpointer)(uintptr_t)info->typemask; ffi_cif *cif =3D g_hash_table_lookup(ffi_table, hash); assert(cif !=3D NULL); - tcg_out_call(s, func_addr, cif); + tcg_out_call(s, tcg_call_func(op), cif); } #else - tcg_out_call(s, func_addr); + tcg_out_call(s, tcg_call_func(op)); #endif =20 - /* assign output registers and emit moves if needed */ - for(i =3D 0; i < nb_oargs; i++) { - arg =3D op->args[i]; - ts =3D arg_temp(arg); + /* Assign output registers and emit moves if needed. */ + switch (info->out_kind) { + case TCG_CALL_RET_NORMAL: + for (i =3D 0; i < nb_oargs; i++) { + TCGTemp *ts =3D arg_temp(op->args[i]); + TCGReg reg =3D tcg_target_call_oarg_regs[i]; =20 - /* ENV should not be modified. */ - tcg_debug_assert(!temp_readonly(ts)); + /* ENV should not be modified. */ + tcg_debug_assert(!temp_readonly(ts)); =20 - reg =3D tcg_target_call_oarg_regs[i]; - tcg_debug_assert(s->reg_to_temp[reg] =3D=3D NULL); - if (ts->val_type =3D=3D TEMP_VAL_REG) { - s->reg_to_temp[ts->reg] =3D NULL; + tcg_debug_assert(s->reg_to_temp[reg] =3D=3D NULL); + if (ts->val_type =3D=3D TEMP_VAL_REG) { + s->reg_to_temp[ts->reg] =3D NULL; + } + ts->val_type =3D TEMP_VAL_REG; + ts->reg =3D reg; + ts->mem_coherent =3D 0; + s->reg_to_temp[reg] =3D ts; } - ts->val_type =3D TEMP_VAL_REG; - ts->reg =3D reg; - ts->mem_coherent =3D 0; - s->reg_to_temp[reg] =3D ts; + break; + default: + g_assert_not_reached(); + } + + /* Flush or discard output registers as needed. */ + for (i =3D 0; i < nb_oargs; i++) { + TCGTemp *ts =3D arg_temp(op->args[i]); if (NEED_SYNC_ARG(i)) { - temp_sync(s, ts, allocated_regs, 0, IS_DEAD_ARG(i)); + temp_sync(s, ts, s->reserved_regs, 0, IS_DEAD_ARG(i)); } else if (IS_DEAD_ARG(i)) { temp_dead(s, ts); } --=20 2.34.1