From nobody Tue Feb 10 12:59:47 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1553016706911824.2769102408706; Tue, 19 Mar 2019 10:31:46 -0700 (PDT) Received: from localhost ([127.0.0.1]:60801 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h6IaU-0003HX-Ni for importer@patchew.org; Tue, 19 Mar 2019 13:31:42 -0400 Received: from eggs.gnu.org ([209.51.188.92]:49173) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h6IQp-0004et-TG for qemu-devel@nongnu.org; Tue, 19 Mar 2019 13:21:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h6IQn-0004Y8-B4 for qemu-devel@nongnu.org; Tue, 19 Mar 2019 13:21:43 -0400 Received: from mail-pg1-x52d.google.com ([2607:f8b0:4864:20::52d]:34879) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1h6IQm-0004Xc-Ug for qemu-devel@nongnu.org; Tue, 19 Mar 2019 13:21:41 -0400 Received: by mail-pg1-x52d.google.com with SMTP id g8so3653300pgf.2 for ; Tue, 19 Mar 2019 10:21:40 -0700 (PDT) Received: from cloudburst.twiddle.net (97-113-188-82.tukw.qwest.net. [97.113.188.82]) by smtp.gmail.com with ESMTPSA id w68sm5616666pfb.176.2019.03.19.10.21.38 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 19 Mar 2019 10:21:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=kFPPZstBkas5qbjcpfxyoGriUyFM5j9wSpVME/IdO1s=; b=rWhAw8lDYO4egtJ286jVkzMp9UHVb1S0BHzrHMR8sBzdsc0y47XUaBx+Fuq+DWRwGL jZj/ktnn0mH3rqh8J9h1NZs/NEo80JcYCIWm3/Vph5w67NQeTO9VRYpiI7jG2W0s6NfY pR1WFqKirJ+Teq1oebZSNQ01VW4Wrv9n6R6tP1VOacftXhMQvIpuIcNu04PQZhu0OFrz OUNfqIR0jln68dO1EMV4zSTj44lSpwTWToksYYeOraKcDzczg0nAVGcHm8imkKiKIDcN X+0ZnJSxF6TZkSJNxtJGdGG0Z26N2fUEjRdx/7tTcFtP2sU6knalhblgfidpycijvSnz f8Sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=kFPPZstBkas5qbjcpfxyoGriUyFM5j9wSpVME/IdO1s=; b=sIyCdsQw++77TvXLqY8vH5QtqxkU4ZvBWvpn5JsmbNkzCrrPoIsw8oag5TEIOuyEFK 19/YKaPwZ+Uwpcxg/EYq3IcJ/QDp2mVQscKt2y9x18Xha8jHt4KdAKGIgBMOnUfIyX4k ynYOnp1fVkqI+INQ8123bGz84ebhuTFgFhekqh22BfmVEnJwZqC+CZ7iZvBx8Z5yJDjM ffrI8/vbf9sgPK0dGyP/aqdjLGQP0SkayBUWPUMtZcIJTmAXJq7iQz/t7SqxYhCkVZrS oAu3tHq8RTQeZ7296YM786G1Bgls+GZGnQEXNLjWcanRABc+T0IpxZyXdX253SDaUAwK ma8w== X-Gm-Message-State: APjAAAWYHFLXMuZFYYlU3sHKcAevk7X2qyLT516BhFfosFeHMLQZqUab EiW8BMEtPbY0ud5UeuXu6JGSO7B+wUA= X-Google-Smtp-Source: APXvYqzdJwOoSTRoIqKP+zUa+il24dXg6b1OzxniOWBcYpJqr1CUARgEt+H7+8bgZYFFtlr8hb7gDw== X-Received: by 2002:a62:a509:: with SMTP id v9mr3233467pfm.64.1553016099538; Tue, 19 Mar 2019 10:21:39 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 19 Mar 2019 10:21:16 -0700 Message-Id: <20190319172126.7502-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190319172126.7502-1-richard.henderson@linaro.org> References: <20190319172126.7502-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::52d Subject: [Qemu-devel] [PATCH for-4.1 v3 07/17] tcg: Manually expand INDEX_op_dup_vec X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.cave-ayland@ilande.co.uk, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This case is similar to INDEX_op_mov_* in that we need to do different things depending on the current location of the source. Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.inc.c | 9 ++-- tcg/i386/tcg-target.inc.c | 8 ++- tcg/tcg.c | 102 +++++++++++++++++++++++++++++++++++ 3 files changed, 109 insertions(+), 10 deletions(-) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 3c786ee581..17e35f2fb6 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -2099,10 +2099,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, =20 case INDEX_op_mov_i32: /* Always emitted via tcg_out_mov. */ case INDEX_op_mov_i64: - case INDEX_op_mov_vec: case INDEX_op_movi_i32: /* Always emitted via tcg_out_movi. */ case INDEX_op_movi_i64: - case INDEX_op_dupi_vec: case INDEX_op_call: /* Always emitted via tcg_out_call. */ default: g_assert_not_reached(); @@ -2199,9 +2197,6 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode o= pc, case INDEX_op_not_vec: tcg_out_insn(s, 3617, NOT, is_q, 0, a0, a1); break; - case INDEX_op_dup_vec: - tcg_out_dup_vec(s, type, vece, a0, a1); - break; case INDEX_op_shli_vec: tcg_out_insn(s, 3614, SHL, is_q, a0, a1, a2 + (8 << vece)); break; @@ -2245,6 +2240,10 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode = opc, } } break; + + case INDEX_op_mov_vec: /* Always emitted via tcg_out_mov. */ + case INDEX_op_dupi_vec: /* Always emitted via tcg_out_movi. */ + case INDEX_op_dup_vec: /* Always emitted via tcg_out_dup_vec. */ default: g_assert_not_reached(); } diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index b8e677e46d..09e2308557 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -2594,10 +2594,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpco= de opc, break; case INDEX_op_mov_i32: /* Always emitted via tcg_out_mov. */ case INDEX_op_mov_i64: - case INDEX_op_mov_vec: case INDEX_op_movi_i32: /* Always emitted via tcg_out_movi. */ case INDEX_op_movi_i64: - case INDEX_op_dupi_vec: case INDEX_op_call: /* Always emitted via tcg_out_call. */ default: tcg_abort(); @@ -2786,9 +2784,6 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode o= pc, case INDEX_op_st_vec: tcg_out_st(s, type, a0, a1, a2); break; - case INDEX_op_dup_vec: - tcg_out_dup_vec(s, type, vece, a0, a1); - break; =20 case INDEX_op_x86_shufps_vec: insn =3D OPC_SHUFPS; @@ -2830,6 +2825,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode o= pc, tcg_out8(s, a2); break; =20 + case INDEX_op_mov_vec: /* Always emitted via tcg_out_mov. */ + case INDEX_op_dupi_vec: /* Always emitted via tcg_out_movi. */ + case INDEX_op_dup_vec: /* Always emitted via tcg_out_dup_vec. */ default: g_assert_not_reached(); } diff --git a/tcg/tcg.c b/tcg/tcg.c index ca5f3ed5ce..b11b30bbec 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -3406,6 +3406,105 @@ static void tcg_reg_alloc_mov(TCGContext *s, const = TCGOp *op) } } =20 +static void tcg_reg_alloc_dup(TCGContext *s, const TCGOp *op) +{ + const TCGLifeData arg_life =3D op->life; + TCGRegSet dup_out_regs, dup_in_regs; + TCGTemp *its, *ots; + TCGType itype, vtype; + unsigned vece; + bool ok; + + ots =3D arg_temp(op->args[0]); + its =3D arg_temp(op->args[1]); + + /* There should be no fixed vector registers. */ + tcg_debug_assert(!ots->fixed_reg); + + itype =3D its->type; + vece =3D TCGOP_VECE(op); + vtype =3D TCGOP_VECL(op) + TCG_TYPE_V64; + + if (its->val_type =3D=3D TEMP_VAL_CONST) { + /* Propagate constant via movi -> dupi. */ + tcg_target_ulong val =3D its->val; + if (IS_DEAD_ARG(1)) { + temp_dead(s, its); + } + tcg_reg_alloc_do_movi(s, ots, val, arg_life, op->output_pref[0]); + return; + } + + dup_out_regs =3D tcg_op_defs[INDEX_op_dup_vec].args_ct[0].u.regs; + dup_in_regs =3D tcg_op_defs[INDEX_op_dup_vec].args_ct[1].u.regs; + + /* Allocate the output register now. */ + if (ots->val_type !=3D TEMP_VAL_REG) { + TCGRegSet allocated_regs =3D s->reserved_regs; + + if (!IS_DEAD_ARG(1) && its->val_type =3D=3D TEMP_VAL_REG) { + /* Make sure to not spill the input register. */ + tcg_regset_set_reg(allocated_regs, its->reg); + } + ots->reg =3D tcg_reg_alloc(s, dup_out_regs, allocated_regs, + op->output_pref[0], ots->indirect_base); + ots->val_type =3D TEMP_VAL_REG; + ots->mem_coherent =3D 0; + s->reg_to_temp[ots->reg] =3D ots; + } + + switch (its->val_type) { + case TEMP_VAL_REG: + /* + * The dup constriaints must be broad, covering all possible VECE. + * However, tcg_op_dup_vec() gets to see the VECE and we allow it + * to fail, indicating that extra moves are required for that case. + */ + if (tcg_regset_test_reg(dup_in_regs, its->reg)) { + if (tcg_out_dup_vec(s, vtype, vece, ots->reg, its->reg)) { + goto done; + } + /* Try again from memory or a vector input register. */ + } + if (!its->mem_coherent) { + /* + * The input register is not synced, and so an extra store + * would be required to use memory. Attempt an integer-vector + * register move first. We do not have a TCGRegSet for this. + */ + if (tcg_out_mov(s, itype, ots->reg, its->reg)) { + break; + } + /* Sync the temp back to its slot and load from there. */ + temp_sync(s, its, s->reserved_regs, 0, 0); + } + /* fall through */ + + case TEMP_VAL_MEM: + /* TODO: dup from memory */ + tcg_out_ld(s, itype, ots->reg, its->mem_base->reg, its->mem_offset= ); + break; + + default: + g_assert_not_reached(); + } + + /* We now have a vector input register, so dup must succeed. */ + ok =3D tcg_out_dup_vec(s, vtype, vece, ots->reg, ots->reg); + tcg_debug_assert(ok); + + done: + if (IS_DEAD_ARG(1)) { + temp_dead(s, its); + } + if (NEED_SYNC_ARG(0)) { + temp_sync(s, ots, s->reserved_regs, 0, 0); + } + if (IS_DEAD_ARG(0)) { + temp_dead(s, ots); + } +} + static void tcg_reg_alloc_op(TCGContext *s, const TCGOp *op) { const TCGLifeData arg_life =3D op->life; @@ -3974,6 +4073,9 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb) case INDEX_op_dupi_vec: tcg_reg_alloc_movi(s, op); break; + case INDEX_op_dup_vec: + tcg_reg_alloc_dup(s, op); + break; case INDEX_op_insn_start: if (num_insns >=3D 0) { size_t off =3D tcg_current_code_size(s); --=20 2.17.2