From nobody Wed Nov 5 10:30:08 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 149939427798667.73565228618872; Thu, 6 Jul 2017 19:24:37 -0700 (PDT) Received: from localhost ([::1]:54109 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dTIwd-0002oV-Ii for importer@patchew.org; Thu, 06 Jul 2017 22:24:35 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34235) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dTIuk-0001UB-Ll for qemu-devel@nongnu.org; Thu, 06 Jul 2017 22:22:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dTIuh-0008HI-HI for qemu-devel@nongnu.org; Thu, 06 Jul 2017 22:22:38 -0400 Received: from mail-qt0-x241.google.com ([2607:f8b0:400d:c0d::241]:35230) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dTIuh-0008H0-CH for qemu-devel@nongnu.org; Thu, 06 Jul 2017 22:22:35 -0400 Received: by mail-qt0-x241.google.com with SMTP id w12so2708505qta.2 for ; Thu, 06 Jul 2017 19:22:35 -0700 (PDT) Received: from bigtime.twiddle.net.com (rrcs-66-91-136-156.west.biz.rr.com. [66.91.136.156]) by smtp.gmail.com with ESMTPSA id i85sm1407176qke.66.2017.07.06.19.22.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 06 Jul 2017 19:22:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=/zjpuMAhvLMCmoFHLzDHXZu55UhPF6g59LCtmCOOb+w=; b=SECgP7G8YObJWcUGs5f2moLnV3AwMAXji2b/m5j1aXK3jiNjKisvM79yVoIghNAvTF Bxh6KUFJI2FOeGj6m5CWcCWfsZ77hMT68AyhVrbQSTvQxJRWZJRAodfaIpVSiPxoL+RE CI/TeUcXt4n5QDa+kIpWA8KeQAWQ8xeamcLHQdrhPsy8rlcBuaUYcYdxfkSG9udl2FWD LJRwLZzpHQiL9ZiPJaBPXcP+cLvGph5gvWlEp90hQfxkeVTUjWS2qXW2k948o0tqhCMv cnWaXn4Ty7wwOfgocYw5bwpAJJhREUObTIv894Cry4pCOEmsM1r1WHWZSCG5kDmU4V8i R6Pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=/zjpuMAhvLMCmoFHLzDHXZu55UhPF6g59LCtmCOOb+w=; b=QE87OvZlQkRG2h7EYg/90uZrJJzpH2VGEvaTBtaF8eQTf+pP41xoD0pEJAPGwinfME Xsgmr3/tar9WucXf1ZVF43+CvXY/kITaT0t8zcAhkQJXlaWoUng2ytzGfappV0nAYf+O ewvh7uEhomFeU+cKzM/yzcvn9to4jRW3WyVXzNHUKgIaJi1GrvrYwa9c5Fl9RgtDCHSz uOBuRIB7uoMLXSJm0kMY3z5z0Q4fjHfAg7e9jVujL8tWemq9/EWmojBlMEn/uKv/r+0b jCSjOu3wsnpEHnm4TTBDLewgBSLWjczvCfEZBwcItBdq6ECMkeeXOcMyzqd1gCOtUvSD SkdQ== X-Gm-Message-State: AIVw110sAJiuTsDjDTKISOGfj0ZV/S66hw1jwojZ2YGH/RLS23RXTM+N 7SK3WCmF7p4AFqT/Ojs= X-Received: by 10.237.55.34 with SMTP id i31mr16562085qtb.210.1499394154464; Thu, 06 Jul 2017 19:22:34 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 6 Jul 2017 16:20:51 -1000 Message-Id: <20170707022111.21836-8-rth@twiddle.net> X-Mailer: git-send-email 2.9.4 In-Reply-To: <20170707022111.21836-1-rth@twiddle.net> References: <20170707022111.21836-1-rth@twiddle.net> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400d:c0d::241 Subject: [Qemu-devel] [PATCH v2 07/27] target/sh4: Recognize common gUSA sequences X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bruno@clisp.org, laurent@vivier.eu, aurelien@aurel32.net, glaubitz@debian.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" For many of the sequences produced by gcc or glibc, we can translate these as host atomic operations. Which saves the need to acquire the exclusive lock. Signed-off-by: Richard Henderson --- target/sh4/translate.c | 316 +++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 316 insertions(+) diff --git a/target/sh4/translate.c b/target/sh4/translate.c index 653c06c..73b3e02 100644 --- a/target/sh4/translate.c +++ b/target/sh4/translate.c @@ -1894,10 +1894,17 @@ static void decode_opc(DisasContext * ctx) */ static int decode_gusa(DisasContext *ctx, CPUSH4State *env, int *pmax_insn= s) { + uint16_t insns[5]; + int ld_adr, ld_dst, ld_mop; + int op_dst, op_src, op_opc; + int mv_src, mt_dst, st_src, st_mop; + TCGv op_arg; + uint32_t pc =3D ctx->pc; uint32_t pc_end =3D ctx->tb->cs_base; int backup =3D sextract32(ctx->tbflags, GUSA_SHIFT, 8); int max_insns =3D (pc_end - pc) / 2; + int i; =20 if (pc !=3D pc_end + backup || max_insns < 2) { /* This is a malformed gUSA region. Don't do anything special, @@ -1914,6 +1921,315 @@ static int decode_gusa(DisasContext *ctx, CPUSH4Sta= te *env, int *pmax_insns) return 0; } =20 + /* The state machine below will consume only a few insns. + If there are more than that in a region, fail now. */ + if (max_insns > ARRAY_SIZE(insns)) { + goto fail; + } + + /* Read all of the insns for the region. */ + for (i =3D 0; i < max_insns; ++i) { + insns[i] =3D cpu_lduw_code(env, pc + i * 2); + } + + ld_adr =3D ld_dst =3D ld_mop =3D -1; + mv_src =3D -1; + op_dst =3D op_src =3D op_opc =3D -1; + mt_dst =3D -1; + st_src =3D st_mop =3D -1; + TCGV_UNUSED(op_arg); + i =3D 0; + +#define NEXT_INSN \ + do { if (i >=3D max_insns) goto fail; ctx->opcode =3D insns[i++]; } wh= ile (0) + + /* + * Expect a load to begin the region. + */ + NEXT_INSN; + switch (ctx->opcode & 0xf00f) { + case 0x6000: /* mov.b @Rm,Rn */ + ld_mop =3D MO_SB; + break; + case 0x6001: /* mov.w @Rm,Rn */ + ld_mop =3D MO_TESW; + break; + case 0x6002: /* mov.l @Rm,Rn */ + ld_mop =3D MO_TESL; + break; + default: + goto fail; + } + ld_adr =3D B7_4; + ld_dst =3D B11_8; + if (ld_adr =3D=3D ld_dst) { + goto fail; + } + /* Unless we see a mov, any two-operand operation must use ld_dst. */ + op_dst =3D ld_dst; + + /* + * Expect an optional register move. + */ + NEXT_INSN; + switch (ctx->opcode & 0xf00f) { + case 0x6003: /* mov Rm,Rn */ + /* Here we want to recognize ld_dst being saved for later consumti= on, + or for another input register being copied so that ld_dst need = not + be clobbered during the operation. */ + op_dst =3D B11_8; + mv_src =3D B7_4; + if (op_dst =3D=3D ld_dst) { + /* Overwriting the load output. */ + goto fail; + } + if (mv_src !=3D ld_dst) { + /* Copying a new input; constrain op_src to match the load. */ + op_src =3D ld_dst; + } + break; + + default: + /* Put back and re-examine as operation. */ + --i; + } + + /* + * Expect the operation. + */ + NEXT_INSN; + switch (ctx->opcode & 0xf00f) { + case 0x300c: /* add Rm,Rn */ + op_opc =3D INDEX_op_add_i32; + goto do_reg_op; + case 0x2009: /* and Rm,Rn */ + op_opc =3D INDEX_op_and_i32; + goto do_reg_op; + case 0x200a: /* xor Rm,Rn */ + op_opc =3D INDEX_op_xor_i32; + goto do_reg_op; + case 0x200b: /* or Rm,Rn */ + op_opc =3D INDEX_op_or_i32; + do_reg_op: + /* The operation register should be as expected, and the + other input cannot depend on the load. */ + if (op_dst !=3D B11_8) { + goto fail; + } + if (op_src < 0) { + /* Unconstrainted input. */ + op_src =3D B7_4; + } else if (op_src =3D=3D B7_4) { + /* Constrained input matched load. All operations are + commutative; "swap" them by "moving" the load output + to the (implicit) first argument and the move source + to the (explicit) second argument. */ + op_src =3D mv_src; + } else { + goto fail; + } + op_arg =3D REG(op_src); + break; + + case 0x6007: /* not Rm,Rn */ + if (ld_dst !=3D B7_4 || mv_src >=3D 0) { + goto fail; + } + op_dst =3D B11_8; + op_opc =3D INDEX_op_xor_i32; + op_arg =3D tcg_const_i32(-1); + break; + + case 0x7000 ... 0x700f: /* add #imm,Rn */ + if (op_dst !=3D B11_8 || op_src >=3D 0) { + goto fail; + } + op_opc =3D INDEX_op_add_i32; + op_arg =3D tcg_const_i32(B7_0s); + break; + + case 0x3000: /* cmp/eq Rm,Rn */ + /* Looking for the middle of a compare-and-swap sequence, + beginning with the compare. Operands can be either order, + but with only one overlapping the load. */ + if ((ld_dst =3D=3D B11_8) + (ld_dst =3D=3D B7_4) !=3D 1 || mv_src = >=3D 0) { + goto fail; + } + op_opc =3D INDEX_op_setcond_i32; /* placeholder */ + op_src =3D (ld_dst =3D=3D B11_8 ? B7_4 : B11_8); + op_arg =3D REG(op_src); + + NEXT_INSN; + switch (ctx->opcode & 0xff00) { + case 0x8b00: /* bf label */ + case 0x8f00: /* bf/s label */ + if (pc + (i + 1 + B7_0s) * 2 !=3D pc_end) { + goto fail; + } + if ((ctx->opcode & 0xff00) =3D=3D 0x8b00) { /* bf label */ + break; + } + /* We're looking to unconditionally modify Rn with the + result of the comparison, within the delay slot of + the branch. This is used by older gcc. */ + NEXT_INSN; + if ((ctx->opcode & 0xf0ff) =3D=3D 0x0029) { /* movt Rn */ + mt_dst =3D B11_8; + } else { + goto fail; + } + break; + + default: + goto fail; + } + break; + + case 0x2008: /* tst Rm,Rn */ + /* Looking for a compare-and-swap against zero. */ + if (ld_dst !=3D B11_8 || ld_dst !=3D B7_4 || mv_src >=3D 0) { + goto fail; + } + op_opc =3D INDEX_op_setcond_i32; + op_arg =3D tcg_const_i32(0); + + NEXT_INSN; + if ((ctx->opcode & 0xff00) !=3D 0x8900 /* bt label */ + || pc + (i + 1 + B7_0s) * 2 !=3D pc_end) { + goto fail; + } + break; + + default: + /* Put back and re-examine as store. */ + --i; + } + + /* + * Expect the store. + */ + /* The store must be the last insn. */ + if (i !=3D max_insns - 1) { + goto fail; + } + NEXT_INSN; + switch (ctx->opcode & 0xf00f) { + case 0x2000: /* mov.b Rm,@Rn */ + st_mop =3D MO_UB; + break; + case 0x2001: /* mov.w Rm,@Rn */ + st_mop =3D MO_UW; + break; + case 0x2002: /* mov.l Rm,@Rn */ + st_mop =3D MO_UL; + break; + default: + goto fail; + } + /* The store must match the load. */ + if (ld_adr !=3D B11_8 || st_mop !=3D (ld_mop & MO_SIZE)) { + goto fail; + } + st_src =3D B7_4; + +#undef NEXT_INSN + + /* + * Emit the operation. + */ + tcg_gen_insn_start(pc, ctx->envflags); + switch (op_opc) { + case -1: + /* No operation found. Look for exchange pattern. */ + if (st_src =3D=3D ld_dst || mv_src >=3D 0) { + goto fail; + } + tcg_gen_atomic_xchg_i32(REG(ld_dst), REG(ld_adr), REG(st_src), + ctx->memidx, ld_mop); + break; + + case INDEX_op_add_i32: + if (op_dst !=3D st_src) { + goto fail; + } + if (op_dst =3D=3D ld_dst && st_mop =3D=3D MO_UL) { + tcg_gen_atomic_add_fetch_i32(REG(ld_dst), REG(ld_adr), + op_arg, ctx->memidx, ld_mop); + } else { + tcg_gen_atomic_fetch_add_i32(REG(ld_dst), REG(ld_adr), + op_arg, ctx->memidx, ld_mop); + if (op_dst !=3D ld_dst) { + /* Note that mop sizes < 4 cannot use add_fetch + because it won't carry into the higher bits. */ + tcg_gen_add_i32(REG(op_dst), REG(ld_dst), op_arg); + } + } + break; + + case INDEX_op_and_i32: + if (op_dst !=3D st_src) { + goto fail; + } + if (op_dst =3D=3D ld_dst) { + tcg_gen_atomic_and_fetch_i32(REG(ld_dst), REG(ld_adr), + op_arg, ctx->memidx, ld_mop); + } else { + tcg_gen_atomic_fetch_and_i32(REG(ld_dst), REG(ld_adr), + op_arg, ctx->memidx, ld_mop); + tcg_gen_and_i32(REG(op_dst), REG(ld_dst), op_arg); + } + break; + + case INDEX_op_or_i32: + if (op_dst !=3D st_src) { + goto fail; + } + if (op_dst =3D=3D ld_dst) { + tcg_gen_atomic_or_fetch_i32(REG(ld_dst), REG(ld_adr), + op_arg, ctx->memidx, ld_mop); + } else { + tcg_gen_atomic_fetch_or_i32(REG(ld_dst), REG(ld_adr), + op_arg, ctx->memidx, ld_mop); + tcg_gen_or_i32(REG(op_dst), REG(ld_dst), op_arg); + } + break; + + case INDEX_op_xor_i32: + if (op_dst !=3D st_src) { + goto fail; + } + if (op_dst =3D=3D ld_dst) { + tcg_gen_atomic_xor_fetch_i32(REG(ld_dst), REG(ld_adr), + op_arg, ctx->memidx, ld_mop); + } else { + tcg_gen_atomic_fetch_xor_i32(REG(ld_dst), REG(ld_adr), + op_arg, ctx->memidx, ld_mop); + tcg_gen_xor_i32(REG(op_dst), REG(ld_dst), op_arg); + } + break; + + case INDEX_op_setcond_i32: + if (st_src =3D=3D ld_dst) { + goto fail; + } + tcg_gen_atomic_cmpxchg_i32(REG(ld_dst), REG(ld_adr), op_arg, + REG(st_src), ctx->memidx, ld_mop); + tcg_gen_setcond_i32(TCG_COND_EQ, cpu_sr_t, REG(ld_dst), op_arg); + if (mt_dst >=3D 0) { + tcg_gen_mov_i32(REG(mt_dst), cpu_sr_t); + } + break; + + default: + g_assert_not_reached(); + } + + /* The entire region has been translated. */ + ctx->envflags &=3D ~GUSA_MASK; + ctx->pc =3D pc_end; + return max_insns; + + fail: qemu_log_mask(LOG_UNIMP, "Unrecognized gUSA sequence %08x-%08x\n", pc, pc_end); =20 --=20 2.9.4