From nobody Tue Feb 10 06:07:15 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of _spf.google.com designates 209.85.128.50 as permitted sender) client-ip=209.85.128.50; envelope-from=philippe.mathieu.daude@gmail.com; helo=mail-wm1-f50.google.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of _spf.google.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=philippe.mathieu.daude@gmail.com ARC-Seal: i=1; a=rsa-sha256; t=1606164392; cv=none; d=zohomail.com; s=zohoarc; b=GyO2QXXhxUCKEp7axCa011XFsBKHDKjS+1x6eHXImlCvqSYXz/GRTQE+wlzA+KiRPrF0EnMtreFYpeVBvZcaOWtUmJO8oi2OwgYrzBqlB8SaWdBXX9D46tPSXCvYZPWboKBk3MbElOssnFFFL1ebjgQQhPupFaa9qNoDbeKk0NU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1606164392; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Sender:Subject:To; bh=U7NTzUqpG+USX3MdIDzS5276z0nx/Isk7EgCIUUfDCc=; b=dWZ8oD6v6TUjxbW2CwaCFBHvqgKt2YOcIB7PWXQFatibYGnkLI9TjGMBVHGcrBjdN2fAh7MOqcvRnmh5a/OhqAzKqIgu/KOOJO9rkuBPsFCBfMN4yzpR/kHtctyL/GxOA1fadrPztL/bwzfKS2s1kXDvHQH1EeWP5tkEVNJ4SIc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of _spf.google.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=philippe.mathieu.daude@gmail.com Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) by mx.zohomail.com with SMTPS id 1606164392419551.6150107566461; Mon, 23 Nov 2020 12:46:32 -0800 (PST) Received: by mail-wm1-f50.google.com with SMTP id c198so628352wmd.0 for ; Mon, 23 Nov 2020 12:46:31 -0800 (PST) Return-Path: Return-Path: Received: from x1w.redhat.com (111.red-88-21-205.staticip.rima-tde.net. [88.21.205.111]) by smtp.gmail.com with ESMTPSA id z17sm854447wmf.15.2020.11.23.12.46.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Nov 2020 12:46:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=U7NTzUqpG+USX3MdIDzS5276z0nx/Isk7EgCIUUfDCc=; b=so/llHmZNxnEbbPTXJQue3XieQ29fczo7/R8hV9nuVMU64aLwO/kU6wCynGDJ0nyFY E2pAv3tz5icMXxDuvDLFBfuekua9tz/cfJG/yFL+psC0ihc4ltdqbmu1ACQJDFVzPx6v ONGhW3nud08eyRFYyN55tjibIH2DJ6h+j2U30SXg4kjcpg0/byZ3AmvT3PnYh3mzlsUl mSNRfE1FKxlMvSYcE+WhdGyE2QgMZLbEhmkHFJIUHe/I7Bod0Ja/OInzME9JhVqEdS5G wVPHOb4e0G1lIJBHiaedaQOht1OK/1Go5+oHB4OVQDUUDwPRCY6ZFV9RqCQxllUyLAvn cPzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=U7NTzUqpG+USX3MdIDzS5276z0nx/Isk7EgCIUUfDCc=; b=ZE4cF6zL0+/DjnNZhqCwBSWb2t6clVhjqH53KIsM1cHNzqsqesnY1MXO4AKTMHZo7B 0Q1kIq94nyyKP2j1iqnldBiqpAsvJQKjvPDP3WRcoDalnv7/+KnXkchq2voukUUBq2oS KM+47vWIiCCL38G583onoR6Z5KqCMCwy3zVFLmb/qhpBz+4bqp/ApFdoPBaBh7VPfy+b aRUs+0muNx7FFZ7o5t87YiLQ1I1Ezflk3ompy5E+01jBH9nofYTbibtTAtaCvAnozohi FJV6XPh3Y4Ot5BV6Xaqrj8byfM5mPrGsBNj1K99PuMcl0lzFCbJxfmDNRuKumAEqvot1 dsBQ== X-Gm-Message-State: AOAM531HzxBqeDvlF7iKpBxSRokJz2toRSSYY/Hqqz9j9a4PVvYPAJfk mb5VLuPLzWm/8Zo6z3g80Ks= X-Google-Smtp-Source: ABdhPJwpDRXIedEQH8k+it6rMY2k1t7clw3UyGqrlwxx9LM0RIA5ruVkHSYXU7IndVVUEzrY97jr1w== X-Received: by 2002:a1c:9695:: with SMTP id y143mr740023wmd.70.1606164390087; Mon, 23 Nov 2020 12:46:30 -0800 (PST) Sender: =?UTF-8?Q?Philippe_Mathieu=2DDaud=C3=A9?= From: =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= To: qemu-devel@nongnu.org Cc: Richard Henderson , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= Subject: [PATCH v2 20/28] target/mips: Extract Loongson SIMD translation routines Date: Mon, 23 Nov 2020 21:44:40 +0100 Message-Id: <20201123204448.3260804-21-f4bug@amsat.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201123204448.3260804-1-f4bug@amsat.org> References: <20201123204448.3260804-1-f4bug@amsat.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @gmail.com) LoongSIMD (formerly LoongMMI in Loongson 2E/F) is the 128-bit SIMD extension from the LoongISA. Extract ~500 lines of translation routines to 'vendor-loong-simd_translate.c.inc'. Signed-off-by: Philippe Mathieu-Daud=C3=A9 Reviewed-by: Richard Henderson Message-Id: <20201120210844.2625602-19-f4bug@amsat.org> --- target/mips/translate.c | 483 +---------------- target/mips/vendor-loong-simd_translate.c.inc | 492 ++++++++++++++++++ 2 files changed, 493 insertions(+), 482 deletions(-) create mode 100644 target/mips/vendor-loong-simd_translate.c.inc diff --git a/target/mips/translate.c b/target/mips/translate.c index ca2e79d955a..745bf9a9dd9 100644 --- a/target/mips/translate.c +++ b/target/mips/translate.c @@ -639,103 +639,6 @@ enum { OPC_BC2NEZ =3D (0x0D << 21) | OPC_CP2, }; =20 -#define MASK_LMMI(op) (MASK_OP_MAJOR(op) | (op & (0x1F << 21)) | (op & = 0x1F)) - -enum { - OPC_PADDSH =3D (24 << 21) | (0x00) | OPC_CP2, - OPC_PADDUSH =3D (25 << 21) | (0x00) | OPC_CP2, - OPC_PADDH =3D (26 << 21) | (0x00) | OPC_CP2, - OPC_PADDW =3D (27 << 21) | (0x00) | OPC_CP2, - OPC_PADDSB =3D (28 << 21) | (0x00) | OPC_CP2, - OPC_PADDUSB =3D (29 << 21) | (0x00) | OPC_CP2, - OPC_PADDB =3D (30 << 21) | (0x00) | OPC_CP2, - OPC_PADDD =3D (31 << 21) | (0x00) | OPC_CP2, - - OPC_PSUBSH =3D (24 << 21) | (0x01) | OPC_CP2, - OPC_PSUBUSH =3D (25 << 21) | (0x01) | OPC_CP2, - OPC_PSUBH =3D (26 << 21) | (0x01) | OPC_CP2, - OPC_PSUBW =3D (27 << 21) | (0x01) | OPC_CP2, - OPC_PSUBSB =3D (28 << 21) | (0x01) | OPC_CP2, - OPC_PSUBUSB =3D (29 << 21) | (0x01) | OPC_CP2, - OPC_PSUBB =3D (30 << 21) | (0x01) | OPC_CP2, - OPC_PSUBD =3D (31 << 21) | (0x01) | OPC_CP2, - - OPC_PSHUFH =3D (24 << 21) | (0x02) | OPC_CP2, - OPC_PACKSSWH =3D (25 << 21) | (0x02) | OPC_CP2, - OPC_PACKSSHB =3D (26 << 21) | (0x02) | OPC_CP2, - OPC_PACKUSHB =3D (27 << 21) | (0x02) | OPC_CP2, - OPC_XOR_CP2 =3D (28 << 21) | (0x02) | OPC_CP2, - OPC_NOR_CP2 =3D (29 << 21) | (0x02) | OPC_CP2, - OPC_AND_CP2 =3D (30 << 21) | (0x02) | OPC_CP2, - OPC_PANDN =3D (31 << 21) | (0x02) | OPC_CP2, - - OPC_PUNPCKLHW =3D (24 << 21) | (0x03) | OPC_CP2, - OPC_PUNPCKHHW =3D (25 << 21) | (0x03) | OPC_CP2, - OPC_PUNPCKLBH =3D (26 << 21) | (0x03) | OPC_CP2, - OPC_PUNPCKHBH =3D (27 << 21) | (0x03) | OPC_CP2, - OPC_PINSRH_0 =3D (28 << 21) | (0x03) | OPC_CP2, - OPC_PINSRH_1 =3D (29 << 21) | (0x03) | OPC_CP2, - OPC_PINSRH_2 =3D (30 << 21) | (0x03) | OPC_CP2, - OPC_PINSRH_3 =3D (31 << 21) | (0x03) | OPC_CP2, - - OPC_PAVGH =3D (24 << 21) | (0x08) | OPC_CP2, - OPC_PAVGB =3D (25 << 21) | (0x08) | OPC_CP2, - OPC_PMAXSH =3D (26 << 21) | (0x08) | OPC_CP2, - OPC_PMINSH =3D (27 << 21) | (0x08) | OPC_CP2, - OPC_PMAXUB =3D (28 << 21) | (0x08) | OPC_CP2, - OPC_PMINUB =3D (29 << 21) | (0x08) | OPC_CP2, - - OPC_PCMPEQW =3D (24 << 21) | (0x09) | OPC_CP2, - OPC_PCMPGTW =3D (25 << 21) | (0x09) | OPC_CP2, - OPC_PCMPEQH =3D (26 << 21) | (0x09) | OPC_CP2, - OPC_PCMPGTH =3D (27 << 21) | (0x09) | OPC_CP2, - OPC_PCMPEQB =3D (28 << 21) | (0x09) | OPC_CP2, - OPC_PCMPGTB =3D (29 << 21) | (0x09) | OPC_CP2, - - OPC_PSLLW =3D (24 << 21) | (0x0A) | OPC_CP2, - OPC_PSLLH =3D (25 << 21) | (0x0A) | OPC_CP2, - OPC_PMULLH =3D (26 << 21) | (0x0A) | OPC_CP2, - OPC_PMULHH =3D (27 << 21) | (0x0A) | OPC_CP2, - OPC_PMULUW =3D (28 << 21) | (0x0A) | OPC_CP2, - OPC_PMULHUH =3D (29 << 21) | (0x0A) | OPC_CP2, - - OPC_PSRLW =3D (24 << 21) | (0x0B) | OPC_CP2, - OPC_PSRLH =3D (25 << 21) | (0x0B) | OPC_CP2, - OPC_PSRAW =3D (26 << 21) | (0x0B) | OPC_CP2, - OPC_PSRAH =3D (27 << 21) | (0x0B) | OPC_CP2, - OPC_PUNPCKLWD =3D (28 << 21) | (0x0B) | OPC_CP2, - OPC_PUNPCKHWD =3D (29 << 21) | (0x0B) | OPC_CP2, - - OPC_ADDU_CP2 =3D (24 << 21) | (0x0C) | OPC_CP2, - OPC_OR_CP2 =3D (25 << 21) | (0x0C) | OPC_CP2, - OPC_ADD_CP2 =3D (26 << 21) | (0x0C) | OPC_CP2, - OPC_DADD_CP2 =3D (27 << 21) | (0x0C) | OPC_CP2, - OPC_SEQU_CP2 =3D (28 << 21) | (0x0C) | OPC_CP2, - OPC_SEQ_CP2 =3D (29 << 21) | (0x0C) | OPC_CP2, - - OPC_SUBU_CP2 =3D (24 << 21) | (0x0D) | OPC_CP2, - OPC_PASUBUB =3D (25 << 21) | (0x0D) | OPC_CP2, - OPC_SUB_CP2 =3D (26 << 21) | (0x0D) | OPC_CP2, - OPC_DSUB_CP2 =3D (27 << 21) | (0x0D) | OPC_CP2, - OPC_SLTU_CP2 =3D (28 << 21) | (0x0D) | OPC_CP2, - OPC_SLT_CP2 =3D (29 << 21) | (0x0D) | OPC_CP2, - - OPC_SLL_CP2 =3D (24 << 21) | (0x0E) | OPC_CP2, - OPC_DSLL_CP2 =3D (25 << 21) | (0x0E) | OPC_CP2, - OPC_PEXTRH =3D (26 << 21) | (0x0E) | OPC_CP2, - OPC_PMADDHW =3D (27 << 21) | (0x0E) | OPC_CP2, - OPC_SLEU_CP2 =3D (28 << 21) | (0x0E) | OPC_CP2, - OPC_SLE_CP2 =3D (29 << 21) | (0x0E) | OPC_CP2, - - OPC_SRL_CP2 =3D (24 << 21) | (0x0F) | OPC_CP2, - OPC_DSRL_CP2 =3D (25 << 21) | (0x0F) | OPC_CP2, - OPC_SRA_CP2 =3D (26 << 21) | (0x0F) | OPC_CP2, - OPC_DSRA_CP2 =3D (27 << 21) | (0x0F) | OPC_CP2, - OPC_BIADD =3D (28 << 21) | (0x0F) | OPC_CP2, - OPC_PMOVMSKB =3D (29 << 21) | (0x0F) | OPC_CP2, -}; - - #define MASK_CP3(op) (MASK_OP_MAJOR(op) | (op & 0x3F)) =20 enum { @@ -4768,391 +4671,6 @@ static void gen_loongson_integer(DisasContext *ctx,= uint32_t opc, tcg_temp_free(t1); } =20 -/* Loongson multimedia instructions */ -static void gen_loongson_multimedia(DisasContext *ctx, int rd, int rs, int= rt) -{ - uint32_t opc, shift_max; - TCGv_i64 t0, t1; - TCGCond cond; - - opc =3D MASK_LMMI(ctx->opcode); - switch (opc) { - case OPC_ADD_CP2: - case OPC_SUB_CP2: - case OPC_DADD_CP2: - case OPC_DSUB_CP2: - t0 =3D tcg_temp_local_new_i64(); - t1 =3D tcg_temp_local_new_i64(); - break; - default: - t0 =3D tcg_temp_new_i64(); - t1 =3D tcg_temp_new_i64(); - break; - } - - check_cp1_enabled(ctx); - gen_load_fpr64(ctx, t0, rs); - gen_load_fpr64(ctx, t1, rt); - - switch (opc) { - case OPC_PADDSH: - gen_helper_paddsh(t0, t0, t1); - break; - case OPC_PADDUSH: - gen_helper_paddush(t0, t0, t1); - break; - case OPC_PADDH: - gen_helper_paddh(t0, t0, t1); - break; - case OPC_PADDW: - gen_helper_paddw(t0, t0, t1); - break; - case OPC_PADDSB: - gen_helper_paddsb(t0, t0, t1); - break; - case OPC_PADDUSB: - gen_helper_paddusb(t0, t0, t1); - break; - case OPC_PADDB: - gen_helper_paddb(t0, t0, t1); - break; - - case OPC_PSUBSH: - gen_helper_psubsh(t0, t0, t1); - break; - case OPC_PSUBUSH: - gen_helper_psubush(t0, t0, t1); - break; - case OPC_PSUBH: - gen_helper_psubh(t0, t0, t1); - break; - case OPC_PSUBW: - gen_helper_psubw(t0, t0, t1); - break; - case OPC_PSUBSB: - gen_helper_psubsb(t0, t0, t1); - break; - case OPC_PSUBUSB: - gen_helper_psubusb(t0, t0, t1); - break; - case OPC_PSUBB: - gen_helper_psubb(t0, t0, t1); - break; - - case OPC_PSHUFH: - gen_helper_pshufh(t0, t0, t1); - break; - case OPC_PACKSSWH: - gen_helper_packsswh(t0, t0, t1); - break; - case OPC_PACKSSHB: - gen_helper_packsshb(t0, t0, t1); - break; - case OPC_PACKUSHB: - gen_helper_packushb(t0, t0, t1); - break; - - case OPC_PUNPCKLHW: - gen_helper_punpcklhw(t0, t0, t1); - break; - case OPC_PUNPCKHHW: - gen_helper_punpckhhw(t0, t0, t1); - break; - case OPC_PUNPCKLBH: - gen_helper_punpcklbh(t0, t0, t1); - break; - case OPC_PUNPCKHBH: - gen_helper_punpckhbh(t0, t0, t1); - break; - case OPC_PUNPCKLWD: - gen_helper_punpcklwd(t0, t0, t1); - break; - case OPC_PUNPCKHWD: - gen_helper_punpckhwd(t0, t0, t1); - break; - - case OPC_PAVGH: - gen_helper_pavgh(t0, t0, t1); - break; - case OPC_PAVGB: - gen_helper_pavgb(t0, t0, t1); - break; - case OPC_PMAXSH: - gen_helper_pmaxsh(t0, t0, t1); - break; - case OPC_PMINSH: - gen_helper_pminsh(t0, t0, t1); - break; - case OPC_PMAXUB: - gen_helper_pmaxub(t0, t0, t1); - break; - case OPC_PMINUB: - gen_helper_pminub(t0, t0, t1); - break; - - case OPC_PCMPEQW: - gen_helper_pcmpeqw(t0, t0, t1); - break; - case OPC_PCMPGTW: - gen_helper_pcmpgtw(t0, t0, t1); - break; - case OPC_PCMPEQH: - gen_helper_pcmpeqh(t0, t0, t1); - break; - case OPC_PCMPGTH: - gen_helper_pcmpgth(t0, t0, t1); - break; - case OPC_PCMPEQB: - gen_helper_pcmpeqb(t0, t0, t1); - break; - case OPC_PCMPGTB: - gen_helper_pcmpgtb(t0, t0, t1); - break; - - case OPC_PSLLW: - gen_helper_psllw(t0, t0, t1); - break; - case OPC_PSLLH: - gen_helper_psllh(t0, t0, t1); - break; - case OPC_PSRLW: - gen_helper_psrlw(t0, t0, t1); - break; - case OPC_PSRLH: - gen_helper_psrlh(t0, t0, t1); - break; - case OPC_PSRAW: - gen_helper_psraw(t0, t0, t1); - break; - case OPC_PSRAH: - gen_helper_psrah(t0, t0, t1); - break; - - case OPC_PMULLH: - gen_helper_pmullh(t0, t0, t1); - break; - case OPC_PMULHH: - gen_helper_pmulhh(t0, t0, t1); - break; - case OPC_PMULHUH: - gen_helper_pmulhuh(t0, t0, t1); - break; - case OPC_PMADDHW: - gen_helper_pmaddhw(t0, t0, t1); - break; - - case OPC_PASUBUB: - gen_helper_pasubub(t0, t0, t1); - break; - case OPC_BIADD: - gen_helper_biadd(t0, t0); - break; - case OPC_PMOVMSKB: - gen_helper_pmovmskb(t0, t0); - break; - - case OPC_PADDD: - tcg_gen_add_i64(t0, t0, t1); - break; - case OPC_PSUBD: - tcg_gen_sub_i64(t0, t0, t1); - break; - case OPC_XOR_CP2: - tcg_gen_xor_i64(t0, t0, t1); - break; - case OPC_NOR_CP2: - tcg_gen_nor_i64(t0, t0, t1); - break; - case OPC_AND_CP2: - tcg_gen_and_i64(t0, t0, t1); - break; - case OPC_OR_CP2: - tcg_gen_or_i64(t0, t0, t1); - break; - - case OPC_PANDN: - tcg_gen_andc_i64(t0, t1, t0); - break; - - case OPC_PINSRH_0: - tcg_gen_deposit_i64(t0, t0, t1, 0, 16); - break; - case OPC_PINSRH_1: - tcg_gen_deposit_i64(t0, t0, t1, 16, 16); - break; - case OPC_PINSRH_2: - tcg_gen_deposit_i64(t0, t0, t1, 32, 16); - break; - case OPC_PINSRH_3: - tcg_gen_deposit_i64(t0, t0, t1, 48, 16); - break; - - case OPC_PEXTRH: - tcg_gen_andi_i64(t1, t1, 3); - tcg_gen_shli_i64(t1, t1, 4); - tcg_gen_shr_i64(t0, t0, t1); - tcg_gen_ext16u_i64(t0, t0); - break; - - case OPC_ADDU_CP2: - tcg_gen_add_i64(t0, t0, t1); - tcg_gen_ext32s_i64(t0, t0); - break; - case OPC_SUBU_CP2: - tcg_gen_sub_i64(t0, t0, t1); - tcg_gen_ext32s_i64(t0, t0); - break; - - case OPC_SLL_CP2: - shift_max =3D 32; - goto do_shift; - case OPC_SRL_CP2: - shift_max =3D 32; - goto do_shift; - case OPC_SRA_CP2: - shift_max =3D 32; - goto do_shift; - case OPC_DSLL_CP2: - shift_max =3D 64; - goto do_shift; - case OPC_DSRL_CP2: - shift_max =3D 64; - goto do_shift; - case OPC_DSRA_CP2: - shift_max =3D 64; - goto do_shift; - do_shift: - /* Make sure shift count isn't TCG undefined behaviour. */ - tcg_gen_andi_i64(t1, t1, shift_max - 1); - - switch (opc) { - case OPC_SLL_CP2: - case OPC_DSLL_CP2: - tcg_gen_shl_i64(t0, t0, t1); - break; - case OPC_SRA_CP2: - case OPC_DSRA_CP2: - /* - * Since SRA is UndefinedResult without sign-extended inputs, - * we can treat SRA and DSRA the same. - */ - tcg_gen_sar_i64(t0, t0, t1); - break; - case OPC_SRL_CP2: - /* We want to shift in zeros for SRL; zero-extend first. */ - tcg_gen_ext32u_i64(t0, t0); - /* FALLTHRU */ - case OPC_DSRL_CP2: - tcg_gen_shr_i64(t0, t0, t1); - break; - } - - if (shift_max =3D=3D 32) { - tcg_gen_ext32s_i64(t0, t0); - } - - /* Shifts larger than MAX produce zero. */ - tcg_gen_setcondi_i64(TCG_COND_LTU, t1, t1, shift_max); - tcg_gen_neg_i64(t1, t1); - tcg_gen_and_i64(t0, t0, t1); - break; - - case OPC_ADD_CP2: - case OPC_DADD_CP2: - { - TCGv_i64 t2 =3D tcg_temp_new_i64(); - TCGLabel *lab =3D gen_new_label(); - - tcg_gen_mov_i64(t2, t0); - tcg_gen_add_i64(t0, t1, t2); - if (opc =3D=3D OPC_ADD_CP2) { - tcg_gen_ext32s_i64(t0, t0); - } - tcg_gen_xor_i64(t1, t1, t2); - tcg_gen_xor_i64(t2, t2, t0); - tcg_gen_andc_i64(t1, t2, t1); - tcg_temp_free_i64(t2); - tcg_gen_brcondi_i64(TCG_COND_GE, t1, 0, lab); - generate_exception(ctx, EXCP_OVERFLOW); - gen_set_label(lab); - break; - } - - case OPC_SUB_CP2: - case OPC_DSUB_CP2: - { - TCGv_i64 t2 =3D tcg_temp_new_i64(); - TCGLabel *lab =3D gen_new_label(); - - tcg_gen_mov_i64(t2, t0); - tcg_gen_sub_i64(t0, t1, t2); - if (opc =3D=3D OPC_SUB_CP2) { - tcg_gen_ext32s_i64(t0, t0); - } - tcg_gen_xor_i64(t1, t1, t2); - tcg_gen_xor_i64(t2, t2, t0); - tcg_gen_and_i64(t1, t1, t2); - tcg_temp_free_i64(t2); - tcg_gen_brcondi_i64(TCG_COND_GE, t1, 0, lab); - generate_exception(ctx, EXCP_OVERFLOW); - gen_set_label(lab); - break; - } - - case OPC_PMULUW: - tcg_gen_ext32u_i64(t0, t0); - tcg_gen_ext32u_i64(t1, t1); - tcg_gen_mul_i64(t0, t0, t1); - break; - - case OPC_SEQU_CP2: - case OPC_SEQ_CP2: - cond =3D TCG_COND_EQ; - goto do_cc_cond; - break; - case OPC_SLTU_CP2: - cond =3D TCG_COND_LTU; - goto do_cc_cond; - break; - case OPC_SLT_CP2: - cond =3D TCG_COND_LT; - goto do_cc_cond; - break; - case OPC_SLEU_CP2: - cond =3D TCG_COND_LEU; - goto do_cc_cond; - break; - case OPC_SLE_CP2: - cond =3D TCG_COND_LE; - do_cc_cond: - { - int cc =3D (ctx->opcode >> 8) & 0x7; - TCGv_i64 t64 =3D tcg_temp_new_i64(); - TCGv_i32 t32 =3D tcg_temp_new_i32(); - - tcg_gen_setcond_i64(cond, t64, t0, t1); - tcg_gen_extrl_i64_i32(t32, t64); - tcg_gen_deposit_i32(fpu_fcr31, fpu_fcr31, t32, - get_fp_bit(cc), 1); - - tcg_temp_free_i32(t32); - tcg_temp_free_i64(t64); - } - goto no_rd; - break; - default: - MIPS_INVAL("loongson_cp2"); - generate_exception_end(ctx, EXCP_RI); - return; - } - - gen_store_fpr64(ctx, t0, rd); - -no_rd: - tcg_temp_free_i64(t0); - tcg_temp_free_i64(t1); -} - static void gen_loongson_lswc2(DisasContext *ctx, int rt, int rs, int rd) { @@ -12939,6 +12457,7 @@ out: #include "mod-dsp_translate.c.inc" =20 #include "vendor-vr54xx_translate.c.inc" +#include "vendor-loong-simd_translate.c.inc" =20 static void decode_opc_special_r6(CPUMIPSState *env, DisasContext *ctx) { diff --git a/target/mips/vendor-loong-simd_translate.c.inc b/target/mips/ve= ndor-loong-simd_translate.c.inc new file mode 100644 index 00000000000..85b1de6c28a --- /dev/null +++ b/target/mips/vendor-loong-simd_translate.c.inc @@ -0,0 +1,492 @@ +/* + * Loongson SIMD translation routines. + * + * Copyright (c) 2004-2005 Jocelyn Mayer + * Copyright (c) 2006 Marius Groeger (FPU operations) + * Copyright (c) 2006 Thiemo Seufer (MIPS32R2 support) + * Copyright (c) 2009 CodeSourcery (MIPS16 and microMIPS support) + * Copyright (c) 2011 Richard Henderson + * + * SPDX-License-Identifier: LGPL-2.1-or-later + */ + +#define MASK_LMMI(op) (MASK_OP_MAJOR(op) | (op & (0x1F << 21)) | (op & = 0x1F)) + +/* Loongson multimedia instructions */ +enum { + OPC_PADDSH =3D (24 << 21) | (0x00) | OPC_CP2, + OPC_PADDUSH =3D (25 << 21) | (0x00) | OPC_CP2, + OPC_PADDH =3D (26 << 21) | (0x00) | OPC_CP2, + OPC_PADDW =3D (27 << 21) | (0x00) | OPC_CP2, + OPC_PADDSB =3D (28 << 21) | (0x00) | OPC_CP2, + OPC_PADDUSB =3D (29 << 21) | (0x00) | OPC_CP2, + OPC_PADDB =3D (30 << 21) | (0x00) | OPC_CP2, + OPC_PADDD =3D (31 << 21) | (0x00) | OPC_CP2, + + OPC_PSUBSH =3D (24 << 21) | (0x01) | OPC_CP2, + OPC_PSUBUSH =3D (25 << 21) | (0x01) | OPC_CP2, + OPC_PSUBH =3D (26 << 21) | (0x01) | OPC_CP2, + OPC_PSUBW =3D (27 << 21) | (0x01) | OPC_CP2, + OPC_PSUBSB =3D (28 << 21) | (0x01) | OPC_CP2, + OPC_PSUBUSB =3D (29 << 21) | (0x01) | OPC_CP2, + OPC_PSUBB =3D (30 << 21) | (0x01) | OPC_CP2, + OPC_PSUBD =3D (31 << 21) | (0x01) | OPC_CP2, + + OPC_PSHUFH =3D (24 << 21) | (0x02) | OPC_CP2, + OPC_PACKSSWH =3D (25 << 21) | (0x02) | OPC_CP2, + OPC_PACKSSHB =3D (26 << 21) | (0x02) | OPC_CP2, + OPC_PACKUSHB =3D (27 << 21) | (0x02) | OPC_CP2, + OPC_XOR_CP2 =3D (28 << 21) | (0x02) | OPC_CP2, + OPC_NOR_CP2 =3D (29 << 21) | (0x02) | OPC_CP2, + OPC_AND_CP2 =3D (30 << 21) | (0x02) | OPC_CP2, + OPC_PANDN =3D (31 << 21) | (0x02) | OPC_CP2, + + OPC_PUNPCKLHW =3D (24 << 21) | (0x03) | OPC_CP2, + OPC_PUNPCKHHW =3D (25 << 21) | (0x03) | OPC_CP2, + OPC_PUNPCKLBH =3D (26 << 21) | (0x03) | OPC_CP2, + OPC_PUNPCKHBH =3D (27 << 21) | (0x03) | OPC_CP2, + OPC_PINSRH_0 =3D (28 << 21) | (0x03) | OPC_CP2, + OPC_PINSRH_1 =3D (29 << 21) | (0x03) | OPC_CP2, + OPC_PINSRH_2 =3D (30 << 21) | (0x03) | OPC_CP2, + OPC_PINSRH_3 =3D (31 << 21) | (0x03) | OPC_CP2, + + OPC_PAVGH =3D (24 << 21) | (0x08) | OPC_CP2, + OPC_PAVGB =3D (25 << 21) | (0x08) | OPC_CP2, + OPC_PMAXSH =3D (26 << 21) | (0x08) | OPC_CP2, + OPC_PMINSH =3D (27 << 21) | (0x08) | OPC_CP2, + OPC_PMAXUB =3D (28 << 21) | (0x08) | OPC_CP2, + OPC_PMINUB =3D (29 << 21) | (0x08) | OPC_CP2, + + OPC_PCMPEQW =3D (24 << 21) | (0x09) | OPC_CP2, + OPC_PCMPGTW =3D (25 << 21) | (0x09) | OPC_CP2, + OPC_PCMPEQH =3D (26 << 21) | (0x09) | OPC_CP2, + OPC_PCMPGTH =3D (27 << 21) | (0x09) | OPC_CP2, + OPC_PCMPEQB =3D (28 << 21) | (0x09) | OPC_CP2, + OPC_PCMPGTB =3D (29 << 21) | (0x09) | OPC_CP2, + + OPC_PSLLW =3D (24 << 21) | (0x0A) | OPC_CP2, + OPC_PSLLH =3D (25 << 21) | (0x0A) | OPC_CP2, + OPC_PMULLH =3D (26 << 21) | (0x0A) | OPC_CP2, + OPC_PMULHH =3D (27 << 21) | (0x0A) | OPC_CP2, + OPC_PMULUW =3D (28 << 21) | (0x0A) | OPC_CP2, + OPC_PMULHUH =3D (29 << 21) | (0x0A) | OPC_CP2, + + OPC_PSRLW =3D (24 << 21) | (0x0B) | OPC_CP2, + OPC_PSRLH =3D (25 << 21) | (0x0B) | OPC_CP2, + OPC_PSRAW =3D (26 << 21) | (0x0B) | OPC_CP2, + OPC_PSRAH =3D (27 << 21) | (0x0B) | OPC_CP2, + OPC_PUNPCKLWD =3D (28 << 21) | (0x0B) | OPC_CP2, + OPC_PUNPCKHWD =3D (29 << 21) | (0x0B) | OPC_CP2, + + OPC_ADDU_CP2 =3D (24 << 21) | (0x0C) | OPC_CP2, + OPC_OR_CP2 =3D (25 << 21) | (0x0C) | OPC_CP2, + OPC_ADD_CP2 =3D (26 << 21) | (0x0C) | OPC_CP2, + OPC_DADD_CP2 =3D (27 << 21) | (0x0C) | OPC_CP2, + OPC_SEQU_CP2 =3D (28 << 21) | (0x0C) | OPC_CP2, + OPC_SEQ_CP2 =3D (29 << 21) | (0x0C) | OPC_CP2, + + OPC_SUBU_CP2 =3D (24 << 21) | (0x0D) | OPC_CP2, + OPC_PASUBUB =3D (25 << 21) | (0x0D) | OPC_CP2, + OPC_SUB_CP2 =3D (26 << 21) | (0x0D) | OPC_CP2, + OPC_DSUB_CP2 =3D (27 << 21) | (0x0D) | OPC_CP2, + OPC_SLTU_CP2 =3D (28 << 21) | (0x0D) | OPC_CP2, + OPC_SLT_CP2 =3D (29 << 21) | (0x0D) | OPC_CP2, + + OPC_SLL_CP2 =3D (24 << 21) | (0x0E) | OPC_CP2, + OPC_DSLL_CP2 =3D (25 << 21) | (0x0E) | OPC_CP2, + OPC_PEXTRH =3D (26 << 21) | (0x0E) | OPC_CP2, + OPC_PMADDHW =3D (27 << 21) | (0x0E) | OPC_CP2, + OPC_SLEU_CP2 =3D (28 << 21) | (0x0E) | OPC_CP2, + OPC_SLE_CP2 =3D (29 << 21) | (0x0E) | OPC_CP2, + + OPC_SRL_CP2 =3D (24 << 21) | (0x0F) | OPC_CP2, + OPC_DSRL_CP2 =3D (25 << 21) | (0x0F) | OPC_CP2, + OPC_SRA_CP2 =3D (26 << 21) | (0x0F) | OPC_CP2, + OPC_DSRA_CP2 =3D (27 << 21) | (0x0F) | OPC_CP2, + OPC_BIADD =3D (28 << 21) | (0x0F) | OPC_CP2, + OPC_PMOVMSKB =3D (29 << 21) | (0x0F) | OPC_CP2, +}; + +static void gen_loongson_multimedia(DisasContext *ctx, int rd, int rs, int= rt) +{ + uint32_t opc, shift_max; + TCGv_i64 t0, t1; + TCGCond cond; + + opc =3D MASK_LMMI(ctx->opcode); + switch (opc) { + case OPC_ADD_CP2: + case OPC_SUB_CP2: + case OPC_DADD_CP2: + case OPC_DSUB_CP2: + t0 =3D tcg_temp_local_new_i64(); + t1 =3D tcg_temp_local_new_i64(); + break; + default: + t0 =3D tcg_temp_new_i64(); + t1 =3D tcg_temp_new_i64(); + break; + } + + check_cp1_enabled(ctx); + gen_load_fpr64(ctx, t0, rs); + gen_load_fpr64(ctx, t1, rt); + + switch (opc) { + case OPC_PADDSH: + gen_helper_paddsh(t0, t0, t1); + break; + case OPC_PADDUSH: + gen_helper_paddush(t0, t0, t1); + break; + case OPC_PADDH: + gen_helper_paddh(t0, t0, t1); + break; + case OPC_PADDW: + gen_helper_paddw(t0, t0, t1); + break; + case OPC_PADDSB: + gen_helper_paddsb(t0, t0, t1); + break; + case OPC_PADDUSB: + gen_helper_paddusb(t0, t0, t1); + break; + case OPC_PADDB: + gen_helper_paddb(t0, t0, t1); + break; + + case OPC_PSUBSH: + gen_helper_psubsh(t0, t0, t1); + break; + case OPC_PSUBUSH: + gen_helper_psubush(t0, t0, t1); + break; + case OPC_PSUBH: + gen_helper_psubh(t0, t0, t1); + break; + case OPC_PSUBW: + gen_helper_psubw(t0, t0, t1); + break; + case OPC_PSUBSB: + gen_helper_psubsb(t0, t0, t1); + break; + case OPC_PSUBUSB: + gen_helper_psubusb(t0, t0, t1); + break; + case OPC_PSUBB: + gen_helper_psubb(t0, t0, t1); + break; + + case OPC_PSHUFH: + gen_helper_pshufh(t0, t0, t1); + break; + case OPC_PACKSSWH: + gen_helper_packsswh(t0, t0, t1); + break; + case OPC_PACKSSHB: + gen_helper_packsshb(t0, t0, t1); + break; + case OPC_PACKUSHB: + gen_helper_packushb(t0, t0, t1); + break; + + case OPC_PUNPCKLHW: + gen_helper_punpcklhw(t0, t0, t1); + break; + case OPC_PUNPCKHHW: + gen_helper_punpckhhw(t0, t0, t1); + break; + case OPC_PUNPCKLBH: + gen_helper_punpcklbh(t0, t0, t1); + break; + case OPC_PUNPCKHBH: + gen_helper_punpckhbh(t0, t0, t1); + break; + case OPC_PUNPCKLWD: + gen_helper_punpcklwd(t0, t0, t1); + break; + case OPC_PUNPCKHWD: + gen_helper_punpckhwd(t0, t0, t1); + break; + + case OPC_PAVGH: + gen_helper_pavgh(t0, t0, t1); + break; + case OPC_PAVGB: + gen_helper_pavgb(t0, t0, t1); + break; + case OPC_PMAXSH: + gen_helper_pmaxsh(t0, t0, t1); + break; + case OPC_PMINSH: + gen_helper_pminsh(t0, t0, t1); + break; + case OPC_PMAXUB: + gen_helper_pmaxub(t0, t0, t1); + break; + case OPC_PMINUB: + gen_helper_pminub(t0, t0, t1); + break; + + case OPC_PCMPEQW: + gen_helper_pcmpeqw(t0, t0, t1); + break; + case OPC_PCMPGTW: + gen_helper_pcmpgtw(t0, t0, t1); + break; + case OPC_PCMPEQH: + gen_helper_pcmpeqh(t0, t0, t1); + break; + case OPC_PCMPGTH: + gen_helper_pcmpgth(t0, t0, t1); + break; + case OPC_PCMPEQB: + gen_helper_pcmpeqb(t0, t0, t1); + break; + case OPC_PCMPGTB: + gen_helper_pcmpgtb(t0, t0, t1); + break; + + case OPC_PSLLW: + gen_helper_psllw(t0, t0, t1); + break; + case OPC_PSLLH: + gen_helper_psllh(t0, t0, t1); + break; + case OPC_PSRLW: + gen_helper_psrlw(t0, t0, t1); + break; + case OPC_PSRLH: + gen_helper_psrlh(t0, t0, t1); + break; + case OPC_PSRAW: + gen_helper_psraw(t0, t0, t1); + break; + case OPC_PSRAH: + gen_helper_psrah(t0, t0, t1); + break; + + case OPC_PMULLH: + gen_helper_pmullh(t0, t0, t1); + break; + case OPC_PMULHH: + gen_helper_pmulhh(t0, t0, t1); + break; + case OPC_PMULHUH: + gen_helper_pmulhuh(t0, t0, t1); + break; + case OPC_PMADDHW: + gen_helper_pmaddhw(t0, t0, t1); + break; + + case OPC_PASUBUB: + gen_helper_pasubub(t0, t0, t1); + break; + case OPC_BIADD: + gen_helper_biadd(t0, t0); + break; + case OPC_PMOVMSKB: + gen_helper_pmovmskb(t0, t0); + break; + + case OPC_PADDD: + tcg_gen_add_i64(t0, t0, t1); + break; + case OPC_PSUBD: + tcg_gen_sub_i64(t0, t0, t1); + break; + case OPC_XOR_CP2: + tcg_gen_xor_i64(t0, t0, t1); + break; + case OPC_NOR_CP2: + tcg_gen_nor_i64(t0, t0, t1); + break; + case OPC_AND_CP2: + tcg_gen_and_i64(t0, t0, t1); + break; + case OPC_OR_CP2: + tcg_gen_or_i64(t0, t0, t1); + break; + + case OPC_PANDN: + tcg_gen_andc_i64(t0, t1, t0); + break; + + case OPC_PINSRH_0: + tcg_gen_deposit_i64(t0, t0, t1, 0, 16); + break; + case OPC_PINSRH_1: + tcg_gen_deposit_i64(t0, t0, t1, 16, 16); + break; + case OPC_PINSRH_2: + tcg_gen_deposit_i64(t0, t0, t1, 32, 16); + break; + case OPC_PINSRH_3: + tcg_gen_deposit_i64(t0, t0, t1, 48, 16); + break; + + case OPC_PEXTRH: + tcg_gen_andi_i64(t1, t1, 3); + tcg_gen_shli_i64(t1, t1, 4); + tcg_gen_shr_i64(t0, t0, t1); + tcg_gen_ext16u_i64(t0, t0); + break; + + case OPC_ADDU_CP2: + tcg_gen_add_i64(t0, t0, t1); + tcg_gen_ext32s_i64(t0, t0); + break; + case OPC_SUBU_CP2: + tcg_gen_sub_i64(t0, t0, t1); + tcg_gen_ext32s_i64(t0, t0); + break; + + case OPC_SLL_CP2: + shift_max =3D 32; + goto do_shift; + case OPC_SRL_CP2: + shift_max =3D 32; + goto do_shift; + case OPC_SRA_CP2: + shift_max =3D 32; + goto do_shift; + case OPC_DSLL_CP2: + shift_max =3D 64; + goto do_shift; + case OPC_DSRL_CP2: + shift_max =3D 64; + goto do_shift; + case OPC_DSRA_CP2: + shift_max =3D 64; + goto do_shift; + do_shift: + /* Make sure shift count isn't TCG undefined behaviour. */ + tcg_gen_andi_i64(t1, t1, shift_max - 1); + + switch (opc) { + case OPC_SLL_CP2: + case OPC_DSLL_CP2: + tcg_gen_shl_i64(t0, t0, t1); + break; + case OPC_SRA_CP2: + case OPC_DSRA_CP2: + /* + * Since SRA is UndefinedResult without sign-extended inputs, + * we can treat SRA and DSRA the same. + */ + tcg_gen_sar_i64(t0, t0, t1); + break; + case OPC_SRL_CP2: + /* We want to shift in zeros for SRL; zero-extend first. */ + tcg_gen_ext32u_i64(t0, t0); + /* FALLTHRU */ + case OPC_DSRL_CP2: + tcg_gen_shr_i64(t0, t0, t1); + break; + } + + if (shift_max =3D=3D 32) { + tcg_gen_ext32s_i64(t0, t0); + } + + /* Shifts larger than MAX produce zero. */ + tcg_gen_setcondi_i64(TCG_COND_LTU, t1, t1, shift_max); + tcg_gen_neg_i64(t1, t1); + tcg_gen_and_i64(t0, t0, t1); + break; + + case OPC_ADD_CP2: + case OPC_DADD_CP2: + { + TCGv_i64 t2 =3D tcg_temp_new_i64(); + TCGLabel *lab =3D gen_new_label(); + + tcg_gen_mov_i64(t2, t0); + tcg_gen_add_i64(t0, t1, t2); + if (opc =3D=3D OPC_ADD_CP2) { + tcg_gen_ext32s_i64(t0, t0); + } + tcg_gen_xor_i64(t1, t1, t2); + tcg_gen_xor_i64(t2, t2, t0); + tcg_gen_andc_i64(t1, t2, t1); + tcg_temp_free_i64(t2); + tcg_gen_brcondi_i64(TCG_COND_GE, t1, 0, lab); + generate_exception(ctx, EXCP_OVERFLOW); + gen_set_label(lab); + break; + } + + case OPC_SUB_CP2: + case OPC_DSUB_CP2: + { + TCGv_i64 t2 =3D tcg_temp_new_i64(); + TCGLabel *lab =3D gen_new_label(); + + tcg_gen_mov_i64(t2, t0); + tcg_gen_sub_i64(t0, t1, t2); + if (opc =3D=3D OPC_SUB_CP2) { + tcg_gen_ext32s_i64(t0, t0); + } + tcg_gen_xor_i64(t1, t1, t2); + tcg_gen_xor_i64(t2, t2, t0); + tcg_gen_and_i64(t1, t1, t2); + tcg_temp_free_i64(t2); + tcg_gen_brcondi_i64(TCG_COND_GE, t1, 0, lab); + generate_exception(ctx, EXCP_OVERFLOW); + gen_set_label(lab); + break; + } + + case OPC_PMULUW: + tcg_gen_ext32u_i64(t0, t0); + tcg_gen_ext32u_i64(t1, t1); + tcg_gen_mul_i64(t0, t0, t1); + break; + + case OPC_SEQU_CP2: + case OPC_SEQ_CP2: + cond =3D TCG_COND_EQ; + goto do_cc_cond; + break; + case OPC_SLTU_CP2: + cond =3D TCG_COND_LTU; + goto do_cc_cond; + break; + case OPC_SLT_CP2: + cond =3D TCG_COND_LT; + goto do_cc_cond; + break; + case OPC_SLEU_CP2: + cond =3D TCG_COND_LEU; + goto do_cc_cond; + break; + case OPC_SLE_CP2: + cond =3D TCG_COND_LE; + do_cc_cond: + { + int cc =3D (ctx->opcode >> 8) & 0x7; + TCGv_i64 t64 =3D tcg_temp_new_i64(); + TCGv_i32 t32 =3D tcg_temp_new_i32(); + + tcg_gen_setcond_i64(cond, t64, t0, t1); + tcg_gen_extrl_i64_i32(t32, t64); + tcg_gen_deposit_i32(fpu_fcr31, fpu_fcr31, t32, + get_fp_bit(cc), 1); + + tcg_temp_free_i32(t32); + tcg_temp_free_i64(t64); + } + goto no_rd; + break; + default: + MIPS_INVAL("loongson_cp2"); + generate_exception_end(ctx, EXCP_RI); + return; + } + + gen_store_fpr64(ctx, t0, rd); + +no_rd: + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); +} --=20 2.26.2