From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712907653; cv=none; d=zohomail.com; s=zohoarc; b=Tmf9qbBuRCOc9IDYE+v6nE/x3u/UemBhjeWIuTmIJtegCSy4C3nlNbsZg8ae5GTpc2MiXHbzE6onR7/c3tfS9EYxDXCabfQ9fK1X2UgqnzH1JCWwHaazouY5tcrS7jXFZmXY7xMbcciYLj6Ey6DITZotKeacXVFWayLAj0T5QFg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712907653; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=CD1rIM6sbmEHK27xeJ6Gn+PBG5c5/nQVoK5CDrnnAhI=; b=dszylXt0O/YgQr8qch0zQRVCuUT9Shsqf4tu+lFF5JKqnooqefDRKSMA1b7u4m8F2VSZbt7cG6lRLIhCejlB2SyC/jedLBsZVjv97qYg5pZf1zmahOEvwK4StHguSkIu5FKNLfkbo0PpivzWoq8YtsOhm05pbTlpZuSNx1ScNXI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712907653310519.3818355078848; Fri, 12 Apr 2024 00:40:53 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBWT-0007br-S9; Fri, 12 Apr 2024 03:40:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBWS-0007ae-OU; Fri, 12 Apr 2024 03:40:32 -0400 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBWP-0008K2-OR; Fri, 12 Apr 2024 03:40:32 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NVQ9._1712907621) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 15:40:22 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712907623; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; bh=CD1rIM6sbmEHK27xeJ6Gn+PBG5c5/nQVoK5CDrnnAhI=; b=pPtwjQ6yESukCRYn31MSE4ZfutVIQ00PPJKamGYbxAhUQAASV6rl47+FPQmru1S0LvYl6/YDvjSTMrsWRkEvVTRBMkvmIJbSOIhRsrQP4ViM/deuyG9CvprfhMx+uYIQf3KSH6nBBcLSP+UU2Idz1H/AlRxPBxpnQuo7GN+CLtw= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R611e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045170; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NVQ9._1712907621; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, =?UTF-8?q?Christoph=20M=C3=BCllner?= Subject: [PATCH 01/65] riscv: thead: Add th.sxstatus CSR emulation Date: Fri, 12 Apr 2024 15:36:31 +0800 Message-ID: <20240412073735.76413-2-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.133; envelope-from=eric.huang@linux.alibaba.com; helo=out30-133.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712907653962100001 From: Christoph M=C3=BCllner The th.sxstatus CSR can be used to identify available custom extension on T-Head CPUs. The CSR is documented here: https://github.com/T-head-Semi/thead-extension-spec/pull/46 An important property of this patch is, that the th.sxstatus MAEE field is not set (indicating that XTheadMaee is not available). XTheadMaee is a memory attribute extension (similar to Svpbmt) which is implemented in many T-Head CPUs (C906, C910, etc.) and utilizes bits in PTEs that are marked as reserved. QEMU maintainers prefer to not implement XTheadMaee, so we need give kernels a mechanism to identify if XTheadMaee is available in a system or not. And this patch introduces this mechanism in QEMU in a way that's compatible with real HW (i.e., probing the th.sxstatus.MAEE bit). Further context can be found on the list: https://lists.gnu.org/archive/html/qemu-devel/2024-02/msg00775.html Signed-off-by: Christoph M=C3=BCllner --- target/riscv/cpu.c | 1 + target/riscv/cpu.h | 3 ++ target/riscv/meson.build | 1 + target/riscv/th_csr.c | 78 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 83 insertions(+) create mode 100644 target/riscv/th_csr.c diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 193dbf7fe8..46a66cdbbb 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -545,6 +545,7 @@ static void rv64_thead_c906_cpu_init(Object *obj) cpu->cfg.mvendorid =3D THEAD_VENDOR_ID; #ifndef CONFIG_USER_ONLY set_satp_mode_max_supported(cpu, VM_1_10_SV39); + th_register_custom_csrs(cpu); #endif =20 /* inherited from parent obj via riscv_cpu_init() */ diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index 48e67410e1..8d0b500758 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -825,4 +825,7 @@ void riscv_cpu_register_gdb_regs_for_features(CPUState = *cs); uint8_t satp_mode_max_from_map(uint32_t map); const char *satp_mode_str(uint8_t satp_mode, bool is_32_bit); =20 +/* Implemented in th_csr.c */ +void th_register_custom_csrs(RISCVCPU *cpu); + #endif /* RISCV_CPU_H */ diff --git a/target/riscv/meson.build b/target/riscv/meson.build index a5e0734e7f..a4bd61e52a 100644 --- a/target/riscv/meson.build +++ b/target/riscv/meson.build @@ -33,6 +33,7 @@ riscv_system_ss.add(files( 'monitor.c', 'machine.c', 'pmu.c', + 'th_csr.c', 'time_helper.c', 'riscv-qmp-cmds.c', )) diff --git a/target/riscv/th_csr.c b/target/riscv/th_csr.c new file mode 100644 index 0000000000..66d260cabd --- /dev/null +++ b/target/riscv/th_csr.c @@ -0,0 +1,78 @@ +/* + * T-Head-specific CSRs. + * + * Copyright (c) 2024 VRULL GmbH + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License f= or + * more details. + * + * You should have received a copy of the GNU General Public License along= with + * this program. If not, see . + */ + +#include "qemu/osdep.h" +#include "cpu.h" +#include "cpu_vendorid.h" + +#define CSR_TH_SXSTATUS 0x5c0 + +/* TH_SXSTATUS bits */ +#define TH_SXSTATUS_UCME BIT(16) +#define TH_SXSTATUS_MAEE BIT(21) +#define TH_SXSTATUS_THEADISAEE BIT(22) + +typedef struct { + int csrno; + int (*insertion_test)(RISCVCPU *cpu); + riscv_csr_operations csr_ops; +} riscv_csr; + +static RISCVException s_mode_csr(CPURISCVState *env, int csrno) +{ + if (env->debugger) + return RISCV_EXCP_NONE; + + if (env->priv >=3D PRV_S) + return RISCV_EXCP_NONE; + + return RISCV_EXCP_ILLEGAL_INST; +} + +static int test_thead_mvendorid(RISCVCPU *cpu) +{ + if (cpu->cfg.mvendorid !=3D THEAD_VENDOR_ID) + return -1; + return 0; +} + +static RISCVException read_th_sxstatus(CPURISCVState *env, int csrno, + target_ulong *val) +{ + /* We don't set MAEE here, because QEMU does not implement MAEE. */ + *val =3D TH_SXSTATUS_UCME | TH_SXSTATUS_THEADISAEE; + return RISCV_EXCP_NONE; +} + +static riscv_csr th_csr_list[] =3D { + { + .csrno =3D CSR_TH_SXSTATUS, + .insertion_test =3D test_thead_mvendorid, + .csr_ops =3D { "th.sxstatus", s_mode_csr, read_th_sxstatus } + } +}; + +void th_register_custom_csrs(RISCVCPU *cpu) +{ + for (size_t i =3D 0; i < ARRAY_SIZE(th_csr_list); i++) { + int csrno =3D th_csr_list[i].csrno; + riscv_csr_operations *csr_ops =3D &th_csr_list[i].csr_ops; + if (!th_csr_list[i].insertion_test(cpu)) + riscv_set_csr_ops(csrno, csr_ops); + } +} --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712907776; cv=none; d=zohomail.com; s=zohoarc; b=ZAY8HsLgRNnfADHNqYA6LlDsjF5v1v2ga3uTL1NMEPhJWhM48+49muzLOuCEONegMxF7bjgosoXPEeSULugiDX7EEdkxsG25nrd2mHcA2pRXr4Dqt0DWsctjpUxiRYwLmZF6iipL+6pPUJ6x3cCW6tMAEg9bXk6RNFai42O80gY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712907776; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=HN6DhZ1YDyjygjYarsnv77wizJM6gixFAR0kv+mDYD8=; b=F+XKTfy5wlATiTbokQ2NAgbT4diBuMeG9TPUWgRkE90ANDABMNaR5BDXgxRUYx03ebZ+wVD69fHX1OQGyk/kBiWszAWXrUP7jus77l3MoBXDPJtkMyTlHpUNm7s1r7JWDqV1i5Og6/MKApPrvdh4CtYFq7JiHdJ0GAaEALXOKvU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 17129077764901015.2656844467529; Fri, 12 Apr 2024 00:42:56 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBYT-0000iv-Na; Fri, 12 Apr 2024 03:42:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBYQ-0000ib-Un; Fri, 12 Apr 2024 03:42:34 -0400 Received: from out30-112.freemail.mail.aliyun.com ([115.124.30.112]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBYO-00008x-CZ; Fri, 12 Apr 2024 03:42:34 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NQA-S_1712907742) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 15:42:23 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712907745; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=HN6DhZ1YDyjygjYarsnv77wizJM6gixFAR0kv+mDYD8=; b=HECwdnN5XD6mDtqQwYEHk7amfQOX4gPXPx9A0Iu/DWZz1c/asR1IJZMCy41SRnyY8Wkp6DHrFpilOdL+B/L3uF/sbWUxUhtAwY5oRdOvkvyMefKrzXozGSGjhhhYGhtWcCYOIQe6mXslKzoWNassuR7P6OoGScoNYFYEsKQnenA= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R141e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045192; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NQA-S_1712907742; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 02/65] target/riscv: Reuse th_csr.c to add user-mode csrs Date: Fri, 12 Apr 2024 15:36:32 +0800 Message-ID: <20240412073735.76413-3-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.112; envelope-from=eric.huang@linux.alibaba.com; helo=out30-112.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712907778391100001 Content-Type: text/plain; charset="utf-8" The former patch added th_csr.c to add th.sxstatus csr for XTheadMaee. However, it can only support system-mode vendor csrs. In this patch, I change the way of compiling th_csr.c and calling the function th_register_custom_csrs, using '#if !defined(CONFIG_USER_ONLY)' in th_csr.c to support both user-mode and system-mode vendor csrs. Signed-off-by: Huang Tao --- target/riscv/cpu.c | 2 +- target/riscv/meson.build | 2 +- target/riscv/th_csr.c | 21 +++++++++++++-------- 3 files changed, 15 insertions(+), 10 deletions(-) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 46a66cdbbb..3f21c976ba 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -545,8 +545,8 @@ static void rv64_thead_c906_cpu_init(Object *obj) cpu->cfg.mvendorid =3D THEAD_VENDOR_ID; #ifndef CONFIG_USER_ONLY set_satp_mode_max_supported(cpu, VM_1_10_SV39); - th_register_custom_csrs(cpu); #endif + th_register_custom_csrs(cpu); =20 /* inherited from parent obj via riscv_cpu_init() */ cpu->cfg.pmp =3D true; diff --git a/target/riscv/meson.build b/target/riscv/meson.build index a4bd61e52a..b01a6cfb23 100644 --- a/target/riscv/meson.build +++ b/target/riscv/meson.build @@ -12,6 +12,7 @@ riscv_ss.add(files( 'cpu.c', 'cpu_helper.c', 'csr.c', + 'th_csr.c', 'fpu_helper.c', 'gdbstub.c', 'op_helper.c', @@ -33,7 +34,6 @@ riscv_system_ss.add(files( 'monitor.c', 'machine.c', 'pmu.c', - 'th_csr.c', 'time_helper.c', 'riscv-qmp-cmds.c', )) diff --git a/target/riscv/th_csr.c b/target/riscv/th_csr.c index 66d260cabd..dc087b1ffa 100644 --- a/target/riscv/th_csr.c +++ b/target/riscv/th_csr.c @@ -33,6 +33,15 @@ typedef struct { riscv_csr_operations csr_ops; } riscv_csr; =20 +static int test_thead_mvendorid(RISCVCPU *cpu) +{ + if (cpu->cfg.mvendorid !=3D THEAD_VENDOR_ID) { + return -1; + } + return 0; +} + +#if !defined(CONFIG_USER_ONLY) static RISCVException s_mode_csr(CPURISCVState *env, int csrno) { if (env->debugger) @@ -44,13 +53,6 @@ static RISCVException s_mode_csr(CPURISCVState *env, int= csrno) return RISCV_EXCP_ILLEGAL_INST; } =20 -static int test_thead_mvendorid(RISCVCPU *cpu) -{ - if (cpu->cfg.mvendorid !=3D THEAD_VENDOR_ID) - return -1; - return 0; -} - static RISCVException read_th_sxstatus(CPURISCVState *env, int csrno, target_ulong *val) { @@ -58,13 +60,16 @@ static RISCVException read_th_sxstatus(CPURISCVState *e= nv, int csrno, *val =3D TH_SXSTATUS_UCME | TH_SXSTATUS_THEADISAEE; return RISCV_EXCP_NONE; } +#endif =20 static riscv_csr th_csr_list[] =3D { +#if !defined(CONFIG_USER_ONLY) { .csrno =3D CSR_TH_SXSTATUS, .insertion_test =3D test_thead_mvendorid, .csr_ops =3D { "th.sxstatus", s_mode_csr, read_th_sxstatus } - } + }, +#endif }; =20 void th_register_custom_csrs(RISCVCPU *cpu) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712907897; cv=none; d=zohomail.com; s=zohoarc; b=DwXFnC962/Py6nKJWHuIK5EvSViu7GCoY+6bh3ZS5TWG7lVIu9CNPA4KuHiQ8t20AuF36PweVuw93xuxkCVzo+5hqDhiuw97AYMqmnYJbTSML5pIIQl6m4gsL2H7+yZZLslZFEAuk3DYbDbzLIKulbrTAmynBdZLBc7cbjUpW+Y= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712907897; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=hKbo4P0vi87lED25C2k2TdmTf1KALgcOHYKTjpACgqs=; b=ValPM/iypJS7BtJHOL+JK1H+7sNLO3mU4TgC01sB2qhr9b7KS6mk+NavgJ1IfnXzQx9IiJ71V+T3kyWGT1hwyYle9yVm5SryVJ3GRwZ9SoaNWUaVEvCNUTDnzt66AtwAMXqR8rjlgXBu8t7YyvNCxLNVfh5Anwy0DtRL6gZ1pbs= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712907897134259.5468116426533; Fri, 12 Apr 2024 00:44:57 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBaQ-0001Ys-IE; Fri, 12 Apr 2024 03:44:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBaO-0001YQ-CA; Fri, 12 Apr 2024 03:44:36 -0400 Received: from out30-124.freemail.mail.aliyun.com ([115.124.30.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBaK-0000S5-Kj; Fri, 12 Apr 2024 03:44:36 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NQAh5_1712907864) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 15:44:25 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712907866; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=hKbo4P0vi87lED25C2k2TdmTf1KALgcOHYKTjpACgqs=; b=NQ29oGTcqR49ovH3DRUlTWGTfw8/40AxDtqdTyarOUiHG/9hZJlo1IZRBp/VzcgX11mPSNEOoE3e4J7ZvQgfHZjzXHCYmDRprJb2jLVcvvhsSEYl21cOKUC6huH9UjRbcULz52eh47LpPYjb/B7aim8p9piJkV4fYnxZeeXegEM= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R191e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046060; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NQAh5_1712907864; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 03/65] target/riscv: Add properties for XTheadVector extension Date: Fri, 12 Apr 2024 15:36:33 +0800 Message-ID: <20240412073735.76413-4-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.124; envelope-from=eric.huang@linux.alibaba.com; helo=out30-124.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712907898812100001 Content-Type: text/plain; charset="utf-8" Add ext_xtheadvector properties. In this patch, we add ext_xtheadvector in RISCVCPUConfig for XTheadVector as a start. In rv64_thead_c906_cpu_init, we make ext_xtheadvector equals false to avoid affecting other extensions when it is not fully implemented. Signed-off-by: Huang Tao --- target/riscv/cpu.c | 3 +++ target/riscv/cpu_cfg.h | 2 ++ target/riscv/cpu_helper.c | 2 +- target/riscv/tcg/tcg-cpu.c | 33 +++++++++++++++++++++++++++++++++ 4 files changed, 39 insertions(+), 1 deletion(-) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 3f21c976ba..05652e8c87 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -201,6 +201,7 @@ const RISCVIsaExtData isa_edata_arr[] =3D { ISA_EXT_DATA_ENTRY(xtheadmemidx, PRIV_VERSION_1_11_0, ext_xtheadmemidx= ), ISA_EXT_DATA_ENTRY(xtheadmempair, PRIV_VERSION_1_11_0, ext_xtheadmempa= ir), ISA_EXT_DATA_ENTRY(xtheadsync, PRIV_VERSION_1_11_0, ext_xtheadsync), + ISA_EXT_DATA_ENTRY(xtheadvector, PRIV_VERSION_1_11_0, ext_xtheadvector= ), ISA_EXT_DATA_ENTRY(xventanacondops, PRIV_VERSION_1_12_0, ext_XVentanaC= ondOps), =20 DEFINE_PROP_END_OF_LIST(), @@ -541,6 +542,7 @@ static void rv64_thead_c906_cpu_init(Object *obj) cpu->cfg.ext_xtheadmemidx =3D true; cpu->cfg.ext_xtheadmempair =3D true; cpu->cfg.ext_xtheadsync =3D true; + cpu->cfg.ext_xtheadvector =3D false; =20 cpu->cfg.mvendorid =3D THEAD_VENDOR_ID; #ifndef CONFIG_USER_ONLY @@ -1567,6 +1569,7 @@ const RISCVCPUMultiExtConfig riscv_cpu_vendor_exts[] = =3D { MULTI_EXT_CFG_BOOL("xtheadmemidx", ext_xtheadmemidx, false), MULTI_EXT_CFG_BOOL("xtheadmempair", ext_xtheadmempair, false), MULTI_EXT_CFG_BOOL("xtheadsync", ext_xtheadsync, false), + MULTI_EXT_CFG_BOOL("xtheadvector", ext_xtheadvector, false), MULTI_EXT_CFG_BOOL("xventanacondops", ext_XVentanaCondOps, false), =20 DEFINE_PROP_END_OF_LIST(), diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h index cb750154bd..da85e94e04 100644 --- a/target/riscv/cpu_cfg.h +++ b/target/riscv/cpu_cfg.h @@ -149,6 +149,7 @@ struct RISCVCPUConfig { bool ext_xtheadmemidx; bool ext_xtheadmempair; bool ext_xtheadsync; + bool ext_xtheadvector; bool ext_XVentanaCondOps; =20 uint32_t pmu_mask; @@ -205,6 +206,7 @@ MATERIALISE_EXT_PREDICATE(xtheadmac) MATERIALISE_EXT_PREDICATE(xtheadmemidx) MATERIALISE_EXT_PREDICATE(xtheadmempair) MATERIALISE_EXT_PREDICATE(xtheadsync) +MATERIALISE_EXT_PREDICATE(xtheadvector) MATERIALISE_EXT_PREDICATE(XVentanaCondOps) =20 #endif diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c index fc090d729a..5882b65321 100644 --- a/target/riscv/cpu_helper.c +++ b/target/riscv/cpu_helper.c @@ -72,7 +72,7 @@ void cpu_get_tb_cpu_state(CPURISCVState *env, vaddr *pc, *pc =3D env->xl =3D=3D MXL_RV32 ? env->pc & UINT32_MAX : env->pc; *cs_base =3D 0; =20 - if (cpu->cfg.ext_zve32f) { + if (cpu->cfg.ext_zve32f || cpu->cfg.ext_xtheadvector) { /* * If env->vl equals to VLMAX, we can use generic vector operation * expanders (GVEC) to accerlate the vector operations. diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c index 483774e4f8..f7a105b30e 100644 --- a/target/riscv/tcg/tcg-cpu.c +++ b/target/riscv/tcg/tcg-cpu.c @@ -281,6 +281,25 @@ static void riscv_cpu_validate_v(CPURISCVState *env, R= ISCVCPUConfig *cfg, } } =20 +static void th_cpu_validate_v(CPURISCVState *env, RISCVCPUConfig *cfg, + Error **errp) +{ + uint32_t vlen =3D cfg->vlenb << 3; + + if (vlen < 32) { + error_setg(errp, + "In XTheadVector extension, VLEN must be " + "greater than or equal to 32"); + } + + if (vlen < cfg->elen) { + error_setg(errp, + "In XTheadVector extension, VLEN must be " + "greater than or equal to ELEN"); + return; + } +} + static void riscv_cpu_disable_priv_spec_isa_exts(RISCVCPU *cpu) { CPURISCVState *env =3D &cpu->env; @@ -485,6 +504,20 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, = Error **errp) return; } =20 + if (cpu->cfg.ext_xtheadvector && riscv_has_ext(env, RVV)) { + error_setg(errp, "XTheadVector extension is incompatible with " + "RVV extension"); + return; + } + + if (cpu->cfg.ext_xtheadvector) { + th_cpu_validate_v(env, &cpu->cfg, &local_err); + if (local_err !=3D NULL) { + error_propagate(errp, local_err); + return; + } + } + if (riscv_has_ext(env, RVV)) { riscv_cpu_validate_v(env, &cpu->cfg, &local_err); if (local_err !=3D NULL) { --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712908018; cv=none; d=zohomail.com; s=zohoarc; b=V6jc+jXlGNSWDADob0ULph/oUUtgU+EubwBhDvarahV9pPjbyDa4FQbTrT/gq7b1I7VPiP8eC9iao1JduqaPZalEmRl0PDxhe13tQgKigM4jY4iOpEMVidIVoM1RNyTDrFtOJkevQxpR9VNjyxDTumxzeBbDRMnhT/mzMYt5+3w= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712908018; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=U5gWdPjnZ8jAyzXeyOkRDORRmrHe3OHO1qDDOYZ+XK4=; b=F5ioP0Sy6dKei1r5sQG/JFpN5sik6mP2doCppCLGv/anqBXWMkISKNzkNPiVegaXw7+4HtEtT+MgPEiOice+c4l+O1ZmWGxBZiosp7c8xouoTe+AgKyq3MsbDciy+oYwtcFSQ6QX4sdW9UfO0wrdP9l8954/Y+UJsW565fsj8FI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712908018191522.5473165435839; Fri, 12 Apr 2024 00:46:58 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBcP-0003S4-DZ; Fri, 12 Apr 2024 03:46:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBcN-0003Rj-M1; Fri, 12 Apr 2024 03:46:39 -0400 Received: from out30-111.freemail.mail.aliyun.com ([115.124.30.111]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBcI-0000xE-Bp; Fri, 12 Apr 2024 03:46:39 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NVSM._1712907986) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 15:46:26 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712907987; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=U5gWdPjnZ8jAyzXeyOkRDORRmrHe3OHO1qDDOYZ+XK4=; b=uXUYWVOlY7lTqr3pOw58VDC80urbXQvks5H4a2C9rO+WuwVEIgKU7mbVFXf6MWcEen62ocuzTq9WIqmwl6PLxqxfeRaDaddNKgstgvAjld2aZy6AD/mkufgzbKdru3NGrJz9CWOZiWTJEKxd0XLn/YE836sA5OQT8kBy8bSdQKA= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R411e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046050; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NVSM._1712907986; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 04/65] target/riscv: Override some csr ops for XTheadVector Date: Fri, 12 Apr 2024 15:36:34 +0800 Message-ID: <20240412073735.76413-5-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.111; envelope-from=eric.huang@linux.alibaba.com; helo=out30-111.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712908019038100001 Content-Type: text/plain; charset="utf-8" Some CSR operations have different behavior when XTheadVector is enabled. In this patch, we override the RISC-V standard implementation of these CSRs with a vendor implementation. Additionally, we attempt to use the decorator pattern to explicitly list the different behaviors between xtheadvector and the RISC-V standard. Signed-off-by: Huang Tao --- target/riscv/cpu.h | 36 +++++++++ target/riscv/cpu_bits.h | 18 +++++ target/riscv/csr.c | 42 +++++----- target/riscv/th_csr.c | 169 +++++++++++++++++++++++++++++++++++++++- 4 files changed, 243 insertions(+), 22 deletions(-) diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index 8d0b500758..6558e652df 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -813,6 +813,42 @@ void riscv_add_satp_mode_properties(Object *obj); bool riscv_cpu_accelerator_compatible(RISCVCPU *cpu); =20 /* CSR function table */ +RISCVException fs(CPURISCVState *env, int csrno); +RISCVException vs(CPURISCVState *env, int csrno); +RISCVException any(CPURISCVState *env, int csrno); +RISCVException smode(CPURISCVState *env, int csrno); +RISCVException read_fcsr(CPURISCVState *env, int csrno, + target_ulong *val); +RISCVException write_fcsr(CPURISCVState *env, int csrno, + target_ulong val); +RISCVException read_vtype(CPURISCVState *env, int csrno, + target_ulong *val); +RISCVException read_vl(CPURISCVState *env, int csrno, + target_ulong *val); +RISCVException read_vlenb(CPURISCVState *env, int csrno, + target_ulong *val); +RISCVException read_vxrm(CPURISCVState *env, int csrno, + target_ulong *val); +RISCVException write_vxrm(CPURISCVState *env, int csrno, + target_ulong val); +RISCVException read_vxsat(CPURISCVState *env, int csrno, + target_ulong *val); +RISCVException write_vxsat(CPURISCVState *env, int csrno, + target_ulong val); +RISCVException read_vstart(CPURISCVState *env, int csrno, + target_ulong *val); +RISCVException write_vstart(CPURISCVState *env, int csrno, + target_ulong val); +RISCVException read_mstatus(CPURISCVState *env, int csrno, + target_ulong *val); +RISCVException write_mstatus(CPURISCVState *env, int csrno, + target_ulong val); +RISCVException write_sstatus(CPURISCVState *env, int csrno, + target_ulong val); +RISCVException read_sstatus(CPURISCVState *env, int csrno, + target_ulong *val); +RISCVException read_vcsr(CPURISCVState *env, int csrno, target_ulong *val); +RISCVException write_vcsr(CPURISCVState *env, int csrno, target_ulong val); extern riscv_csr_operations csr_ops[CSR_TABLE_SIZE]; =20 extern const bool valid_vm_1_10_32[], valid_vm_1_10_64[]; diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h index fc2068ee4d..5a5e0ed444 100644 --- a/target/riscv/cpu_bits.h +++ b/target/riscv/cpu_bits.h @@ -904,4 +904,22 @@ typedef enum RISCVException { #define MCONTEXT64 0x0000000000001FFFULL #define MCONTEXT32_HCONTEXT 0x0000007F #define MCONTEXT64_HCONTEXT 0x0000000000003FFFULL + +/* Xuantie custom CSRs */ +#define TH_MSTATUS_VS 0x01800000 + +#define TH_FSR_VXRM_SHIFT 9 +#define TH_FSR_VXRM (0x3 << TH_FSR_VXRM_SHIFT) + +#define TH_FSR_VXSAT_SHIFT 8 +#define TH_FSR_VXSAT (0x1 << TH_FSR_VXSAT_SHIFT) + +#define TH_VTYPE_LMUL_SHIFT 0 +#define TH_VTYPE_LMUL (0x3 << TH_VTYPE_LMUL_SHIFT) + +#define TH_VTYPE_SEW_SHIFT 2 +#define TH_VTYPE_SEW (0x7 << TH_VTYPE_SEW_SHIFT) + +#define TH_VTYPE_CLEAR_SHIFT 5 +#define TH_VTYPE_CLEAR (0x7 << TH_VTYPE_CLEAR_SHIFT) #endif diff --git a/target/riscv/csr.c b/target/riscv/csr.c index 726096444f..797929d5b9 100644 --- a/target/riscv/csr.c +++ b/target/riscv/csr.c @@ -76,7 +76,7 @@ RISCVException smstateen_acc_ok(CPURISCVState *env, int i= ndex, uint64_t bit) } #endif =20 -static RISCVException fs(CPURISCVState *env, int csrno) +RISCVException fs(CPURISCVState *env, int csrno) { #if !defined(CONFIG_USER_ONLY) if (!env->debugger && !riscv_cpu_fp_enabled(env) && @@ -91,7 +91,7 @@ static RISCVException fs(CPURISCVState *env, int csrno) return RISCV_EXCP_NONE; } =20 -static RISCVException vs(CPURISCVState *env, int csrno) +RISCVException vs(CPURISCVState *env, int csrno) { if (riscv_cpu_cfg(env)->ext_zve32f) { #if !defined(CONFIG_USER_ONLY) @@ -227,7 +227,7 @@ static RISCVException sscofpmf(CPURISCVState *env, int = csrno) return RISCV_EXCP_NONE; } =20 -static RISCVException any(CPURISCVState *env, int csrno) +RISCVException any(CPURISCVState *env, int csrno) { return RISCV_EXCP_NONE; } @@ -260,7 +260,7 @@ static RISCVException aia_any32(CPURISCVState *env, int= csrno) return any32(env, csrno); } =20 -static RISCVException smode(CPURISCVState *env, int csrno) +RISCVException smode(CPURISCVState *env, int csrno) { if (riscv_has_ext(env, RVS)) { return RISCV_EXCP_NONE; @@ -635,7 +635,7 @@ static RISCVException write_frm(CPURISCVState *env, int= csrno, return RISCV_EXCP_NONE; } =20 -static RISCVException read_fcsr(CPURISCVState *env, int csrno, +RISCVException read_fcsr(CPURISCVState *env, int csrno, target_ulong *val) { *val =3D (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT) @@ -643,7 +643,7 @@ static RISCVException read_fcsr(CPURISCVState *env, int= csrno, return RISCV_EXCP_NONE; } =20 -static RISCVException write_fcsr(CPURISCVState *env, int csrno, +RISCVException write_fcsr(CPURISCVState *env, int csrno, target_ulong val) { #if !defined(CONFIG_USER_ONLY) @@ -656,7 +656,7 @@ static RISCVException write_fcsr(CPURISCVState *env, in= t csrno, return RISCV_EXCP_NONE; } =20 -static RISCVException read_vtype(CPURISCVState *env, int csrno, +RISCVException read_vtype(CPURISCVState *env, int csrno, target_ulong *val) { uint64_t vill; @@ -674,28 +674,28 @@ static RISCVException read_vtype(CPURISCVState *env, = int csrno, return RISCV_EXCP_NONE; } =20 -static RISCVException read_vl(CPURISCVState *env, int csrno, +RISCVException read_vl(CPURISCVState *env, int csrno, target_ulong *val) { *val =3D env->vl; return RISCV_EXCP_NONE; } =20 -static RISCVException read_vlenb(CPURISCVState *env, int csrno, +RISCVException read_vlenb(CPURISCVState *env, int csrno, target_ulong *val) { *val =3D riscv_cpu_cfg(env)->vlenb; return RISCV_EXCP_NONE; } =20 -static RISCVException read_vxrm(CPURISCVState *env, int csrno, +RISCVException read_vxrm(CPURISCVState *env, int csrno, target_ulong *val) { *val =3D env->vxrm; return RISCV_EXCP_NONE; } =20 -static RISCVException write_vxrm(CPURISCVState *env, int csrno, +RISCVException write_vxrm(CPURISCVState *env, int csrno, target_ulong val) { #if !defined(CONFIG_USER_ONLY) @@ -705,14 +705,14 @@ static RISCVException write_vxrm(CPURISCVState *env, = int csrno, return RISCV_EXCP_NONE; } =20 -static RISCVException read_vxsat(CPURISCVState *env, int csrno, +RISCVException read_vxsat(CPURISCVState *env, int csrno, target_ulong *val) { *val =3D env->vxsat; return RISCV_EXCP_NONE; } =20 -static RISCVException write_vxsat(CPURISCVState *env, int csrno, +RISCVException write_vxsat(CPURISCVState *env, int csrno, target_ulong val) { #if !defined(CONFIG_USER_ONLY) @@ -722,14 +722,14 @@ static RISCVException write_vxsat(CPURISCVState *env,= int csrno, return RISCV_EXCP_NONE; } =20 -static RISCVException read_vstart(CPURISCVState *env, int csrno, +RISCVException read_vstart(CPURISCVState *env, int csrno, target_ulong *val) { *val =3D env->vstart; return RISCV_EXCP_NONE; } =20 -static RISCVException write_vstart(CPURISCVState *env, int csrno, +RISCVException write_vstart(CPURISCVState *env, int csrno, target_ulong val) { #if !defined(CONFIG_USER_ONLY) @@ -743,14 +743,14 @@ static RISCVException write_vstart(CPURISCVState *env= , int csrno, return RISCV_EXCP_NONE; } =20 -static RISCVException read_vcsr(CPURISCVState *env, int csrno, +RISCVException read_vcsr(CPURISCVState *env, int csrno, target_ulong *val) { *val =3D (env->vxrm << VCSR_VXRM_SHIFT) | (env->vxsat << VCSR_VXSAT_SH= IFT); return RISCV_EXCP_NONE; } =20 -static RISCVException write_vcsr(CPURISCVState *env, int csrno, +RISCVException write_vcsr(CPURISCVState *env, int csrno, target_ulong val) { #if !defined(CONFIG_USER_ONLY) @@ -1286,7 +1286,7 @@ static uint64_t add_status_sd(RISCVMXL xl, uint64_t s= tatus) return status; } =20 -static RISCVException read_mstatus(CPURISCVState *env, int csrno, +RISCVException read_mstatus(CPURISCVState *env, int csrno, target_ulong *val) { *val =3D add_status_sd(riscv_cpu_mxl(env), env->mstatus); @@ -1351,7 +1351,7 @@ static target_ulong legalize_mpp(CPURISCVState *env, = target_ulong old_mpp, return val; } =20 -static RISCVException write_mstatus(CPURISCVState *env, int csrno, +RISCVException write_mstatus(CPURISCVState *env, int csrno, target_ulong val) { uint64_t mstatus =3D env->mstatus; @@ -2639,7 +2639,7 @@ static RISCVException read_sstatus_i128(CPURISCVState= *env, int csrno, return RISCV_EXCP_NONE; } =20 -static RISCVException read_sstatus(CPURISCVState *env, int csrno, +RISCVException read_sstatus(CPURISCVState *env, int csrno, target_ulong *val) { target_ulong mask =3D (sstatus_v1_10_mask); @@ -2651,7 +2651,7 @@ static RISCVException read_sstatus(CPURISCVState *env= , int csrno, return RISCV_EXCP_NONE; } =20 -static RISCVException write_sstatus(CPURISCVState *env, int csrno, +RISCVException write_sstatus(CPURISCVState *env, int csrno, target_ulong val) { target_ulong mask =3D (sstatus_v1_10_mask); diff --git a/target/riscv/th_csr.c b/target/riscv/th_csr.c index dc087b1ffa..4fc488fc10 100644 --- a/target/riscv/th_csr.c +++ b/target/riscv/th_csr.c @@ -41,7 +41,124 @@ static int test_thead_mvendorid(RISCVCPU *cpu) return 0; } =20 +/* + * In XTheadVector, vcsr is inaccessible + * and we just check ext_xtheadvector instead of ext_zve32f + */ +static RISCVException th_vs(CPURISCVState *env, int csrno) +{ + RISCVCPU *cpu =3D env_archcpu(env); + if (cpu->cfg.ext_xtheadvector) { + if (csrno =3D=3D CSR_VCSR) { + return RISCV_EXCP_ILLEGAL_INST; + } + return RISCV_EXCP_NONE; + } + return vs(env, csrno); +} + +static RISCVException +th_read_fcsr(CPURISCVState *env, int csrno, target_ulong *val) +{ + RISCVCPU *cpu =3D env_archcpu(env); + RISCVException ret =3D read_fcsr(env, csrno, val); + if (cpu->cfg.ext_xtheadvector) { + *val =3D set_field(*val, TH_FSR_VXRM, env->vxrm); + *val =3D set_field(*val, TH_FSR_VXSAT, env->vxsat); + } + return ret; +} + +static RISCVException +th_write_fcsr(CPURISCVState *env, int csrno, target_ulong val) +{ + RISCVCPU *cpu =3D env_archcpu(env); + if (cpu->cfg.ext_xtheadvector) { + env->vxrm =3D get_field(val, TH_FSR_VXRM); + env->vxsat =3D get_field(val, TH_FSR_VXSAT); + } + return write_fcsr(env, csrno, val); +} + +/* + * We use the RVV1.0 format for env->vtype + * When reading vtype, we need to change the format. + * In RVV1.0: + * vtype[7] -> vma + * vtype[6] -> vta + * vtype[5:3] -> vsew + * vtype[2:0] -> vlmul + * In XTheadVector: + * vtype[6:5] -> vediv + * vtype[4:2] -> vsew + * vtype[1:0] -> vlmul + * Although vlmul size is different between RVV1.0 and XTheadVector, + * the lower 2 bits have the same meaning. + * vma, vta and vediv are useless in XTheadVector, So we need to clear + * vtype[7:5] for XTheadVector + */ +static RISCVException +th_read_vtype(CPURISCVState *env, int csrno, target_ulong *val) +{ + RISCVCPU *cpu =3D env_archcpu(env); + RISCVException ret =3D read_vtype(env, csrno, val); + if (cpu->cfg.ext_xtheadvector) { + *val =3D set_field(*val, TH_VTYPE_LMUL, + FIELD_EX64(*val, VTYPE, VLMUL)); + *val =3D set_field(*val, TH_VTYPE_SEW, + FIELD_EX64(*val, VTYPE, VSEW)); + *val =3D set_field(*val, TH_VTYPE_CLEAR, 0); + } + return ret; +} + #if !defined(CONFIG_USER_ONLY) +static RISCVException +th_read_mstatus(CPURISCVState *env, int csrno, target_ulong *val) +{ + RISCVCPU *cpu =3D env_archcpu(env); + RISCVException ret =3D read_mstatus(env, csrno, val); + if (cpu->cfg.ext_xtheadvector) { + *val =3D set_field(*val, TH_MSTATUS_VS, + get_field(*val, MSTATUS_VS)); + } + return ret; +} + +static RISCVException +th_write_mstatus(CPURISCVState *env, int csrno, target_ulong val) +{ + RISCVCPU *cpu =3D env_archcpu(env); + if (cpu->cfg.ext_xtheadvector) { + val =3D set_field(val, MSTATUS_VS, + get_field(val, TH_MSTATUS_VS)); + } + return write_mstatus(env, csrno, val); +} + +static RISCVException +th_read_sstatus(CPURISCVState *env, int csrno, target_ulong *val) +{ + RISCVCPU *cpu =3D env_archcpu(env); + RISCVException ret =3D read_sstatus(env, csrno, val); + if (cpu->cfg.ext_xtheadvector) { + *val =3D set_field(*val, TH_MSTATUS_VS, + get_field(*val, MSTATUS_VS)); + } + return ret; +} + +static RISCVException +th_write_sstatus(CPURISCVState *env, int csrno, target_ulong val) +{ + RISCVCPU *cpu =3D env_archcpu(env); + if (cpu->cfg.ext_xtheadvector) { + val =3D set_field(val, MSTATUS_VS, + get_field(val, TH_MSTATUS_VS)); + } + return write_sstatus(env, csrno, val); +} + static RISCVException s_mode_csr(CPURISCVState *env, int csrno) { if (env->debugger) @@ -63,13 +180,63 @@ static RISCVException read_th_sxstatus(CPURISCVState *= env, int csrno, #endif =20 static riscv_csr th_csr_list[] =3D { + { + .csrno =3D CSR_FCSR, + .insertion_test =3D test_thead_mvendorid, + .csr_ops =3D { "fcsr", fs, th_read_fcsr, th_write_fcsr } + }, + { + .csrno =3D CSR_VSTART, + .insertion_test =3D test_thead_mvendorid, + .csr_ops =3D { "vstart", th_vs, read_vstart, write_vstart } + }, + { + .csrno =3D CSR_VXSAT, + .insertion_test =3D test_thead_mvendorid, + .csr_ops =3D { "vxsat", th_vs, read_vxsat, write_vxsat } + }, + { + .csrno =3D CSR_VXRM, + .insertion_test =3D test_thead_mvendorid, + .csr_ops =3D { "vxrm", th_vs, read_vxrm, write_vxrm } + }, + { + .csrno =3D CSR_VCSR, + .insertion_test =3D test_thead_mvendorid, + .csr_ops =3D { "vcsr", th_vs, read_vcsr, write_vcsr } + }, + { + .csrno =3D CSR_VL, + .insertion_test =3D test_thead_mvendorid, + .csr_ops =3D { "vl", th_vs, read_vl} + }, + { + .csrno =3D CSR_VTYPE, + .insertion_test =3D test_thead_mvendorid, + .csr_ops =3D { "vtype", th_vs, th_read_vtype} + }, + { + .csrno =3D CSR_VLENB, + .insertion_test =3D test_thead_mvendorid, + .csr_ops =3D { "vlenb", th_vs, read_vlenb} + }, #if !defined(CONFIG_USER_ONLY) + { + .csrno =3D CSR_MSTATUS, + .insertion_test =3D test_thead_mvendorid, + .csr_ops =3D { "mstatus", any, th_read_mstatus, th_write_mstatus} + }, + { + .csrno =3D CSR_SSTATUS, + .insertion_test =3D test_thead_mvendorid, + .csr_ops =3D { "sstatus", smode, th_read_sstatus, th_write_sstatus} + }, { .csrno =3D CSR_TH_SXSTATUS, .insertion_test =3D test_thead_mvendorid, .csr_ops =3D { "th.sxstatus", s_mode_csr, read_th_sxstatus } }, -#endif +#endif /* !CONFIG_USER_ONLY */ }; =20 void th_register_custom_csrs(RISCVCPU *cpu) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712908155; cv=none; d=zohomail.com; s=zohoarc; b=DJgfwk/23gIop+BadWzYXvB+RFesZoENq2p13yqnltoiPVbu6/UTJWffoh2aAbdkB0EFeQOilli0pwZyAAmnSyHheBI55Douz2sssOjkWwuD6ne01LTmHzLdladnYmJyQUYwiy31ryU/+lUMO5XMh+ItegDnmnhqRPSA5JqDA2A= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712908155; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=TPNW4bmOn9CGG7L/DlfyVOtIHAJR8iGvrg4tz/S9gjQ=; b=Aaxt+ozbfPYVW8+xOCmdNfZWgn5DRuGek9j3jTofMiXAMnjE4R33Y3jsZ7X7GyKnrWkVTChlAtpr/7Q7EeHvEmhFnzRC0RKc0RsAaU4GaZR4k3nGZTY48B9BnYczV47MabzoT6WrLdZ8JvFaasN/lAjFZaSpOftCNZ1Xu10A09s= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712908155943263.4576263778622; Fri, 12 Apr 2024 00:49:15 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBeK-0004OW-S3; Fri, 12 Apr 2024 03:48:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBeH-0004Nq-MO; Fri, 12 Apr 2024 03:48:37 -0400 Received: from out30-110.freemail.mail.aliyun.com ([115.124.30.110]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBeF-0001Je-It; Fri, 12 Apr 2024 03:48:37 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nbehz_1712908107) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 15:48:28 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712908109; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=TPNW4bmOn9CGG7L/DlfyVOtIHAJR8iGvrg4tz/S9gjQ=; b=qQqTP8myMhEjkFfzxjMl5hqHKTrI4W60uh44/aYcvD5yRub1hpVQadkVcjI1YnswOdOqzaBb7WtN0PgDYr22mYN4gGs0gPkGOHrby3u20DZDXmTBahOr6s1nWQXh6/Vsrxuw8JK0niddujdbUj8fPwfZcm8cdGafxeP7TE7j1iI= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R421e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045168; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nbehz_1712908107; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 05/65] target/riscv: Add mlen in DisasContext Date: Fri, 12 Apr 2024 15:36:35 +0800 Message-ID: <20240412073735.76413-6-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.110; envelope-from=eric.huang@linux.alibaba.com; helo=out30-110.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712908157135100001 Content-Type: text/plain; charset="utf-8" The mask register layout of XTheadVector is different from that of RVV1.0. For RVV1.0, the mask bits for element i are located in bit[i] of the mask register. While for XTheadVector, the mask bits for element i are located bit[MLEN*i] of the mask register. (MLEN =3D SEW/LMUL) So we add mlen in DisasContext to indicate the mask bit and reduce the calculation of mlen. Signed-off-by: Huang Tao --- target/riscv/translate.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/target/riscv/translate.c b/target/riscv/translate.c index 7eb8c9cd31..a22fdb59df 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -106,6 +106,7 @@ typedef struct DisasContext { bool cfg_vta_all_1s; bool vstart_eq_zero; bool vl_eq_vlmax; + uint16_t mlen; CPUState *cs; TCGv zero; /* PointerMasking extension */ @@ -1207,6 +1208,9 @@ static void riscv_tr_init_disas_context(DisasContextB= ase *dcbase, CPUState *cs) ctx->zero =3D tcg_constant_tl(0); ctx->virt_inst_excp =3D false; ctx->decoders =3D cpu->decoders; + if (cpu->cfg.ext_xtheadvector) { + ctx->mlen =3D 1 << (ctx->sew + 3 - ctx->lmul); + } } =20 static void riscv_tr_tb_start(DisasContextBase *db, CPUState *cpu) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712908260; cv=none; d=zohomail.com; s=zohoarc; b=QerpnC37I2hBqsPzGWpBXWQTbiAu2M+CuUHCr//HrCOBY5gFsBUjQGZ4r1CyY65uD7DFfawcNAlaQKi6nph7Gi9+nlbxQTM3SUmg3F8JnXVwypENUch6jMfZJzqai07rfEaQrEW9YhRBUbqv+B34SiWLemBuLY5M6UX3AGZCeaI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712908260; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=vxUo2nmvOWUv6Z6G4IdxVKdLu32j+OWL1akdN/FKH8k=; b=aYnPf62tAUsSa2f+MhYU2joIl5BZG9+hKk1D3E10SmixQs/QdPEPn9MgdaoaDwYCBRCgY4qyHRXfOo3fwxd9/uUPIdhv4/sOTbWwEoweBRqhyR+hEtpveT2OnaZyBHVx5Y9avmu9qI+yBvXq/NAkqXC0Alhc4TTNvHl89QEcgDg= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 171290826022569.30777939177187; Fri, 12 Apr 2024 00:51:00 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBgL-00058W-2G; Fri, 12 Apr 2024 03:50:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBgI-00058B-VW; Fri, 12 Apr 2024 03:50:43 -0400 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBgD-0001oO-UO; Fri, 12 Apr 2024 03:50:42 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NaySc_1712908228) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 15:50:29 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712908231; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=vxUo2nmvOWUv6Z6G4IdxVKdLu32j+OWL1akdN/FKH8k=; b=URTwriJR4sIUQNwZ+yV2Ltm0CHC6rIqKuAw3CSjxBsjkHMG1TRsT4DtHFv9rJfI04KXBF/GJpO8tCrr2SxJGx2OCgb7W2Mcc7PBRmfPzcLo7O/2A6YD/fz4fq0LMAh5cHx+I8euOy8Hisk8NCiViRc9woz2hFdtW3mnRRwGPc9Y= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R111e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046060; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NaySc_1712908228; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 06/65] target/riscv: Implement insns decode rules for XTheadVector Date: Fri, 12 Apr 2024 15:36:36 +0800 Message-ID: <20240412073735.76413-7-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.133; envelope-from=eric.huang@linux.alibaba.com; helo=out30-133.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712908261504100003 Content-Type: text/plain; charset="utf-8" In this patch, we implement the XTheadVector instructions decode rules in xtheadvector.decode. In order to avoid compile failure, we add trans_ functions in trans_xtheadvector.c.inc as placeholders. Also, we add decode_xtheadvector in decoder_table to support dynamic building of deocders. There is no performance impact on standard decoding because the decode_xtheadvector will not be added to decode function array when ext_xtheadvector is false. Signed-off-by: Huang Tao --- .../riscv/insn_trans/trans_xtheadvector.c.inc | 384 +++++++++++++++++ target/riscv/meson.build | 1 + target/riscv/translate.c | 3 + target/riscv/xtheadvector.decode | 390 ++++++++++++++++++ 4 files changed, 778 insertions(+) create mode 100644 target/riscv/insn_trans/trans_xtheadvector.c.inc create mode 100644 target/riscv/xtheadvector.decode diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc new file mode 100644 index 0000000000..2dd77d74ab --- /dev/null +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -0,0 +1,384 @@ +/* + * RISC-V translation routines for the XTheadVector Extension. + * + * Copyright (c) 2024 Alibaba Group. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License f= or + * more details. + * + * You should have received a copy of the GNU General Public License along= with + * this program. If not, see . + */ +#include "tcg/tcg-op-gvec.h" +#include "tcg/tcg-gvec-desc.h" +#include "internals.h" + +static bool require_xtheadvector(DisasContext *s) +{ + return s->cfg_ptr->ext_xtheadvector && + s->mstatus_vs !=3D EXT_STATUS_DISABLED; +} + +#define TH_TRANS_STUB(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ +{ \ + return require_xtheadvector(s); \ +} + +TH_TRANS_STUB(th_vsetvli) +TH_TRANS_STUB(th_vsetvl) +TH_TRANS_STUB(th_vlb_v) +TH_TRANS_STUB(th_vlh_v) +TH_TRANS_STUB(th_vlw_v) +TH_TRANS_STUB(th_vle_v) +TH_TRANS_STUB(th_vlbu_v) +TH_TRANS_STUB(th_vlhu_v) +TH_TRANS_STUB(th_vlwu_v) +TH_TRANS_STUB(th_vlbff_v) +TH_TRANS_STUB(th_vlhff_v) +TH_TRANS_STUB(th_vlwff_v) +TH_TRANS_STUB(th_vleff_v) +TH_TRANS_STUB(th_vlbuff_v) +TH_TRANS_STUB(th_vlhuff_v) +TH_TRANS_STUB(th_vlwuff_v) +TH_TRANS_STUB(th_vsb_v) +TH_TRANS_STUB(th_vsh_v) +TH_TRANS_STUB(th_vsw_v) +TH_TRANS_STUB(th_vse_v) +TH_TRANS_STUB(th_vlsb_v) +TH_TRANS_STUB(th_vlsh_v) +TH_TRANS_STUB(th_vlsw_v) +TH_TRANS_STUB(th_vlse_v) +TH_TRANS_STUB(th_vlsbu_v) +TH_TRANS_STUB(th_vlshu_v) +TH_TRANS_STUB(th_vlswu_v) +TH_TRANS_STUB(th_vssb_v) +TH_TRANS_STUB(th_vssh_v) +TH_TRANS_STUB(th_vssw_v) +TH_TRANS_STUB(th_vsse_v) +TH_TRANS_STUB(th_vlxb_v) +TH_TRANS_STUB(th_vlxh_v) +TH_TRANS_STUB(th_vlxw_v) +TH_TRANS_STUB(th_vlxe_v) +TH_TRANS_STUB(th_vlxbu_v) +TH_TRANS_STUB(th_vlxhu_v) +TH_TRANS_STUB(th_vlxwu_v) +TH_TRANS_STUB(th_vsxb_v) +TH_TRANS_STUB(th_vsxh_v) +TH_TRANS_STUB(th_vsxw_v) +TH_TRANS_STUB(th_vsxe_v) +TH_TRANS_STUB(th_vamoswapw_v) +TH_TRANS_STUB(th_vamoaddw_v) +TH_TRANS_STUB(th_vamoxorw_v) +TH_TRANS_STUB(th_vamoandw_v) +TH_TRANS_STUB(th_vamoorw_v) +TH_TRANS_STUB(th_vamominw_v) +TH_TRANS_STUB(th_vamomaxw_v) +TH_TRANS_STUB(th_vamominuw_v) +TH_TRANS_STUB(th_vamomaxuw_v) +TH_TRANS_STUB(th_vamoswapd_v) +TH_TRANS_STUB(th_vamoaddd_v) +TH_TRANS_STUB(th_vamoxord_v) +TH_TRANS_STUB(th_vamoandd_v) +TH_TRANS_STUB(th_vamoord_v) +TH_TRANS_STUB(th_vamomind_v) +TH_TRANS_STUB(th_vamomaxd_v) +TH_TRANS_STUB(th_vamominud_v) +TH_TRANS_STUB(th_vamomaxud_v) +TH_TRANS_STUB(th_vadd_vv) +TH_TRANS_STUB(th_vadd_vx) +TH_TRANS_STUB(th_vadd_vi) +TH_TRANS_STUB(th_vsub_vv) +TH_TRANS_STUB(th_vsub_vx) +TH_TRANS_STUB(th_vrsub_vx) +TH_TRANS_STUB(th_vrsub_vi) +TH_TRANS_STUB(th_vwaddu_vv) +TH_TRANS_STUB(th_vwaddu_vx) +TH_TRANS_STUB(th_vwadd_vv) +TH_TRANS_STUB(th_vwadd_vx) +TH_TRANS_STUB(th_vwsubu_vv) +TH_TRANS_STUB(th_vwsubu_vx) +TH_TRANS_STUB(th_vwsub_vv) +TH_TRANS_STUB(th_vwsub_vx) +TH_TRANS_STUB(th_vwaddu_wv) +TH_TRANS_STUB(th_vwaddu_wx) +TH_TRANS_STUB(th_vwadd_wv) +TH_TRANS_STUB(th_vwadd_wx) +TH_TRANS_STUB(th_vwsubu_wv) +TH_TRANS_STUB(th_vwsubu_wx) +TH_TRANS_STUB(th_vwsub_wv) +TH_TRANS_STUB(th_vwsub_wx) +TH_TRANS_STUB(th_vadc_vvm) +TH_TRANS_STUB(th_vadc_vxm) +TH_TRANS_STUB(th_vadc_vim) +TH_TRANS_STUB(th_vmadc_vvm) +TH_TRANS_STUB(th_vmadc_vxm) +TH_TRANS_STUB(th_vmadc_vim) +TH_TRANS_STUB(th_vsbc_vvm) +TH_TRANS_STUB(th_vsbc_vxm) +TH_TRANS_STUB(th_vmsbc_vvm) +TH_TRANS_STUB(th_vmsbc_vxm) +TH_TRANS_STUB(th_vand_vv) +TH_TRANS_STUB(th_vand_vx) +TH_TRANS_STUB(th_vand_vi) +TH_TRANS_STUB(th_vor_vv) +TH_TRANS_STUB(th_vor_vx) +TH_TRANS_STUB(th_vor_vi) +TH_TRANS_STUB(th_vxor_vv) +TH_TRANS_STUB(th_vxor_vx) +TH_TRANS_STUB(th_vxor_vi) +TH_TRANS_STUB(th_vsll_vv) +TH_TRANS_STUB(th_vsll_vx) +TH_TRANS_STUB(th_vsll_vi) +TH_TRANS_STUB(th_vsrl_vv) +TH_TRANS_STUB(th_vsrl_vx) +TH_TRANS_STUB(th_vsrl_vi) +TH_TRANS_STUB(th_vsra_vv) +TH_TRANS_STUB(th_vsra_vx) +TH_TRANS_STUB(th_vsra_vi) +TH_TRANS_STUB(th_vnsrl_vv) +TH_TRANS_STUB(th_vnsrl_vx) +TH_TRANS_STUB(th_vnsrl_vi) +TH_TRANS_STUB(th_vnsra_vv) +TH_TRANS_STUB(th_vnsra_vx) +TH_TRANS_STUB(th_vnsra_vi) +TH_TRANS_STUB(th_vmseq_vv) +TH_TRANS_STUB(th_vmseq_vx) +TH_TRANS_STUB(th_vmseq_vi) +TH_TRANS_STUB(th_vmsne_vv) +TH_TRANS_STUB(th_vmsne_vx) +TH_TRANS_STUB(th_vmsne_vi) +TH_TRANS_STUB(th_vmsltu_vv) +TH_TRANS_STUB(th_vmsltu_vx) +TH_TRANS_STUB(th_vmslt_vv) +TH_TRANS_STUB(th_vmslt_vx) +TH_TRANS_STUB(th_vmsleu_vv) +TH_TRANS_STUB(th_vmsleu_vx) +TH_TRANS_STUB(th_vmsleu_vi) +TH_TRANS_STUB(th_vmsle_vv) +TH_TRANS_STUB(th_vmsle_vx) +TH_TRANS_STUB(th_vmsle_vi) +TH_TRANS_STUB(th_vmsgtu_vx) +TH_TRANS_STUB(th_vmsgtu_vi) +TH_TRANS_STUB(th_vmsgt_vx) +TH_TRANS_STUB(th_vmsgt_vi) +TH_TRANS_STUB(th_vminu_vv) +TH_TRANS_STUB(th_vminu_vx) +TH_TRANS_STUB(th_vmin_vv) +TH_TRANS_STUB(th_vmin_vx) +TH_TRANS_STUB(th_vmaxu_vv) +TH_TRANS_STUB(th_vmaxu_vx) +TH_TRANS_STUB(th_vmax_vv) +TH_TRANS_STUB(th_vmax_vx) +TH_TRANS_STUB(th_vmul_vv) +TH_TRANS_STUB(th_vmul_vx) +TH_TRANS_STUB(th_vmulh_vv) +TH_TRANS_STUB(th_vmulh_vx) +TH_TRANS_STUB(th_vmulhu_vv) +TH_TRANS_STUB(th_vmulhu_vx) +TH_TRANS_STUB(th_vmulhsu_vv) +TH_TRANS_STUB(th_vmulhsu_vx) +TH_TRANS_STUB(th_vdivu_vv) +TH_TRANS_STUB(th_vdivu_vx) +TH_TRANS_STUB(th_vdiv_vv) +TH_TRANS_STUB(th_vdiv_vx) +TH_TRANS_STUB(th_vremu_vv) +TH_TRANS_STUB(th_vremu_vx) +TH_TRANS_STUB(th_vrem_vv) +TH_TRANS_STUB(th_vrem_vx) +TH_TRANS_STUB(th_vwmulu_vv) +TH_TRANS_STUB(th_vwmulu_vx) +TH_TRANS_STUB(th_vwmulsu_vv) +TH_TRANS_STUB(th_vwmulsu_vx) +TH_TRANS_STUB(th_vwmul_vv) +TH_TRANS_STUB(th_vwmul_vx) +TH_TRANS_STUB(th_vmacc_vv) +TH_TRANS_STUB(th_vmacc_vx) +TH_TRANS_STUB(th_vnmsac_vv) +TH_TRANS_STUB(th_vnmsac_vx) +TH_TRANS_STUB(th_vmadd_vv) +TH_TRANS_STUB(th_vmadd_vx) +TH_TRANS_STUB(th_vnmsub_vv) +TH_TRANS_STUB(th_vnmsub_vx) +TH_TRANS_STUB(th_vwmaccu_vv) +TH_TRANS_STUB(th_vwmaccu_vx) +TH_TRANS_STUB(th_vwmacc_vv) +TH_TRANS_STUB(th_vwmacc_vx) +TH_TRANS_STUB(th_vwmaccsu_vv) +TH_TRANS_STUB(th_vwmaccsu_vx) +TH_TRANS_STUB(th_vwmaccus_vx) +TH_TRANS_STUB(th_vmv_v_v) +TH_TRANS_STUB(th_vmv_v_x) +TH_TRANS_STUB(th_vmv_v_i) +TH_TRANS_STUB(th_vmerge_vvm) +TH_TRANS_STUB(th_vmerge_vxm) +TH_TRANS_STUB(th_vmerge_vim) +TH_TRANS_STUB(th_vsaddu_vv) +TH_TRANS_STUB(th_vsaddu_vx) +TH_TRANS_STUB(th_vsaddu_vi) +TH_TRANS_STUB(th_vsadd_vv) +TH_TRANS_STUB(th_vsadd_vx) +TH_TRANS_STUB(th_vsadd_vi) +TH_TRANS_STUB(th_vssubu_vv) +TH_TRANS_STUB(th_vssubu_vx) +TH_TRANS_STUB(th_vssub_vv) +TH_TRANS_STUB(th_vssub_vx) +TH_TRANS_STUB(th_vaadd_vv) +TH_TRANS_STUB(th_vaadd_vx) +TH_TRANS_STUB(th_vaadd_vi) +TH_TRANS_STUB(th_vasub_vv) +TH_TRANS_STUB(th_vasub_vx) +TH_TRANS_STUB(th_vsmul_vv) +TH_TRANS_STUB(th_vsmul_vx) +TH_TRANS_STUB(th_vwsmaccu_vv) +TH_TRANS_STUB(th_vwsmaccu_vx) +TH_TRANS_STUB(th_vwsmacc_vv) +TH_TRANS_STUB(th_vwsmacc_vx) +TH_TRANS_STUB(th_vwsmaccsu_vv) +TH_TRANS_STUB(th_vwsmaccsu_vx) +TH_TRANS_STUB(th_vwsmaccus_vx) +TH_TRANS_STUB(th_vssrl_vv) +TH_TRANS_STUB(th_vssrl_vx) +TH_TRANS_STUB(th_vssrl_vi) +TH_TRANS_STUB(th_vssra_vv) +TH_TRANS_STUB(th_vssra_vx) +TH_TRANS_STUB(th_vssra_vi) +TH_TRANS_STUB(th_vnclipu_vv) +TH_TRANS_STUB(th_vnclipu_vx) +TH_TRANS_STUB(th_vnclipu_vi) +TH_TRANS_STUB(th_vnclip_vv) +TH_TRANS_STUB(th_vnclip_vx) +TH_TRANS_STUB(th_vnclip_vi) +TH_TRANS_STUB(th_vfadd_vv) +TH_TRANS_STUB(th_vfadd_vf) +TH_TRANS_STUB(th_vfsub_vv) +TH_TRANS_STUB(th_vfsub_vf) +TH_TRANS_STUB(th_vfrsub_vf) +TH_TRANS_STUB(th_vfwadd_vv) +TH_TRANS_STUB(th_vfwadd_vf) +TH_TRANS_STUB(th_vfwadd_wv) +TH_TRANS_STUB(th_vfwadd_wf) +TH_TRANS_STUB(th_vfwsub_vv) +TH_TRANS_STUB(th_vfwsub_vf) +TH_TRANS_STUB(th_vfwsub_wv) +TH_TRANS_STUB(th_vfwsub_wf) +TH_TRANS_STUB(th_vfmul_vv) +TH_TRANS_STUB(th_vfmul_vf) +TH_TRANS_STUB(th_vfdiv_vv) +TH_TRANS_STUB(th_vfdiv_vf) +TH_TRANS_STUB(th_vfrdiv_vf) +TH_TRANS_STUB(th_vfwmul_vv) +TH_TRANS_STUB(th_vfwmul_vf) +TH_TRANS_STUB(th_vfmacc_vv) +TH_TRANS_STUB(th_vfnmacc_vv) +TH_TRANS_STUB(th_vfnmacc_vf) +TH_TRANS_STUB(th_vfmacc_vf) +TH_TRANS_STUB(th_vfmsac_vv) +TH_TRANS_STUB(th_vfmsac_vf) +TH_TRANS_STUB(th_vfnmsac_vv) +TH_TRANS_STUB(th_vfnmsac_vf) +TH_TRANS_STUB(th_vfmadd_vv) +TH_TRANS_STUB(th_vfmadd_vf) +TH_TRANS_STUB(th_vfnmadd_vv) +TH_TRANS_STUB(th_vfnmadd_vf) +TH_TRANS_STUB(th_vfmsub_vv) +TH_TRANS_STUB(th_vfmsub_vf) +TH_TRANS_STUB(th_vfnmsub_vv) +TH_TRANS_STUB(th_vfnmsub_vf) +TH_TRANS_STUB(th_vfwmacc_vv) +TH_TRANS_STUB(th_vfwmacc_vf) +TH_TRANS_STUB(th_vfwnmacc_vv) +TH_TRANS_STUB(th_vfwnmacc_vf) +TH_TRANS_STUB(th_vfwmsac_vv) +TH_TRANS_STUB(th_vfwmsac_vf) +TH_TRANS_STUB(th_vfwnmsac_vv) +TH_TRANS_STUB(th_vfwnmsac_vf) +TH_TRANS_STUB(th_vfsqrt_v) +TH_TRANS_STUB(th_vfmin_vv) +TH_TRANS_STUB(th_vfmin_vf) +TH_TRANS_STUB(th_vfmax_vv) +TH_TRANS_STUB(th_vfmax_vf) +TH_TRANS_STUB(th_vfsgnj_vv) +TH_TRANS_STUB(th_vfsgnj_vf) +TH_TRANS_STUB(th_vfsgnjn_vv) +TH_TRANS_STUB(th_vfsgnjn_vf) +TH_TRANS_STUB(th_vfsgnjx_vv) +TH_TRANS_STUB(th_vfsgnjx_vf) +TH_TRANS_STUB(th_vmfeq_vv) +TH_TRANS_STUB(th_vmfeq_vf) +TH_TRANS_STUB(th_vmfne_vv) +TH_TRANS_STUB(th_vmfne_vf) +TH_TRANS_STUB(th_vmflt_vv) +TH_TRANS_STUB(th_vmflt_vf) +TH_TRANS_STUB(th_vmfle_vv) +TH_TRANS_STUB(th_vmfle_vf) +TH_TRANS_STUB(th_vmfgt_vf) +TH_TRANS_STUB(th_vmfge_vf) +TH_TRANS_STUB(th_vmford_vv) +TH_TRANS_STUB(th_vmford_vf) +TH_TRANS_STUB(th_vfclass_v) +TH_TRANS_STUB(th_vfmerge_vfm) +TH_TRANS_STUB(th_vfmv_v_f) +TH_TRANS_STUB(th_vfcvt_xu_f_v) +TH_TRANS_STUB(th_vfcvt_x_f_v) +TH_TRANS_STUB(th_vfcvt_f_xu_v) +TH_TRANS_STUB(th_vfcvt_f_x_v) +TH_TRANS_STUB(th_vfwcvt_xu_f_v) +TH_TRANS_STUB(th_vfwcvt_x_f_v) +TH_TRANS_STUB(th_vfwcvt_f_xu_v) +TH_TRANS_STUB(th_vfwcvt_f_x_v) +TH_TRANS_STUB(th_vfwcvt_f_f_v) +TH_TRANS_STUB(th_vfncvt_xu_f_v) +TH_TRANS_STUB(th_vfncvt_x_f_v) +TH_TRANS_STUB(th_vfncvt_f_xu_v) +TH_TRANS_STUB(th_vfncvt_f_x_v) +TH_TRANS_STUB(th_vfncvt_f_f_v) +TH_TRANS_STUB(th_vredsum_vs) +TH_TRANS_STUB(th_vredand_vs) +TH_TRANS_STUB(th_vredor_vs) +TH_TRANS_STUB(th_vredxor_vs) +TH_TRANS_STUB(th_vredminu_vs) +TH_TRANS_STUB(th_vredmin_vs) +TH_TRANS_STUB(th_vredmaxu_vs) +TH_TRANS_STUB(th_vredmax_vs) +TH_TRANS_STUB(th_vwredsumu_vs) +TH_TRANS_STUB(th_vwredsum_vs) +TH_TRANS_STUB(th_vfredsum_vs) +TH_TRANS_STUB(th_vfredmin_vs) +TH_TRANS_STUB(th_vfredmax_vs) +TH_TRANS_STUB(th_vfwredsum_vs) +TH_TRANS_STUB(th_vmand_mm) +TH_TRANS_STUB(th_vmnand_mm) +TH_TRANS_STUB(th_vmandnot_mm) +TH_TRANS_STUB(th_vmxor_mm) +TH_TRANS_STUB(th_vmor_mm) +TH_TRANS_STUB(th_vmnor_mm) +TH_TRANS_STUB(th_vmornot_mm) +TH_TRANS_STUB(th_vmxnor_mm) +TH_TRANS_STUB(th_vmpopc_m) +TH_TRANS_STUB(th_vmfirst_m) +TH_TRANS_STUB(th_vmsbf_m) +TH_TRANS_STUB(th_vmsif_m) +TH_TRANS_STUB(th_vmsof_m) +TH_TRANS_STUB(th_viota_m) +TH_TRANS_STUB(th_vid_v) +TH_TRANS_STUB(th_vext_x_v) +TH_TRANS_STUB(th_vmv_s_x) +TH_TRANS_STUB(th_vfmv_f_s) +TH_TRANS_STUB(th_vfmv_s_f) +TH_TRANS_STUB(th_vslideup_vx) +TH_TRANS_STUB(th_vslideup_vi) +TH_TRANS_STUB(th_vslide1up_vx) +TH_TRANS_STUB(th_vslidedown_vx) +TH_TRANS_STUB(th_vslidedown_vi) +TH_TRANS_STUB(th_vslide1down_vx) +TH_TRANS_STUB(th_vrgather_vv) +TH_TRANS_STUB(th_vrgather_vx) +TH_TRANS_STUB(th_vrgather_vi) +TH_TRANS_STUB(th_vcompress_vm) diff --git a/target/riscv/meson.build b/target/riscv/meson.build index b01a6cfb23..1207ba84ed 100644 --- a/target/riscv/meson.build +++ b/target/riscv/meson.build @@ -3,6 +3,7 @@ gen =3D [ decodetree.process('insn16.decode', extra_args: ['--static-decode=3Ddeco= de_insn16', '--insnwidth=3D16']), decodetree.process('insn32.decode', extra_args: '--static-decode=3Ddecod= e_insn32'), decodetree.process('xthead.decode', extra_args: '--static-decode=3Ddecod= e_xthead'), + decodetree.process('xtheadvector.decode', extra_args: '--static-decode= =3Ddecode_xtheadvector'), decodetree.process('XVentanaCondOps.decode', extra_args: '--static-decod= e=3Ddecode_XVentanaCodeOps'), ] =20 diff --git a/target/riscv/translate.c b/target/riscv/translate.c index a22fdb59df..ddc6dcb45f 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -1113,6 +1113,8 @@ static uint32_t opcode_at(DisasContextBase *dcbase, t= arget_ulong pc) #include "insn_trans/trans_rvbf16.c.inc" #include "decode-xthead.c.inc" #include "insn_trans/trans_xthead.c.inc" +#include "decode-xtheadvector.c.inc" +#include "insn_trans/trans_xtheadvector.c.inc" #include "insn_trans/trans_xventanacondops.c.inc" =20 /* Include the auto-generated decoder for 16 bit insn */ @@ -1131,6 +1133,7 @@ static inline int insn_len(uint16_t first_word) } =20 const RISCVDecoder decoder_table[] =3D { + { has_xtheadvector_p, decode_xtheadvector }, { always_true_p, decode_insn32 }, { has_xthead_p, decode_xthead}, { has_XVentanaCondOps_p, decode_XVentanaCodeOps}, diff --git a/target/riscv/xtheadvector.decode b/target/riscv/xtheadvector.d= ecode new file mode 100644 index 0000000000..47cbf9e24a --- /dev/null +++ b/target/riscv/xtheadvector.decode @@ -0,0 +1,390 @@ +%nf 29:3 !function=3Dex_plus_1 +%rs2 20:5 +%rs1 15:5 +%rd 7:5 +# +# +&r !extern rd rs1 rs2 +&rmrr !extern vm rd rs1 rs2 +&rmr !extern vm rd rs2 +&r2nfvm !extern vm rd rs1 nf +&rnfvm !extern vm rd rs1 rs2 nf +&rwdvm vm wd rd rs1 rs2 +# +@r ....... ..... ..... ... ..... ....... &r %rs2 %r= s1 %rd +@r2 ....... ..... ..... ... ..... ....... %rs1 %rd +@r2_nfvm ... ... vm:1 ..... ..... ... ..... ....... &r2nfvm %nf %rs1 %rd +@r_nfvm ... ... vm:1 ..... ..... ... ..... ....... &rnfvm %nf %rs2 %rs1 %= rd +@r_wdvm ..... wd:1 vm:1 ..... ..... ... ..... ....... &rwdvm %rs2 %rs1 %rd +@r_vm ...... vm:1 ..... ..... ... ..... ....... &rmrr %rs2 %rs1 %rd +@r_vm_1 ...... . ..... ..... ... ..... ....... &rmrr vm=3D1 %rs2 %rs1 = %rd +@r_vm_0 ...... . ..... ..... ... ..... ....... &rmrr vm=3D0 %rs2 %rs1 = %rd +@r2_vm ...... vm:1 ..... ..... ... ..... ....... &rmr %rs2 %rd +@r1_vm ...... vm:1 ..... ..... ... ..... ....... %rd +@r2rd ....... ..... ..... ... ..... ....... %rs2 %rd +@r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd +# +# *** RV32V Extension *** + +th_vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm +th_vsetvl 1000000 ..... ..... 111 ..... 1010111 @r + +# *** Vector loads and stores are encoded within LOADFP/STORE-FP *** +th_vlb_v ... 100 . 00000 ..... 000 ..... 0000111 @r2_nfvm +th_vlh_v ... 100 . 00000 ..... 101 ..... 0000111 @r2_nfvm +th_vlw_v ... 100 . 00000 ..... 110 ..... 0000111 @r2_nfvm +th_vle_v ... 000 . 00000 ..... 111 ..... 0000111 @r2_nfvm +th_vlbu_v ... 000 . 00000 ..... 000 ..... 0000111 @r2_nfvm +th_vlhu_v ... 000 . 00000 ..... 101 ..... 0000111 @r2_nfvm +th_vlwu_v ... 000 . 00000 ..... 110 ..... 0000111 @r2_nfvm +th_vlbff_v ... 100 . 10000 ..... 000 ..... 0000111 @r2_nfvm +th_vlhff_v ... 100 . 10000 ..... 101 ..... 0000111 @r2_nfvm +th_vlwff_v ... 100 . 10000 ..... 110 ..... 0000111 @r2_nfvm +th_vleff_v ... 000 . 10000 ..... 111 ..... 0000111 @r2_nfvm +th_vlbuff_v ... 000 . 10000 ..... 000 ..... 0000111 @r2_nfvm +th_vlhuff_v ... 000 . 10000 ..... 101 ..... 0000111 @r2_nfvm +th_vlwuff_v ... 000 . 10000 ..... 110 ..... 0000111 @r2_nfvm +th_vsb_v ... 000 . 00000 ..... 000 ..... 0100111 @r2_nfvm +th_vsh_v ... 000 . 00000 ..... 101 ..... 0100111 @r2_nfvm +th_vsw_v ... 000 . 00000 ..... 110 ..... 0100111 @r2_nfvm +th_vse_v ... 000 . 00000 ..... 111 ..... 0100111 @r2_nfvm + +th_vlsb_v ... 110 . ..... ..... 000 ..... 0000111 @r_nfvm +th_vlsh_v ... 110 . ..... ..... 101 ..... 0000111 @r_nfvm +th_vlsw_v ... 110 . ..... ..... 110 ..... 0000111 @r_nfvm +th_vlse_v ... 010 . ..... ..... 111 ..... 0000111 @r_nfvm +th_vlsbu_v ... 010 . ..... ..... 000 ..... 0000111 @r_nfvm +th_vlshu_v ... 010 . ..... ..... 101 ..... 0000111 @r_nfvm +th_vlswu_v ... 010 . ..... ..... 110 ..... 0000111 @r_nfvm +th_vssb_v ... 010 . ..... ..... 000 ..... 0100111 @r_nfvm +th_vssh_v ... 010 . ..... ..... 101 ..... 0100111 @r_nfvm +th_vssw_v ... 010 . ..... ..... 110 ..... 0100111 @r_nfvm +th_vsse_v ... 010 . ..... ..... 111 ..... 0100111 @r_nfvm + +th_vlxb_v ... 111 . ..... ..... 000 ..... 0000111 @r_nfvm +th_vlxh_v ... 111 . ..... ..... 101 ..... 0000111 @r_nfvm +th_vlxw_v ... 111 . ..... ..... 110 ..... 0000111 @r_nfvm +th_vlxe_v ... 011 . ..... ..... 111 ..... 0000111 @r_nfvm +th_vlxbu_v ... 011 . ..... ..... 000 ..... 0000111 @r_nfvm +th_vlxhu_v ... 011 . ..... ..... 101 ..... 0000111 @r_nfvm +th_vlxwu_v ... 011 . ..... ..... 110 ..... 0000111 @r_nfvm +# Vector ordered-indexed and unordered-indexed store insns. +th_vsxb_v ... -11 . ..... ..... 000 ..... 0100111 @r_nfvm +th_vsxh_v ... -11 . ..... ..... 101 ..... 0100111 @r_nfvm +th_vsxw_v ... -11 . ..... ..... 110 ..... 0100111 @r_nfvm +th_vsxe_v ... -11 . ..... ..... 111 ..... 0100111 @r_nfvm + +#*** Vector AMO operations are encoded under the standard AMO major opcode= *** +th_vamoswapw_v 00001 . . ..... ..... 110 ..... 0101111 @r_wdvm +th_vamoaddw_v 00000 . . ..... ..... 110 ..... 0101111 @r_wdvm +th_vamoxorw_v 00100 . . ..... ..... 110 ..... 0101111 @r_wdvm +th_vamoandw_v 01100 . . ..... ..... 110 ..... 0101111 @r_wdvm +th_vamoorw_v 01000 . . ..... ..... 110 ..... 0101111 @r_wdvm +th_vamominw_v 10000 . . ..... ..... 110 ..... 0101111 @r_wdvm +th_vamomaxw_v 10100 . . ..... ..... 110 ..... 0101111 @r_wdvm +th_vamominuw_v 11000 . . ..... ..... 110 ..... 0101111 @r_wdvm +th_vamomaxuw_v 11100 . . ..... ..... 110 ..... 0101111 @r_wdvm +th_vamoswapd_v 00001 . . ..... ..... 111 ..... 0101111 @r_wdvm +th_vamoaddd_v 00000 . . ..... ..... 111 ..... 0101111 @r_wdvm +th_vamoxord_v 00100 . . ..... ..... 111 ..... 0101111 @r_wdvm +th_vamoandd_v 01100 . . ..... ..... 111 ..... 0101111 @r_wdvm +th_vamoord_v 01000 . . ..... ..... 111 ..... 0101111 @r_wdvm +th_vamomind_v 10000 . . ..... ..... 111 ..... 0101111 @r_wdvm +th_vamomaxd_v 10100 . . ..... ..... 111 ..... 0101111 @r_wdvm +th_vamominud_v 11000 . . ..... ..... 111 ..... 0101111 @r_wdvm +th_vamomaxud_v 11100 . . ..... ..... 111 ..... 0101111 @r_wdvm + +# *** new major opcode OP-V *** +th_vadd_vv 000000 . ..... ..... 000 ..... 1010111 @r_vm +th_vadd_vx 000000 . ..... ..... 100 ..... 1010111 @r_vm +th_vadd_vi 000000 . ..... ..... 011 ..... 1010111 @r_vm +th_vsub_vv 000010 . ..... ..... 000 ..... 1010111 @r_vm +th_vsub_vx 000010 . ..... ..... 100 ..... 1010111 @r_vm +th_vrsub_vx 000011 . ..... ..... 100 ..... 1010111 @r_vm +th_vrsub_vi 000011 . ..... ..... 011 ..... 1010111 @r_vm +th_vwaddu_vv 110000 . ..... ..... 010 ..... 1010111 @r_vm +th_vwaddu_vx 110000 . ..... ..... 110 ..... 1010111 @r_vm +th_vwadd_vv 110001 . ..... ..... 010 ..... 1010111 @r_vm +th_vwadd_vx 110001 . ..... ..... 110 ..... 1010111 @r_vm +th_vwsubu_vv 110010 . ..... ..... 010 ..... 1010111 @r_vm +th_vwsubu_vx 110010 . ..... ..... 110 ..... 1010111 @r_vm +th_vwsub_vv 110011 . ..... ..... 010 ..... 1010111 @r_vm +th_vwsub_vx 110011 . ..... ..... 110 ..... 1010111 @r_vm +th_vwaddu_wv 110100 . ..... ..... 010 ..... 1010111 @r_vm +th_vwaddu_wx 110100 . ..... ..... 110 ..... 1010111 @r_vm +th_vwadd_wv 110101 . ..... ..... 010 ..... 1010111 @r_vm +th_vwadd_wx 110101 . ..... ..... 110 ..... 1010111 @r_vm +th_vwsubu_wv 110110 . ..... ..... 010 ..... 1010111 @r_vm +th_vwsubu_wx 110110 . ..... ..... 110 ..... 1010111 @r_vm +th_vwsub_wv 110111 . ..... ..... 010 ..... 1010111 @r_vm +th_vwsub_wx 110111 . ..... ..... 110 ..... 1010111 @r_vm +th_vadc_vvm 010000 1 ..... ..... 000 ..... 1010111 @r_vm_1 +th_vadc_vxm 010000 1 ..... ..... 100 ..... 1010111 @r_vm_1 +th_vadc_vim 010000 1 ..... ..... 011 ..... 1010111 @r_vm_1 +th_vmadc_vvm 010001 1 ..... ..... 000 ..... 1010111 @r_vm_1 +th_vmadc_vxm 010001 1 ..... ..... 100 ..... 1010111 @r_vm_1 +th_vmadc_vim 010001 1 ..... ..... 011 ..... 1010111 @r_vm_1 +th_vsbc_vvm 010010 1 ..... ..... 000 ..... 1010111 @r_vm_1 +th_vsbc_vxm 010010 1 ..... ..... 100 ..... 1010111 @r_vm_1 +th_vmsbc_vvm 010011 1 ..... ..... 000 ..... 1010111 @r_vm_1 +th_vmsbc_vxm 010011 1 ..... ..... 100 ..... 1010111 @r_vm_1 +th_vand_vv 001001 . ..... ..... 000 ..... 1010111 @r_vm +th_vand_vx 001001 . ..... ..... 100 ..... 1010111 @r_vm +th_vand_vi 001001 . ..... ..... 011 ..... 1010111 @r_vm +th_vor_vv 001010 . ..... ..... 000 ..... 1010111 @r_vm +th_vor_vx 001010 . ..... ..... 100 ..... 1010111 @r_vm +th_vor_vi 001010 . ..... ..... 011 ..... 1010111 @r_vm +th_vxor_vv 001011 . ..... ..... 000 ..... 1010111 @r_vm +th_vxor_vx 001011 . ..... ..... 100 ..... 1010111 @r_vm +th_vxor_vi 001011 . ..... ..... 011 ..... 1010111 @r_vm +th_vsll_vv 100101 . ..... ..... 000 ..... 1010111 @r_vm +th_vsll_vx 100101 . ..... ..... 100 ..... 1010111 @r_vm +th_vsll_vi 100101 . ..... ..... 011 ..... 1010111 @r_vm +th_vsrl_vv 101000 . ..... ..... 000 ..... 1010111 @r_vm +th_vsrl_vx 101000 . ..... ..... 100 ..... 1010111 @r_vm +th_vsrl_vi 101000 . ..... ..... 011 ..... 1010111 @r_vm +th_vsra_vv 101001 . ..... ..... 000 ..... 1010111 @r_vm +th_vsra_vx 101001 . ..... ..... 100 ..... 1010111 @r_vm +th_vsra_vi 101001 . ..... ..... 011 ..... 1010111 @r_vm +th_vnsrl_vv 101100 . ..... ..... 000 ..... 1010111 @r_vm +th_vnsrl_vx 101100 . ..... ..... 100 ..... 1010111 @r_vm +th_vnsrl_vi 101100 . ..... ..... 011 ..... 1010111 @r_vm +th_vnsra_vv 101101 . ..... ..... 000 ..... 1010111 @r_vm +th_vnsra_vx 101101 . ..... ..... 100 ..... 1010111 @r_vm +th_vnsra_vi 101101 . ..... ..... 011 ..... 1010111 @r_vm +th_vmseq_vv 011000 . ..... ..... 000 ..... 1010111 @r_vm +th_vmseq_vx 011000 . ..... ..... 100 ..... 1010111 @r_vm +th_vmseq_vi 011000 . ..... ..... 011 ..... 1010111 @r_vm +th_vmsne_vv 011001 . ..... ..... 000 ..... 1010111 @r_vm +th_vmsne_vx 011001 . ..... ..... 100 ..... 1010111 @r_vm +th_vmsne_vi 011001 . ..... ..... 011 ..... 1010111 @r_vm +th_vmsltu_vv 011010 . ..... ..... 000 ..... 1010111 @r_vm +th_vmsltu_vx 011010 . ..... ..... 100 ..... 1010111 @r_vm +th_vmslt_vv 011011 . ..... ..... 000 ..... 1010111 @r_vm +th_vmslt_vx 011011 . ..... ..... 100 ..... 1010111 @r_vm +th_vmsleu_vv 011100 . ..... ..... 000 ..... 1010111 @r_vm +th_vmsleu_vx 011100 . ..... ..... 100 ..... 1010111 @r_vm +th_vmsleu_vi 011100 . ..... ..... 011 ..... 1010111 @r_vm +th_vmsle_vv 011101 . ..... ..... 000 ..... 1010111 @r_vm +th_vmsle_vx 011101 . ..... ..... 100 ..... 1010111 @r_vm +th_vmsle_vi 011101 . ..... ..... 011 ..... 1010111 @r_vm +th_vmsgtu_vx 011110 . ..... ..... 100 ..... 1010111 @r_vm +th_vmsgtu_vi 011110 . ..... ..... 011 ..... 1010111 @r_vm +th_vmsgt_vx 011111 . ..... ..... 100 ..... 1010111 @r_vm +th_vmsgt_vi 011111 . ..... ..... 011 ..... 1010111 @r_vm +th_vminu_vv 000100 . ..... ..... 000 ..... 1010111 @r_vm +th_vminu_vx 000100 . ..... ..... 100 ..... 1010111 @r_vm +th_vmin_vv 000101 . ..... ..... 000 ..... 1010111 @r_vm +th_vmin_vx 000101 . ..... ..... 100 ..... 1010111 @r_vm +th_vmaxu_vv 000110 . ..... ..... 000 ..... 1010111 @r_vm +th_vmaxu_vx 000110 . ..... ..... 100 ..... 1010111 @r_vm +th_vmax_vv 000111 . ..... ..... 000 ..... 1010111 @r_vm +th_vmax_vx 000111 . ..... ..... 100 ..... 1010111 @r_vm +th_vmul_vv 100101 . ..... ..... 010 ..... 1010111 @r_vm +th_vmul_vx 100101 . ..... ..... 110 ..... 1010111 @r_vm +th_vmulh_vv 100111 . ..... ..... 010 ..... 1010111 @r_vm +th_vmulh_vx 100111 . ..... ..... 110 ..... 1010111 @r_vm +th_vmulhu_vv 100100 . ..... ..... 010 ..... 1010111 @r_vm +th_vmulhu_vx 100100 . ..... ..... 110 ..... 1010111 @r_vm +th_vmulhsu_vv 100110 . ..... ..... 010 ..... 1010111 @r_vm +th_vmulhsu_vx 100110 . ..... ..... 110 ..... 1010111 @r_vm +th_vdivu_vv 100000 . ..... ..... 010 ..... 1010111 @r_vm +th_vdivu_vx 100000 . ..... ..... 110 ..... 1010111 @r_vm +th_vdiv_vv 100001 . ..... ..... 010 ..... 1010111 @r_vm +th_vdiv_vx 100001 . ..... ..... 110 ..... 1010111 @r_vm +th_vremu_vv 100010 . ..... ..... 010 ..... 1010111 @r_vm +th_vremu_vx 100010 . ..... ..... 110 ..... 1010111 @r_vm +th_vrem_vv 100011 . ..... ..... 010 ..... 1010111 @r_vm +th_vrem_vx 100011 . ..... ..... 110 ..... 1010111 @r_vm +th_vwmulu_vv 111000 . ..... ..... 010 ..... 1010111 @r_vm +th_vwmulu_vx 111000 . ..... ..... 110 ..... 1010111 @r_vm +th_vwmulsu_vv 111010 . ..... ..... 010 ..... 1010111 @r_vm +th_vwmulsu_vx 111010 . ..... ..... 110 ..... 1010111 @r_vm +th_vwmul_vv 111011 . ..... ..... 010 ..... 1010111 @r_vm +th_vwmul_vx 111011 . ..... ..... 110 ..... 1010111 @r_vm +th_vmacc_vv 101101 . ..... ..... 010 ..... 1010111 @r_vm +th_vmacc_vx 101101 . ..... ..... 110 ..... 1010111 @r_vm +th_vnmsac_vv 101111 . ..... ..... 010 ..... 1010111 @r_vm +th_vnmsac_vx 101111 . ..... ..... 110 ..... 1010111 @r_vm +th_vmadd_vv 101001 . ..... ..... 010 ..... 1010111 @r_vm +th_vmadd_vx 101001 . ..... ..... 110 ..... 1010111 @r_vm +th_vnmsub_vv 101011 . ..... ..... 010 ..... 1010111 @r_vm +th_vnmsub_vx 101011 . ..... ..... 110 ..... 1010111 @r_vm +th_vwmaccu_vv 111100 . ..... ..... 010 ..... 1010111 @r_vm +th_vwmaccu_vx 111100 . ..... ..... 110 ..... 1010111 @r_vm +th_vwmacc_vv 111101 . ..... ..... 010 ..... 1010111 @r_vm +th_vwmacc_vx 111101 . ..... ..... 110 ..... 1010111 @r_vm +th_vwmaccsu_vv 111110 . ..... ..... 010 ..... 1010111 @r_vm +th_vwmaccsu_vx 111110 . ..... ..... 110 ..... 1010111 @r_vm +th_vwmaccus_vx 111111 . ..... ..... 110 ..... 1010111 @r_vm +th_vmv_v_v 010111 1 00000 ..... 000 ..... 1010111 @r2 +th_vmv_v_x 010111 1 00000 ..... 100 ..... 1010111 @r2 +th_vmv_v_i 010111 1 00000 ..... 011 ..... 1010111 @r2 +th_vmerge_vvm 010111 0 ..... ..... 000 ..... 1010111 @r_vm_0 +th_vmerge_vxm 010111 0 ..... ..... 100 ..... 1010111 @r_vm_0 +th_vmerge_vim 010111 0 ..... ..... 011 ..... 1010111 @r_vm_0 +th_vsaddu_vv 100000 . ..... ..... 000 ..... 1010111 @r_vm +th_vsaddu_vx 100000 . ..... ..... 100 ..... 1010111 @r_vm +th_vsaddu_vi 100000 . ..... ..... 011 ..... 1010111 @r_vm +th_vsadd_vv 100001 . ..... ..... 000 ..... 1010111 @r_vm +th_vsadd_vx 100001 . ..... ..... 100 ..... 1010111 @r_vm +th_vsadd_vi 100001 . ..... ..... 011 ..... 1010111 @r_vm +th_vssubu_vv 100010 . ..... ..... 000 ..... 1010111 @r_vm +th_vssubu_vx 100010 . ..... ..... 100 ..... 1010111 @r_vm +th_vssub_vv 100011 . ..... ..... 000 ..... 1010111 @r_vm +th_vssub_vx 100011 . ..... ..... 100 ..... 1010111 @r_vm +th_vaadd_vv 100100 . ..... ..... 000 ..... 1010111 @r_vm +th_vaadd_vx 100100 . ..... ..... 100 ..... 1010111 @r_vm +th_vaadd_vi 100100 . ..... ..... 011 ..... 1010111 @r_vm +th_vasub_vv 100110 . ..... ..... 000 ..... 1010111 @r_vm +th_vasub_vx 100110 . ..... ..... 100 ..... 1010111 @r_vm +th_vsmul_vv 100111 . ..... ..... 000 ..... 1010111 @r_vm +th_vsmul_vx 100111 . ..... ..... 100 ..... 1010111 @r_vm +th_vwsmaccu_vv 111100 . ..... ..... 000 ..... 1010111 @r_vm +th_vwsmaccu_vx 111100 . ..... ..... 100 ..... 1010111 @r_vm +th_vwsmacc_vv 111101 . ..... ..... 000 ..... 1010111 @r_vm +th_vwsmacc_vx 111101 . ..... ..... 100 ..... 1010111 @r_vm +th_vwsmaccsu_vv 111110 . ..... ..... 000 ..... 1010111 @r_vm +th_vwsmaccsu_vx 111110 . ..... ..... 100 ..... 1010111 @r_vm +th_vwsmaccus_vx 111111 . ..... ..... 100 ..... 1010111 @r_vm +th_vssrl_vv 101010 . ..... ..... 000 ..... 1010111 @r_vm +th_vssrl_vx 101010 . ..... ..... 100 ..... 1010111 @r_vm +th_vssrl_vi 101010 . ..... ..... 011 ..... 1010111 @r_vm +th_vssra_vv 101011 . ..... ..... 000 ..... 1010111 @r_vm +th_vssra_vx 101011 . ..... ..... 100 ..... 1010111 @r_vm +th_vssra_vi 101011 . ..... ..... 011 ..... 1010111 @r_vm +th_vnclipu_vv 101110 . ..... ..... 000 ..... 1010111 @r_vm +th_vnclipu_vx 101110 . ..... ..... 100 ..... 1010111 @r_vm +th_vnclipu_vi 101110 . ..... ..... 011 ..... 1010111 @r_vm +th_vnclip_vv 101111 . ..... ..... 000 ..... 1010111 @r_vm +th_vnclip_vx 101111 . ..... ..... 100 ..... 1010111 @r_vm +th_vnclip_vi 101111 . ..... ..... 011 ..... 1010111 @r_vm +th_vfadd_vv 000000 . ..... ..... 001 ..... 1010111 @r_vm +th_vfadd_vf 000000 . ..... ..... 101 ..... 1010111 @r_vm +th_vfsub_vv 000010 . ..... ..... 001 ..... 1010111 @r_vm +th_vfsub_vf 000010 . ..... ..... 101 ..... 1010111 @r_vm +th_vfrsub_vf 100111 . ..... ..... 101 ..... 1010111 @r_vm +th_vfwadd_vv 110000 . ..... ..... 001 ..... 1010111 @r_vm +th_vfwadd_vf 110000 . ..... ..... 101 ..... 1010111 @r_vm +th_vfwadd_wv 110100 . ..... ..... 001 ..... 1010111 @r_vm +th_vfwadd_wf 110100 . ..... ..... 101 ..... 1010111 @r_vm +th_vfwsub_vv 110010 . ..... ..... 001 ..... 1010111 @r_vm +th_vfwsub_vf 110010 . ..... ..... 101 ..... 1010111 @r_vm +th_vfwsub_wv 110110 . ..... ..... 001 ..... 1010111 @r_vm +th_vfwsub_wf 110110 . ..... ..... 101 ..... 1010111 @r_vm +th_vfmul_vv 100100 . ..... ..... 001 ..... 1010111 @r_vm +th_vfmul_vf 100100 . ..... ..... 101 ..... 1010111 @r_vm +th_vfdiv_vv 100000 . ..... ..... 001 ..... 1010111 @r_vm +th_vfdiv_vf 100000 . ..... ..... 101 ..... 1010111 @r_vm +th_vfrdiv_vf 100001 . ..... ..... 101 ..... 1010111 @r_vm +th_vfwmul_vv 111000 . ..... ..... 001 ..... 1010111 @r_vm +th_vfwmul_vf 111000 . ..... ..... 101 ..... 1010111 @r_vm +th_vfmacc_vv 101100 . ..... ..... 001 ..... 1010111 @r_vm +th_vfnmacc_vv 101101 . ..... ..... 001 ..... 1010111 @r_vm +th_vfnmacc_vf 101101 . ..... ..... 101 ..... 1010111 @r_vm +th_vfmacc_vf 101100 . ..... ..... 101 ..... 1010111 @r_vm +th_vfmsac_vv 101110 . ..... ..... 001 ..... 1010111 @r_vm +th_vfmsac_vf 101110 . ..... ..... 101 ..... 1010111 @r_vm +th_vfnmsac_vv 101111 . ..... ..... 001 ..... 1010111 @r_vm +th_vfnmsac_vf 101111 . ..... ..... 101 ..... 1010111 @r_vm +th_vfmadd_vv 101000 . ..... ..... 001 ..... 1010111 @r_vm +th_vfmadd_vf 101000 . ..... ..... 101 ..... 1010111 @r_vm +th_vfnmadd_vv 101001 . ..... ..... 001 ..... 1010111 @r_vm +th_vfnmadd_vf 101001 . ..... ..... 101 ..... 1010111 @r_vm +th_vfmsub_vv 101010 . ..... ..... 001 ..... 1010111 @r_vm +th_vfmsub_vf 101010 . ..... ..... 101 ..... 1010111 @r_vm +th_vfnmsub_vv 101011 . ..... ..... 001 ..... 1010111 @r_vm +th_vfnmsub_vf 101011 . ..... ..... 101 ..... 1010111 @r_vm +th_vfwmacc_vv 111100 . ..... ..... 001 ..... 1010111 @r_vm +th_vfwmacc_vf 111100 . ..... ..... 101 ..... 1010111 @r_vm +th_vfwnmacc_vv 111101 . ..... ..... 001 ..... 1010111 @r_vm +th_vfwnmacc_vf 111101 . ..... ..... 101 ..... 1010111 @r_vm +th_vfwmsac_vv 111110 . ..... ..... 001 ..... 1010111 @r_vm +th_vfwmsac_vf 111110 . ..... ..... 101 ..... 1010111 @r_vm +th_vfwnmsac_vv 111111 . ..... ..... 001 ..... 1010111 @r_vm +th_vfwnmsac_vf 111111 . ..... ..... 101 ..... 1010111 @r_vm +th_vfsqrt_v 100011 . ..... 00000 001 ..... 1010111 @r2_vm +th_vfmin_vv 000100 . ..... ..... 001 ..... 1010111 @r_vm +th_vfmin_vf 000100 . ..... ..... 101 ..... 1010111 @r_vm +th_vfmax_vv 000110 . ..... ..... 001 ..... 1010111 @r_vm +th_vfmax_vf 000110 . ..... ..... 101 ..... 1010111 @r_vm +th_vfsgnj_vv 001000 . ..... ..... 001 ..... 1010111 @r_vm +th_vfsgnj_vf 001000 . ..... ..... 101 ..... 1010111 @r_vm +th_vfsgnjn_vv 001001 . ..... ..... 001 ..... 1010111 @r_vm +th_vfsgnjn_vf 001001 . ..... ..... 101 ..... 1010111 @r_vm +th_vfsgnjx_vv 001010 . ..... ..... 001 ..... 1010111 @r_vm +th_vfsgnjx_vf 001010 . ..... ..... 101 ..... 1010111 @r_vm +th_vmfeq_vv 011000 . ..... ..... 001 ..... 1010111 @r_vm +th_vmfeq_vf 011000 . ..... ..... 101 ..... 1010111 @r_vm +th_vmfne_vv 011100 . ..... ..... 001 ..... 1010111 @r_vm +th_vmfne_vf 011100 . ..... ..... 101 ..... 1010111 @r_vm +th_vmflt_vv 011011 . ..... ..... 001 ..... 1010111 @r_vm +th_vmflt_vf 011011 . ..... ..... 101 ..... 1010111 @r_vm +th_vmfle_vv 011001 . ..... ..... 001 ..... 1010111 @r_vm +th_vmfle_vf 011001 . ..... ..... 101 ..... 1010111 @r_vm +th_vmfgt_vf 011101 . ..... ..... 101 ..... 1010111 @r_vm +th_vmfge_vf 011111 . ..... ..... 101 ..... 1010111 @r_vm +th_vmford_vv 011010 . ..... ..... 001 ..... 1010111 @r_vm +th_vmford_vf 011010 . ..... ..... 101 ..... 1010111 @r_vm +th_vfclass_v 100011 . ..... 10000 001 ..... 1010111 @r2_vm +th_vfmerge_vfm 010111 0 ..... ..... 101 ..... 1010111 @r_vm_0 +th_vfmv_v_f 010111 1 00000 ..... 101 ..... 1010111 @r2 +th_vfcvt_xu_f_v 100010 . ..... 00000 001 ..... 1010111 @r2_vm +th_vfcvt_x_f_v 100010 . ..... 00001 001 ..... 1010111 @r2_vm +th_vfcvt_f_xu_v 100010 . ..... 00010 001 ..... 1010111 @r2_vm +th_vfcvt_f_x_v 100010 . ..... 00011 001 ..... 1010111 @r2_vm +th_vfwcvt_xu_f_v 100010 . ..... 01000 001 ..... 1010111 @r2_vm +th_vfwcvt_x_f_v 100010 . ..... 01001 001 ..... 1010111 @r2_vm +th_vfwcvt_f_xu_v 100010 . ..... 01010 001 ..... 1010111 @r2_vm +th_vfwcvt_f_x_v 100010 . ..... 01011 001 ..... 1010111 @r2_vm +th_vfwcvt_f_f_v 100010 . ..... 01100 001 ..... 1010111 @r2_vm +th_vfncvt_xu_f_v 100010 . ..... 10000 001 ..... 1010111 @r2_vm +th_vfncvt_x_f_v 100010 . ..... 10001 001 ..... 1010111 @r2_vm +th_vfncvt_f_xu_v 100010 . ..... 10010 001 ..... 1010111 @r2_vm +th_vfncvt_f_x_v 100010 . ..... 10011 001 ..... 1010111 @r2_vm +th_vfncvt_f_f_v 100010 . ..... 10100 001 ..... 1010111 @r2_vm +th_vredsum_vs 000000 . ..... ..... 010 ..... 1010111 @r_vm +th_vredand_vs 000001 . ..... ..... 010 ..... 1010111 @r_vm +th_vredor_vs 000010 . ..... ..... 010 ..... 1010111 @r_vm +th_vredxor_vs 000011 . ..... ..... 010 ..... 1010111 @r_vm +th_vredminu_vs 000100 . ..... ..... 010 ..... 1010111 @r_vm +th_vredmin_vs 000101 . ..... ..... 010 ..... 1010111 @r_vm +th_vredmaxu_vs 000110 . ..... ..... 010 ..... 1010111 @r_vm +th_vredmax_vs 000111 . ..... ..... 010 ..... 1010111 @r_vm +th_vwredsumu_vs 110000 . ..... ..... 000 ..... 1010111 @r_vm +th_vwredsum_vs 110001 . ..... ..... 000 ..... 1010111 @r_vm +# Vector ordered and unordered reduction sum +th_vfredsum_vs 0000-1 . ..... ..... 001 ..... 1010111 @r_vm +th_vfredmin_vs 000101 . ..... ..... 001 ..... 1010111 @r_vm +th_vfredmax_vs 000111 . ..... ..... 001 ..... 1010111 @r_vm +# Vector widening ordered and unordered float reduction sum +th_vfwredsum_vs 1100-1 . ..... ..... 001 ..... 1010111 @r_vm +th_vmand_mm 011001 - ..... ..... 010 ..... 1010111 @r +th_vmnand_mm 011101 - ..... ..... 010 ..... 1010111 @r +th_vmandnot_mm 011000 - ..... ..... 010 ..... 1010111 @r +th_vmxor_mm 011011 - ..... ..... 010 ..... 1010111 @r +th_vmor_mm 011010 - ..... ..... 010 ..... 1010111 @r +th_vmnor_mm 011110 - ..... ..... 010 ..... 1010111 @r +th_vmornot_mm 011100 - ..... ..... 010 ..... 1010111 @r +th_vmxnor_mm 011111 - ..... ..... 010 ..... 1010111 @r +th_vmpopc_m 010100 . ..... ----- 010 ..... 1010111 @r2_vm +th_vmfirst_m 010101 . ..... ----- 010 ..... 1010111 @r2_vm +th_vmsbf_m 010110 . ..... 00001 010 ..... 1010111 @r2_vm +th_vmsif_m 010110 . ..... 00011 010 ..... 1010111 @r2_vm +th_vmsof_m 010110 . ..... 00010 010 ..... 1010111 @r2_vm +th_viota_m 010110 . ..... 10000 010 ..... 1010111 @r2_vm +th_vid_v 010110 . 00000 10001 010 ..... 1010111 @r1_vm +th_vext_x_v 001100 1 ..... ..... 010 ..... 1010111 @r +th_vmv_s_x 001101 1 00000 ..... 110 ..... 1010111 @r2 +th_vfmv_f_s 001100 1 ..... 00000 001 ..... 1010111 @r2rd +th_vfmv_s_f 001101 1 00000 ..... 101 ..... 1010111 @r2 +th_vslideup_vx 001110 . ..... ..... 100 ..... 1010111 @r_vm +th_vslideup_vi 001110 . ..... ..... 011 ..... 1010111 @r_vm +th_vslide1up_vx 001110 . ..... ..... 110 ..... 1010111 @r_vm +th_vslidedown_vx 001111 . ..... ..... 100 ..... 1010111 @r_vm +th_vslidedown_vi 001111 . ..... ..... 011 ..... 1010111 @r_vm +th_vslide1down_vx 001111 . ..... ..... 110 ..... 1010111 @r_vm +th_vrgather_vv 001100 . ..... ..... 000 ..... 1010111 @r_vm +th_vrgather_vx 001100 . ..... ..... 100 ..... 1010111 @r_vm +th_vrgather_vi 001100 . ..... ..... 011 ..... 1010111 @r_vm +th_vcompress_vm 010111 - ..... ..... 010 ..... 1010111 @r --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712908411; cv=none; d=zohomail.com; s=zohoarc; b=Bf7qWKk+OFqJkL9g47IPhPMW1JtOw0Tx8nt6P9nfA8FOL/bKCBMw0u3b7jxRPh5e1cUd8eIhEZwA8SKbjUBlIAEzxFACTJiJXlIYM7Z7ICZUmm8S9w7htzhxz9BntYY8+jldYOLm5Mur3+uXEp+N3gIZUqRlMhkSWaIb6A3kXZA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712908411; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=5tfc8PGqT+1VwghlIxyvrrL4zcInwz7NCC0F/LFhNH0=; b=HpO+HQKKQiEuc47KAdCw8D4dPZ2cATyHucL57BNolt/oA3hKzMdjU1bH1vJRWHoOFF3tQb6Mi/rzDL5m9W9PoOmvhKVPo1DPs14FbWOzdoMBQHyLvSDtC9m0CeV0TUzaeDuROpaS4EpWdgMHe/F6dSATcihTxjZ5ZmIrriY6TJg= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712908411367157.60488887879478; Fri, 12 Apr 2024 00:53:31 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBiP-0005wj-Hw; Fri, 12 Apr 2024 03:52:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBiN-0005wU-72; Fri, 12 Apr 2024 03:52:51 -0400 Received: from out30-101.freemail.mail.aliyun.com ([115.124.30.101]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBiK-0002Hv-Vv; Fri, 12 Apr 2024 03:52:50 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nacq2_1712908361) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 15:52:42 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712908363; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=5tfc8PGqT+1VwghlIxyvrrL4zcInwz7NCC0F/LFhNH0=; b=Clj2m8f83bpjeP/zBjk36RWQ6sAtzVhrJXCQh8+Ajs3tCI8lrVW33XqQN57C+CdB6j8ep2OVJf74KYoNuSDJNIIQJtNwtJfegxXc82NEvoMkzZKSjOG9qm+1aFA/wY2b/yqXHdbWhAoFN+4G4sFtGGGelZbAdAL93s5EUeG9xik= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R181e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045192; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nacq2_1712908361; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 07/65] target/riscv: implement th.vsetvl{i} for XTheadVector Date: Fri, 12 Apr 2024 15:36:37 +0800 Message-ID: <20240412073735.76413-8-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.101; envelope-from=eric.huang@linux.alibaba.com; helo=out30-101.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712908411712100001 Content-Type: text/plain; charset="utf-8" In this patch, we implement the th.vetvl{i} instructions. In the th_vsetvl function, some work has been done according to the difference between RVV1.0 and XTheadVector. th.vsetvl{i} differs from vsetvl{i} in the following points: 1. th.vsetvl{i} does not have the option to maintain the existing vl. 2. XTheadVector has different vtype encoding from RVV1.0. Signed-off-by: Huang Tao --- .../riscv/insn_trans/trans_xtheadvector.c.inc | 93 ++++++++++++++++++- 1 file changed, 91 insertions(+), 2 deletions(-) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 2dd77d74ab..0461b53893 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -25,14 +25,103 @@ static bool require_xtheadvector(DisasContext *s) s->mstatus_vs !=3D EXT_STATUS_DISABLED; } =20 +/* + * XTheadVector has different vtype encoding from RVV1.0. + * We recode the value in RVV1.0 vtype format to reuse the RVV1.0 function= s. + * In RVV1.0: + * vtype[7] -> vma + * vtype[6] -> vta + * vtype[5:3] -> vsew + * vtype[2:0] -> vlmul + * In XTheadVector: + * vtype[6:5] -> vediv (reserved) + * vtype[4:2] -> vsew + * vtype[1:0] -> vlmul + * + * Also th_vsetvl does not have feature of keeping existing vl when + * (rd =3D=3D 0 && rs1 =3D=3D 0) + */ +static bool th_vsetvl(DisasContext *s, int rd, int rs1, TCGv s2) +{ + TCGv temp =3D tcg_temp_new(); + TCGv dst_s2 =3D tcg_temp_new(); + /* illegal value check*/ + TCGLabel *legal =3D gen_new_label(); + tcg_gen_shri_tl(temp, s2, 5); + tcg_gen_brcondi_tl(TCG_COND_EQ, temp, 0, legal); + /* + * if illegal, set unsupported value + * s2[8:5] =3D=3D 0b111, which is reserved field in XTheadVector + */ + tcg_gen_movi_tl(s2, 0xff); + gen_set_label(legal); + /* get vlmul, s2[1:0] -> dst_s2[2:0] */ + tcg_gen_andi_tl(dst_s2, s2, 0x3); + /* get vsew, s2[4:2] -> dst_s2[5:3] */ + tcg_gen_andi_tl(temp, s2, 0x1c); + tcg_gen_shli_tl(temp, temp, 1); + tcg_gen_or_tl(dst_s2, dst_s2, temp); + /* + * get reserved field when illegal, s2[7:5] -> dst_s2[10:8] + * avoid dst_s2[7:6], because dst_s2[7:6] are vma and vta. + * + * Make the dst_s2 an illegal value for RVV1.0, leads to the illegal + * operation processing flow. + */ + tcg_gen_andi_tl(temp, s2, 0xe0); + tcg_gen_shli_tl(temp, temp, 3); + tcg_gen_or_tl(dst_s2, dst_s2, temp); + + /* + * We can't reuse do_vsetvl for we don't have ext_zve32f + * The code below is almost copied from rvv do_vsetvl + * delete zve32f check and the situation when rd =3D rs1 =3D 0 + */ + TCGv s1, dst; + dst =3D dest_gpr(s, rd); + if (rs1 =3D=3D 0) { + /* As the mask is at least one bit, RV_VLEN_MAX is >=3D VLMAX */ + s1 =3D tcg_constant_tl(RV_VLEN_MAX); + } else { + s1 =3D get_gpr(s, rs1, EXT_ZERO); + } + + gen_helper_vsetvl(dst, tcg_env, s1, dst_s2); + gen_set_gpr(s, rd, dst); + mark_vs_dirty(s); + + gen_update_pc(s, s->cur_insn_len); + lookup_and_goto_ptr(s); + s->base.is_jmp =3D DISAS_NORETURN; + return true; +} + +static bool trans_th_vsetvl(DisasContext *s, arg_th_vsetvl *a) +{ + if (!require_xtheadvector(s)) { + return false; + } + + TCGv s2 =3D get_gpr(s, a->rs2, EXT_ZERO); + return th_vsetvl(s, a->rd, a->rs1, s2); +} + +static bool trans_th_vsetvli(DisasContext *s, arg_th_vsetvli *a) +{ + if (!require_xtheadvector(s)) { + return false; + } + + TCGv s2 =3D tcg_constant_tl(a->zimm); + return th_vsetvl(s, a->rd, a->rs1, s2); +} + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vsetvli) -TH_TRANS_STUB(th_vsetvl) TH_TRANS_STUB(th_vlb_v) TH_TRANS_STUB(th_vlh_v) TH_TRANS_STUB(th_vlw_v) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712908531; cv=none; d=zohomail.com; s=zohoarc; b=K3Asd726t6TIo/RQmxwSJ7Frl2d/o27IIj2SQpgzYmZtER+/dq8P9ZihDzsAb/FQ9FjgfQxfIsEAN+pFxJXDY09Q6dyFezTn3aCzHae3IR/PhwsejQFLdMWbuWkYfQU6zpm3nVQcDSh0/5iZjILnRWUznKaZ8q9+d/X0B+YJJDs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712908531; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=NgV908wNRjhZHvhVtL9Oyga1qipXmf9NM0MKKM0NfUk=; b=eHpLwl7BGUfhWr+ZYPNYjPcvnKdXw6nslN6hGaVgJjHKOamSpw6dFKopyb13Yvrq0XnfKoDGp0YCwgYFiFHf38tvP2I+NCk5mpGY18iRe7ffi0vKC9yeE/xBZ+1OGhAo+YUe+yd7SqmKcjzlObK7p6/gRii3KSyAW9ngBnEfss4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 171290853148516.299084985645663; Fri, 12 Apr 2024 00:55:31 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBkf-0007WR-5g; Fri, 12 Apr 2024 03:55:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBkP-0007UA-1t; Fri, 12 Apr 2024 03:54:57 -0400 Received: from out30-101.freemail.mail.aliyun.com ([115.124.30.101]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBkI-0002Ty-N9; Fri, 12 Apr 2024 03:54:56 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nb-5f_1712908482) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 15:54:43 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712908483; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=NgV908wNRjhZHvhVtL9Oyga1qipXmf9NM0MKKM0NfUk=; b=uwfd3hQPkyUqZQWTuRYnzRKszqlzYa0mRgwXXD8xHTisOkFmN7eMHFcbCR/micYHC87AcYChfFxOb8uGfnh0wFEZ0biYDeblrKoejwyVefmh+dUL87CdSxKNLMrk6qSXtEy2fEekvjfjGmPl95HN08pFhwWlh1Y2cXYLsR/tDdg= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R471e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045168; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nb-5f_1712908482; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 08/65] target/riscv: Add strided load instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:38 +0800 Message-ID: <20240412073735.76413-9-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.101; envelope-from=eric.huang@linux.alibaba.com; helo=out30-101.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712908532198100001 Content-Type: text/plain; charset="utf-8" In this patch, we add one strided load instructions to show the way we implement XTheadVector load/store instructions. We use independent functions to achieve decoupling. XTheadVector strided load instructions diff from RVV1.0 in the following points: 1. Different mask reg layout. For mask bit of element i, XTheadVector locates it in bit[mlen*i] (mlen =3D SEW/LMUL), while RVV1.0 locates it in bit[i]. 2. Different vector reg element width. XTheadVector has 7 instructions, th.vls{b,h,w}.v, th.vls{b,h,w}u.v and th.vlse.v. "b/h/w" represents byte/halfword, "u" represents unsigned. The insn th.vls{b,h,w}.v strided load 8/16/32-bit memory data and sign-extended to SEW-bit vector element if SEW > 8/16/32. The insn th.vls{b,h,w}u.v strided load 8/16/32-bit memory data and zero-extended to SEW-bit vector element if SEW > 8/16/32. The insn th.vlse strided load SEW-bit memory data to SEW-bit vector element. RVV1.0 has 4 instructions, vlse{8,16,32,64}.v. They load 8/16/32/64-bit memory data to 8/16/32/64-bit vector element. So XTheadVector has more instructions to handle zero/sign-extened. 3. Different tail/masked elements process policy. XTheadVector keep the masked element value and clear the tail elements. While RVV1.0 has vta and vma to set the processing policy, keeping value or overwrite it with 1s. 4. Different check policy. XTheadVector does not have fractional lmul, so we can use simpler check function Signed-off-by: Huang Tao --- target/riscv/helper.h | 24 ++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 160 ++++++++++- target/riscv/internals.h | 12 + target/riscv/meson.build | 1 + target/riscv/vector_helper.c | 5 - target/riscv/vector_internals.h | 6 + target/riscv/xtheadvector_helper.c | 271 ++++++++++++++++++ 7 files changed, 467 insertions(+), 12 deletions(-) create mode 100644 target/riscv/xtheadvector_helper.c diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 8a63523851..8decfc20cc 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1280,3 +1280,27 @@ DEF_HELPER_4(vgmul_vv, void, ptr, ptr, env, i32) DEF_HELPER_5(vsm4k_vi, void, ptr, ptr, i32, env, i32) DEF_HELPER_4(vsm4r_vv, void, ptr, ptr, env, i32) DEF_HELPER_4(vsm4r_vs, void, ptr, ptr, env, i32) + + /* XTheadVector functions */ +DEF_HELPER_6(th_vlsb_v_b, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlsb_v_h, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlsb_v_w, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlsb_v_d, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlsh_v_h, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlsh_v_w, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlsh_v_d, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlsw_v_w, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlsw_v_d, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlse_v_b, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlse_v_h, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlse_v_w, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlse_v_d, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlsbu_v_b, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlsbu_v_h, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlsbu_v_w, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlsbu_v_d, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlshu_v_h, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlshu_v_w, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlshu_v_d, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlswu_v_w, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vlswu_v_d, void, ptr, ptr, tl, tl, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 0461b53893..72481fdd5f 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -116,6 +116,159 @@ static bool trans_th_vsetvli(DisasContext *s, arg_th_= vsetvli *a) return th_vsetvl(s, a->rd, a->rs1, s2); } =20 +/* check functions */ + +/* + * There are two rules check here. + * 1. Vector register numbers are multiples of LMUL. + * 2. For all widening instructions, the destination LMUL value must also = be + * a supported LMUL value. + * + * This function is the combination of require_align and vext_wide_check_c= ommon, + * except: + * 1) In require_align, if LMUL < 0, i.e. fractional LMUL, any vector regi= ster + * is allowed, we do not need to check this situation. + * 2) In vext_wide_check_common, RVV check all the constraints of widen + * instruction, including SEW < 64, 2*SEW lmul : 1 << s->lmul; + + return !((s->lmul =3D=3D 0x3 && widen) || (reg % legal)); +} + +/* + * There are two rules check here. + * 1. The destination vector register group for a masked vector instructio= n can + * only overlap the source mask register (v0) when LMUL=3D1. + * 2. In widen instructions and some other insturctions, like vslideup.vx, + * there is no need to check whether LMUL=3D1. + * + * This function is almost the copy of require_vm, except: + * 1) In RVV1.0, destination vector register group cannot overlap source m= ask + * register even when LMUL=3D1. So we have to add a check of "(s->lmul = =3D=3D 0)". + * 2) When the instruction forbid the mask overlap in all situation, we add + * a arg of "force" to flag the situation. + */ +static bool th_check_overlap_mask(DisasContext *s, uint32_t vd, bool vm, + bool force) +{ + return (vm !=3D 0 || vd !=3D 0) || (!force && (s->lmul =3D=3D 0)); +} + +/* + * The LMUL setting must be such that LMUL * NFIELDS <=3D 8. + * + * This function is almost the copy of require_nf, except that + * XTheadVectot does not have fractional LMUL, so we do not need to + * max(lmul, 0) + * RVV use "size =3D nf << MAX(lmul, 0)" to let the one segment be loaded + * to at least one vector register. + */ +static bool th_check_nf(DisasContext *s, uint32_t vd, uint32_t nf) +{ + int size =3D nf << s->lmul; + return size <=3D 8 && vd + size <=3D 32; +} + +/* + * common translation macro + * + * GEN_TH_TRANS is similar to GEN_VEXT_TRANS + * just change the function args + */ +#define GEN_TH_TRANS(NAME, SEQ, ARGTYPE, OP, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_##ARGTYPE *a)\ +{ \ + if (CHECK(s, a)) { \ + return OP(s, a, SEQ); \ + } \ + return false; \ +} + +/* + * stride load and store + */ + +#define gen_helper_ldst_stride_th gen_helper_ldst_stride + +/* + * This function is almost the copy of ld_stride_op, except: + * 1) XTheadVector has more insns to handle zero/sign-extended. + * 2) XTheadVector using different data encoding, add MLEN, + * delete VTA and VMA. + * + * MLEN =3D SEW/LMUL. to indicate the mask bit. + */ +static bool ld_stride_op_th(DisasContext *s, arg_rnfvm *a, uint8_t seq) +{ + uint32_t data =3D 0; + gen_helper_ldst_stride_th *fn; + static gen_helper_ldst_stride_th * const fns[7][4] =3D { + { gen_helper_th_vlsb_v_b, gen_helper_th_vlsb_v_h, + gen_helper_th_vlsb_v_w, gen_helper_th_vlsb_v_d }, + { NULL, gen_helper_th_vlsh_v_h, + gen_helper_th_vlsh_v_w, gen_helper_th_vlsh_v_d }, + { NULL, NULL, + gen_helper_th_vlsw_v_w, gen_helper_th_vlsw_v_d }, + { gen_helper_th_vlse_v_b, gen_helper_th_vlse_v_h, + gen_helper_th_vlse_v_w, gen_helper_th_vlse_v_d }, + { gen_helper_th_vlsbu_v_b, gen_helper_th_vlsbu_v_h, + gen_helper_th_vlsbu_v_w, gen_helper_th_vlsbu_v_d }, + { NULL, gen_helper_th_vlshu_v_h, + gen_helper_th_vlshu_v_w, gen_helper_th_vlshu_v_d }, + { NULL, NULL, + gen_helper_th_vlswu_v_w, gen_helper_th_vlswu_v_d }, + }; + + fn =3D fns[seq][s->sew]; + if (fn =3D=3D NULL) { + return false; + } + /* Need extra mlen to find the mask bit */ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + data =3D FIELD_DP32(data, VDATA_TH, NF, a->nf); + return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s); +} + +/* + * check function + * 1) check overlap mask, XTheadVector can overlap mask reg v0 when + * lmul =3D=3D 1, while RVV1.0 can not. + * 2) check reg, XTheadVector Vector register numbers are multiples + * of integral LMUL, while RVV1.0 has fractional LMUL, which allows + * any vector register. + * 3) check nf, the LMUL setting must be such that LMUL * NFIELDS <=3D 8, + * which is the same as RVV1.0. But we do not need to check fractional + * LMUL. + */ +static bool ld_stride_check_th(DisasContext *s, arg_rnfvm* a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, false) && + th_check_reg(s, a->rd, false) && + th_check_nf(s, a->rd, a->nf)); +} + +GEN_TH_TRANS(th_vlsb_v, 0, rnfvm, ld_stride_op_th, ld_stride_check_th) +GEN_TH_TRANS(th_vlsh_v, 1, rnfvm, ld_stride_op_th, ld_stride_check_th) +GEN_TH_TRANS(th_vlsw_v, 2, rnfvm, ld_stride_op_th, ld_stride_check_th) +GEN_TH_TRANS(th_vlse_v, 3, rnfvm, ld_stride_op_th, ld_stride_check_th) +GEN_TH_TRANS(th_vlsbu_v, 4, rnfvm, ld_stride_op_th, ld_stride_check_th) +GEN_TH_TRANS(th_vlshu_v, 5, rnfvm, ld_stride_op_th, ld_stride_check_th) +GEN_TH_TRANS(th_vlswu_v, 6, rnfvm, ld_stride_op_th, ld_stride_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ @@ -140,13 +293,6 @@ TH_TRANS_STUB(th_vsb_v) TH_TRANS_STUB(th_vsh_v) TH_TRANS_STUB(th_vsw_v) TH_TRANS_STUB(th_vse_v) -TH_TRANS_STUB(th_vlsb_v) -TH_TRANS_STUB(th_vlsh_v) -TH_TRANS_STUB(th_vlsw_v) -TH_TRANS_STUB(th_vlse_v) -TH_TRANS_STUB(th_vlsbu_v) -TH_TRANS_STUB(th_vlshu_v) -TH_TRANS_STUB(th_vlswu_v) TH_TRANS_STUB(th_vssb_v) TH_TRANS_STUB(th_vssh_v) TH_TRANS_STUB(th_vssw_v) diff --git a/target/riscv/internals.h b/target/riscv/internals.h index 8239ae83cc..07e95cd07b 100644 --- a/target/riscv/internals.h +++ b/target/riscv/internals.h @@ -65,6 +65,18 @@ FIELD(VDATA, VMA, 6, 1) FIELD(VDATA, NF, 7, 4) FIELD(VDATA, WD, 7, 1) =20 +/* + * XTheadVector need mlen in addition, and does not need + * VTA and VMA. So, we redesign the encoding of desc. + * + * MLEN =3D SEW/LMUL. to indicate the mask bit. + */ +FIELD(VDATA_TH, MLEN, 0, 8) +FIELD(VDATA_TH, VM, 8, 1) +FIELD(VDATA_TH, LMUL, 9, 2) +FIELD(VDATA_TH, NF, 11, 4) +FIELD(VDATA_TH, WD, 11, 1) + /* float point classify helpers */ target_ulong fclass_h(uint64_t frs1); target_ulong fclass_s(uint64_t frs1); diff --git a/target/riscv/meson.build b/target/riscv/meson.build index 1207ba84ed..a86a836a82 100644 --- a/target/riscv/meson.build +++ b/target/riscv/meson.build @@ -18,6 +18,7 @@ riscv_ss.add(files( 'gdbstub.c', 'op_helper.c', 'vector_helper.c', + 'xtheadvector_helper.c', 'vector_internals.c', 'bitmanip_helper.c', 'translate.c', diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index fa139040f8..31660226dc 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -102,11 +102,6 @@ static inline uint32_t vext_max_elems(uint32_t desc, u= int32_t log2_esz) return scale < 0 ? vlenb >> -scale : vlenb << scale; } =20 -static inline target_ulong adjust_addr(CPURISCVState *env, target_ulong ad= dr) -{ - return (addr & ~env->cur_pmmask) | env->cur_pmbase; -} - /* * This function checks watchpoint before real load operation. * diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 9e1e15b575..59c93f86bf 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -233,4 +233,10 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,= \ #define WOP_UUU_H uint32_t, uint16_t, uint16_t, uint32_t, uint32_t #define WOP_UUU_W uint64_t, uint32_t, uint32_t, uint64_t, uint64_t =20 +/* share functions */ +static inline target_ulong adjust_addr(CPURISCVState *env, target_ulong ad= dr) +{ + return (addr & ~env->cur_pmmask) | env->cur_pmbase; +} + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c new file mode 100644 index 0000000000..7bfd85901e --- /dev/null +++ b/target/riscv/xtheadvector_helper.c @@ -0,0 +1,271 @@ +/* + * RISC-V XTheadVector Extension Helpers for QEMU. + * + * Copyright (c) 2024 Alibaba Group. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License f= or + * more details. + * + * You should have received a copy of the GNU General Public License along= with + * this program. If not, see . + */ + +#include "qemu/osdep.h" +#include "qemu/host-utils.h" +#include "qemu/bitops.h" +#include "cpu.h" +#include "exec/memop.h" +#include "exec/exec-all.h" +#include "exec/cpu_ldst.h" +#include "exec/helper-proto.h" +#include "fpu/softfloat.h" +#include "tcg/tcg-gvec-desc.h" +#include "internals.h" +#include "vector_internals.h" +#include + +/* Different desc encoding need different parse functions */ +static inline uint32_t th_nf(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA_TH, NF); +} + +static inline uint32_t th_mlen(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA_TH, MLEN); +} + +static inline uint32_t th_vm(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA_TH, VM); +} + +static inline uint32_t th_lmul(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA_TH, LMUL); +} + +/* + * Get vector group length in bytes. Its range is [64, 2048]. + * + * As simd_desc support at most 256, the max vlen is 512 bits. + * So vlen in bytes is encoded as maxsz. + * + * XtheadVector diff from RVV1.0 is that TH not have fractional lmul and e= mul. + */ +static inline uint32_t th_maxsz(uint32_t desc) +{ + return simd_maxsz(desc) << th_lmul(desc); +} + +/* XTheadVector need to clear the tail elements */ +#if HOST_BIG_ENDIAN +static void th_clear(void *tail, uint32_t cnt, uint32_t tot) +{ + /* + * Split the remaining range to two parts. + * The first part is in the last uint64_t unit. + * The second part start from the next uint64_t unit. + */ + int part1 =3D 0, part2 =3D tot - cnt; + if (cnt % 8) { + part1 =3D 8 - (cnt % 8); + part2 =3D tot - cnt - part1; + memset(QEMU_ALIGN_PTR_DOWN(tail, 8), 0, part1); + memset(QEMU_ALIGN_PTR_UP(tail, 8), 0, part2); + } else { + memset(tail, 0, part2); + } +} +#else +static void th_clear(void *tail, uint32_t cnt, uint32_t tot) +{ + memset(tail, 0, tot - cnt); +} +#endif + +static void clearb_th(void *vd, uint32_t idx, uint32_t cnt, uint32_t tot) +{ + int8_t *cur =3D ((int8_t *)vd + H1(idx)); + th_clear(cur, cnt, tot); +} + +static void clearh_th(void *vd, uint32_t idx, uint32_t cnt, uint32_t tot) +{ + int16_t *cur =3D ((int16_t *)vd + H2(idx)); + th_clear(cur, cnt, tot); +} + +static void clearl_th(void *vd, uint32_t idx, uint32_t cnt, uint32_t tot) +{ + int32_t *cur =3D ((int32_t *)vd + H4(idx)); + th_clear(cur, cnt, tot); +} + +static void clearq_th(void *vd, uint32_t idx, uint32_t cnt, uint32_t tot) +{ + int64_t *cur =3D (int64_t *)vd + idx; + th_clear(cur, cnt, tot); +} + +/* + * XTheadVector has different mask layout. + * + * The mask bits for element i are located in + * bits [MLEN*i+(MLEN-1) : MLEN*i] of the mask register. + * + * MLEN =3D SEW/LMUL + */ +static inline int th_elem_mask(void *v0, int mlen, int index) +{ + int idx =3D (index * mlen) / 64; + int pos =3D (index * mlen) % 64; + return (((uint64_t *)v0)[idx] >> pos) & 1; +} + +/* elements operations for load and store */ +typedef void th_ldst_elem_fn(CPURISCVState *env, abi_ptr addr, + uint32_t idx, void *vd, uintptr_t retaddr); + +/* + * GEN_TH_LD_ELEM is almost the copy of to GEN_VEXT_LD_ELEM, except that + * we add "MTYPE data" to deal with zero/sign-extended. + * + * For XTheadVector, mem access width is determined by the instruction, + * while the reg element size equals SEW, therefore mem access width may + * not equal reg element size. For example, ldb_d means load 8-bit data + * and sign-extended to 64-bit vector element. + * As for RVV1.0, mem access width always equals reg element size. + * + * So we need to deal with zero/sign-extended in addition. + */ + +#define GEN_TH_LD_ELEM(NAME, MTYPE, ETYPE, H, LDSUF) \ +static void NAME(CPURISCVState *env, abi_ptr addr, \ + uint32_t idx, void *vd, uintptr_t retaddr)\ +{ \ + MTYPE data; \ + ETYPE *cur =3D ((ETYPE *)vd + H(idx)); \ + data =3D cpu_##LDSUF##_data_ra(env, addr, retaddr); \ + *cur =3D data; \ +} \ + +GEN_TH_LD_ELEM(ldb_b, int8_t, int8_t, H1, ldsb) +GEN_TH_LD_ELEM(ldb_h, int8_t, int16_t, H2, ldsb) +GEN_TH_LD_ELEM(ldb_w, int8_t, int32_t, H4, ldsb) +GEN_TH_LD_ELEM(ldb_d, int8_t, int64_t, H8, ldsb) +GEN_TH_LD_ELEM(ldh_h, int16_t, int16_t, H2, ldsw) +GEN_TH_LD_ELEM(ldh_w, int16_t, int32_t, H4, ldsw) +GEN_TH_LD_ELEM(ldh_d, int16_t, int64_t, H8, ldsw) +GEN_TH_LD_ELEM(ldw_w, int32_t, int32_t, H4, ldl) +GEN_TH_LD_ELEM(ldw_d, int32_t, int64_t, H8, ldl) +GEN_TH_LD_ELEM(lde_b, int8_t, int8_t, H1, ldsb) +GEN_TH_LD_ELEM(lde_h, int16_t, int16_t, H2, ldsw) +GEN_TH_LD_ELEM(lde_w, int32_t, int32_t, H4, ldl) +GEN_TH_LD_ELEM(lde_d, int64_t, int64_t, H8, ldq) +GEN_TH_LD_ELEM(ldbu_b, uint8_t, uint8_t, H1, ldub) +GEN_TH_LD_ELEM(ldbu_h, uint8_t, uint16_t, H2, ldub) +GEN_TH_LD_ELEM(ldbu_w, uint8_t, uint32_t, H4, ldub) +GEN_TH_LD_ELEM(ldbu_d, uint8_t, uint64_t, H8, ldub) +GEN_TH_LD_ELEM(ldhu_h, uint16_t, uint16_t, H2, lduw) +GEN_TH_LD_ELEM(ldhu_w, uint16_t, uint32_t, H4, lduw) +GEN_TH_LD_ELEM(ldhu_d, uint16_t, uint64_t, H8, lduw) +GEN_TH_LD_ELEM(ldwu_w, uint32_t, uint32_t, H4, ldl) +GEN_TH_LD_ELEM(ldwu_d, uint32_t, uint64_t, H8, ldl) + +/* + * stride: access vector element from strided memory + */ +typedef void clear_fn(void *vd, uint32_t idx, uint32_t cnt, uint32_t tot); + +/* + * This function is almost the copy of vext_ldst_stride, except: + * 1) XTheadVector has different mask layout, using th_elem_mask + * to get [MLEN*i] bit + * 2) XTheadVector using different data encoding, using th_ functions + * to parse. + * 3) XTheadVector keep the masked elements value, while RVV1.0 policy is + * determined by vma. + * 4) XTheadVector clear the tail elements, while RVV1.0 policy is to rath= er + * set all bits 1s or keep it, determined by vta. + */ +static void +th_ldst_stride(void *vd, void *v0, target_ulong base, + target_ulong stride, CPURISCVState *env, + uint32_t desc, uint32_t vm, + th_ldst_elem_fn *ldst_elem, clear_fn *clear_elem, + uint32_t esz, uint32_t msz, uintptr_t ra) +{ + uint32_t i, k; + uint32_t nf =3D th_nf(desc); + uint32_t mlen =3D th_mlen(desc); + uint32_t vlmax =3D th_maxsz(desc) / esz; + + VSTART_CHECK_EARLY_EXIT(env); + + /* do real access */ + for (i =3D env->vstart; i < env->vl; env->vstart =3D ++i) { + k =3D 0; + if (!vm && !th_elem_mask(v0, mlen, i)) { + continue; + } + while (k < nf) { + target_ulong addr =3D base + stride * i + k * msz; + ldst_elem(env, adjust_addr(env, addr), i + k * vlmax, vd, ra); + k++; + } + } + env->vstart =3D 0; + /* + * clear tail elements + * clear_elem is NULL when store + */ + if (clear_elem) { + for (k =3D 0; k < nf; k++) { + clear_elem(vd, env->vl + k * vlmax, env->vl * esz, vlmax * esz= ); + } + } +} + +/* + * GEN_TH_LD_STRIDE is similar to GEN_VEXT_LD_STRIDE + * just change the function args + */ +#define GEN_TH_LD_STRIDE(NAME, MTYPE, ETYPE, LOAD_FN, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void * v0, target_ulong base, \ + target_ulong stride, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + uint32_t vm =3D th_vm(desc); \ + th_ldst_stride(vd, v0, base, stride, env, desc, vm, LOAD_FN, \ + CLEAR_FN, sizeof(ETYPE), sizeof(MTYPE), GETPC()); \ +} + +GEN_TH_LD_STRIDE(th_vlsb_v_b, int8_t, int8_t, ldb_b, clearb_th) +GEN_TH_LD_STRIDE(th_vlsb_v_h, int8_t, int16_t, ldb_h, clearh_th) +GEN_TH_LD_STRIDE(th_vlsb_v_w, int8_t, int32_t, ldb_w, clearl_th) +GEN_TH_LD_STRIDE(th_vlsb_v_d, int8_t, int64_t, ldb_d, clearq_th) +GEN_TH_LD_STRIDE(th_vlsh_v_h, int16_t, int16_t, ldh_h, clearh_th) +GEN_TH_LD_STRIDE(th_vlsh_v_w, int16_t, int32_t, ldh_w, clearl_th) +GEN_TH_LD_STRIDE(th_vlsh_v_d, int16_t, int64_t, ldh_d, clearq_th) +GEN_TH_LD_STRIDE(th_vlsw_v_w, int32_t, int32_t, ldw_w, clearl_th) +GEN_TH_LD_STRIDE(th_vlsw_v_d, int32_t, int64_t, ldw_d, clearq_th) +GEN_TH_LD_STRIDE(th_vlse_v_b, int8_t, int8_t, lde_b, clearb_th) +GEN_TH_LD_STRIDE(th_vlse_v_h, int16_t, int16_t, lde_h, clearh_th) +GEN_TH_LD_STRIDE(th_vlse_v_w, int32_t, int32_t, lde_w, clearl_th) +GEN_TH_LD_STRIDE(th_vlse_v_d, int64_t, int64_t, lde_d, clearq_th) +GEN_TH_LD_STRIDE(th_vlsbu_v_b, uint8_t, uint8_t, ldbu_b, clearb_th) +GEN_TH_LD_STRIDE(th_vlsbu_v_h, uint8_t, uint16_t, ldbu_h, clearh_th) +GEN_TH_LD_STRIDE(th_vlsbu_v_w, uint8_t, uint32_t, ldbu_w, clearl_th) +GEN_TH_LD_STRIDE(th_vlsbu_v_d, uint8_t, uint64_t, ldbu_d, clearq_th) +GEN_TH_LD_STRIDE(th_vlshu_v_h, uint16_t, uint16_t, ldhu_h, clearh_th) +GEN_TH_LD_STRIDE(th_vlshu_v_w, uint16_t, uint32_t, ldhu_w, clearl_th) +GEN_TH_LD_STRIDE(th_vlshu_v_d, uint16_t, uint64_t, ldhu_d, clearq_th) +GEN_TH_LD_STRIDE(th_vlswu_v_w, uint32_t, uint32_t, ldwu_w, clearl_th) +GEN_TH_LD_STRIDE(th_vlswu_v_d, uint32_t, uint64_t, ldwu_d, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712908654; cv=none; d=zohomail.com; s=zohoarc; b=QYFbK2O6yYkF6OpUzm35emFPIQOR/k2BvDG6pJ+H8NvU32ib+FP4buQCEdiMlpgPb48xnTYWCvmKe9/3rnnmhasid2a7gPTEsP79Hevb4ccf/6z86cRTTZuZFZpyIWMYWq5A62sW+/Mb7eKnXfy67nlpW0oTjrRfD4mky33MAB0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712908654; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=DekwD9q+3PbDnA5bdrcq6XzBLYGk6h6UtXapc5/9268=; b=VjVyahPlM5KRdyAX3jmPxK8Cq32TrI7IjH/zkvYP6Z3tUf6FbPuTLgo1H1wsQF+W2aTMmvnLW2u1PWvt41R/FmjekgEoic388Symz6+vxf4rBnTH/90Y8VAok6dYSrTQSCaY3gcMmfAf/PuBhEaZRDtpw5Srm+KBLsjj7NFJlNI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712908654344495.38479072493647; Fri, 12 Apr 2024 00:57:34 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBmV-0000CF-06; Fri, 12 Apr 2024 03:57:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBmM-0000AV-Rm; Fri, 12 Apr 2024 03:56:59 -0400 Received: from out30-111.freemail.mail.aliyun.com ([115.124.30.111]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBmF-0003I0-UN; Fri, 12 Apr 2024 03:56:56 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NUO.B_1712908603) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 15:56:44 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712908605; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=DekwD9q+3PbDnA5bdrcq6XzBLYGk6h6UtXapc5/9268=; b=BBxOkZsVl6QsC3S0ClHzeaAV34UxnXfS6AZRz0pFVUjs7QteCyERotB5JbTlxcBwM/VIxW6AgAALvEPC1Xs0K9uJDN2Kp4ycql2tp56+BAfMyV+QJoUi3+mRa/eEqnmzK341I0XHiXa8PLOGadOyD0td1z4k0oGoYCube6lXH3g= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R111e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045170; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NUO.B_1712908603; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 09/65] target/riscv: Add strided store instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:39 +0800 Message-ID: <20240412073735.76413-10-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.111; envelope-from=eric.huang@linux.alibaba.com; helo=out30-111.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712908656639100003 Content-Type: text/plain; charset="utf-8" XTheadVector strided store instructions diff from RVV1.0 in the following points: 1. Different mask reg layout. The difference is same as strided load instru= ctions. 2. Different vector reg element width. XTheadVector has 4 instructions, th.vss{b,h,w,e}.v. They store SEW-bit reg data to 8/16/32/SEW-bit memory= loaction. RVV1.0 has 4 instructions, vsse{8,16,32,64}.v. They store 8/16/32/64-bit= reg data to 8/16/32/64-bit memory location. 3. Different tail/masked elements process policy. The difference is same as= strided load instructions. 4. Different check policy. XTheadVector does not have fractional lmul and e= mul, so we can use simpler check function. Signed-off-by: Huang Tao --- target/riscv/helper.h | 13 +++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 56 +++++++++++++++++-- target/riscv/xtheadvector_helper.c | 50 +++++++++++++++++ 3 files changed, 115 insertions(+), 4 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 8decfc20cc..bfd6bd9b13 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1304,3 +1304,16 @@ DEF_HELPER_6(th_vlshu_v_w, void, ptr, ptr, tl, tl, e= nv, i32) DEF_HELPER_6(th_vlshu_v_d, void, ptr, ptr, tl, tl, env, i32) DEF_HELPER_6(th_vlswu_v_w, void, ptr, ptr, tl, tl, env, i32) DEF_HELPER_6(th_vlswu_v_d, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vssb_v_b, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vssb_v_h, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vssb_v_w, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vssb_v_d, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vssh_v_h, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vssh_v_w, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vssh_v_d, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vssw_v_w, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vssw_v_d, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vsse_v_b, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vsse_v_h, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vsse_v_w, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_6(th_vsse_v_d, void, ptr, ptr, tl, tl, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 72481fdd5f..48004bf0d6 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -269,6 +269,58 @@ GEN_TH_TRANS(th_vlsbu_v, 4, rnfvm, ld_stride_op_th, ld= _stride_check_th) GEN_TH_TRANS(th_vlshu_v, 5, rnfvm, ld_stride_op_th, ld_stride_check_th) GEN_TH_TRANS(th_vlswu_v, 6, rnfvm, ld_stride_op_th, ld_stride_check_th) =20 +/* + * This function is almost the copy of st_stride_op, except: + * 1) XTheadVector using different data encoding, add MLEN, + * delete VTA and VMA. + * 2) XTheadVector has more situations. vss{8,16,32,64}.v decide the + * reg and mem width both equal 8/16/32/64. As for th.vss{b,h,w}.v, the + * reg width equals SEW, and the mem width equals 8/16/32. The reg and + * mem width of th.vsse.v both equal SEW. Therefore, we need to add more + * helper functions depending on SEW. + */ +static bool st_stride_op_th(DisasContext *s, arg_rnfvm *a, uint8_t seq) +{ + uint32_t data =3D 0; + gen_helper_ldst_stride_th *fn; + static gen_helper_ldst_stride_th * const fns[4][4] =3D { + /* masked stride store */ + { gen_helper_th_vssb_v_b, gen_helper_th_vssb_v_h, + gen_helper_th_vssb_v_w, gen_helper_th_vssb_v_d }, + { NULL, gen_helper_th_vssh_v_h, + gen_helper_th_vssh_v_w, gen_helper_th_vssh_v_d }, + { NULL, NULL, + gen_helper_th_vssw_v_w, gen_helper_th_vssw_v_d }, + { gen_helper_th_vsse_v_b, gen_helper_th_vsse_v_h, + gen_helper_th_vsse_v_w, gen_helper_th_vsse_v_d } + }; + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + data =3D FIELD_DP32(data, VDATA_TH, NF, a->nf); + fn =3D fns[seq][s->sew]; + if (fn =3D=3D NULL) { + return false; + } + + return ldst_stride_trans(a->rd, a->rs1, a->rs2, data, fn, s); +} + +/* store does not need to check overlap */ +static bool st_stride_check_th(DisasContext *s, arg_rnfvm* a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rd, false) && + th_check_nf(s, a->rd, a->nf)); +} + +GEN_TH_TRANS(th_vssb_v, 0, rnfvm, st_stride_op_th, st_stride_check_th) +GEN_TH_TRANS(th_vssh_v, 1, rnfvm, st_stride_op_th, st_stride_check_th) +GEN_TH_TRANS(th_vssw_v, 2, rnfvm, st_stride_op_th, st_stride_check_th) +GEN_TH_TRANS(th_vsse_v, 3, rnfvm, st_stride_op_th, st_stride_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ @@ -293,10 +345,6 @@ TH_TRANS_STUB(th_vsb_v) TH_TRANS_STUB(th_vsh_v) TH_TRANS_STUB(th_vsw_v) TH_TRANS_STUB(th_vse_v) -TH_TRANS_STUB(th_vssb_v) -TH_TRANS_STUB(th_vssh_v) -TH_TRANS_STUB(th_vssw_v) -TH_TRANS_STUB(th_vsse_v) TH_TRANS_STUB(th_vlxb_v) TH_TRANS_STUB(th_vlxh_v) TH_TRANS_STUB(th_vlxw_v) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 7bfd85901e..17de312f0a 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -179,6 +179,28 @@ GEN_TH_LD_ELEM(ldhu_d, uint16_t, uint64_t, H8, lduw) GEN_TH_LD_ELEM(ldwu_w, uint32_t, uint32_t, H4, ldl) GEN_TH_LD_ELEM(ldwu_d, uint32_t, uint64_t, H8, ldl) =20 +#define GEN_TH_ST_ELEM(NAME, ETYPE, H, STSUF) \ +static void NAME(CPURISCVState *env, abi_ptr addr, \ + uint32_t idx, void *vd, uintptr_t retaddr)\ +{ \ + ETYPE data =3D *((ETYPE *)vd + H(idx)); \ + cpu_##STSUF##_data_ra(env, addr, data, retaddr); \ +} + +GEN_TH_ST_ELEM(stb_b, int8_t, H1, stb) +GEN_TH_ST_ELEM(stb_h, int16_t, H2, stb) +GEN_TH_ST_ELEM(stb_w, int32_t, H4, stb) +GEN_TH_ST_ELEM(stb_d, int64_t, H8, stb) +GEN_TH_ST_ELEM(sth_h, int16_t, H2, stw) +GEN_TH_ST_ELEM(sth_w, int32_t, H4, stw) +GEN_TH_ST_ELEM(sth_d, int64_t, H8, stw) +GEN_TH_ST_ELEM(stw_w, int32_t, H4, stl) +GEN_TH_ST_ELEM(stw_d, int64_t, H8, stl) +GEN_TH_ST_ELEM(ste_b, int8_t, H1, stb) +GEN_TH_ST_ELEM(ste_h, int16_t, H2, stw) +GEN_TH_ST_ELEM(ste_w, int32_t, H4, stl) +GEN_TH_ST_ELEM(ste_d, int64_t, H8, stq) + /* * stride: access vector element from strided memory */ @@ -269,3 +291,31 @@ GEN_TH_LD_STRIDE(th_vlshu_v_w, uint16_t, uint32_t, ldh= u_w, clearl_th) GEN_TH_LD_STRIDE(th_vlshu_v_d, uint16_t, uint64_t, ldhu_d, clearq_th) GEN_TH_LD_STRIDE(th_vlswu_v_w, uint32_t, uint32_t, ldwu_w, clearl_th) GEN_TH_LD_STRIDE(th_vlswu_v_d, uint32_t, uint64_t, ldwu_d, clearq_th) + +/* + * GEN_TH_ST_STRIDE is similar to GEN_VEXT_ST_STRIDE + * just change the function name and args + */ +#define GEN_TH_ST_STRIDE(NAME, MTYPE, ETYPE, STORE_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong base, \ + target_ulong stride, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + uint32_t vm =3D th_vm(desc); \ + th_ldst_stride(vd, v0, base, stride, env, desc, vm, STORE_FN, \ + NULL, sizeof(ETYPE), sizeof(MTYPE), GETPC()); \ +} + +GEN_TH_ST_STRIDE(th_vssb_v_b, int8_t, int8_t, stb_b) +GEN_TH_ST_STRIDE(th_vssb_v_h, int8_t, int16_t, stb_h) +GEN_TH_ST_STRIDE(th_vssb_v_w, int8_t, int32_t, stb_w) +GEN_TH_ST_STRIDE(th_vssb_v_d, int8_t, int64_t, stb_d) +GEN_TH_ST_STRIDE(th_vssh_v_h, int16_t, int16_t, sth_h) +GEN_TH_ST_STRIDE(th_vssh_v_w, int16_t, int32_t, sth_w) +GEN_TH_ST_STRIDE(th_vssh_v_d, int16_t, int64_t, sth_d) +GEN_TH_ST_STRIDE(th_vssw_v_w, int32_t, int32_t, stw_w) +GEN_TH_ST_STRIDE(th_vssw_v_d, int32_t, int64_t, stw_d) +GEN_TH_ST_STRIDE(th_vsse_v_b, int8_t, int8_t, ste_b) +GEN_TH_ST_STRIDE(th_vsse_v_h, int16_t, int16_t, ste_h) +GEN_TH_ST_STRIDE(th_vsse_v_w, int32_t, int32_t, ste_w) +GEN_TH_ST_STRIDE(th_vsse_v_d, int64_t, int64_t, ste_d) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712908782; cv=none; d=zohomail.com; s=zohoarc; b=XQWbKePq7ikDBPqIZQKqvE/9nlMgV2CqevH6k0LtcGkHtHToMu4Ck/wU9nDp7B0QFTmG2mcy0Ox3iGZDb56x/ugx/vzmg0GO4JF7LxvqUhEuiMUmmliM7aDw+Qg4d8m0vEIIqyzUIEGnIxcCwG77FlES2wKoN0QEz9RnwwbaBGk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712908782; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=09Sr0SXNnumO7TbG9kWITCDuN7gd8a13VCUf4d6xkso=; b=gwHjyluBuCi9sCEQae7fwc/9i8zYknH+OmNHytftkVQKz9AOzyfZ9jWzVksJbnUh0o18LGkViPh7HBcEbm2ae8raOIpf/tFLcfZkIDD7b2tScMZaciCyIsRK/pXFLqqaQfS/xZCBLemfUWKIit+HBot4f6zVEZD3QBQapE4Lslo= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712908782598117.16412235864937; Fri, 12 Apr 2024 00:59:42 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBoZ-0001UX-Fl; Fri, 12 Apr 2024 03:59:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBoN-0001UA-KK; Fri, 12 Apr 2024 03:59:05 -0400 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBoG-0003XK-3j; Fri, 12 Apr 2024 03:59:03 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Naf-L_1712908725) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 15:58:46 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712908729; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=09Sr0SXNnumO7TbG9kWITCDuN7gd8a13VCUf4d6xkso=; b=ubx08cEzfrcgipWuRJW31+8rw0k01A64an2bgO8EpJAl5kSkZ0M1KLxJVBYabNjsYN11V/1Dij+OkjHjBdZXMItW07PV3qdeYv+12KZn8sjRTnzq/uRh7P+DcsyyuE3jn14bW6OtAP/m6Z9imbqHaSyGAmPZVwnv2FWg/RDZEYg= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R141e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045192; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Naf-L_1712908725; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 10/65] target/riscv: Add unit-stride load instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:40 +0800 Message-ID: <20240412073735.76413-11-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.132; envelope-from=eric.huang@linux.alibaba.com; helo=out30-132.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712908784877100003 Content-Type: text/plain; charset="utf-8" TheadVector unit-stride load instructions diff from RVV1.0 in the following points: 1. Different mask reg layout. 2. Different vector reg element width. 3. Different tail/masked elements process policy. 4. Different check function. The detials of the difference are the same as strided load instruction, as unit-stride is the special case of strided operations. Signed-off-by: Huang Tao --- target/riscv/helper.h | 44 ++++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 84 ++++++++++++++++-- target/riscv/xtheadvector_helper.c | 86 +++++++++++++++++++ 3 files changed, 207 insertions(+), 7 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index bfd6bd9b13..f2fa8425b3 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1317,3 +1317,47 @@ DEF_HELPER_6(th_vsse_v_b, void, ptr, ptr, tl, tl, en= v, i32) DEF_HELPER_6(th_vsse_v_h, void, ptr, ptr, tl, tl, env, i32) DEF_HELPER_6(th_vsse_v_w, void, ptr, ptr, tl, tl, env, i32) DEF_HELPER_6(th_vsse_v_d, void, ptr, ptr, tl, tl, env, i32) +DEF_HELPER_5(th_vlb_v_b, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlb_v_b_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlb_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlb_v_h_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlb_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlb_v_w_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlb_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlb_v_d_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlh_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlh_v_h_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlh_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlh_v_w_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlh_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlh_v_d_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlw_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlw_v_w_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlw_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlw_v_d_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vle_v_b, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vle_v_b_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vle_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vle_v_h_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vle_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vle_v_w_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vle_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vle_v_d_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbu_v_b, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbu_v_b_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbu_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbu_v_h_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbu_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbu_v_w_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbu_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbu_v_d_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlhu_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlhu_v_h_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlhu_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlhu_v_w_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlhu_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlhu_v_d_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlwu_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlwu_v_w_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlwu_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlwu_v_d_mask, void, ptr, ptr, tl, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 48004bf0d6..eb910acf40 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -321,19 +321,89 @@ GEN_TH_TRANS(th_vssh_v, 1, rnfvm, st_stride_op_th, st= _stride_check_th) GEN_TH_TRANS(th_vssw_v, 2, rnfvm, st_stride_op_th, st_stride_check_th) GEN_TH_TRANS(th_vsse_v, 3, rnfvm, st_stride_op_th, st_stride_check_th) =20 +/* + * unit stride load and store + */ + +#define gen_helper_ldst_us_th gen_helper_ldst_us + +/* + * This function is almost the copy of ld_us_op, except: + * 1) different data encoding + * 2) XTheadVector has more insns to handle zero/sign-extended. + */ +static bool ld_us_op_th(DisasContext *s, arg_r2nfvm *a, uint8_t seq) +{ + uint32_t data =3D 0; + gen_helper_ldst_us_th *fn; + static gen_helper_ldst_us_th * const fns[2][7][4] =3D { + /* masked unit stride load */ + { { gen_helper_th_vlb_v_b_mask, gen_helper_th_vlb_v_h_mask, + gen_helper_th_vlb_v_w_mask, gen_helper_th_vlb_v_d_mask }, + { NULL, gen_helper_th_vlh_v_h_mask, + gen_helper_th_vlh_v_w_mask, gen_helper_th_vlh_v_d_mask }, + { NULL, NULL, + gen_helper_th_vlw_v_w_mask, gen_helper_th_vlw_v_d_mask }, + { gen_helper_th_vle_v_b_mask, gen_helper_th_vle_v_h_mask, + gen_helper_th_vle_v_w_mask, gen_helper_th_vle_v_d_mask }, + { gen_helper_th_vlbu_v_b_mask, gen_helper_th_vlbu_v_h_mask, + gen_helper_th_vlbu_v_w_mask, gen_helper_th_vlbu_v_d_mask }, + { NULL, gen_helper_th_vlhu_v_h_mask, + gen_helper_th_vlhu_v_w_mask, gen_helper_th_vlhu_v_d_mask }, + { NULL, NULL, + gen_helper_th_vlwu_v_w_mask, gen_helper_th_vlwu_v_d_mask } }, + /* unmasked unit stride load */ + { { gen_helper_th_vlb_v_b, gen_helper_th_vlb_v_h, + gen_helper_th_vlb_v_w, gen_helper_th_vlb_v_d }, + { NULL, gen_helper_th_vlh_v_h, + gen_helper_th_vlh_v_w, gen_helper_th_vlh_v_d }, + { NULL, NULL, + gen_helper_th_vlw_v_w, gen_helper_th_vlw_v_d }, + { gen_helper_th_vle_v_b, gen_helper_th_vle_v_h, + gen_helper_th_vle_v_w, gen_helper_th_vle_v_d }, + { gen_helper_th_vlbu_v_b, gen_helper_th_vlbu_v_h, + gen_helper_th_vlbu_v_w, gen_helper_th_vlbu_v_d }, + { NULL, gen_helper_th_vlhu_v_h, + gen_helper_th_vlhu_v_w, gen_helper_th_vlhu_v_d }, + { NULL, NULL, + gen_helper_th_vlwu_v_w, gen_helper_th_vlwu_v_d } } + }; + + fn =3D fns[a->vm][seq][s->sew]; + if (fn =3D=3D NULL) { + return false; + } + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + data =3D FIELD_DP32(data, VDATA_TH, NF, a->nf); + return ldst_us_trans(a->rd, a->rs1, data, fn, s, false); +} + +static bool ld_us_check_th(DisasContext *s, arg_r2nfvm* a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, false) && + th_check_reg(s, a->rd, false) && + th_check_nf(s, a->rd, a->nf)); +} + +GEN_TH_TRANS(th_vlb_v, 0, r2nfvm, ld_us_op_th, ld_us_check_th) +GEN_TH_TRANS(th_vlh_v, 1, r2nfvm, ld_us_op_th, ld_us_check_th) +GEN_TH_TRANS(th_vlw_v, 2, r2nfvm, ld_us_op_th, ld_us_check_th) +GEN_TH_TRANS(th_vle_v, 3, r2nfvm, ld_us_op_th, ld_us_check_th) +GEN_TH_TRANS(th_vlbu_v, 4, r2nfvm, ld_us_op_th, ld_us_check_th) +GEN_TH_TRANS(th_vlhu_v, 5, r2nfvm, ld_us_op_th, ld_us_check_th) +GEN_TH_TRANS(th_vlwu_v, 6, r2nfvm, ld_us_op_th, ld_us_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vlb_v) -TH_TRANS_STUB(th_vlh_v) -TH_TRANS_STUB(th_vlw_v) -TH_TRANS_STUB(th_vle_v) -TH_TRANS_STUB(th_vlbu_v) -TH_TRANS_STUB(th_vlhu_v) -TH_TRANS_STUB(th_vlwu_v) TH_TRANS_STUB(th_vlbff_v) TH_TRANS_STUB(th_vlhff_v) TH_TRANS_STUB(th_vlwff_v) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 17de312f0a..7566ab8e31 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -319,3 +319,89 @@ GEN_TH_ST_STRIDE(th_vsse_v_b, int8_t, int8_t, ste_b) GEN_TH_ST_STRIDE(th_vsse_v_h, int16_t, int16_t, ste_h) GEN_TH_ST_STRIDE(th_vsse_v_w, int32_t, int32_t, ste_w) GEN_TH_ST_STRIDE(th_vsse_v_d, int64_t, int64_t, ste_d) + +/* + * unit-stride: access elements stored contiguously in memory + */ + +/* + * unmasked unit-stride load and store operation + * + * This function is almost the copy of vext_ldst_us, except: + * 1) different mask layout + * 2) different data encoding + * 3) different the tail elements process policy + */ +static void +th_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc, + th_ldst_elem_fn *ldst_elem, clear_fn *clear_elem, + uint32_t esz, uint32_t msz, uintptr_t ra) +{ + uint32_t i, k; + uint32_t nf =3D th_nf(desc); + uint32_t vlmax =3D th_maxsz(desc) / esz; + + VSTART_CHECK_EARLY_EXIT(env); + + /* load bytes from guest memory */ + for (i =3D env->vstart; i < env->vl; env->vstart =3D ++i) { + k =3D 0; + while (k < nf) { + target_ulong addr =3D base + (i * nf + k) * msz; + ldst_elem(env, adjust_addr(env, addr), i + k * vlmax, vd, ra); + k++; + } + } + env->vstart =3D 0; + /* clear tail elements */ + if (clear_elem) { + for (k =3D 0; k < nf; k++) { + clear_elem(vd, env->vl + k * vlmax, env->vl * esz, vlmax * esz= ); + } + } +} + +/* + * masked unit-stride load and store operation will be a special case of s= tride, + * stride =3D NF * sizeof (MTYPE) + * + * similar to GEN_GEN_VEXT_LD_US, change the function + */ +#define GEN_TH_LD_US(NAME, MTYPE, ETYPE, LOAD_FN, CLEAR_FN) \ +void HELPER(NAME##_mask)(void *vd, void *v0, target_ulong base, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t stride =3D th_nf(desc) * sizeof(MTYPE); \ + th_ldst_stride(vd, v0, base, stride, env, desc, false, LOAD_FN, \ + CLEAR_FN, sizeof(ETYPE), sizeof(MTYPE), GETPC()); \ +} \ + \ +void HELPER(NAME)(void *vd, void *v0, target_ulong base, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + th_ldst_us(vd, base, env, desc, LOAD_FN, CLEAR_FN, \ + sizeof(ETYPE), sizeof(MTYPE), GETPC()); \ +} + +GEN_TH_LD_US(th_vlb_v_b, int8_t, int8_t, ldb_b, clearb_th) +GEN_TH_LD_US(th_vlb_v_h, int8_t, int16_t, ldb_h, clearh_th) +GEN_TH_LD_US(th_vlb_v_w, int8_t, int32_t, ldb_w, clearl_th) +GEN_TH_LD_US(th_vlb_v_d, int8_t, int64_t, ldb_d, clearq_th) +GEN_TH_LD_US(th_vlh_v_h, int16_t, int16_t, ldh_h, clearh_th) +GEN_TH_LD_US(th_vlh_v_w, int16_t, int32_t, ldh_w, clearl_th) +GEN_TH_LD_US(th_vlh_v_d, int16_t, int64_t, ldh_d, clearq_th) +GEN_TH_LD_US(th_vlw_v_w, int32_t, int32_t, ldw_w, clearl_th) +GEN_TH_LD_US(th_vlw_v_d, int32_t, int64_t, ldw_d, clearq_th) +GEN_TH_LD_US(th_vle_v_b, int8_t, int8_t, lde_b, clearb_th) +GEN_TH_LD_US(th_vle_v_h, int16_t, int16_t, lde_h, clearh_th) +GEN_TH_LD_US(th_vle_v_w, int32_t, int32_t, lde_w, clearl_th) +GEN_TH_LD_US(th_vle_v_d, int64_t, int64_t, lde_d, clearq_th) +GEN_TH_LD_US(th_vlbu_v_b, uint8_t, uint8_t, ldbu_b, clearb_th) +GEN_TH_LD_US(th_vlbu_v_h, uint8_t, uint16_t, ldbu_h, clearh_th) +GEN_TH_LD_US(th_vlbu_v_w, uint8_t, uint32_t, ldbu_w, clearl_th) +GEN_TH_LD_US(th_vlbu_v_d, uint8_t, uint64_t, ldbu_d, clearq_th) +GEN_TH_LD_US(th_vlhu_v_h, uint16_t, uint16_t, ldhu_h, clearh_th) +GEN_TH_LD_US(th_vlhu_v_w, uint16_t, uint32_t, ldhu_w, clearl_th) +GEN_TH_LD_US(th_vlhu_v_d, uint16_t, uint64_t, ldhu_d, clearq_th) +GEN_TH_LD_US(th_vlwu_v_w, uint32_t, uint32_t, ldwu_w, clearl_th) +GEN_TH_LD_US(th_vlwu_v_d, uint32_t, uint64_t, ldwu_d, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712908872; cv=none; d=zohomail.com; s=zohoarc; b=CsaHvq78OcSxtqIUBe6P5E0V/6oRRL4BY807UdXrgTMb6nb9kVI0T+77ybjrowKfBsFPEAPEZuXhddUBQzKtHoITDwNTWtBdZPyEMe10f/dw/jYXDGUSdr0+A7kEkPWcLEqVzeO9tSHR8Toj2wF4Z07R7yNuGX0QsxPlNLI4KAA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712908872; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=YNvdgHfN8w2e7L19s+cU3m0HtBGznWrNm16M5aWsr3Y=; b=MJZIeoHFuX3iTDIO9lZqZGhlG2KXIxOm7xvVjY47WRoNvWe6BRGev0A24jEzZXY+G5B6IzPmUsljbnSicHNqC5EJ9KofMj5vBF3brJ6XAFXLqrwDwJnGowcRlW6LT2WdQPD5yNk385YuYQKQhXJd99WhjCamaNB/PYODYOXtMPg= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712908872694800.8710030039285; Fri, 12 Apr 2024 01:01:12 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBqG-0002Ii-Hj; Fri, 12 Apr 2024 04:01:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBqE-0002IR-Bo; Fri, 12 Apr 2024 04:00:58 -0400 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBq9-0003xl-VA; Fri, 12 Apr 2024 04:00:58 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nf54-_1712908846) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:00:47 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712908848; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=YNvdgHfN8w2e7L19s+cU3m0HtBGznWrNm16M5aWsr3Y=; b=bDGiXxqVAcArqSL2l1J8nTx/GQL61KrJinmzVTfiR9nVLsUp+TirS699LzQSGNPfpGhHyCN5deqGvLAbZAjzwysGGj8VgnAvij9OmFeLkQ/8anTxxPaKLl5scQxeFPe0APIY+7bh418ssMrOPhdbj79Ie7uzKAuJSzMg5O2W+Ig= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R151e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045192; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nf54-_1712908846; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 11/65] target/riscv: Add unit-stride store instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:41 +0800 Message-ID: <20240412073735.76413-12-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.130; envelope-from=eric.huang@linux.alibaba.com; helo=out30-130.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712908873080100001 Content-Type: text/plain; charset="utf-8" XTheadVector unit-stride store instructions diff from RVV1.0 in the followi= ng points: 1. Different mask reg layout. 2. Different vector reg element width. 3. Different tail/masked elements process policy. 4. Different check policy. The detials of the difference are the same as strided store instruction, as unit-stride is the special case of strided operations. Signed-off-by: Huang Tao --- target/riscv/helper.h | 26 ++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 59 +++++++++++++++++-- target/riscv/xtheadvector_helper.c | 31 ++++++++++ 3 files changed, 112 insertions(+), 4 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index f2fa8425b3..eb31784e18 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1361,3 +1361,29 @@ DEF_HELPER_5(th_vlwu_v_w, void, ptr, ptr, tl, env, i= 32) DEF_HELPER_5(th_vlwu_v_w_mask, void, ptr, ptr, tl, env, i32) DEF_HELPER_5(th_vlwu_v_d, void, ptr, ptr, tl, env, i32) DEF_HELPER_5(th_vlwu_v_d_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsb_v_b, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsb_v_b_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsb_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsb_v_h_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsb_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsb_v_w_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsb_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsb_v_d_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsh_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsh_v_h_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsh_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsh_v_w_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsh_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsh_v_d_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsw_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsw_v_w_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsw_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vsw_v_d_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vse_v_b, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vse_v_b_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vse_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vse_v_h_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vse_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vse_v_w_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vse_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vse_v_d_mask, void, ptr, ptr, tl, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index eb910acf40..9b88ea2fa4 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -398,6 +398,61 @@ GEN_TH_TRANS(th_vlbu_v, 4, r2nfvm, ld_us_op_th, ld_us_= check_th) GEN_TH_TRANS(th_vlhu_v, 5, r2nfvm, ld_us_op_th, ld_us_check_th) GEN_TH_TRANS(th_vlwu_v, 6, r2nfvm, ld_us_op_th, ld_us_check_th) =20 +/* + * This function is almost the copy of st_us_op, except: + * 1) different data encoding. + * 2) XTheadVector has more situations, depending on SEW. + */ +static bool st_us_op_th(DisasContext *s, arg_r2nfvm *a, uint8_t seq) +{ + uint32_t data =3D 0; + gen_helper_ldst_us_th *fn; + static gen_helper_ldst_us_th * const fns[2][4][4] =3D { + /* masked unit stride load and store */ + { { gen_helper_th_vsb_v_b_mask, gen_helper_th_vsb_v_h_mask, + gen_helper_th_vsb_v_w_mask, gen_helper_th_vsb_v_d_mask }, + { NULL, gen_helper_th_vsh_v_h_mask, + gen_helper_th_vsh_v_w_mask, gen_helper_th_vsh_v_d_mask }, + { NULL, NULL, + gen_helper_th_vsw_v_w_mask, gen_helper_th_vsw_v_d_mask }, + { gen_helper_th_vse_v_b_mask, gen_helper_th_vse_v_h_mask, + gen_helper_th_vse_v_w_mask, gen_helper_th_vse_v_d_mask } }, + /* unmasked unit stride store */ + { { gen_helper_th_vsb_v_b, gen_helper_th_vsb_v_h, + gen_helper_th_vsb_v_w, gen_helper_th_vsb_v_d }, + { NULL, gen_helper_th_vsh_v_h, + gen_helper_th_vsh_v_w, gen_helper_th_vsh_v_d }, + { NULL, NULL, + gen_helper_th_vsw_v_w, gen_helper_th_vsw_v_d }, + { gen_helper_th_vse_v_b, gen_helper_th_vse_v_h, + gen_helper_th_vse_v_w, gen_helper_th_vse_v_d } } + }; + + fn =3D fns[a->vm][seq][s->sew]; + if (fn =3D=3D NULL) { + return false; + } + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + data =3D FIELD_DP32(data, VDATA_TH, NF, a->nf); + return ldst_us_trans(a->rd, a->rs1, data, fn, s, true); +} + +static bool st_us_check_th(DisasContext *s, arg_r2nfvm* a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rd, false) && + th_check_nf(s, a->rd, a->nf)); +} + +GEN_TH_TRANS(th_vsb_v, 0, r2nfvm, st_us_op_th, st_us_check_th) +GEN_TH_TRANS(th_vsh_v, 1, r2nfvm, st_us_op_th, st_us_check_th) +GEN_TH_TRANS(th_vsw_v, 2, r2nfvm, st_us_op_th, st_us_check_th) +GEN_TH_TRANS(th_vse_v, 3, r2nfvm, st_us_op_th, st_us_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ @@ -411,10 +466,6 @@ TH_TRANS_STUB(th_vleff_v) TH_TRANS_STUB(th_vlbuff_v) TH_TRANS_STUB(th_vlhuff_v) TH_TRANS_STUB(th_vlwuff_v) -TH_TRANS_STUB(th_vsb_v) -TH_TRANS_STUB(th_vsh_v) -TH_TRANS_STUB(th_vsw_v) -TH_TRANS_STUB(th_vse_v) TH_TRANS_STUB(th_vlxb_v) TH_TRANS_STUB(th_vlxh_v) TH_TRANS_STUB(th_vlxw_v) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 7566ab8e31..5c05cd5aa6 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -405,3 +405,34 @@ GEN_TH_LD_US(th_vlhu_v_w, uint16_t, uint32_t, ldhu_w, = clearl_th) GEN_TH_LD_US(th_vlhu_v_d, uint16_t, uint64_t, ldhu_d, clearq_th) GEN_TH_LD_US(th_vlwu_v_w, uint32_t, uint32_t, ldwu_w, clearl_th) GEN_TH_LD_US(th_vlwu_v_d, uint32_t, uint64_t, ldwu_d, clearq_th) + +/* similar to GEN_GEN_VEXT_ST_US, change the function */ +#define GEN_TH_ST_US(NAME, MTYPE, ETYPE, STORE_FN) \ +void HELPER(NAME##_mask)(void *vd, void *v0, target_ulong base, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t stride =3D th_nf(desc) * sizeof(MTYPE); \ + th_ldst_stride(vd, v0, base, stride, env, desc, false, STORE_FN, \ + NULL, sizeof(ETYPE), sizeof(MTYPE), GETPC()); \ +} \ + \ +void HELPER(NAME)(void *vd, void *v0, target_ulong base, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + th_ldst_us(vd, base, env, desc, STORE_FN, NULL, \ + sizeof(ETYPE), sizeof(MTYPE), GETPC()); \ +} + +GEN_TH_ST_US(th_vsb_v_b, int8_t, int8_t , stb_b) +GEN_TH_ST_US(th_vsb_v_h, int8_t, int16_t, stb_h) +GEN_TH_ST_US(th_vsb_v_w, int8_t, int32_t, stb_w) +GEN_TH_ST_US(th_vsb_v_d, int8_t, int64_t, stb_d) +GEN_TH_ST_US(th_vsh_v_h, int16_t, int16_t, sth_h) +GEN_TH_ST_US(th_vsh_v_w, int16_t, int32_t, sth_w) +GEN_TH_ST_US(th_vsh_v_d, int16_t, int64_t, sth_d) +GEN_TH_ST_US(th_vsw_v_w, int32_t, int32_t, stw_w) +GEN_TH_ST_US(th_vsw_v_d, int32_t, int64_t, stw_d) +GEN_TH_ST_US(th_vse_v_b, int8_t, int8_t , ste_b) +GEN_TH_ST_US(th_vse_v_h, int16_t, int16_t, ste_h) +GEN_TH_ST_US(th_vse_v_w, int32_t, int32_t, ste_w) +GEN_TH_ST_US(th_vse_v_d, int64_t, int64_t, ste_d) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712909002; cv=none; d=zohomail.com; s=zohoarc; b=LPT7GflyKJeeKbPEs9Dqg6rG020hRX4+4hEIxoCVQkz3UUm+xuosj4JFvxNd8wmEf2b9q7lziBORlBZIroRZJOr/HmjNlEsi2yhDE612/00N0yVgc06Y+2vt88jmmaonQ5mAQbWIiBrDQugfiMzjjRxoxzSsF5kizIBf1Kk/5b8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712909002; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=8sS1mGwiVRIv8dgw//97y/0QfMFkwxXjpg1XX23X9IA=; b=KZat90izd5WyhVEQrfDJULFosy2nNgs41CDdSK/AExT7sA7AAgY/rTUcKZTULFoaG+LRqAxbdXCdTO0KZB9gvXFdHUuGLhAJdPg2Kwm0C1gP3UmfHRPAObgpy7fcin2vvLwtNmKxqYuqRzZ9syc/D9WJ1bhspzSZ147HH73y7pU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712909002266743.8675211451917; Fri, 12 Apr 2024 01:03:22 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBsH-0003Mg-51; Fri, 12 Apr 2024 04:03:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBsE-0003MT-5k; Fri, 12 Apr 2024 04:03:02 -0400 Received: from out30-112.freemail.mail.aliyun.com ([115.124.30.112]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBs8-0004LT-Ro; Fri, 12 Apr 2024 04:03:01 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NagQK_1712908968) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:02:49 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712908969; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=8sS1mGwiVRIv8dgw//97y/0QfMFkwxXjpg1XX23X9IA=; b=eyWua9LsQ1jbXoW97xMCleHuccnexcSL6Jyn72d2ZNCvW0WhpIbQeneLc+ZaEx52NxyBWyT4Zlc68vqQxuqdWQy7LxLgPRhw98ba3A7xyoKmGZvYDyxDVyeqPxnsFfajwcl1E4JMY0Zqa3U1zbIteQdT++1nw+afX6FCR/iMqu4= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R151e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046049; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NagQK_1712908968; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 12/65] target/riscv: Add indexed load instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:42 +0800 Message-ID: <20240412073735.76413-13-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.112; envelope-from=eric.huang@linux.alibaba.com; helo=out30-112.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712909003539100001 Content-Type: text/plain; charset="utf-8" XTheadVector indexed load instructions diff from RVV1.0 in the following points: 1. Different mask reg layout. 2. Different access width. XTheadVector has 7 instructions, th.vlx{b,h,w}.v, th.vlx{b,h,w}u.v and th.vlxe.v. Their index element width and reg data element width are all SEW-bit. The difference between b,h,w,e is memory access width. {b,h,w,e} stands for 8/16/32/SEW-bit. Therefore, it leads = to the difference of zero and sign extend. "u" stands for zero-extened, the lack of "u" stands for the other. RVV1.0 instructions vlxei{8,16,32,64}.v. {8,16,32,64} are the width of index element, while the memory and data reg element width are both SEW-= bit. 3. Different tail/masked elements process policy. 4. Different check policy. In RVV1.0, in some situations, the destination vector register group can overlap the source vector register group of a different element width. While XTheadVector not. Signed-off-by: Huang Tao --- target/riscv/helper.h | 22 +++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 91 +++++++++++++++++-- target/riscv/vector_helper.c | 4 +- target/riscv/vector_internals.h | 4 + target/riscv/xtheadvector_helper.c | 82 +++++++++++++++++ 5 files changed, 194 insertions(+), 9 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index eb31784e18..733071bdc6 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1387,3 +1387,25 @@ DEF_HELPER_5(th_vse_v_w, void, ptr, ptr, tl, env, i3= 2) DEF_HELPER_5(th_vse_v_w_mask, void, ptr, ptr, tl, env, i32) DEF_HELPER_5(th_vse_v_d, void, ptr, ptr, tl, env, i32) DEF_HELPER_5(th_vse_v_d_mask, void, ptr, ptr, tl, env, i32) +DEF_HELPER_6(th_vlxb_v_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxb_v_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxb_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxb_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxh_v_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxh_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxh_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxw_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxw_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxe_v_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxe_v_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxe_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxe_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxbu_v_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxbu_v_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxbu_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxbu_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxhu_v_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxhu_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxhu_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxwu_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vlxwu_v_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 9b88ea2fa4..8148097de3 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -179,6 +179,20 @@ static bool th_check_nf(DisasContext *s, uint32_t vd, = uint32_t nf) return size <=3D 8 && vd + size <=3D 32; } =20 +/* + * The destination vector register group cannot overlap a source vector re= gister + * group of a different element width. + * + * The overlap check rule is different from RVV1.0. The function + * "require_noover" describes the check rule in RVV1.0. In general, in some + * situations, the destination vector register group can overlap the source + * vector register group of a different element width. While XTheadVector = not. + */ +static inline bool th_check_overlap_group(int rd, int dlen, int rs, int sl= en) +{ + return ((rd >=3D rs + slen) || (rs >=3D rd + dlen)); +} + /* * common translation macro * @@ -453,6 +467,76 @@ GEN_TH_TRANS(th_vsh_v, 1, r2nfvm, st_us_op_th, st_us_c= heck_th) GEN_TH_TRANS(th_vsw_v, 2, r2nfvm, st_us_op_th, st_us_check_th) GEN_TH_TRANS(th_vse_v, 3, r2nfvm, st_us_op_th, st_us_check_th) =20 +/* + * index load and store + */ + +#define gen_helper_ldst_index_th gen_helper_ldst_index + +/* + * This function is almost the copy of ld_index_op, except: + * 1) different data encoding + * 2) XTheadVector has more insns to handle zero/sign-extended. + */ +static bool ld_index_op_th(DisasContext *s, arg_rnfvm *a, uint8_t seq) +{ + uint32_t data =3D 0; + gen_helper_ldst_index_th *fn; + static gen_helper_ldst_index_th * const fns[7][4] =3D { + { gen_helper_th_vlxb_v_b, gen_helper_th_vlxb_v_h, + gen_helper_th_vlxb_v_w, gen_helper_th_vlxb_v_d }, + { NULL, gen_helper_th_vlxh_v_h, + gen_helper_th_vlxh_v_w, gen_helper_th_vlxh_v_d }, + { NULL, NULL, + gen_helper_th_vlxw_v_w, gen_helper_th_vlxw_v_d }, + { gen_helper_th_vlxe_v_b, gen_helper_th_vlxe_v_h, + gen_helper_th_vlxe_v_w, gen_helper_th_vlxe_v_d }, + { gen_helper_th_vlxbu_v_b, gen_helper_th_vlxbu_v_h, + gen_helper_th_vlxbu_v_w, gen_helper_th_vlxbu_v_d }, + { NULL, gen_helper_th_vlxhu_v_h, + gen_helper_th_vlxhu_v_w, gen_helper_th_vlxhu_v_d }, + { NULL, NULL, + gen_helper_th_vlxwu_v_w, gen_helper_th_vlxwu_v_d }, + }; + + fn =3D fns[seq][s->sew]; + if (fn =3D=3D NULL) { + return false; + } + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + data =3D FIELD_DP32(data, VDATA_TH, NF, a->nf); + return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s); +} + +/* + * For vector indexed segment loads, the destination vector register + * groups cannot overlap the source vector register group (specified by + * `vs2`), else an illegal instruction exception is raised. + */ +static bool ld_index_check_th(DisasContext *s, arg_rnfvm* a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, false) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false) && + th_check_nf(s, a->rd, a->nf) && + ((a->nf =3D=3D 1) || + th_check_overlap_group(a->rd, a->nf << s->lmul, + a->rs2, 1 << s->lmul))); +} + +GEN_TH_TRANS(th_vlxb_v, 0, rnfvm, ld_index_op_th, ld_index_check_th) +GEN_TH_TRANS(th_vlxh_v, 1, rnfvm, ld_index_op_th, ld_index_check_th) +GEN_TH_TRANS(th_vlxw_v, 2, rnfvm, ld_index_op_th, ld_index_check_th) +GEN_TH_TRANS(th_vlxe_v, 3, rnfvm, ld_index_op_th, ld_index_check_th) +GEN_TH_TRANS(th_vlxbu_v, 4, rnfvm, ld_index_op_th, ld_index_check_th) +GEN_TH_TRANS(th_vlxhu_v, 5, rnfvm, ld_index_op_th, ld_index_check_th) +GEN_TH_TRANS(th_vlxwu_v, 6, rnfvm, ld_index_op_th, ld_index_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ @@ -466,13 +550,6 @@ TH_TRANS_STUB(th_vleff_v) TH_TRANS_STUB(th_vlbuff_v) TH_TRANS_STUB(th_vlhuff_v) TH_TRANS_STUB(th_vlwuff_v) -TH_TRANS_STUB(th_vlxb_v) -TH_TRANS_STUB(th_vlxh_v) -TH_TRANS_STUB(th_vlxw_v) -TH_TRANS_STUB(th_vlxe_v) -TH_TRANS_STUB(th_vlxbu_v) -TH_TRANS_STUB(th_vlxhu_v) -TH_TRANS_STUB(th_vlxwu_v) TH_TRANS_STUB(th_vsxb_v) TH_TRANS_STUB(th_vsxh_v) TH_TRANS_STUB(th_vsxw_v) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 31660226dc..49b5027371 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -360,8 +360,8 @@ typedef target_ulong vext_get_index_addr(target_ulong b= ase, uint32_t idx, void *vs2); =20 #define GEN_VEXT_GET_INDEX_ADDR(NAME, ETYPE, H) \ -static target_ulong NAME(target_ulong base, \ - uint32_t idx, void *vs2) \ +target_ulong NAME(target_ulong base, \ + uint32_t idx, void *vs2) \ { \ return (base + *((ETYPE *)vs2 + H(idx))); \ } diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 59c93f86bf..a692462bf1 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -239,4 +239,8 @@ static inline target_ulong adjust_addr(CPURISCVState *e= nv, target_ulong addr) return (addr & ~env->cur_pmmask) | env->cur_pmbase; } =20 +target_ulong idx_b(target_ulong base, uint32_t idx, void *vs2); +target_ulong idx_h(target_ulong base, uint32_t idx, void *vs2); +target_ulong idx_w(target_ulong base, uint32_t idx, void *vs2); +target_ulong idx_d(target_ulong base, uint32_t idx, void *vs2); #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 5c05cd5aa6..a9ae157296 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -436,3 +436,85 @@ GEN_TH_ST_US(th_vse_v_b, int8_t, int8_t , ste_b) GEN_TH_ST_US(th_vse_v_h, int16_t, int16_t, ste_h) GEN_TH_ST_US(th_vse_v_w, int32_t, int32_t, ste_w) GEN_TH_ST_US(th_vse_v_d, int64_t, int64_t, ste_d) + +/* + * index: access vector element from indexed memory + */ +typedef target_ulong th_get_index_addr(target_ulong base, + uint32_t idx, void *vs2); + +/* + * This function is almost the copy of vext_ldst_index, except: + * 1) different mask layout + * 2) different data encoding + * 3) different mask/tail elements process policy + */ +static inline void +th_ldst_index(void *vd, void *v0, target_ulong base, + void *vs2, CPURISCVState *env, uint32_t desc, + th_get_index_addr get_index_addr, + th_ldst_elem_fn *ldst_elem, + clear_fn *clear_elem, + uint32_t esz, uint32_t msz, uintptr_t ra) +{ + uint32_t i, k; + uint32_t nf =3D th_nf(desc); + uint32_t vm =3D th_vm(desc); + uint32_t mlen =3D th_mlen(desc); + uint32_t vlmax =3D th_maxsz(desc) / esz; + + VSTART_CHECK_EARLY_EXIT(env); + + /* load bytes from guest memory */ + for (i =3D env->vstart; i < env->vl; env->vstart =3D ++i) { + k =3D 0; + if (!vm && !th_elem_mask(v0, mlen, i)) { + continue; + } + while (k < nf) { + abi_ptr addr =3D get_index_addr(base, i, vs2) + k * msz; + ldst_elem(env, adjust_addr(env, addr), i + k * vlmax, vd, ra); + k++; + } + } + env->vstart =3D 0; + /* clear tail elements */ + if (clear_elem) { + for (k =3D 0; k < nf; k++) { + clear_elem(vd, env->vl + k * vlmax, env->vl * esz, vlmax * esz= ); + } + } +} + +/* Similar to GEN_VEXT_LD_INDEX */ +#define GEN_TH_LD_INDEX(NAME, MTYPE, ETYPE, INDEX_FN, LOAD_FN, CLEAR_FN) = \ +void HELPER(NAME)(void *vd, void *v0, target_ulong base, = \ + void *vs2, CPURISCVState *env, uint32_t desc) = \ +{ = \ + th_ldst_index(vd, v0, base, vs2, env, desc, INDEX_FN, = \ + LOAD_FN, CLEAR_FN, sizeof(ETYPE), sizeof(MTYPE), = \ + GETPC()); = \ +} + +GEN_TH_LD_INDEX(th_vlxb_v_b, int8_t, int8_t, idx_b, ldb_b, clearb_th) +GEN_TH_LD_INDEX(th_vlxb_v_h, int8_t, int16_t, idx_h, ldb_h, clearh_th) +GEN_TH_LD_INDEX(th_vlxb_v_w, int8_t, int32_t, idx_w, ldb_w, clearl_th) +GEN_TH_LD_INDEX(th_vlxb_v_d, int8_t, int64_t, idx_d, ldb_d, clearq_th) +GEN_TH_LD_INDEX(th_vlxh_v_h, int16_t, int16_t, idx_h, ldh_h, clearh_th) +GEN_TH_LD_INDEX(th_vlxh_v_w, int16_t, int32_t, idx_w, ldh_w, clearl_th) +GEN_TH_LD_INDEX(th_vlxh_v_d, int16_t, int64_t, idx_d, ldh_d, clearq_th) +GEN_TH_LD_INDEX(th_vlxw_v_w, int32_t, int32_t, idx_w, ldw_w, clearl_th) +GEN_TH_LD_INDEX(th_vlxw_v_d, int32_t, int64_t, idx_d, ldw_d, clearq_th) +GEN_TH_LD_INDEX(th_vlxe_v_b, int8_t, int8_t, idx_b, lde_b, clearb_th) +GEN_TH_LD_INDEX(th_vlxe_v_h, int16_t, int16_t, idx_h, lde_h, clearh_th) +GEN_TH_LD_INDEX(th_vlxe_v_w, int32_t, int32_t, idx_w, lde_w, clearl_th) +GEN_TH_LD_INDEX(th_vlxe_v_d, int64_t, int64_t, idx_d, lde_d, clearq_th) +GEN_TH_LD_INDEX(th_vlxbu_v_b, uint8_t, uint8_t, idx_b, ldbu_b, clearb_th) +GEN_TH_LD_INDEX(th_vlxbu_v_h, uint8_t, uint16_t, idx_h, ldbu_h, clearh_th) +GEN_TH_LD_INDEX(th_vlxbu_v_w, uint8_t, uint32_t, idx_w, ldbu_w, clearl_th) +GEN_TH_LD_INDEX(th_vlxbu_v_d, uint8_t, uint64_t, idx_d, ldbu_d, clearq_th) +GEN_TH_LD_INDEX(th_vlxhu_v_h, uint16_t, uint16_t, idx_h, ldhu_h, clearh_th) +GEN_TH_LD_INDEX(th_vlxhu_v_w, uint16_t, uint32_t, idx_w, ldhu_w, clearl_th) +GEN_TH_LD_INDEX(th_vlxhu_v_d, uint16_t, uint64_t, idx_d, ldhu_d, clearq_th) +GEN_TH_LD_INDEX(th_vlxwu_v_w, uint32_t, uint32_t, idx_w, ldwu_w, clearl_th) +GEN_TH_LD_INDEX(th_vlxwu_v_d, uint32_t, uint64_t, idx_d, ldwu_d, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712909115; cv=none; d=zohomail.com; s=zohoarc; b=kscGcM/I567Lz5MC+Yvx9cM/9K3Zt52k9/RW6l9EwmUTDm2AOnzs+POgXdXLFYjgKVCmLTpuS3WDdw/gY4UZ9pDixL9+VJYBwEfr0pLQR3FoiXV6Y0/xG3gQKr75t+e9RUiP/CADjYY0Z0GmceetHkC5LJVHjn+3bacaKA4ffbA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712909115; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=9ugy1gHKeOMe5mNFWAUsyeoGXw4JWDzoRj/8nAijsX8=; b=DPviDLgH1Aiofj8RF2rRiF4oms8eXDTSPkxkrsByDh2V/Ujs1npczL7JlrbSZ8v7ex/Q471qLKwWZlnoUOrKfxnBFebI4DeXoJYf2dxsgBlI21tB8qxMAms3FW/i0RQpCsDNtYkEKxbyYJW5+Vf1CeCuw3nJKMCDl2FR4X5EBEQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712909115009500.9000506835745; Fri, 12 Apr 2024 01:05:15 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBuD-0005PC-8l; Fri, 12 Apr 2024 04:05:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBu9-0005Ou-Dp; Fri, 12 Apr 2024 04:05:01 -0400 Received: from out30-124.freemail.mail.aliyun.com ([115.124.30.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBu5-0004WY-1H; Fri, 12 Apr 2024 04:05:00 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nah7i_1712909089) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:04:50 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712909091; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=9ugy1gHKeOMe5mNFWAUsyeoGXw4JWDzoRj/8nAijsX8=; b=Xkp9PBdbFIN9Kr3RuE2DgCdjTrLCZ78cR5VMXvGh317vWsGuiGQlwkP9BzzlEpYTTWYbq3ZXWxTf1P4Bj6uBhNUF0WDfmAf5eAgZCKcVEvXV8cVvSSqFZoM3U0URnrcPwidt7uUKsJPPdk6FmIeZfLlTsBS/urs75qoxkqIojYo= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R121e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046056; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nah7i_1712909089; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 13/65] target/riscv: Add indexed store instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:43 +0800 Message-ID: <20240412073735.76413-14-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.124; envelope-from=eric.huang@linux.alibaba.com; helo=out30-124.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712909115943100001 Content-Type: text/plain; charset="utf-8" XTheadVector indexed store instructions diff from RVV1.0 in the following points: 1. Different mask reg layout. 2. Different access width. As same as XTheadVector indexed load instruction= s, except store does not need to distinguish between zero and sign extended. 3. Different masked elements process policy. 4. Different check policy. Signed-off-by: Huang Tao --- target/riscv/helper.h | 13 +++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 49 +++++++++++++++++-- target/riscv/xtheadvector_helper.c | 24 +++++++++ 3 files changed, 82 insertions(+), 4 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 733071bdc6..fd81db2f74 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1409,3 +1409,16 @@ DEF_HELPER_6(th_vlxhu_v_w, void, ptr, ptr, tl, ptr, = env, i32) DEF_HELPER_6(th_vlxhu_v_d, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vlxwu_v_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vlxwu_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxb_v_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxb_v_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxb_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxb_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxh_v_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxh_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxh_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxw_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxw_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxe_v_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxe_v_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxe_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsxe_v_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 8148097de3..68a2a9a0cf 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -537,6 +537,51 @@ GEN_TH_TRANS(th_vlxbu_v, 4, rnfvm, ld_index_op_th, ld_= index_check_th) GEN_TH_TRANS(th_vlxhu_v, 5, rnfvm, ld_index_op_th, ld_index_check_th) GEN_TH_TRANS(th_vlxwu_v, 6, rnfvm, ld_index_op_th, ld_index_check_th) =20 +/* + * This function is almost the copy of st_index_op, except: + * 1) different data encoding. + */ +static bool st_index_op_th(DisasContext *s, arg_rnfvm *a, uint8_t seq) +{ + uint32_t data =3D 0; + gen_helper_ldst_index_th *fn; + static gen_helper_ldst_index_th * const fns[4][4] =3D { + { gen_helper_th_vsxb_v_b, gen_helper_th_vsxb_v_h, + gen_helper_th_vsxb_v_w, gen_helper_th_vsxb_v_d }, + { NULL, gen_helper_th_vsxh_v_h, + gen_helper_th_vsxh_v_w, gen_helper_th_vsxh_v_d }, + { NULL, NULL, + gen_helper_th_vsxw_v_w, gen_helper_th_vsxw_v_d }, + { gen_helper_th_vsxe_v_b, gen_helper_th_vsxe_v_h, + gen_helper_th_vsxe_v_w, gen_helper_th_vsxe_v_d } + }; + + fn =3D fns[seq][s->sew]; + if (fn =3D=3D NULL) { + return false; + } + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + data =3D FIELD_DP32(data, VDATA_TH, NF, a->nf); + return ldst_index_trans(a->rd, a->rs1, a->rs2, data, fn, s); +} + +static bool st_index_check_th(DisasContext *s, arg_rnfvm* a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false) && + th_check_nf(s, a->rd, a->nf)); +} + +GEN_TH_TRANS(th_vsxb_v, 0, rnfvm, st_index_op_th, st_index_check_th) +GEN_TH_TRANS(th_vsxh_v, 1, rnfvm, st_index_op_th, st_index_check_th) +GEN_TH_TRANS(th_vsxw_v, 2, rnfvm, st_index_op_th, st_index_check_th) +GEN_TH_TRANS(th_vsxe_v, 3, rnfvm, st_index_op_th, st_index_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ @@ -550,10 +595,6 @@ TH_TRANS_STUB(th_vleff_v) TH_TRANS_STUB(th_vlbuff_v) TH_TRANS_STUB(th_vlhuff_v) TH_TRANS_STUB(th_vlwuff_v) -TH_TRANS_STUB(th_vsxb_v) -TH_TRANS_STUB(th_vsxh_v) -TH_TRANS_STUB(th_vsxw_v) -TH_TRANS_STUB(th_vsxe_v) TH_TRANS_STUB(th_vamoswapw_v) TH_TRANS_STUB(th_vamoaddw_v) TH_TRANS_STUB(th_vamoxorw_v) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index a9ae157296..22af4774df 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -518,3 +518,27 @@ GEN_TH_LD_INDEX(th_vlxhu_v_w, uint16_t, uint32_t, idx_= w, ldhu_w, clearl_th) GEN_TH_LD_INDEX(th_vlxhu_v_d, uint16_t, uint64_t, idx_d, ldhu_d, clearq_th) GEN_TH_LD_INDEX(th_vlxwu_v_w, uint32_t, uint32_t, idx_w, ldwu_w, clearl_th) GEN_TH_LD_INDEX(th_vlxwu_v_d, uint32_t, uint64_t, idx_d, ldwu_d, clearq_th) + +/* Similar to GEN_VEXT_ST_INDEX */ +#define GEN_TH_ST_INDEX(NAME, MTYPE, ETYPE, INDEX_FN, STORE_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong base, \ + void *vs2, CPURISCVState *env, uint32_t desc) \ +{ \ + th_ldst_index(vd, v0, base, vs2, env, desc, INDEX_FN, \ + STORE_FN, NULL, sizeof(ETYPE), sizeof(MTYPE), \ + GETPC()); \ +} + +GEN_TH_ST_INDEX(th_vsxb_v_b, int8_t, int8_t, idx_b, stb_b) +GEN_TH_ST_INDEX(th_vsxb_v_h, int8_t, int16_t, idx_h, stb_h) +GEN_TH_ST_INDEX(th_vsxb_v_w, int8_t, int32_t, idx_w, stb_w) +GEN_TH_ST_INDEX(th_vsxb_v_d, int8_t, int64_t, idx_d, stb_d) +GEN_TH_ST_INDEX(th_vsxh_v_h, int16_t, int16_t, idx_h, sth_h) +GEN_TH_ST_INDEX(th_vsxh_v_w, int16_t, int32_t, idx_w, sth_w) +GEN_TH_ST_INDEX(th_vsxh_v_d, int16_t, int64_t, idx_d, sth_d) +GEN_TH_ST_INDEX(th_vsxw_v_w, int32_t, int32_t, idx_w, stw_w) +GEN_TH_ST_INDEX(th_vsxw_v_d, int32_t, int64_t, idx_d, stw_d) +GEN_TH_ST_INDEX(th_vsxe_v_b, int8_t, int8_t, idx_b, ste_b) +GEN_TH_ST_INDEX(th_vsxe_v_h, int16_t, int16_t, idx_h, ste_h) +GEN_TH_ST_INDEX(th_vsxe_v_w, int32_t, int32_t, idx_w, ste_w) +GEN_TH_ST_INDEX(th_vsxe_v_d, int64_t, int64_t, idx_d, ste_d) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712909243; cv=none; d=zohomail.com; s=zohoarc; b=JRAbSyVXlRlBr+NwluTH/4XraSsboEr7a5vUlNj6thWZArrxKTtBo1XseJg3Hgcj6G4YfqyigkFp0sJQgYR981NA3gXT5bbZwTNjQdjex/a7i0EsRL2RES+CXrRkSS0jptxGrwUFKjoUc5T0DvceeI8y6F0fE2cCnqXDVhVn2js= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712909243; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=G99+1JUges/zyQw0rpQMt4VPdKw1dFUmH+MQFBw98yk=; b=nWL6FVs13EXszSDG3lhvKZLXMhZ7BBYWyHs+8B6vxj1l5E8eVNTqZ8PGrFyi8Ua7iAr5pTxdfW+Z7VWUGlqzkBI0xDz9IcU1CHFoJIUHl5+/PfKO9JsPhPDJahvzXwxoJ/45OIEhxe9gW9ojhs6HwsZ8uzaYrFh5Zr2aCoKl1i4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712909243150358.687670908648; Fri, 12 Apr 2024 01:07:23 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvBw8-00069k-On; Fri, 12 Apr 2024 04:07:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBw6-00069W-Ap; Fri, 12 Apr 2024 04:07:02 -0400 Received: from out30-100.freemail.mail.aliyun.com ([115.124.30.100]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBw2-0004yV-VT; Fri, 12 Apr 2024 04:07:01 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nai7w_1712909211) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:06:51 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712909213; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=G99+1JUges/zyQw0rpQMt4VPdKw1dFUmH+MQFBw98yk=; b=Afqrq3ZDSmRSBQXgZy2eo7L0voucSGsS/wkrfjT/TtG9YEc7xXyHvfR3E44nb/VAlQBCx4ytZ9POSGP5vwBFWjdiZWJ6LNUtzcQBGPtZrf+uOwxcJWJ0hRlEKL5n3c8GEXD/HisMkoqTcRkhrXsmiHk1jnFA2z5CwLp49BGfK2I= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R171e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046059; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nai7w_1712909211; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 14/65] target/riscv: Add unit-stride fault-only-first instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:44 +0800 Message-ID: <20240412073735.76413-15-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.100; envelope-from=eric.huang@linux.alibaba.com; helo=out30-100.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712909244505100001 Content-Type: text/plain; charset="utf-8" XTheadVector unit-stride fault-only-first instructions diff from RVV1.0 in the following points: 1. Different mask reg layout. 2. Different vector reg element width. 3. Different tail/masked elements process policy. 4. Different check policy. The detials of the difference are the same as unit-stride load instructions, as unit-stride fault-only-first instructions are the he special cases of unit-stride load operations. Signed-off-by: Huang Tao --- target/riscv/helper.h | 22 ++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 57 +++++++-- target/riscv/vector_helper.c | 2 +- target/riscv/vector_internals.h | 5 + target/riscv/xtheadvector_helper.c | 119 ++++++++++++++++++ 5 files changed, 197 insertions(+), 8 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index fd81db2f74..1bf4c38c4b 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1422,3 +1422,25 @@ DEF_HELPER_6(th_vsxe_v_b, void, ptr, ptr, tl, ptr, e= nv, i32) DEF_HELPER_6(th_vsxe_v_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vsxe_v_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vsxe_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_5(th_vlbff_v_b, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbff_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbff_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbff_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlhff_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlhff_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlhff_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlwff_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlwff_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vleff_v_b, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vleff_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vleff_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vleff_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbuff_v_b, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbuff_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbuff_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlbuff_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlhuff_v_h, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlhuff_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlhuff_v_d, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlwuff_v_w, void, ptr, ptr, tl, env, i32) +DEF_HELPER_5(th_vlwuff_v_d, void, ptr, ptr, tl, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 68a2a9a0cf..3548a6c2cc 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -582,19 +582,62 @@ GEN_TH_TRANS(th_vsxh_v, 1, rnfvm, st_index_op_th, st_= index_check_th) GEN_TH_TRANS(th_vsxw_v, 2, rnfvm, st_index_op_th, st_index_check_th) GEN_TH_TRANS(th_vsxe_v, 3, rnfvm, st_index_op_th, st_index_check_th) =20 +/* + * unit stride fault-only-first load + */ + +/* + * This function is almost the copy of ldff_op, except: + * 1) different data encoding. + * 2) XTheadVector has more insns to handle zero/sign-extended. + */ +static bool ldff_op_th(DisasContext *s, arg_r2nfvm *a, uint8_t seq) +{ + uint32_t data =3D 0; + gen_helper_ldst_us_th *fn; + static gen_helper_ldst_us_th * const fns[7][4] =3D { + { gen_helper_th_vlbff_v_b, gen_helper_th_vlbff_v_h, + gen_helper_th_vlbff_v_w, gen_helper_th_vlbff_v_d }, + { NULL, gen_helper_th_vlhff_v_h, + gen_helper_th_vlhff_v_w, gen_helper_th_vlhff_v_d }, + { NULL, NULL, + gen_helper_th_vlwff_v_w, gen_helper_th_vlwff_v_d }, + { gen_helper_th_vleff_v_b, gen_helper_th_vleff_v_h, + gen_helper_th_vleff_v_w, gen_helper_th_vleff_v_d }, + { gen_helper_th_vlbuff_v_b, gen_helper_th_vlbuff_v_h, + gen_helper_th_vlbuff_v_w, gen_helper_th_vlbuff_v_d }, + { NULL, gen_helper_th_vlhuff_v_h, + gen_helper_th_vlhuff_v_w, gen_helper_th_vlhuff_v_d }, + { NULL, NULL, + gen_helper_th_vlwuff_v_w, gen_helper_th_vlwuff_v_d } + }; + + fn =3D fns[seq][s->sew]; + if (fn =3D=3D NULL) { + return false; + } + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + data =3D FIELD_DP32(data, VDATA_TH, NF, a->nf); + return ldff_trans(a->rd, a->rs1, data, fn, s); +} + +GEN_TH_TRANS(th_vlbff_v, 0, r2nfvm, ldff_op_th, ld_us_check_th) +GEN_TH_TRANS(th_vlhff_v, 1, r2nfvm, ldff_op_th, ld_us_check_th) +GEN_TH_TRANS(th_vlwff_v, 2, r2nfvm, ldff_op_th, ld_us_check_th) +GEN_TH_TRANS(th_vleff_v, 3, r2nfvm, ldff_op_th, ld_us_check_th) +GEN_TH_TRANS(th_vlbuff_v, 4, r2nfvm, ldff_op_th, ld_us_check_th) +GEN_TH_TRANS(th_vlhuff_v, 5, r2nfvm, ldff_op_th, ld_us_check_th) +GEN_TH_TRANS(th_vlwuff_v, 6, r2nfvm, ldff_op_th, ld_us_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vlbff_v) -TH_TRANS_STUB(th_vlhff_v) -TH_TRANS_STUB(th_vlwff_v) -TH_TRANS_STUB(th_vleff_v) -TH_TRANS_STUB(th_vlbuff_v) -TH_TRANS_STUB(th_vlhuff_v) -TH_TRANS_STUB(th_vlwuff_v) TH_TRANS_STUB(th_vamoswapw_v) TH_TRANS_STUB(th_vamoaddw_v) TH_TRANS_STUB(th_vamoxorw_v) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 49b5027371..695cb7dfec 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -112,7 +112,7 @@ static inline uint32_t vext_max_elems(uint32_t desc, ui= nt32_t log2_esz) * and page table walk can't fill the TLB entry. Then the guest * software can return here after process the exception or never return. */ -static void probe_pages(CPURISCVState *env, target_ulong addr, +void probe_pages(CPURISCVState *env, target_ulong addr, target_ulong len, uintptr_t ra, MMUAccessType access_type) { diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index a692462bf1..ff10cd3806 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -243,4 +243,9 @@ target_ulong idx_b(target_ulong base, uint32_t idx, voi= d *vs2); target_ulong idx_h(target_ulong base, uint32_t idx, void *vs2); target_ulong idx_w(target_ulong base, uint32_t idx, void *vs2); target_ulong idx_d(target_ulong base, uint32_t idx, void *vs2); + +void probe_pages(CPURISCVState *env, target_ulong addr, + target_ulong len, uintptr_t ra, + MMUAccessType access_type); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 22af4774df..af814688b5 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -542,3 +542,122 @@ GEN_TH_ST_INDEX(th_vsxe_v_b, int8_t, int8_t, idx_b,= ste_b) GEN_TH_ST_INDEX(th_vsxe_v_h, int16_t, int16_t, idx_h, ste_h) GEN_TH_ST_INDEX(th_vsxe_v_w, int32_t, int32_t, idx_w, ste_w) GEN_TH_ST_INDEX(th_vsxe_v_d, int64_t, int64_t, idx_d, ste_d) + +/* + * unit-stride fault-only-first load instructions + */ + +/* + * This function is almost the copy of vext_ldff, except: + * 1) different mask layout + * 2) different data encoding + * 3) different mask/tail elements process policy + */ +static inline void +th_ldff(void *vd, void *v0, target_ulong base, + CPURISCVState *env, uint32_t desc, + th_ldst_elem_fn *ldst_elem, + clear_fn *clear_elem, + uint32_t esz, uint32_t msz, uintptr_t ra) +{ + void *host; + uint32_t i, k, vl =3D 0; + uint32_t mlen =3D th_mlen(desc); + uint32_t nf =3D th_nf(desc); + uint32_t vm =3D th_vm(desc); + uint32_t vlmax =3D th_maxsz(desc) / esz; + target_ulong addr, offset, remain; + int mmu_index =3D riscv_env_mmu_index(env, false); + + VSTART_CHECK_EARLY_EXIT(env); + /* probe every access*/ + for (i =3D env->vstart; i < env->vl; i++) { + if (!vm && !th_elem_mask(v0, mlen, i)) { + continue; + } + addr =3D adjust_addr(env, base + nf * i * msz); + if (i =3D=3D 0) { + probe_pages(env, addr, nf * msz, ra, MMU_DATA_LOAD); + } else { + /* if it triggers an exception, no need to check watchpoint */ + remain =3D nf * msz; + while (remain > 0) { + offset =3D -(addr | TARGET_PAGE_MASK); + host =3D tlb_vaddr_to_host(env, addr, MMU_DATA_LOAD, mmu_i= ndex); + if (host) { +#ifdef CONFIG_USER_ONLY + if (!page_check_range(addr, offset, PAGE_READ)) { + vl =3D i; + goto ProbeSuccess; + } +#else + probe_pages(env, addr, offset, ra, MMU_DATA_LOAD); +#endif + } else { + vl =3D i; + goto ProbeSuccess; + } + if (remain <=3D offset) { + break; + } + remain -=3D offset; + addr =3D adjust_addr(env, addr + offset); + } + } + } +ProbeSuccess: + /* load bytes from guest memory */ + if (vl !=3D 0) { + env->vl =3D vl; + } + for (i =3D env->vstart; i < env->vl; i++) { + k =3D 0; + if (!vm && !th_elem_mask(v0, mlen, i)) { + continue; + } + while (k < nf) { + addr =3D base + (i * nf + k) * msz; + ldst_elem(env, adjust_addr(env, addr), i + k * vlmax, vd, ra); + k++; + } + } + env->vstart =3D 0; + /* clear tail elements */ + if (vl !=3D 0) { + return; + } + for (k =3D 0; k < nf; k++) { + clear_elem(vd, env->vl + k * vlmax, env->vl * esz, vlmax * esz); + } +} + +#define GEN_TH_LDFF(NAME, MTYPE, ETYPE, LOAD_FN, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong base, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + th_ldff(vd, v0, base, env, desc, LOAD_FN, CLEAR_FN, \ + sizeof(ETYPE), sizeof(MTYPE), GETPC()); \ +} + +GEN_TH_LDFF(th_vlbff_v_b, int8_t, int8_t, ldb_b, clearb_th) +GEN_TH_LDFF(th_vlbff_v_h, int8_t, int16_t, ldb_h, clearh_th) +GEN_TH_LDFF(th_vlbff_v_w, int8_t, int32_t, ldb_w, clearl_th) +GEN_TH_LDFF(th_vlbff_v_d, int8_t, int64_t, ldb_d, clearq_th) +GEN_TH_LDFF(th_vlhff_v_h, int16_t, int16_t, ldh_h, clearh_th) +GEN_TH_LDFF(th_vlhff_v_w, int16_t, int32_t, ldh_w, clearl_th) +GEN_TH_LDFF(th_vlhff_v_d, int16_t, int64_t, ldh_d, clearq_th) +GEN_TH_LDFF(th_vlwff_v_w, int32_t, int32_t, ldw_w, clearl_th) +GEN_TH_LDFF(th_vlwff_v_d, int32_t, int64_t, ldw_d, clearq_th) +GEN_TH_LDFF(th_vleff_v_b, int8_t, int8_t, lde_b, clearb_th) +GEN_TH_LDFF(th_vleff_v_h, int16_t, int16_t, lde_h, clearh_th) +GEN_TH_LDFF(th_vleff_v_w, int32_t, int32_t, lde_w, clearl_th) +GEN_TH_LDFF(th_vleff_v_d, int64_t, int64_t, lde_d, clearq_th) +GEN_TH_LDFF(th_vlbuff_v_b, uint8_t, uint8_t, ldbu_b, clearb_th) +GEN_TH_LDFF(th_vlbuff_v_h, uint8_t, uint16_t, ldbu_h, clearh_th) +GEN_TH_LDFF(th_vlbuff_v_w, uint8_t, uint32_t, ldbu_w, clearl_th) +GEN_TH_LDFF(th_vlbuff_v_d, uint8_t, uint64_t, ldbu_d, clearq_th) +GEN_TH_LDFF(th_vlhuff_v_h, uint16_t, uint16_t, ldhu_h, clearh_th) +GEN_TH_LDFF(th_vlhuff_v_w, uint16_t, uint32_t, ldhu_w, clearl_th) +GEN_TH_LDFF(th_vlhuff_v_d, uint16_t, uint64_t, ldhu_d, clearq_th) +GEN_TH_LDFF(th_vlwuff_v_w, uint32_t, uint32_t, ldwu_w, clearl_th) +GEN_TH_LDFF(th_vlwuff_v_d, uint32_t, uint64_t, ldwu_d, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712909375; cv=none; d=zohomail.com; s=zohoarc; b=kyTmXq9HUhYTTKZSeY5i7TW6JMCWbtUbz8kpquHWoqnFq/zMpE/r7A2laixmvvsuXBfXtlDTAL6/Ny2ar+r6tOjsPXtdKcE2F9HhlVnn2bY+s8QMfGiUjp6RUm2Eyc3PP2xbrYQa88LWCAWkY0bvYsrxvzNqccnI1eywUpNNJK4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712909375; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=mn0p1OLAEnQ7KKDa+jWz4TOoWUuC4CYzS6oguLEqCkY=; b=VJah1175Z89e1IPCGXUjr/4OqGkb6UCgUFKm7+VuBY710igNsi5WN+OAfNDvZBJiYmRI+FcXwkA9YWs5iQhe56DguPOh8AhvKMO7be9YaccdIcfKVR/JM30g6gjZG9MvaUzs/9VsLT/bo3tlbOFpMO35IgfhjyoxDv3kyiOEvec= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712909375960857.3358221759129; Fri, 12 Apr 2024 01:09:35 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvByF-00074b-Kl; Fri, 12 Apr 2024 04:09:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvByD-00074L-2l; Fri, 12 Apr 2024 04:09:13 -0400 Received: from out30-98.freemail.mail.aliyun.com ([115.124.30.98]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBy6-0005Fr-Dp; Fri, 12 Apr 2024 04:09:12 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nbm6z_1712909332) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:08:53 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712909334; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=mn0p1OLAEnQ7KKDa+jWz4TOoWUuC4CYzS6oguLEqCkY=; b=nyNjSBklUv3N9/gM9q0hkQH3BP7HECBuqDq7waX5Y6uQd9OhtYI/BSefLiNSM7K2yZvIxCdRnlZY0swxAeflmRpOPkDg90h3f0V5JiJqerj/LZVPNT/wcKNN/cXi/aR+v4MsCqdUMulLBne4kFWS0Ap3MmlnC2EeyZadrel8cQo= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R131e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046056; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nbm6z_1712909332; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 15/65] target/riscv: Add vector amo operations for XTheadVector Date: Fri, 12 Apr 2024 15:36:45 +0800 Message-ID: <20240412073735.76413-16-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.98; envelope-from=eric.huang@linux.alibaba.com; helo=out30-98.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712909376672100001 Content-Type: text/plain; charset="utf-8" In this patch, we add the vector amo instructions(Zvamo) for XTheadVector. Zvamo is unsupported by RVV1.0. The action of Zvamo is similar to Zaamo(atomic operations from the standard A extension). Signed-off-by: Huang Tao --- target/riscv/helper.h | 28 ++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 155 ++++++++++++++++-- target/riscv/xtheadvector_helper.c | 136 +++++++++++++++ 3 files changed, 301 insertions(+), 18 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 1bf4c38c4b..c2a26acabc 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1444,3 +1444,31 @@ DEF_HELPER_5(th_vlhuff_v_w, void, ptr, ptr, tl, env,= i32) DEF_HELPER_5(th_vlhuff_v_d, void, ptr, ptr, tl, env, i32) DEF_HELPER_5(th_vlwuff_v_w, void, ptr, ptr, tl, env, i32) DEF_HELPER_5(th_vlwuff_v_d, void, ptr, ptr, tl, env, i32) + +DEF_HELPER_6(th_vamoswapw_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoswapd_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoaddw_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoaddd_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoxorw_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoxord_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoandw_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoandd_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoorw_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoord_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamominw_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamomind_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamomaxw_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamomaxd_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamominuw_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamominud_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamomaxuw_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamomaxud_v_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoswapw_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoaddw_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoxorw_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoandw_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamoorw_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamominw_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamomaxw_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamominuw_v_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vamomaxuw_v_w, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 3548a6c2cc..2bcd9b0832 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -632,30 +632,149 @@ GEN_TH_TRANS(th_vlbuff_v, 4, r2nfvm, ldff_op_th, ld_= us_check_th) GEN_TH_TRANS(th_vlhuff_v, 5, r2nfvm, ldff_op_th, ld_us_check_th) GEN_TH_TRANS(th_vlwuff_v, 6, r2nfvm, ldff_op_th, ld_us_check_th) =20 + +/* + * vector atomic operation + */ +typedef void gen_helper_amo(TCGv_ptr, TCGv_ptr, TCGv, TCGv_ptr, + TCGv_env, TCGv_i32); +static bool amo_trans_th(uint32_t vd, uint32_t rs1, uint32_t vs2, + uint32_t data, gen_helper_amo *fn, DisasContext *= s) +{ + TCGv_ptr dest, mask, index; + TCGv base; + TCGv_i32 desc; + + dest =3D tcg_temp_new_ptr(); + mask =3D tcg_temp_new_ptr(); + index =3D tcg_temp_new_ptr(); + base =3D get_gpr(s, rs1, EXT_NONE); + desc =3D tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data)); + + tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, vd)); + tcg_gen_addi_ptr(index, tcg_env, vreg_ofs(s, vs2)); + tcg_gen_addi_ptr(mask, tcg_env, vreg_ofs(s, 0)); + + mark_vs_dirty(s); + + fn(dest, mask, base, index, tcg_env, desc); + + finalize_rvv_inst(s); + return true; +} + +static bool amo_op_th(DisasContext *s, arg_rwdvm *a, uint8_t seq) +{ + uint32_t data =3D 0; + gen_helper_amo *fn; + static gen_helper_amo *const fnsw[9] =3D { + /* no atomic operation */ + gen_helper_th_vamoswapw_v_w, + gen_helper_th_vamoaddw_v_w, + gen_helper_th_vamoxorw_v_w, + gen_helper_th_vamoandw_v_w, + gen_helper_th_vamoorw_v_w, + gen_helper_th_vamominw_v_w, + gen_helper_th_vamomaxw_v_w, + gen_helper_th_vamominuw_v_w, + gen_helper_th_vamomaxuw_v_w + }; + static gen_helper_amo *const fnsd[18] =3D { + gen_helper_th_vamoswapw_v_d, + gen_helper_th_vamoaddw_v_d, + gen_helper_th_vamoxorw_v_d, + gen_helper_th_vamoandw_v_d, + gen_helper_th_vamoorw_v_d, + gen_helper_th_vamominw_v_d, + gen_helper_th_vamomaxw_v_d, + gen_helper_th_vamominuw_v_d, + gen_helper_th_vamomaxuw_v_d, + gen_helper_th_vamoswapd_v_d, + gen_helper_th_vamoaddd_v_d, + gen_helper_th_vamoxord_v_d, + gen_helper_th_vamoandd_v_d, + gen_helper_th_vamoord_v_d, + gen_helper_th_vamomind_v_d, + gen_helper_th_vamomaxd_v_d, + gen_helper_th_vamominud_v_d, + gen_helper_th_vamomaxud_v_d + }; + + if (tb_cflags(s->base.tb) & CF_PARALLEL) { + gen_helper_exit_atomic(tcg_env); + s->base.is_jmp =3D DISAS_NORETURN; + return true; + } + switch (s->sew) { + case 0 ... 2: + assert(seq < ARRAY_SIZE(fnsw)); + fn =3D fnsw[seq]; + break; + case 3: + /* XLEN check done in amo_check(). */ + assert(seq < ARRAY_SIZE(fnsd)); + fn =3D fnsd[seq]; + break; + default: + g_assert_not_reached(); + } + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + data =3D FIELD_DP32(data, VDATA_TH, WD, a->wd); + return amo_trans_th(a->rd, a->rs1, a->rs2, data, fn, s); +} +/* + * There are two rules check here. + * + * 1. SEW must be at least as wide as the AMO memory element size. + * + * 2. If SEW is greater than XLEN, an illegal instruction exception is rai= sed. + */ +static bool amo_check_th(DisasContext *s, arg_rwdvm* a) +{ + return (require_xtheadvector(s) && + !s->vill && has_ext(s, RVA) && + (!a->wd || th_check_overlap_mask(s, a->rd, a->vm, false)) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false) && + ((1 << s->sew) <=3D (get_xlen(s) / 8)) && + ((1 << s->sew) >=3D 4)); +} + +static bool amo_check64_th(DisasContext *s, arg_rwdvm* a) +{ + REQUIRE_64BIT(s); + return amo_check_th(s, a); +} + +GEN_TH_TRANS(th_vamoswapw_v, 0, rwdvm, amo_op_th, amo_check_th) +GEN_TH_TRANS(th_vamoaddw_v, 1, rwdvm, amo_op_th, amo_check_th) +GEN_TH_TRANS(th_vamoxorw_v, 2, rwdvm, amo_op_th, amo_check_th) +GEN_TH_TRANS(th_vamoandw_v, 3, rwdvm, amo_op_th, amo_check_th) +GEN_TH_TRANS(th_vamoorw_v, 4, rwdvm, amo_op_th, amo_check_th) +GEN_TH_TRANS(th_vamominw_v, 5, rwdvm, amo_op_th, amo_check_th) +GEN_TH_TRANS(th_vamomaxw_v, 6, rwdvm, amo_op_th, amo_check_th) +GEN_TH_TRANS(th_vamominuw_v, 7, rwdvm, amo_op_th, amo_check_th) +GEN_TH_TRANS(th_vamomaxuw_v, 8, rwdvm, amo_op_th, amo_check_th) +GEN_TH_TRANS(th_vamoswapd_v, 9, rwdvm, amo_op_th, amo_check64_th) +GEN_TH_TRANS(th_vamoaddd_v, 10, rwdvm, amo_op_th, amo_check64_th) +GEN_TH_TRANS(th_vamoxord_v, 11, rwdvm, amo_op_th, amo_check64_th) +GEN_TH_TRANS(th_vamoandd_v, 12, rwdvm, amo_op_th, amo_check64_th) +GEN_TH_TRANS(th_vamoord_v, 13, rwdvm, amo_op_th, amo_check64_th) +GEN_TH_TRANS(th_vamomind_v, 14, rwdvm, amo_op_th, amo_check64_th) +GEN_TH_TRANS(th_vamomaxd_v, 15, rwdvm, amo_op_th, amo_check64_th) +GEN_TH_TRANS(th_vamominud_v, 16, rwdvm, amo_op_th, amo_check64_th) +GEN_TH_TRANS(th_vamomaxud_v, 17, rwdvm, amo_op_th, amo_check64_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vamoswapw_v) -TH_TRANS_STUB(th_vamoaddw_v) -TH_TRANS_STUB(th_vamoxorw_v) -TH_TRANS_STUB(th_vamoandw_v) -TH_TRANS_STUB(th_vamoorw_v) -TH_TRANS_STUB(th_vamominw_v) -TH_TRANS_STUB(th_vamomaxw_v) -TH_TRANS_STUB(th_vamominuw_v) -TH_TRANS_STUB(th_vamomaxuw_v) -TH_TRANS_STUB(th_vamoswapd_v) -TH_TRANS_STUB(th_vamoaddd_v) -TH_TRANS_STUB(th_vamoxord_v) -TH_TRANS_STUB(th_vamoandd_v) -TH_TRANS_STUB(th_vamoord_v) -TH_TRANS_STUB(th_vamomind_v) -TH_TRANS_STUB(th_vamomaxd_v) -TH_TRANS_STUB(th_vamominud_v) -TH_TRANS_STUB(th_vamomaxud_v) TH_TRANS_STUB(th_vadd_vv) TH_TRANS_STUB(th_vadd_vx) TH_TRANS_STUB(th_vadd_vi) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index af814688b5..1dced03ee3 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -51,6 +51,11 @@ static inline uint32_t th_lmul(uint32_t desc) return FIELD_EX32(simd_data(desc), VDATA_TH, LMUL); } =20 +static uint32_t th_wd(uint32_t desc) +{ + return (simd_data(desc) >> 11) & 0x1; +} + /* * Get vector group length in bytes. Its range is [64, 2048]. * @@ -661,3 +666,134 @@ GEN_TH_LDFF(th_vlhuff_v_w, uint16_t, uint32_t, ldhu_w= , clearl_th) GEN_TH_LDFF(th_vlhuff_v_d, uint16_t, uint64_t, ldhu_d, clearq_th) GEN_TH_LDFF(th_vlwuff_v_w, uint32_t, uint32_t, ldwu_w, clearl_th) GEN_TH_LDFF(th_vlwuff_v_d, uint32_t, uint64_t, ldwu_d, clearq_th) + +/* + * Vector AMO Operations (Zvamo) + */ +typedef void th_amo_noatomic_fn(void *vs3, target_ulong addr, + uint32_t wd, uint32_t idx, CPURISCVState *= env, + uintptr_t retaddr); + +#define TH_SWAP(N, M) (M) +#define TH_XOR(N, M) (N ^ M) +#define TH_OR(N, M) (N | M) +#define TH_AND(N, M) (N & M) +#define TH_ADD(N, M) (N + M) + +#define GEN_TH_AMO_NOATOMIC_OP(NAME, ESZ, MSZ, H, DO_OP, SUF) \ +static void \ +NAME##_noatomic_op(void *vs3, target_ulong addr, \ + uint32_t wd, uint32_t idx, \ + CPURISCVState *env, uintptr_t retaddr) \ +{ \ + typedef int##ESZ##_t ETYPE; \ + typedef int##MSZ##_t MTYPE; \ + typedef uint##MSZ##_t UMTYPE __attribute__((unused)); \ + ETYPE *pe3 =3D (ETYPE *)vs3 + H(idx); \ + MTYPE a =3D cpu_ld##SUF##_data(env, addr), b =3D *pe3; \ + \ + cpu_st##SUF##_data(env, addr, DO_OP(a, b)); \ + if (wd) { \ + *pe3 =3D a; \ + } \ +} + +#define TH_MAX(N, M) ((N) >=3D (M) ? (N) : (M)) +#define TH_MIN(N, M) ((N) >=3D (M) ? (M) : (N)) +#define TH_MAXU(N, M) TH_MAX((UMTYPE)N, (UMTYPE)M) +#define TH_MINU(N, M) TH_MIN((UMTYPE)N, (UMTYPE)M) + +GEN_TH_AMO_NOATOMIC_OP(th_vamoswapw_v_w, 32, 32, H4, TH_SWAP, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamoaddw_v_w, 32, 32, H4, TH_ADD, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamoxorw_v_w, 32, 32, H4, TH_XOR, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamoandw_v_w, 32, 32, H4, TH_AND, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamoorw_v_w, 32, 32, H4, TH_OR, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamominw_v_w, 32, 32, H4, TH_MIN, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamomaxw_v_w, 32, 32, H4, TH_MAX, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamominuw_v_w, 32, 32, H4, TH_MINU, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamomaxuw_v_w, 32, 32, H4, TH_MAXU, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamoswapw_v_d, 64, 32, H8, TH_SWAP, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamoswapd_v_d, 64, 64, H8, TH_SWAP, q) +GEN_TH_AMO_NOATOMIC_OP(th_vamoaddw_v_d, 64, 32, H8, TH_ADD, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamoaddd_v_d, 64, 64, H8, TH_ADD, q) +GEN_TH_AMO_NOATOMIC_OP(th_vamoxorw_v_d, 64, 32, H8, TH_XOR, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamoxord_v_d, 64, 64, H8, TH_XOR, q) +GEN_TH_AMO_NOATOMIC_OP(th_vamoandw_v_d, 64, 32, H8, TH_AND, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamoandd_v_d, 64, 64, H8, TH_AND, q) +GEN_TH_AMO_NOATOMIC_OP(th_vamoorw_v_d, 64, 32, H8, TH_OR, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamoord_v_d, 64, 64, H8, TH_OR, q) +GEN_TH_AMO_NOATOMIC_OP(th_vamominw_v_d, 64, 32, H8, TH_MIN, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamomind_v_d, 64, 64, H8, TH_MIN, q) +GEN_TH_AMO_NOATOMIC_OP(th_vamomaxw_v_d, 64, 32, H8, TH_MAX, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamomaxd_v_d, 64, 64, H8, TH_MAX, q) +GEN_TH_AMO_NOATOMIC_OP(th_vamominuw_v_d, 64, 32, H8, TH_MINU, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamominud_v_d, 64, 64, H8, TH_MINU, q) +GEN_TH_AMO_NOATOMIC_OP(th_vamomaxuw_v_d, 64, 32, H8, TH_MAXU, l) +GEN_TH_AMO_NOATOMIC_OP(th_vamomaxud_v_d, 64, 64, H8, TH_MAXU, q) + +static inline void +th_amo_noatomic(void *vs3, void *v0, target_ulong base, + void *vs2, CPURISCVState *env, uint32_t desc, + th_get_index_addr get_index_addr, + th_amo_noatomic_fn * noatomic_op, + clear_fn * clear_elem, + uint32_t esz, uint32_t msz, uintptr_t ra) +{ + uint32_t i; + target_long addr; + uint32_t wd =3D th_wd(desc); + uint32_t vm =3D th_vm(desc); + uint32_t mlen =3D th_mlen(desc); + uint32_t vlmax =3D th_maxsz(desc) / esz; + uint32_t vl =3D env->vl; + + VSTART_CHECK_EARLY_EXIT(env); + + for (i =3D env->vstart; i < vl; env->vstart =3D ++i) { + if (!vm && !th_elem_mask(v0, mlen, i)) { + continue; + } + addr =3D get_index_addr(base, i, vs2); + noatomic_op(vs3, adjust_addr(env, addr), wd, i, env, ra); + } + env->vstart =3D 0; + clear_elem(vs3, env->vl, env->vl * esz, vlmax * esz); +} + +#define GEN_TH_AMO(NAME, MTYPE, ETYPE, INDEX_FN, CLEAR_FN) \ +void HELPER(NAME)(void *vs3, void *v0, target_ulong base, \ + void *vs2, CPURISCVState *env, uint32_t desc) \ +{ \ + th_amo_noatomic(vs3, v0, base, vs2, env, desc, \ + INDEX_FN, NAME##_noatomic_op, \ + CLEAR_FN, sizeof(ETYPE), sizeof(MTYPE), \ + GETPC()); \ +} + +GEN_TH_AMO(th_vamoswapw_v_d, int32_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamoswapd_v_d, int64_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamoaddw_v_d, int32_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamoaddd_v_d, int64_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamoxorw_v_d, int32_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamoxord_v_d, int64_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamoandw_v_d, int32_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamoandd_v_d, int64_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamoorw_v_d, int32_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamoord_v_d, int64_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamominw_v_d, int32_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamomind_v_d, int64_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamomaxw_v_d, int32_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamomaxd_v_d, int64_t, int64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamominuw_v_d, uint32_t, uint64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamominud_v_d, uint64_t, uint64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamomaxuw_v_d, uint32_t, uint64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamomaxud_v_d, uint64_t, uint64_t, idx_d, clearq_th) +GEN_TH_AMO(th_vamoswapw_v_w, int32_t, int32_t, idx_w, clearl_th) +GEN_TH_AMO(th_vamoaddw_v_w, int32_t, int32_t, idx_w, clearl_th) +GEN_TH_AMO(th_vamoxorw_v_w, int32_t, int32_t, idx_w, clearl_th) +GEN_TH_AMO(th_vamoandw_v_w, int32_t, int32_t, idx_w, clearl_th) +GEN_TH_AMO(th_vamoorw_v_w, int32_t, int32_t, idx_w, clearl_th) +GEN_TH_AMO(th_vamominw_v_w, int32_t, int32_t, idx_w, clearl_th) +GEN_TH_AMO(th_vamomaxw_v_w, int32_t, int32_t, idx_w, clearl_th) +GEN_TH_AMO(th_vamominuw_v_w, uint32_t, uint32_t, idx_w, clearl_th) +GEN_TH_AMO(th_vamomaxuw_v_w, uint32_t, uint32_t, idx_w, clearl_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712909512; cv=none; d=zohomail.com; s=zohoarc; b=Noo/wouWl6bgT1EfSlXMi0Dgi5+FShGXPHRpa/qTGB1TqzU0pgIRNQE2l5DQ1o9rkskj0+abWQn4zPwjh2++bvnpucEp3wCRCxUqAUlGedRTleHPuu4YT7XQGbU8skDTUh0ZN4y6aGy2jDXtLDFG8nb1veH+L+0egzXV3ifwecU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712909512; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=y3+kOxM7NvGXOflMMRatwzobkasLbYixMzVleUxj15U=; b=ePUDWc010gfNOQ5oe2IlT0J6zEj+jk9ntwd1YSxWKQ4AvQul4gyWHi904TgNJBF2jY7OG3mCiafJGI2InwohB591Q/csbU+LD9riYze3xj1HweaOfu5cZHzbmeCeyavgCgW225c+mMkKzJKDuGMphynKxY30XKicmfurd8dM2NI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712909512296243.29910363967247; Fri, 12 Apr 2024 01:11:52 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvC0L-0007uU-8B; Fri, 12 Apr 2024 04:11:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvC0E-0007u7-V7; Fri, 12 Apr 2024 04:11:18 -0400 Received: from out30-131.freemail.mail.aliyun.com ([115.124.30.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvBzy-0005jR-Af; Fri, 12 Apr 2024 04:11:18 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nbmri_1712909454) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:10:55 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712909455; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=y3+kOxM7NvGXOflMMRatwzobkasLbYixMzVleUxj15U=; b=Z1e4mUU2qjWfvasxXFQJhVJKfygI6Uq4Z5bo4f2I50bJhQaQwi3S0VUmF7LGoiYsuN/Ax2Kf9DX8jLizsElg8p02alsxNysRioE/ZBKb421UCI7nXwTGdaDiwUlq3IWaCI+gbw0P9N8joyIRR105JlsSlpn+Zw3zzUfPOMgCwnQ= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R131e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046050; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nbmri_1712909454; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 16/65] target/riscv: Add single-width integer add and subtract instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:46 +0800 Message-ID: <20240412073735.76413-17-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.131; envelope-from=eric.huang@linux.alibaba.com; helo=out30-131.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712909512900100001 Content-Type: text/plain; charset="utf-8" In this patch, we add single-width integer add and subtract instructions, including th.vadd.vv/vx/vi, th.vsub.vv/vx and th.vrsub.vx/vi, also show the way we implement XTheadVector integer arithmetic instructions. These instructions diff from RVV1.0 in the following points: 1. Different mask reg layout. For mask bit of element i, XTheadVector locates it in bit[mlen], while RVV1.0 locates it in bit[i]. 2. Different tail/masked elements process policy. XTheadVector keep the masked element value and clear the tail elements. While RVV1.0 has vta and vma to set the processing policy, keeping value or overwrite it with 1s. 3. Different check policy. XTheadVector does not have fractional lmul, so we can use simpler check function. 4. XTheadVector simplifies the judgment logic of whether to accelerate or not for its lack of fractional LMUL and vta. Signed-off-by: Huang Tao --- target/riscv/helper.h | 21 ++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 242 +++++++++++++++++- target/riscv/vector_helper.c | 4 - target/riscv/vector_internals.h | 4 + target/riscv/xtheadvector_helper.c | 153 +++++++++++ 5 files changed, 413 insertions(+), 11 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index c2a26acabc..6a7d2c0a78 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1472,3 +1472,24 @@ DEF_HELPER_6(th_vamominw_v_w, void, ptr, ptr, tl, p= tr, env, i32) DEF_HELPER_6(th_vamomaxw_v_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vamominuw_v_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vamomaxuw_v_w, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vadd_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vadd_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vadd_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vadd_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsub_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsub_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsub_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsub_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vadd_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vadd_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vadd_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vadd_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsub_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsub_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsub_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsub_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vrsub_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vrsub_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vrsub_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vrsub_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 2bcd9b0832..6836e9a3b7 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -769,19 +769,247 @@ GEN_TH_TRANS(th_vamomaxd_v, 15, rwdvm, amo_op_th, am= o_check64_th) GEN_TH_TRANS(th_vamominud_v, 16, rwdvm, amo_op_th, amo_check64_th) GEN_TH_TRANS(th_vamomaxud_v, 17, rwdvm, amo_op_th, amo_check64_th) =20 +/* + * Vector Integer Arithmetic Instructions + */ + +/* + * check function + * 1) check overlap mask, XTheadVector can overlap mask reg v0 when + * lmul =3D=3D 1, while RVV1.0 can not. + * 2) check reg, XTheadVector Vector register numbers are multiples + * of integral LMUL, while RVV1.0 has fractional LMUL, which allows + * any vector register. + */ +static bool opivv_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, false) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false) && + th_check_reg(s, a->rs1, false)); +} + +#define GVecGen3Fn_Th GVecGen3Fn +/* + * This function is almost the copy of do_opivv_gvec, except: + * 1) XTheadVector using different data encoding, add MLEN, + * delete VTA and VMA. + * 2) XTheadVector simplifies the judgment logic of whether + * to accelerate or not for its lack of fractional LMUL and + * VTA. + */ +static inline bool +do_opivv_gvec_th(DisasContext *s, arg_rmrr *a, GVecGen3Fn_Th *gvec_fn, + gen_helper_gvec_4_ptr *fn) +{ + if (a->vm && s->vl_eq_vlmax) { + gvec_fn(s->sew, vreg_ofs(s, a->rd), + vreg_ofs(s, a->rs2), vreg_ofs(s, a->rs1), + MAXSZ(s), MAXSZ(s)); + } else { + uint32_t data =3D 0; + /* Need extra mlen to find the mask bit */ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), + vreg_ofs(s, a->rs1), vreg_ofs(s, a->rs2), + tcg_env, s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data, fn); + } + finalize_rvv_inst(s); + return true; +} + +/* + * OPIVV with GVEC IR + * + * GEN_OPIVV_GVEC_TRANS_TH is similar to GEN_OPIVV_GVEC_TRANS + * just change the check and do_ functions. + */ +#define GEN_OPIVV_GVEC_TRANS_TH(NAME, SUF) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[4] =3D { \ + gen_helper_##NAME##_b, gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, gen_helper_##NAME##_d, \ + }; \ + if (!opivv_check_th(s, a)) { \ + return false; \ + } \ + return do_opivv_gvec_th(s, a, tcg_gen_gvec_##SUF, fns[s->sew]);\ +} + +GEN_OPIVV_GVEC_TRANS_TH(th_vadd_vv, add) +GEN_OPIVV_GVEC_TRANS_TH(th_vsub_vv, sub) + +/* + * This function is almost the copy of opivx_trans, except: + * 1) XTheadVector using different data encoding, add MLEN, + * delete VTA and VMA. + */ +#define gen_helper_opivx_th gen_helper_opivx +static bool opivx_trans_th(uint32_t vd, uint32_t rs1, uint32_t vs2, uint32= _t vm, + gen_helper_opivx_th *fn, DisasContext *s) +{ + TCGv_ptr dest, src2, mask; + TCGv src1; + TCGv_i32 desc; + uint32_t data =3D 0; + + dest =3D tcg_temp_new_ptr(); + mask =3D tcg_temp_new_ptr(); + src2 =3D tcg_temp_new_ptr(); + src1 =3D get_gpr(s, rs1, EXT_SIGN); + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + desc =3D tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data)); + + tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, vd)); + tcg_gen_addi_ptr(src2, tcg_env, vreg_ofs(s, vs2)); + tcg_gen_addi_ptr(mask, tcg_env, vreg_ofs(s, 0)); + + fn(dest, mask, src1, src2, tcg_env, desc); + + finalize_rvv_inst(s); + return true; +} + +static bool opivx_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, false) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false)); +} + +#define GVecGen2sFn_Th GVecGen2sFn +/* + * This function is almost the copy of do_opivx_gvec, except: + * 1) XTheadVector simplifies the judgment logic of whether + * to accelerate or not for its lack of fractional LMUL and + * VTA. + */ +static inline bool +do_opivx_gvec_th(DisasContext *s, arg_rmrr *a, GVecGen2sFn_Th *gvec_fn, + gen_helper_opivx_th *fn) +{ + if (a->vm && s->vl_eq_vlmax) { + TCGv_i64 src1 =3D tcg_temp_new_i64(); + + tcg_gen_ext_tl_i64(src1, get_gpr(s, a->rs1, EXT_SIGN)); + gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), + src1, MAXSZ(s), MAXSZ(s)); + finalize_rvv_inst(s); + return true; + } + return opivx_trans_th(a->rd, a->rs1, a->rs2, a->vm, fn, s); +} + +/* OPIVX with GVEC IR */ +#define GEN_OPIVX_GVEC_TRANS_TH(NAME, SUF) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + static gen_helper_opivx_th * const fns[4] =3D { \ + gen_helper_##NAME##_b, gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, gen_helper_##NAME##_d, \ + }; \ + if (!opivx_check_th(s, a)) { \ + return false; \ + } \ + return do_opivx_gvec_th(s, a, tcg_gen_gvec_##SUF, fns[s->sew]);\ +} + +GEN_OPIVX_GVEC_TRANS_TH(th_vadd_vx, adds) +GEN_OPIVX_GVEC_TRANS_TH(th_vsub_vx, subs) +GEN_OPIVX_GVEC_TRANS_TH(th_vrsub_vx, rsubs) + +#define imm_mode_t_th imm_mode_t + +/* + * This function is almost the copy of opivi_trans, except: + * 1) XTheadVector using different data encoding, add MLEN, + * delete VTA and VMA. + */ +static bool opivi_trans_th(uint32_t vd, uint32_t imm, uint32_t vs2, uint32= _t vm, + gen_helper_opivx_th *fn, DisasContext *s, + imm_mode_t_th imm_mode) +{ + TCGv_ptr dest, src2, mask; + TCGv src1; + TCGv_i32 desc; + uint32_t data =3D 0; + + dest =3D tcg_temp_new_ptr(); + mask =3D tcg_temp_new_ptr(); + src2 =3D tcg_temp_new_ptr(); + src1 =3D tcg_constant_tl(extract_imm(s, imm, imm_mode)); + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + desc =3D tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data)); + + tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, vd)); + tcg_gen_addi_ptr(src2, tcg_env, vreg_ofs(s, vs2)); + tcg_gen_addi_ptr(mask, tcg_env, vreg_ofs(s, 0)); + + fn(dest, mask, src1, src2, tcg_env, desc); + + finalize_rvv_inst(s); + return true; +} +#define GVecGen2iFn_Th GVecGen2iFn +/* + * This function is almost the copy of do_opivi_gvec, except: + * 1) XTheadVector simplifies the judgment logic of whether + * to accelerate or not for its lack of fractional LMUL and + * VTA. + */ +static inline bool +do_opivi_gvec_th(DisasContext *s, arg_rmrr *a, GVecGen2iFn_Th *gvec_fn, + gen_helper_opivx_th *fn, imm_mode_t_th imm_mode) +{ + if (a->vm && s->vl_eq_vlmax) { + gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), + extract_imm(s, a->rs1, imm_mode), MAXSZ(s), MAXSZ(s)); + finalize_rvv_inst(s); + return true; + } + return opivi_trans_th(a->rd, a->rs1, a->rs2, a->vm, fn, s, imm_mode); +} + +/* OPIVI with GVEC IR */ +#define GEN_OPIVI_GVEC_TRANS_TH(NAME, IMM_MODE, OPIVX, SUF) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + static gen_helper_opivx_th * const fns[4] =3D { \ + gen_helper_##OPIVX##_b, gen_helper_##OPIVX##_h, \ + gen_helper_##OPIVX##_w, gen_helper_##OPIVX##_d, \ + }; \ + if (!opivx_check_th(s, a)) { \ + return false; \ + } \ + return do_opivi_gvec_th(s, a, tcg_gen_gvec_##SUF, \ + fns[s->sew], IMM_MODE); \ +} + +GEN_OPIVI_GVEC_TRANS_TH(th_vadd_vi, IMM_SX, th_vadd_vx, addi) +GEN_OPIVI_GVEC_TRANS_TH(th_vrsub_vi, IMM_SX, th_vrsub_vx, rsubi) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vadd_vv) -TH_TRANS_STUB(th_vadd_vx) -TH_TRANS_STUB(th_vadd_vi) -TH_TRANS_STUB(th_vsub_vv) -TH_TRANS_STUB(th_vsub_vx) -TH_TRANS_STUB(th_vrsub_vx) -TH_TRANS_STUB(th_vrsub_vi) TH_TRANS_STUB(th_vwaddu_vv) TH_TRANS_STUB(th_vwaddu_vx) TH_TRANS_STUB(th_vwadd_vv) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 695cb7dfec..8fb0b02976 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -647,10 +647,6 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b) */ =20 /* (TD, T1, T2, TX1, TX2) */ -#define OP_SSS_B int8_t, int8_t, int8_t, int8_t, int8_t -#define OP_SSS_H int16_t, int16_t, int16_t, int16_t, int16_t -#define OP_SSS_W int32_t, int32_t, int32_t, int32_t, int32_t -#define OP_SSS_D int64_t, int64_t, int64_t, int64_t, int64_t #define OP_SUS_B int8_t, uint8_t, int8_t, uint8_t, int8_t #define OP_SUS_H int16_t, uint16_t, int16_t, uint16_t, int16_t #define OP_SUS_W int32_t, uint32_t, int32_t, uint32_t, int32_t diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index ff10cd3806..1e118c6a17 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -138,6 +138,10 @@ void vext_set_elems_1s(void *base, uint32_t is_agnosti= c, uint32_t cnt, #define OP_UUU_H uint16_t, uint16_t, uint16_t, uint16_t, uint16_t #define OP_UUU_W uint32_t, uint32_t, uint32_t, uint32_t, uint32_t #define OP_UUU_D uint64_t, uint64_t, uint64_t, uint64_t, uint64_t +#define OP_SSS_B int8_t, int8_t, int8_t, int8_t, int8_t +#define OP_SSS_H int16_t, int16_t, int16_t, int16_t, int16_t +#define OP_SSS_W int32_t, int32_t, int32_t, int32_t, int32_t +#define OP_SSS_D int64_t, int64_t, int64_t, int64_t, int64_t =20 #define OPIVV1(NAME, TD, T2, TX2, HD, HS2, OP) \ static void do_##NAME(void *vd, void *vs2, int i) \ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 1dced03ee3..1571d372a8 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -797,3 +797,156 @@ GEN_TH_AMO(th_vamominw_v_w, int32_t, int32_t, idx_= w, clearl_th) GEN_TH_AMO(th_vamomaxw_v_w, int32_t, int32_t, idx_w, clearl_th) GEN_TH_AMO(th_vamominuw_v_w, uint32_t, uint32_t, idx_w, clearl_th) GEN_TH_AMO(th_vamomaxuw_v_w, uint32_t, uint32_t, idx_w, clearl_th) + +/* + * Vector Integer Arithmetic Instructions + */ + +/* redefine macros to decouple */ + +#define THCALL(macro, ...) macro(__VA_ARGS__) + +/* operation of two vector elements */ +#define opivv2_fn_th opivv2_fn + +#define TH_OPIVV2(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2, OP) \ + OPIVV2(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2, OP) + +#define TH_SUB(N, M) (N - M) +#define TH_RSUB(N, M) (M - N) + +THCALL(TH_OPIVV2, th_vadd_vv_b, OP_SSS_B, H1, H1, H1, TH_ADD) +THCALL(TH_OPIVV2, th_vadd_vv_h, OP_SSS_H, H2, H2, H2, TH_ADD) +THCALL(TH_OPIVV2, th_vadd_vv_w, OP_SSS_W, H4, H4, H4, TH_ADD) +THCALL(TH_OPIVV2, th_vadd_vv_d, OP_SSS_D, H8, H8, H8, TH_ADD) +THCALL(TH_OPIVV2, th_vsub_vv_b, OP_SSS_B, H1, H1, H1, TH_SUB) +THCALL(TH_OPIVV2, th_vsub_vv_h, OP_SSS_H, H2, H2, H2, TH_SUB) +THCALL(TH_OPIVV2, th_vsub_vv_w, OP_SSS_W, H4, H4, H4, TH_SUB) +THCALL(TH_OPIVV2, th_vsub_vv_d, OP_SSS_D, H8, H8, H8, TH_SUB) + +/* + * This function is almost the copy of do_vext_vv, except: + * 1) XTheadVector has different mask layout, using th_elem_mask + * to get [MLEN*i] bit + * 2) XTheadVector using different data encoding, using th_ functions + * to parse. + * 3) XTheadVector keep the masked elements value, while RVV1.0 policy is + * determined by vma. + * 4) XTheadVector clear the tail elements, while RVV1.0 policy is to rath= er + * set all bits 1s or keep it, determined by vta. + */ +static void do_th_vv(void *vd, void *v0, void *vs1, void *vs2, + CPURISCVState *env, uint32_t desc, + uint32_t esz, uint32_t dsz, + opivv2_fn_th *fn, clear_fn *clearfn) +{ + uint32_t vlmax =3D th_maxsz(desc) / esz; + uint32_t mlen =3D th_mlen(desc); + uint32_t vm =3D th_vm(desc); + uint32_t vl =3D env->vl; + uint32_t i; + + VSTART_CHECK_EARLY_EXIT(env); + + for (i =3D env->vstart; i < vl; i++) { + if (!vm && !th_elem_mask(v0, mlen, i)) { + continue; + } + fn(vd, vs1, vs2, i); + } + env->vstart =3D 0; + clearfn(vd, vl, vl * dsz, vlmax * dsz); +} + +/* generate the helpers for OPIVV */ +#define GEN_TH_VV(NAME, ESZ, DSZ, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, \ + void *vs2, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + do_th_vv(vd, v0, vs1, vs2, env, desc, ESZ, DSZ, \ + do_##NAME, CLEAR_FN); \ +} + +GEN_TH_VV(th_vadd_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vadd_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vadd_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vadd_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vsub_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vsub_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vsub_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vsub_vv_d, 8, 8, clearq_th) + +#define opivx2_fn_th opivx2_fn + +/* + * (T1)s1 gives the real operator type. + * (TX1)(T1)s1 expands the operator type of widen or narrow operations. + */ +#define TH_OPIVX2(NAME, TD, T1, T2, TX1, TX2, HD, HS2, OP) \ + OPIVX2(NAME, TD, T1, T2, TX1, TX2, HD, HS2, OP) + +THCALL(TH_OPIVX2, th_vadd_vx_b, OP_SSS_B, H1, H1, TH_ADD) +THCALL(TH_OPIVX2, th_vadd_vx_h, OP_SSS_H, H2, H2, TH_ADD) +THCALL(TH_OPIVX2, th_vadd_vx_w, OP_SSS_W, H4, H4, TH_ADD) +THCALL(TH_OPIVX2, th_vadd_vx_d, OP_SSS_D, H8, H8, TH_ADD) +THCALL(TH_OPIVX2, th_vsub_vx_b, OP_SSS_B, H1, H1, TH_SUB) +THCALL(TH_OPIVX2, th_vsub_vx_h, OP_SSS_H, H2, H2, TH_SUB) +THCALL(TH_OPIVX2, th_vsub_vx_w, OP_SSS_W, H4, H4, TH_SUB) +THCALL(TH_OPIVX2, th_vsub_vx_d, OP_SSS_D, H8, H8, TH_SUB) +THCALL(TH_OPIVX2, th_vrsub_vx_b, OP_SSS_B, H1, H1, TH_RSUB) +THCALL(TH_OPIVX2, th_vrsub_vx_h, OP_SSS_H, H2, H2, TH_RSUB) +THCALL(TH_OPIVX2, th_vrsub_vx_w, OP_SSS_W, H4, H4, TH_RSUB) +THCALL(TH_OPIVX2, th_vrsub_vx_d, OP_SSS_D, H8, H8, TH_RSUB) + +/* + * This function is almost the copy of do_vext_vx, except: + * 1) different mask layout + * 2) different data encoding + * 3) different mask/tail elements process policy + */ +static void do_th_vx(void *vd, void *v0, target_long s1, void *vs2, + CPURISCVState *env, uint32_t desc, + uint32_t esz, uint32_t dsz, + opivx2_fn_th fn, clear_fn *clearfn) +{ + uint32_t vlmax =3D th_maxsz(desc) / esz; + uint32_t mlen =3D th_mlen(desc); + uint32_t vm =3D th_vm(desc); + uint32_t vl =3D env->vl; + uint32_t i; + + VSTART_CHECK_EARLY_EXIT(env); + + for (i =3D env->vstart; i < vl; i++) { + if (!vm && !th_elem_mask(v0, mlen, i)) { + continue; + } + fn(vd, s1, vs2, i); + } + env->vstart =3D 0; + clearfn(vd, vl, vl * dsz, vlmax * dsz); +} + +/* generate the helpers for OPIVX */ +#define GEN_TH_VX(NAME, ESZ, DSZ, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ + void *vs2, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + do_th_vx(vd, v0, s1, vs2, env, desc, ESZ, DSZ, \ + do_##NAME, CLEAR_FN); \ +} + +GEN_TH_VX(th_vadd_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vadd_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vadd_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vadd_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vsub_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vsub_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vsub_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vsub_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vrsub_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vrsub_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vrsub_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vrsub_vx_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712909618; cv=none; d=zohomail.com; s=zohoarc; b=QarcUst/5vkzPJSOY48N7s1G46LmoTdAPkYKanQaYE57uQwmVpJ+DPtJcSiKjhlCrGqXTMr3owi8qACSz8NIvqilB5J1yJ2AEpz9rLNLbydjTlw3kq5GH/U2BlMHIWAZpuQn6Hjc58ktY07HJUy7te6FFsuTYD//KavRV8XB0Oo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712909618; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=h+T/DHk8h2koOMR/b8RpyQp8Ruu0bssJzW9yKOwKLNA=; b=JGwSK76KYXOXVeZSL8klgzA44gvKdUJrh/0I36Z9MrxPDV3H5sSPR41MbO6WLIjdMp4uHfcThbbN4OKCjwd87goExID7iDuXWzKkb80xnwYzV3Mh0/z8bK7ySrPYeqZ4OPjD/ggtMfFp38XZhy92feKfJHeg2iCj4ghg7DkuzHY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712909618419281.8178904711011; Fri, 12 Apr 2024 01:13:38 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvC2A-0000OC-3d; Fri, 12 Apr 2024 04:13:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvC20-0000NX-1a; Fri, 12 Apr 2024 04:13:08 -0400 Received: from out30-111.freemail.mail.aliyun.com ([115.124.30.111]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvC1u-00067y-V3; Fri, 12 Apr 2024 04:13:07 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NYa1k_1712909575) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:12:56 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712909577; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=h+T/DHk8h2koOMR/b8RpyQp8Ruu0bssJzW9yKOwKLNA=; b=V/tpTay+gK930eKiXkA3kwBxoH/neLs//uFQ/uTlLZhYPFAEFKP77E5fTqR3JgnrzTIEpQ3+7M++dOPUAyU2PjF9OKvf8aR2y5y6vtSMDqecPqKrR236MRanEo//PDFa6C2hk1tFjjqpRI1R355J0+KkuzYACLbZLW3IFgB55Dc= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R131e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046051; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NYa1k_1712909575; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 17/65] target/riscv: Add widening integer add/subtract instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:47 +0800 Message-ID: <20240412073735.76413-18-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.111; envelope-from=eric.huang@linux.alibaba.com; helo=out30-111.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712909618980100001 Content-Type: text/plain; charset="utf-8" In this patch, we reuse lots of funtions of single-width operations, except do_opivv_th. The reason why do_opivv_widen_th does not call do_opivv_th is that widen operation is not applicable to using GVEC to accerlate the vector operations. The difference between XTheadVector and RVV1.0 is as same as the single- width operation patch mentions. Signed-off-by: Huang Tao --- target/riscv/helper.h | 49 +++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 197 ++++++++++++++++-- target/riscv/vector_helper.c | 15 -- target/riscv/vector_internals.h | 9 + target/riscv/xtheadvector_helper.c | 100 +++++++++ 5 files changed, 339 insertions(+), 31 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 6a7d2c0a78..3906f17079 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1493,3 +1493,52 @@ DEF_HELPER_6(th_vrsub_vx_b, void, ptr, ptr, tl, ptr,= env, i32) DEF_HELPER_6(th_vrsub_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vrsub_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vrsub_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vwaddu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwaddu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwaddu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsubu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsubu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsubu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwadd_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwadd_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwadd_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsub_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsub_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsub_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwaddu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwaddu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwaddu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsubu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsubu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsubu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwadd_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwadd_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwadd_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsub_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsub_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsub_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwaddu_wv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwaddu_wv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwaddu_wv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsubu_wv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsubu_wv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsubu_wv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwadd_wv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwadd_wv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwadd_wv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsub_wv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsub_wv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsub_wv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwaddu_wx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwaddu_wx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwaddu_wx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsubu_wx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsubu_wx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsubu_wx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwadd_wx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwadd_wx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwadd_wx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsub_wx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsub_wx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsub_wx_w, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 6836e9a3b7..f6aea9deff 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1004,28 +1004,193 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr= *a) \ GEN_OPIVI_GVEC_TRANS_TH(th_vadd_vi, IMM_SX, th_vadd_vx, addi) GEN_OPIVI_GVEC_TRANS_TH(th_vrsub_vi, IMM_SX, th_vrsub_vx, rsubi) =20 +/* Vector Widening Integer Add/Subtract */ + +/* OPIVV with WIDEN */ +static bool opivv_widen_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, true) && + th_check_reg(s, a->rs2, false) && + th_check_reg(s, a->rs1, false) && + th_check_overlap_group(a->rd, 2 << s->lmul, a->rs2, + 1 << s->lmul) && + th_check_overlap_group(a->rd, 2 << s->lmul, a->rs1, + 1 << s->lmul) && + (s->lmul < 0x3) && (s->sew < 0x3)); +} + +/* + * This function is almost the copy of do_opivv_widen, except: + * 1) XTheadVector using different data encoding, add MLEN, + * delete VTA and VMA. + */ +static bool do_opivv_widen_th(DisasContext *s, arg_rmrr *a, + gen_helper_gvec_4_ptr *fn, + bool (*checkfn)(DisasContext *, arg_rmrr *)) +{ + if (checkfn(s, a)) { + uint32_t data =3D 0; + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), + vreg_ofs(s, a->rs1), + vreg_ofs(s, a->rs2), + tcg_env, s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, + data, fn); + finalize_rvv_inst(s); + return true; + } + return false; +} + +#define GEN_OPIVV_WIDEN_TRANS_TH(NAME, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[3] =3D { \ + gen_helper_##NAME##_b, \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w \ + }; \ + return do_opivv_widen_th(s, a, fns[s->sew], CHECK); \ +} + +GEN_OPIVV_WIDEN_TRANS_TH(th_vwaddu_vv, opivv_widen_check_th) +GEN_OPIVV_WIDEN_TRANS_TH(th_vwadd_vv, opivv_widen_check_th) +GEN_OPIVV_WIDEN_TRANS_TH(th_vwsubu_vv, opivv_widen_check_th) +GEN_OPIVV_WIDEN_TRANS_TH(th_vwsub_vv, opivv_widen_check_th) + +/* OPIVX with WIDEN */ +static bool opivx_widen_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, true) && + th_check_reg(s, a->rs2, false) && + th_check_overlap_group(a->rd, 2 << s->lmul, a->rs2, + 1 << s->lmul) && + (s->lmul < 0x3) && (s->sew < 0x3)); +} + +#define GEN_OPIVX_WIDEN_TRANS_TH(NAME, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (CHECK(s, a)) { \ + static gen_helper_opivx_th * const fns[3] =3D { \ + gen_helper_##NAME##_b, \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w \ + }; \ + return opivx_trans_th(a->rd, a->rs1, a->rs2, a->vm, \ + fns[s->sew], s); \ + } \ + return false; \ +} + +GEN_OPIVX_WIDEN_TRANS_TH(th_vwaddu_vx, opivx_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwadd_vx, opivx_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwsubu_vx, opivx_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwsub_vx, opivx_widen_check_th) + +/* WIDEN OPIVV with WIDEN */ +static bool opiwv_widen_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, true) && + th_check_reg(s, a->rs2, true) && + th_check_reg(s, a->rs1, false) && + th_check_overlap_group(a->rd, 2 << s->lmul, a->rs1, + 1 << s->lmul) && + (s->lmul < 0x3) && (s->sew < 0x3)); +} + +static bool do_opiwv_widen_th(DisasContext *s, arg_rmrr *a, + gen_helper_gvec_4_ptr *fn) +{ + if (opiwv_widen_check_th(s, a)) { + uint32_t data =3D 0; + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), + vreg_ofs(s, a->rs1), + vreg_ofs(s, a->rs2), + tcg_env, s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data, fn); + finalize_rvv_inst(s); + return true; + } + return false; +} + +#define GEN_OPIWV_WIDEN_TRANS_TH(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[3] =3D { \ + gen_helper_##NAME##_b, \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w \ + }; \ + return do_opiwv_widen_th(s, a, fns[s->sew]); \ +} + +GEN_OPIWV_WIDEN_TRANS_TH(th_vwaddu_wv) +GEN_OPIWV_WIDEN_TRANS_TH(th_vwadd_wv) +GEN_OPIWV_WIDEN_TRANS_TH(th_vwsubu_wv) +GEN_OPIWV_WIDEN_TRANS_TH(th_vwsub_wv) + +/* WIDEN OPIVX with WIDEN */ +static bool opiwx_widen_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, true) && + th_check_reg(s, a->rs2, true) && + (s->lmul < 0x3) && (s->sew < 0x3)); +} + + +static bool do_opiwx_widen_th(DisasContext *s, arg_rmrr *a, + gen_helper_opivx_th *fn) +{ + if (opiwx_widen_check_th(s, a)) { + return opivx_trans_th(a->rd, a->rs1, a->rs2, a->vm, fn, s); + } + return false; +} + +#define GEN_OPIWX_WIDEN_TRANS_TH(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + static gen_helper_opivx_th * const fns[3] =3D { \ + gen_helper_##NAME##_b, \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w \ + }; \ + return do_opiwx_widen_th(s, a, fns[s->sew]); \ +} + +GEN_OPIWX_WIDEN_TRANS_TH(th_vwaddu_wx) +GEN_OPIWX_WIDEN_TRANS_TH(th_vwadd_wx) +GEN_OPIWX_WIDEN_TRANS_TH(th_vwsubu_wx) +GEN_OPIWX_WIDEN_TRANS_TH(th_vwsub_wx) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vwaddu_vv) -TH_TRANS_STUB(th_vwaddu_vx) -TH_TRANS_STUB(th_vwadd_vv) -TH_TRANS_STUB(th_vwadd_vx) -TH_TRANS_STUB(th_vwsubu_vv) -TH_TRANS_STUB(th_vwsubu_vx) -TH_TRANS_STUB(th_vwsub_vv) -TH_TRANS_STUB(th_vwsub_vx) -TH_TRANS_STUB(th_vwaddu_wv) -TH_TRANS_STUB(th_vwaddu_wx) -TH_TRANS_STUB(th_vwadd_wv) -TH_TRANS_STUB(th_vwadd_wx) -TH_TRANS_STUB(th_vwsubu_wv) -TH_TRANS_STUB(th_vwsubu_wx) -TH_TRANS_STUB(th_vwsub_wv) -TH_TRANS_STUB(th_vwsub_wx) TH_TRANS_STUB(th_vadc_vvm) TH_TRANS_STUB(th_vadc_vxm) TH_TRANS_STUB(th_vadc_vim) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 8fb0b02976..9774fc62c3 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -651,9 +651,6 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b) #define OP_SUS_H int16_t, uint16_t, int16_t, uint16_t, int16_t #define OP_SUS_W int32_t, uint32_t, int32_t, uint32_t, int32_t #define OP_SUS_D int64_t, uint64_t, int64_t, uint64_t, int64_t -#define WOP_SSS_B int16_t, int8_t, int8_t, int16_t, int16_t -#define WOP_SSS_H int32_t, int16_t, int16_t, int32_t, int32_t -#define WOP_SSS_W int64_t, int32_t, int32_t, int64_t, int64_t #define WOP_SUS_B int16_t, uint8_t, int8_t, uint16_t, int16_t #define WOP_SUS_H int32_t, uint16_t, int16_t, uint32_t, int32_t #define WOP_SUS_W int64_t, uint32_t, int32_t, uint64_t, int64_t @@ -756,18 +753,6 @@ void HELPER(vec_rsubs64)(void *d, void *a, uint64_t b,= uint32_t desc) } =20 /* Vector Widening Integer Add/Subtract */ -#define WOP_UUU_B uint16_t, uint8_t, uint8_t, uint16_t, uint16_t -#define WOP_UUU_H uint32_t, uint16_t, uint16_t, uint32_t, uint32_t -#define WOP_UUU_W uint64_t, uint32_t, uint32_t, uint64_t, uint64_t -#define WOP_SSS_B int16_t, int8_t, int8_t, int16_t, int16_t -#define WOP_SSS_H int32_t, int16_t, int16_t, int32_t, int32_t -#define WOP_SSS_W int64_t, int32_t, int32_t, int64_t, int64_t -#define WOP_WUUU_B uint16_t, uint8_t, uint16_t, uint16_t, uint16_t -#define WOP_WUUU_H uint32_t, uint16_t, uint32_t, uint32_t, uint32_t -#define WOP_WUUU_W uint64_t, uint32_t, uint64_t, uint64_t, uint64_t -#define WOP_WSSS_B int16_t, int8_t, int16_t, int16_t, int16_t -#define WOP_WSSS_H int32_t, int16_t, int32_t, int32_t, int32_t -#define WOP_WSSS_W int64_t, int32_t, int64_t, int64_t, int64_t RVVCALL(OPIVV2, vwaddu_vv_b, WOP_UUU_B, H2, H1, H1, DO_ADD) RVVCALL(OPIVV2, vwaddu_vv_h, WOP_UUU_H, H4, H2, H2, DO_ADD) RVVCALL(OPIVV2, vwaddu_vv_w, WOP_UUU_W, H8, H4, H4, DO_ADD) diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 1e118c6a17..24e64c37d4 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -236,6 +236,15 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,= \ #define WOP_UUU_B uint16_t, uint8_t, uint8_t, uint16_t, uint16_t #define WOP_UUU_H uint32_t, uint16_t, uint16_t, uint32_t, uint32_t #define WOP_UUU_W uint64_t, uint32_t, uint32_t, uint64_t, uint64_t +#define WOP_SSS_B int16_t, int8_t, int8_t, int16_t, int16_t +#define WOP_SSS_H int32_t, int16_t, int16_t, int32_t, int32_t +#define WOP_SSS_W int64_t, int32_t, int32_t, int64_t, int64_t +#define WOP_WUUU_B uint16_t, uint8_t, uint16_t, uint16_t, uint16_t +#define WOP_WUUU_H uint32_t, uint16_t, uint32_t, uint32_t, uint32_t +#define WOP_WUUU_W uint64_t, uint32_t, uint64_t, uint64_t, uint64_t +#define WOP_WSSS_B int16_t, int8_t, int16_t, int16_t, int16_t +#define WOP_WSSS_H int32_t, int16_t, int32_t, int32_t, int32_t +#define WOP_WSSS_W int64_t, int32_t, int64_t, int64_t, int64_t =20 /* share functions */ static inline target_ulong adjust_addr(CPURISCVState *env, target_ulong ad= dr) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 1571d372a8..5ebdb5a375 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -950,3 +950,103 @@ GEN_TH_VX(th_vrsub_vx_b, 1, 1, clearb_th) GEN_TH_VX(th_vrsub_vx_h, 2, 2, clearh_th) GEN_TH_VX(th_vrsub_vx_w, 4, 4, clearl_th) GEN_TH_VX(th_vrsub_vx_d, 8, 8, clearq_th) + +/* Vector Widening Integer Add/Subtract */ + +THCALL(TH_OPIVV2, th_vwaddu_vv_b, WOP_UUU_B, H2, H1, H1, TH_ADD) +THCALL(TH_OPIVV2, th_vwaddu_vv_h, WOP_UUU_H, H4, H2, H2, TH_ADD) +THCALL(TH_OPIVV2, th_vwaddu_vv_w, WOP_UUU_W, H8, H4, H4, TH_ADD) +THCALL(TH_OPIVV2, th_vwsubu_vv_b, WOP_UUU_B, H2, H1, H1, TH_SUB) +THCALL(TH_OPIVV2, th_vwsubu_vv_h, WOP_UUU_H, H4, H2, H2, TH_SUB) +THCALL(TH_OPIVV2, th_vwsubu_vv_w, WOP_UUU_W, H8, H4, H4, TH_SUB) +THCALL(TH_OPIVV2, th_vwadd_vv_b, WOP_SSS_B, H2, H1, H1, TH_ADD) +THCALL(TH_OPIVV2, th_vwadd_vv_h, WOP_SSS_H, H4, H2, H2, TH_ADD) +THCALL(TH_OPIVV2, th_vwadd_vv_w, WOP_SSS_W, H8, H4, H4, TH_ADD) +THCALL(TH_OPIVV2, th_vwsub_vv_b, WOP_SSS_B, H2, H1, H1, TH_SUB) +THCALL(TH_OPIVV2, th_vwsub_vv_h, WOP_SSS_H, H4, H2, H2, TH_SUB) +THCALL(TH_OPIVV2, th_vwsub_vv_w, WOP_SSS_W, H8, H4, H4, TH_SUB) +THCALL(TH_OPIVV2, th_vwaddu_wv_b, WOP_WUUU_B, H2, H1, H1, TH_ADD) +THCALL(TH_OPIVV2, th_vwaddu_wv_h, WOP_WUUU_H, H4, H2, H2, TH_ADD) +THCALL(TH_OPIVV2, th_vwaddu_wv_w, WOP_WUUU_W, H8, H4, H4, TH_ADD) +THCALL(TH_OPIVV2, th_vwsubu_wv_b, WOP_WUUU_B, H2, H1, H1, TH_SUB) +THCALL(TH_OPIVV2, th_vwsubu_wv_h, WOP_WUUU_H, H4, H2, H2, TH_SUB) +THCALL(TH_OPIVV2, th_vwsubu_wv_w, WOP_WUUU_W, H8, H4, H4, TH_SUB) +THCALL(TH_OPIVV2, th_vwadd_wv_b, WOP_WSSS_B, H2, H1, H1, TH_ADD) +THCALL(TH_OPIVV2, th_vwadd_wv_h, WOP_WSSS_H, H4, H2, H2, TH_ADD) +THCALL(TH_OPIVV2, th_vwadd_wv_w, WOP_WSSS_W, H8, H4, H4, TH_ADD) +THCALL(TH_OPIVV2, th_vwsub_wv_b, WOP_WSSS_B, H2, H1, H1, TH_SUB) +THCALL(TH_OPIVV2, th_vwsub_wv_h, WOP_WSSS_H, H4, H2, H2, TH_SUB) +THCALL(TH_OPIVV2, th_vwsub_wv_w, WOP_WSSS_W, H8, H4, H4, TH_SUB) +GEN_TH_VV(th_vwaddu_vv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwaddu_vv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwaddu_vv_w, 4, 8, clearq_th) +GEN_TH_VV(th_vwsubu_vv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwsubu_vv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwsubu_vv_w, 4, 8, clearq_th) +GEN_TH_VV(th_vwadd_vv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwadd_vv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwadd_vv_w, 4, 8, clearq_th) +GEN_TH_VV(th_vwsub_vv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwsub_vv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwsub_vv_w, 4, 8, clearq_th) +GEN_TH_VV(th_vwaddu_wv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwaddu_wv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwaddu_wv_w, 4, 8, clearq_th) +GEN_TH_VV(th_vwsubu_wv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwsubu_wv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwsubu_wv_w, 4, 8, clearq_th) +GEN_TH_VV(th_vwadd_wv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwadd_wv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwadd_wv_w, 4, 8, clearq_th) +GEN_TH_VV(th_vwsub_wv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwsub_wv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwsub_wv_w, 4, 8, clearq_th) + +THCALL(TH_OPIVX2, th_vwaddu_vx_b, WOP_UUU_B, H2, H1, TH_ADD) +THCALL(TH_OPIVX2, th_vwaddu_vx_h, WOP_UUU_H, H4, H2, TH_ADD) +THCALL(TH_OPIVX2, th_vwaddu_vx_w, WOP_UUU_W, H8, H4, TH_ADD) +THCALL(TH_OPIVX2, th_vwsubu_vx_b, WOP_UUU_B, H2, H1, TH_SUB) +THCALL(TH_OPIVX2, th_vwsubu_vx_h, WOP_UUU_H, H4, H2, TH_SUB) +THCALL(TH_OPIVX2, th_vwsubu_vx_w, WOP_UUU_W, H8, H4, TH_SUB) +THCALL(TH_OPIVX2, th_vwadd_vx_b, WOP_SSS_B, H2, H1, TH_ADD) +THCALL(TH_OPIVX2, th_vwadd_vx_h, WOP_SSS_H, H4, H2, TH_ADD) +THCALL(TH_OPIVX2, th_vwadd_vx_w, WOP_SSS_W, H8, H4, TH_ADD) +THCALL(TH_OPIVX2, th_vwsub_vx_b, WOP_SSS_B, H2, H1, TH_SUB) +THCALL(TH_OPIVX2, th_vwsub_vx_h, WOP_SSS_H, H4, H2, TH_SUB) +THCALL(TH_OPIVX2, th_vwsub_vx_w, WOP_SSS_W, H8, H4, TH_SUB) +THCALL(TH_OPIVX2, th_vwaddu_wx_b, WOP_WUUU_B, H2, H1, TH_ADD) +THCALL(TH_OPIVX2, th_vwaddu_wx_h, WOP_WUUU_H, H4, H2, TH_ADD) +THCALL(TH_OPIVX2, th_vwaddu_wx_w, WOP_WUUU_W, H8, H4, TH_ADD) +THCALL(TH_OPIVX2, th_vwsubu_wx_b, WOP_WUUU_B, H2, H1, TH_SUB) +THCALL(TH_OPIVX2, th_vwsubu_wx_h, WOP_WUUU_H, H4, H2, TH_SUB) +THCALL(TH_OPIVX2, th_vwsubu_wx_w, WOP_WUUU_W, H8, H4, TH_SUB) +THCALL(TH_OPIVX2, th_vwadd_wx_b, WOP_WSSS_B, H2, H1, TH_ADD) +THCALL(TH_OPIVX2, th_vwadd_wx_h, WOP_WSSS_H, H4, H2, TH_ADD) +THCALL(TH_OPIVX2, th_vwadd_wx_w, WOP_WSSS_W, H8, H4, TH_ADD) +THCALL(TH_OPIVX2, th_vwsub_wx_b, WOP_WSSS_B, H2, H1, TH_SUB) +THCALL(TH_OPIVX2, th_vwsub_wx_h, WOP_WSSS_H, H4, H2, TH_SUB) +THCALL(TH_OPIVX2, th_vwsub_wx_w, WOP_WSSS_W, H8, H4, TH_SUB) +GEN_TH_VX(th_vwaddu_vx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwaddu_vx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwaddu_vx_w, 4, 8, clearq_th) +GEN_TH_VX(th_vwsubu_vx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwsubu_vx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwsubu_vx_w, 4, 8, clearq_th) +GEN_TH_VX(th_vwadd_vx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwadd_vx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwadd_vx_w, 4, 8, clearq_th) +GEN_TH_VX(th_vwsub_vx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwsub_vx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwsub_vx_w, 4, 8, clearq_th) +GEN_TH_VX(th_vwaddu_wx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwaddu_wx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwaddu_wx_w, 4, 8, clearq_th) +GEN_TH_VX(th_vwsubu_wx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwsubu_wx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwsubu_wx_w, 4, 8, clearq_th) +GEN_TH_VX(th_vwadd_wx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwadd_wx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwadd_wx_w, 4, 8, clearq_th) +GEN_TH_VX(th_vwsub_wx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwsub_wx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwsub_wx_w, 4, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712909783; cv=none; d=zohomail.com; s=zohoarc; b=lDVJrneKGw7CKVGHjtD5Do1lO9k87Q8WYwOV5SHzW58Q/xdCZ70XLNSWGbSHDRo20NUM/EsXdcARClMlDAI7uFAZqJ8wKAi8yrmxkQ9Wu4L6tsEXx9GXCEG2qpvnUwhULwPxg2yRSm/TrPHQcez4QhEmRJQrqcsWwBeHs8zbkv4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712909783; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=qlt/5Mp05T6u3nwzd3/bnHiRuWYLiQjE9zreh2l8OK0=; b=VM0LXsHOTLJ2oPSdrd6q3j2TdUFVjpApe4n8NmxZ5gYeO5vAT1Az37lNvvGQuCEfX8M/IhleyDE3KoF4Bw/oJbJMxhtLP/83ekp1HHDoIEu+cnV+CcLlg9bFkd2VEjeXhn8aYpX/UeEelAE+Q4ivUOFIq0ZwJUtWk9ShRCZ2C/4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712909783549349.32975399207214; Fri, 12 Apr 2024 01:16:23 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvC4H-0001qI-Jp; Fri, 12 Apr 2024 04:15:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvC43-0001nD-VF; Fri, 12 Apr 2024 04:15:18 -0400 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvC3w-0006U9-KJ; Fri, 12 Apr 2024 04:15:15 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NYaj5_1712909697) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:14:58 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712909698; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=qlt/5Mp05T6u3nwzd3/bnHiRuWYLiQjE9zreh2l8OK0=; b=fSP+IkDeh1AVrW5xWxfpxiCy9X6QAywrp3ra5RaXdbiVC2aPRkqwYEn8itnebqNW8VuLvLG9HoAdshNqm5Gtui2WT5H0lPvUQWBAmuGQC8yCCHLwT7xfzYSLL2ZLcYjT1Yhmmte9fsN38rdu+8ld8ByfirMeZ2qUQghxys7PpQg= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R191e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045170; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NYaj5_1712909697; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 18/65] target/riscv: Add integer add-with-carry/sub-with-borrow instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:48 +0800 Message-ID: <20240412073735.76413-19-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.130; envelope-from=eric.huang@linux.alibaba.com; helo=out30-130.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712909785442100001 Content-Type: text/plain; charset="utf-8" XTheadVector adc/sbc instructions diff from RVV1.0 in the following points: 1. Different mask reg layout. 2. Different tail elements process policy. 3. Different check policy. 4. When vm =3D 1, RVV1.0 vmadc and vmsbc perform the computation without carry-in/borrow-in. While XTheadVector does not have this kind of situat= ion. Signed-off-by: Huang Tao --- target/riscv/helper.h | 33 ++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 139 +++++++++++++- target/riscv/xtheadvector_helper.c | 173 ++++++++++++++++++ 3 files changed, 335 insertions(+), 10 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 3906f17079..25fb8f81c7 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1542,3 +1542,36 @@ DEF_HELPER_6(th_vwadd_wx_w, void, ptr, ptr, tl, ptr,= env, i32) DEF_HELPER_6(th_vwsub_wx_b, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vwsub_wx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vwsub_wx_w, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vadc_vvm_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vadc_vvm_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vadc_vvm_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vadc_vvm_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsbc_vvm_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsbc_vvm_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsbc_vvm_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsbc_vvm_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmadc_vvm_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmadc_vvm_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmadc_vvm_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmadc_vvm_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsbc_vvm_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsbc_vvm_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsbc_vvm_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsbc_vvm_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vadc_vxm_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vadc_vxm_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vadc_vxm_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vadc_vxm_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsbc_vxm_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsbc_vxm_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsbc_vxm_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsbc_vxm_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmadc_vxm_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmadc_vxm_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmadc_vxm_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmadc_vxm_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsbc_vxm_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsbc_vxm_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsbc_vxm_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsbc_vxm_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index f6aea9deff..a9e20a6dcb 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1185,22 +1185,141 @@ GEN_OPIWX_WIDEN_TRANS_TH(th_vwadd_wx) GEN_OPIWX_WIDEN_TRANS_TH(th_vwsubu_wx) GEN_OPIWX_WIDEN_TRANS_TH(th_vwsub_wx) =20 +/* Vector Integer Add-with-Carry / Subtract-with-Borrow Instructions */ + +/* + * This function is almost the copy of opivv_trans, except: + * 1) XTheadVector using different data encoding, add MLEN, + * delete VTA and VMA. + */ +static bool opivv_trans_th(uint32_t vd, uint32_t vs1, uint32_t vs2, uint32= _t vm, + gen_helper_gvec_4_ptr *fn, DisasContext *s) +{ + uint32_t data =3D 0; + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + tcg_gen_gvec_4_ptr(vreg_ofs(s, vd), vreg_ofs(s, 0), + vreg_ofs(s, vs1), vreg_ofs(s, vs2), + tcg_env, s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data, fn); + finalize_rvv_inst(s); + return true; +} + +/* OPIVV without GVEC IR */ +#define GEN_OPIVV_TRANS_TH(NAME, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (CHECK(s, a)) { \ + static gen_helper_gvec_4_ptr * const fns[4] =3D { \ + gen_helper_##NAME##_b, gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, gen_helper_##NAME##_d, \ + }; \ + return opivv_trans_th(a->rd, a->rs1, a->rs2, a->vm, \ + fns[s->sew], s); \ + } \ + return false; \ +} + +/* + * For vadc and vsbc, an illegal instruction exception is raised if the + * destination vector register is v0 and LMUL > 1. + */ +static bool opivv_vadc_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false) && + th_check_reg(s, a->rs1, false) && + ((a->rd !=3D 0) || (s->lmul =3D=3D 0))); +} + +GEN_OPIVV_TRANS_TH(th_vadc_vvm, opivv_vadc_check_th) +GEN_OPIVV_TRANS_TH(th_vsbc_vvm, opivv_vadc_check_th) + +/* + * For vmadc and vmsbc, an illegal instruction exception is raised if the + * destination vector register overlaps a source vector register group. + */ +static bool opivv_vmadc_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rs2, false) && + th_check_reg(s, a->rs1, false) && + th_check_overlap_group(a->rd, 1, a->rs1, 1 << s->lmul) && + th_check_overlap_group(a->rd, 1, a->rs2, 1 << s->lmul)); +} + +GEN_OPIVV_TRANS_TH(th_vmadc_vvm, opivv_vmadc_check_th) +GEN_OPIVV_TRANS_TH(th_vmsbc_vvm, opivv_vmadc_check_th) + +static bool opivx_vadc_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false) && + ((a->rd !=3D 0) || (s->lmul =3D=3D 0))); +} + +/* OPIVX without GVEC IR */ +#define GEN_OPIVX_TRANS_TH(NAME, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (CHECK(s, a)) { \ + static gen_helper_opivx * const fns[4] =3D { = \ + gen_helper_##NAME##_b, gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, gen_helper_##NAME##_d, \ + }; \ + \ + return opivx_trans_th(a->rd, a->rs1, a->rs2, a->vm, \ + fns[s->sew], s); \ + } \ + return false; \ +} + +GEN_OPIVX_TRANS_TH(th_vadc_vxm, opivx_vadc_check_th) +GEN_OPIVX_TRANS_TH(th_vsbc_vxm, opivx_vadc_check_th) + +static bool opivx_vmadc_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rs2, false) && + th_check_overlap_group(a->rd, 1, a->rs2, 1 << s->lmul)); +} + +GEN_OPIVX_TRANS_TH(th_vmadc_vxm, opivx_vmadc_check_th) +GEN_OPIVX_TRANS_TH(th_vmsbc_vxm, opivx_vmadc_check_th) + +/* OPIVI without GVEC IR */ +#define GEN_OPIVI_TRANS_TH(NAME, ZX, OPIVX, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (CHECK(s, a)) { \ + static gen_helper_opivx * const fns[4] =3D { = \ + gen_helper_##OPIVX##_b, gen_helper_##OPIVX##_h, \ + gen_helper_##OPIVX##_w, gen_helper_##OPIVX##_d, \ + }; \ + return opivi_trans_th(a->rd, a->rs1, a->rs2, a->vm, \ + fns[s->sew], s, ZX); \ + } \ + return false; \ +} + +GEN_OPIVI_TRANS_TH(th_vadc_vim, IMM_SX, th_vadc_vxm, opivx_vadc_check_th) +GEN_OPIVI_TRANS_TH(th_vmadc_vim, IMM_SX, th_vmadc_vxm, opivx_vmadc_check_t= h) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vadc_vvm) -TH_TRANS_STUB(th_vadc_vxm) -TH_TRANS_STUB(th_vadc_vim) -TH_TRANS_STUB(th_vmadc_vvm) -TH_TRANS_STUB(th_vmadc_vxm) -TH_TRANS_STUB(th_vmadc_vim) -TH_TRANS_STUB(th_vsbc_vvm) -TH_TRANS_STUB(th_vsbc_vxm) -TH_TRANS_STUB(th_vmsbc_vvm) -TH_TRANS_STUB(th_vmsbc_vxm) TH_TRANS_STUB(th_vand_vv) TH_TRANS_STUB(th_vand_vx) TH_TRANS_STUB(th_vand_vi) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 5ebdb5a375..e5058d09f6 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -134,6 +134,15 @@ static inline int th_elem_mask(void *v0, int mlen, int= index) return (((uint64_t *)v0)[idx] >> pos) & 1; } =20 +static inline void th_set_elem_mask(void *v0, int mlen, int index, + uint8_t value) +{ + int idx =3D (index * mlen) / 64; + int pos =3D (index * mlen) % 64; + uint64_t old =3D ((uint64_t *)v0)[idx]; + ((uint64_t *)v0)[idx] =3D deposit64(old, pos, mlen, value); +} + /* elements operations for load and store */ typedef void th_ldst_elem_fn(CPURISCVState *env, abi_ptr addr, uint32_t idx, void *vd, uintptr_t retaddr); @@ -1050,3 +1059,167 @@ GEN_TH_VX(th_vwadd_wx_w, 4, 8, clearq_th) GEN_TH_VX(th_vwsub_wx_b, 1, 2, clearh_th) GEN_TH_VX(th_vwsub_wx_h, 2, 4, clearl_th) GEN_TH_VX(th_vwsub_wx_w, 4, 8, clearq_th) + +/* Vector Integer Add-with-Carry / Subtract-with-Borrow Instructions */ + +#define TH_VADC(N, M, C) (N + M + C) +#define TH_VSBC(N, M, C) (N - M - C) +/* + * This function is almost the copy of GEN_VEXT_VADC_VVM, except: + * 1) different mask layout + * 2) different data encoding + * 3) different tail elements process policy + */ +#define GEN_TH_VADC_VVM(NAME, ETYPE, H, DO_OP, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t esz =3D sizeof(ETYPE); \ + uint32_t vlmax =3D th_maxsz(desc) / esz; \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + ETYPE s1 =3D *((ETYPE *)vs1 + H(i)); \ + ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ + uint8_t carry =3D th_elem_mask(v0, mlen, i); \ + \ + *((ETYPE *)vd + H(i)) =3D DO_OP(s2, s1, carry); \ + } \ + env->vstart =3D 0; \ + CLEAR_FN(vd, vl, vl * esz, vlmax * esz); \ +} + +GEN_TH_VADC_VVM(th_vadc_vvm_b, uint8_t, H1, TH_VADC, clearb_th) +GEN_TH_VADC_VVM(th_vadc_vvm_h, uint16_t, H2, TH_VADC, clearh_th) +GEN_TH_VADC_VVM(th_vadc_vvm_w, uint32_t, H4, TH_VADC, clearl_th) +GEN_TH_VADC_VVM(th_vadc_vvm_d, uint64_t, H8, TH_VADC, clearq_th) + +GEN_TH_VADC_VVM(th_vsbc_vvm_b, uint8_t, H1, TH_VSBC, clearb_th) +GEN_TH_VADC_VVM(th_vsbc_vvm_h, uint16_t, H2, TH_VSBC, clearh_th) +GEN_TH_VADC_VVM(th_vsbc_vvm_w, uint32_t, H4, TH_VSBC, clearl_th) +GEN_TH_VADC_VVM(th_vsbc_vvm_d, uint64_t, H8, TH_VSBC, clearq_th) +/* + * This function is almost the copy of GEN_VEXT_VADC_VXM, except: + * 1) different mask layout + * 2) different data encoding + * 3) different tail elements process policy + */ +#define GEN_TH_VADC_VXM(NAME, ETYPE, H, DO_OP, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); = \ + uint32_t vl =3D env->vl; = \ + uint32_t esz =3D sizeof(ETYPE); = \ + uint32_t vlmax =3D th_maxsz(desc) / esz; = \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { = \ + ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); = \ + uint8_t carry =3D th_elem_mask(v0, mlen, i); = \ + \ + *((ETYPE *)vd + H(i)) =3D DO_OP(s2, (ETYPE)(target_long)s1, carry)= ;\ + } \ + env->vstart =3D 0; = \ + CLEAR_FN(vd, vl, vl * esz, vlmax * esz); \ +} + +GEN_TH_VADC_VXM(th_vadc_vxm_b, uint8_t, H1, TH_VADC, clearb_th) +GEN_TH_VADC_VXM(th_vadc_vxm_h, uint16_t, H2, TH_VADC, clearh_th) +GEN_TH_VADC_VXM(th_vadc_vxm_w, uint32_t, H4, TH_VADC, clearl_th) +GEN_TH_VADC_VXM(th_vadc_vxm_d, uint64_t, H8, TH_VADC, clearq_th) + +GEN_TH_VADC_VXM(th_vsbc_vxm_b, uint8_t, H1, TH_VSBC, clearb_th) +GEN_TH_VADC_VXM(th_vsbc_vxm_h, uint16_t, H2, TH_VSBC, clearh_th) +GEN_TH_VADC_VXM(th_vsbc_vxm_w, uint32_t, H4, TH_VSBC, clearl_th) +GEN_TH_VADC_VXM(th_vsbc_vxm_d, uint64_t, H8, TH_VSBC, clearq_th) + +#define TH_MADC(N, M, C) (C ? (__typeof(N))(N + M + 1) <=3D N : \ + (__typeof(N))(N + M) < N) +#define TH_MSBC(N, M, C) (C ? N <=3D M : N < M) +/* + * This function is almost the copy of GEN_VEXT_VMADC_VVM, except: + * 1) different mask layout + * 2) different data encoding + * 3) different tail elements process policy + * 4) When vm =3D 1, RVV1.0 vmadc and vmsbc perform the computation + * without carry-in/borrow-in. While XTheadVector does not have + * this kind of situation. + */ +#define GEN_TH_VMADC_VVM(NAME, ETYPE, H, DO_OP) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t vlmax =3D th_maxsz(desc) / sizeof(ETYPE); \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + ETYPE s1 =3D *((ETYPE *)vs1 + H(i)); \ + ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ + uint8_t carry =3D th_elem_mask(v0, mlen, i); \ + \ + th_set_elem_mask(vd, mlen, i, DO_OP(s2, s1, carry)); \ + } \ + env->vstart =3D 0; \ + for (; i < vlmax; i++) { \ + th_set_elem_mask(vd, mlen, i, 0); \ + } \ +} + +GEN_TH_VMADC_VVM(th_vmadc_vvm_b, uint8_t, H1, TH_MADC) +GEN_TH_VMADC_VVM(th_vmadc_vvm_h, uint16_t, H2, TH_MADC) +GEN_TH_VMADC_VVM(th_vmadc_vvm_w, uint32_t, H4, TH_MADC) +GEN_TH_VMADC_VVM(th_vmadc_vvm_d, uint64_t, H8, TH_MADC) + +GEN_TH_VMADC_VVM(th_vmsbc_vvm_b, uint8_t, H1, TH_MSBC) +GEN_TH_VMADC_VVM(th_vmsbc_vvm_h, uint16_t, H2, TH_MSBC) +GEN_TH_VMADC_VVM(th_vmsbc_vvm_w, uint32_t, H4, TH_MSBC) +GEN_TH_VMADC_VVM(th_vmsbc_vvm_d, uint64_t, H8, TH_MSBC) +/* + * This function is almost the copy of GEN_VEXT_VMADC_VXM, except: + * 1) different mask layout + * 2) different data encoding + * 3) different tail elements process policy + * 4) When vm =3D 1, RVV1.0 vmadc and vmsbc perform the computation + * without carry-in/borrow-in. While XTheadVector does not have + * this kind of situation. + */ +#define GEN_TH_VMADC_VXM(NAME, ETYPE, H, DO_OP) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ + void *vs2, CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t vlmax =3D th_maxsz(desc) / sizeof(ETYPE); \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ + uint8_t carry =3D th_elem_mask(v0, mlen, i); \ + \ + th_set_elem_mask(vd, mlen, i, \ + DO_OP(s2, (ETYPE)(target_long)s1, carry)); \ + } \ + env->vstart =3D 0; \ + for (; i < vlmax; i++) { \ + th_set_elem_mask(vd, mlen, i, 0); \ + } \ +} + +GEN_TH_VMADC_VXM(th_vmadc_vxm_b, uint8_t, H1, TH_MADC) +GEN_TH_VMADC_VXM(th_vmadc_vxm_h, uint16_t, H2, TH_MADC) +GEN_TH_VMADC_VXM(th_vmadc_vxm_w, uint32_t, H4, TH_MADC) +GEN_TH_VMADC_VXM(th_vmadc_vxm_d, uint64_t, H8, TH_MADC) + +GEN_TH_VMADC_VXM(th_vmsbc_vxm_b, uint8_t, H1, TH_MSBC) +GEN_TH_VMADC_VXM(th_vmsbc_vxm_h, uint16_t, H2, TH_MSBC) +GEN_TH_VMADC_VXM(th_vmsbc_vxm_w, uint32_t, H4, TH_MSBC) +GEN_TH_VMADC_VXM(th_vmsbc_vxm_d, uint64_t, H8, TH_MSBC) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712909847; cv=none; d=zohomail.com; s=zohoarc; b=ga2GMZ3osm1GI+YBhOmS7yJvtCg05zmqdy21EocMEIdPCqEnq6uVOHomE7rJ7bbH6G2/zQZM0SMFJ+v00sRftggAgVqb++O4XdYu4fNLCjNRBEoYrZP92KF5bA+QFIdxuFEKQIo7495EQ35bo9G+IE2nOXZOQcDs66rMVepcHBM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712909847; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=j8Bo0/2LIl6dABn/l0mBMsDO09qRupjZeSWwd8wdd4I=; b=DkW7ib5NvAXwlddk3ggKrgW0fWNzSA5gNvLCgNfR2d3IWmWn7CvZI6idUDTfZWpRdNdvxghRFrRjNnPy8Z7VQhK0oPPwOYFhNM4Y+1qREu6Zk7B6/UHy/mDJQX+c79QztqprPT/PIRI+MVLELT/I29XNKsYjR3Fe+uw+bwyw+EE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712909847216894.6919982218072; Fri, 12 Apr 2024 01:17:27 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvC61-00031O-VB; Fri, 12 Apr 2024 04:17:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvC5s-00030M-Fl; Fri, 12 Apr 2024 04:17:09 -0400 Received: from out30-98.freemail.mail.aliyun.com ([115.124.30.98]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvC5q-0006oy-BW; Fri, 12 Apr 2024 04:17:08 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nb6r6_1712909818) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:16:59 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712909820; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=j8Bo0/2LIl6dABn/l0mBMsDO09qRupjZeSWwd8wdd4I=; b=LtMay7XeoZ1n0yQR6BeIlC4scL9zgnnd3OPkjPwJf2x2pQ7a8uGp/Ie6GFXEdUObT0lt2weTKPmC6ZdCOrf+akt8G/1A9aO/kDWUxE1efvA2j+lv7prCwzBOEm0/5BpUgmcOaaC4agPLyk65NZ6jDY2DPlvZK2A55Onwr3OCs94= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R431e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045176; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nb6r6_1712909818; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 19/65] target/riscv: Add bitwise logical instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:49 +0800 Message-ID: <20240412073735.76413-20-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.98; envelope-from=eric.huang@linux.alibaba.com; helo=out30-98.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712909847597100001 Content-Type: text/plain; charset="utf-8" Add bitwise logical instructions by resuing macros define before, Therefore, the difference depending on the macros which commited in other patchs. Signed-off-by: Huang Tao --- target/riscv/helper.h | 25 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 20 ++++---- target/riscv/xtheadvector_helper.c | 51 +++++++++++++++++++ 3 files changed, 87 insertions(+), 9 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 25fb8f81c7..6599b2f2f5 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1575,3 +1575,28 @@ DEF_HELPER_6(th_vmsbc_vxm_b, void, ptr, ptr, tl, ptr= , env, i32) DEF_HELPER_6(th_vmsbc_vxm_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vmsbc_vxm_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vmsbc_vxm_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vand_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vand_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vand_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vand_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vor_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vor_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vor_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vor_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vxor_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vxor_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vxor_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vxor_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vand_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vand_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vand_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vand_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vor_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vor_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vor_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vor_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vxor_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vxor_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vxor_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vxor_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index a9e20a6dcb..2b7b2cfe20 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1314,21 +1314,23 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr = *a) \ GEN_OPIVI_TRANS_TH(th_vadc_vim, IMM_SX, th_vadc_vxm, opivx_vadc_check_th) GEN_OPIVI_TRANS_TH(th_vmadc_vim, IMM_SX, th_vmadc_vxm, opivx_vmadc_check_t= h) =20 +/* Vector Bitwise Logical Instructions */ +GEN_OPIVV_GVEC_TRANS_TH(th_vand_vv, and) +GEN_OPIVV_GVEC_TRANS_TH(th_vor_vv, or) +GEN_OPIVV_GVEC_TRANS_TH(th_vxor_vv, xor) +GEN_OPIVX_GVEC_TRANS_TH(th_vand_vx, ands) +GEN_OPIVX_GVEC_TRANS_TH(th_vor_vx, ors) +GEN_OPIVX_GVEC_TRANS_TH(th_vxor_vx, xors) +GEN_OPIVI_GVEC_TRANS_TH(th_vand_vi, IMM_SX, th_vand_vx, andi) +GEN_OPIVI_GVEC_TRANS_TH(th_vor_vi, IMM_SX, th_vor_vx, ori) +GEN_OPIVI_GVEC_TRANS_TH(th_vxor_vi, IMM_SX, th_vxor_vx, xori) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vand_vv) -TH_TRANS_STUB(th_vand_vx) -TH_TRANS_STUB(th_vand_vi) -TH_TRANS_STUB(th_vor_vv) -TH_TRANS_STUB(th_vor_vx) -TH_TRANS_STUB(th_vor_vi) -TH_TRANS_STUB(th_vxor_vv) -TH_TRANS_STUB(th_vxor_vx) -TH_TRANS_STUB(th_vxor_vi) TH_TRANS_STUB(th_vsll_vv) TH_TRANS_STUB(th_vsll_vx) TH_TRANS_STUB(th_vsll_vi) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index e5058d09f6..85fa69dd82 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -1223,3 +1223,54 @@ GEN_TH_VMADC_VXM(th_vmsbc_vxm_b, uint8_t, H1, TH_MS= BC) GEN_TH_VMADC_VXM(th_vmsbc_vxm_h, uint16_t, H2, TH_MSBC) GEN_TH_VMADC_VXM(th_vmsbc_vxm_w, uint32_t, H4, TH_MSBC) GEN_TH_VMADC_VXM(th_vmsbc_vxm_d, uint64_t, H8, TH_MSBC) + +/* Vector Bitwise Logical Instructions */ +THCALL(TH_OPIVV2, th_vand_vv_b, OP_SSS_B, H1, H1, H1, TH_AND) +THCALL(TH_OPIVV2, th_vand_vv_h, OP_SSS_H, H2, H2, H2, TH_AND) +THCALL(TH_OPIVV2, th_vand_vv_w, OP_SSS_W, H4, H4, H4, TH_AND) +THCALL(TH_OPIVV2, th_vand_vv_d, OP_SSS_D, H8, H8, H8, TH_AND) +THCALL(TH_OPIVV2, th_vor_vv_b, OP_SSS_B, H1, H1, H1, TH_OR) +THCALL(TH_OPIVV2, th_vor_vv_h, OP_SSS_H, H2, H2, H2, TH_OR) +THCALL(TH_OPIVV2, th_vor_vv_w, OP_SSS_W, H4, H4, H4, TH_OR) +THCALL(TH_OPIVV2, th_vor_vv_d, OP_SSS_D, H8, H8, H8, TH_OR) +THCALL(TH_OPIVV2, th_vxor_vv_b, OP_SSS_B, H1, H1, H1, TH_XOR) +THCALL(TH_OPIVV2, th_vxor_vv_h, OP_SSS_H, H2, H2, H2, TH_XOR) +THCALL(TH_OPIVV2, th_vxor_vv_w, OP_SSS_W, H4, H4, H4, TH_XOR) +THCALL(TH_OPIVV2, th_vxor_vv_d, OP_SSS_D, H8, H8, H8, TH_XOR) +GEN_TH_VV(th_vand_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vand_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vand_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vand_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vor_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vor_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vor_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vor_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vxor_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vxor_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vxor_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vxor_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVX2, th_vand_vx_b, OP_SSS_B, H1, H1, TH_AND) +THCALL(TH_OPIVX2, th_vand_vx_h, OP_SSS_H, H2, H2, TH_AND) +THCALL(TH_OPIVX2, th_vand_vx_w, OP_SSS_W, H4, H4, TH_AND) +THCALL(TH_OPIVX2, th_vand_vx_d, OP_SSS_D, H8, H8, TH_AND) +THCALL(TH_OPIVX2, th_vor_vx_b, OP_SSS_B, H1, H1, TH_OR) +THCALL(TH_OPIVX2, th_vor_vx_h, OP_SSS_H, H2, H2, TH_OR) +THCALL(TH_OPIVX2, th_vor_vx_w, OP_SSS_W, H4, H4, TH_OR) +THCALL(TH_OPIVX2, th_vor_vx_d, OP_SSS_D, H8, H8, TH_OR) +THCALL(TH_OPIVX2, th_vxor_vx_b, OP_SSS_B, H1, H1, TH_XOR) +THCALL(TH_OPIVX2, th_vxor_vx_h, OP_SSS_H, H2, H2, TH_XOR) +THCALL(TH_OPIVX2, th_vxor_vx_w, OP_SSS_W, H4, H4, TH_XOR) +THCALL(TH_OPIVX2, th_vxor_vx_d, OP_SSS_D, H8, H8, TH_XOR) +GEN_TH_VX(th_vand_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vand_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vand_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vand_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vor_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vor_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vor_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vor_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vxor_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vxor_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vxor_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vxor_vx_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712909974; cv=none; d=zohomail.com; s=zohoarc; b=ON/x1TJGwcBqut/TjH2fkRo5y6zd9hxpvOHURMeuqe3I9TGEdHRXPZFxrx9XNI08DLbWY8VP9e5gix5FFAoZeij5slwyClwL1Km8ZwwD0MrhXhNfrEIcOHkzptsF9BRmHJahQSutHDmndRRHy/5um7SOYr5+rRNmPQmH3xaJPH0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712909974; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=0bIc/Q6v2L6SYxXtcDSnlBWrOUgRyTmsETGPlvTfqys=; b=DBZclu9828yFMQmskzUtxvh+/yOt15425Vs51QZoH0Gdxf1Lz2tlrQv3+dDag86mUqgIlIaBzuVFnzIjXDbbZNSKar6GFgmoTHy4HlL1WSj6vLTXmKRCZ11fSWr6vr3D6IHVWn+hdhWzvB0X4k+eI1z3sB9oe8IG1u7m2/PXHi8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 171290997400710.492346418275702; Fri, 12 Apr 2024 01:19:34 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvC80-00041h-SA; Fri, 12 Apr 2024 04:19:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvC7z-00040V-3u; Fri, 12 Apr 2024 04:19:19 -0400 Received: from out30-112.freemail.mail.aliyun.com ([115.124.30.112]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvC7u-000722-RN; Fri, 12 Apr 2024 04:19:18 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nan8p_1712909940) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:19:01 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712909943; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=0bIc/Q6v2L6SYxXtcDSnlBWrOUgRyTmsETGPlvTfqys=; b=tV/LjvSWuf5jIrDfvkY9QRU/yEvf/F10KJPxdgeLNHKo6+hBFWnTzhF0xS+w2/Mh++WZfxaqwWZDlqGWq7vtAoHDoGQOxmp0XuskPpwPvcDQaU5+xUl7rnmzgERcu1KlRL94emr/hXYKaGZfNuMHphUL11oOtZP+JIYxFXGMdaU= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R461e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045170; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nan8p_1712909940; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 20/65] target/riscv: Add single-width bit shift instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:50 +0800 Message-ID: <20240412073735.76413-21-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.112; envelope-from=eric.huang@linux.alibaba.com; helo=out30-112.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712909975972100003 Content-Type: text/plain; charset="utf-8" The difference between XTheadVector and RVV1.0 is same as the other patchs: 1. Different mask reg layout. 2. Different tail/masked elements process policy. 3. Simpler acceleration judgment logic. Signed-off-by: Huang Tao --- target/riscv/helper.h | 25 ++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 61 ++++++++-- target/riscv/xtheadvector_helper.c | 115 ++++++++++++++++++ 3 files changed, 192 insertions(+), 9 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 6599b2f2f5..77251af8c9 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1600,3 +1600,28 @@ DEF_HELPER_6(th_vxor_vx_b, void, ptr, ptr, tl, ptr, = env, i32) DEF_HELPER_6(th_vxor_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vxor_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vxor_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vsll_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsll_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsll_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsll_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsrl_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsrl_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsrl_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsrl_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsra_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsra_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsra_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsra_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsll_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsll_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsll_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsll_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsrl_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsrl_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsrl_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsrl_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsra_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsra_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsra_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsra_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 2b7b2cfe20..d72320699c 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1325,21 +1325,64 @@ GEN_OPIVI_GVEC_TRANS_TH(th_vand_vi, IMM_SX, th_vand= _vx, andi) GEN_OPIVI_GVEC_TRANS_TH(th_vor_vi, IMM_SX, th_vor_vx, ori) GEN_OPIVI_GVEC_TRANS_TH(th_vxor_vi, IMM_SX, th_vxor_vx, xori) =20 +/* Vector Single-Width Bit Shift Instructions */ +GEN_OPIVV_GVEC_TRANS_TH(th_vsll_vv, shlv) +GEN_OPIVV_GVEC_TRANS_TH(th_vsrl_vv, shrv) +GEN_OPIVV_GVEC_TRANS_TH(th_vsra_vv, sarv) + +#define GVecGen2sFn32_Th GVecGen2sFn32 + +/* + * This function is almost the copy of do_opivx_gvec_shift, except: + * 1) XTheadVector simplifies the judgment logic of whether + * to accelerate or not for its lack of fractional LMUL and + * VTA. + */ +static inline bool +do_opivx_gvec_shift_th(DisasContext *s, arg_rmrr *a, GVecGen2sFn32_Th *gve= c_fn, + gen_helper_opivx_th *fn) +{ + if (a->vm && s->vl_eq_vlmax) { + TCGv_i32 src1 =3D tcg_temp_new_i32(); + + tcg_gen_trunc_tl_i32(src1, get_gpr(s, a->rs1, EXT_NONE)); + tcg_gen_extract_i32(src1, src1, 0, s->sew + 3); + gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), + src1, MAXSZ(s), MAXSZ(s)); + + finalize_rvv_inst(s); + return true; + } + return opivx_trans_th(a->rd, a->rs1, a->rs2, a->vm, fn, s); +} + +#define GEN_OPIVX_GVEC_SHIFT_TRANS_TH(NAME, SUF) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + static gen_helper_opivx * const fns[4] =3D { = \ + gen_helper_##NAME##_b, gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, gen_helper_##NAME##_d, \ + }; \ + if (!opivx_check_th(s, a)) { \ + return false; \ + } \ + return do_opivx_gvec_shift_th(s, a, tcg_gen_gvec_##SUF, fns[s->sew]); \ +} + +GEN_OPIVX_GVEC_SHIFT_TRANS_TH(th_vsll_vx, shls) +GEN_OPIVX_GVEC_SHIFT_TRANS_TH(th_vsrl_vx, shrs) +GEN_OPIVX_GVEC_SHIFT_TRANS_TH(th_vsra_vx, sars) + +GEN_OPIVI_GVEC_TRANS_TH(th_vsll_vi, IMM_TRUNC_SEW, th_vsll_vx, shli) +GEN_OPIVI_GVEC_TRANS_TH(th_vsrl_vi, IMM_TRUNC_SEW, th_vsrl_vx, shri) +GEN_OPIVI_GVEC_TRANS_TH(th_vsra_vi, IMM_TRUNC_SEW, th_vsra_vx, sari) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vsll_vv) -TH_TRANS_STUB(th_vsll_vx) -TH_TRANS_STUB(th_vsll_vi) -TH_TRANS_STUB(th_vsrl_vv) -TH_TRANS_STUB(th_vsrl_vx) -TH_TRANS_STUB(th_vsrl_vi) -TH_TRANS_STUB(th_vsra_vv) -TH_TRANS_STUB(th_vsra_vx) -TH_TRANS_STUB(th_vsra_vi) TH_TRANS_STUB(th_vnsrl_vv) TH_TRANS_STUB(th_vnsrl_vx) TH_TRANS_STUB(th_vnsrl_vi) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 85fa69dd82..d3f10ad873 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -1274,3 +1274,118 @@ GEN_TH_VX(th_vxor_vx_b, 1, 1, clearb_th) GEN_TH_VX(th_vxor_vx_h, 2, 2, clearh_th) GEN_TH_VX(th_vxor_vx_w, 4, 4, clearl_th) GEN_TH_VX(th_vxor_vx_d, 8, 8, clearq_th) + +/* Vector Single-Width Bit Shift Instructions */ +#define TH_SLL(N, M) (N << (M)) +#define TH_SRL(N, M) (N >> (M)) + +/* + * generate the helpers for shift instructions with two vector operators + * + * GEN_TH_SHIFT_VV and GEN_TH_SHIFT_VX are almost the copy of + * GEN_VEXT_SHIFT_VV and GEN_VEXT_SHIFT_VX, except: + * 1) different mask layout + * 2) different data encoding + * 3) different masked/tail elements process policy + */ +#define GEN_TH_SHIFT_VV(NAME, TS1, TS2, HS1, HS2, OP, MASK, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, \ + void *vs2, CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); = \ + uint32_t vm =3D th_vm(desc); = \ + uint32_t vl =3D env->vl; = \ + uint32_t esz =3D sizeof(TS1); = \ + uint32_t vlmax =3D th_maxsz(desc) / esz; = \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { = \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + TS1 s1 =3D *((TS1 *)vs1 + HS1(i)); = \ + TS2 s2 =3D *((TS2 *)vs2 + HS2(i)); = \ + *((TS1 *)vd + HS1(i)) =3D OP(s2, s1 & MASK); = \ + } \ + env->vstart =3D 0; = \ + CLEAR_FN(vd, vl, vl * esz, vlmax * esz); \ +} + +GEN_TH_SHIFT_VV(th_vsll_vv_b, uint8_t, uint8_t, H1, H1, TH_SLL, + 0x7, clearb_th) +GEN_TH_SHIFT_VV(th_vsll_vv_h, uint16_t, uint16_t, H2, H2, TH_SLL, + 0xf, clearh_th) +GEN_TH_SHIFT_VV(th_vsll_vv_w, uint32_t, uint32_t, H4, H4, TH_SLL, + 0x1f, clearl_th) +GEN_TH_SHIFT_VV(th_vsll_vv_d, uint64_t, uint64_t, H8, H8, TH_SLL, + 0x3f, clearq_th) + +GEN_TH_SHIFT_VV(th_vsrl_vv_b, uint8_t, uint8_t, H1, H1, TH_SRL, + 0x7, clearb_th) +GEN_TH_SHIFT_VV(th_vsrl_vv_h, uint16_t, uint16_t, H2, H2, TH_SRL, + 0xf, clearh_th) +GEN_TH_SHIFT_VV(th_vsrl_vv_w, uint32_t, uint32_t, H4, H4, TH_SRL, + 0x1f, clearl_th) +GEN_TH_SHIFT_VV(th_vsrl_vv_d, uint64_t, uint64_t, H8, H8, TH_SRL, + 0x3f, clearq_th) + +GEN_TH_SHIFT_VV(th_vsra_vv_b, uint8_t, int8_t, H1, H1, TH_SRL, + 0x7, clearb_th) +GEN_TH_SHIFT_VV(th_vsra_vv_h, uint16_t, int16_t, H2, H2, TH_SRL, + 0xf, clearh_th) +GEN_TH_SHIFT_VV(th_vsra_vv_w, uint32_t, int32_t, H4, H4, TH_SRL, + 0x1f, clearl_th) +GEN_TH_SHIFT_VV(th_vsra_vv_d, uint64_t, int64_t, H8, H8, TH_SRL, + 0x3f, clearq_th) + +/* generate the helpers for shift instructions with one vector and one sca= lar */ +#define GEN_TH_SHIFT_VX(NAME, TD, TS2, HD, HS2, OP, MASK, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ + void *vs2, CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vm =3D th_vm(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t esz =3D sizeof(TD); \ + uint32_t vlmax =3D th_maxsz(desc) / esz; \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + TS2 s2 =3D *((TS2 *)vs2 + HS2(i)); \ + *((TD *)vd + HD(i)) =3D OP(s2, s1 & MASK); \ + } \ + env->vstart =3D 0; \ + CLEAR_FN(vd, vl, vl * esz, vlmax * esz); \ +} + +GEN_TH_SHIFT_VX(th_vsll_vx_b, uint8_t, int8_t, H1, H1, TH_SLL, + 0x7, clearb_th) +GEN_TH_SHIFT_VX(th_vsll_vx_h, uint16_t, int16_t, H2, H2, TH_SLL, + 0xf, clearh_th) +GEN_TH_SHIFT_VX(th_vsll_vx_w, uint32_t, int32_t, H4, H4, TH_SLL, + 0x1f, clearl_th) +GEN_TH_SHIFT_VX(th_vsll_vx_d, uint64_t, int64_t, H8, H8, TH_SLL, + 0x3f, clearq_th) + +GEN_TH_SHIFT_VX(th_vsrl_vx_b, uint8_t, uint8_t, H1, H1, TH_SRL, + 0x7, clearb_th) +GEN_TH_SHIFT_VX(th_vsrl_vx_h, uint16_t, uint16_t, H2, H2, TH_SRL, + 0xf, clearh_th) +GEN_TH_SHIFT_VX(th_vsrl_vx_w, uint32_t, uint32_t, H4, H4, TH_SRL, + 0x1f, clearl_th) +GEN_TH_SHIFT_VX(th_vsrl_vx_d, uint64_t, uint64_t, H8, H8, TH_SRL, + 0x3f, clearq_th) + +GEN_TH_SHIFT_VX(th_vsra_vx_b, int8_t, int8_t, H1, H1, TH_SRL, + 0x7, clearb_th) +GEN_TH_SHIFT_VX(th_vsra_vx_h, int16_t, int16_t, H2, H2, TH_SRL, + 0xf, clearh_th) +GEN_TH_SHIFT_VX(th_vsra_vx_w, int32_t, int32_t, H4, H4, TH_SRL, + 0x1f, clearl_th) +GEN_TH_SHIFT_VX(th_vsra_vx_d, int64_t, int64_t, H8, H8, TH_SRL, + 0x3f, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712910102; cv=none; d=zohomail.com; s=zohoarc; b=YTFk7ygmEMaLdkkiN5HUV3nNZ5oxJ+sDFq+LaD6EQ1/jSmfWM+O3Jmubt+BmAGdfhpz9c+vzm1Y/TAn0XWKvD3sRuVHhoSG6NI1QEHdniIRMtpC9H3P76JTvma2+g1mu2Hgr3qkbnztgujbYy7NttfYK3wow7bI/YXRsSwWT7/U= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712910102; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=CvEg4L7EWN8kOWi0WEBbcpERM3281h1RojXMFy93aWQ=; b=ZmolsW/BrzHkCzTd/+vAb+nNRYOdRqE7I23XLphNHf3Bc8fvO4RZx3tMyHubwgEnhEdPr0VKkt/LB21G5HeOtaxCUimn3om7Bjo1zDh7s5ZgwFxxOVvMMDYsc2H+6FvWgnbdcnCH/8v/xiovRm+ewyTBQq9VKz23EF068aJfAFQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712910102932251.6913369155527; Fri, 12 Apr 2024 01:21:42 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvC9r-00053h-2T; Fri, 12 Apr 2024 04:21:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvC9o-00053N-Kt; Fri, 12 Apr 2024 04:21:12 -0400 Received: from out30-97.freemail.mail.aliyun.com ([115.124.30.97]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvC9m-0007aQ-0A; Fri, 12 Apr 2024 04:21:12 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NfDVK_1712910061) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:21:02 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712910063; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=CvEg4L7EWN8kOWi0WEBbcpERM3281h1RojXMFy93aWQ=; b=bo9LP6mIUcgbpBYLCml0PxJBm70f4foBSDqP6frbMaTSJvN9KZQT77xdf4GUoQS1Iyf/WTK7fnZhkhBl5XpxOKmhJlwBfe+k6AxLy9TGn7uvrCPRDeIHx8sXfpE8o/+Hq2vxMCmK7y3omeahTkpVykeXczMoMrjNGRTiWNYB3CA= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R181e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046050; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NfDVK_1712910061; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 21/65] target/riscv: Add narrowing integer right shift instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:51 +0800 Message-ID: <20240412073735.76413-22-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.97; envelope-from=eric.huang@linux.alibaba.com; helo=out30-97.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712910104236100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 13 +++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 99 +++++++++++++++++-- target/riscv/xtheadvector_helper.c | 26 +++++ 3 files changed, 132 insertions(+), 6 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 77251af8c9..d3170ba91f 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1625,3 +1625,16 @@ DEF_HELPER_6(th_vsra_vx_b, void, ptr, ptr, tl, ptr, = env, i32) DEF_HELPER_6(th_vsra_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vsra_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vsra_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vnsrl_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnsrl_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnsrl_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnsra_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnsra_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnsra_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnsrl_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnsrl_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnsrl_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnsra_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnsra_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnsra_vx_w, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index d72320699c..68810ff0ec 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1377,18 +1377,105 @@ GEN_OPIVI_GVEC_TRANS_TH(th_vsll_vi, IMM_TRUNC_SEW,= th_vsll_vx, shli) GEN_OPIVI_GVEC_TRANS_TH(th_vsrl_vi, IMM_TRUNC_SEW, th_vsrl_vx, shri) GEN_OPIVI_GVEC_TRANS_TH(th_vsra_vi, IMM_TRUNC_SEW, th_vsra_vx, sari) =20 +/* Vector Narrowing Integer Right Shift Instructions */ +static bool opivv_narrow_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, false) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, true) && + th_check_reg(s, a->rs1, false) && + th_check_overlap_group(a->rd, 1 << s->lmul, a->rs2, + 2 << s->lmul) && + (s->lmul < 0x3) && (s->sew < 0x3)); +} + +/* OPIVV with NARROW */ +#define GEN_OPIVV_NARROW_TRANS_TH(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (opivv_narrow_check_th(s, a)) { \ + uint32_t data =3D 0; \ + static gen_helper_gvec_4_ptr * const fns[3] =3D { \ + gen_helper_##NAME##_b, \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, \ + }; \ + \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), \ + vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs1), \ + vreg_ofs(s, a->rs2), tcg_env, \ + s->cfg_ptr->vlenb, \ + s->cfg_ptr->vlenb, data, \ + fns[s->sew]); \ + finalize_rvv_inst(s); \ + return true; \ + } \ + return false; \ +} +GEN_OPIVV_NARROW_TRANS_TH(th_vnsra_vv) +GEN_OPIVV_NARROW_TRANS_TH(th_vnsrl_vv) + +static bool opivx_narrow_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, false) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, true) && + th_check_overlap_group(a->rd, 1 << s->lmul, a->rs2, + 2 << s->lmul) && + (s->lmul < 0x3) && (s->sew < 0x3)); +} + +/* OPIVX with NARROW */ +#define GEN_OPIVX_NARROW_TRANS_TH(NAME) = \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) = \ +{ = \ + if (opivx_narrow_check_th(s, a)) { = \ + static gen_helper_opivx * const fns[3] =3D { = \ + gen_helper_##NAME##_b, = \ + gen_helper_##NAME##_h, = \ + gen_helper_##NAME##_w, = \ + }; = \ + return opivx_trans_th(a->rd, a->rs1, a->rs2, a->vm, fns[s->sew], s= );\ + } = \ + return false; = \ +} + +GEN_OPIVX_NARROW_TRANS_TH(th_vnsra_vx) +GEN_OPIVX_NARROW_TRANS_TH(th_vnsrl_vx) + +/* OPIVI with NARROW */ +#define GEN_OPIVI_NARROW_TRANS_TH(NAME, ZX, OPIVX) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (opivx_narrow_check_th(s, a)) { \ + static gen_helper_opivx * const fns[3] =3D { = \ + gen_helper_##OPIVX##_b, \ + gen_helper_##OPIVX##_h, \ + gen_helper_##OPIVX##_w, \ + }; \ + return opivi_trans_th(a->rd, a->rs1, a->rs2, a->vm, \ + fns[s->sew], s, ZX); \ + } \ + return false; \ +} + +GEN_OPIVI_NARROW_TRANS_TH(th_vnsra_vi, IMM_ZX, th_vnsra_vx) +GEN_OPIVI_NARROW_TRANS_TH(th_vnsrl_vi, IMM_ZX, th_vnsrl_vx) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vnsrl_vv) -TH_TRANS_STUB(th_vnsrl_vx) -TH_TRANS_STUB(th_vnsrl_vi) -TH_TRANS_STUB(th_vnsra_vv) -TH_TRANS_STUB(th_vnsra_vx) -TH_TRANS_STUB(th_vnsra_vi) TH_TRANS_STUB(th_vmseq_vv) TH_TRANS_STUB(th_vmseq_vx) TH_TRANS_STUB(th_vmseq_vi) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index d3f10ad873..f4bd80349d 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -1389,3 +1389,29 @@ GEN_TH_SHIFT_VX(th_vsra_vx_w, int32_t, int32_t, H4, = H4, TH_SRL, 0x1f, clearl_th) GEN_TH_SHIFT_VX(th_vsra_vx_d, int64_t, int64_t, H8, H8, TH_SRL, 0x3f, clearq_th) + +/* Vector Narrowing Integer Right Shift Instructions */ +GEN_TH_SHIFT_VV(th_vnsrl_vv_b, uint8_t, uint16_t, H1, H2, TH_SRL, + 0xf, clearb_th) +GEN_TH_SHIFT_VV(th_vnsrl_vv_h, uint16_t, uint32_t, H2, H4, TH_SRL, + 0x1f, clearh_th) +GEN_TH_SHIFT_VV(th_vnsrl_vv_w, uint32_t, uint64_t, H4, H8, TH_SRL, + 0x3f, clearl_th) +GEN_TH_SHIFT_VV(th_vnsra_vv_b, uint8_t, int16_t, H1, H2, TH_SRL, + 0xf, clearb_th) +GEN_TH_SHIFT_VV(th_vnsra_vv_h, uint16_t, int32_t, H2, H4, TH_SRL, + 0x1f, clearh_th) +GEN_TH_SHIFT_VV(th_vnsra_vv_w, uint32_t, int64_t, H4, H8, TH_SRL, + 0x3f, clearl_th) +GEN_TH_SHIFT_VX(th_vnsrl_vx_b, uint8_t, uint16_t, H1, H2, TH_SRL, + 0xf, clearb_th) +GEN_TH_SHIFT_VX(th_vnsrl_vx_h, uint16_t, uint32_t, H2, H4, TH_SRL, + 0x1f, clearh_th) +GEN_TH_SHIFT_VX(th_vnsrl_vx_w, uint32_t, uint64_t, H4, H8, TH_SRL, + 0x3f, clearl_th) +GEN_TH_SHIFT_VX(th_vnsra_vx_b, int8_t, int16_t, H1, H2, TH_SRL, + 0xf, clearb_th) +GEN_TH_SHIFT_VX(th_vnsra_vx_h, int16_t, int32_t, H2, H4, TH_SRL, + 0x1f, clearh_th) +GEN_TH_SHIFT_VX(th_vnsra_vx_w, int32_t, int64_t, H4, H8, TH_SRL, + 0x3f, clearl_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712910218; cv=none; d=zohomail.com; s=zohoarc; b=iOxdPtphPI8FcoLbu1NQh73Wzoc3ytJuaMDOFWH24RQCH2+4SA15bKlA1uqjxkmZno4Z7tHXVWofXXw7nTHJrr5bT/u+lzO1WuUZw8/yQWCi3Oqodajbu3CBLjj6Pg5v5ADkYlfR1AzgVOtzLHahp6MkmbTC4rVWmx86FSsC1kM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712910218; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=Xw/B1iKhHAB1bZRFrLI7LnGzX6yfajlljeqcaYIlm5M=; b=Gzd5Jm7PGr3ZuXY9BJEjSK51lXUcAPy9Gto/nBQOOcGbJeAvlOtx0HOJvzyWdUqeC2JbeOgI5vgAs8kZgtSNMKSOngMxyobl1FczsxhJ3Q7hdTZ0XMiCgJa2ti1VOz+38QaAS7gNKM0+9mUJlxfs40hb81wIZYHH27fG+ciIJqM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712910218600938.9992059968142; Fri, 12 Apr 2024 01:23:38 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCBo-0006PX-2K; Fri, 12 Apr 2024 04:23:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCBm-0006ON-I8; Fri, 12 Apr 2024 04:23:14 -0400 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCBj-00084P-61; Fri, 12 Apr 2024 04:23:14 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NfEHh_1712910183) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:23:04 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712910185; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=Xw/B1iKhHAB1bZRFrLI7LnGzX6yfajlljeqcaYIlm5M=; b=LWxVqu4BO49avhlYIJFtOlq1KXPXVeC0+hwwRQ7RMMaXSfReGd0N5XmlLCOvLIWMA/K9UR7jUDqqTvkF4l6z8a9k8z5rqUwk0CcKLdkizeWq5SnJQ77H1XktERbvSTShlCWxQpRyNnb6ZG7HetmqtWO4TSI/piSk3cFOBMPBsdk= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R421e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046051; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NfEHh_1712910183; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 22/65] target/riscv: Add integer compare instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:52 +0800 Message-ID: <20240412073735.76413-23-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.133; envelope-from=eric.huang@linux.alibaba.com; helo=out30-133.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712910220525100003 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 57 ++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 69 +++++++--- target/riscv/xtheadvector_helper.c | 127 ++++++++++++++++++ 3 files changed, 233 insertions(+), 20 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index d3170ba91f..8f2dec158b 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1638,3 +1638,60 @@ DEF_HELPER_6(th_vnsrl_vx_w, void, ptr, ptr, tl, ptr,= env, i32) DEF_HELPER_6(th_vnsra_vx_b, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vnsra_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vnsra_vx_w, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vmseq_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmseq_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmseq_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmseq_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsne_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsne_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsne_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsne_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsltu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsltu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsltu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsltu_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmslt_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmslt_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmslt_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmslt_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsleu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsleu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsleu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsleu_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsle_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsle_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsle_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmsle_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmseq_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmseq_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmseq_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmseq_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsne_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsne_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsne_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsne_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsltu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsltu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsltu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsltu_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmslt_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmslt_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmslt_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmslt_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsleu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsleu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsleu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsleu_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsle_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsle_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsle_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsle_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsgtu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsgtu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsgtu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsgtu_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsgt_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsgt_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsgt_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmsgt_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 68810ff0ec..049d9da0a5 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1470,32 +1470,61 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr = *a) \ GEN_OPIVI_NARROW_TRANS_TH(th_vnsra_vi, IMM_ZX, th_vnsra_vx) GEN_OPIVI_NARROW_TRANS_TH(th_vnsrl_vi, IMM_ZX, th_vnsrl_vx) =20 +/* Vector Integer Comparison Instructions */ + +/* + * For all comparison instructions, an illegal instruction exception is ra= ised + * if the destination vector register overlaps a source vector register gr= oup + * and LMUL > 1. + */ +static bool opivv_cmp_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rs2, false) && + th_check_reg(s, a->rs1, false) && + ((th_check_overlap_group(a->rd, 1, a->rs1, 1 << s->lmul) && + th_check_overlap_group(a->rd, 1, a->rs2, 1 << s->lmul)) || + (s->lmul =3D=3D 0))); +} +GEN_OPIVV_TRANS_TH(th_vmseq_vv, opivv_cmp_check_th) +GEN_OPIVV_TRANS_TH(th_vmsne_vv, opivv_cmp_check_th) +GEN_OPIVV_TRANS_TH(th_vmsltu_vv, opivv_cmp_check_th) +GEN_OPIVV_TRANS_TH(th_vmslt_vv, opivv_cmp_check_th) +GEN_OPIVV_TRANS_TH(th_vmsleu_vv, opivv_cmp_check_th) +GEN_OPIVV_TRANS_TH(th_vmsle_vv, opivv_cmp_check_th) + +static bool opivx_cmp_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rs2, false) && + (th_check_overlap_group(a->rd, 1, a->rs2, 1 << s->lmul) || + (s->lmul =3D=3D 0))); +} + +GEN_OPIVX_TRANS_TH(th_vmseq_vx, opivx_cmp_check_th) +GEN_OPIVX_TRANS_TH(th_vmsne_vx, opivx_cmp_check_th) +GEN_OPIVX_TRANS_TH(th_vmsltu_vx, opivx_cmp_check_th) +GEN_OPIVX_TRANS_TH(th_vmslt_vx, opivx_cmp_check_th) +GEN_OPIVX_TRANS_TH(th_vmsleu_vx, opivx_cmp_check_th) +GEN_OPIVX_TRANS_TH(th_vmsle_vx, opivx_cmp_check_th) +GEN_OPIVX_TRANS_TH(th_vmsgtu_vx, opivx_cmp_check_th) +GEN_OPIVX_TRANS_TH(th_vmsgt_vx, opivx_cmp_check_th) + +GEN_OPIVI_TRANS_TH(th_vmseq_vi, IMM_SX, th_vmseq_vx, opivx_cmp_check_th) +GEN_OPIVI_TRANS_TH(th_vmsne_vi, IMM_SX, th_vmsne_vx, opivx_cmp_check_th) +GEN_OPIVI_TRANS_TH(th_vmsleu_vi, IMM_ZX, th_vmsleu_vx, opivx_cmp_check_th) +GEN_OPIVI_TRANS_TH(th_vmsle_vi, IMM_SX, th_vmsle_vx, opivx_cmp_check_th) +GEN_OPIVI_TRANS_TH(th_vmsgtu_vi, IMM_ZX, th_vmsgtu_vx, opivx_cmp_check_th) +GEN_OPIVI_TRANS_TH(th_vmsgt_vi, IMM_SX, th_vmsgt_vx, opivx_cmp_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vmseq_vv) -TH_TRANS_STUB(th_vmseq_vx) -TH_TRANS_STUB(th_vmseq_vi) -TH_TRANS_STUB(th_vmsne_vv) -TH_TRANS_STUB(th_vmsne_vx) -TH_TRANS_STUB(th_vmsne_vi) -TH_TRANS_STUB(th_vmsltu_vv) -TH_TRANS_STUB(th_vmsltu_vx) -TH_TRANS_STUB(th_vmslt_vv) -TH_TRANS_STUB(th_vmslt_vx) -TH_TRANS_STUB(th_vmsleu_vv) -TH_TRANS_STUB(th_vmsleu_vx) -TH_TRANS_STUB(th_vmsleu_vi) -TH_TRANS_STUB(th_vmsle_vv) -TH_TRANS_STUB(th_vmsle_vx) -TH_TRANS_STUB(th_vmsle_vi) -TH_TRANS_STUB(th_vmsgtu_vx) -TH_TRANS_STUB(th_vmsgtu_vi) -TH_TRANS_STUB(th_vmsgt_vx) -TH_TRANS_STUB(th_vmsgt_vi) TH_TRANS_STUB(th_vminu_vv) TH_TRANS_STUB(th_vminu_vx) TH_TRANS_STUB(th_vmin_vv) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index f4bd80349d..827650b325 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -1415,3 +1415,130 @@ GEN_TH_SHIFT_VX(th_vnsra_vx_h, int16_t, int32_t, H2= , H4, TH_SRL, 0x1f, clearh_th) GEN_TH_SHIFT_VX(th_vnsra_vx_w, int32_t, int64_t, H4, H8, TH_SRL, 0x3f, clearl_th) + +/* Vector Integer Comparison Instructions */ +#define TH_MSEQ(N, M) (N =3D=3D M) +#define TH_MSNE(N, M) (N !=3D M) +#define TH_MSLT(N, M) (N < M) +#define TH_MSLE(N, M) (N <=3D M) +#define TH_MSGT(N, M) (N > M) + +#define GEN_TH_CMP_VV(NAME, ETYPE, H, DO_OP) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vm =3D th_vm(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t vlmax =3D th_maxsz(desc) / sizeof(ETYPE); \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + ETYPE s1 =3D *((ETYPE *)vs1 + H(i)); \ + ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + th_set_elem_mask(vd, mlen, i, DO_OP(s2, s1)); \ + } \ + env->vstart =3D 0; \ + for (; i < vlmax; i++) { \ + th_set_elem_mask(vd, mlen, i, 0); \ + } \ +} + +GEN_TH_CMP_VV(th_vmseq_vv_b, uint8_t, H1, TH_MSEQ) +GEN_TH_CMP_VV(th_vmseq_vv_h, uint16_t, H2, TH_MSEQ) +GEN_TH_CMP_VV(th_vmseq_vv_w, uint32_t, H4, TH_MSEQ) +GEN_TH_CMP_VV(th_vmseq_vv_d, uint64_t, H8, TH_MSEQ) + +GEN_TH_CMP_VV(th_vmsne_vv_b, uint8_t, H1, TH_MSNE) +GEN_TH_CMP_VV(th_vmsne_vv_h, uint16_t, H2, TH_MSNE) +GEN_TH_CMP_VV(th_vmsne_vv_w, uint32_t, H4, TH_MSNE) +GEN_TH_CMP_VV(th_vmsne_vv_d, uint64_t, H8, TH_MSNE) + +GEN_TH_CMP_VV(th_vmsltu_vv_b, uint8_t, H1, TH_MSLT) +GEN_TH_CMP_VV(th_vmsltu_vv_h, uint16_t, H2, TH_MSLT) +GEN_TH_CMP_VV(th_vmsltu_vv_w, uint32_t, H4, TH_MSLT) +GEN_TH_CMP_VV(th_vmsltu_vv_d, uint64_t, H8, TH_MSLT) + +GEN_TH_CMP_VV(th_vmslt_vv_b, int8_t, H1, TH_MSLT) +GEN_TH_CMP_VV(th_vmslt_vv_h, int16_t, H2, TH_MSLT) +GEN_TH_CMP_VV(th_vmslt_vv_w, int32_t, H4, TH_MSLT) +GEN_TH_CMP_VV(th_vmslt_vv_d, int64_t, H8, TH_MSLT) + +GEN_TH_CMP_VV(th_vmsleu_vv_b, uint8_t, H1, TH_MSLE) +GEN_TH_CMP_VV(th_vmsleu_vv_h, uint16_t, H2, TH_MSLE) +GEN_TH_CMP_VV(th_vmsleu_vv_w, uint32_t, H4, TH_MSLE) +GEN_TH_CMP_VV(th_vmsleu_vv_d, uint64_t, H8, TH_MSLE) + +GEN_TH_CMP_VV(th_vmsle_vv_b, int8_t, H1, TH_MSLE) +GEN_TH_CMP_VV(th_vmsle_vv_h, int16_t, H2, TH_MSLE) +GEN_TH_CMP_VV(th_vmsle_vv_w, int32_t, H4, TH_MSLE) +GEN_TH_CMP_VV(th_vmsle_vv_d, int64_t, H8, TH_MSLE) + +#define GEN_TH_CMP_VX(NAME, ETYPE, H, DO_OP) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vm =3D th_vm(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t vlmax =3D th_maxsz(desc) / sizeof(ETYPE); \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + th_set_elem_mask(vd, mlen, i, \ + DO_OP(s2, (ETYPE)(target_long)s1)); \ + } \ + env->vstart =3D 0; \ + for (; i < vlmax; i++) { \ + th_set_elem_mask(vd, mlen, i, 0); \ + } \ +} + +GEN_TH_CMP_VX(th_vmseq_vx_b, uint8_t, H1, TH_MSEQ) +GEN_TH_CMP_VX(th_vmseq_vx_h, uint16_t, H2, TH_MSEQ) +GEN_TH_CMP_VX(th_vmseq_vx_w, uint32_t, H4, TH_MSEQ) +GEN_TH_CMP_VX(th_vmseq_vx_d, uint64_t, H8, TH_MSEQ) + +GEN_TH_CMP_VX(th_vmsne_vx_b, uint8_t, H1, TH_MSNE) +GEN_TH_CMP_VX(th_vmsne_vx_h, uint16_t, H2, TH_MSNE) +GEN_TH_CMP_VX(th_vmsne_vx_w, uint32_t, H4, TH_MSNE) +GEN_TH_CMP_VX(th_vmsne_vx_d, uint64_t, H8, TH_MSNE) + +GEN_TH_CMP_VX(th_vmsltu_vx_b, uint8_t, H1, TH_MSLT) +GEN_TH_CMP_VX(th_vmsltu_vx_h, uint16_t, H2, TH_MSLT) +GEN_TH_CMP_VX(th_vmsltu_vx_w, uint32_t, H4, TH_MSLT) +GEN_TH_CMP_VX(th_vmsltu_vx_d, uint64_t, H8, TH_MSLT) + +GEN_TH_CMP_VX(th_vmslt_vx_b, int8_t, H1, TH_MSLT) +GEN_TH_CMP_VX(th_vmslt_vx_h, int16_t, H2, TH_MSLT) +GEN_TH_CMP_VX(th_vmslt_vx_w, int32_t, H4, TH_MSLT) +GEN_TH_CMP_VX(th_vmslt_vx_d, int64_t, H8, TH_MSLT) + +GEN_TH_CMP_VX(th_vmsleu_vx_b, uint8_t, H1, TH_MSLE) +GEN_TH_CMP_VX(th_vmsleu_vx_h, uint16_t, H2, TH_MSLE) +GEN_TH_CMP_VX(th_vmsleu_vx_w, uint32_t, H4, TH_MSLE) +GEN_TH_CMP_VX(th_vmsleu_vx_d, uint64_t, H8, TH_MSLE) + +GEN_TH_CMP_VX(th_vmsle_vx_b, int8_t, H1, TH_MSLE) +GEN_TH_CMP_VX(th_vmsle_vx_h, int16_t, H2, TH_MSLE) +GEN_TH_CMP_VX(th_vmsle_vx_w, int32_t, H4, TH_MSLE) +GEN_TH_CMP_VX(th_vmsle_vx_d, int64_t, H8, TH_MSLE) + +GEN_TH_CMP_VX(th_vmsgtu_vx_b, uint8_t, H1, TH_MSGT) +GEN_TH_CMP_VX(th_vmsgtu_vx_h, uint16_t, H2, TH_MSGT) +GEN_TH_CMP_VX(th_vmsgtu_vx_w, uint32_t, H4, TH_MSGT) +GEN_TH_CMP_VX(th_vmsgtu_vx_d, uint64_t, H8, TH_MSGT) + +GEN_TH_CMP_VX(th_vmsgt_vx_b, int8_t, H1, TH_MSGT) +GEN_TH_CMP_VX(th_vmsgt_vx_h, int16_t, H2, TH_MSGT) +GEN_TH_CMP_VX(th_vmsgt_vx_w, int32_t, H4, TH_MSGT) +GEN_TH_CMP_VX(th_vmsgt_vx_d, int64_t, H8, TH_MSGT) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712910340; cv=none; d=zohomail.com; s=zohoarc; b=IQq2+15GgOi5J8NClDHI+/XoB8k3T1nBKi7NcxvFs4S+WNmDQsuAoUrh9NsIAD1zt5H/wAPPIEw+1DX1LDH+riDKKXUTdsBfOYZa1WnfG7Fw3wY4g/Ertg3TX8qZE5x+BLX0B3Bc7sZYCs5n4OE1GT6XLEm0WakaXnaPPMAbOF8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712910340; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=0o5z902MUZqe6fDYphamYfk/0or6RBtEHfsI6q67FOI=; b=gR9q3xq05gVBgkp0bVe07wq+jN+iaw3I3en4VHXkiFVMbTYzh+tRnPfcSoNIzLDmXWI5caK9QM7YCV4R6LlwmczTPuj+CLj8h4206cZIhnTpmLVFGOvPWtb+q5FaXiliVDXQztBR9ZAN8KYutxVVv3U9n67yEvrPuhCBcoZiBtg= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712910340320585.4846339733244; Fri, 12 Apr 2024 01:25:40 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCDo-0007Es-GM; Fri, 12 Apr 2024 04:25:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCDm-0007DO-SP; Fri, 12 Apr 2024 04:25:18 -0400 Received: from out30-101.freemail.mail.aliyun.com ([115.124.30.101]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCDg-0008DG-Bm; Fri, 12 Apr 2024 04:25:18 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NgLXS_1712910304) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:25:05 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712910306; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=0o5z902MUZqe6fDYphamYfk/0or6RBtEHfsI6q67FOI=; b=jQTLn0t5URE3SmiS0vxoKnqVWIKsmsRcgG+dU5uceDTpa+atmrZVHd9wfA2SBY09QWFcbry/iC62t4MIARUAAIWKpQYo/HgdZbm/fZmAqC8nimoNf8H4bt9YLGZsLh2PMh/sw8H6i2LN2UmZMVLRFmza5B2NRasrDGCFY4N7dIs= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R101e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046056; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NgLXS_1712910304; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 23/65] target/riscv: Add integer min/max instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:53 +0800 Message-ID: <20240412073735.76413-24-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.101; envelope-from=eric.huang@linux.alibaba.com; helo=out30-101.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712910340759100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 33 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 18 ++--- target/riscv/xtheadvector_helper.c | 67 +++++++++++++++++++ 3 files changed, 110 insertions(+), 8 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 8f2dec158b..f3e4ab0f1f 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1695,3 +1695,36 @@ DEF_HELPER_6(th_vmsgt_vx_b, void, ptr, ptr, tl, ptr,= env, i32) DEF_HELPER_6(th_vmsgt_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vmsgt_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vmsgt_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vminu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vminu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vminu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vminu_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmin_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmin_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmin_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmin_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmaxu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmaxu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmaxu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmaxu_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmax_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmax_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmax_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmax_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vminu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vminu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vminu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vminu_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmin_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmin_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmin_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmin_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmaxu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmaxu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmaxu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmaxu_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmax_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmax_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmax_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmax_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 049d9da0a5..f19a771b61 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1519,20 +1519,22 @@ GEN_OPIVI_TRANS_TH(th_vmsle_vi, IMM_SX, th_vmsle_vx= , opivx_cmp_check_th) GEN_OPIVI_TRANS_TH(th_vmsgtu_vi, IMM_ZX, th_vmsgtu_vx, opivx_cmp_check_th) GEN_OPIVI_TRANS_TH(th_vmsgt_vi, IMM_SX, th_vmsgt_vx, opivx_cmp_check_th) =20 +/* Vector Integer Min/Max Instructions */ +GEN_OPIVV_GVEC_TRANS_TH(th_vminu_vv, umin) +GEN_OPIVV_GVEC_TRANS_TH(th_vmin_vv, smin) +GEN_OPIVV_GVEC_TRANS_TH(th_vmaxu_vv, umax) +GEN_OPIVV_GVEC_TRANS_TH(th_vmax_vv, smax) +GEN_OPIVX_TRANS_TH(th_vminu_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vmin_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vmaxu_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vmax_vx, opivx_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vminu_vv) -TH_TRANS_STUB(th_vminu_vx) -TH_TRANS_STUB(th_vmin_vv) -TH_TRANS_STUB(th_vmin_vx) -TH_TRANS_STUB(th_vmaxu_vv) -TH_TRANS_STUB(th_vmaxu_vx) -TH_TRANS_STUB(th_vmax_vv) -TH_TRANS_STUB(th_vmax_vx) TH_TRANS_STUB(th_vmul_vv) TH_TRANS_STUB(th_vmul_vx) TH_TRANS_STUB(th_vmulh_vv) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 827650b325..da869e1069 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -1542,3 +1542,70 @@ GEN_TH_CMP_VX(th_vmsgt_vx_b, int8_t, H1, TH_MSGT) GEN_TH_CMP_VX(th_vmsgt_vx_h, int16_t, H2, TH_MSGT) GEN_TH_CMP_VX(th_vmsgt_vx_w, int32_t, H4, TH_MSGT) GEN_TH_CMP_VX(th_vmsgt_vx_d, int64_t, H8, TH_MSGT) + +/* Vector Integer Min/Max Instructions */ +THCALL(TH_OPIVV2, th_vminu_vv_b, OP_UUU_B, H1, H1, H1, TH_MIN) +THCALL(TH_OPIVV2, th_vminu_vv_h, OP_UUU_H, H2, H2, H2, TH_MIN) +THCALL(TH_OPIVV2, th_vminu_vv_w, OP_UUU_W, H4, H4, H4, TH_MIN) +THCALL(TH_OPIVV2, th_vminu_vv_d, OP_UUU_D, H8, H8, H8, TH_MIN) +THCALL(TH_OPIVV2, th_vmin_vv_b, OP_SSS_B, H1, H1, H1, TH_MIN) +THCALL(TH_OPIVV2, th_vmin_vv_h, OP_SSS_H, H2, H2, H2, TH_MIN) +THCALL(TH_OPIVV2, th_vmin_vv_w, OP_SSS_W, H4, H4, H4, TH_MIN) +THCALL(TH_OPIVV2, th_vmin_vv_d, OP_SSS_D, H8, H8, H8, TH_MIN) +THCALL(TH_OPIVV2, th_vmaxu_vv_b, OP_UUU_B, H1, H1, H1, TH_MAX) +THCALL(TH_OPIVV2, th_vmaxu_vv_h, OP_UUU_H, H2, H2, H2, TH_MAX) +THCALL(TH_OPIVV2, th_vmaxu_vv_w, OP_UUU_W, H4, H4, H4, TH_MAX) +THCALL(TH_OPIVV2, th_vmaxu_vv_d, OP_UUU_D, H8, H8, H8, TH_MAX) +THCALL(TH_OPIVV2, th_vmax_vv_b, OP_SSS_B, H1, H1, H1, TH_MAX) +THCALL(TH_OPIVV2, th_vmax_vv_h, OP_SSS_H, H2, H2, H2, TH_MAX) +THCALL(TH_OPIVV2, th_vmax_vv_w, OP_SSS_W, H4, H4, H4, TH_MAX) +THCALL(TH_OPIVV2, th_vmax_vv_d, OP_SSS_D, H8, H8, H8, TH_MAX) +GEN_TH_VV(th_vminu_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vminu_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vminu_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vminu_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vmin_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vmin_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vmin_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vmin_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vmaxu_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vmaxu_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vmaxu_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vmaxu_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vmax_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vmax_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vmax_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vmax_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVX2, th_vminu_vx_b, OP_UUU_B, H1, H1, TH_MIN) +THCALL(TH_OPIVX2, th_vminu_vx_h, OP_UUU_H, H2, H2, TH_MIN) +THCALL(TH_OPIVX2, th_vminu_vx_w, OP_UUU_W, H4, H4, TH_MIN) +THCALL(TH_OPIVX2, th_vminu_vx_d, OP_UUU_D, H8, H8, TH_MIN) +THCALL(TH_OPIVX2, th_vmin_vx_b, OP_SSS_B, H1, H1, TH_MIN) +THCALL(TH_OPIVX2, th_vmin_vx_h, OP_SSS_H, H2, H2, TH_MIN) +THCALL(TH_OPIVX2, th_vmin_vx_w, OP_SSS_W, H4, H4, TH_MIN) +THCALL(TH_OPIVX2, th_vmin_vx_d, OP_SSS_D, H8, H8, TH_MIN) +THCALL(TH_OPIVX2, th_vmaxu_vx_b, OP_UUU_B, H1, H1, TH_MAX) +THCALL(TH_OPIVX2, th_vmaxu_vx_h, OP_UUU_H, H2, H2, TH_MAX) +THCALL(TH_OPIVX2, th_vmaxu_vx_w, OP_UUU_W, H4, H4, TH_MAX) +THCALL(TH_OPIVX2, th_vmaxu_vx_d, OP_UUU_D, H8, H8, TH_MAX) +THCALL(TH_OPIVX2, th_vmax_vx_b, OP_SSS_B, H1, H1, TH_MAX) +THCALL(TH_OPIVX2, th_vmax_vx_h, OP_SSS_H, H2, H2, TH_MAX) +THCALL(TH_OPIVX2, th_vmax_vx_w, OP_SSS_W, H4, H4, TH_MAX) +THCALL(TH_OPIVX2, th_vmax_vx_d, OP_SSS_D, H8, H8, TH_MAX) +GEN_TH_VX(th_vminu_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vminu_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vminu_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vminu_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vmin_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vmin_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vmin_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vmin_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vmaxu_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vmaxu_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vmaxu_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vmaxu_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vmax_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vmax_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vmax_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vmax_vx_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712910452; cv=none; d=zohomail.com; s=zohoarc; b=JoFk/c+J1Y5yxNq4v0Cf3fqdO/l57tBEYD3GTPD5LDKRT3czTEFCvktzKfGWICmIRdwtwbOxXNWN9vLqOMe4xAfqhdzsNMcA3/zQf5tkbiGjiSwCi72tN3tba5WNxDA6e+GTCF8epCbzbqWL6RsGsFs0rlh7QiaQ2+ov4jDV5ys= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712910452; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=B+SstW1PWU92CRSM3sOGN34gzAi70P6R3/YF/ZZWmHI=; b=KsWp0cPDoo+uNDJ+PpnHNIbjERVh4Dp7hio5RikYE4EKm+Bwly14l3W/Z4Q/E199wjedAL++Iw9irFHrmScl+r68JNsI9/8vWlAtybyrVh/fXHtpulJE9KSPa0DlBB80FMdEkr3s/m8rmcUUajR+TjCnURq3lYDAEuo8ekZBjCY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712910452952464.4564934069143; Fri, 12 Apr 2024 01:27:32 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCFo-0008EE-If; Fri, 12 Apr 2024 04:27:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCFk-0008Cu-ND; Fri, 12 Apr 2024 04:27:21 -0400 Received: from out30-100.freemail.mail.aliyun.com ([115.124.30.100]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCFg-000092-D5; Fri, 12 Apr 2024 04:27:19 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NfFtd_1712910426) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:27:07 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712910428; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=B+SstW1PWU92CRSM3sOGN34gzAi70P6R3/YF/ZZWmHI=; b=h70SayjaAGtC55PFCxZMsLJqdTkvNN76B4ZZRxGfavt+p99TFxIsx2afpaLYx1qbws7dgAXs/ccbsNcizaO+QT6bNr1VZJJUscKsj6oygfwy3ZURfeS+J9ZsN1svjl+Vtt2rVlxMfBw36DzfPpz1ugQQDhQZt9ipBMgI2DVRFV8= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R151e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046049; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NfFtd_1712910426; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 24/65] target/riscv: Add single-width integer multiply instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:54 +0800 Message-ID: <20240412073735.76413-25-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.100; envelope-from=eric.huang@linux.alibaba.com; helo=out30-100.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712910455196100003 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 33 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 18 ++--- target/riscv/vector_helper.c | 28 ++++---- target/riscv/vector_internals.h | 17 +++++ target/riscv/xtheadvector_helper.c | 69 +++++++++++++++++++ 5 files changed, 141 insertions(+), 24 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index f3e4ab0f1f..e678dd5385 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1728,3 +1728,36 @@ DEF_HELPER_6(th_vmax_vx_b, void, ptr, ptr, tl, ptr, = env, i32) DEF_HELPER_6(th_vmax_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vmax_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vmax_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vmul_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmul_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmul_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmul_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmulh_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmulh_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmulh_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmulh_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmulhu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmulhu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmulhu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmulhu_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmulhsu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmulhsu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmulhsu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmulhsu_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmul_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmul_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmul_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmul_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmulh_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmulh_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmulh_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmulh_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmulhu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmulhu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmulhu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmulhu_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmulhsu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmulhsu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmulhsu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmulhsu_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index f19a771b61..15f36ba98a 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1529,20 +1529,22 @@ GEN_OPIVX_TRANS_TH(th_vmin_vx, opivx_check_th) GEN_OPIVX_TRANS_TH(th_vmaxu_vx, opivx_check_th) GEN_OPIVX_TRANS_TH(th_vmax_vx, opivx_check_th) =20 +/* Vector Single-Width Integer Multiply Instructions */ +GEN_OPIVV_GVEC_TRANS_TH(th_vmul_vv, mul) +GEN_OPIVV_TRANS_TH(th_vmulh_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vmulhu_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vmulhsu_vv, opivv_check_th) +GEN_OPIVX_GVEC_TRANS_TH(th_vmul_vx, muls) +GEN_OPIVX_TRANS_TH(th_vmulh_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vmulhu_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vmulhsu_vx, opivx_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vmul_vv) -TH_TRANS_STUB(th_vmul_vx) -TH_TRANS_STUB(th_vmulh_vv) -TH_TRANS_STUB(th_vmulh_vx) -TH_TRANS_STUB(th_vmulhu_vv) -TH_TRANS_STUB(th_vmulhu_vx) -TH_TRANS_STUB(th_vmulhsu_vv) -TH_TRANS_STUB(th_vmulhsu_vx) TH_TRANS_STUB(th_vdivu_vv) TH_TRANS_STUB(th_vdivu_vx) TH_TRANS_STUB(th_vdiv_vv) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 9774fc62c3..5aba3f238f 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -647,10 +647,6 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b) */ =20 /* (TD, T1, T2, TX1, TX2) */ -#define OP_SUS_B int8_t, uint8_t, int8_t, uint8_t, int8_t -#define OP_SUS_H int16_t, uint16_t, int16_t, uint16_t, int16_t -#define OP_SUS_W int32_t, uint32_t, int32_t, uint32_t, int32_t -#define OP_SUS_D int64_t, uint64_t, int64_t, uint64_t, int64_t #define WOP_SUS_B int16_t, uint8_t, int8_t, uint16_t, int16_t #define WOP_SUS_H int32_t, uint16_t, int16_t, uint32_t, int32_t #define WOP_SUS_W int64_t, uint32_t, int32_t, uint64_t, int64_t @@ -1399,22 +1395,22 @@ GEN_VEXT_VV(vmul_vv_h, 2) GEN_VEXT_VV(vmul_vv_w, 4) GEN_VEXT_VV(vmul_vv_d, 8) =20 -static int8_t do_mulh_b(int8_t s2, int8_t s1) +int8_t do_mulh_b(int8_t s2, int8_t s1) { return (int16_t)s2 * (int16_t)s1 >> 8; } =20 -static int16_t do_mulh_h(int16_t s2, int16_t s1) +int16_t do_mulh_h(int16_t s2, int16_t s1) { return (int32_t)s2 * (int32_t)s1 >> 16; } =20 -static int32_t do_mulh_w(int32_t s2, int32_t s1) +int32_t do_mulh_w(int32_t s2, int32_t s1) { return (int64_t)s2 * (int64_t)s1 >> 32; } =20 -static int64_t do_mulh_d(int64_t s2, int64_t s1) +int64_t do_mulh_d(int64_t s2, int64_t s1) { uint64_t hi_64, lo_64; =20 @@ -1422,22 +1418,22 @@ static int64_t do_mulh_d(int64_t s2, int64_t s1) return hi_64; } =20 -static uint8_t do_mulhu_b(uint8_t s2, uint8_t s1) +uint8_t do_mulhu_b(uint8_t s2, uint8_t s1) { return (uint16_t)s2 * (uint16_t)s1 >> 8; } =20 -static uint16_t do_mulhu_h(uint16_t s2, uint16_t s1) +uint16_t do_mulhu_h(uint16_t s2, uint16_t s1) { return (uint32_t)s2 * (uint32_t)s1 >> 16; } =20 -static uint32_t do_mulhu_w(uint32_t s2, uint32_t s1) +uint32_t do_mulhu_w(uint32_t s2, uint32_t s1) { return (uint64_t)s2 * (uint64_t)s1 >> 32; } =20 -static uint64_t do_mulhu_d(uint64_t s2, uint64_t s1) +uint64_t do_mulhu_d(uint64_t s2, uint64_t s1) { uint64_t hi_64, lo_64; =20 @@ -1445,17 +1441,17 @@ static uint64_t do_mulhu_d(uint64_t s2, uint64_t s1) return hi_64; } =20 -static int8_t do_mulhsu_b(int8_t s2, uint8_t s1) +int8_t do_mulhsu_b(int8_t s2, uint8_t s1) { return (int16_t)s2 * (uint16_t)s1 >> 8; } =20 -static int16_t do_mulhsu_h(int16_t s2, uint16_t s1) +int16_t do_mulhsu_h(int16_t s2, uint16_t s1) { return (int32_t)s2 * (uint32_t)s1 >> 16; } =20 -static int32_t do_mulhsu_w(int32_t s2, uint32_t s1) +int32_t do_mulhsu_w(int32_t s2, uint32_t s1) { return (int64_t)s2 * (uint64_t)s1 >> 32; } @@ -1479,7 +1475,7 @@ static int32_t do_mulhsu_w(int32_t s2, uint32_t s1) * HI_P -=3D (A < 0 ? B : 0) */ =20 -static int64_t do_mulhsu_d(int64_t s2, uint64_t s1) +int64_t do_mulhsu_d(int64_t s2, uint64_t s1) { uint64_t hi_64, lo_64; =20 diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 24e64c37d4..4cbd7f972a 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -142,6 +142,10 @@ void vext_set_elems_1s(void *base, uint32_t is_agnosti= c, uint32_t cnt, #define OP_SSS_H int16_t, int16_t, int16_t, int16_t, int16_t #define OP_SSS_W int32_t, int32_t, int32_t, int32_t, int32_t #define OP_SSS_D int64_t, int64_t, int64_t, int64_t, int64_t +#define OP_SUS_B int8_t, uint8_t, int8_t, uint8_t, int8_t +#define OP_SUS_H int16_t, uint16_t, int16_t, uint16_t, int16_t +#define OP_SUS_W int32_t, uint32_t, int32_t, uint32_t, int32_t +#define OP_SUS_D int64_t, uint64_t, int64_t, uint64_t, int64_t =20 #define OPIVV1(NAME, TD, T2, TX2, HD, HS2, OP) \ static void do_##NAME(void *vd, void *vs2, int i) \ @@ -261,4 +265,17 @@ void probe_pages(CPURISCVState *env, target_ulong addr, target_ulong len, uintptr_t ra, MMUAccessType access_type); =20 +int8_t do_mulh_b(int8_t s2, int8_t s1); +int16_t do_mulh_h(int16_t s2, int16_t s1); +int32_t do_mulh_w(int32_t s2, int32_t s1); +int64_t do_mulh_d(int64_t s2, int64_t s1); +uint8_t do_mulhu_b(uint8_t s2, uint8_t s1); +uint16_t do_mulhu_h(uint16_t s2, uint16_t s1); +uint32_t do_mulhu_w(uint32_t s2, uint32_t s1); +uint64_t do_mulhu_d(uint64_t s2, uint64_t s1); +int8_t do_mulhsu_b(int8_t s2, uint8_t s1); +int16_t do_mulhsu_h(int16_t s2, uint16_t s1); +int32_t do_mulhsu_w(int32_t s2, uint32_t s1); +int64_t do_mulhsu_d(int64_t s2, uint64_t s1); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index da869e1069..9d8129750c 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -1609,3 +1609,72 @@ GEN_TH_VX(th_vmax_vx_b, 1, 1, clearb_th) GEN_TH_VX(th_vmax_vx_h, 2, 2, clearh_th) GEN_TH_VX(th_vmax_vx_w, 4, 4, clearl_th) GEN_TH_VX(th_vmax_vx_d, 8, 8, clearq_th) + +/* Vector Single-Width Integer Multiply Instructions */ +#define TH_MUL(N, M) (N * M) +THCALL(TH_OPIVV2, th_vmul_vv_b, OP_SSS_B, H1, H1, H1, TH_MUL) +THCALL(TH_OPIVV2, th_vmul_vv_h, OP_SSS_H, H2, H2, H2, TH_MUL) +THCALL(TH_OPIVV2, th_vmul_vv_w, OP_SSS_W, H4, H4, H4, TH_MUL) +THCALL(TH_OPIVV2, th_vmul_vv_d, OP_SSS_D, H8, H8, H8, TH_MUL) +GEN_TH_VV(th_vmul_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vmul_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vmul_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vmul_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVV2, th_vmulh_vv_b, OP_SSS_B, H1, H1, H1, do_mulh_b) +THCALL(TH_OPIVV2, th_vmulh_vv_h, OP_SSS_H, H2, H2, H2, do_mulh_h) +THCALL(TH_OPIVV2, th_vmulh_vv_w, OP_SSS_W, H4, H4, H4, do_mulh_w) +THCALL(TH_OPIVV2, th_vmulh_vv_d, OP_SSS_D, H8, H8, H8, do_mulh_d) +THCALL(TH_OPIVV2, th_vmulhu_vv_b, OP_UUU_B, H1, H1, H1, do_mulhu_b) +THCALL(TH_OPIVV2, th_vmulhu_vv_h, OP_UUU_H, H2, H2, H2, do_mulhu_h) +THCALL(TH_OPIVV2, th_vmulhu_vv_w, OP_UUU_W, H4, H4, H4, do_mulhu_w) +THCALL(TH_OPIVV2, th_vmulhu_vv_d, OP_UUU_D, H8, H8, H8, do_mulhu_d) +THCALL(TH_OPIVV2, th_vmulhsu_vv_b, OP_SUS_B, H1, H1, H1, do_mulhsu_b) +THCALL(TH_OPIVV2, th_vmulhsu_vv_h, OP_SUS_H, H2, H2, H2, do_mulhsu_h) +THCALL(TH_OPIVV2, th_vmulhsu_vv_w, OP_SUS_W, H4, H4, H4, do_mulhsu_w) +THCALL(TH_OPIVV2, th_vmulhsu_vv_d, OP_SUS_D, H8, H8, H8, do_mulhsu_d) +GEN_TH_VV(th_vmulh_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vmulh_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vmulh_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vmulh_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vmulhu_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vmulhu_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vmulhu_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vmulhu_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vmulhsu_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vmulhsu_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vmulhsu_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vmulhsu_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVX2, th_vmul_vx_b, OP_SSS_B, H1, H1, TH_MUL) +THCALL(TH_OPIVX2, th_vmul_vx_h, OP_SSS_H, H2, H2, TH_MUL) +THCALL(TH_OPIVX2, th_vmul_vx_w, OP_SSS_W, H4, H4, TH_MUL) +THCALL(TH_OPIVX2, th_vmul_vx_d, OP_SSS_D, H8, H8, TH_MUL) +THCALL(TH_OPIVX2, th_vmulh_vx_b, OP_SSS_B, H1, H1, do_mulh_b) +THCALL(TH_OPIVX2, th_vmulh_vx_h, OP_SSS_H, H2, H2, do_mulh_h) +THCALL(TH_OPIVX2, th_vmulh_vx_w, OP_SSS_W, H4, H4, do_mulh_w) +THCALL(TH_OPIVX2, th_vmulh_vx_d, OP_SSS_D, H8, H8, do_mulh_d) +THCALL(TH_OPIVX2, th_vmulhu_vx_b, OP_UUU_B, H1, H1, do_mulhu_b) +THCALL(TH_OPIVX2, th_vmulhu_vx_h, OP_UUU_H, H2, H2, do_mulhu_h) +THCALL(TH_OPIVX2, th_vmulhu_vx_w, OP_UUU_W, H4, H4, do_mulhu_w) +THCALL(TH_OPIVX2, th_vmulhu_vx_d, OP_UUU_D, H8, H8, do_mulhu_d) +THCALL(TH_OPIVX2, th_vmulhsu_vx_b, OP_SUS_B, H1, H1, do_mulhsu_b) +THCALL(TH_OPIVX2, th_vmulhsu_vx_h, OP_SUS_H, H2, H2, do_mulhsu_h) +THCALL(TH_OPIVX2, th_vmulhsu_vx_w, OP_SUS_W, H4, H4, do_mulhsu_w) +THCALL(TH_OPIVX2, th_vmulhsu_vx_d, OP_SUS_D, H8, H8, do_mulhsu_d) +GEN_TH_VX(th_vmul_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vmul_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vmul_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vmul_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vmulh_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vmulh_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vmulh_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vmulh_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vmulhu_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vmulhu_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vmulhu_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vmulhu_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vmulhsu_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vmulhsu_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vmulhsu_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vmulhsu_vx_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712910629; cv=none; d=zohomail.com; s=zohoarc; b=lkm3toYs55W43/KtJBrHVNVXQ2JrZZNCoYVmwXmDZs7h+7Ix45mzHnibQUJR+9DIN61BbWuj+XBubwTtKL9cqPXd/AIINZW4H356Ap/snUumfPZBZupnFDtVq7+uMoDOFzvBgJoQ2RA3WfupRCb5yuGEIuVr9aA9QT6D3Y6DKdw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712910629; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=R2VLeOTOYRd4LZ+wxfRIz5wt9B2uW4e9zZUeCg2zcXI=; b=kq39IYsqfJ4rbI2W8cbhb3lLHbaU58qGiD3/LD/sq2Dyb8wI2hCmas6w6jkXUq8zauP/NwiiS3J4Y6ZZM+0LExmWR2Gxf9m6dbKOY9lJsy5cCmxFM+e8klOltrohlKMP5uFvITWyhfNw4RWa5Sp0PQGPWeQsu+/cjx8OU4kK9yE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712910629379896.3009668849472; Fri, 12 Apr 2024 01:30:29 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCIF-0000hV-Jn; Fri, 12 Apr 2024 04:29:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCHe-0000dm-Dm; Fri, 12 Apr 2024 04:29:20 -0400 Received: from out30-110.freemail.mail.aliyun.com ([115.124.30.110]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCHb-0000RD-RE; Fri, 12 Apr 2024 04:29:18 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NarA3_1712910548) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:29:09 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712910550; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=R2VLeOTOYRd4LZ+wxfRIz5wt9B2uW4e9zZUeCg2zcXI=; b=MY+aTmpKunYmnnLHQnNitpMU1GZmo301jrdFJMRKjil9bvKyJi0jSGy96180vKuRa4XFmEVAnQybgLwX256HxAT3pdJhA8zm9UDv/LMLp7TM5goZQ0ksE7gJ1qg0Niqbjc0N2BmyYP0NbnBCIqAl+nixULNLCooVFjUEsWk2eKg= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R381e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046056; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NarA3_1712910548; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 25/65] target/riscv: Add integer divide instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:55 +0800 Message-ID: <20240412073735.76413-26-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.110; envelope-from=eric.huang@linux.alibaba.com; helo=out30-110.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712910631496100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 33 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 18 +++-- target/riscv/xtheadvector_helper.c | 74 +++++++++++++++++++ 3 files changed, 117 insertions(+), 8 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index e678dd5385..3d5ad2847e 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1761,3 +1761,36 @@ DEF_HELPER_6(th_vmulhsu_vx_b, void, ptr, ptr, tl, pt= r, env, i32) DEF_HELPER_6(th_vmulhsu_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vmulhsu_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vmulhsu_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vdivu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vdivu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vdivu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vdivu_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vdiv_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vdiv_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vdiv_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vdiv_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vremu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vremu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vremu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vremu_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vrem_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vrem_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vrem_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vrem_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vdivu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vdivu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vdivu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vdivu_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vdiv_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vdiv_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vdiv_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vdiv_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vremu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vremu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vremu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vremu_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vrem_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vrem_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vrem_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vrem_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 15f36ba98a..a609b7faf3 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1539,20 +1539,22 @@ GEN_OPIVX_TRANS_TH(th_vmulh_vx, opivx_check_th) GEN_OPIVX_TRANS_TH(th_vmulhu_vx, opivx_check_th) GEN_OPIVX_TRANS_TH(th_vmulhsu_vx, opivx_check_th) =20 +/* Vector Integer Divide Instructions */ +GEN_OPIVV_TRANS_TH(th_vdivu_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vdiv_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vremu_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vrem_vv, opivv_check_th) +GEN_OPIVX_TRANS_TH(th_vdivu_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vdiv_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vremu_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vrem_vx, opivx_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vdivu_vv) -TH_TRANS_STUB(th_vdivu_vx) -TH_TRANS_STUB(th_vdiv_vv) -TH_TRANS_STUB(th_vdiv_vx) -TH_TRANS_STUB(th_vremu_vv) -TH_TRANS_STUB(th_vremu_vx) -TH_TRANS_STUB(th_vrem_vv) -TH_TRANS_STUB(th_vrem_vx) TH_TRANS_STUB(th_vwmulu_vv) TH_TRANS_STUB(th_vwmulu_vx) TH_TRANS_STUB(th_vwmulsu_vv) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 9d8129750c..4af66b047a 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -1678,3 +1678,77 @@ GEN_TH_VX(th_vmulhsu_vx_b, 1, 1, clearb_th) GEN_TH_VX(th_vmulhsu_vx_h, 2, 2, clearh_th) GEN_TH_VX(th_vmulhsu_vx_w, 4, 4, clearl_th) GEN_TH_VX(th_vmulhsu_vx_d, 8, 8, clearq_th) + +/* Vector Integer Divide Instructions */ +#define TH_DIVU(N, M) (unlikely(M =3D=3D 0) ? (__typeof(N))(-1) : N / M) +#define TH_REMU(N, M) (unlikely(M =3D=3D 0) ? N : N % M) +#define TH_DIV(N, M) (unlikely(M =3D=3D 0) ? (__typeof(N))(-1) :\ + unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? N : N / = M) +#define TH_REM(N, M) (unlikely(M =3D=3D 0) ? N :\ + unlikely((N =3D=3D -N) && (M =3D=3D (__typeof(N))(-1))) ? 0 : N % = M) + +THCALL(TH_OPIVV2, th_vdivu_vv_b, OP_UUU_B, H1, H1, H1, TH_DIVU) +THCALL(TH_OPIVV2, th_vdivu_vv_h, OP_UUU_H, H2, H2, H2, TH_DIVU) +THCALL(TH_OPIVV2, th_vdivu_vv_w, OP_UUU_W, H4, H4, H4, TH_DIVU) +THCALL(TH_OPIVV2, th_vdivu_vv_d, OP_UUU_D, H8, H8, H8, TH_DIVU) +THCALL(TH_OPIVV2, th_vdiv_vv_b, OP_SSS_B, H1, H1, H1, TH_DIV) +THCALL(TH_OPIVV2, th_vdiv_vv_h, OP_SSS_H, H2, H2, H2, TH_DIV) +THCALL(TH_OPIVV2, th_vdiv_vv_w, OP_SSS_W, H4, H4, H4, TH_DIV) +THCALL(TH_OPIVV2, th_vdiv_vv_d, OP_SSS_D, H8, H8, H8, TH_DIV) +THCALL(TH_OPIVV2, th_vremu_vv_b, OP_UUU_B, H1, H1, H1, TH_REMU) +THCALL(TH_OPIVV2, th_vremu_vv_h, OP_UUU_H, H2, H2, H2, TH_REMU) +THCALL(TH_OPIVV2, th_vremu_vv_w, OP_UUU_W, H4, H4, H4, TH_REMU) +THCALL(TH_OPIVV2, th_vremu_vv_d, OP_UUU_D, H8, H8, H8, TH_REMU) +THCALL(TH_OPIVV2, th_vrem_vv_b, OP_SSS_B, H1, H1, H1, TH_REM) +THCALL(TH_OPIVV2, th_vrem_vv_h, OP_SSS_H, H2, H2, H2, TH_REM) +THCALL(TH_OPIVV2, th_vrem_vv_w, OP_SSS_W, H4, H4, H4, TH_REM) +THCALL(TH_OPIVV2, th_vrem_vv_d, OP_SSS_D, H8, H8, H8, TH_REM) +GEN_TH_VV(th_vdivu_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vdivu_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vdivu_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vdivu_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vdiv_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vdiv_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vdiv_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vdiv_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vremu_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vremu_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vremu_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vremu_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vrem_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vrem_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vrem_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vrem_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVX2, th_vdivu_vx_b, OP_UUU_B, H1, H1, TH_DIVU) +THCALL(TH_OPIVX2, th_vdivu_vx_h, OP_UUU_H, H2, H2, TH_DIVU) +THCALL(TH_OPIVX2, th_vdivu_vx_w, OP_UUU_W, H4, H4, TH_DIVU) +THCALL(TH_OPIVX2, th_vdivu_vx_d, OP_UUU_D, H8, H8, TH_DIVU) +THCALL(TH_OPIVX2, th_vdiv_vx_b, OP_SSS_B, H1, H1, TH_DIV) +THCALL(TH_OPIVX2, th_vdiv_vx_h, OP_SSS_H, H2, H2, TH_DIV) +THCALL(TH_OPIVX2, th_vdiv_vx_w, OP_SSS_W, H4, H4, TH_DIV) +THCALL(TH_OPIVX2, th_vdiv_vx_d, OP_SSS_D, H8, H8, TH_DIV) +THCALL(TH_OPIVX2, th_vremu_vx_b, OP_UUU_B, H1, H1, TH_REMU) +THCALL(TH_OPIVX2, th_vremu_vx_h, OP_UUU_H, H2, H2, TH_REMU) +THCALL(TH_OPIVX2, th_vremu_vx_w, OP_UUU_W, H4, H4, TH_REMU) +THCALL(TH_OPIVX2, th_vremu_vx_d, OP_UUU_D, H8, H8, TH_REMU) +THCALL(TH_OPIVX2, th_vrem_vx_b, OP_SSS_B, H1, H1, TH_REM) +THCALL(TH_OPIVX2, th_vrem_vx_h, OP_SSS_H, H2, H2, TH_REM) +THCALL(TH_OPIVX2, th_vrem_vx_w, OP_SSS_W, H4, H4, TH_REM) +THCALL(TH_OPIVX2, th_vrem_vx_d, OP_SSS_D, H8, H8, TH_REM) +GEN_TH_VX(th_vdivu_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vdivu_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vdivu_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vdivu_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vdiv_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vdiv_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vdiv_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vdiv_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vremu_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vremu_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vremu_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vremu_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vrem_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vrem_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vrem_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vrem_vx_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712910711; cv=none; d=zohomail.com; s=zohoarc; b=nWPNpnWWXCndVXJuGXVpiYIY6I4rtKZeireXPJoUgKsCOxZRFdH0LkPBEWgXqp9hkbfT7dklvJEsjZMEl46frUu7u8UAKw1QsdTZjjaK8Senq301Aj1l0ZdITZKp23Fq/T41ujwFCtWOYL1w+mhB0DVXn4SrCXmYdiqDUUrzXeo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712910711; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=Cet7U4bxzdatnKI+ukDBeStaMpJT25FiLg6Jf4qTISM=; b=d4Su8Gfyj6tmJQQ5ubNOnwsRR/UjTM456QV/wcPWMSUQrb41EmwVty71+uCDCvl5AktRvJzZZn+djk0xr1Ua/ibZte2jww5YA1+/f95jHhOY/rVXZt62lryNHokCnnEjyLzO9bfKDKZgZYj+z03534Wr+zp3QOKHQ4YUok8+998= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712910711370786.4891251624872; Fri, 12 Apr 2024 01:31:51 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCJl-0001gO-JJ; Fri, 12 Apr 2024 04:31:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCJj-0001ft-J7; Fri, 12 Apr 2024 04:31:27 -0400 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCJf-0000se-4W; Fri, 12 Apr 2024 04:31:27 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NbCCo_1712910670) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:31:10 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712910672; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=Cet7U4bxzdatnKI+ukDBeStaMpJT25FiLg6Jf4qTISM=; b=saExLYIVHK4SPOfXdP76/8X44lyxVVWX0BvGhL+QGJgoK8xJlCLq6gXEMlqVm3khHWyQwInnrYj65im7X6CLOTvG4WTkdNX7EJGb4MwEiArnupOMHWwSp9zvwZ9v9aoiY9pWDU/KjQYuFiGpDbd2v2TltGYRsMMrsdB14tKx+JE= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R491e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046049; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NbCCo_1712910670; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 26/65] target/riscv: Add widening integer multiply instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:56 +0800 Message-ID: <20240412073735.76413-27-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.130; envelope-from=eric.huang@linux.alibaba.com; helo=out30-130.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712910711866100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 19 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 14 ++++--- target/riscv/vector_helper.c | 3 -- target/riscv/vector_internals.h | 3 ++ target/riscv/xtheadvector_helper.c | 39 +++++++++++++++++++ 5 files changed, 69 insertions(+), 9 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 3d5ad2847e..93e6a3f33d 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1794,3 +1794,22 @@ DEF_HELPER_6(th_vrem_vx_b, void, ptr, ptr, tl, ptr, = env, i32) DEF_HELPER_6(th_vrem_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vrem_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vrem_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vwmul_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmul_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmul_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmulu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmulu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmulu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmulsu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmulsu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmulsu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmul_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmul_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmul_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmulu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmulu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmulu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmulsu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmulsu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmulsu_vx_w, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index a609b7faf3..681e967078 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1549,18 +1549,20 @@ GEN_OPIVX_TRANS_TH(th_vdiv_vx, opivx_check_th) GEN_OPIVX_TRANS_TH(th_vremu_vx, opivx_check_th) GEN_OPIVX_TRANS_TH(th_vrem_vx, opivx_check_th) =20 +/* Vector Widening Integer Multiply Instructions */ +GEN_OPIVV_WIDEN_TRANS_TH(th_vwmul_vv, opivv_widen_check_th) +GEN_OPIVV_WIDEN_TRANS_TH(th_vwmulu_vv, opivv_widen_check_th) +GEN_OPIVV_WIDEN_TRANS_TH(th_vwmulsu_vv, opivv_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwmul_vx, opivx_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwmulu_vx, opivx_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwmulsu_vx, opivx_widen_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vwmulu_vv) -TH_TRANS_STUB(th_vwmulu_vx) -TH_TRANS_STUB(th_vwmulsu_vv) -TH_TRANS_STUB(th_vwmulsu_vx) -TH_TRANS_STUB(th_vwmul_vv) -TH_TRANS_STUB(th_vwmul_vx) TH_TRANS_STUB(th_vmacc_vv) TH_TRANS_STUB(th_vmacc_vx) TH_TRANS_STUB(th_vnmsac_vv) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 5aba3f238f..b312d67f87 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -647,9 +647,6 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b) */ =20 /* (TD, T1, T2, TX1, TX2) */ -#define WOP_SUS_B int16_t, uint8_t, int8_t, uint16_t, int16_t -#define WOP_SUS_H int32_t, uint16_t, int16_t, uint32_t, int32_t -#define WOP_SUS_W int64_t, uint32_t, int32_t, uint64_t, int64_t #define WOP_SSU_B int16_t, int8_t, uint8_t, int16_t, uint16_t #define WOP_SSU_H int32_t, int16_t, uint16_t, int32_t, uint32_t #define WOP_SSU_W int64_t, int32_t, uint32_t, int64_t, uint64_t diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 4cbd7f972a..c3d9752e2e 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -249,6 +249,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, = \ #define WOP_WSSS_B int16_t, int8_t, int16_t, int16_t, int16_t #define WOP_WSSS_H int32_t, int16_t, int32_t, int32_t, int32_t #define WOP_WSSS_W int64_t, int32_t, int64_t, int64_t, int64_t +#define WOP_SUS_B int16_t, uint8_t, int8_t, uint16_t, int16_t +#define WOP_SUS_H int32_t, uint16_t, int16_t, uint32_t, int32_t +#define WOP_SUS_W int64_t, uint32_t, int32_t, uint64_t, int64_t =20 /* share functions */ static inline target_ulong adjust_addr(CPURISCVState *env, target_ulong ad= dr) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 4af66b047a..b5b1e55452 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -1752,3 +1752,42 @@ GEN_TH_VX(th_vrem_vx_b, 1, 1, clearb_th) GEN_TH_VX(th_vrem_vx_h, 2, 2, clearh_th) GEN_TH_VX(th_vrem_vx_w, 4, 4, clearl_th) GEN_TH_VX(th_vrem_vx_d, 8, 8, clearq_th) + +/* Vector Widening Integer Multiply Instructions */ +THCALL(TH_OPIVV2, th_vwmul_vv_b, WOP_SSS_B, H2, H1, H1, TH_MUL) +THCALL(TH_OPIVV2, th_vwmul_vv_h, WOP_SSS_H, H4, H2, H2, TH_MUL) +THCALL(TH_OPIVV2, th_vwmul_vv_w, WOP_SSS_W, H8, H4, H4, TH_MUL) +THCALL(TH_OPIVV2, th_vwmulu_vv_b, WOP_UUU_B, H2, H1, H1, TH_MUL) +THCALL(TH_OPIVV2, th_vwmulu_vv_h, WOP_UUU_H, H4, H2, H2, TH_MUL) +THCALL(TH_OPIVV2, th_vwmulu_vv_w, WOP_UUU_W, H8, H4, H4, TH_MUL) +THCALL(TH_OPIVV2, th_vwmulsu_vv_b, WOP_SUS_B, H2, H1, H1, TH_MUL) +THCALL(TH_OPIVV2, th_vwmulsu_vv_h, WOP_SUS_H, H4, H2, H2, TH_MUL) +THCALL(TH_OPIVV2, th_vwmulsu_vv_w, WOP_SUS_W, H8, H4, H4, TH_MUL) +GEN_TH_VV(th_vwmul_vv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwmul_vv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwmul_vv_w, 4, 8, clearq_th) +GEN_TH_VV(th_vwmulu_vv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwmulu_vv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwmulu_vv_w, 4, 8, clearq_th) +GEN_TH_VV(th_vwmulsu_vv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwmulsu_vv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwmulsu_vv_w, 4, 8, clearq_th) + +THCALL(TH_OPIVX2, th_vwmul_vx_b, WOP_SSS_B, H2, H1, TH_MUL) +THCALL(TH_OPIVX2, th_vwmul_vx_h, WOP_SSS_H, H4, H2, TH_MUL) +THCALL(TH_OPIVX2, th_vwmul_vx_w, WOP_SSS_W, H8, H4, TH_MUL) +THCALL(TH_OPIVX2, th_vwmulu_vx_b, WOP_UUU_B, H2, H1, TH_MUL) +THCALL(TH_OPIVX2, th_vwmulu_vx_h, WOP_UUU_H, H4, H2, TH_MUL) +THCALL(TH_OPIVX2, th_vwmulu_vx_w, WOP_UUU_W, H8, H4, TH_MUL) +THCALL(TH_OPIVX2, th_vwmulsu_vx_b, WOP_SUS_B, H2, H1, TH_MUL) +THCALL(TH_OPIVX2, th_vwmulsu_vx_h, WOP_SUS_H, H4, H2, TH_MUL) +THCALL(TH_OPIVX2, th_vwmulsu_vx_w, WOP_SUS_W, H8, H4, TH_MUL) +GEN_TH_VX(th_vwmul_vx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwmul_vx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwmul_vx_w, 4, 8, clearq_th) +GEN_TH_VX(th_vwmulu_vx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwmulu_vx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwmulu_vx_w, 4, 8, clearq_th) +GEN_TH_VX(th_vwmulsu_vx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwmulsu_vx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwmulsu_vx_w, 4, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712910843; cv=none; d=zohomail.com; s=zohoarc; b=oIbUzXnV3/nqSS5w8Lwt58WcmgPfN7ivAK7S809c/X249ZuFAJI+YuZCJdfXgoRQS4hnuZBj4s0MCaNH46FFX9ucJyET3T34ps3XcmcHgTaI36R8FjfBBnU5nv2KhLQdYRUe4qfc713ZU12vS+GkIBiQp0q7z9f/0cyFVVEMya4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712910843; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=1ju8qwYBWtviK5bZttluimjs+DUO08y/3B/TO2LfaVU=; b=nZJ+68mAHUM9gj5/PtW169n1ucPjn5tiD5XeYrppz5tv8He6DWEJe8r053cvA/GvzwL0Gp1C0Sy+xoh5XWtlBeeFHG07SBUtqI1SvBFud0yJC49FnUVyYFtEkZgU2lBXp8/XYJ4mvdp3Up1LZSbZHZaRemYnBMoBmoIc/TU7pog= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712910843700107.70844853380811; Fri, 12 Apr 2024 01:34:03 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCLd-0002YV-Vb; Fri, 12 Apr 2024 04:33:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCLb-0002YB-CR; Fri, 12 Apr 2024 04:33:23 -0400 Received: from out30-124.freemail.mail.aliyun.com ([115.124.30.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCLY-0001Ox-9G; Fri, 12 Apr 2024 04:33:23 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nasex_1712910791) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:33:12 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712910792; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=1ju8qwYBWtviK5bZttluimjs+DUO08y/3B/TO2LfaVU=; b=bqeUpNRrIGfNh7wiDJdsrlDm7NXtpwnNuGNyVnJEm9ioXSCJHg0TMH8Jg7vrXibe4DReiQkI7Eq6zFCz3crwqr36z082eTljrFBwPu5B4qUqF7Ww61dBQelUhC/Sr1TH/YC8WogJtKbtsrNVGwszDTN7FMGdhXiloV8neom7S2E= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R101e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045170; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nasex_1712910791; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 27/65] target/riscv: Add single-width integer multiply-add instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:57 +0800 Message-ID: <20240412073735.76413-28-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.124; envelope-from=eric.huang@linux.alibaba.com; helo=out30-124.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712910845777100003 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 33 +++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 18 ++-- target/riscv/xtheadvector_helper.c | 87 +++++++++++++++++++ 3 files changed, 130 insertions(+), 8 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 93e6a3f33d..a6abb48b55 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1813,3 +1813,36 @@ DEF_HELPER_6(th_vwmulu_vx_w, void, ptr, ptr, tl, ptr= , env, i32) DEF_HELPER_6(th_vwmulsu_vx_b, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vwmulsu_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vwmulsu_vx_w, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vmacc_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmacc_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmacc_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmacc_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnmsac_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnmsac_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnmsac_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnmsac_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmadd_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmadd_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmadd_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmadd_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnmsub_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnmsub_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnmsub_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnmsub_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmacc_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmacc_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmacc_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmacc_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnmsac_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnmsac_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnmsac_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnmsac_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmadd_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmadd_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmadd_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmadd_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnmsub_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnmsub_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnmsub_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnmsub_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 681e967078..d84edd90ca 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1557,20 +1557,22 @@ GEN_OPIVX_WIDEN_TRANS_TH(th_vwmul_vx, opivx_widen_c= heck_th) GEN_OPIVX_WIDEN_TRANS_TH(th_vwmulu_vx, opivx_widen_check_th) GEN_OPIVX_WIDEN_TRANS_TH(th_vwmulsu_vx, opivx_widen_check_th) =20 +/* Vector Single-Width Integer Multiply-Add Instructions */ +GEN_OPIVV_TRANS_TH(th_vmacc_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vnmsac_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vmadd_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vnmsub_vv, opivv_check_th) +GEN_OPIVX_TRANS_TH(th_vmacc_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vnmsac_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vmadd_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vnmsub_vx, opivx_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vmacc_vv) -TH_TRANS_STUB(th_vmacc_vx) -TH_TRANS_STUB(th_vnmsac_vv) -TH_TRANS_STUB(th_vnmsac_vx) -TH_TRANS_STUB(th_vmadd_vv) -TH_TRANS_STUB(th_vmadd_vx) -TH_TRANS_STUB(th_vnmsub_vv) -TH_TRANS_STUB(th_vnmsub_vx) TH_TRANS_STUB(th_vwmaccu_vv) TH_TRANS_STUB(th_vwmaccu_vx) TH_TRANS_STUB(th_vwmacc_vv) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index b5b1e55452..ccf6eb8a43 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -1791,3 +1791,90 @@ GEN_TH_VX(th_vwmulu_vx_w, 4, 8, clearq_th) GEN_TH_VX(th_vwmulsu_vx_b, 1, 2, clearh_th) GEN_TH_VX(th_vwmulsu_vx_h, 2, 4, clearl_th) GEN_TH_VX(th_vwmulsu_vx_w, 4, 8, clearq_th) + +/* Vector Single-Width Integer Multiply-Add Instructions */ +#define TH_OPIVV3(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2, OP) \ +static void do_##NAME(void *vd, void *vs1, void *vs2, int i) \ +{ \ + TX1 s1 =3D *((T1 *)vs1 + HS1(i)); \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + TD d =3D *((TD *)vd + HD(i)); \ + *((TD *)vd + HD(i)) =3D OP(s2, s1, d); \ +} +#define TH_MACC(N, M, D) (M * N + D) +#define TH_NMSAC(N, M, D) (-(M * N) + D) +#define TH_MADD(N, M, D) (M * D + N) +#define TH_NMSUB(N, M, D) (-(M * D) + N) +THCALL(TH_OPIVV3, th_vmacc_vv_b, OP_SSS_B, H1, H1, H1, TH_MACC) +THCALL(TH_OPIVV3, th_vmacc_vv_h, OP_SSS_H, H2, H2, H2, TH_MACC) +THCALL(TH_OPIVV3, th_vmacc_vv_w, OP_SSS_W, H4, H4, H4, TH_MACC) +THCALL(TH_OPIVV3, th_vmacc_vv_d, OP_SSS_D, H8, H8, H8, TH_MACC) +THCALL(TH_OPIVV3, th_vnmsac_vv_b, OP_SSS_B, H1, H1, H1, TH_NMSAC) +THCALL(TH_OPIVV3, th_vnmsac_vv_h, OP_SSS_H, H2, H2, H2, TH_NMSAC) +THCALL(TH_OPIVV3, th_vnmsac_vv_w, OP_SSS_W, H4, H4, H4, TH_NMSAC) +THCALL(TH_OPIVV3, th_vnmsac_vv_d, OP_SSS_D, H8, H8, H8, TH_NMSAC) +THCALL(TH_OPIVV3, th_vmadd_vv_b, OP_SSS_B, H1, H1, H1, TH_MADD) +THCALL(TH_OPIVV3, th_vmadd_vv_h, OP_SSS_H, H2, H2, H2, TH_MADD) +THCALL(TH_OPIVV3, th_vmadd_vv_w, OP_SSS_W, H4, H4, H4, TH_MADD) +THCALL(TH_OPIVV3, th_vmadd_vv_d, OP_SSS_D, H8, H8, H8, TH_MADD) +THCALL(TH_OPIVV3, th_vnmsub_vv_b, OP_SSS_B, H1, H1, H1, TH_NMSUB) +THCALL(TH_OPIVV3, th_vnmsub_vv_h, OP_SSS_H, H2, H2, H2, TH_NMSUB) +THCALL(TH_OPIVV3, th_vnmsub_vv_w, OP_SSS_W, H4, H4, H4, TH_NMSUB) +THCALL(TH_OPIVV3, th_vnmsub_vv_d, OP_SSS_D, H8, H8, H8, TH_NMSUB) +GEN_TH_VV(th_vmacc_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vmacc_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vmacc_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vmacc_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vnmsac_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vnmsac_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vnmsac_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vnmsac_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vmadd_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vmadd_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vmadd_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vmadd_vv_d, 8, 8, clearq_th) +GEN_TH_VV(th_vnmsub_vv_b, 1, 1, clearb_th) +GEN_TH_VV(th_vnmsub_vv_h, 2, 2, clearh_th) +GEN_TH_VV(th_vnmsub_vv_w, 4, 4, clearl_th) +GEN_TH_VV(th_vnmsub_vv_d, 8, 8, clearq_th) + +#define TH_OPIVX3(NAME, TD, T1, T2, TX1, TX2, HD, HS2, OP) \ +static void do_##NAME(void *vd, target_long s1, void *vs2, int i) \ +{ \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + TD d =3D *((TD *)vd + HD(i)); \ + *((TD *)vd + HD(i)) =3D OP(s2, (TX1)(T1)s1, d); \ +} + +THCALL(TH_OPIVX3, th_vmacc_vx_b, OP_SSS_B, H1, H1, TH_MACC) +THCALL(TH_OPIVX3, th_vmacc_vx_h, OP_SSS_H, H2, H2, TH_MACC) +THCALL(TH_OPIVX3, th_vmacc_vx_w, OP_SSS_W, H4, H4, TH_MACC) +THCALL(TH_OPIVX3, th_vmacc_vx_d, OP_SSS_D, H8, H8, TH_MACC) +THCALL(TH_OPIVX3, th_vnmsac_vx_b, OP_SSS_B, H1, H1, TH_NMSAC) +THCALL(TH_OPIVX3, th_vnmsac_vx_h, OP_SSS_H, H2, H2, TH_NMSAC) +THCALL(TH_OPIVX3, th_vnmsac_vx_w, OP_SSS_W, H4, H4, TH_NMSAC) +THCALL(TH_OPIVX3, th_vnmsac_vx_d, OP_SSS_D, H8, H8, TH_NMSAC) +THCALL(TH_OPIVX3, th_vmadd_vx_b, OP_SSS_B, H1, H1, TH_MADD) +THCALL(TH_OPIVX3, th_vmadd_vx_h, OP_SSS_H, H2, H2, TH_MADD) +THCALL(TH_OPIVX3, th_vmadd_vx_w, OP_SSS_W, H4, H4, TH_MADD) +THCALL(TH_OPIVX3, th_vmadd_vx_d, OP_SSS_D, H8, H8, TH_MADD) +THCALL(TH_OPIVX3, th_vnmsub_vx_b, OP_SSS_B, H1, H1, TH_NMSUB) +THCALL(TH_OPIVX3, th_vnmsub_vx_h, OP_SSS_H, H2, H2, TH_NMSUB) +THCALL(TH_OPIVX3, th_vnmsub_vx_w, OP_SSS_W, H4, H4, TH_NMSUB) +THCALL(TH_OPIVX3, th_vnmsub_vx_d, OP_SSS_D, H8, H8, TH_NMSUB) +GEN_TH_VX(th_vmacc_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vmacc_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vmacc_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vmacc_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vnmsac_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vnmsac_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vnmsac_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vnmsac_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vmadd_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vmadd_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vmadd_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vmadd_vx_d, 8, 8, clearq_th) +GEN_TH_VX(th_vnmsub_vx_b, 1, 1, clearb_th) +GEN_TH_VX(th_vnmsub_vx_h, 2, 2, clearh_th) +GEN_TH_VX(th_vnmsub_vx_w, 4, 4, clearl_th) +GEN_TH_VX(th_vnmsub_vx_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712910962; cv=none; d=zohomail.com; s=zohoarc; b=G1WXnbXvODB5YMqkums2tE2Ayr7D7CHpYgxH3d7Hb/NJwQi1w7NQpxnxu7IRyNXMcE9nlwALfAWWtXdBkZY2yNOZXnHE6ZFzdHw5qs+KZwPVOIWveHQvgROA5XR7BNanLsfzFJtIO+OcYP0GYU7chXc3ThJeIXzMCkS4Ee9cBlA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712910962; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=Agt9GBO/lBPUx+8bOSXPzQaLoEc1ZRP8fQYxpEsTH6c=; b=fPdhdyusXGw0uzd0ArJA9loiQotFkt6MT/XXYbj/td2q2H18OGzcpIDBCCJbp+tfm40sCNu5/OS04CanyfHNjXYr8xgTRG9HHi5t0lzWn+85CrJVTyAof2LL756swmK++dA4Y2d4gZTZG2n/xbULp0qnkbkQm4PnmYB9FTM0Bkk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712910962189572.1773810052865; Fri, 12 Apr 2024 01:36:02 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCNg-0003M1-NB; Fri, 12 Apr 2024 04:35:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCNa-0003LX-2O; Fri, 12 Apr 2024 04:35:26 -0400 Received: from out30-98.freemail.mail.aliyun.com ([115.124.30.98]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCNV-0001cS-TK; Fri, 12 Apr 2024 04:35:24 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NbDgh_1712910912) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:35:13 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712910914; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=Agt9GBO/lBPUx+8bOSXPzQaLoEc1ZRP8fQYxpEsTH6c=; b=PMh9ipcF7xj4A3dE8ke9Rc9tJe5iN/W08GfX+FAfbbACAZQu8aEMp/h5ErBZxMgyXickjbZWAR2ykez4PBnlB97uOVQgPrT7wDxuPeuLmcshFbFDJsqjTS3CtBWggo+pZjmP60VsfWwPBKiZZ+a1CexUkdTsfg/zSvdjpNOuMF8= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R151e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046049; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NbDgh_1712910912; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 28/65] target/riscv: Add widening integer multiply-add instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:58 +0800 Message-ID: <20240412073735.76413-29-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.98; envelope-from=eric.huang@linux.alibaba.com; helo=out30-98.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712910964134100003 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 22 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 16 ++++--- target/riscv/vector_helper.c | 3 -- target/riscv/vector_internals.h | 3 ++ target/riscv/xtheadvector_helper.c | 45 +++++++++++++++++++ 5 files changed, 79 insertions(+), 10 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index a6abb48b55..8b8dd62761 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1846,3 +1846,25 @@ DEF_HELPER_6(th_vnmsub_vx_b, void, ptr, ptr, tl, ptr= , env, i32) DEF_HELPER_6(th_vnmsub_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vnmsub_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vnmsub_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vwmaccu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmaccu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmaccu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmacc_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmacc_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmacc_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmaccsu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmaccsu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmaccsu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwmaccu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmaccu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmaccu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmacc_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmacc_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmacc_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmaccsu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmaccsu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmaccsu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmaccus_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmaccus_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwmaccus_vx_w, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index d84edd90ca..bfa3a26f78 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1567,19 +1567,21 @@ GEN_OPIVX_TRANS_TH(th_vnmsac_vx, opivx_check_th) GEN_OPIVX_TRANS_TH(th_vmadd_vx, opivx_check_th) GEN_OPIVX_TRANS_TH(th_vnmsub_vx, opivx_check_th) =20 +/* Vector Widening Integer Multiply-Add Instructions */ +GEN_OPIVV_WIDEN_TRANS_TH(th_vwmaccu_vv, opivx_widen_check_th) +GEN_OPIVV_WIDEN_TRANS_TH(th_vwmacc_vv, opivx_widen_check_th) +GEN_OPIVV_WIDEN_TRANS_TH(th_vwmaccsu_vv, opivx_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwmaccu_vx, opivx_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwmacc_vx, opivx_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwmaccsu_vx, opivx_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwmaccus_vx, opivx_widen_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vwmaccu_vv) -TH_TRANS_STUB(th_vwmaccu_vx) -TH_TRANS_STUB(th_vwmacc_vv) -TH_TRANS_STUB(th_vwmacc_vx) -TH_TRANS_STUB(th_vwmaccsu_vv) -TH_TRANS_STUB(th_vwmaccsu_vx) -TH_TRANS_STUB(th_vwmaccus_vx) TH_TRANS_STUB(th_vmv_v_v) TH_TRANS_STUB(th_vmv_v_x) TH_TRANS_STUB(th_vmv_v_i) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index b312d67f87..06ca77691d 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -647,9 +647,6 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b) */ =20 /* (TD, T1, T2, TX1, TX2) */ -#define WOP_SSU_B int16_t, int8_t, uint8_t, int16_t, uint16_t -#define WOP_SSU_H int32_t, int16_t, uint16_t, int32_t, uint32_t -#define WOP_SSU_W int64_t, int32_t, uint32_t, int64_t, uint64_t #define NOP_SSS_B int8_t, int8_t, int16_t, int8_t, int16_t #define NOP_SSS_H int16_t, int16_t, int32_t, int16_t, int32_t #define NOP_SSS_W int32_t, int32_t, int64_t, int32_t, int64_t diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index c3d9752e2e..e99caa8e2d 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -252,6 +252,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, = \ #define WOP_SUS_B int16_t, uint8_t, int8_t, uint16_t, int16_t #define WOP_SUS_H int32_t, uint16_t, int16_t, uint32_t, int32_t #define WOP_SUS_W int64_t, uint32_t, int32_t, uint64_t, int64_t +#define WOP_SSU_B int16_t, int8_t, uint8_t, int16_t, uint16_t +#define WOP_SSU_H int32_t, int16_t, uint16_t, int32_t, uint32_t +#define WOP_SSU_W int64_t, int32_t, uint32_t, int64_t, uint64_t =20 /* share functions */ static inline target_ulong adjust_addr(CPURISCVState *env, target_ulong ad= dr) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index ccf6eb8a43..19aad626c9 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -1878,3 +1878,48 @@ GEN_TH_VX(th_vnmsub_vx_b, 1, 1, clearb_th) GEN_TH_VX(th_vnmsub_vx_h, 2, 2, clearh_th) GEN_TH_VX(th_vnmsub_vx_w, 4, 4, clearl_th) GEN_TH_VX(th_vnmsub_vx_d, 8, 8, clearq_th) + +/* Vector Widening Integer Multiply-Add Instructions */ +THCALL(TH_OPIVV3, th_vwmaccu_vv_b, WOP_UUU_B, H2, H1, H1, TH_MACC) +THCALL(TH_OPIVV3, th_vwmaccu_vv_h, WOP_UUU_H, H4, H2, H2, TH_MACC) +THCALL(TH_OPIVV3, th_vwmaccu_vv_w, WOP_UUU_W, H8, H4, H4, TH_MACC) +THCALL(TH_OPIVV3, th_vwmacc_vv_b, WOP_SSS_B, H2, H1, H1, TH_MACC) +THCALL(TH_OPIVV3, th_vwmacc_vv_h, WOP_SSS_H, H4, H2, H2, TH_MACC) +THCALL(TH_OPIVV3, th_vwmacc_vv_w, WOP_SSS_W, H8, H4, H4, TH_MACC) +THCALL(TH_OPIVV3, th_vwmaccsu_vv_b, WOP_SSU_B, H2, H1, H1, TH_MACC) +THCALL(TH_OPIVV3, th_vwmaccsu_vv_h, WOP_SSU_H, H4, H2, H2, TH_MACC) +THCALL(TH_OPIVV3, th_vwmaccsu_vv_w, WOP_SSU_W, H8, H4, H4, TH_MACC) +GEN_TH_VV(th_vwmaccu_vv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwmaccu_vv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwmaccu_vv_w, 4, 8, clearq_th) +GEN_TH_VV(th_vwmacc_vv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwmacc_vv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwmacc_vv_w, 4, 8, clearq_th) +GEN_TH_VV(th_vwmaccsu_vv_b, 1, 2, clearh_th) +GEN_TH_VV(th_vwmaccsu_vv_h, 2, 4, clearl_th) +GEN_TH_VV(th_vwmaccsu_vv_w, 4, 8, clearq_th) + +THCALL(TH_OPIVX3, th_vwmaccu_vx_b, WOP_UUU_B, H2, H1, TH_MACC) +THCALL(TH_OPIVX3, th_vwmaccu_vx_h, WOP_UUU_H, H4, H2, TH_MACC) +THCALL(TH_OPIVX3, th_vwmaccu_vx_w, WOP_UUU_W, H8, H4, TH_MACC) +THCALL(TH_OPIVX3, th_vwmacc_vx_b, WOP_SSS_B, H2, H1, TH_MACC) +THCALL(TH_OPIVX3, th_vwmacc_vx_h, WOP_SSS_H, H4, H2, TH_MACC) +THCALL(TH_OPIVX3, th_vwmacc_vx_w, WOP_SSS_W, H8, H4, TH_MACC) +THCALL(TH_OPIVX3, th_vwmaccsu_vx_b, WOP_SSU_B, H2, H1, TH_MACC) +THCALL(TH_OPIVX3, th_vwmaccsu_vx_h, WOP_SSU_H, H4, H2, TH_MACC) +THCALL(TH_OPIVX3, th_vwmaccsu_vx_w, WOP_SSU_W, H8, H4, TH_MACC) +THCALL(TH_OPIVX3, th_vwmaccus_vx_b, WOP_SUS_B, H2, H1, TH_MACC) +THCALL(TH_OPIVX3, th_vwmaccus_vx_h, WOP_SUS_H, H4, H2, TH_MACC) +THCALL(TH_OPIVX3, th_vwmaccus_vx_w, WOP_SUS_W, H8, H4, TH_MACC) +GEN_TH_VX(th_vwmaccu_vx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwmaccu_vx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwmaccu_vx_w, 4, 8, clearq_th) +GEN_TH_VX(th_vwmacc_vx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwmacc_vx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwmacc_vx_w, 4, 8, clearq_th) +GEN_TH_VX(th_vwmaccsu_vx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwmaccsu_vx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwmaccsu_vx_w, 4, 8, clearq_th) +GEN_TH_VX(th_vwmaccus_vx_b, 1, 2, clearh_th) +GEN_TH_VX(th_vwmaccus_vx_h, 2, 4, clearl_th) +GEN_TH_VX(th_vwmaccus_vx_w, 4, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712911063; cv=none; d=zohomail.com; s=zohoarc; b=Hue5DaauqP9Xs7pRhNF4eBbO6Drx8WSzzqhBSKB63/Nrwk2DY92TTXbrK2mogh79wS48NBgUdgV+UZ7xEE/lpcgT5JjdrPs1WRbgCXSw4RP5cuPp4vnIj6BwIEt9LO/2otQosMNjqK+T5ULviBt54cQ+LA3EK67jMJREtdzdnyo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712911063; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=tbVKlp+UKVB3nmIVN6O3Pf2UE5h2FFT6aCNuFKLTbOc=; b=XvyN90GTyjMlC6N9PuceF4oqZtvvcPFaw74uKaRv3e/DyJxDNGYelS5JPrppddYzEtpWOCrZEMPiMftLMOywpkA7aoBVLqSwFk1d7UYxGauxUnrgS8Hn1TmN8YDZ7gFKkboPIydfZwyag/7e8EDvr+8vl6Uiz427+q8kYdu7+yY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712911063800256.41282866047413; Fri, 12 Apr 2024 01:37:43 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCPV-0004R1-8A; Fri, 12 Apr 2024 04:37:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCPT-0004QV-Tv; Fri, 12 Apr 2024 04:37:23 -0400 Received: from out30-110.freemail.mail.aliyun.com ([115.124.30.110]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCPQ-00021S-PW; Fri, 12 Apr 2024 04:37:23 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NfJWY_1712911034) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:37:14 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712911036; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=tbVKlp+UKVB3nmIVN6O3Pf2UE5h2FFT6aCNuFKLTbOc=; b=GjMVALnASny8hbUF0Yi66TQ7REV+BkDFfdq2JbWJ0wMWWi1sh0/8+jyvVHRHjFm2ChZj195b3kFSdqisFHicMV7/qTgjF5n0I6+gGeHTBCZN3/Vin8TID+q5Xtu+3AxDTkVZsupR2H9YBa6Bo6TuA/gzwokCEL759SbGvNn/LIc= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R191e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045176; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NfJWY_1712911034; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 29/65] target/riscv: Add integer merge and move instructions for XTheadVector Date: Fri, 12 Apr 2024 15:36:59 +0800 Message-ID: <20240412073735.76413-30-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.110; envelope-from=eric.huang@linux.alibaba.com; helo=out30-110.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712911064354100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Except of th.vmv.v.x, the difference is that XTheadVector has no limit of SEW of 8 to 64, Therefore, it is not suitable to use acceleration when xlen < SEW. Signed-off-by: Huang Tao --- target/riscv/helper.h | 17 +++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 124 +++++++++++++++++- target/riscv/xtheadvector_helper.c | 104 +++++++++++++++ 3 files changed, 239 insertions(+), 6 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 8b8dd62761..ba548ebdc9 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1868,3 +1868,20 @@ DEF_HELPER_6(th_vwmaccsu_vx_w, void, ptr, ptr, tl, p= tr, env, i32) DEF_HELPER_6(th_vwmaccus_vx_b, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vwmaccus_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vwmaccus_vx_w, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vmerge_vvm_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmerge_vvm_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmerge_vvm_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmerge_vvm_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmerge_vxm_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmerge_vxm_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmerge_vxm_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vmerge_vxm_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_4(th_vmv_v_v_b, void, ptr, ptr, env, i32) +DEF_HELPER_4(th_vmv_v_v_h, void, ptr, ptr, env, i32) +DEF_HELPER_4(th_vmv_v_v_w, void, ptr, ptr, env, i32) +DEF_HELPER_4(th_vmv_v_v_d, void, ptr, ptr, env, i32) +DEF_HELPER_4(th_vmv_v_x_b, void, ptr, i64, env, i32) +DEF_HELPER_4(th_vmv_v_x_h, void, ptr, i64, env, i32) +DEF_HELPER_4(th_vmv_v_x_w, void, ptr, i64, env, i32) +DEF_HELPER_4(th_vmv_v_x_d, void, ptr, i64, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index bfa3a26f78..6d0ce9f966 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1576,18 +1576,130 @@ GEN_OPIVX_WIDEN_TRANS_TH(th_vwmacc_vx, opivx_widen= _check_th) GEN_OPIVX_WIDEN_TRANS_TH(th_vwmaccsu_vx, opivx_widen_check_th) GEN_OPIVX_WIDEN_TRANS_TH(th_vwmaccus_vx, opivx_widen_check_th) =20 +/* Vector Integer Merge and Move Instructions */ + +/* + * This function is almost the copy of trans_vmv_v_v, except: + * 1) XTheadVector simplifies the judgment logic of whether + * to accelerate or not for its lack of fractional LMUL and + * VTA. + */ +static bool trans_th_vmv_v_v(DisasContext *s, arg_th_vmv_v_v *a) +{ + if (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs1, false)) { + + if (s->vl_eq_vlmax) { + tcg_gen_gvec_mov(s->sew, vreg_ofs(s, a->rd), + vreg_ofs(s, a->rs1), + MAXSZ(s), MAXSZ(s)); + } else { + uint32_t data =3D FIELD_DP32(0, VDATA_TH, LMUL, s->lmul); + static gen_helper_gvec_2_ptr * const fns[4] =3D { + gen_helper_th_vmv_v_v_b, gen_helper_th_vmv_v_v_h, + gen_helper_th_vmv_v_v_w, gen_helper_th_vmv_v_v_d, + }; + + tcg_gen_gvec_2_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs1), + tcg_env, s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data, + fns[s->sew]); + } + finalize_rvv_inst(s); + return true; + } + return false; +} + + +#define gen_helper_vmv_vx_th gen_helper_vmv_vx +/* + * This function is almost the copy of trans_vmv_v_x, except: + * 1) Simplier judgment logic of acceleration + * 2) XTheadVector has no limit of SEW of 8 to 64, Therefore, it is not + * suitable to use acceleration when xlen < SEW. + */ +static bool trans_th_vmv_v_x(DisasContext *s, arg_th_vmv_v_x *a) +{ + if (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rd, false)) { + + TCGv s1; + s1 =3D get_gpr(s, a->rs1, EXT_SIGN); + + if (s->vl_eq_vlmax && (8 << s->sew) <=3D get_xlen(s)) { + tcg_gen_gvec_dup_tl(s->sew, vreg_ofs(s, a->rd), + MAXSZ(s), MAXSZ(s), s1); + } else { + TCGv_i32 desc; + TCGv_i64 s1_i64 =3D tcg_temp_new_i64(); + TCGv_ptr dest =3D tcg_temp_new_ptr(); + uint32_t data =3D FIELD_DP32(0, VDATA_TH, LMUL, s->lmul); + static gen_helper_vmv_vx_th * const fns[4] =3D { + gen_helper_th_vmv_v_x_b, gen_helper_th_vmv_v_x_h, + gen_helper_th_vmv_v_x_w, gen_helper_th_vmv_v_x_d, + }; + + tcg_gen_ext_tl_i64(s1_i64, s1); + desc =3D tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data)); + tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, a->rd)); + fns[s->sew](dest, s1_i64, tcg_env, desc); + } + + finalize_rvv_inst(s); + return true; + } + return false; +} + +/* The difference is same as trans_th_vmv_v_v */ +static bool trans_th_vmv_v_i(DisasContext *s, arg_th_vmv_v_i *a) +{ + if (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rd, false)) { + + int64_t simm =3D sextract64(a->rs1, 0, 5); + if (s->vl_eq_vlmax) { + tcg_gen_gvec_dup_imm(s->sew, vreg_ofs(s, a->rd), + MAXSZ(s), MAXSZ(s), simm); + } else { + TCGv_i32 desc; + TCGv_i64 s1; + TCGv_ptr dest; + uint32_t data =3D FIELD_DP32(0, VDATA_TH, LMUL, s->lmul); + static gen_helper_vmv_vx_th * const fns[4] =3D { + gen_helper_th_vmv_v_x_b, gen_helper_th_vmv_v_x_h, + gen_helper_th_vmv_v_x_w, gen_helper_th_vmv_v_x_d, + }; + + s1 =3D tcg_constant_i64(simm); + dest =3D tcg_temp_new_ptr(); + desc =3D tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data)); + tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, a->rd)); + fns[s->sew](dest, s1, tcg_env, desc); + } + finalize_rvv_inst(s); + return true; + } + return false; +} + +GEN_OPIVV_TRANS_TH(th_vmerge_vvm, opivv_vadc_check_th) +GEN_OPIVX_TRANS_TH(th_vmerge_vxm, opivx_vadc_check_th) +GEN_OPIVI_TRANS_TH(th_vmerge_vim, IMM_SX, th_vmerge_vxm, opivx_vadc_check_= th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vmv_v_v) -TH_TRANS_STUB(th_vmv_v_x) -TH_TRANS_STUB(th_vmv_v_i) -TH_TRANS_STUB(th_vmerge_vvm) -TH_TRANS_STUB(th_vmerge_vxm) -TH_TRANS_STUB(th_vmerge_vim) TH_TRANS_STUB(th_vsaddu_vv) TH_TRANS_STUB(th_vsaddu_vx) TH_TRANS_STUB(th_vsaddu_vi) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 19aad626c9..d8a0e3af90 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -1923,3 +1923,107 @@ GEN_TH_VX(th_vwmaccsu_vx_w, 4, 8, clearq_th) GEN_TH_VX(th_vwmaccus_vx_b, 1, 2, clearh_th) GEN_TH_VX(th_vwmaccus_vx_h, 2, 4, clearl_th) GEN_TH_VX(th_vwmaccus_vx_w, 4, 8, clearq_th) + +/* Vector Integer Merge and Move Instructions */ + +/* + * The funtions below of VMV and vmerge are all the copy of RVV1.0 functio= ns, + * except: + * 1) different desc encoding + * 2) different tail/masked element process policy + * 3) different mask layout + */ +#define GEN_TH_VMV_VV(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *vs1, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + uint32_t vl =3D env->vl; \ + uint32_t esz =3D sizeof(ETYPE); \ + uint32_t vlmax =3D th_maxsz(desc) / esz; \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + ETYPE s1 =3D *((ETYPE *)vs1 + H(i)); \ + *((ETYPE *)vd + H(i)) =3D s1; \ + } \ + env->vstart =3D 0; \ + CLEAR_FN(vd, vl, vl * esz, vlmax * esz); \ +} + +GEN_TH_VMV_VV(th_vmv_v_v_b, int8_t, H1, clearb_th) +GEN_TH_VMV_VV(th_vmv_v_v_h, int16_t, H2, clearh_th) +GEN_TH_VMV_VV(th_vmv_v_v_w, int32_t, H4, clearl_th) +GEN_TH_VMV_VV(th_vmv_v_v_d, int64_t, H8, clearq_th) + +#define GEN_TH_VMV_VX(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, uint64_t s1, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + uint32_t vl =3D env->vl; \ + uint32_t esz =3D sizeof(ETYPE); \ + uint32_t vlmax =3D th_maxsz(desc) / esz; \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + *((ETYPE *)vd + H(i)) =3D (ETYPE)s1; \ + } \ + env->vstart =3D 0; \ + CLEAR_FN(vd, vl, vl * esz, vlmax * esz); \ +} + +GEN_TH_VMV_VX(th_vmv_v_x_b, int8_t, H1, clearb_th) +GEN_TH_VMV_VX(th_vmv_v_x_h, int16_t, H2, clearh_th) +GEN_TH_VMV_VX(th_vmv_v_x_w, int32_t, H4, clearl_th) +GEN_TH_VMV_VX(th_vmv_v_x_d, int64_t, H8, clearq_th) + +#define GEN_TH_VMERGE_VV(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t esz =3D sizeof(ETYPE); \ + uint32_t vlmax =3D th_maxsz(desc) / esz; \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + ETYPE *vt =3D (!th_elem_mask(v0, mlen, i) ? vs2 : vs1); \ + *((ETYPE *)vd + H(i)) =3D *(vt + H(i)); \ + } \ + env->vstart =3D 0; \ + CLEAR_FN(vd, vl, vl * esz, vlmax * esz); \ +} + +GEN_TH_VMERGE_VV(th_vmerge_vvm_b, int8_t, H1, clearb_th) +GEN_TH_VMERGE_VV(th_vmerge_vvm_h, int16_t, H2, clearh_th) +GEN_TH_VMERGE_VV(th_vmerge_vvm_w, int32_t, H4, clearl_th) +GEN_TH_VMERGE_VV(th_vmerge_vvm_d, int64_t, H8, clearq_th) + +#define GEN_TH_VMERGE_VX(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ + void *vs2, CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t esz =3D sizeof(ETYPE); \ + uint32_t vlmax =3D th_maxsz(desc) / esz; \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ + ETYPE d =3D (!th_elem_mask(v0, mlen, i) ? s2 : \ + (ETYPE)(target_long)s1); \ + *((ETYPE *)vd + H(i)) =3D d; \ + } \ + env->vstart =3D 0; \ + CLEAR_FN(vd, vl, vl * esz, vlmax * esz); \ +} + +GEN_TH_VMERGE_VX(th_vmerge_vxm_b, int8_t, H1, clearb_th) +GEN_TH_VMERGE_VX(th_vmerge_vxm_h, int16_t, H2, clearh_th) +GEN_TH_VMERGE_VX(th_vmerge_vxm_w, int32_t, H4, clearl_th) +GEN_TH_VMERGE_VX(th_vmerge_vxm_d, int64_t, H8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712911193; cv=none; d=zohomail.com; s=zohoarc; b=eqsvwe0PP7F9/7aMF3ZFF4YrnBlxSXHt8Ob//BwR35yhZCXNB/C1Dw7kZNgL+KbkhUHjElkvfDLs4Q56gBzSXXusEo9//A/bmlffG775bTPoAhaM/uKWEigv++4m7yH12Hz1lyvmZo2tYDia+9Ta+ZO429x3V/hfuqnsOofz4/A= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712911193; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=3X09Fw7LZBmQAAAQn0XZA0PH23O/oWDyy/6nqPoXjAg=; b=K+JmvvKlSys5NLjFHwbAS3q4ScYjqJOHLJNM+hCplbumPoGpl5LOaFVe+Xntt5XRjxtCOBY+2Nj7twhJG3UMqhQLuHoYnokuq7HXTVFQPjTMbjzXYw/2Jomr0g/TbkGNuIQ6O4B2jtJ4e7SE9BSwVkhKCuUnQ2B9TDvOOG1tSQs= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712911193508721.8134384880378; Fri, 12 Apr 2024 01:39:53 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCRb-0005O7-UJ; Fri, 12 Apr 2024 04:39:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCRa-0005Nq-4N; Fri, 12 Apr 2024 04:39:34 -0400 Received: from out30-111.freemail.mail.aliyun.com ([115.124.30.111]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCRT-0002Kh-Kf; Fri, 12 Apr 2024 04:39:32 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NbF43_1712911155) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:39:16 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712911157; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=3X09Fw7LZBmQAAAQn0XZA0PH23O/oWDyy/6nqPoXjAg=; b=rPPYq3QDvY2eMtjwFfEP9GfFtT+H9Onk3o6HmDosg+sAPlQ2PaR4sHKzLnCdmp2btJ1KUu3NPR3o4PKU0oFQ41shJt/dY1SXLm9ZUDz1R4wBvT4407dGPuFzITMi5eMDOLpKDzoVn1kU1DIQV/CPd664OazbD9xchEs0MxznAOc= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R521e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045170; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NbF43_1712911155; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 30/65] target/riscv: Add single-width saturating add and sub instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:00 +0800 Message-ID: <20240412073735.76413-31-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.111; envelope-from=eric.huang@linux.alibaba.com; helo=out30-111.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712911194523100003 Content-Type: text/plain; charset="utf-8" In this patch, we add single-width saturating add and sub instructions to show the way we implement XTheadVector fixed-point arithmetic instructions. The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 33 +++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 26 +- target/riscv/vector_helper.c | 32 +-- target/riscv/vector_internals.h | 19 ++ target/riscv/xtheadvector_helper.c | 231 ++++++++++++++++++ 5 files changed, 315 insertions(+), 26 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index ba548ebdc9..c5156d9939 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1885,3 +1885,36 @@ DEF_HELPER_4(th_vmv_v_x_b, void, ptr, i64, env, i32) DEF_HELPER_4(th_vmv_v_x_h, void, ptr, i64, env, i32) DEF_HELPER_4(th_vmv_v_x_w, void, ptr, i64, env, i32) DEF_HELPER_4(th_vmv_v_x_d, void, ptr, i64, env, i32) + +DEF_HELPER_6(th_vsaddu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsaddu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsaddu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsaddu_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsadd_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsadd_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsadd_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsadd_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssubu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssubu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssubu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssubu_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssub_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssub_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssub_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssub_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsaddu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsaddu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsaddu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsaddu_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsadd_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsadd_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsadd_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsadd_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssubu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssubu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssubu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssubu_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssub_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssub_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssub_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssub_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 6d0ce9f966..e60da5b237 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1694,22 +1694,28 @@ GEN_OPIVV_TRANS_TH(th_vmerge_vvm, opivv_vadc_check_= th) GEN_OPIVX_TRANS_TH(th_vmerge_vxm, opivx_vadc_check_th) GEN_OPIVI_TRANS_TH(th_vmerge_vim, IMM_SX, th_vmerge_vxm, opivx_vadc_check_= th) =20 +/* + * Vector Fixed-Point Arithmetic Instructions + */ + +/* Vector Single-Width Saturating Add and Subtract */ +GEN_OPIVV_TRANS_TH(th_vsaddu_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vsadd_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vssubu_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vssub_vv, opivv_check_th) +GEN_OPIVX_TRANS_TH(th_vsaddu_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vsadd_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vssubu_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vssub_vx, opivx_check_th) +GEN_OPIVI_TRANS_TH(th_vsaddu_vi, IMM_ZX, th_vsaddu_vx, opivx_check_th) +GEN_OPIVI_TRANS_TH(th_vsadd_vi, IMM_SX, th_vsadd_vx, opivx_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vsaddu_vv) -TH_TRANS_STUB(th_vsaddu_vx) -TH_TRANS_STUB(th_vsaddu_vi) -TH_TRANS_STUB(th_vsadd_vv) -TH_TRANS_STUB(th_vsadd_vx) -TH_TRANS_STUB(th_vsadd_vi) -TH_TRANS_STUB(th_vssubu_vv) -TH_TRANS_STUB(th_vssubu_vx) -TH_TRANS_STUB(th_vssub_vv) -TH_TRANS_STUB(th_vssub_vx) TH_TRANS_STUB(th_vaadd_vv) TH_TRANS_STUB(th_vaadd_vx) TH_TRANS_STUB(th_vaadd_vi) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 06ca77691d..8664a3d4ef 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -1974,7 +1974,7 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void= *vs2, \ do_##NAME, ESZ); \ } =20 -static inline uint8_t saddu8(CPURISCVState *env, int vxrm, uint8_t a, +uint8_t saddu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) { uint8_t res =3D a + b; @@ -1985,7 +1985,7 @@ static inline uint8_t saddu8(CPURISCVState *env, int = vxrm, uint8_t a, return res; } =20 -static inline uint16_t saddu16(CPURISCVState *env, int vxrm, uint16_t a, +uint16_t saddu16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b) { uint16_t res =3D a + b; @@ -1996,7 +1996,7 @@ static inline uint16_t saddu16(CPURISCVState *env, in= t vxrm, uint16_t a, return res; } =20 -static inline uint32_t saddu32(CPURISCVState *env, int vxrm, uint32_t a, +uint32_t saddu32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b) { uint32_t res =3D a + b; @@ -2007,7 +2007,7 @@ static inline uint32_t saddu32(CPURISCVState *env, in= t vxrm, uint32_t a, return res; } =20 -static inline uint64_t saddu64(CPURISCVState *env, int vxrm, uint64_t a, +uint64_t saddu64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b) { uint64_t res =3D a + b; @@ -2111,7 +2111,7 @@ GEN_VEXT_VX_RM(vsaddu_vx_h, 2) GEN_VEXT_VX_RM(vsaddu_vx_w, 4) GEN_VEXT_VX_RM(vsaddu_vx_d, 8) =20 -static inline int8_t sadd8(CPURISCVState *env, int vxrm, int8_t a, int8_t = b) +int8_t sadd8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) { int8_t res =3D a + b; if ((res ^ a) & (res ^ b) & INT8_MIN) { @@ -2121,7 +2121,7 @@ static inline int8_t sadd8(CPURISCVState *env, int vx= rm, int8_t a, int8_t b) return res; } =20 -static inline int16_t sadd16(CPURISCVState *env, int vxrm, int16_t a, +int16_t sadd16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) { int16_t res =3D a + b; @@ -2132,7 +2132,7 @@ static inline int16_t sadd16(CPURISCVState *env, int = vxrm, int16_t a, return res; } =20 -static inline int32_t sadd32(CPURISCVState *env, int vxrm, int32_t a, +int32_t sadd32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) { int32_t res =3D a + b; @@ -2143,7 +2143,7 @@ static inline int32_t sadd32(CPURISCVState *env, int = vxrm, int32_t a, return res; } =20 -static inline int64_t sadd64(CPURISCVState *env, int vxrm, int64_t a, +int64_t sadd64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) { int64_t res =3D a + b; @@ -2172,7 +2172,7 @@ GEN_VEXT_VX_RM(vsadd_vx_h, 2) GEN_VEXT_VX_RM(vsadd_vx_w, 4) GEN_VEXT_VX_RM(vsadd_vx_d, 8) =20 -static inline uint8_t ssubu8(CPURISCVState *env, int vxrm, uint8_t a, +uint8_t ssubu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) { uint8_t res =3D a - b; @@ -2183,7 +2183,7 @@ static inline uint8_t ssubu8(CPURISCVState *env, int = vxrm, uint8_t a, return res; } =20 -static inline uint16_t ssubu16(CPURISCVState *env, int vxrm, uint16_t a, +uint16_t ssubu16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b) { uint16_t res =3D a - b; @@ -2194,7 +2194,7 @@ static inline uint16_t ssubu16(CPURISCVState *env, in= t vxrm, uint16_t a, return res; } =20 -static inline uint32_t ssubu32(CPURISCVState *env, int vxrm, uint32_t a, +uint32_t ssubu32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b) { uint32_t res =3D a - b; @@ -2205,7 +2205,7 @@ static inline uint32_t ssubu32(CPURISCVState *env, in= t vxrm, uint32_t a, return res; } =20 -static inline uint64_t ssubu64(CPURISCVState *env, int vxrm, uint64_t a, +uint64_t ssubu64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b) { uint64_t res =3D a - b; @@ -2234,7 +2234,7 @@ GEN_VEXT_VX_RM(vssubu_vx_h, 2) GEN_VEXT_VX_RM(vssubu_vx_w, 4) GEN_VEXT_VX_RM(vssubu_vx_d, 8) =20 -static inline int8_t ssub8(CPURISCVState *env, int vxrm, int8_t a, int8_t = b) +int8_t ssub8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) { int8_t res =3D a - b; if ((res ^ a) & (a ^ b) & INT8_MIN) { @@ -2244,7 +2244,7 @@ static inline int8_t ssub8(CPURISCVState *env, int vx= rm, int8_t a, int8_t b) return res; } =20 -static inline int16_t ssub16(CPURISCVState *env, int vxrm, int16_t a, +int16_t ssub16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) { int16_t res =3D a - b; @@ -2255,7 +2255,7 @@ static inline int16_t ssub16(CPURISCVState *env, int = vxrm, int16_t a, return res; } =20 -static inline int32_t ssub32(CPURISCVState *env, int vxrm, int32_t a, +int32_t ssub32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) { int32_t res =3D a - b; @@ -2266,7 +2266,7 @@ static inline int32_t ssub32(CPURISCVState *env, int = vxrm, int32_t a, return res; } =20 -static inline int64_t ssub64(CPURISCVState *env, int vxrm, int64_t a, +int64_t ssub64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) { int64_t res =3D a - b; diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index e99caa8e2d..a70ebdabe4 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -284,4 +284,23 @@ int16_t do_mulhsu_h(int16_t s2, uint16_t s1); int32_t do_mulhsu_w(int32_t s2, uint32_t s1); int64_t do_mulhsu_d(int64_t s2, uint64_t s1); =20 +uint8_t saddu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b); +uint16_t saddu16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b); +uint32_t saddu32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b); +uint64_t saddu64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b); + +int8_t sadd8(CPURISCVState *env, int vxrm, int8_t a, int8_t b); +int16_t sadd16(CPURISCVState *env, int vxrm, int16_t a, int16_t b); +int32_t sadd32(CPURISCVState *env, int vxrm, int32_t a, int32_t b); +int64_t sadd64(CPURISCVState *env, int vxrm, int64_t a, int64_t b); + +int8_t ssub8(CPURISCVState *env, int vxrm, int8_t a, int8_t b); +int16_t ssub16(CPURISCVState *env, int vxrm, int16_t a, int16_t b); +int32_t ssub32(CPURISCVState *env, int vxrm, int32_t a, int32_t b); +int64_t ssub64(CPURISCVState *env, int vxrm, int64_t a, int64_t b); + +uint8_t ssubu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b); +uint16_t ssubu16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b); +uint32_t ssubu32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b); +uint64_t ssubu64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b); #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index d8a0e3af90..5e21ab2e07 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2027,3 +2027,234 @@ GEN_TH_VMERGE_VX(th_vmerge_vxm_b, int8_t, H1, clea= rb_th) GEN_TH_VMERGE_VX(th_vmerge_vxm_h, int16_t, H2, clearh_th) GEN_TH_VMERGE_VX(th_vmerge_vxm_w, int32_t, H4, clearl_th) GEN_TH_VMERGE_VX(th_vmerge_vxm_d, int64_t, H8, clearq_th) + +/* + * Vector Fixed-Point Arithmetic Instructions + */ + +/* Vector Single-Width Saturating Add and Subtract */ + +/* + * As fixed point instructions probably have round mode and saturation, + * define common macros for fixed point here. + */ +typedef void opivv2_rm_fn_th(void *vd, void *vs1, void *vs2, int i, + CPURISCVState *env, int vxrm); + +/* + * The functions of fix-point operations below are just the copies of + * functions in RVV1.0. + * The changes in these functions are: + * 1) different desc encoding + * 2) different tail/masked element process policy + * 3) different mask layout + */ +#define TH_OPIVV2_RM(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2, OP) \ +static inline void \ +do_##NAME(void *vd, void *vs1, void *vs2, int i, \ + CPURISCVState *env, int vxrm) \ +{ \ + TX1 s1 =3D *((T1 *)vs1 + HS1(i)); \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + *((TD *)vd + HD(i)) =3D OP(env, vxrm, s2, s1); \ +} + +static inline void +th_vv_rm_1(void *vd, void *v0, void *vs1, void *vs2, + CPURISCVState *env, + uint32_t vl, uint32_t vm, uint32_t mlen, int vxrm, + opivv2_rm_fn_th *fn) +{ + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D env->vstart; i < vl; i++) { + if (!vm && !th_elem_mask(v0, mlen, i)) { + continue; + } + fn(vd, vs1, vs2, i, env, vxrm); + } + env->vstart =3D 0; +} + +static inline void +th_vv_rm_2(void *vd, void *v0, void *vs1, void *vs2, + CPURISCVState *env, + uint32_t desc, uint32_t esz, uint32_t dsz, + opivv2_rm_fn_th *fn, clear_fn *clearfn) +{ + uint32_t vlmax =3D th_maxsz(desc) / esz; + uint32_t mlen =3D th_mlen(desc); + uint32_t vm =3D th_vm(desc); + uint32_t vl =3D env->vl; + + switch (env->vxrm) { + case 0: /* rnu */ + th_vv_rm_1(vd, v0, vs1, vs2, + env, vl, vm, mlen, 0, fn); + break; + case 1: /* rne */ + th_vv_rm_1(vd, v0, vs1, vs2, + env, vl, vm, mlen, 1, fn); + break; + case 2: /* rdn */ + th_vv_rm_1(vd, v0, vs1, vs2, + env, vl, vm, mlen, 2, fn); + break; + default: /* rod */ + th_vv_rm_1(vd, v0, vs1, vs2, + env, vl, vm, mlen, 3, fn); + break; + } + + clearfn(vd, vl, vl * dsz, vlmax * dsz); +} + +/* generate helpers for fixed point instructions with OPIVV format */ +#define GEN_TH_VV_RM(NAME, ESZ, DSZ, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + th_vv_rm_2(vd, v0, vs1, vs2, env, desc, ESZ, DSZ, \ + do_##NAME, CLEAR_FN); \ +} + +THCALL(TH_OPIVV2_RM, th_vsaddu_vv_b, OP_UUU_B, H1, H1, H1, saddu8) +THCALL(TH_OPIVV2_RM, th_vsaddu_vv_h, OP_UUU_H, H2, H2, H2, saddu16) +THCALL(TH_OPIVV2_RM, th_vsaddu_vv_w, OP_UUU_W, H4, H4, H4, saddu32) +THCALL(TH_OPIVV2_RM, th_vsaddu_vv_d, OP_UUU_D, H8, H8, H8, saddu64) +GEN_TH_VV_RM(th_vsaddu_vv_b, 1, 1, clearb_th) +GEN_TH_VV_RM(th_vsaddu_vv_h, 2, 2, clearh_th) +GEN_TH_VV_RM(th_vsaddu_vv_w, 4, 4, clearl_th) +GEN_TH_VV_RM(th_vsaddu_vv_d, 8, 8, clearq_th) + +typedef void opivx2_rm_fn_th(void *vd, target_long s1, void *vs2, int i, + CPURISCVState *env, int vxrm); + +#define TH_OPIVX2_RM(NAME, TD, T1, T2, TX1, TX2, HD, HS2, OP) \ +static inline void \ +do_##NAME(void *vd, target_long s1, void *vs2, int i, \ + CPURISCVState *env, int vxrm) \ +{ \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + *((TD *)vd + HD(i)) =3D OP(env, vxrm, s2, (TX1)(T1)s1); \ +} + +static inline void +th_vx_rm_1(void *vd, void *v0, target_long s1, void *vs2, + CPURISCVState *env, + uint32_t vl, uint32_t vm, uint32_t mlen, int vxrm, + opivx2_rm_fn_th *fn) +{ + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D env->vstart; i < vl; i++) { + if (!vm && !th_elem_mask(v0, mlen, i)) { + continue; + } + fn(vd, s1, vs2, i, env, vxrm); + } + env->vstart =3D 0; +} + +static inline void +th_vx_rm_2(void *vd, void *v0, target_long s1, void *vs2, + CPURISCVState *env, + uint32_t desc, uint32_t esz, uint32_t dsz, + opivx2_rm_fn_th *fn, clear_fn *clearfn) +{ + uint32_t vlmax =3D th_maxsz(desc) / esz; + uint32_t mlen =3D th_mlen(desc); + uint32_t vm =3D th_vm(desc); + uint32_t vl =3D env->vl; + + switch (env->vxrm) { + case 0: /* rnu */ + th_vx_rm_1(vd, v0, s1, vs2, + env, vl, vm, mlen, 0, fn); + break; + case 1: /* rne */ + th_vx_rm_1(vd, v0, s1, vs2, + env, vl, vm, mlen, 1, fn); + break; + case 2: /* rdn */ + th_vx_rm_1(vd, v0, s1, vs2, + env, vl, vm, mlen, 2, fn); + break; + default: /* rod */ + th_vx_rm_1(vd, v0, s1, vs2, + env, vl, vm, mlen, 3, fn); + break; + } + + clearfn(vd, vl, vl * dsz, vlmax * dsz); +} + +/* generate helpers for fixed point instructions with OPIVX format */ +#define GEN_TH_VX_RM(NAME, ESZ, DSZ, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ + void *vs2, CPURISCVState *env, uint32_t desc) \ +{ \ + th_vx_rm_2(vd, v0, s1, vs2, env, desc, ESZ, DSZ, \ + do_##NAME, CLEAR_FN); \ +} + +THCALL(TH_OPIVX2_RM, th_vsaddu_vx_b, OP_UUU_B, H1, H1, saddu8) +THCALL(TH_OPIVX2_RM, th_vsaddu_vx_h, OP_UUU_H, H2, H2, saddu16) +THCALL(TH_OPIVX2_RM, th_vsaddu_vx_w, OP_UUU_W, H4, H4, saddu32) +THCALL(TH_OPIVX2_RM, th_vsaddu_vx_d, OP_UUU_D, H8, H8, saddu64) +GEN_TH_VX_RM(th_vsaddu_vx_b, 1, 1, clearb_th) +GEN_TH_VX_RM(th_vsaddu_vx_h, 2, 2, clearh_th) +GEN_TH_VX_RM(th_vsaddu_vx_w, 4, 4, clearl_th) +GEN_TH_VX_RM(th_vsaddu_vx_d, 8, 8, clearq_th) + +THCALL(TH_OPIVV2_RM, th_vsadd_vv_b, OP_SSS_B, H1, H1, H1, sadd8) +THCALL(TH_OPIVV2_RM, th_vsadd_vv_h, OP_SSS_H, H2, H2, H2, sadd16) +THCALL(TH_OPIVV2_RM, th_vsadd_vv_w, OP_SSS_W, H4, H4, H4, sadd32) +THCALL(TH_OPIVV2_RM, th_vsadd_vv_d, OP_SSS_D, H8, H8, H8, sadd64) +GEN_TH_VV_RM(th_vsadd_vv_b, 1, 1, clearb_th) +GEN_TH_VV_RM(th_vsadd_vv_h, 2, 2, clearh_th) +GEN_TH_VV_RM(th_vsadd_vv_w, 4, 4, clearl_th) +GEN_TH_VV_RM(th_vsadd_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVX2_RM, th_vsadd_vx_b, OP_SSS_B, H1, H1, sadd8) +THCALL(TH_OPIVX2_RM, th_vsadd_vx_h, OP_SSS_H, H2, H2, sadd16) +THCALL(TH_OPIVX2_RM, th_vsadd_vx_w, OP_SSS_W, H4, H4, sadd32) +THCALL(TH_OPIVX2_RM, th_vsadd_vx_d, OP_SSS_D, H8, H8, sadd64) +GEN_TH_VX_RM(th_vsadd_vx_b, 1, 1, clearb_th) +GEN_TH_VX_RM(th_vsadd_vx_h, 2, 2, clearh_th) +GEN_TH_VX_RM(th_vsadd_vx_w, 4, 4, clearl_th) +GEN_TH_VX_RM(th_vsadd_vx_d, 8, 8, clearq_th) + +THCALL(TH_OPIVV2_RM, th_vssubu_vv_b, OP_UUU_B, H1, H1, H1, ssubu8) +THCALL(TH_OPIVV2_RM, th_vssubu_vv_h, OP_UUU_H, H2, H2, H2, ssubu16) +THCALL(TH_OPIVV2_RM, th_vssubu_vv_w, OP_UUU_W, H4, H4, H4, ssubu32) +THCALL(TH_OPIVV2_RM, th_vssubu_vv_d, OP_UUU_D, H8, H8, H8, ssubu64) +GEN_TH_VV_RM(th_vssubu_vv_b, 1, 1, clearb_th) +GEN_TH_VV_RM(th_vssubu_vv_h, 2, 2, clearh_th) +GEN_TH_VV_RM(th_vssubu_vv_w, 4, 4, clearl_th) +GEN_TH_VV_RM(th_vssubu_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVX2_RM, th_vssubu_vx_b, OP_UUU_B, H1, H1, ssubu8) +THCALL(TH_OPIVX2_RM, th_vssubu_vx_h, OP_UUU_H, H2, H2, ssubu16) +THCALL(TH_OPIVX2_RM, th_vssubu_vx_w, OP_UUU_W, H4, H4, ssubu32) +THCALL(TH_OPIVX2_RM, th_vssubu_vx_d, OP_UUU_D, H8, H8, ssubu64) +GEN_TH_VX_RM(th_vssubu_vx_b, 1, 1, clearb_th) +GEN_TH_VX_RM(th_vssubu_vx_h, 2, 2, clearh_th) +GEN_TH_VX_RM(th_vssubu_vx_w, 4, 4, clearl_th) +GEN_TH_VX_RM(th_vssubu_vx_d, 8, 8, clearq_th) + +THCALL(TH_OPIVV2_RM, th_vssub_vv_b, OP_SSS_B, H1, H1, H1, ssub8) +THCALL(TH_OPIVV2_RM, th_vssub_vv_h, OP_SSS_H, H2, H2, H2, ssub16) +THCALL(TH_OPIVV2_RM, th_vssub_vv_w, OP_SSS_W, H4, H4, H4, ssub32) +THCALL(TH_OPIVV2_RM, th_vssub_vv_d, OP_SSS_D, H8, H8, H8, ssub64) +GEN_TH_VV_RM(th_vssub_vv_b, 1, 1, clearb_th) +GEN_TH_VV_RM(th_vssub_vv_h, 2, 2, clearh_th) +GEN_TH_VV_RM(th_vssub_vv_w, 4, 4, clearl_th) +GEN_TH_VV_RM(th_vssub_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVX2_RM, th_vssub_vx_b, OP_SSS_B, H1, H1, ssub8) +THCALL(TH_OPIVX2_RM, th_vssub_vx_h, OP_SSS_H, H2, H2, ssub16) +THCALL(TH_OPIVX2_RM, th_vssub_vx_w, OP_SSS_W, H4, H4, ssub32) +THCALL(TH_OPIVX2_RM, th_vssub_vx_d, OP_SSS_D, H8, H8, ssub64) +GEN_TH_VX_RM(th_vssub_vx_b, 1, 1, clearb_th) +GEN_TH_VX_RM(th_vssub_vx_h, 2, 2, clearh_th) +GEN_TH_VX_RM(th_vssub_vx_w, 4, 4, clearl_th) +GEN_TH_VX_RM(th_vssub_vx_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712911398; cv=none; d=zohomail.com; s=zohoarc; b=dfTHVG7wH/+QzBfPuW+n6svnNbQNbP4INHPUBgl/IjPttUOp3fwtWvvpOFxP2vs6UgNSSWKK61tuVenfj6fzcdN9rRONN0qBIPblzd4IY+qcddiRZQyAdA3GKgJEG1fwqoOg4JFPZn9FX49R96q5HM9EtbYa1Pw0lebWOne+7uY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712911398; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=TA2IBlLpBlvP8g1Ia3w0hCZemcZTshgWCLhS4KfguoM=; b=hJoACZ6Wt0ZqXSZjqAgDUWytekfxK7GWqqVTdLgqpEU4l8dzizuHAHDRRlTbAWDtS8DJMgigm047/pjRAPd3NpPk7L9oatvSp4TORjZUMtjT1LBqWvw2leyDLQ5kRweNwoBm1kuyqConbKdq2aK48YOO/KamYrCYdt0IoZwHDF0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712911398529803.8292303646969; Fri, 12 Apr 2024 01:43:18 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCTl-0008PM-Aj; Fri, 12 Apr 2024 04:41:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCTZ-0007Ic-Ct; Fri, 12 Apr 2024 04:41:39 -0400 Received: from out30-99.freemail.mail.aliyun.com ([115.124.30.99]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCTN-0003Aq-2P; Fri, 12 Apr 2024 04:41:31 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NfL5J_1712911277) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:41:18 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712911279; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=TA2IBlLpBlvP8g1Ia3w0hCZemcZTshgWCLhS4KfguoM=; b=te1j7G0CPnkUqe0UflrGPMBq1/KG7TLFYtTpGnu5lwfnpFJsvyJSOzqxnbHG69GuzLzNyZzsvmHILrByXT8zIINZ+Bi4gmOBYqjMFJfDbt/LxcXN0xO6TzEYrlIqxISWWIclmXKWeAcSEtAGIHIyRFgoENybl/9TZ4Rte/0AlkY= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R101e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046049; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NfL5J_1712911277; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 31/65] target/riscv: Add single-width average add and sub instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:01 +0800 Message-ID: <20240412073735.76413-32-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.99; envelope-from=eric.huang@linux.alibaba.com; helo=out30-99.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712911399358100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 17 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 12 ++++--- target/riscv/vector_helper.c | 8 ++--- target/riscv/vector_internals.h | 5 +++ target/riscv/xtheadvector_helper.c | 36 +++++++++++++++++++ 5 files changed, 69 insertions(+), 9 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index c5156d9939..aab2979328 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1918,3 +1918,20 @@ DEF_HELPER_6(th_vssub_vx_b, void, ptr, ptr, tl, ptr,= env, i32) DEF_HELPER_6(th_vssub_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vssub_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vssub_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vaadd_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vaadd_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vaadd_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vaadd_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vasub_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vasub_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vasub_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vasub_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vaadd_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vaadd_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vaadd_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vaadd_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vasub_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vasub_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vasub_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vasub_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index e60da5b237..59da1e4b3f 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1710,17 +1710,19 @@ GEN_OPIVX_TRANS_TH(th_vssub_vx, opivx_check_th) GEN_OPIVI_TRANS_TH(th_vsaddu_vi, IMM_ZX, th_vsaddu_vx, opivx_check_th) GEN_OPIVI_TRANS_TH(th_vsadd_vi, IMM_SX, th_vsadd_vx, opivx_check_th) =20 +/* Vector Single-Width Averaging Add and Subtract */ +GEN_OPIVV_TRANS_TH(th_vaadd_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vasub_vv, opivv_check_th) +GEN_OPIVX_TRANS_TH(th_vaadd_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vasub_vx, opivx_check_th) +GEN_OPIVI_TRANS_TH(th_vaadd_vi, IMM_SX, th_vaadd_vx, opivx_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vaadd_vv) -TH_TRANS_STUB(th_vaadd_vx) -TH_TRANS_STUB(th_vaadd_vi) -TH_TRANS_STUB(th_vasub_vv) -TH_TRANS_STUB(th_vasub_vx) TH_TRANS_STUB(th_vsmul_vv) TH_TRANS_STUB(th_vsmul_vx) TH_TRANS_STUB(th_vwsmaccu_vv) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 8664a3d4ef..ea1e449174 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -2323,7 +2323,7 @@ static inline uint8_t get_round(int vxrm, uint64_t v,= uint8_t shift) return 0; /* round-down (truncate) */ } =20 -static inline int32_t aadd32(CPURISCVState *env, int vxrm, int32_t a, +int32_t aadd32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) { int64_t res =3D (int64_t)a + b; @@ -2332,7 +2332,7 @@ static inline int32_t aadd32(CPURISCVState *env, int = vxrm, int32_t a, return (res >> 1) + round; } =20 -static inline int64_t aadd64(CPURISCVState *env, int vxrm, int64_t a, +int64_t aadd64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) { int64_t res =3D a + b; @@ -2398,7 +2398,7 @@ GEN_VEXT_VX_RM(vaaddu_vx_h, 2) GEN_VEXT_VX_RM(vaaddu_vx_w, 4) GEN_VEXT_VX_RM(vaaddu_vx_d, 8) =20 -static inline int32_t asub32(CPURISCVState *env, int vxrm, int32_t a, +int32_t asub32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) { int64_t res =3D (int64_t)a - b; @@ -2407,7 +2407,7 @@ static inline int32_t asub32(CPURISCVState *env, int = vxrm, int32_t a, return (res >> 1) + round; } =20 -static inline int64_t asub64(CPURISCVState *env, int vxrm, int64_t a, +int64_t asub64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) { int64_t res =3D (int64_t)a - b; diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index a70ebdabe4..19f174f4c8 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -303,4 +303,9 @@ uint8_t ssubu8(CPURISCVState *env, int vxrm, uint8_t a,= uint8_t b); uint16_t ssubu16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b); uint32_t ssubu32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b); uint64_t ssubu64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b); + +int32_t aadd32(CPURISCVState *env, int vxrm, int32_t a, int32_t b); +int64_t aadd64(CPURISCVState *env, int vxrm, int64_t a, int64_t b); +int32_t asub32(CPURISCVState *env, int vxrm, int32_t a, int32_t b); +int64_t asub64(CPURISCVState *env, int vxrm, int64_t a, int64_t b); #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 5e21ab2e07..06ac5940b7 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2258,3 +2258,39 @@ GEN_TH_VX_RM(th_vssub_vx_b, 1, 1, clearb_th) GEN_TH_VX_RM(th_vssub_vx_h, 2, 2, clearh_th) GEN_TH_VX_RM(th_vssub_vx_w, 4, 4, clearl_th) GEN_TH_VX_RM(th_vssub_vx_d, 8, 8, clearq_th) + +THCALL(TH_OPIVV2_RM, th_vaadd_vv_b, OP_SSS_B, H1, H1, H1, aadd32) +THCALL(TH_OPIVV2_RM, th_vaadd_vv_h, OP_SSS_H, H2, H2, H2, aadd32) +THCALL(TH_OPIVV2_RM, th_vaadd_vv_w, OP_SSS_W, H4, H4, H4, aadd32) +THCALL(TH_OPIVV2_RM, th_vaadd_vv_d, OP_SSS_D, H8, H8, H8, aadd64) +GEN_TH_VV_RM(th_vaadd_vv_b, 1, 1, clearb_th) +GEN_TH_VV_RM(th_vaadd_vv_h, 2, 2, clearh_th) +GEN_TH_VV_RM(th_vaadd_vv_w, 4, 4, clearl_th) +GEN_TH_VV_RM(th_vaadd_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVX2_RM, th_vaadd_vx_b, OP_SSS_B, H1, H1, aadd32) +THCALL(TH_OPIVX2_RM, th_vaadd_vx_h, OP_SSS_H, H2, H2, aadd32) +THCALL(TH_OPIVX2_RM, th_vaadd_vx_w, OP_SSS_W, H4, H4, aadd32) +THCALL(TH_OPIVX2_RM, th_vaadd_vx_d, OP_SSS_D, H8, H8, aadd64) +GEN_TH_VX_RM(th_vaadd_vx_b, 1, 1, clearb_th) +GEN_TH_VX_RM(th_vaadd_vx_h, 2, 2, clearh_th) +GEN_TH_VX_RM(th_vaadd_vx_w, 4, 4, clearl_th) +GEN_TH_VX_RM(th_vaadd_vx_d, 8, 8, clearq_th) + +THCALL(TH_OPIVV2_RM, th_vasub_vv_b, OP_SSS_B, H1, H1, H1, asub32) +THCALL(TH_OPIVV2_RM, th_vasub_vv_h, OP_SSS_H, H2, H2, H2, asub32) +THCALL(TH_OPIVV2_RM, th_vasub_vv_w, OP_SSS_W, H4, H4, H4, asub32) +THCALL(TH_OPIVV2_RM, th_vasub_vv_d, OP_SSS_D, H8, H8, H8, asub64) +GEN_TH_VV_RM(th_vasub_vv_b, 1, 1, clearb_th) +GEN_TH_VV_RM(th_vasub_vv_h, 2, 2, clearh_th) +GEN_TH_VV_RM(th_vasub_vv_w, 4, 4, clearl_th) +GEN_TH_VV_RM(th_vasub_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVX2_RM, th_vasub_vx_b, OP_SSS_B, H1, H1, asub32) +THCALL(TH_OPIVX2_RM, th_vasub_vx_h, OP_SSS_H, H2, H2, asub32) +THCALL(TH_OPIVX2_RM, th_vasub_vx_w, OP_SSS_W, H4, H4, asub32) +THCALL(TH_OPIVX2_RM, th_vasub_vx_d, OP_SSS_D, H8, H8, asub64) +GEN_TH_VX_RM(th_vasub_vx_b, 1, 1, clearb_th) +GEN_TH_VX_RM(th_vasub_vx_h, 2, 2, clearh_th) +GEN_TH_VX_RM(th_vasub_vx_w, 4, 4, clearl_th) +GEN_TH_VX_RM(th_vasub_vx_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712911419; cv=none; d=zohomail.com; s=zohoarc; b=ibCtptY3/QJBBcHeZbXGM6nw2Q+/6YwqQXDCZzWf7uOmxafAjH+kVNQpVlIxm8ql9GvvJBSJK6H3wKXiIZ4Wb93nhrPs51b+WtgwCVUeeAgK5mE7LRqYWFeIal0Qy0HvlXvbZTt8n4VXkHJp3hDMEWzm/NvCG5j3gFKHn0ETN0c= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712911419; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=l7qs+t2zYNNda/ugnXokH5TwZNjFnIo1PX/2F8U6sJA=; b=PvzoKZ0tY6GsiQ4I3swTolItsa0TQsiQwCKspznieP3pmYjWxh85EqN+pZzZGBUSSPuRn4sRptWKRq8w4yUqOm2TZUCedbN43z/a8uLjSSsbkgdOb8PLOZXCkCeBcXdTxfuM4YgK+t99Sxf68CDnyoeNCOsw9mcyToHGrz6VSiE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712911419888589.5634219755233; Fri, 12 Apr 2024 01:43:39 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCVP-0005ND-Hn; Fri, 12 Apr 2024 04:43:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCVM-0005CA-24; Fri, 12 Apr 2024 04:43:29 -0400 Received: from out30-111.freemail.mail.aliyun.com ([115.124.30.111]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCVJ-0003hP-2N; Fri, 12 Apr 2024 04:43:27 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NgS2f_1712911398) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:43:19 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712911400; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=l7qs+t2zYNNda/ugnXokH5TwZNjFnIo1PX/2F8U6sJA=; b=oEFIQV0E7yKPCFkGr9H4d67NJmbufmDzi6yRHZ7xZCPgN66+G3x04Yrqs1GhsOZFokoFCwWwBeL6xI2pZ3tp52vjaTfUCdoQ4YeD0e3NLzCPKdnaX6mWChZYGp21O8tbjmynebKySCDoE65hrXpSPxnjAydeG8VJNpW1ci6gmDw= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R671e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045168; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NgS2f_1712911398; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 32/65] target/riscv: Add single-width fractional mul with rounding and saturation for XTheadVector Date: Fri, 12 Apr 2024 15:37:02 +0800 Message-ID: <20240412073735.76413-33-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.111; envelope-from=eric.huang@linux.alibaba.com; helo=out30-111.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712911421163100003 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 9 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 6 ++++-- target/riscv/vector_helper.c | 8 ++++---- target/riscv/vector_internals.h | 6 ++++++ target/riscv/xtheadvector_helper.c | 19 +++++++++++++++++++ 5 files changed, 42 insertions(+), 6 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index aab2979328..85962f7253 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1935,3 +1935,12 @@ DEF_HELPER_6(th_vasub_vx_b, void, ptr, ptr, tl, ptr,= env, i32) DEF_HELPER_6(th_vasub_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vasub_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vasub_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vsmul_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsmul_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsmul_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsmul_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vsmul_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsmul_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsmul_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vsmul_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 59da1e4b3f..df653bd1c9 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1717,14 +1717,16 @@ GEN_OPIVX_TRANS_TH(th_vaadd_vx, opivx_check_th) GEN_OPIVX_TRANS_TH(th_vasub_vx, opivx_check_th) GEN_OPIVI_TRANS_TH(th_vaadd_vi, IMM_SX, th_vaadd_vx, opivx_check_th) =20 +/* Vector Single-Width Fractional Multiply with Rounding and Saturation */ +GEN_OPIVV_TRANS_TH(th_vsmul_vv, opivv_check_th) +GEN_OPIVX_TRANS_TH(th_vsmul_vx, opivx_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vsmul_vv) -TH_TRANS_STUB(th_vsmul_vx) TH_TRANS_STUB(th_vwsmaccu_vv) TH_TRANS_STUB(th_vwsmaccu_vx) TH_TRANS_STUB(th_vwsmacc_vv) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index ea1e449174..331a9a9c7a 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -2474,7 +2474,7 @@ GEN_VEXT_VX_RM(vasubu_vx_w, 4) GEN_VEXT_VX_RM(vasubu_vx_d, 8) =20 /* Vector Single-Width Fractional Multiply with Rounding and Saturation */ -static inline int8_t vsmul8(CPURISCVState *env, int vxrm, int8_t a, int8_t= b) +int8_t vsmul8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) { uint8_t round; int16_t res; @@ -2494,7 +2494,7 @@ static inline int8_t vsmul8(CPURISCVState *env, int v= xrm, int8_t a, int8_t b) } } =20 -static int16_t vsmul16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) +int16_t vsmul16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) { uint8_t round; int32_t res; @@ -2514,7 +2514,7 @@ static int16_t vsmul16(CPURISCVState *env, int vxrm, = int16_t a, int16_t b) } } =20 -static int32_t vsmul32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) +int32_t vsmul32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) { uint8_t round; int64_t res; @@ -2534,7 +2534,7 @@ static int32_t vsmul32(CPURISCVState *env, int vxrm, = int32_t a, int32_t b) } } =20 -static int64_t vsmul64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) +int64_t vsmul64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) { uint8_t round; uint64_t hi_64, lo_64; diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 19f174f4c8..c76ff5abac 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -308,4 +308,10 @@ int32_t aadd32(CPURISCVState *env, int vxrm, int32_t a= , int32_t b); int64_t aadd64(CPURISCVState *env, int vxrm, int64_t a, int64_t b); int32_t asub32(CPURISCVState *env, int vxrm, int32_t a, int32_t b); int64_t asub64(CPURISCVState *env, int vxrm, int64_t a, int64_t b); + +int8_t vsmul8(CPURISCVState *env, int vxrm, int8_t a, int8_t b); +int16_t vsmul16(CPURISCVState *env, int vxrm, int16_t a, int16_t b); +int32_t vsmul32(CPURISCVState *env, int vxrm, int32_t a, int32_t b); +int64_t vsmul64(CPURISCVState *env, int vxrm, int64_t a, int64_t b); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 06ac5940b7..e4acb4d176 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2294,3 +2294,22 @@ GEN_TH_VX_RM(th_vasub_vx_b, 1, 1, clearb_th) GEN_TH_VX_RM(th_vasub_vx_h, 2, 2, clearh_th) GEN_TH_VX_RM(th_vasub_vx_w, 4, 4, clearl_th) GEN_TH_VX_RM(th_vasub_vx_d, 8, 8, clearq_th) + +/* Vector Single-Width Fractional Multiply with Rounding and Saturation */ +THCALL(TH_OPIVV2_RM, th_vsmul_vv_b, OP_SSS_B, H1, H1, H1, vsmul8) +THCALL(TH_OPIVV2_RM, th_vsmul_vv_h, OP_SSS_H, H2, H2, H2, vsmul16) +THCALL(TH_OPIVV2_RM, th_vsmul_vv_w, OP_SSS_W, H4, H4, H4, vsmul32) +THCALL(TH_OPIVV2_RM, th_vsmul_vv_d, OP_SSS_D, H8, H8, H8, vsmul64) +GEN_TH_VV_RM(th_vsmul_vv_b, 1, 1, clearb_th) +GEN_TH_VV_RM(th_vsmul_vv_h, 2, 2, clearh_th) +GEN_TH_VV_RM(th_vsmul_vv_w, 4, 4, clearl_th) +GEN_TH_VV_RM(th_vsmul_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVX2_RM, th_vsmul_vx_b, OP_SSS_B, H1, H1, vsmul8) +THCALL(TH_OPIVX2_RM, th_vsmul_vx_h, OP_SSS_H, H2, H2, vsmul16) +THCALL(TH_OPIVX2_RM, th_vsmul_vx_w, OP_SSS_W, H4, H4, vsmul32) +THCALL(TH_OPIVX2_RM, th_vsmul_vx_d, OP_SSS_D, H8, H8, vsmul64) +GEN_TH_VX_RM(th_vsmul_vx_b, 1, 1, clearb_th) +GEN_TH_VX_RM(th_vsmul_vx_h, 2, 2, clearh_th) +GEN_TH_VX_RM(th_vsmul_vx_w, 4, 4, clearl_th) +GEN_TH_VX_RM(th_vsmul_vx_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712911578; cv=none; d=zohomail.com; s=zohoarc; b=UfESyRsmODab4Bcxg+CFlGvbWrJGUo3FdaK8h0qAjLtVzQd7/PBpempVEJ3X/QbpqE3oW4Cfgl8MVyR1ZFoR1Btsiuir42gB8XdOSErNjVHBJj/y4kTQYrZZ7gP5Pq9gZ1ptHn0J3DdDBL0DaBzeM/0HYyZ7PB/BcQ2xtoUb6h4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712911578; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=SIUra65AWArhR0rIzfyuRhTsmYOTj+U/GvUEESwKsrM=; b=UvLhtDpW57GQjLfoI/6pBFB8eMYlPv0In4dOZPP8pnXjoTugyZ/cmrptxvK19mdzvU6vAFcMiNzX67yyVzsbjucqBxby3a33cjSRiKqFvV10E8oyp9lf5uVtGAbxZNklpdPRuT5YBpr/zJdEFtJX8TwWXbLaaU3DZgU6y15ueY0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712911578676848.7871361365568; Fri, 12 Apr 2024 01:46:18 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCXm-0006hh-9X; Fri, 12 Apr 2024 04:45:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCXT-0006gJ-Eh; Fri, 12 Apr 2024 04:45:42 -0400 Received: from out30-111.freemail.mail.aliyun.com ([115.124.30.111]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCXM-0004OV-C4; Fri, 12 Apr 2024 04:45:36 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NjQr6_1712911520) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:45:21 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712911521; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=SIUra65AWArhR0rIzfyuRhTsmYOTj+U/GvUEESwKsrM=; b=rztchhFveY2WtdZn3CoWcQr+uBKOYmB0EwZ+R4liTQa8E/f2lo7A4rY0AUhcX3pUs1OWI4MQGOeTpr2mm53OMEPHRliJCOJXqEQVHu5bTR84FpRXmh1U5ctQWTpDTfGrhJD1O4oI8ddHg/xkW7MCVWNk2EURIl69uWERQJ8RB78= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R131e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045170; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NjQr6_1712911520; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 33/65] target/riscv: Add widening saturating scaled multiply-add instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:03 +0800 Message-ID: <20240412073735.76413-34-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.111; envelope-from=eric.huang@linux.alibaba.com; helo=out30-111.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712911579722100003 Content-Type: text/plain; charset="utf-8" There are no instructions similar to these instructions in RVV1.0. So we implement them by writing their own functions instead of copying code from RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 22 ++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 16 +- target/riscv/vector_helper.c | 2 +- target/riscv/vector_internals.h | 2 + target/riscv/xtheadvector_helper.c | 210 ++++++++++++++++++ 5 files changed, 244 insertions(+), 8 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 85962f7253..d45477ee1b 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1944,3 +1944,25 @@ DEF_HELPER_6(th_vsmul_vx_b, void, ptr, ptr, tl, ptr,= env, i32) DEF_HELPER_6(th_vsmul_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vsmul_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vsmul_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vwsmaccu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsmacc_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsmacc_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsmacc_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccsu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccsu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccsu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsmacc_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsmacc_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsmacc_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccsu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccsu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccsu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccus_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccus_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vwsmaccus_vx_w, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index df653bd1c9..175516e3a7 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1721,19 +1721,21 @@ GEN_OPIVI_TRANS_TH(th_vaadd_vi, IMM_SX, th_vaadd_vx= , opivx_check_th) GEN_OPIVV_TRANS_TH(th_vsmul_vv, opivv_check_th) GEN_OPIVX_TRANS_TH(th_vsmul_vx, opivx_check_th) =20 +/* Vector Widening Saturating Scaled Multiply-Add */ +GEN_OPIVV_WIDEN_TRANS_TH(th_vwsmaccu_vv, opivv_widen_check_th) +GEN_OPIVV_WIDEN_TRANS_TH(th_vwsmacc_vv, opivv_widen_check_th) +GEN_OPIVV_WIDEN_TRANS_TH(th_vwsmaccsu_vv, opivv_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwsmaccu_vx, opivx_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwsmacc_vx, opivx_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwsmaccsu_vx, opivx_widen_check_th) +GEN_OPIVX_WIDEN_TRANS_TH(th_vwsmaccus_vx, opivx_widen_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vwsmaccu_vv) -TH_TRANS_STUB(th_vwsmaccu_vx) -TH_TRANS_STUB(th_vwsmacc_vv) -TH_TRANS_STUB(th_vwsmacc_vx) -TH_TRANS_STUB(th_vwsmaccsu_vv) -TH_TRANS_STUB(th_vwsmaccsu_vx) -TH_TRANS_STUB(th_vwsmaccus_vx) TH_TRANS_STUB(th_vssrl_vv) TH_TRANS_STUB(th_vssrl_vx) TH_TRANS_STUB(th_vssrl_vi) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 331a9a9c7a..ec11acf487 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -2296,7 +2296,7 @@ GEN_VEXT_VX_RM(vssub_vx_w, 4) GEN_VEXT_VX_RM(vssub_vx_d, 8) =20 /* Vector Single-Width Averaging Add and Subtract */ -static inline uint8_t get_round(int vxrm, uint64_t v, uint8_t shift) +uint8_t get_round(int vxrm, uint64_t v, uint8_t shift) { uint8_t d =3D extract64(v, shift, 1); uint8_t d1; diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index c76ff5abac..99f69ef8fa 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -314,4 +314,6 @@ int16_t vsmul16(CPURISCVState *env, int vxrm, int16_t a= , int16_t b); int32_t vsmul32(CPURISCVState *env, int vxrm, int32_t a, int32_t b); int64_t vsmul64(CPURISCVState *env, int vxrm, int64_t a, int64_t b); =20 +uint8_t get_round(int vxrm, uint64_t v, uint8_t shift); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index e4acb4d176..1964855d2d 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2313,3 +2313,213 @@ GEN_TH_VX_RM(th_vsmul_vx_b, 1, 1, clearb_th) GEN_TH_VX_RM(th_vsmul_vx_h, 2, 2, clearh_th) GEN_TH_VX_RM(th_vsmul_vx_w, 4, 4, clearl_th) GEN_TH_VX_RM(th_vsmul_vx_d, 8, 8, clearq_th) + +/* + * Vector Widening Saturating Scaled Multiply-Add + * + * RVV1.0 does not have similar instructions + */ + +static inline uint16_t +vwsmaccu8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b, + uint16_t c) +{ + uint8_t round; + uint16_t res =3D (uint16_t)a * b; + + round =3D get_round(vxrm, res, 4); + res =3D (res >> 4) + round; + return saddu16(env, vxrm, c, res); +} + +static inline uint32_t +vwsmaccu16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b, + uint32_t c) +{ + uint8_t round; + uint32_t res =3D (uint32_t)a * b; + + round =3D get_round(vxrm, res, 8); + res =3D (res >> 8) + round; + return saddu32(env, vxrm, c, res); +} + +static inline uint64_t +vwsmaccu32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b, + uint64_t c) +{ + uint8_t round; + uint64_t res =3D (uint64_t)a * b; + + round =3D get_round(vxrm, res, 16); + res =3D (res >> 16) + round; + return saddu64(env, vxrm, c, res); +} + +#define TH_OPIVV3_RM(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2, OP) \ +static inline void \ +do_##NAME(void *vd, void *vs1, void *vs2, int i, \ + CPURISCVState *env, int vxrm) \ +{ \ + TX1 s1 =3D *((T1 *)vs1 + HS1(i)); \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + TD d =3D *((TD *)vd + HD(i)); \ + *((TD *)vd + HD(i)) =3D OP(env, vxrm, s2, s1, d); \ +} + +THCALL(TH_OPIVV3_RM, th_vwsmaccu_vv_b, WOP_UUU_B, H2, H1, H1, vwsmaccu8) +THCALL(TH_OPIVV3_RM, th_vwsmaccu_vv_h, WOP_UUU_H, H4, H2, H2, vwsmaccu16) +THCALL(TH_OPIVV3_RM, th_vwsmaccu_vv_w, WOP_UUU_W, H8, H4, H4, vwsmaccu32) +GEN_TH_VV_RM(th_vwsmaccu_vv_b, 1, 2, clearh_th) +GEN_TH_VV_RM(th_vwsmaccu_vv_h, 2, 4, clearl_th) +GEN_TH_VV_RM(th_vwsmaccu_vv_w, 4, 8, clearq_th) + +#define TH_OPIVX3_RM(NAME, TD, T1, T2, TX1, TX2, HD, HS2, OP) \ +static inline void \ +do_##NAME(void *vd, target_long s1, void *vs2, int i, \ + CPURISCVState *env, int vxrm) \ +{ \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + TD d =3D *((TD *)vd + HD(i)); \ + *((TD *)vd + HD(i)) =3D OP(env, vxrm, s2, (TX1)(T1)s1, d); \ +} + +THCALL(TH_OPIVX3_RM, th_vwsmaccu_vx_b, WOP_UUU_B, H2, H1, vwsmaccu8) +THCALL(TH_OPIVX3_RM, th_vwsmaccu_vx_h, WOP_UUU_H, H4, H2, vwsmaccu16) +THCALL(TH_OPIVX3_RM, th_vwsmaccu_vx_w, WOP_UUU_W, H8, H4, vwsmaccu32) +GEN_TH_VX_RM(th_vwsmaccu_vx_b, 1, 2, clearh_th) +GEN_TH_VX_RM(th_vwsmaccu_vx_h, 2, 4, clearl_th) +GEN_TH_VX_RM(th_vwsmaccu_vx_w, 4, 8, clearq_th) + +static inline int16_t +vwsmacc8(CPURISCVState *env, int vxrm, int8_t a, int8_t b, int16_t c) +{ + uint8_t round; + int16_t res =3D (int16_t)a * b; + + round =3D get_round(vxrm, res, 4); + res =3D (res >> 4) + round; + return sadd16(env, vxrm, c, res); +} + +static inline int32_t +vwsmacc16(CPURISCVState *env, int vxrm, int16_t a, int16_t b, int32_t c) +{ + uint8_t round; + int32_t res =3D (int32_t)a * b; + + round =3D get_round(vxrm, res, 8); + res =3D (res >> 8) + round; + return sadd32(env, vxrm, c, res); + +} + +static inline int64_t +vwsmacc32(CPURISCVState *env, int vxrm, int32_t a, int32_t b, int64_t c) +{ + uint8_t round; + int64_t res =3D (int64_t)a * b; + + round =3D get_round(vxrm, res, 16); + res =3D (res >> 16) + round; + return sadd64(env, vxrm, c, res); +} + +THCALL(TH_OPIVV3_RM, th_vwsmacc_vv_b, WOP_SSS_B, H2, H1, H1, vwsmacc8) +THCALL(TH_OPIVV3_RM, th_vwsmacc_vv_h, WOP_SSS_H, H4, H2, H2, vwsmacc16) +THCALL(TH_OPIVV3_RM, th_vwsmacc_vv_w, WOP_SSS_W, H8, H4, H4, vwsmacc32) +GEN_TH_VV_RM(th_vwsmacc_vv_b, 1, 2, clearh_th) +GEN_TH_VV_RM(th_vwsmacc_vv_h, 2, 4, clearl_th) +GEN_TH_VV_RM(th_vwsmacc_vv_w, 4, 8, clearq_th) +THCALL(TH_OPIVX3_RM, th_vwsmacc_vx_b, WOP_SSS_B, H2, H1, vwsmacc8) +THCALL(TH_OPIVX3_RM, th_vwsmacc_vx_h, WOP_SSS_H, H4, H2, vwsmacc16) +THCALL(TH_OPIVX3_RM, th_vwsmacc_vx_w, WOP_SSS_W, H8, H4, vwsmacc32) +GEN_TH_VX_RM(th_vwsmacc_vx_b, 1, 2, clearh_th) +GEN_TH_VX_RM(th_vwsmacc_vx_h, 2, 4, clearl_th) +GEN_TH_VX_RM(th_vwsmacc_vx_w, 4, 8, clearq_th) + +static inline int16_t +vwsmaccsu8(CPURISCVState *env, int vxrm, uint8_t a, int8_t b, int16_t c) +{ + uint8_t round; + int16_t res =3D a * (int16_t)b; + + round =3D get_round(vxrm, res, 4); + res =3D (res >> 4) + round; + return ssub16(env, vxrm, c, res); +} + +static inline int32_t +vwsmaccsu16(CPURISCVState *env, int vxrm, uint16_t a, int16_t b, uint32_t = c) +{ + uint8_t round; + int32_t res =3D a * (int32_t)b; + + round =3D get_round(vxrm, res, 8); + res =3D (res >> 8) + round; + return ssub32(env, vxrm, c, res); +} + +static inline int64_t +vwsmaccsu32(CPURISCVState *env, int vxrm, uint32_t a, int32_t b, int64_t c) +{ + uint8_t round; + int64_t res =3D a * (int64_t)b; + + round =3D get_round(vxrm, res, 16); + res =3D (res >> 16) + round; + return ssub64(env, vxrm, c, res); +} + +THCALL(TH_OPIVV3_RM, th_vwsmaccsu_vv_b, WOP_SSU_B, H2, H1, H1, vwsmaccsu8) +THCALL(TH_OPIVV3_RM, th_vwsmaccsu_vv_h, WOP_SSU_H, H4, H2, H2, vwsmaccsu16) +THCALL(TH_OPIVV3_RM, th_vwsmaccsu_vv_w, WOP_SSU_W, H8, H4, H4, vwsmaccsu32) +GEN_TH_VV_RM(th_vwsmaccsu_vv_b, 1, 2, clearh_th) +GEN_TH_VV_RM(th_vwsmaccsu_vv_h, 2, 4, clearl_th) +GEN_TH_VV_RM(th_vwsmaccsu_vv_w, 4, 8, clearq_th) +THCALL(TH_OPIVX3_RM, th_vwsmaccsu_vx_b, WOP_SSU_B, H2, H1, vwsmaccsu8) +THCALL(TH_OPIVX3_RM, th_vwsmaccsu_vx_h, WOP_SSU_H, H4, H2, vwsmaccsu16) +THCALL(TH_OPIVX3_RM, th_vwsmaccsu_vx_w, WOP_SSU_W, H8, H4, vwsmaccsu32) +GEN_TH_VX_RM(th_vwsmaccsu_vx_b, 1, 2, clearh_th) +GEN_TH_VX_RM(th_vwsmaccsu_vx_h, 2, 4, clearl_th) +GEN_TH_VX_RM(th_vwsmaccsu_vx_w, 4, 8, clearq_th) + +static inline int16_t +vwsmaccus8(CPURISCVState *env, int vxrm, int8_t a, uint8_t b, int16_t c) +{ + uint8_t round; + int16_t res =3D (int16_t)a * b; + + round =3D get_round(vxrm, res, 4); + res =3D (res >> 4) + round; + return ssub16(env, vxrm, c, res); +} + +static inline int32_t +vwsmaccus16(CPURISCVState *env, int vxrm, int16_t a, uint16_t b, int32_t c) +{ + uint8_t round; + int32_t res =3D (int32_t)a * b; + + round =3D get_round(vxrm, res, 8); + res =3D (res >> 8) + round; + return ssub32(env, vxrm, c, res); +} + +static inline int64_t +vwsmaccus32(CPURISCVState *env, int vxrm, int32_t a, uint32_t b, int64_t c) +{ + uint8_t round; + int64_t res =3D (int64_t)a * b; + + round =3D get_round(vxrm, res, 16); + res =3D (res >> 16) + round; + return ssub64(env, vxrm, c, res); +} + +THCALL(TH_OPIVX3_RM, th_vwsmaccus_vx_b, WOP_SUS_B, H2, H1, vwsmaccus8) +THCALL(TH_OPIVX3_RM, th_vwsmaccus_vx_h, WOP_SUS_H, H4, H2, vwsmaccus16) +THCALL(TH_OPIVX3_RM, th_vwsmaccus_vx_w, WOP_SUS_W, H8, H4, vwsmaccus32) +GEN_TH_VX_RM(th_vwsmaccus_vx_b, 1, 2, clearh_th) +GEN_TH_VX_RM(th_vwsmaccus_vx_h, 2, 4, clearl_th) +GEN_TH_VX_RM(th_vwsmaccus_vx_w, 4, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712911704; cv=none; d=zohomail.com; s=zohoarc; b=LaRH/kZ/7R7exDj2iaoWOnZvUkGb3LbOxTBTGIyxRa6h4Bipn5+EYxdwYsIE+3ZzC0psS/H2njG1H/oLHNNOvErUplxwq2U5hptWfXsFdp7xiIrsofm3Ghp0wgvcFY/VXctouaQXxsxARp4PTKLTwuywHduz3f8kyBLDEGk5ty0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712911704; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=u4+DkqirLDfgVnFo+QL+8djwBn76H13Ork4i/falbKI=; b=cHxBNucwJZG/8xFir+OmGXS38aonPWgOF7NFFhHFcPbUCJCq3ZZ9Q60qf/7JFKeAag1fFkhCpkxlkiAfsTCByItX+/iJC3DXXSWmAwk0OpAa2N3hfuZ+rc40OgWB6C8+WRIq6q5ZzRxO1BgRHIdy87ldzc/mr9kaFiW6DosGlkE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712911704239304.43662962276323; Fri, 12 Apr 2024 01:48:24 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCZY-0008G6-9X; Fri, 12 Apr 2024 04:47:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCZP-00085H-75; Fri, 12 Apr 2024 04:47:44 -0400 Received: from out30-101.freemail.mail.aliyun.com ([115.124.30.101]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCZH-00053f-K3; Fri, 12 Apr 2024 04:47:38 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NqAEn_1712911641) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:47:22 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712911643; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=u4+DkqirLDfgVnFo+QL+8djwBn76H13Ork4i/falbKI=; b=aqo8TkoIa3R2pn0huN7txZJxdmKq+0vxlOIXIq45WKbal4GIrghEhLHEsvFE3AqyKjaUTsURMXqHper024dcAfXuOmTn0RZgWJtn1B4GfrJwlqM6MBnj64Yivn5sy6DXgUjMqu0nEirQvgMjKmQKkzVTZul1J6ta7TCsy/vJJzM= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R811e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045168; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NqAEn_1712911641; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 34/65] target/riscv: Add single-width scaling shift instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:04 +0800 Message-ID: <20240412073735.76413-35-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.101; envelope-from=eric.huang@linux.alibaba.com; helo=out30-101.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712911706123100002 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 17 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 14 ++++--- target/riscv/vector_helper.c | 24 ++++-------- target/riscv/vector_internals.h | 10 +++++ target/riscv/xtheadvector_helper.c | 38 +++++++++++++++++++ 5 files changed, 81 insertions(+), 22 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index d45477ee1b..70d3f34a59 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1966,3 +1966,20 @@ DEF_HELPER_6(th_vwsmaccsu_vx_w, void, ptr, ptr, tl, = ptr, env, i32) DEF_HELPER_6(th_vwsmaccus_vx_b, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vwsmaccus_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vwsmaccus_vx_w, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vssrl_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssrl_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssrl_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssrl_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssra_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssra_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssra_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssra_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vssrl_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssrl_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssrl_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssrl_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssra_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssra_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssra_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vssra_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 175516e3a7..d1f523832b 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1730,18 +1730,20 @@ GEN_OPIVX_WIDEN_TRANS_TH(th_vwsmacc_vx, opivx_widen= _check_th) GEN_OPIVX_WIDEN_TRANS_TH(th_vwsmaccsu_vx, opivx_widen_check_th) GEN_OPIVX_WIDEN_TRANS_TH(th_vwsmaccus_vx, opivx_widen_check_th) =20 +/* Vector Single-Width Scaling Shift Instructions */ +GEN_OPIVV_TRANS_TH(th_vssrl_vv, opivv_check_th) +GEN_OPIVV_TRANS_TH(th_vssra_vv, opivv_check_th) +GEN_OPIVX_TRANS_TH(th_vssrl_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vssra_vx, opivx_check_th) +GEN_OPIVI_TRANS_TH(th_vssrl_vi, IMM_TRUNC_SEW, th_vssrl_vx, opivx_check_th) +GEN_OPIVI_TRANS_TH(th_vssra_vi, IMM_TRUNC_SEW, th_vssra_vx, opivx_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vssrl_vv) -TH_TRANS_STUB(th_vssrl_vx) -TH_TRANS_STUB(th_vssrl_vi) -TH_TRANS_STUB(th_vssra_vv) -TH_TRANS_STUB(th_vssra_vx) -TH_TRANS_STUB(th_vssra_vi) TH_TRANS_STUB(th_vnclipu_vv) TH_TRANS_STUB(th_vnclipu_vx) TH_TRANS_STUB(th_vnclipu_vi) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index ec11acf487..be1f1bc8e2 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -2581,8 +2581,7 @@ GEN_VEXT_VX_RM(vsmul_vx_w, 4) GEN_VEXT_VX_RM(vsmul_vx_d, 8) =20 /* Vector Single-Width Scaling Shift Instructions */ -static inline uint8_t -vssrl8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) +uint8_t vssrl8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b) { uint8_t round, shift =3D b & 0x7; uint8_t res; @@ -2591,24 +2590,21 @@ vssrl8(CPURISCVState *env, int vxrm, uint8_t a, uin= t8_t b) res =3D (a >> shift) + round; return res; } -static inline uint16_t -vssrl16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b) +uint16_t vssrl16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b) { uint8_t round, shift =3D b & 0xf; =20 round =3D get_round(vxrm, a, shift); return (a >> shift) + round; } -static inline uint32_t -vssrl32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b) +uint32_t vssrl32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b) { uint8_t round, shift =3D b & 0x1f; =20 round =3D get_round(vxrm, a, shift); return (a >> shift) + round; } -static inline uint64_t -vssrl64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b) +uint64_t vssrl64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b) { uint8_t round, shift =3D b & 0x3f; =20 @@ -2633,32 +2629,28 @@ GEN_VEXT_VX_RM(vssrl_vx_h, 2) GEN_VEXT_VX_RM(vssrl_vx_w, 4) GEN_VEXT_VX_RM(vssrl_vx_d, 8) =20 -static inline int8_t -vssra8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) +int8_t vssra8(CPURISCVState *env, int vxrm, int8_t a, int8_t b) { uint8_t round, shift =3D b & 0x7; =20 round =3D get_round(vxrm, a, shift); return (a >> shift) + round; } -static inline int16_t -vssra16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) +int16_t vssra16(CPURISCVState *env, int vxrm, int16_t a, int16_t b) { uint8_t round, shift =3D b & 0xf; =20 round =3D get_round(vxrm, a, shift); return (a >> shift) + round; } -static inline int32_t -vssra32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) +int32_t vssra32(CPURISCVState *env, int vxrm, int32_t a, int32_t b) { uint8_t round, shift =3D b & 0x1f; =20 round =3D get_round(vxrm, a, shift); return (a >> shift) + round; } -static inline int64_t -vssra64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) +int64_t vssra64(CPURISCVState *env, int vxrm, int64_t a, int64_t b) { uint8_t round, shift =3D b & 0x3f; =20 diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 99f69ef8fa..02b5fd49f0 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -316,4 +316,14 @@ int64_t vsmul64(CPURISCVState *env, int vxrm, int64_t = a, int64_t b); =20 uint8_t get_round(int vxrm, uint64_t v, uint8_t shift); =20 +uint8_t vssrl8(CPURISCVState *env, int vxrm, uint8_t a, uint8_t b); +uint16_t vssrl16(CPURISCVState *env, int vxrm, uint16_t a, uint16_t b); +uint32_t vssrl32(CPURISCVState *env, int vxrm, uint32_t a, uint32_t b); +uint64_t vssrl64(CPURISCVState *env, int vxrm, uint64_t a, uint64_t b); + +int8_t vssra8(CPURISCVState *env, int vxrm, int8_t a, int8_t b); +int16_t vssra16(CPURISCVState *env, int vxrm, int16_t a, int16_t b); +int32_t vssra32(CPURISCVState *env, int vxrm, int32_t a, int32_t b); +int64_t vssra64(CPURISCVState *env, int vxrm, int64_t a, int64_t b); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 1964855d2d..8cd3fd028b 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2523,3 +2523,41 @@ THCALL(TH_OPIVX3_RM, th_vwsmaccus_vx_w, WOP_SUS_W, H= 8, H4, vwsmaccus32) GEN_TH_VX_RM(th_vwsmaccus_vx_b, 1, 2, clearh_th) GEN_TH_VX_RM(th_vwsmaccus_vx_h, 2, 4, clearl_th) GEN_TH_VX_RM(th_vwsmaccus_vx_w, 4, 8, clearq_th) + +/* Vector Single-Width Scaling Shift Instructions */ + +THCALL(TH_OPIVV2_RM, th_vssrl_vv_b, OP_UUU_B, H1, H1, H1, vssrl8) +THCALL(TH_OPIVV2_RM, th_vssrl_vv_h, OP_UUU_H, H2, H2, H2, vssrl16) +THCALL(TH_OPIVV2_RM, th_vssrl_vv_w, OP_UUU_W, H4, H4, H4, vssrl32) +THCALL(TH_OPIVV2_RM, th_vssrl_vv_d, OP_UUU_D, H8, H8, H8, vssrl64) +GEN_TH_VV_RM(th_vssrl_vv_b, 1, 1, clearb_th) +GEN_TH_VV_RM(th_vssrl_vv_h, 2, 2, clearh_th) +GEN_TH_VV_RM(th_vssrl_vv_w, 4, 4, clearl_th) +GEN_TH_VV_RM(th_vssrl_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVX2_RM, th_vssrl_vx_b, OP_UUU_B, H1, H1, vssrl8) +THCALL(TH_OPIVX2_RM, th_vssrl_vx_h, OP_UUU_H, H2, H2, vssrl16) +THCALL(TH_OPIVX2_RM, th_vssrl_vx_w, OP_UUU_W, H4, H4, vssrl32) +THCALL(TH_OPIVX2_RM, th_vssrl_vx_d, OP_UUU_D, H8, H8, vssrl64) +GEN_TH_VX_RM(th_vssrl_vx_b, 1, 1, clearb_th) +GEN_TH_VX_RM(th_vssrl_vx_h, 2, 2, clearh_th) +GEN_TH_VX_RM(th_vssrl_vx_w, 4, 4, clearl_th) +GEN_TH_VX_RM(th_vssrl_vx_d, 8, 8, clearq_th) + +THCALL(TH_OPIVV2_RM, th_vssra_vv_b, OP_SSS_B, H1, H1, H1, vssra8) +THCALL(TH_OPIVV2_RM, th_vssra_vv_h, OP_SSS_H, H2, H2, H2, vssra16) +THCALL(TH_OPIVV2_RM, th_vssra_vv_w, OP_SSS_W, H4, H4, H4, vssra32) +THCALL(TH_OPIVV2_RM, th_vssra_vv_d, OP_SSS_D, H8, H8, H8, vssra64) +GEN_TH_VV_RM(th_vssra_vv_b, 1, 1, clearb_th) +GEN_TH_VV_RM(th_vssra_vv_h, 2, 2, clearh_th) +GEN_TH_VV_RM(th_vssra_vv_w, 4, 4, clearl_th) +GEN_TH_VV_RM(th_vssra_vv_d, 8, 8, clearq_th) + +THCALL(TH_OPIVX2_RM, th_vssra_vx_b, OP_SSS_B, H1, H1, vssra8) +THCALL(TH_OPIVX2_RM, th_vssra_vx_h, OP_SSS_H, H2, H2, vssra16) +THCALL(TH_OPIVX2_RM, th_vssra_vx_w, OP_SSS_W, H4, H4, vssra32) +THCALL(TH_OPIVX2_RM, th_vssra_vx_d, OP_SSS_D, H8, H8, vssra64) +GEN_TH_VX_RM(th_vssra_vx_b, 1, 1, clearb_th) +GEN_TH_VX_RM(th_vssra_vx_h, 2, 2, clearh_th) +GEN_TH_VX_RM(th_vssra_vx_w, 4, 4, clearl_th) +GEN_TH_VX_RM(th_vssra_vx_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712911799; cv=none; d=zohomail.com; s=zohoarc; b=Utfe/XEpAxHTjYJFe9dOZbqLw9QrhQssrYQv9r3cI4lhuYnmeFiOIzHOEUhn+9mD8FeqVcA1bR7zIjpXXqj8xHD0VtX/eI3XK6kTq96jhQr1mA70EBBK+YbAwaHebgXu5eP50eZMTTidAnAkQboEFeTyqY6Ri42pQxmKptgyc+w= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712911799; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=uHj4HVxobXUy4ZlMB2RX3KERQ7oVvKO031CGOKwBY9o=; b=NNOGXdZ60Ykysu80je5drvbVkFhtCj04/Iq8/5hAL2F5CGffla5mImICO732UGSZ+qixpij763rcMwYTw6C+FBMmnKSZn23LL8W2eg7uuaiNDLrZvqhs4bVYCmnobVrrgCA/VhauFoPD9nUKMy4lenCZWqZcd9BqtgIcFYVvWLs= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712911799908368.47945640956345; Fri, 12 Apr 2024 01:49:59 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCbH-0001Iv-5K; Fri, 12 Apr 2024 04:49:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCbE-0001Ie-NX; Fri, 12 Apr 2024 04:49:32 -0400 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCbB-0005Gb-Ku; Fri, 12 Apr 2024 04:49:32 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nohny_1712911763) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:49:23 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712911764; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=uHj4HVxobXUy4ZlMB2RX3KERQ7oVvKO031CGOKwBY9o=; b=X1MrwWgiW8Mcp0tyTseqOow1PJRcI/VjP2sESjOk5jRE/et+MnI9kguqNS45OeQmA12O0BvGvXVnMzYVo+SuBNI3gk4lHCphtsyLd2soiXsuPMctRa91Rn3KaWyRw4jjoBYw+ZcsKzchB5jGAOdygZTt0txNGC0rcOnugvLB6q4= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R191e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046050; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nohny_1712911763; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 35/65] target/riscv: Add narrowing fixed-point clip instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:05 +0800 Message-ID: <20240412073735.76413-36-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.133; envelope-from=eric.huang@linux.alibaba.com; helo=out30-133.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712911802177100003 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 13 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 14 +++++---- target/riscv/vector_helper.c | 26 ++++------------- target/riscv/vector_internals.h | 14 +++++++++ target/riscv/xtheadvector_helper.c | 29 +++++++++++++++++++ 5 files changed, 70 insertions(+), 26 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 70d3f34a59..6254be771f 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1983,3 +1983,16 @@ DEF_HELPER_6(th_vssra_vx_b, void, ptr, ptr, tl, ptr,= env, i32) DEF_HELPER_6(th_vssra_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vssra_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vssra_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vnclip_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnclip_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnclip_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnclipu_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnclipu_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnclipu_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vnclipu_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnclipu_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnclipu_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnclip_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnclip_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vnclip_vx_w, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index d1f523832b..108f3249d0 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1738,18 +1738,20 @@ GEN_OPIVX_TRANS_TH(th_vssra_vx, opivx_check_th) GEN_OPIVI_TRANS_TH(th_vssrl_vi, IMM_TRUNC_SEW, th_vssrl_vx, opivx_check_th) GEN_OPIVI_TRANS_TH(th_vssra_vi, IMM_TRUNC_SEW, th_vssra_vx, opivx_check_th) =20 +/* Vector Narrowing Fixed-Point Clip Instructions */ +GEN_OPIVV_NARROW_TRANS_TH(th_vnclipu_vv) +GEN_OPIVV_NARROW_TRANS_TH(th_vnclip_vv) +GEN_OPIVX_NARROW_TRANS_TH(th_vnclipu_vx) +GEN_OPIVX_NARROW_TRANS_TH(th_vnclip_vx) +GEN_OPIVI_NARROW_TRANS_TH(th_vnclipu_vi, IMM_ZX, th_vnclipu_vx) +GEN_OPIVI_NARROW_TRANS_TH(th_vnclip_vi, IMM_ZX, th_vnclip_vx) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vnclipu_vv) -TH_TRANS_STUB(th_vnclipu_vx) -TH_TRANS_STUB(th_vnclipu_vi) -TH_TRANS_STUB(th_vnclip_vv) -TH_TRANS_STUB(th_vnclip_vx) -TH_TRANS_STUB(th_vnclip_vi) TH_TRANS_STUB(th_vfadd_vv) TH_TRANS_STUB(th_vfadd_vf) TH_TRANS_STUB(th_vfsub_vv) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index be1f1bc8e2..262cb28824 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -646,14 +646,6 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b) * Vector Integer Arithmetic Instructions */ =20 -/* (TD, T1, T2, TX1, TX2) */ -#define NOP_SSS_B int8_t, int8_t, int16_t, int8_t, int16_t -#define NOP_SSS_H int16_t, int16_t, int32_t, int16_t, int32_t -#define NOP_SSS_W int32_t, int32_t, int64_t, int32_t, int64_t -#define NOP_UUU_B uint8_t, uint8_t, uint16_t, uint8_t, uint16_t -#define NOP_UUU_H uint16_t, uint16_t, uint32_t, uint16_t, uint32_t -#define NOP_UUU_W uint32_t, uint32_t, uint64_t, uint32_t, uint64_t - #define DO_SUB(N, M) (N - M) #define DO_RSUB(N, M) (M - N) =20 @@ -2677,8 +2669,7 @@ GEN_VEXT_VX_RM(vssra_vx_w, 4) GEN_VEXT_VX_RM(vssra_vx_d, 8) =20 /* Vector Narrowing Fixed-Point Clip Instructions */ -static inline int8_t -vnclip8(CPURISCVState *env, int vxrm, int16_t a, int8_t b) +int8_t vnclip8(CPURISCVState *env, int vxrm, int16_t a, int8_t b) { uint8_t round, shift =3D b & 0xf; int16_t res; @@ -2696,8 +2687,7 @@ vnclip8(CPURISCVState *env, int vxrm, int16_t a, int8= _t b) } } =20 -static inline int16_t -vnclip16(CPURISCVState *env, int vxrm, int32_t a, int16_t b) +int16_t vnclip16(CPURISCVState *env, int vxrm, int32_t a, int16_t b) { uint8_t round, shift =3D b & 0x1f; int32_t res; @@ -2715,8 +2705,7 @@ vnclip16(CPURISCVState *env, int vxrm, int32_t a, int= 16_t b) } } =20 -static inline int32_t -vnclip32(CPURISCVState *env, int vxrm, int64_t a, int32_t b) +int32_t vnclip32(CPURISCVState *env, int vxrm, int64_t a, int32_t b) { uint8_t round, shift =3D b & 0x3f; int64_t res; @@ -2748,8 +2737,7 @@ GEN_VEXT_VX_RM(vnclip_wx_b, 1) GEN_VEXT_VX_RM(vnclip_wx_h, 2) GEN_VEXT_VX_RM(vnclip_wx_w, 4) =20 -static inline uint8_t -vnclipu8(CPURISCVState *env, int vxrm, uint16_t a, uint8_t b) +uint8_t vnclipu8(CPURISCVState *env, int vxrm, uint16_t a, uint8_t b) { uint8_t round, shift =3D b & 0xf; uint16_t res; @@ -2764,8 +2752,7 @@ vnclipu8(CPURISCVState *env, int vxrm, uint16_t a, ui= nt8_t b) } } =20 -static inline uint16_t -vnclipu16(CPURISCVState *env, int vxrm, uint32_t a, uint16_t b) +uint16_t vnclipu16(CPURISCVState *env, int vxrm, uint32_t a, uint16_t b) { uint8_t round, shift =3D b & 0x1f; uint32_t res; @@ -2780,8 +2767,7 @@ vnclipu16(CPURISCVState *env, int vxrm, uint32_t a, u= int16_t b) } } =20 -static inline uint32_t -vnclipu32(CPURISCVState *env, int vxrm, uint64_t a, uint32_t b) +uint32_t vnclipu32(CPURISCVState *env, int vxrm, uint64_t a, uint32_t b) { uint8_t round, shift =3D b & 0x3f; uint64_t res; diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 02b5fd49f0..a42dc080ec 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -255,6 +255,12 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,= \ #define WOP_SSU_B int16_t, int8_t, uint8_t, int16_t, uint16_t #define WOP_SSU_H int32_t, int16_t, uint16_t, int32_t, uint32_t #define WOP_SSU_W int64_t, int32_t, uint32_t, int64_t, uint64_t +#define NOP_SSS_B int8_t, int8_t, int16_t, int8_t, int16_t +#define NOP_SSS_H int16_t, int16_t, int32_t, int16_t, int32_t +#define NOP_SSS_W int32_t, int32_t, int64_t, int32_t, int64_t +#define NOP_UUU_B uint8_t, uint8_t, uint16_t, uint8_t, uint16_t +#define NOP_UUU_H uint16_t, uint16_t, uint32_t, uint16_t, uint32_t +#define NOP_UUU_W uint32_t, uint32_t, uint64_t, uint32_t, uint64_t =20 /* share functions */ static inline target_ulong adjust_addr(CPURISCVState *env, target_ulong ad= dr) @@ -326,4 +332,12 @@ int16_t vssra16(CPURISCVState *env, int vxrm, int16_t = a, int16_t b); int32_t vssra32(CPURISCVState *env, int vxrm, int32_t a, int32_t b); int64_t vssra64(CPURISCVState *env, int vxrm, int64_t a, int64_t b); =20 +int8_t vnclip8(CPURISCVState *env, int vxrm, int16_t a, int8_t b); +int16_t vnclip16(CPURISCVState *env, int vxrm, int32_t a, int16_t b); +int32_t vnclip32(CPURISCVState *env, int vxrm, int64_t a, int32_t b); + +uint8_t vnclipu8(CPURISCVState *env, int vxrm, uint16_t a, uint8_t b); +uint16_t vnclipu16(CPURISCVState *env, int vxrm, uint32_t a, uint16_t b); +uint32_t vnclipu32(CPURISCVState *env, int vxrm, uint64_t a, uint32_t b); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 8cd3fd028b..2e97a95392 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2561,3 +2561,32 @@ GEN_TH_VX_RM(th_vssra_vx_b, 1, 1, clearb_th) GEN_TH_VX_RM(th_vssra_vx_h, 2, 2, clearh_th) GEN_TH_VX_RM(th_vssra_vx_w, 4, 4, clearl_th) GEN_TH_VX_RM(th_vssra_vx_d, 8, 8, clearq_th) + +/* Vector Narrowing Fixed-Point Clip Instructions */ +THCALL(TH_OPIVV2_RM, th_vnclip_vv_b, NOP_SSS_B, H1, H2, H1, vnclip8) +THCALL(TH_OPIVV2_RM, th_vnclip_vv_h, NOP_SSS_H, H2, H4, H2, vnclip16) +THCALL(TH_OPIVV2_RM, th_vnclip_vv_w, NOP_SSS_W, H4, H8, H4, vnclip32) +GEN_TH_VV_RM(th_vnclip_vv_b, 1, 1, clearb_th) +GEN_TH_VV_RM(th_vnclip_vv_h, 2, 2, clearh_th) +GEN_TH_VV_RM(th_vnclip_vv_w, 4, 4, clearl_th) + +THCALL(TH_OPIVX2_RM, th_vnclip_vx_b, NOP_SSS_B, H1, H2, vnclip8) +THCALL(TH_OPIVX2_RM, th_vnclip_vx_h, NOP_SSS_H, H2, H4, vnclip16) +THCALL(TH_OPIVX2_RM, th_vnclip_vx_w, NOP_SSS_W, H4, H8, vnclip32) +GEN_TH_VX_RM(th_vnclip_vx_b, 1, 1, clearb_th) +GEN_TH_VX_RM(th_vnclip_vx_h, 2, 2, clearh_th) +GEN_TH_VX_RM(th_vnclip_vx_w, 4, 4, clearl_th) + +THCALL(TH_OPIVV2_RM, th_vnclipu_vv_b, NOP_UUU_B, H1, H2, H1, vnclipu8) +THCALL(TH_OPIVV2_RM, th_vnclipu_vv_h, NOP_UUU_H, H2, H4, H2, vnclipu16) +THCALL(TH_OPIVV2_RM, th_vnclipu_vv_w, NOP_UUU_W, H4, H8, H4, vnclipu32) +GEN_TH_VV_RM(th_vnclipu_vv_b, 1, 1, clearb_th) +GEN_TH_VV_RM(th_vnclipu_vv_h, 2, 2, clearh_th) +GEN_TH_VV_RM(th_vnclipu_vv_w, 4, 4, clearl_th) + +THCALL(TH_OPIVX2_RM, th_vnclipu_vx_b, NOP_UUU_B, H1, H2, vnclipu8) +THCALL(TH_OPIVX2_RM, th_vnclipu_vx_h, NOP_UUU_H, H2, H4, vnclipu16) +THCALL(TH_OPIVX2_RM, th_vnclipu_vx_w, NOP_UUU_W, H4, H8, vnclipu32) +GEN_TH_VX_RM(th_vnclipu_vx_b, 1, 1, clearb_th) +GEN_TH_VX_RM(th_vnclipu_vx_h, 2, 2, clearh_th) +GEN_TH_VX_RM(th_vnclipu_vx_w, 4, 4, clearl_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712911941; cv=none; d=zohomail.com; s=zohoarc; b=RhlmBKg2DQ071AY+wdaE9T+XhiNUtvEg1ARVQqGCGInsOHH3yaoqjkDJLMdRBhBmOjPBX+mZxTWXOLEwBKqFOv6wuGxHZy+WUJZJP7HdiS6P6oicfiigG13rNFSj8EhfQIYyg0mMk41Z+8H0iRXMrWS4W22W4yqX8aBVYt/wJxg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712911941; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=eUj5n97wrTCwLeO2A7acdUv4TXegDzvT7ygrNj10q4A=; b=TzEj+nsuqgc6vbjZ53eRzw0sMEvgl2mocqT9fo5E8JAOOft+927QSaa0VKpH+1FiysLBEnMC7gkHzL6WZ+Rgj1E7OrUAC/KclSRgIQAG8P+63lHz0mhLsCD9X46kevm7Ty8haJH8ddjTbX6f5kz9U7yH2VMgejeF86FjkmdAGRY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 171291194181899.6169503679588; Fri, 12 Apr 2024 01:52:21 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCdK-0003Gn-IU; Fri, 12 Apr 2024 04:51:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCdG-0003D5-Pf; Fri, 12 Apr 2024 04:51:39 -0400 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCdC-0005nE-Rs; Fri, 12 Apr 2024 04:51:38 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NgUmG_1712911884) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:51:25 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712911885; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=eUj5n97wrTCwLeO2A7acdUv4TXegDzvT7ygrNj10q4A=; b=Vwk8sSYFjyTqT0ZX12DM6JXg8fMBGkI2lVGRMOMD5Syj9yIaB1d+zso7kF5OYIdGOqBs6r4aFeJJRfiacsshfnR91vT0XFWff0OyinktNTyhmZn4BghiACGNL2bR17xXkBDDPyTH2FZDZ59wzgySPz1yNd6YlsxZCwYKm/ea4/w= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R921e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046050; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NgUmG_1712911884; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 36/65] target/riscv: Add single-width floating-point add/sub instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:06 +0800 Message-ID: <20240412073735.76413-37-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.133; envelope-from=eric.huang@linux.alibaba.com; helo=out30-133.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712911942489100001 Content-Type: text/plain; charset="utf-8" In this patch, we add single-width floating-point add/sub instructions to show the way we implement XTheadVector floating-point arithmetic instructions. XTheadVector diff from RVV1.0 in the following points: 1. Different mask reg layout. 2. Different tail/masked elements process policy. 3. Different check policy. XTheadVector does not have fractional lmul, so w= e can use simpler check function. Signed-off-by: Huang Tao --- target/riscv/helper.h | 16 +++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 113 +++++++++++++++++- target/riscv/vector_helper.c | 6 +- target/riscv/vector_internals.h | 4 + target/riscv/xtheadvector_helper.c | 106 ++++++++++++++++ 5 files changed, 237 insertions(+), 8 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 6254be771f..04bd363ac0 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1996,3 +1996,19 @@ DEF_HELPER_6(th_vnclipu_vx_w, void, ptr, ptr, tl, pt= r, env, i32) DEF_HELPER_6(th_vnclip_vx_b, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vnclip_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vnclip_vx_w, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vfadd_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfadd_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfadd_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfsub_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfsub_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfsub_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfadd_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfadd_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfadd_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfsub_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfsub_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfsub_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfrsub_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfrsub_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfrsub_vf_d, void, ptr, ptr, i64, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 108f3249d0..a18c661f24 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1746,17 +1746,120 @@ GEN_OPIVX_NARROW_TRANS_TH(th_vnclip_vx) GEN_OPIVI_NARROW_TRANS_TH(th_vnclipu_vi, IMM_ZX, th_vnclipu_vx) GEN_OPIVI_NARROW_TRANS_TH(th_vnclip_vi, IMM_ZX, th_vnclip_vx) =20 +/* + * Vector Float Point Arithmetic Instructions + */ + +/* Vector Single-Width Floating-Point Add/Subtract Instructions */ + +/* + * If the current SEW does not correspond to a supported IEEE floating-poi= nt + * type, an illegal instruction exception is raised. + */ +static bool opfvv_check_th(DisasContext *s, arg_rmrr *a) +{ + return require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, false) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false) && + th_check_reg(s, a->rs1, false) && + (s->sew !=3D 0); +} + +/* + * The macro below including GEN_OPFVV_TRANS_TH and GEN_OPFVF_TRANS_TH, + * are just changed the data encoding compared to RVV1.0. + */ + +/* OPFVV without GVEC IR */ +#define GEN_OPFVV_TRANS_TH(NAME, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (CHECK(s, a)) { \ + uint32_t data =3D 0; \ + static gen_helper_gvec_4_ptr * const fns[3] =3D { \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, \ + gen_helper_##NAME##_d, \ + }; \ + gen_set_rm(s, RISCV_FRM_DYN); \ + \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), \ + vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs1), \ + vreg_ofs(s, a->rs2), tcg_env, \ + s->cfg_ptr->vlenb, \ + s->cfg_ptr->vlenb, data, \ + fns[s->sew - 1]); \ + finalize_rvv_inst(s); \ + return true; \ + } \ + return false; \ +} + +GEN_OPFVV_TRANS_TH(th_vfadd_vv, opfvv_check_th) +GEN_OPFVV_TRANS_TH(th_vfsub_vv, opfvv_check_th) + +#define gen_helper_opfvf_th gen_helper_opfvf + +static bool opfvf_check_th(DisasContext *s, arg_rmrr *a) +{ +/* + * If the current SEW does not correspond to a supported IEEE floating-poi= nt + * type, an illegal instruction exception is raised + */ + return require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, false) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false) && + (s->sew !=3D 0); +} + +/* + * OPFVF without GVEC IR + * + * XTheadVector has different process policy when FLEN < SEW from + * RVV1.0. In XTheadVector, when FLEN < SEW, the value in freg should + * be nanboxed, while in RVV1.0, this situation is reserved. + * However, RVF-only cpus always have values NaN-boxed to 64-bits, so + * we do not have to deal with this situation differently. We can just + * use the RVV function opfvf_trans + */ +#define GEN_OPFVF_TRANS_TH(NAME, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (CHECK(s, a)) { \ + uint32_t data =3D 0; \ + static gen_helper_opfvf_th *const fns[3] =3D { \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, \ + gen_helper_##NAME##_d, \ + }; \ + gen_set_rm(s, RISCV_FRM_DYN); \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + return opfvf_trans(a->rd, a->rs1, a->rs2, data, \ + fns[s->sew - 1], s); \ + } \ + return false; \ +} + +GEN_OPFVF_TRANS_TH(th_vfadd_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfsub_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfrsub_vf, opfvf_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfadd_vv) -TH_TRANS_STUB(th_vfadd_vf) -TH_TRANS_STUB(th_vfsub_vv) -TH_TRANS_STUB(th_vfsub_vf) -TH_TRANS_STUB(th_vfrsub_vf) TH_TRANS_STUB(th_vfwadd_vv) TH_TRANS_STUB(th_vfwadd_vf) TH_TRANS_STUB(th_vfwadd_wv) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 262cb28824..3784096da2 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -2904,17 +2904,17 @@ GEN_VEXT_VF(vfsub_vf_h, 2) GEN_VEXT_VF(vfsub_vf_w, 4) GEN_VEXT_VF(vfsub_vf_d, 8) =20 -static uint16_t float16_rsub(uint16_t a, uint16_t b, float_status *s) +uint16_t float16_rsub(uint16_t a, uint16_t b, float_status *s) { return float16_sub(b, a, s); } =20 -static uint32_t float32_rsub(uint32_t a, uint32_t b, float_status *s) +uint32_t float32_rsub(uint32_t a, uint32_t b, float_status *s) { return float32_sub(b, a, s); } =20 -static uint64_t float64_rsub(uint64_t a, uint64_t b, float_status *s) +uint64_t float64_rsub(uint64_t a, uint64_t b, float_status *s) { return float64_sub(b, a, s); } diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index a42dc080ec..5f250ab7ba 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -340,4 +340,8 @@ uint8_t vnclipu8(CPURISCVState *env, int vxrm, uint16_t= a, uint8_t b); uint16_t vnclipu16(CPURISCVState *env, int vxrm, uint32_t a, uint16_t b); uint32_t vnclipu32(CPURISCVState *env, int vxrm, uint64_t a, uint32_t b); =20 +uint16_t float16_rsub(uint16_t a, uint16_t b, float_status *s); +uint32_t float32_rsub(uint32_t a, uint32_t b, float_status *s); +uint64_t float64_rsub(uint64_t a, uint64_t b, float_status *s); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 2e97a95392..60811ca813 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2590,3 +2590,109 @@ THCALL(TH_OPIVX2_RM, th_vnclipu_vx_w, NOP_UUU_W, H4= , H8, vnclipu32) GEN_TH_VX_RM(th_vnclipu_vx_b, 1, 1, clearb_th) GEN_TH_VX_RM(th_vnclipu_vx_h, 2, 2, clearh_th) GEN_TH_VX_RM(th_vnclipu_vx_w, 4, 4, clearl_th) + +/* + * Vector Float Point Arithmetic Instructions + */ + +/* Vector Single-Width Floating-Point Add/Subtract Instructions */ + +/* + * Some functions or macros are just the same as RVV1.0. + * But it is not worthy to extract them from RVV1.0, so we just copy + * them. + */ +#define TH_OPFVV2(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2, OP)\ +static void do_##NAME(void *vd, void *vs1, void *vs2, int i, \ + CPURISCVState *env) \ +{ \ + TX1 s1 =3D *((T1 *)vs1 + HS1(i)); \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + *((TD *)vd + HD(i)) =3D OP(s2, s1, &env->fp_status); \ +} + +#define GEN_TH_VV_ENV(NAME, ESZ, DSZ, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, \ + void *vs2, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + uint32_t vlmax =3D th_maxsz(desc) / ESZ; \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vm =3D th_vm(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + do_##NAME(vd, vs1, vs2, i, env); \ + } \ + env->vstart =3D 0; \ + CLEAR_FN(vd, vl, vl * DSZ, vlmax * DSZ); \ +} + +THCALL(TH_OPFVV2, th_vfadd_vv_h, OP_UUU_H, H2, H2, H2, float16_add) +THCALL(TH_OPFVV2, th_vfadd_vv_w, OP_UUU_W, H4, H4, H4, float32_add) +THCALL(TH_OPFVV2, th_vfadd_vv_d, OP_UUU_D, H8, H8, H8, float64_add) +GEN_TH_VV_ENV(th_vfadd_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfadd_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfadd_vv_d, 8, 8, clearq_th) + +#define TH_OPFVF2(NAME, TD, T1, T2, TX1, TX2, HD, HS2, OP) \ +static void do_##NAME(void *vd, uint64_t s1, void *vs2, int i, \ + CPURISCVState *env) \ +{ \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + *((TD *)vd + HD(i)) =3D OP(s2, (TX1)(T1)s1, &env->fp_status);\ +} + +#define GEN_TH_VF(NAME, ESZ, DSZ, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, uint64_t s1, \ + void *vs2, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + uint32_t vlmax =3D th_maxsz(desc) / ESZ; \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vm =3D th_vm(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + do_##NAME(vd, s1, vs2, i, env); \ + } \ + env->vstart =3D 0; \ + CLEAR_FN(vd, vl, vl * DSZ, vlmax * DSZ); \ +} + +THCALL(TH_OPFVF2, th_vfadd_vf_h, OP_UUU_H, H2, H2, float16_add) +THCALL(TH_OPFVF2, th_vfadd_vf_w, OP_UUU_W, H4, H4, float32_add) +THCALL(TH_OPFVF2, th_vfadd_vf_d, OP_UUU_D, H8, H8, float64_add) +GEN_TH_VF(th_vfadd_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfadd_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfadd_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVV2, th_vfsub_vv_h, OP_UUU_H, H2, H2, H2, float16_sub) +THCALL(TH_OPFVV2, th_vfsub_vv_w, OP_UUU_W, H4, H4, H4, float32_sub) +THCALL(TH_OPFVV2, th_vfsub_vv_d, OP_UUU_D, H8, H8, H8, float64_sub) +GEN_TH_VV_ENV(th_vfsub_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfsub_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfsub_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfsub_vf_h, OP_UUU_H, H2, H2, float16_sub) +THCALL(TH_OPFVF2, th_vfsub_vf_w, OP_UUU_W, H4, H4, float32_sub) +THCALL(TH_OPFVF2, th_vfsub_vf_d, OP_UUU_D, H8, H8, float64_sub) +GEN_TH_VF(th_vfsub_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfsub_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfsub_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVF2, th_vfrsub_vf_h, OP_UUU_H, H2, H2, float16_rsub) +THCALL(TH_OPFVF2, th_vfrsub_vf_w, OP_UUU_W, H4, H4, float32_rsub) +THCALL(TH_OPFVF2, th_vfrsub_vf_d, OP_UUU_D, H8, H8, float64_rsub) +GEN_TH_VF(th_vfrsub_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfrsub_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfrsub_vf_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712912042; cv=none; d=zohomail.com; s=zohoarc; b=PcRpueAwFze+5guuQl0z21DZYdbXfbz90Ls8uGCPaZNETFzapm1d3+rayroKVksXy2K4i2hJ5W/9OxWFICl+ubq1lHJmTFGpkWw6/dI6y/PThvGNr1Mmj3q31WieheVuz6xJjil/SgYxRsqu6Tvh2347Xp6MydiV5Tkbilo/gFk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712912042; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=RIg6PWSzgeHhBKgFWo9JZiHi0QPSvyckGd64FC9/oh8=; b=lw0JGTcNw/lIQpp0Ok4Au64RVWTf7ZyK9hp4wxUCEAROAccrqRx81DtS9LsYwC/wVhhn+KAH41cS0zbJz08r4K/tA+bClGKTK7QOQ6jEpKk9dg3cQnKgdhwJsBgHchKurQwHFQE248zh+dN+vETFY2M5/NKz4kIAb2v1G2eCj6g= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712912042066289.5959968691491; Fri, 12 Apr 2024 01:54:02 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCfJ-00053Y-Qd; Fri, 12 Apr 2024 04:53:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCfH-00052k-IU; Fri, 12 Apr 2024 04:53:43 -0400 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCf5-0005zM-Vh; Fri, 12 Apr 2024 04:53:43 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nc.B4_1712912005) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:53:26 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712912007; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=RIg6PWSzgeHhBKgFWo9JZiHi0QPSvyckGd64FC9/oh8=; b=g+YUsxQ7ROetcFgdA2nNZcpi06h1dT3rWsjz3Cjyr5Nhp/+SnewLaDrkKP2yy24eNQOb0kvSd3FvJepuakWFMUF4UcKVus/TItTTbtA6jVCyLOVcG+ovukeqGxUIY/SdLK1GDQvc/eM21bNjO1BI675xMApEDmE0hfbkAKM4yv0= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R141e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045192; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nc.B4_1712912005; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 37/65] target/riscv: Add widening floating-point add/sub instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:07 +0800 Message-ID: <20240412073735.76413-38-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.133; envelope-from=eric.huang@linux.alibaba.com; helo=out30-133.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712912042781100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 17 ++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 162 +++++++++++++++++- target/riscv/vector_helper.c | 16 +- target/riscv/vector_internals.h | 9 + target/riscv/xtheadvector_helper.c | 38 ++++ 5 files changed, 226 insertions(+), 16 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 04bd363ac0..21916e9e3c 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2012,3 +2012,20 @@ DEF_HELPER_6(th_vfsub_vf_d, void, ptr, ptr, i64, ptr= , env, i32) DEF_HELPER_6(th_vfrsub_vf_h, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfrsub_vf_w, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfrsub_vf_d, void, ptr, ptr, i64, ptr, env, i32) + +DEF_HELPER_6(th_vfwadd_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwadd_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwsub_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwsub_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwadd_wv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwadd_wv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwsub_wv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwsub_wv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwadd_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwadd_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwsub_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwsub_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwadd_wf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwadd_wf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwsub_wf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwsub_wf_w, void, ptr, ptr, i64, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index a18c661f24..64d7a7fb76 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -1854,20 +1854,166 @@ GEN_OPFVF_TRANS_TH(th_vfadd_vf, opfvf_check_th) GEN_OPFVF_TRANS_TH(th_vfsub_vf, opfvf_check_th) GEN_OPFVF_TRANS_TH(th_vfrsub_vf, opfvf_check_th) =20 +/* Vector Widening Floating-Point Add/Subtract Instructions */ +static bool opfvv_widen_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, true) && + th_check_reg(s, a->rs2, false) && + th_check_reg(s, a->rs1, false) && + th_check_overlap_group(a->rd, 2 << s->lmul, a->rs2, + 1 << s->lmul) && + th_check_overlap_group(a->rd, 2 << s->lmul, a->rs1, + 1 << s->lmul) && + (s->lmul < 0x3) && (s->sew < 0x3) && (s->sew !=3D 0)); +} + +/* OPFVV with WIDEN */ +#define GEN_OPFVV_WIDEN_TRANS_TH(NAME, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (CHECK(s, a)) { \ + uint32_t data =3D 0; \ + static gen_helper_gvec_4_ptr * const fns[2] =3D { \ + gen_helper_##NAME##_h, gen_helper_##NAME##_w, \ + }; \ + gen_set_rm(s, RISCV_FRM_DYN); \ + \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs1), \ + vreg_ofs(s, a->rs2), tcg_env, \ + s->cfg_ptr->vlenb, \ + s->cfg_ptr->vlenb, data, \ + fns[s->sew - 1]); \ + finalize_rvv_inst(s); \ + return true; \ + } \ + return false; \ +} + +GEN_OPFVV_WIDEN_TRANS_TH(th_vfwadd_vv, opfvv_widen_check_th) +GEN_OPFVV_WIDEN_TRANS_TH(th_vfwsub_vv, opfvv_widen_check_th) + +static bool opfvf_widen_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, true) && + th_check_reg(s, a->rs2, false) && + th_check_overlap_group(a->rd, 2 << s->lmul, a->rs2, + 1 << s->lmul) && + (s->lmul < 0x3) && (s->sew < 0x3) && (s->sew !=3D 0)); +} + +/* OPFVF with WIDEN */ +#define GEN_OPFVF_WIDEN_TRANS_TH(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (opfvf_widen_check_th(s, a)) { \ + uint32_t data =3D 0; \ + static gen_helper_opfvf *const fns[2] =3D { \ + gen_helper_##NAME##_h, gen_helper_##NAME##_w, \ + }; \ + gen_set_rm(s, RISCV_FRM_DYN); \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + return opfvf_trans(a->rd, a->rs1, a->rs2, data, \ + fns[s->sew - 1], s); \ + } \ + return false; \ +} + +GEN_OPFVF_WIDEN_TRANS_TH(th_vfwadd_vf) +GEN_OPFVF_WIDEN_TRANS_TH(th_vfwsub_vf) + +static bool opfwv_widen_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, true) && + th_check_reg(s, a->rs2, true) && + th_check_reg(s, a->rs1, false) && + th_check_overlap_group(a->rd, 2 << s->lmul, a->rs1, + 1 << s->lmul) && + (s->lmul < 0x3) && (s->sew < 0x3) && (s->sew !=3D 0)); +} + +/* WIDEN OPFVV with WIDEN */ +#define GEN_OPFWV_WIDEN_TRANS_TH(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (opfwv_widen_check_th(s, a)) { \ + uint32_t data =3D 0; \ + static gen_helper_gvec_4_ptr * const fns[2] =3D { \ + gen_helper_##NAME##_h, gen_helper_##NAME##_w, \ + }; \ + gen_set_rm(s, RISCV_FRM_DYN); \ + \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), \ + vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs1), \ + vreg_ofs(s, a->rs2), tcg_env, \ + s->cfg_ptr->vlenb, \ + s->cfg_ptr->vlenb, data, \ + fns[s->sew - 1]); \ + finalize_rvv_inst(s); \ + return true; \ + } \ + return false; \ +} + +GEN_OPFWV_WIDEN_TRANS_TH(th_vfwadd_wv) +GEN_OPFWV_WIDEN_TRANS_TH(th_vfwsub_wv) + +static bool opfwf_widen_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, true) && + th_check_reg(s, a->rs2, true) && + (s->lmul < 0x3) && (s->sew < 0x3) && (s->sew !=3D 0)); +} + +/* WIDEN OPFVF with WIDEN */ +#define GEN_OPFWF_WIDEN_TRANS_TH(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (opfwf_widen_check_th(s, a)) { \ + uint32_t data =3D 0; \ + static gen_helper_opfvf *const fns[2] =3D { \ + gen_helper_##NAME##_h, gen_helper_##NAME##_w, \ + }; \ + gen_set_rm(s, RISCV_FRM_DYN); \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + return opfvf_trans(a->rd, a->rs1, a->rs2, data, \ + fns[s->sew - 1], s); \ + } \ + return false; \ +} + +GEN_OPFWF_WIDEN_TRANS_TH(th_vfwadd_wf) +GEN_OPFWF_WIDEN_TRANS_TH(th_vfwsub_wf) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfwadd_vv) -TH_TRANS_STUB(th_vfwadd_vf) -TH_TRANS_STUB(th_vfwadd_wv) -TH_TRANS_STUB(th_vfwadd_wf) -TH_TRANS_STUB(th_vfwsub_vv) -TH_TRANS_STUB(th_vfwsub_vf) -TH_TRANS_STUB(th_vfwsub_wv) -TH_TRANS_STUB(th_vfwsub_wf) TH_TRANS_STUB(th_vfmul_vv) TH_TRANS_STUB(th_vfmul_vf) TH_TRANS_STUB(th_vfdiv_vv) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 3784096da2..6d0358876a 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -2927,13 +2927,13 @@ GEN_VEXT_VF(vfrsub_vf_w, 4) GEN_VEXT_VF(vfrsub_vf_d, 8) =20 /* Vector Widening Floating-Point Add/Subtract Instructions */ -static uint32_t vfwadd16(uint16_t a, uint16_t b, float_status *s) +uint32_t vfwadd16(uint16_t a, uint16_t b, float_status *s) { return float32_add(float16_to_float32(a, true, s), float16_to_float32(b, true, s), s); } =20 -static uint64_t vfwadd32(uint32_t a, uint32_t b, float_status *s) +uint64_t vfwadd32(uint32_t a, uint32_t b, float_status *s) { return float64_add(float32_to_float64(a, s), float32_to_float64(b, s), s); @@ -2949,13 +2949,13 @@ RVVCALL(OPFVF2, vfwadd_vf_w, WOP_UUU_W, H8, H4, vfw= add32) GEN_VEXT_VF(vfwadd_vf_h, 4) GEN_VEXT_VF(vfwadd_vf_w, 8) =20 -static uint32_t vfwsub16(uint16_t a, uint16_t b, float_status *s) +uint32_t vfwsub16(uint16_t a, uint16_t b, float_status *s) { return float32_sub(float16_to_float32(a, true, s), float16_to_float32(b, true, s), s); } =20 -static uint64_t vfwsub32(uint32_t a, uint32_t b, float_status *s) +uint64_t vfwsub32(uint32_t a, uint32_t b, float_status *s) { return float64_sub(float32_to_float64(a, s), float32_to_float64(b, s), s); @@ -2971,12 +2971,12 @@ RVVCALL(OPFVF2, vfwsub_vf_w, WOP_UUU_W, H8, H4, vfw= sub32) GEN_VEXT_VF(vfwsub_vf_h, 4) GEN_VEXT_VF(vfwsub_vf_w, 8) =20 -static uint32_t vfwaddw16(uint32_t a, uint16_t b, float_status *s) +uint32_t vfwaddw16(uint32_t a, uint16_t b, float_status *s) { return float32_add(a, float16_to_float32(b, true, s), s); } =20 -static uint64_t vfwaddw32(uint64_t a, uint32_t b, float_status *s) +uint64_t vfwaddw32(uint64_t a, uint32_t b, float_status *s) { return float64_add(a, float32_to_float64(b, s), s); } @@ -2990,12 +2990,12 @@ RVVCALL(OPFVF2, vfwadd_wf_w, WOP_WUUU_W, H8, H4, vf= waddw32) GEN_VEXT_VF(vfwadd_wf_h, 4) GEN_VEXT_VF(vfwadd_wf_w, 8) =20 -static uint32_t vfwsubw16(uint32_t a, uint16_t b, float_status *s) +uint32_t vfwsubw16(uint32_t a, uint16_t b, float_status *s) { return float32_sub(a, float16_to_float32(b, true, s), s); } =20 -static uint64_t vfwsubw32(uint64_t a, uint32_t b, float_status *s) +uint64_t vfwsubw32(uint64_t a, uint32_t b, float_status *s) { return float64_sub(a, float32_to_float64(b, s), s); } diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 5f250ab7ba..0786f5a4e1 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -344,4 +344,13 @@ uint16_t float16_rsub(uint16_t a, uint16_t b, float_st= atus *s); uint32_t float32_rsub(uint32_t a, uint32_t b, float_status *s); uint64_t float64_rsub(uint64_t a, uint64_t b, float_status *s); =20 +uint32_t vfwadd16(uint16_t a, uint16_t b, float_status *s); +uint64_t vfwadd32(uint32_t a, uint32_t b, float_status *s); +uint32_t vfwsub16(uint16_t a, uint16_t b, float_status *s); +uint64_t vfwsub32(uint32_t a, uint32_t b, float_status *s); +uint32_t vfwaddw16(uint32_t a, uint16_t b, float_status *s); +uint64_t vfwaddw32(uint64_t a, uint32_t b, float_status *s); +uint32_t vfwsubw16(uint32_t a, uint16_t b, float_status *s); +uint64_t vfwsubw32(uint64_t a, uint32_t b, float_status *s); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 60811ca813..cab489a4ae 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2696,3 +2696,41 @@ THCALL(TH_OPFVF2, th_vfrsub_vf_d, OP_UUU_D, H8, H8, = float64_rsub) GEN_TH_VF(th_vfrsub_vf_h, 2, 2, clearh_th) GEN_TH_VF(th_vfrsub_vf_w, 4, 4, clearl_th) GEN_TH_VF(th_vfrsub_vf_d, 8, 8, clearq_th) + +/* Vector Widening Floating-Point Add/Subtract Instructions */ + +THCALL(TH_OPFVV2, th_vfwadd_vv_h, WOP_UUU_H, H4, H2, H2, vfwadd16) +THCALL(TH_OPFVV2, th_vfwadd_vv_w, WOP_UUU_W, H8, H4, H4, vfwadd32) +GEN_TH_VV_ENV(th_vfwadd_vv_h, 2, 4, clearl_th) +GEN_TH_VV_ENV(th_vfwadd_vv_w, 4, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfwadd_vf_h, WOP_UUU_H, H4, H2, vfwadd16) +THCALL(TH_OPFVF2, th_vfwadd_vf_w, WOP_UUU_W, H8, H4, vfwadd32) +GEN_TH_VF(th_vfwadd_vf_h, 2, 4, clearl_th) +GEN_TH_VF(th_vfwadd_vf_w, 4, 8, clearq_th) + +THCALL(TH_OPFVV2, th_vfwsub_vv_h, WOP_UUU_H, H4, H2, H2, vfwsub16) +THCALL(TH_OPFVV2, th_vfwsub_vv_w, WOP_UUU_W, H8, H4, H4, vfwsub32) +GEN_TH_VV_ENV(th_vfwsub_vv_h, 2, 4, clearl_th) +GEN_TH_VV_ENV(th_vfwsub_vv_w, 4, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfwsub_vf_h, WOP_UUU_H, H4, H2, vfwsub16) +THCALL(TH_OPFVF2, th_vfwsub_vf_w, WOP_UUU_W, H8, H4, vfwsub32) +GEN_TH_VF(th_vfwsub_vf_h, 2, 4, clearl_th) +GEN_TH_VF(th_vfwsub_vf_w, 4, 8, clearq_th) + +THCALL(TH_OPFVV2, th_vfwadd_wv_h, WOP_WUUU_H, H4, H2, H2, vfwaddw16) +THCALL(TH_OPFVV2, th_vfwadd_wv_w, WOP_WUUU_W, H8, H4, H4, vfwaddw32) +GEN_TH_VV_ENV(th_vfwadd_wv_h, 2, 4, clearl_th) +GEN_TH_VV_ENV(th_vfwadd_wv_w, 4, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfwadd_wf_h, WOP_WUUU_H, H4, H2, vfwaddw16) +THCALL(TH_OPFVF2, th_vfwadd_wf_w, WOP_WUUU_W, H8, H4, vfwaddw32) +GEN_TH_VF(th_vfwadd_wf_h, 2, 4, clearl_th) +GEN_TH_VF(th_vfwadd_wf_w, 4, 8, clearq_th) + +THCALL(TH_OPFVV2, th_vfwsub_wv_h, WOP_WUUU_H, H4, H2, H2, vfwsubw16) +THCALL(TH_OPFVV2, th_vfwsub_wv_w, WOP_WUUU_W, H8, H4, H4, vfwsubw32) +GEN_TH_VV_ENV(th_vfwsub_wv_h, 2, 4, clearl_th) +GEN_TH_VV_ENV(th_vfwsub_wv_w, 4, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfwsub_wf_h, WOP_WUUU_H, H4, H2, vfwsubw16) +THCALL(TH_OPFVF2, th_vfwsub_wf_w, WOP_WUUU_W, H8, H4, vfwsubw32) +GEN_TH_VF(th_vfwsub_wf_h, 2, 4, clearl_th) +GEN_TH_VF(th_vfwsub_wf_w, 4, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712912179; cv=none; d=zohomail.com; s=zohoarc; b=ZWbzvDPbANt9fYgzPOI4gnAvAzhsueok7FcLAMsqE+rxKWquRdTy6d+KFT3m5j7FCRF55/lHDbpTby881RvhkSif7gWFAi1wz+uFvGzLCjVRGUIa/1MLLQsCEckq7eLGD2RvvvkFOsRK4NmTQVlsMIOmeTg+X7EaESXK3FuW+5c= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712912179; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=RhMedefveBKypwSrPdvtto1wZmstSaKCZtiSNPSjXc8=; b=PnDEREfvfrzuZ+VrCWKiN7/RtKBgPyayNYs1ATMVHFzljErafcZb8ePV5SymeOdYV2ugpxJh65MFXjjeJ1O7nEDpS6238Id4bDsuOPWX+9h5P37aaK5mTktOPtkkl32mhuEn3ih694RbiDdNo/pjnycJaR8aeQGPP8f0qfzI4gU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712912179918923.2314265681449; Fri, 12 Apr 2024 01:56:19 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvChA-0006AW-0h; Fri, 12 Apr 2024 04:55:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCh6-00069P-CR; Fri, 12 Apr 2024 04:55:36 -0400 Received: from out30-101.freemail.mail.aliyun.com ([115.124.30.101]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCh3-0006Oq-T7; Fri, 12 Apr 2024 04:55:36 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NgW78_1712912127) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:55:28 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712912129; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=RhMedefveBKypwSrPdvtto1wZmstSaKCZtiSNPSjXc8=; b=XKEFb0fTYu11fWc9MjysrlKA5juvWhiKxR6NxsAUmIP/qQYua/6d913oCxLWfrXA+D2dkU70uwbJJVxUfc8T0lh5uH+ktKD9vUeAMbiOebTHmaOzOi/A6DEq/pFMLZzbsBTOznPpDyviOLIfs1Rfhwb2r5/7rL86ufWVEQ8GtnE= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R151e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045192; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NgW78_1712912127; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 38/65] target/riscv: Add single-width floating-point multiply/divide instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:08 +0800 Message-ID: <20240412073735.76413-39-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.101; envelope-from=eric.huang@linux.alibaba.com; helo=out30-101.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712912181191100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 16 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 12 ++++--- target/riscv/vector_helper.c | 6 ++-- target/riscv/vector_internals.h | 4 +++ target/riscv/xtheadvector_helper.c | 34 +++++++++++++++++++ 5 files changed, 64 insertions(+), 8 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 21916e9e3c..f63239676a 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2029,3 +2029,19 @@ DEF_HELPER_6(th_vfwadd_wf_h, void, ptr, ptr, i64, pt= r, env, i32) DEF_HELPER_6(th_vfwadd_wf_w, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfwsub_wf_h, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfwsub_wf_w, void, ptr, ptr, i64, ptr, env, i32) + +DEF_HELPER_6(th_vfmul_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmul_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmul_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfdiv_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfdiv_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfdiv_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmul_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmul_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmul_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfdiv_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfdiv_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfdiv_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfrdiv_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfrdiv_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfrdiv_vf_d, void, ptr, ptr, i64, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 64d7a7fb76..940b212f5e 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2008,17 +2008,19 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr = *a) \ GEN_OPFWF_WIDEN_TRANS_TH(th_vfwadd_wf) GEN_OPFWF_WIDEN_TRANS_TH(th_vfwsub_wf) =20 +/* Vector Single-Width Floating-Point Multiply/Divide Instructions */ +GEN_OPFVV_TRANS_TH(th_vfmul_vv, opfvv_check_th) +GEN_OPFVV_TRANS_TH(th_vfdiv_vv, opfvv_check_th) +GEN_OPFVF_TRANS_TH(th_vfmul_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfdiv_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfrdiv_vf, opfvf_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfmul_vv) -TH_TRANS_STUB(th_vfmul_vf) -TH_TRANS_STUB(th_vfdiv_vv) -TH_TRANS_STUB(th_vfdiv_vf) -TH_TRANS_STUB(th_vfrdiv_vf) TH_TRANS_STUB(th_vfwmul_vv) TH_TRANS_STUB(th_vfwmul_vf) TH_TRANS_STUB(th_vfmacc_vv) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 6d0358876a..d65b32c584 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -3036,17 +3036,17 @@ GEN_VEXT_VF(vfdiv_vf_h, 2) GEN_VEXT_VF(vfdiv_vf_w, 4) GEN_VEXT_VF(vfdiv_vf_d, 8) =20 -static uint16_t float16_rdiv(uint16_t a, uint16_t b, float_status *s) +uint16_t float16_rdiv(uint16_t a, uint16_t b, float_status *s) { return float16_div(b, a, s); } =20 -static uint32_t float32_rdiv(uint32_t a, uint32_t b, float_status *s) +uint32_t float32_rdiv(uint32_t a, uint32_t b, float_status *s) { return float32_div(b, a, s); } =20 -static uint64_t float64_rdiv(uint64_t a, uint64_t b, float_status *s) +uint64_t float64_rdiv(uint64_t a, uint64_t b, float_status *s) { return float64_div(b, a, s); } diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 0786f5a4e1..29263c6a53 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -353,4 +353,8 @@ uint64_t vfwaddw32(uint64_t a, uint32_t b, float_status= *s); uint32_t vfwsubw16(uint32_t a, uint16_t b, float_status *s); uint64_t vfwsubw32(uint64_t a, uint32_t b, float_status *s); =20 +uint16_t float16_rdiv(uint16_t a, uint16_t b, float_status *s); +uint32_t float32_rdiv(uint32_t a, uint32_t b, float_status *s); +uint64_t float64_rdiv(uint64_t a, uint64_t b, float_status *s); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index cab489a4ae..770f36346f 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2734,3 +2734,37 @@ THCALL(TH_OPFVF2, th_vfwsub_wf_h, WOP_WUUU_H, H4, H2= , vfwsubw16) THCALL(TH_OPFVF2, th_vfwsub_wf_w, WOP_WUUU_W, H8, H4, vfwsubw32) GEN_TH_VF(th_vfwsub_wf_h, 2, 4, clearl_th) GEN_TH_VF(th_vfwsub_wf_w, 4, 8, clearq_th) + +/* Vector Single-Width Floating-Point Multiply/Divide Instructions */ +THCALL(TH_OPFVV2, th_vfmul_vv_h, OP_UUU_H, H2, H2, H2, float16_mul) +THCALL(TH_OPFVV2, th_vfmul_vv_w, OP_UUU_W, H4, H4, H4, float32_mul) +THCALL(TH_OPFVV2, th_vfmul_vv_d, OP_UUU_D, H8, H8, H8, float64_mul) +GEN_TH_VV_ENV(th_vfmul_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfmul_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfmul_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfmul_vf_h, OP_UUU_H, H2, H2, float16_mul) +THCALL(TH_OPFVF2, th_vfmul_vf_w, OP_UUU_W, H4, H4, float32_mul) +THCALL(TH_OPFVF2, th_vfmul_vf_d, OP_UUU_D, H8, H8, float64_mul) +GEN_TH_VF(th_vfmul_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfmul_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfmul_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVV2, th_vfdiv_vv_h, OP_UUU_H, H2, H2, H2, float16_div) +THCALL(TH_OPFVV2, th_vfdiv_vv_w, OP_UUU_W, H4, H4, H4, float32_div) +THCALL(TH_OPFVV2, th_vfdiv_vv_d, OP_UUU_D, H8, H8, H8, float64_div) +GEN_TH_VV_ENV(th_vfdiv_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfdiv_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfdiv_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfdiv_vf_h, OP_UUU_H, H2, H2, float16_div) +THCALL(TH_OPFVF2, th_vfdiv_vf_w, OP_UUU_W, H4, H4, float32_div) +THCALL(TH_OPFVF2, th_vfdiv_vf_d, OP_UUU_D, H8, H8, float64_div) +GEN_TH_VF(th_vfdiv_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfdiv_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfdiv_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVF2, th_vfrdiv_vf_h, OP_UUU_H, H2, H2, float16_rdiv) +THCALL(TH_OPFVF2, th_vfrdiv_vf_w, OP_UUU_W, H4, H4, float32_rdiv) +THCALL(TH_OPFVF2, th_vfrdiv_vf_d, OP_UUU_D, H8, H8, float64_rdiv) +GEN_TH_VF(th_vfrdiv_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfrdiv_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfrdiv_vf_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712912290; cv=none; d=zohomail.com; s=zohoarc; b=CYItNmF8KkGgS1SfK9a4gU/zE/9grL8qfHNtBv7lZetFXtGChEMXIH8TQ7ZoMd/wrzBf/BdGl61xYR3F2KSNHOPPkRMPZQ3eD0+slKBmh5dMme7vdIhk+sP/GHJ8lvUmThtsxUdz6JDWO8WlaqfEbBbqgibzvBMkMildGeXYQY4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712912290; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=zQtzfvqT8ZVzyxELeDR0sSIiQ6qjRvgIusJJ4q/c0c4=; b=fIUB0hNTNkQhGABNFg27KzzvG1H6IF7lXJKdGwWMxiZ7jmYgGv7q3EH1XnfaueCmaqmV27SpP1h3OIUmiBYvjmu7LDw+qWBezYGEILANISp1bVymfB90lcKi7YyvLUx3WHFbi9xSeC/2FHf48KvAgw+D846Jvgmhl9P7vQBPKN4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712912290426840.858506709544; Fri, 12 Apr 2024 01:58:10 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvCjA-0007XG-Ay; Fri, 12 Apr 2024 04:57:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCj7-0007WV-GF; Fri, 12 Apr 2024 04:57:41 -0400 Received: from out30-119.freemail.mail.aliyun.com ([115.124.30.119]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvCj4-0006iM-Pf; Fri, 12 Apr 2024 04:57:41 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nc0YC_1712912248) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 16:57:29 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712912250; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=zQtzfvqT8ZVzyxELeDR0sSIiQ6qjRvgIusJJ4q/c0c4=; b=pSSuXEXMWiBpvGrtNpy3F27r4I2IwQbxtmetg5ug7YLbtpC/bonKdW85NIeV7b+Y7IKHBmV+JBfpQa1XNi5l6qUxwjMi0z33AxlZSU6FZOZssFvyORipyRsBWWOZCSCvq96SWgOvzS6/XkWK86QhwxPCazKYFVh09JsZSq02wlc= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R111e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045168; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nc0YC_1712912248; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 39/65] target/riscv: Add widening floating-point multiply instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:09 +0800 Message-ID: <20240412073735.76413-40-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.119; envelope-from=eric.huang@linux.alibaba.com; helo=out30-119.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712912291371100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 5 +++++ target/riscv/insn_trans/trans_xtheadvector.c.inc | 6 ++++-- target/riscv/vector_helper.c | 4 ++-- target/riscv/vector_internals.h | 3 +++ target/riscv/xtheadvector_helper.c | 11 +++++++++++ 5 files changed, 25 insertions(+), 4 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index f63239676a..3102b078e4 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2045,3 +2045,8 @@ DEF_HELPER_6(th_vfdiv_vf_d, void, ptr, ptr, i64, ptr,= env, i32) DEF_HELPER_6(th_vfrdiv_vf_h, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfrdiv_vf_w, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfrdiv_vf_d, void, ptr, ptr, i64, ptr, env, i32) + +DEF_HELPER_6(th_vfwmul_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwmul_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwmul_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwmul_vf_w, void, ptr, ptr, i64, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 940b212f5e..3d0370f220 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2015,14 +2015,16 @@ GEN_OPFVF_TRANS_TH(th_vfmul_vf, opfvf_check_th) GEN_OPFVF_TRANS_TH(th_vfdiv_vf, opfvf_check_th) GEN_OPFVF_TRANS_TH(th_vfrdiv_vf, opfvf_check_th) =20 +/* Vector Widening Floating-Point Multiply */ +GEN_OPFVV_WIDEN_TRANS_TH(th_vfwmul_vv, opfvv_widen_check_th) +GEN_OPFVF_WIDEN_TRANS_TH(th_vfwmul_vf) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfwmul_vv) -TH_TRANS_STUB(th_vfwmul_vf) TH_TRANS_STUB(th_vfmacc_vv) TH_TRANS_STUB(th_vfnmacc_vv) TH_TRANS_STUB(th_vfnmacc_vf) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index d65b32c584..aa7714d651 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -3059,13 +3059,13 @@ GEN_VEXT_VF(vfrdiv_vf_w, 4) GEN_VEXT_VF(vfrdiv_vf_d, 8) =20 /* Vector Widening Floating-Point Multiply */ -static uint32_t vfwmul16(uint16_t a, uint16_t b, float_status *s) +uint32_t vfwmul16(uint16_t a, uint16_t b, float_status *s) { return float32_mul(float16_to_float32(a, true, s), float16_to_float32(b, true, s), s); } =20 -static uint64_t vfwmul32(uint32_t a, uint32_t b, float_status *s) +uint64_t vfwmul32(uint32_t a, uint32_t b, float_status *s) { return float64_mul(float32_to_float64(a, s), float32_to_float64(b, s), s); diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 29263c6a53..8903a894d7 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -357,4 +357,7 @@ uint16_t float16_rdiv(uint16_t a, uint16_t b, float_sta= tus *s); uint32_t float32_rdiv(uint32_t a, uint32_t b, float_status *s); uint64_t float64_rdiv(uint64_t a, uint64_t b, float_status *s); =20 +uint32_t vfwmul16(uint16_t a, uint16_t b, float_status *s); +uint64_t vfwmul32(uint32_t a, uint32_t b, float_status *s); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 770f36346f..dd01d66933 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2768,3 +2768,14 @@ THCALL(TH_OPFVF2, th_vfrdiv_vf_d, OP_UUU_D, H8, H8, = float64_rdiv) GEN_TH_VF(th_vfrdiv_vf_h, 2, 2, clearh_th) GEN_TH_VF(th_vfrdiv_vf_w, 4, 4, clearl_th) GEN_TH_VF(th_vfrdiv_vf_d, 8, 8, clearq_th) + +/* Vector Widening Floating-Point Multiply */ + +THCALL(TH_OPFVV2, th_vfwmul_vv_h, WOP_UUU_H, H4, H2, H2, vfwmul16) +THCALL(TH_OPFVV2, th_vfwmul_vv_w, WOP_UUU_W, H8, H4, H4, vfwmul32) +GEN_TH_VV_ENV(th_vfwmul_vv_h, 2, 4, clearl_th) +GEN_TH_VV_ENV(th_vfwmul_vv_w, 4, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfwmul_vf_h, WOP_UUU_H, H4, H2, vfwmul16) +THCALL(TH_OPFVF2, th_vfwmul_vf_w, WOP_UUU_W, H8, H4, vfwmul32) +GEN_TH_VF(th_vfwmul_vf_h, 2, 4, clearl_th) +GEN_TH_VF(th_vfwmul_vf_w, 4, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712913398; cv=none; d=zohomail.com; s=zohoarc; b=nx0aGGHZzkJdidQP34+jx0aZfIk/wAfBxyGAlM2V9OEmgot0AvlownPC5LuWCtV5W9XyreKOQxS1+Ns8aJfN6Yz6CBStUewrjzAw4ugzySiEwuf4HsygQSUJg0dM8vZgwd7NNiS5S0OCT+BNIIujOsy/X0mSDoBaaU1i4oedtAU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712913398; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=eUWnNkwi46NYpGCTk/QbRjEaR50zY2NyLtuuMrUhs5E=; b=WlKvTY4Y4UmpUwIed8LkDOjSAWglLwA7KocW83uq/wg7YKIsgiyU9oSPe+MwOxnFTjKMEdeJBdp6Na49bifwMg8KkIYS0OAEbQocIsITriWy/Pjrl83uOSLLxlnLf9bc8/tdyEo08jWMbDQyD/gptQ+MMowS8ZNXPWgEl5la3pY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712913398479346.6660346462096; Fri, 12 Apr 2024 02:16:38 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvD0b-0002ZO-Hc; Fri, 12 Apr 2024 05:15:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvD0V-0002Yk-Np; Fri, 12 Apr 2024 05:15:40 -0400 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvD0M-0001Q4-H5; Fri, 12 Apr 2024 05:15:39 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NsVn2_1712913317) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:15:18 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712913319; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=eUWnNkwi46NYpGCTk/QbRjEaR50zY2NyLtuuMrUhs5E=; b=SgKW7XKJx7/3pSguqRl0qsInI+znpSMubMFfir2YRuz3gYaS9ZVWwZKhq0T32/mu9DXqKz7J6tgi4FR0FR7GxyvwSGO1u5TdhyKiVdVlQBuUsTMeecL4i+iIzj83ZUF911qZT+RS5uJxttA8FiplfMKiepE29KJ0g+ldQTuFFNk= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R421e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046050; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NsVn2_1712913317; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 40/65] target/riscv: Add single-width floating-point fused multiply-add instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:10 +0800 Message-ID: <20240412073735.76413-41-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.133; envelope-from=eric.huang@linux.alibaba.com; helo=out30-133.freemail.mail.aliyun.com X-Spam_score_int: -94 X-Spam_score: -9.5 X-Spam_bar: --------- X-Spam_report: (-9.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712913400492100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 49 +++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 34 ++--- target/riscv/vector_helper.c | 48 +++---- target/riscv/vector_internals.h | 25 ++++ target/riscv/xtheadvector_helper.c | 125 ++++++++++++++++++ 5 files changed, 241 insertions(+), 40 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 3102b078e4..88e3a18e17 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2050,3 +2050,52 @@ DEF_HELPER_6(th_vfwmul_vv_h, void, ptr, ptr, ptr, pt= r, env, i32) DEF_HELPER_6(th_vfwmul_vv_w, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(th_vfwmul_vf_h, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfwmul_vf_w, void, ptr, ptr, i64, ptr, env, i32) + +DEF_HELPER_6(th_vfmacc_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmacc_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmacc_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfnmacc_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfnmacc_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfnmacc_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmsac_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmsac_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmsac_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfnmsac_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfnmsac_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfnmsac_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmadd_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmadd_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmadd_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfnmadd_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfnmadd_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfnmadd_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmsub_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmsub_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmsub_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfnmsub_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfnmsub_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfnmsub_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmacc_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmacc_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmacc_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfnmacc_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfnmacc_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfnmacc_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmsac_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmsac_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmsac_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfnmsac_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfnmsac_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfnmsac_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmadd_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmadd_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmadd_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfnmadd_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfnmadd_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfnmadd_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmsub_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmsub_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmsub_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfnmsub_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfnmsub_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfnmsub_vf_d, void, ptr, ptr, i64, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 3d0370f220..af512c489b 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2019,28 +2019,30 @@ GEN_OPFVF_TRANS_TH(th_vfrdiv_vf, opfvf_check_th) GEN_OPFVV_WIDEN_TRANS_TH(th_vfwmul_vv, opfvv_widen_check_th) GEN_OPFVF_WIDEN_TRANS_TH(th_vfwmul_vf) =20 +/* Vector Single-Width Floating-Point Fused Multiply-Add Instructions */ +GEN_OPFVV_TRANS_TH(th_vfmacc_vv, opfvv_check_th) +GEN_OPFVV_TRANS_TH(th_vfnmacc_vv, opfvv_check_th) +GEN_OPFVV_TRANS_TH(th_vfmsac_vv, opfvv_check_th) +GEN_OPFVV_TRANS_TH(th_vfnmsac_vv, opfvv_check_th) +GEN_OPFVV_TRANS_TH(th_vfmadd_vv, opfvv_check_th) +GEN_OPFVV_TRANS_TH(th_vfnmadd_vv, opfvv_check_th) +GEN_OPFVV_TRANS_TH(th_vfmsub_vv, opfvv_check_th) +GEN_OPFVV_TRANS_TH(th_vfnmsub_vv, opfvv_check_th) +GEN_OPFVF_TRANS_TH(th_vfmacc_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfnmacc_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfmsac_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfnmsac_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfmadd_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfnmadd_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfmsub_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfnmsub_vf, opfvf_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfmacc_vv) -TH_TRANS_STUB(th_vfnmacc_vv) -TH_TRANS_STUB(th_vfnmacc_vf) -TH_TRANS_STUB(th_vfmacc_vf) -TH_TRANS_STUB(th_vfmsac_vv) -TH_TRANS_STUB(th_vfmsac_vf) -TH_TRANS_STUB(th_vfnmsac_vv) -TH_TRANS_STUB(th_vfnmsac_vf) -TH_TRANS_STUB(th_vfmadd_vv) -TH_TRANS_STUB(th_vfmadd_vf) -TH_TRANS_STUB(th_vfnmadd_vv) -TH_TRANS_STUB(th_vfnmadd_vf) -TH_TRANS_STUB(th_vfmsub_vv) -TH_TRANS_STUB(th_vfmsub_vf) -TH_TRANS_STUB(th_vfnmsub_vv) -TH_TRANS_STUB(th_vfnmsub_vf) TH_TRANS_STUB(th_vfwmacc_vv) TH_TRANS_STUB(th_vfwmacc_vf) TH_TRANS_STUB(th_vfwnmacc_vv) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index aa7714d651..165221e08b 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -3091,17 +3091,17 @@ static void do_##NAME(void *vd, void *vs1, void *vs= 2, int i, \ *((TD *)vd + HD(i)) =3D OP(s2, s1, d, &env->fp_status); \ } =20 -static uint16_t fmacc16(uint16_t a, uint16_t b, uint16_t d, float_status *= s) +uint16_t fmacc16(uint16_t a, uint16_t b, uint16_t d, float_status *s) { return float16_muladd(a, b, d, 0, s); } =20 -static uint32_t fmacc32(uint32_t a, uint32_t b, uint32_t d, float_status *= s) +uint32_t fmacc32(uint32_t a, uint32_t b, uint32_t d, float_status *s) { return float32_muladd(a, b, d, 0, s); } =20 -static uint64_t fmacc64(uint64_t a, uint64_t b, uint64_t d, float_status *= s) +uint64_t fmacc64(uint64_t a, uint64_t b, uint64_t d, float_status *s) { return float64_muladd(a, b, d, 0, s); } @@ -3129,19 +3129,19 @@ GEN_VEXT_VF(vfmacc_vf_h, 2) GEN_VEXT_VF(vfmacc_vf_w, 4) GEN_VEXT_VF(vfmacc_vf_d, 8) =20 -static uint16_t fnmacc16(uint16_t a, uint16_t b, uint16_t d, float_status = *s) +uint16_t fnmacc16(uint16_t a, uint16_t b, uint16_t d, float_status *s) { return float16_muladd(a, b, d, float_muladd_negate_c | float_muladd_negate_product, s); } =20 -static uint32_t fnmacc32(uint32_t a, uint32_t b, uint32_t d, float_status = *s) +uint32_t fnmacc32(uint32_t a, uint32_t b, uint32_t d, float_status *s) { return float32_muladd(a, b, d, float_muladd_negate_c | float_muladd_negate_product, s); } =20 -static uint64_t fnmacc64(uint64_t a, uint64_t b, uint64_t d, float_status = *s) +uint64_t fnmacc64(uint64_t a, uint64_t b, uint64_t d, float_status *s) { return float64_muladd(a, b, d, float_muladd_negate_c | float_muladd_negate_product, s); @@ -3160,17 +3160,17 @@ GEN_VEXT_VF(vfnmacc_vf_h, 2) GEN_VEXT_VF(vfnmacc_vf_w, 4) GEN_VEXT_VF(vfnmacc_vf_d, 8) =20 -static uint16_t fmsac16(uint16_t a, uint16_t b, uint16_t d, float_status *= s) +uint16_t fmsac16(uint16_t a, uint16_t b, uint16_t d, float_status *s) { return float16_muladd(a, b, d, float_muladd_negate_c, s); } =20 -static uint32_t fmsac32(uint32_t a, uint32_t b, uint32_t d, float_status *= s) +uint32_t fmsac32(uint32_t a, uint32_t b, uint32_t d, float_status *s) { return float32_muladd(a, b, d, float_muladd_negate_c, s); } =20 -static uint64_t fmsac64(uint64_t a, uint64_t b, uint64_t d, float_status *= s) +uint64_t fmsac64(uint64_t a, uint64_t b, uint64_t d, float_status *s) { return float64_muladd(a, b, d, float_muladd_negate_c, s); } @@ -3188,17 +3188,17 @@ GEN_VEXT_VF(vfmsac_vf_h, 2) GEN_VEXT_VF(vfmsac_vf_w, 4) GEN_VEXT_VF(vfmsac_vf_d, 8) =20 -static uint16_t fnmsac16(uint16_t a, uint16_t b, uint16_t d, float_status = *s) +uint16_t fnmsac16(uint16_t a, uint16_t b, uint16_t d, float_status *s) { return float16_muladd(a, b, d, float_muladd_negate_product, s); } =20 -static uint32_t fnmsac32(uint32_t a, uint32_t b, uint32_t d, float_status = *s) +uint32_t fnmsac32(uint32_t a, uint32_t b, uint32_t d, float_status *s) { return float32_muladd(a, b, d, float_muladd_negate_product, s); } =20 -static uint64_t fnmsac64(uint64_t a, uint64_t b, uint64_t d, float_status = *s) +uint64_t fnmsac64(uint64_t a, uint64_t b, uint64_t d, float_status *s) { return float64_muladd(a, b, d, float_muladd_negate_product, s); } @@ -3216,17 +3216,17 @@ GEN_VEXT_VF(vfnmsac_vf_h, 2) GEN_VEXT_VF(vfnmsac_vf_w, 4) GEN_VEXT_VF(vfnmsac_vf_d, 8) =20 -static uint16_t fmadd16(uint16_t a, uint16_t b, uint16_t d, float_status *= s) +uint16_t fmadd16(uint16_t a, uint16_t b, uint16_t d, float_status *s) { return float16_muladd(d, b, a, 0, s); } =20 -static uint32_t fmadd32(uint32_t a, uint32_t b, uint32_t d, float_status *= s) +uint32_t fmadd32(uint32_t a, uint32_t b, uint32_t d, float_status *s) { return float32_muladd(d, b, a, 0, s); } =20 -static uint64_t fmadd64(uint64_t a, uint64_t b, uint64_t d, float_status *= s) +uint64_t fmadd64(uint64_t a, uint64_t b, uint64_t d, float_status *s) { return float64_muladd(d, b, a, 0, s); } @@ -3244,19 +3244,19 @@ GEN_VEXT_VF(vfmadd_vf_h, 2) GEN_VEXT_VF(vfmadd_vf_w, 4) GEN_VEXT_VF(vfmadd_vf_d, 8) =20 -static uint16_t fnmadd16(uint16_t a, uint16_t b, uint16_t d, float_status = *s) +uint16_t fnmadd16(uint16_t a, uint16_t b, uint16_t d, float_status *s) { return float16_muladd(d, b, a, float_muladd_negate_c | float_muladd_negate_product, s); } =20 -static uint32_t fnmadd32(uint32_t a, uint32_t b, uint32_t d, float_status = *s) +uint32_t fnmadd32(uint32_t a, uint32_t b, uint32_t d, float_status *s) { return float32_muladd(d, b, a, float_muladd_negate_c | float_muladd_negate_product, s); } =20 -static uint64_t fnmadd64(uint64_t a, uint64_t b, uint64_t d, float_status = *s) +uint64_t fnmadd64(uint64_t a, uint64_t b, uint64_t d, float_status *s) { return float64_muladd(d, b, a, float_muladd_negate_c | float_muladd_negate_product, s); @@ -3275,17 +3275,17 @@ GEN_VEXT_VF(vfnmadd_vf_h, 2) GEN_VEXT_VF(vfnmadd_vf_w, 4) GEN_VEXT_VF(vfnmadd_vf_d, 8) =20 -static uint16_t fmsub16(uint16_t a, uint16_t b, uint16_t d, float_status *= s) +uint16_t fmsub16(uint16_t a, uint16_t b, uint16_t d, float_status *s) { return float16_muladd(d, b, a, float_muladd_negate_c, s); } =20 -static uint32_t fmsub32(uint32_t a, uint32_t b, uint32_t d, float_status *= s) +uint32_t fmsub32(uint32_t a, uint32_t b, uint32_t d, float_status *s) { return float32_muladd(d, b, a, float_muladd_negate_c, s); } =20 -static uint64_t fmsub64(uint64_t a, uint64_t b, uint64_t d, float_status *= s) +uint64_t fmsub64(uint64_t a, uint64_t b, uint64_t d, float_status *s) { return float64_muladd(d, b, a, float_muladd_negate_c, s); } @@ -3303,17 +3303,17 @@ GEN_VEXT_VF(vfmsub_vf_h, 2) GEN_VEXT_VF(vfmsub_vf_w, 4) GEN_VEXT_VF(vfmsub_vf_d, 8) =20 -static uint16_t fnmsub16(uint16_t a, uint16_t b, uint16_t d, float_status = *s) +uint16_t fnmsub16(uint16_t a, uint16_t b, uint16_t d, float_status *s) { return float16_muladd(d, b, a, float_muladd_negate_product, s); } =20 -static uint32_t fnmsub32(uint32_t a, uint32_t b, uint32_t d, float_status = *s) +uint32_t fnmsub32(uint32_t a, uint32_t b, uint32_t d, float_status *s) { return float32_muladd(d, b, a, float_muladd_negate_product, s); } =20 -static uint64_t fnmsub64(uint64_t a, uint64_t b, uint64_t d, float_status = *s) +uint64_t fnmsub64(uint64_t a, uint64_t b, uint64_t d, float_status *s) { return float64_muladd(d, b, a, float_muladd_negate_product, s); } diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 8903a894d7..5733640e0d 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -360,4 +360,29 @@ uint64_t float64_rdiv(uint64_t a, uint64_t b, float_st= atus *s); uint32_t vfwmul16(uint16_t a, uint16_t b, float_status *s); uint64_t vfwmul32(uint32_t a, uint32_t b, float_status *s); =20 +uint16_t fmacc16(uint16_t a, uint16_t b, uint16_t d, float_status *s); +uint32_t fmacc32(uint32_t a, uint32_t b, uint32_t d, float_status *s); +uint64_t fmacc64(uint64_t a, uint64_t b, uint64_t d, float_status *s); +uint16_t fnmacc16(uint16_t a, uint16_t b, uint16_t d, float_status *s); +uint32_t fnmacc32(uint32_t a, uint32_t b, uint32_t d, float_status *s); +uint64_t fnmacc64(uint64_t a, uint64_t b, uint64_t d, float_status *s); +uint16_t fmsac16(uint16_t a, uint16_t b, uint16_t d, float_status *s); +uint32_t fmsac32(uint32_t a, uint32_t b, uint32_t d, float_status *s); +uint64_t fmsac64(uint64_t a, uint64_t b, uint64_t d, float_status *s); +uint16_t fnmsac16(uint16_t a, uint16_t b, uint16_t d, float_status *s); +uint32_t fnmsac32(uint32_t a, uint32_t b, uint32_t d, float_status *s); +uint64_t fnmsac64(uint64_t a, uint64_t b, uint64_t d, float_status *s); +uint16_t fmadd16(uint16_t a, uint16_t b, uint16_t d, float_status *s); +uint32_t fmadd32(uint32_t a, uint32_t b, uint32_t d, float_status *s); +uint64_t fmadd64(uint64_t a, uint64_t b, uint64_t d, float_status *s); +uint16_t fnmadd16(uint16_t a, uint16_t b, uint16_t d, float_status *s); +uint32_t fnmadd32(uint32_t a, uint32_t b, uint32_t d, float_status *s); +uint64_t fnmadd64(uint64_t a, uint64_t b, uint64_t d, float_status *s); +uint16_t fmsub16(uint16_t a, uint16_t b, uint16_t d, float_status *s); +uint32_t fmsub32(uint32_t a, uint32_t b, uint32_t d, float_status *s); +uint64_t fmsub64(uint64_t a, uint64_t b, uint64_t d, float_status *s); +uint16_t fnmsub16(uint16_t a, uint16_t b, uint16_t d, float_status *s); +uint32_t fnmsub32(uint32_t a, uint32_t b, uint32_t d, float_status *s); +uint64_t fnmsub64(uint64_t a, uint64_t b, uint64_t d, float_status *s); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index dd01d66933..1d2da6ffb7 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2779,3 +2779,128 @@ THCALL(TH_OPFVF2, th_vfwmul_vf_h, WOP_UUU_H, H4, H2= , vfwmul16) THCALL(TH_OPFVF2, th_vfwmul_vf_w, WOP_UUU_W, H8, H4, vfwmul32) GEN_TH_VF(th_vfwmul_vf_h, 2, 4, clearl_th) GEN_TH_VF(th_vfwmul_vf_w, 4, 8, clearq_th) + +/* Vector Single-Width Floating-Point Fused Multiply-Add Instructions */ +#define TH_OPFVV3(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2, OP) \ +static void do_##NAME(void *vd, void *vs1, void *vs2, int i, \ + CPURISCVState *env) \ +{ \ + TX1 s1 =3D *((T1 *)vs1 + HS1(i)); \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + TD d =3D *((TD *)vd + HD(i)); \ + *((TD *)vd + HD(i)) =3D OP(s2, s1, d, &env->fp_status); \ +} + +THCALL(TH_OPFVV3, th_vfmacc_vv_h, OP_UUU_H, H2, H2, H2, fmacc16) +THCALL(TH_OPFVV3, th_vfmacc_vv_w, OP_UUU_W, H4, H4, H4, fmacc32) +THCALL(TH_OPFVV3, th_vfmacc_vv_d, OP_UUU_D, H8, H8, H8, fmacc64) +GEN_TH_VV_ENV(th_vfmacc_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfmacc_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfmacc_vv_d, 8, 8, clearq_th) + +#define TH_OPFVF3(NAME, TD, T1, T2, TX1, TX2, HD, HS2, OP) \ +static void do_##NAME(void *vd, uint64_t s1, void *vs2, int i, \ + CPURISCVState *env) \ +{ \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + TD d =3D *((TD *)vd + HD(i)); \ + *((TD *)vd + HD(i)) =3D OP(s2, (TX1)(T1)s1, d, &env->fp_status);\ +} + +THCALL(TH_OPFVF3, th_vfmacc_vf_h, OP_UUU_H, H2, H2, fmacc16) +THCALL(TH_OPFVF3, th_vfmacc_vf_w, OP_UUU_W, H4, H4, fmacc32) +THCALL(TH_OPFVF3, th_vfmacc_vf_d, OP_UUU_D, H8, H8, fmacc64) +GEN_TH_VF(th_vfmacc_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfmacc_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfmacc_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVV3, th_vfnmacc_vv_h, OP_UUU_H, H2, H2, H2, fnmacc16) +THCALL(TH_OPFVV3, th_vfnmacc_vv_w, OP_UUU_W, H4, H4, H4, fnmacc32) +THCALL(TH_OPFVV3, th_vfnmacc_vv_d, OP_UUU_D, H8, H8, H8, fnmacc64) +GEN_TH_VV_ENV(th_vfnmacc_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfnmacc_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfnmacc_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF3, th_vfnmacc_vf_h, OP_UUU_H, H2, H2, fnmacc16) +THCALL(TH_OPFVF3, th_vfnmacc_vf_w, OP_UUU_W, H4, H4, fnmacc32) +THCALL(TH_OPFVF3, th_vfnmacc_vf_d, OP_UUU_D, H8, H8, fnmacc64) +GEN_TH_VF(th_vfnmacc_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfnmacc_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfnmacc_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVV3, th_vfmsac_vv_h, OP_UUU_H, H2, H2, H2, fmsac16) +THCALL(TH_OPFVV3, th_vfmsac_vv_w, OP_UUU_W, H4, H4, H4, fmsac32) +THCALL(TH_OPFVV3, th_vfmsac_vv_d, OP_UUU_D, H8, H8, H8, fmsac64) +GEN_TH_VV_ENV(th_vfmsac_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfmsac_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfmsac_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF3, th_vfmsac_vf_h, OP_UUU_H, H2, H2, fmsac16) +THCALL(TH_OPFVF3, th_vfmsac_vf_w, OP_UUU_W, H4, H4, fmsac32) +THCALL(TH_OPFVF3, th_vfmsac_vf_d, OP_UUU_D, H8, H8, fmsac64) +GEN_TH_VF(th_vfmsac_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfmsac_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfmsac_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVV3, th_vfnmsac_vv_h, OP_UUU_H, H2, H2, H2, fnmsac16) +THCALL(TH_OPFVV3, th_vfnmsac_vv_w, OP_UUU_W, H4, H4, H4, fnmsac32) +THCALL(TH_OPFVV3, th_vfnmsac_vv_d, OP_UUU_D, H8, H8, H8, fnmsac64) +GEN_TH_VV_ENV(th_vfnmsac_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfnmsac_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfnmsac_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF3, th_vfnmsac_vf_h, OP_UUU_H, H2, H2, fnmsac16) +THCALL(TH_OPFVF3, th_vfnmsac_vf_w, OP_UUU_W, H4, H4, fnmsac32) +THCALL(TH_OPFVF3, th_vfnmsac_vf_d, OP_UUU_D, H8, H8, fnmsac64) +GEN_TH_VF(th_vfnmsac_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfnmsac_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfnmsac_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVV3, th_vfmadd_vv_h, OP_UUU_H, H2, H2, H2, fmadd16) +THCALL(TH_OPFVV3, th_vfmadd_vv_w, OP_UUU_W, H4, H4, H4, fmadd32) +THCALL(TH_OPFVV3, th_vfmadd_vv_d, OP_UUU_D, H8, H8, H8, fmadd64) +GEN_TH_VV_ENV(th_vfmadd_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfmadd_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfmadd_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF3, th_vfmadd_vf_h, OP_UUU_H, H2, H2, fmadd16) +THCALL(TH_OPFVF3, th_vfmadd_vf_w, OP_UUU_W, H4, H4, fmadd32) +THCALL(TH_OPFVF3, th_vfmadd_vf_d, OP_UUU_D, H8, H8, fmadd64) +GEN_TH_VF(th_vfmadd_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfmadd_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfmadd_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVV3, th_vfnmadd_vv_h, OP_UUU_H, H2, H2, H2, fnmadd16) +THCALL(TH_OPFVV3, th_vfnmadd_vv_w, OP_UUU_W, H4, H4, H4, fnmadd32) +THCALL(TH_OPFVV3, th_vfnmadd_vv_d, OP_UUU_D, H8, H8, H8, fnmadd64) +GEN_TH_VV_ENV(th_vfnmadd_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfnmadd_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfnmadd_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF3, th_vfnmadd_vf_h, OP_UUU_H, H2, H2, fnmadd16) +THCALL(TH_OPFVF3, th_vfnmadd_vf_w, OP_UUU_W, H4, H4, fnmadd32) +THCALL(TH_OPFVF3, th_vfnmadd_vf_d, OP_UUU_D, H8, H8, fnmadd64) +GEN_TH_VF(th_vfnmadd_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfnmadd_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfnmadd_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVV3, th_vfmsub_vv_h, OP_UUU_H, H2, H2, H2, fmsub16) +THCALL(TH_OPFVV3, th_vfmsub_vv_w, OP_UUU_W, H4, H4, H4, fmsub32) +THCALL(TH_OPFVV3, th_vfmsub_vv_d, OP_UUU_D, H8, H8, H8, fmsub64) +GEN_TH_VV_ENV(th_vfmsub_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfmsub_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfmsub_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF3, th_vfmsub_vf_h, OP_UUU_H, H2, H2, fmsub16) +THCALL(TH_OPFVF3, th_vfmsub_vf_w, OP_UUU_W, H4, H4, fmsub32) +THCALL(TH_OPFVF3, th_vfmsub_vf_d, OP_UUU_D, H8, H8, fmsub64) +GEN_TH_VF(th_vfmsub_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfmsub_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfmsub_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVV3, th_vfnmsub_vv_h, OP_UUU_H, H2, H2, H2, fnmsub16) +THCALL(TH_OPFVV3, th_vfnmsub_vv_w, OP_UUU_W, H4, H4, H4, fnmsub32) +THCALL(TH_OPFVV3, th_vfnmsub_vv_d, OP_UUU_D, H8, H8, H8, fnmsub64) +GEN_TH_VV_ENV(th_vfnmsub_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfnmsub_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfnmsub_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF3, th_vfnmsub_vf_h, OP_UUU_H, H2, H2, fnmsub16) +THCALL(TH_OPFVF3, th_vfnmsub_vf_w, OP_UUU_W, H4, H4, fnmsub32) +THCALL(TH_OPFVF3, th_vfnmsub_vf_d, OP_UUU_D, H8, H8, fnmsub64) +GEN_TH_VF(th_vfnmsub_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfnmsub_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfnmsub_vf_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712913469; cv=none; d=zohomail.com; s=zohoarc; b=FYZ6mwD+su/AJrth3UOBJxc6mpb8G62Dy2J4y0f2k3t+z0NcjupTaxn+GIhwdvAEpfe/xIv+n7TNOaq0GTtQjJIxTATSDszQG/+RYWbvIUH6GJnoqjcVnzU2P5P/47TIe5dFRppZvEe9VAE+IXMRCB0yDWeIZWq6C6k3DKJy+Q0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712913469; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=CV8FBN1HjN58rCyAXJLpnJ7PtVjcdUgIJ+0yhbkx+T8=; b=jztare8SMsV8Zh/iCOTKblrLlN9SsncrFe2ihPnRs3byvcrWioduBBCTuZ+1DHudz9k0uhCPa1WIIStjMnpu7La9p+lKZftDeb9j8ajy6v84adpHHAK9F96RviF9T6FGM/Rn3TdZhW9pIa1wjIKSsy7KcOzlL7IAZGF5V6NYRxQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712913469304940.0620747744038; Fri, 12 Apr 2024 02:17:49 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvD2O-00048e-HX; Fri, 12 Apr 2024 05:17:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvD2I-000420-E0; Fri, 12 Apr 2024 05:17:33 -0400 Received: from out30-98.freemail.mail.aliyun.com ([115.124.30.98]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvD2F-0001fL-Bu; Fri, 12 Apr 2024 05:17:30 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NsWV7_1712913439) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:17:19 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712913441; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=CV8FBN1HjN58rCyAXJLpnJ7PtVjcdUgIJ+0yhbkx+T8=; b=qwzZ6IALP9yyPJsrcv9LcF3C25CMdgLcpQ/yQeXu/HefZ8ffkqby+3LZ3mouyb1EgObfsjA+b5nJdYlLKu1Y0rAUM36OG8hZBTpiYK4Y00yj1SLW3FxDyvjwI17oEKEUWBV0NK5GjdwDOBAQbs1SxUbGWSZXcI7dhv5gGP4n8Hk= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R201e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046049; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NsWV7_1712913439; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 41/65] target/riscv: Add widening floating-point fused mul-add instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:11 +0800 Message-ID: <20240412073735.76413-42-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.98; envelope-from=eric.huang@linux.alibaba.com; helo=out30-98.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712913470193100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 17 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 18 +++++---- target/riscv/vector_helper.c | 16 ++++---- target/riscv/vector_internals.h | 9 +++++ target/riscv/xtheadvector_helper.c | 38 +++++++++++++++++++ 5 files changed, 82 insertions(+), 16 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 88e3a18e17..12b5e4573a 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2099,3 +2099,20 @@ DEF_HELPER_6(th_vfmsub_vf_d, void, ptr, ptr, i64, pt= r, env, i32) DEF_HELPER_6(th_vfnmsub_vf_h, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfnmsub_vf_w, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfnmsub_vf_d, void, ptr, ptr, i64, ptr, env, i32) + +DEF_HELPER_6(th_vfwmacc_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwmacc_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwnmacc_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwnmacc_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwmsac_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwmsac_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwnmsac_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwnmsac_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwmacc_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwmacc_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwnmacc_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwnmacc_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwmsac_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwmsac_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwnmsac_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfwnmsac_vf_w, void, ptr, ptr, i64, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index af512c489b..7220b7d607 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2037,20 +2037,22 @@ GEN_OPFVF_TRANS_TH(th_vfnmadd_vf, opfvf_check_th) GEN_OPFVF_TRANS_TH(th_vfmsub_vf, opfvf_check_th) GEN_OPFVF_TRANS_TH(th_vfnmsub_vf, opfvf_check_th) =20 +/* Vector Widening Floating-Point Fused Multiply-Add Instructions */ +GEN_OPFVV_WIDEN_TRANS_TH(th_vfwmacc_vv, opfvv_widen_check_th) +GEN_OPFVV_WIDEN_TRANS_TH(th_vfwnmacc_vv, opfvv_widen_check_th) +GEN_OPFVV_WIDEN_TRANS_TH(th_vfwmsac_vv, opfvv_widen_check_th) +GEN_OPFVV_WIDEN_TRANS_TH(th_vfwnmsac_vv, opfvv_widen_check_th) +GEN_OPFVF_WIDEN_TRANS_TH(th_vfwmacc_vf) +GEN_OPFVF_WIDEN_TRANS_TH(th_vfwnmacc_vf) +GEN_OPFVF_WIDEN_TRANS_TH(th_vfwmsac_vf) +GEN_OPFVF_WIDEN_TRANS_TH(th_vfwnmsac_vf) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfwmacc_vv) -TH_TRANS_STUB(th_vfwmacc_vf) -TH_TRANS_STUB(th_vfwnmacc_vv) -TH_TRANS_STUB(th_vfwnmacc_vf) -TH_TRANS_STUB(th_vfwmsac_vv) -TH_TRANS_STUB(th_vfwmsac_vf) -TH_TRANS_STUB(th_vfwnmsac_vv) -TH_TRANS_STUB(th_vfwnmsac_vf) TH_TRANS_STUB(th_vfsqrt_v) TH_TRANS_STUB(th_vfmin_vv) TH_TRANS_STUB(th_vfmin_vf) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 165221e08b..ef89794bdd 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -3332,13 +3332,13 @@ GEN_VEXT_VF(vfnmsub_vf_w, 4) GEN_VEXT_VF(vfnmsub_vf_d, 8) =20 /* Vector Widening Floating-Point Fused Multiply-Add Instructions */ -static uint32_t fwmacc16(uint16_t a, uint16_t b, uint32_t d, float_status = *s) +uint32_t fwmacc16(uint16_t a, uint16_t b, uint32_t d, float_status *s) { return float32_muladd(float16_to_float32(a, true, s), float16_to_float32(b, true, s), d, 0, s); } =20 -static uint64_t fwmacc32(uint32_t a, uint32_t b, uint64_t d, float_status = *s) +uint64_t fwmacc32(uint32_t a, uint32_t b, uint64_t d, float_status *s) { return float64_muladd(float32_to_float64(a, s), float32_to_float64(b, s), d, 0, s); @@ -3364,7 +3364,7 @@ GEN_VEXT_VV_ENV(vfwmaccbf16_vv, 4) RVVCALL(OPFVF3, vfwmaccbf16_vf, WOP_UUU_H, H4, H2, fwmaccbf16) GEN_VEXT_VF(vfwmaccbf16_vf, 4) =20 -static uint32_t fwnmacc16(uint16_t a, uint16_t b, uint32_t d, float_status= *s) +uint32_t fwnmacc16(uint16_t a, uint16_t b, uint32_t d, float_status *s) { return float32_muladd(float16_to_float32(a, true, s), float16_to_float32(b, true, s), d, @@ -3372,7 +3372,7 @@ static uint32_t fwnmacc16(uint16_t a, uint16_t b, uin= t32_t d, float_status *s) s); } =20 -static uint64_t fwnmacc32(uint32_t a, uint32_t b, uint64_t d, float_status= *s) +uint64_t fwnmacc32(uint32_t a, uint32_t b, uint64_t d, float_status *s) { return float64_muladd(float32_to_float64(a, s), float32_to_float64(b, = s), d, float_muladd_negate_c | @@ -3388,14 +3388,14 @@ RVVCALL(OPFVF3, vfwnmacc_vf_w, WOP_UUU_W, H8, H4, f= wnmacc32) GEN_VEXT_VF(vfwnmacc_vf_h, 4) GEN_VEXT_VF(vfwnmacc_vf_w, 8) =20 -static uint32_t fwmsac16(uint16_t a, uint16_t b, uint32_t d, float_status = *s) +uint32_t fwmsac16(uint16_t a, uint16_t b, uint32_t d, float_status *s) { return float32_muladd(float16_to_float32(a, true, s), float16_to_float32(b, true, s), d, float_muladd_negate_c, s); } =20 -static uint64_t fwmsac32(uint32_t a, uint32_t b, uint64_t d, float_status = *s) +uint64_t fwmsac32(uint32_t a, uint32_t b, uint64_t d, float_status *s) { return float64_muladd(float32_to_float64(a, s), float32_to_float64(b, s), d, @@ -3411,14 +3411,14 @@ RVVCALL(OPFVF3, vfwmsac_vf_w, WOP_UUU_W, H8, H4, fw= msac32) GEN_VEXT_VF(vfwmsac_vf_h, 4) GEN_VEXT_VF(vfwmsac_vf_w, 8) =20 -static uint32_t fwnmsac16(uint16_t a, uint16_t b, uint32_t d, float_status= *s) +uint32_t fwnmsac16(uint16_t a, uint16_t b, uint32_t d, float_status *s) { return float32_muladd(float16_to_float32(a, true, s), float16_to_float32(b, true, s), d, float_muladd_negate_product, s); } =20 -static uint64_t fwnmsac32(uint32_t a, uint32_t b, uint64_t d, float_status= *s) +uint64_t fwnmsac32(uint32_t a, uint32_t b, uint64_t d, float_status *s) { return float64_muladd(float32_to_float64(a, s), float32_to_float64(b, s), d, diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 5733640e0d..535d31007d 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -385,4 +385,13 @@ uint16_t fnmsub16(uint16_t a, uint16_t b, uint16_t d, = float_status *s); uint32_t fnmsub32(uint32_t a, uint32_t b, uint32_t d, float_status *s); uint64_t fnmsub64(uint64_t a, uint64_t b, uint64_t d, float_status *s); =20 +uint32_t fwmacc16(uint16_t a, uint16_t b, uint32_t d, float_status *s); +uint64_t fwmacc32(uint32_t a, uint32_t b, uint64_t d, float_status *s); +uint32_t fwnmacc16(uint16_t a, uint16_t b, uint32_t d, float_status *s); +uint64_t fwnmacc32(uint32_t a, uint32_t b, uint64_t d, float_status *s); +uint32_t fwmsac16(uint16_t a, uint16_t b, uint32_t d, float_status *s); +uint64_t fwmsac32(uint32_t a, uint32_t b, uint64_t d, float_status *s); +uint32_t fwnmsac16(uint16_t a, uint16_t b, uint32_t d, float_status *s); +uint64_t fwnmsac32(uint32_t a, uint32_t b, uint64_t d, float_status *s); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 1d2da6ffb7..ac8e576c49 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2904,3 +2904,41 @@ THCALL(TH_OPFVF3, th_vfnmsub_vf_d, OP_UUU_D, H8, H8,= fnmsub64) GEN_TH_VF(th_vfnmsub_vf_h, 2, 2, clearh_th) GEN_TH_VF(th_vfnmsub_vf_w, 4, 4, clearl_th) GEN_TH_VF(th_vfnmsub_vf_d, 8, 8, clearq_th) + +/* Vector Widening Floating-Point Fused Multiply-Add Instructions */ + +THCALL(TH_OPFVV3, th_vfwmacc_vv_h, WOP_UUU_H, H4, H2, H2, fwmacc16) +THCALL(TH_OPFVV3, th_vfwmacc_vv_w, WOP_UUU_W, H8, H4, H4, fwmacc32) +GEN_TH_VV_ENV(th_vfwmacc_vv_h, 2, 4, clearl_th) +GEN_TH_VV_ENV(th_vfwmacc_vv_w, 4, 8, clearq_th) +THCALL(TH_OPFVF3, th_vfwmacc_vf_h, WOP_UUU_H, H4, H2, fwmacc16) +THCALL(TH_OPFVF3, th_vfwmacc_vf_w, WOP_UUU_W, H8, H4, fwmacc32) +GEN_TH_VF(th_vfwmacc_vf_h, 2, 4, clearl_th) +GEN_TH_VF(th_vfwmacc_vf_w, 4, 8, clearq_th) + +THCALL(TH_OPFVV3, th_vfwnmacc_vv_h, WOP_UUU_H, H4, H2, H2, fwnmacc16) +THCALL(TH_OPFVV3, th_vfwnmacc_vv_w, WOP_UUU_W, H8, H4, H4, fwnmacc32) +GEN_TH_VV_ENV(th_vfwnmacc_vv_h, 2, 4, clearl_th) +GEN_TH_VV_ENV(th_vfwnmacc_vv_w, 4, 8, clearq_th) +THCALL(TH_OPFVF3, th_vfwnmacc_vf_h, WOP_UUU_H, H4, H2, fwnmacc16) +THCALL(TH_OPFVF3, th_vfwnmacc_vf_w, WOP_UUU_W, H8, H4, fwnmacc32) +GEN_TH_VF(th_vfwnmacc_vf_h, 2, 4, clearl_th) +GEN_TH_VF(th_vfwnmacc_vf_w, 4, 8, clearq_th) + +THCALL(TH_OPFVV3, th_vfwmsac_vv_h, WOP_UUU_H, H4, H2, H2, fwmsac16) +THCALL(TH_OPFVV3, th_vfwmsac_vv_w, WOP_UUU_W, H8, H4, H4, fwmsac32) +GEN_TH_VV_ENV(th_vfwmsac_vv_h, 2, 4, clearl_th) +GEN_TH_VV_ENV(th_vfwmsac_vv_w, 4, 8, clearq_th) +THCALL(TH_OPFVF3, th_vfwmsac_vf_h, WOP_UUU_H, H4, H2, fwmsac16) +THCALL(TH_OPFVF3, th_vfwmsac_vf_w, WOP_UUU_W, H8, H4, fwmsac32) +GEN_TH_VF(th_vfwmsac_vf_h, 2, 4, clearl_th) +GEN_TH_VF(th_vfwmsac_vf_w, 4, 8, clearq_th) + +THCALL(TH_OPFVV3, th_vfwnmsac_vv_h, WOP_UUU_H, H4, H2, H2, fwnmsac16) +THCALL(TH_OPFVV3, th_vfwnmsac_vv_w, WOP_UUU_W, H8, H4, H4, fwnmsac32) +GEN_TH_VV_ENV(th_vfwnmsac_vv_h, 2, 4, clearl_th) +GEN_TH_VV_ENV(th_vfwnmsac_vv_w, 4, 8, clearq_th) +THCALL(TH_OPFVF3, th_vfwnmsac_vf_h, WOP_UUU_H, H4, H2, fwnmsac16) +THCALL(TH_OPFVF3, th_vfwnmsac_vf_w, WOP_UUU_W, H8, H4, fwnmsac32) +GEN_TH_VF(th_vfwnmsac_vf_h, 2, 4, clearl_th) +GEN_TH_VF(th_vfwnmsac_vf_w, 4, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712913612; cv=none; d=zohomail.com; s=zohoarc; b=liRqahWosMfLETT81w19T4mybCKnXhLjgZ6wuTCmoKFMr6uyaUpyGRpZPuobhOS764sd4FLAW7Hr+K7cNIMKP9QAAJVJPq8aCvJq+5DDzp4/npwfo3MamZmzhsAspDcmpJlBDHEME7sAZL1mmpN0ygxm2wSnz48VTsZXaw9A3/o= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712913612; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=0QfOz2CtMvPhZVW6CUIxYN/ynScMbPZnTChAkBiscYQ=; b=jnJoFU0hWfO4KVWU63A59nY0aa1l8ifvvt4DUc5RyL02MBG7sxgFrNux9xBIfa9xb3rYdrEWTYHPrtWRGroVfZvRrGN7+FIwLf6d2qEWzsjP60hNf32RLSGkJalcCQONbMVyk2otFYBZ6YwzCyP7E1ga+CZ4wdd2ewfzEfZjVh0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712913612099522.7945276748272; Fri, 12 Apr 2024 02:20:12 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvD4M-00054l-N0; Fri, 12 Apr 2024 05:19:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvD4E-00053w-UV; Fri, 12 Apr 2024 05:19:31 -0400 Received: from out30-111.freemail.mail.aliyun.com ([115.124.30.111]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvD4C-0001v7-Dw; Fri, 12 Apr 2024 05:19:30 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NsXIi_1712913560) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:19:21 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712913562; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=0QfOz2CtMvPhZVW6CUIxYN/ynScMbPZnTChAkBiscYQ=; b=ofTUVBp1wMFDUbdOU7A6X9ygYIEchyCGZXOAB0X1yoO7OkQNvfD50l+Enog51ZLd6yse+AVaco4h0VVaVhpVoFpxXHHgavRrzzV4vTlEzUAF0fnvtEvoz72NieBWGgQB4JG9yWxRCXUX/ANI2aVgjHVfdzYYCw7YlE8sm5vvPto= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R641e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045176; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NsXIi_1712913560; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 42/65] target/riscv: Add floating-pointing square-root instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:12 +0800 Message-ID: <20240412073735.76413-43-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.111; envelope-from=eric.huang@linux.alibaba.com; helo=out30-111.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712913612573100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 4 ++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 46 ++++++++++++++++++- target/riscv/xtheadvector_helper.c | 41 +++++++++++++++++ 3 files changed, 90 insertions(+), 1 deletion(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 12b5e4573a..5aa12f3719 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2116,3 +2116,7 @@ DEF_HELPER_6(th_vfwmsac_vf_h, void, ptr, ptr, i64, pt= r, env, i32) DEF_HELPER_6(th_vfwmsac_vf_w, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfwnmsac_vf_h, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfwnmsac_vf_w, void, ptr, ptr, i64, ptr, env, i32) + +DEF_HELPER_5(th_vfsqrt_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfsqrt_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfsqrt_v_d, void, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 7220b7d607..e709444e9f 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2047,13 +2047,57 @@ GEN_OPFVF_WIDEN_TRANS_TH(th_vfwnmacc_vf) GEN_OPFVF_WIDEN_TRANS_TH(th_vfwmsac_vf) GEN_OPFVF_WIDEN_TRANS_TH(th_vfwnmsac_vf) =20 +/* Vector Floating-Point Square-Root Instruction */ + +/* + * If the current SEW does not correspond to a supported IEEE floating-poi= nt + * type, an illegal instruction exception is raised + */ +static bool opfv_check_th(DisasContext *s, arg_rmr *a) +{ + return require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, false) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false) && + (s->sew !=3D 0); +} + +#define GEN_OPFV_TRANS_TH(NAME, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ +{ \ + if (CHECK(s, a)) { \ + uint32_t data =3D 0; \ + static gen_helper_gvec_3_ptr * const fns[3] =3D { \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, \ + gen_helper_##NAME##_d, \ + }; \ + gen_set_rm(s, RISCV_FRM_DYN); \ + \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), \ + vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs2), tcg_env, \ + s->cfg_ptr->vlenb, \ + s->cfg_ptr->vlenb, data, \ + fns[s->sew - 1]); \ + finalize_rvv_inst(s); \ + return true; \ + } \ + return false; \ +} + +GEN_OPFV_TRANS_TH(th_vfsqrt_v, opfv_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfsqrt_v) TH_TRANS_STUB(th_vfmin_vv) TH_TRANS_STUB(th_vfmin_vf) TH_TRANS_STUB(th_vfmax_vv) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index ac8e576c49..7274e7aedb 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2942,3 +2942,44 @@ THCALL(TH_OPFVF3, th_vfwnmsac_vf_h, WOP_UUU_H, H4, H= 2, fwnmsac16) THCALL(TH_OPFVF3, th_vfwnmsac_vf_w, WOP_UUU_W, H8, H4, fwnmsac32) GEN_TH_VF(th_vfwnmsac_vf_h, 2, 4, clearl_th) GEN_TH_VF(th_vfwnmsac_vf_w, 4, 8, clearq_th) + +/* Vector Floating-Point Square-Root Instruction */ + +#define TH_OPFVV1(NAME, TD, T2, TX2, HD, HS2, OP) \ +static void do_##NAME(void *vd, void *vs2, int i, \ + CPURISCVState *env) \ +{ \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + *((TD *)vd + HD(i)) =3D OP(s2, &env->fp_status); \ +} + +#define GEN_TH_V_ENV(NAME, ESZ, DSZ, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t vlmax =3D th_maxsz(desc) / ESZ; \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vm =3D th_vm(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + if (vl =3D=3D 0) { \ + return; \ + } \ + for (i =3D env->vstart; i < vl; i++) { \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + do_##NAME(vd, vs2, i, env); \ + } \ + env->vstart =3D 0; \ + CLEAR_FN(vd, vl, vl * DSZ, vlmax * DSZ); \ +} + +THCALL(TH_OPFVV1, th_vfsqrt_v_h, OP_UU_H, H2, H2, float16_sqrt) +THCALL(TH_OPFVV1, th_vfsqrt_v_w, OP_UU_W, H4, H4, float32_sqrt) +THCALL(TH_OPFVV1, th_vfsqrt_v_d, OP_UU_D, H8, H8, float64_sqrt) +GEN_TH_V_ENV(th_vfsqrt_v_h, 2, 2, clearh_th) +GEN_TH_V_ENV(th_vfsqrt_v_w, 4, 4, clearl_th) +GEN_TH_V_ENV(th_vfsqrt_v_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712913715; cv=none; d=zohomail.com; s=zohoarc; b=NfRnLvLDh9w4jJGXM30R6pPYZ79bRKk/kWz8jN9cvuQ7Ks2OCW4AEo49C/yWbMkVmjH3kwq1iq13nMxicdtBc2AYGc7kaMm4kuT8GgWPKHLNEervOxvZWbJGShRTuvYPFp4Lqr5NDKdjiQ5fCyqsvK0A521jLmtFsYPKc1vGQX8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712913715; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=9WjpYtFzkDs8TURxFiqDBlqgPLbchnvnBpaXx8gHXeM=; b=juHLxfTmimpwlHIQa260svezswHRND/SSQwR0cVPoW6kLoejbyOCUxQqbSeC3gLwIVJ1sNqH1Y8CLfjUnU0FGGmEUyIb8KKMz3xF0IRJYv3ujuMC9g1kPbZPxjYHr6TlLboYciiAGHOBea8fx8OHTLd5Pwub4Z33iaDmQGAufq8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712913715756252.29679721472576; Fri, 12 Apr 2024 02:21:55 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvD6L-0006YB-T9; Fri, 12 Apr 2024 05:21:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvD6I-0006XF-PC; Fri, 12 Apr 2024 05:21:39 -0400 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvD6D-0002jV-TF; Fri, 12 Apr 2024 05:21:37 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NotgB_1712913681) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:21:22 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712913683; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=9WjpYtFzkDs8TURxFiqDBlqgPLbchnvnBpaXx8gHXeM=; b=MjjaNDx/mKHbsEDqbF2o9YQ04jcgU+NzsFOJE436CfBPthtlAgxVoboIjhGkLvbvQ/jJJD4Sdq4hYLrqjFhQAVzuoOWkgJTWh1LVuw1MKCSjuSBbfvir8Sk07FgUwEWQ6dbrK+fx8gQG4bH/nQRqBjxiB7S+31jiZhCgati3T40= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R791e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045176; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NotgB_1712913681; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 43/65] target/riscv: Add floating-point MIN/MAX instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:13 +0800 Message-ID: <20240412073735.76413-44-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.130; envelope-from=eric.huang@linux.alibaba.com; helo=out30-130.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712913716719100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 13 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 10 ++++--- target/riscv/xtheadvector_helper.c | 27 +++++++++++++++++++ 3 files changed, 46 insertions(+), 4 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 5aa12f3719..86ae984430 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2120,3 +2120,16 @@ DEF_HELPER_6(th_vfwnmsac_vf_w, void, ptr, ptr, i64, = ptr, env, i32) DEF_HELPER_5(th_vfsqrt_v_h, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(th_vfsqrt_v_w, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(th_vfsqrt_v_d, void, ptr, ptr, ptr, env, i32) + +DEF_HELPER_6(th_vfmin_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmin_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmin_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmax_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmax_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmax_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfmin_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmin_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmin_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmax_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmax_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmax_vf_d, void, ptr, ptr, i64, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index e709444e9f..d3205ce2a0 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2092,16 +2092,18 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *= a) \ =20 GEN_OPFV_TRANS_TH(th_vfsqrt_v, opfv_check_th) =20 +/* Vector Floating-Point MIN/MAX Instructions */ +GEN_OPFVV_TRANS_TH(th_vfmin_vv, opfvv_check_th) +GEN_OPFVV_TRANS_TH(th_vfmax_vv, opfvv_check_th) +GEN_OPFVF_TRANS_TH(th_vfmin_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfmax_vf, opfvf_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfmin_vv) -TH_TRANS_STUB(th_vfmin_vf) -TH_TRANS_STUB(th_vfmax_vv) -TH_TRANS_STUB(th_vfmax_vf) TH_TRANS_STUB(th_vfsgnj_vv) TH_TRANS_STUB(th_vfsgnj_vf) TH_TRANS_STUB(th_vfsgnjn_vv) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 7274e7aedb..5593cace78 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -2983,3 +2983,30 @@ THCALL(TH_OPFVV1, th_vfsqrt_v_d, OP_UU_D, H8, H8, fl= oat64_sqrt) GEN_TH_V_ENV(th_vfsqrt_v_h, 2, 2, clearh_th) GEN_TH_V_ENV(th_vfsqrt_v_w, 4, 4, clearl_th) GEN_TH_V_ENV(th_vfsqrt_v_d, 8, 8, clearq_th) + +/* Vector Floating-Point MIN/MAX Instructions */ +THCALL(TH_OPFVV2, th_vfmin_vv_h, OP_UUU_H, H2, H2, H2, float16_minnum) +THCALL(TH_OPFVV2, th_vfmin_vv_w, OP_UUU_W, H4, H4, H4, float32_minnum) +THCALL(TH_OPFVV2, th_vfmin_vv_d, OP_UUU_D, H8, H8, H8, float64_minnum) +GEN_TH_VV_ENV(th_vfmin_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfmin_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfmin_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfmin_vf_h, OP_UUU_H, H2, H2, float16_minnum) +THCALL(TH_OPFVF2, th_vfmin_vf_w, OP_UUU_W, H4, H4, float32_minnum) +THCALL(TH_OPFVF2, th_vfmin_vf_d, OP_UUU_D, H8, H8, float64_minnum) +GEN_TH_VF(th_vfmin_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfmin_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfmin_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVV2, th_vfmax_vv_h, OP_UUU_H, H2, H2, H2, float16_maxnum) +THCALL(TH_OPFVV2, th_vfmax_vv_w, OP_UUU_W, H4, H4, H4, float32_maxnum) +THCALL(TH_OPFVV2, th_vfmax_vv_d, OP_UUU_D, H8, H8, H8, float64_maxnum) +GEN_TH_VV_ENV(th_vfmax_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfmax_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfmax_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfmax_vf_h, OP_UUU_H, H2, H2, float16_maxnum) +THCALL(TH_OPFVF2, th_vfmax_vf_w, OP_UUU_W, H4, H4, float32_maxnum) +THCALL(TH_OPFVF2, th_vfmax_vf_d, OP_UUU_D, H8, H8, float64_maxnum) +GEN_TH_VF(th_vfmax_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfmax_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfmax_vf_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712913840; cv=none; d=zohomail.com; s=zohoarc; b=DF49HdfQmgYQLX8JEY6us6ayuJBqeoOxWnd2fMS/0AkPWD9e9kAUljX+VHMqSEfiK/kedstfyzHR6qKtbR1nBmgYFezLSYoJa5Z5snkdiQV4Y64ns1h7fulNIrW+x22rtESUeR70xBLO9kp1fVKVz6QqAb/9NO1fH5KVw6plSDY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712913840; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=9i82KizZZ4bGTSXDwbBVXYBSowFXdfzhpZehJOycVaI=; b=g5puPFjlnN69ktY8DZ0CP2W25CGC3iudVAPij25f0ZZvoB1rQc5bIeMzPF9AN3wMtCGIUOWsHYpI+1R7nlIO77X56VMzotaL97QWasH9FIBxJvuXB0MvunddPgBWYlN4kB196Pcj+hoCtbkD0tBRVnhjoJe3CsAH408oFv13Jdw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712913840916525.0856279979697; Fri, 12 Apr 2024 02:24:00 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvD8J-0007Ro-BW; Fri, 12 Apr 2024 05:23:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvD8H-0007R2-HV; Fri, 12 Apr 2024 05:23:41 -0400 Received: from out30-113.freemail.mail.aliyun.com ([115.124.30.113]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvD8D-00034P-5v; Fri, 12 Apr 2024 05:23:40 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nr1MX_1712913810) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:23:31 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712913812; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=9i82KizZZ4bGTSXDwbBVXYBSowFXdfzhpZehJOycVaI=; b=roNNYeAuIRLspKhVjdT/fbisjBcoz3H46rrLMjtU2DmAU2Rnil8atDZWZWlODXzWXIyI++se1BIVxOcLukRNxBzD3hreBbkryCEkrgSJ5UUanpQrT636O0rl/AnGfTwhg/GefO4oEYB3A6ffuYnfcG7sc4Ie2zm8bkcX7wO16ic= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R141e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046060; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nr1MX_1712913810; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 44/65] target/riscv: Add floating-point sign-injection instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:14 +0800 Message-ID: <20240412073735.76413-45-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.113; envelope-from=eric.huang@linux.alibaba.com; helo=out30-113.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712913843230100003 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 19 +++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 14 ++++--- target/riscv/vector_helper.c | 18 ++++---- target/riscv/vector_internals.h | 10 +++++ target/riscv/xtheadvector_helper.c | 41 +++++++++++++++++++ 5 files changed, 87 insertions(+), 15 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 86ae984430..2b9d7fa2b6 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2133,3 +2133,22 @@ DEF_HELPER_6(th_vfmin_vf_d, void, ptr, ptr, i64, ptr= , env, i32) DEF_HELPER_6(th_vfmax_vf_h, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfmax_vf_w, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfmax_vf_d, void, ptr, ptr, i64, ptr, env, i32) + +DEF_HELPER_6(th_vfsgnj_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfsgnj_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfsgnj_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfsgnjn_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfsgnjn_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfsgnjn_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfsgnjx_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfsgnjx_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfsgnjx_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfsgnj_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfsgnj_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfsgnj_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfsgnjn_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfsgnjn_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfsgnjn_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfsgnjx_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfsgnjx_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfsgnjx_vf_d, void, ptr, ptr, i64, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index d3205ce2a0..1374bad5b9 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2098,18 +2098,20 @@ GEN_OPFVV_TRANS_TH(th_vfmax_vv, opfvv_check_th) GEN_OPFVF_TRANS_TH(th_vfmin_vf, opfvf_check_th) GEN_OPFVF_TRANS_TH(th_vfmax_vf, opfvf_check_th) =20 +/* Vector Floating-Point Sign-Injection Instructions */ +GEN_OPFVV_TRANS_TH(th_vfsgnj_vv, opfvv_check_th) +GEN_OPFVV_TRANS_TH(th_vfsgnjn_vv, opfvv_check_th) +GEN_OPFVV_TRANS_TH(th_vfsgnjx_vv, opfvv_check_th) +GEN_OPFVF_TRANS_TH(th_vfsgnj_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfsgnjn_vf, opfvf_check_th) +GEN_OPFVF_TRANS_TH(th_vfsgnjx_vf, opfvf_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfsgnj_vv) -TH_TRANS_STUB(th_vfsgnj_vf) -TH_TRANS_STUB(th_vfsgnjn_vv) -TH_TRANS_STUB(th_vfsgnjn_vf) -TH_TRANS_STUB(th_vfsgnjx_vv) -TH_TRANS_STUB(th_vfsgnjx_vf) TH_TRANS_STUB(th_vmfeq_vv) TH_TRANS_STUB(th_vmfeq_vf) TH_TRANS_STUB(th_vmfne_vv) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index ef89794bdd..d0ebda5445 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -3882,17 +3882,17 @@ GEN_VEXT_VF(vfmax_vf_w, 4) GEN_VEXT_VF(vfmax_vf_d, 8) =20 /* Vector Floating-Point Sign-Injection Instructions */ -static uint16_t fsgnj16(uint16_t a, uint16_t b, float_status *s) +uint16_t fsgnj16(uint16_t a, uint16_t b, float_status *s) { return deposit64(b, 0, 15, a); } =20 -static uint32_t fsgnj32(uint32_t a, uint32_t b, float_status *s) +uint32_t fsgnj32(uint32_t a, uint32_t b, float_status *s) { return deposit64(b, 0, 31, a); } =20 -static uint64_t fsgnj64(uint64_t a, uint64_t b, float_status *s) +uint64_t fsgnj64(uint64_t a, uint64_t b, float_status *s) { return deposit64(b, 0, 63, a); } @@ -3910,17 +3910,17 @@ GEN_VEXT_VF(vfsgnj_vf_h, 2) GEN_VEXT_VF(vfsgnj_vf_w, 4) GEN_VEXT_VF(vfsgnj_vf_d, 8) =20 -static uint16_t fsgnjn16(uint16_t a, uint16_t b, float_status *s) +uint16_t fsgnjn16(uint16_t a, uint16_t b, float_status *s) { return deposit64(~b, 0, 15, a); } =20 -static uint32_t fsgnjn32(uint32_t a, uint32_t b, float_status *s) +uint32_t fsgnjn32(uint32_t a, uint32_t b, float_status *s) { return deposit64(~b, 0, 31, a); } =20 -static uint64_t fsgnjn64(uint64_t a, uint64_t b, float_status *s) +uint64_t fsgnjn64(uint64_t a, uint64_t b, float_status *s) { return deposit64(~b, 0, 63, a); } @@ -3938,17 +3938,17 @@ GEN_VEXT_VF(vfsgnjn_vf_h, 2) GEN_VEXT_VF(vfsgnjn_vf_w, 4) GEN_VEXT_VF(vfsgnjn_vf_d, 8) =20 -static uint16_t fsgnjx16(uint16_t a, uint16_t b, float_status *s) +uint16_t fsgnjx16(uint16_t a, uint16_t b, float_status *s) { return deposit64(b ^ a, 0, 15, a); } =20 -static uint32_t fsgnjx32(uint32_t a, uint32_t b, float_status *s) +uint32_t fsgnjx32(uint32_t a, uint32_t b, float_status *s) { return deposit64(b ^ a, 0, 31, a); } =20 -static uint64_t fsgnjx64(uint64_t a, uint64_t b, float_status *s) +uint64_t fsgnjx64(uint64_t a, uint64_t b, float_status *s) { return deposit64(b ^ a, 0, 63, a); } diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 535d31007d..bcc7d0edd6 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -394,4 +394,14 @@ uint64_t fwmsac32(uint32_t a, uint32_t b, uint64_t d, = float_status *s); uint32_t fwnmsac16(uint16_t a, uint16_t b, uint32_t d, float_status *s); uint64_t fwnmsac32(uint32_t a, uint32_t b, uint64_t d, float_status *s); =20 +uint16_t fsgnj16(uint16_t a, uint16_t b, float_status *s); +uint32_t fsgnj32(uint32_t a, uint32_t b, float_status *s); +uint64_t fsgnj64(uint64_t a, uint64_t b, float_status *s); +uint16_t fsgnjn16(uint16_t a, uint16_t b, float_status *s); +uint32_t fsgnjn32(uint32_t a, uint32_t b, float_status *s); +uint64_t fsgnjn64(uint64_t a, uint64_t b, float_status *s); +uint16_t fsgnjx16(uint16_t a, uint16_t b, float_status *s); +uint32_t fsgnjx32(uint32_t a, uint32_t b, float_status *s); +uint64_t fsgnjx64(uint64_t a, uint64_t b, float_status *s); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 5593cace78..38476900a6 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3010,3 +3010,44 @@ THCALL(TH_OPFVF2, th_vfmax_vf_d, OP_UUU_D, H8, H8, f= loat64_maxnum) GEN_TH_VF(th_vfmax_vf_h, 2, 2, clearh_th) GEN_TH_VF(th_vfmax_vf_w, 4, 4, clearl_th) GEN_TH_VF(th_vfmax_vf_d, 8, 8, clearq_th) + +/* Vector Floating-Point Sign-Injection Instructions */ + +THCALL(TH_OPFVV2, th_vfsgnj_vv_h, OP_UUU_H, H2, H2, H2, fsgnj16) +THCALL(TH_OPFVV2, th_vfsgnj_vv_w, OP_UUU_W, H4, H4, H4, fsgnj32) +THCALL(TH_OPFVV2, th_vfsgnj_vv_d, OP_UUU_D, H8, H8, H8, fsgnj64) +GEN_TH_VV_ENV(th_vfsgnj_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfsgnj_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfsgnj_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfsgnj_vf_h, OP_UUU_H, H2, H2, fsgnj16) +THCALL(TH_OPFVF2, th_vfsgnj_vf_w, OP_UUU_W, H4, H4, fsgnj32) +THCALL(TH_OPFVF2, th_vfsgnj_vf_d, OP_UUU_D, H8, H8, fsgnj64) +GEN_TH_VF(th_vfsgnj_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfsgnj_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfsgnj_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVV2, th_vfsgnjn_vv_h, OP_UUU_H, H2, H2, H2, fsgnjn16) +THCALL(TH_OPFVV2, th_vfsgnjn_vv_w, OP_UUU_W, H4, H4, H4, fsgnjn32) +THCALL(TH_OPFVV2, th_vfsgnjn_vv_d, OP_UUU_D, H8, H8, H8, fsgnjn64) +GEN_TH_VV_ENV(th_vfsgnjn_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfsgnjn_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfsgnjn_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfsgnjn_vf_h, OP_UUU_H, H2, H2, fsgnjn16) +THCALL(TH_OPFVF2, th_vfsgnjn_vf_w, OP_UUU_W, H4, H4, fsgnjn32) +THCALL(TH_OPFVF2, th_vfsgnjn_vf_d, OP_UUU_D, H8, H8, fsgnjn64) +GEN_TH_VF(th_vfsgnjn_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfsgnjn_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfsgnjn_vf_d, 8, 8, clearq_th) + +THCALL(TH_OPFVV2, th_vfsgnjx_vv_h, OP_UUU_H, H2, H2, H2, fsgnjx16) +THCALL(TH_OPFVV2, th_vfsgnjx_vv_w, OP_UUU_W, H4, H4, H4, fsgnjx32) +THCALL(TH_OPFVV2, th_vfsgnjx_vv_d, OP_UUU_D, H8, H8, H8, fsgnjx64) +GEN_TH_VV_ENV(th_vfsgnjx_vv_h, 2, 2, clearh_th) +GEN_TH_VV_ENV(th_vfsgnjx_vv_w, 4, 4, clearl_th) +GEN_TH_VV_ENV(th_vfsgnjx_vv_d, 8, 8, clearq_th) +THCALL(TH_OPFVF2, th_vfsgnjx_vf_h, OP_UUU_H, H2, H2, fsgnjx16) +THCALL(TH_OPFVF2, th_vfsgnjx_vf_w, OP_UUU_W, H4, H4, fsgnjx32) +THCALL(TH_OPFVF2, th_vfsgnjx_vf_d, OP_UUU_D, H8, H8, fsgnjx64) +GEN_TH_VF(th_vfsgnjx_vf_h, 2, 2, clearh_th) +GEN_TH_VF(th_vfsgnjx_vf_w, 4, 4, clearl_th) +GEN_TH_VF(th_vfsgnjx_vf_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712913978; cv=none; d=zohomail.com; s=zohoarc; b=mG3aDLjZjI2QdOH2eqTIIJCnRjHzKufXS9UC5g4M6lGjhwutm3fmkVmvGFQ7N50XzvrtLOgCCwd9BH457jDYcleNn5eMhuUgjZngV4YXAoZ/4eC37oyvLu/eJAA6/UdME33Yl+6GOD8lN9AAtlPrqBe8zqLCbCF/8LjSXBawWwc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712913978; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=8jiVqgjUlKLKgH92tksdoxhczfZlcWXYxmxNVCzrspQ=; b=CHxR+C1qbervzNTd/Jjk9NV1ubPZHLx4iTPVhcws/zFlz1AnlKiXMCaofQX+c+bZjV+Aelb6kSlef31jd55h4PAPgnCAich3o1fmSBZFZMV9haKnn79n+Nu6HBjPky77R9Rx7mVNLQCX6P/xZCZjj7lPs4/j+GloxG7aMMMmaQw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712913978175320.78909897676226; Fri, 12 Apr 2024 02:26:18 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDAG-0000RH-IU; Fri, 12 Apr 2024 05:25:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDAE-0000Qq-Lb; Fri, 12 Apr 2024 05:25:42 -0400 Received: from out30-101.freemail.mail.aliyun.com ([115.124.30.101]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDAA-0003Og-QJ; Fri, 12 Apr 2024 05:25:42 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NsZN6_1712913931) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:25:32 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712913933; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=8jiVqgjUlKLKgH92tksdoxhczfZlcWXYxmxNVCzrspQ=; b=H9QNdN0Y5y2HhoDIbFxLy9eSczrfjFWFiS35t+T2hzWX0zk8+sEMKQEoLRv9uk8YTFPg/NNG34osNFEB3BV9oLd4z6N0YIFFuiHN+DnF2046juRj5D62b2xIjhQ4/Q7GEtxI0vJR1HIiH+GG/x068ey2749M7ImWe1/jBaK10Bk= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R171e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046050; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NsZN6_1712913931; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 45/65] target/riscv: Add floating-point compare instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:15 +0800 Message-ID: <20240412073735.76413-46-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.101; envelope-from=eric.huang@linux.alibaba.com; helo=out30-101.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712913979470100003 Content-Type: text/plain; charset="utf-8" There is no similar instruction in RVV1.0 as th.vmford in XTheadVector. Signed-off-by: Huang Tao --- target/riscv/helper.h | 37 +++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 49 +++++++--- target/riscv/vector_helper.c | 18 ++-- target/riscv/vector_internals.h | 10 ++ target/riscv/xtheadvector_helper.c | 96 +++++++++++++++++++ 5 files changed, 189 insertions(+), 21 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 2b9d7fa2b6..5771a4fa8a 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2152,3 +2152,40 @@ DEF_HELPER_6(th_vfsgnjn_vf_d, void, ptr, ptr, i64, p= tr, env, i32) DEF_HELPER_6(th_vfsgnjx_vf_h, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfsgnjx_vf_w, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfsgnjx_vf_d, void, ptr, ptr, i64, ptr, env, i32) + +DEF_HELPER_6(th_vmfeq_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmfeq_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmfeq_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmfne_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmfne_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmfne_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmflt_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmflt_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmflt_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmfle_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmfle_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmfle_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmfeq_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfeq_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfeq_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfne_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfne_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfne_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmflt_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmflt_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmflt_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfle_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfle_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfle_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfgt_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfgt_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfgt_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfge_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfge_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmfge_vf_d, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmford_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmford_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmford_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmford_vf_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmford_vf_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vmford_vf_d, void, ptr, ptr, i64, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 1374bad5b9..1e773c673e 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2106,24 +2106,49 @@ GEN_OPFVF_TRANS_TH(th_vfsgnj_vf, opfvf_check_th) GEN_OPFVF_TRANS_TH(th_vfsgnjn_vf, opfvf_check_th) GEN_OPFVF_TRANS_TH(th_vfsgnjx_vf, opfvf_check_th) =20 +/* Vector Floating-Point Compare Instructions */ +static bool opfvv_cmp_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rs2, false) && + th_check_reg(s, a->rs1, false) && + (s->sew !=3D 0) && + ((th_check_overlap_group(a->rd, 1, a->rs1, 1 << s->lmul) && + th_check_overlap_group(a->rd, 1, a->rs2, 1 << s->lmul)) || + (s->lmul =3D=3D 0))); +} + +GEN_OPFVV_TRANS_TH(th_vmfeq_vv, opfvv_cmp_check_th) +GEN_OPFVV_TRANS_TH(th_vmfne_vv, opfvv_cmp_check_th) +GEN_OPFVV_TRANS_TH(th_vmflt_vv, opfvv_cmp_check_th) +GEN_OPFVV_TRANS_TH(th_vmfle_vv, opfvv_cmp_check_th) +GEN_OPFVV_TRANS_TH(th_vmford_vv, opfvv_cmp_check_th) + +static bool opfvf_cmp_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rs2, false) && + (s->sew !=3D 0) && + (th_check_overlap_group(a->rd, 1, a->rs2, 1 << s->lmul) || + (s->lmul =3D=3D 0))); +} + +GEN_OPFVF_TRANS_TH(th_vmfeq_vf, opfvf_cmp_check_th) +GEN_OPFVF_TRANS_TH(th_vmfne_vf, opfvf_cmp_check_th) +GEN_OPFVF_TRANS_TH(th_vmflt_vf, opfvf_cmp_check_th) +GEN_OPFVF_TRANS_TH(th_vmfle_vf, opfvf_cmp_check_th) +GEN_OPFVF_TRANS_TH(th_vmfgt_vf, opfvf_cmp_check_th) +GEN_OPFVF_TRANS_TH(th_vmfge_vf, opfvf_cmp_check_th) +GEN_OPFVF_TRANS_TH(th_vmford_vf, opfvf_cmp_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vmfeq_vv) -TH_TRANS_STUB(th_vmfeq_vf) -TH_TRANS_STUB(th_vmfne_vv) -TH_TRANS_STUB(th_vmfne_vf) -TH_TRANS_STUB(th_vmflt_vv) -TH_TRANS_STUB(th_vmflt_vf) -TH_TRANS_STUB(th_vmfle_vv) -TH_TRANS_STUB(th_vmfle_vf) -TH_TRANS_STUB(th_vmfgt_vf) -TH_TRANS_STUB(th_vmfge_vf) -TH_TRANS_STUB(th_vmford_vv) -TH_TRANS_STUB(th_vmford_vf) TH_TRANS_STUB(th_vfclass_v) TH_TRANS_STUB(th_vfmerge_vfm) TH_TRANS_STUB(th_vfmv_v_f) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index d0ebda5445..c966600d0c 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -4050,19 +4050,19 @@ GEN_VEXT_CMP_VF(vmfeq_vf_h, uint16_t, H2, float16_e= q_quiet) GEN_VEXT_CMP_VF(vmfeq_vf_w, uint32_t, H4, float32_eq_quiet) GEN_VEXT_CMP_VF(vmfeq_vf_d, uint64_t, H8, float64_eq_quiet) =20 -static bool vmfne16(uint16_t a, uint16_t b, float_status *s) +bool vmfne16(uint16_t a, uint16_t b, float_status *s) { FloatRelation compare =3D float16_compare_quiet(a, b, s); return compare !=3D float_relation_equal; } =20 -static bool vmfne32(uint32_t a, uint32_t b, float_status *s) +bool vmfne32(uint32_t a, uint32_t b, float_status *s) { FloatRelation compare =3D float32_compare_quiet(a, b, s); return compare !=3D float_relation_equal; } =20 -static bool vmfne64(uint64_t a, uint64_t b, float_status *s) +bool vmfne64(uint64_t a, uint64_t b, float_status *s) { FloatRelation compare =3D float64_compare_quiet(a, b, s); return compare !=3D float_relation_equal; @@ -4089,19 +4089,19 @@ GEN_VEXT_CMP_VF(vmfle_vf_h, uint16_t, H2, float16_l= e) GEN_VEXT_CMP_VF(vmfle_vf_w, uint32_t, H4, float32_le) GEN_VEXT_CMP_VF(vmfle_vf_d, uint64_t, H8, float64_le) =20 -static bool vmfgt16(uint16_t a, uint16_t b, float_status *s) +bool vmfgt16(uint16_t a, uint16_t b, float_status *s) { FloatRelation compare =3D float16_compare(a, b, s); return compare =3D=3D float_relation_greater; } =20 -static bool vmfgt32(uint32_t a, uint32_t b, float_status *s) +bool vmfgt32(uint32_t a, uint32_t b, float_status *s) { FloatRelation compare =3D float32_compare(a, b, s); return compare =3D=3D float_relation_greater; } =20 -static bool vmfgt64(uint64_t a, uint64_t b, float_status *s) +bool vmfgt64(uint64_t a, uint64_t b, float_status *s) { FloatRelation compare =3D float64_compare(a, b, s); return compare =3D=3D float_relation_greater; @@ -4111,21 +4111,21 @@ GEN_VEXT_CMP_VF(vmfgt_vf_h, uint16_t, H2, vmfgt16) GEN_VEXT_CMP_VF(vmfgt_vf_w, uint32_t, H4, vmfgt32) GEN_VEXT_CMP_VF(vmfgt_vf_d, uint64_t, H8, vmfgt64) =20 -static bool vmfge16(uint16_t a, uint16_t b, float_status *s) +bool vmfge16(uint16_t a, uint16_t b, float_status *s) { FloatRelation compare =3D float16_compare(a, b, s); return compare =3D=3D float_relation_greater || compare =3D=3D float_relation_equal; } =20 -static bool vmfge32(uint32_t a, uint32_t b, float_status *s) +bool vmfge32(uint32_t a, uint32_t b, float_status *s) { FloatRelation compare =3D float32_compare(a, b, s); return compare =3D=3D float_relation_greater || compare =3D=3D float_relation_equal; } =20 -static bool vmfge64(uint64_t a, uint64_t b, float_status *s) +bool vmfge64(uint64_t a, uint64_t b, float_status *s) { FloatRelation compare =3D float64_compare(a, b, s); return compare =3D=3D float_relation_greater || diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index bcc7d0edd6..b870e15392 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -404,4 +404,14 @@ uint16_t fsgnjx16(uint16_t a, uint16_t b, float_status= *s); uint32_t fsgnjx32(uint32_t a, uint32_t b, float_status *s); uint64_t fsgnjx64(uint64_t a, uint64_t b, float_status *s); =20 +bool vmfne16(uint16_t a, uint16_t b, float_status *s); +bool vmfne32(uint32_t a, uint32_t b, float_status *s); +bool vmfne64(uint64_t a, uint64_t b, float_status *s); +bool vmfgt16(uint16_t a, uint16_t b, float_status *s); +bool vmfgt32(uint32_t a, uint32_t b, float_status *s); +bool vmfgt64(uint64_t a, uint64_t b, float_status *s); +bool vmfge16(uint16_t a, uint16_t b, float_status *s); +bool vmfge32(uint32_t a, uint32_t b, float_status *s); +bool vmfge64(uint64_t a, uint64_t b, float_status *s); + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 38476900a6..603b34a094 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3051,3 +3051,99 @@ THCALL(TH_OPFVF2, th_vfsgnjx_vf_d, OP_UUU_D, H8, H8,= fsgnjx64) GEN_TH_VF(th_vfsgnjx_vf_h, 2, 2, clearh_th) GEN_TH_VF(th_vfsgnjx_vf_w, 4, 4, clearl_th) GEN_TH_VF(th_vfsgnjx_vf_d, 8, 8, clearq_th) + +/* Vector Floating-Point Compare Instructions */ +#define GEN_TH_CMP_VV_ENV(NAME, ETYPE, H, DO_OP) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vm =3D th_vm(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t vlmax =3D th_maxsz(desc) / sizeof(ETYPE); \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + ETYPE s1 =3D *((ETYPE *)vs1 + H(i)); \ + ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + th_set_elem_mask(vd, mlen, i, \ + DO_OP(s2, s1, &env->fp_status)); \ + } \ + env->vstart =3D 0; \ + for (; i < vlmax; i++) { \ + th_set_elem_mask(vd, mlen, i, 0); \ + } \ +} + +GEN_TH_CMP_VV_ENV(th_vmfeq_vv_h, uint16_t, H2, float16_eq_quiet) +GEN_TH_CMP_VV_ENV(th_vmfeq_vv_w, uint32_t, H4, float32_eq_quiet) +GEN_TH_CMP_VV_ENV(th_vmfeq_vv_d, uint64_t, H8, float64_eq_quiet) + +#define GEN_TH_CMP_VF(NAME, ETYPE, H, DO_OP) \ +void HELPER(NAME)(void *vd, void *v0, uint64_t s1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vm =3D th_vm(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t vlmax =3D th_maxsz(desc) / sizeof(ETYPE); \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + th_set_elem_mask(vd, mlen, i, \ + DO_OP(s2, (ETYPE)s1, &env->fp_status)); \ + } \ + env->vstart =3D 0; \ + for (; i < vlmax; i++) { \ + th_set_elem_mask(vd, mlen, i, 0); \ + } \ +} + +GEN_TH_CMP_VF(th_vmfeq_vf_h, uint16_t, H2, float16_eq_quiet) +GEN_TH_CMP_VF(th_vmfeq_vf_w, uint32_t, H4, float32_eq_quiet) +GEN_TH_CMP_VF(th_vmfeq_vf_d, uint64_t, H8, float64_eq_quiet) + +GEN_TH_CMP_VV_ENV(th_vmfne_vv_h, uint16_t, H2, vmfne16) +GEN_TH_CMP_VV_ENV(th_vmfne_vv_w, uint32_t, H4, vmfne32) +GEN_TH_CMP_VV_ENV(th_vmfne_vv_d, uint64_t, H8, vmfne64) +GEN_TH_CMP_VF(th_vmfne_vf_h, uint16_t, H2, vmfne16) +GEN_TH_CMP_VF(th_vmfne_vf_w, uint32_t, H4, vmfne32) +GEN_TH_CMP_VF(th_vmfne_vf_d, uint64_t, H8, vmfne64) + +GEN_TH_CMP_VV_ENV(th_vmflt_vv_h, uint16_t, H2, float16_lt) +GEN_TH_CMP_VV_ENV(th_vmflt_vv_w, uint32_t, H4, float32_lt) +GEN_TH_CMP_VV_ENV(th_vmflt_vv_d, uint64_t, H8, float64_lt) +GEN_TH_CMP_VF(th_vmflt_vf_h, uint16_t, H2, float16_lt) +GEN_TH_CMP_VF(th_vmflt_vf_w, uint32_t, H4, float32_lt) +GEN_TH_CMP_VF(th_vmflt_vf_d, uint64_t, H8, float64_lt) + +GEN_TH_CMP_VV_ENV(th_vmfle_vv_h, uint16_t, H2, float16_le) +GEN_TH_CMP_VV_ENV(th_vmfle_vv_w, uint32_t, H4, float32_le) +GEN_TH_CMP_VV_ENV(th_vmfle_vv_d, uint64_t, H8, float64_le) +GEN_TH_CMP_VF(th_vmfle_vf_h, uint16_t, H2, float16_le) +GEN_TH_CMP_VF(th_vmfle_vf_w, uint32_t, H4, float32_le) +GEN_TH_CMP_VF(th_vmfle_vf_d, uint64_t, H8, float64_le) + +GEN_TH_CMP_VF(th_vmfgt_vf_h, uint16_t, H2, vmfgt16) +GEN_TH_CMP_VF(th_vmfgt_vf_w, uint32_t, H4, vmfgt32) +GEN_TH_CMP_VF(th_vmfgt_vf_d, uint64_t, H8, vmfgt64) + +GEN_TH_CMP_VF(th_vmfge_vf_h, uint16_t, H2, vmfge16) +GEN_TH_CMP_VF(th_vmfge_vf_w, uint32_t, H4, vmfge32) +GEN_TH_CMP_VF(th_vmfge_vf_d, uint64_t, H8, vmfge64) + +GEN_TH_CMP_VV_ENV(th_vmford_vv_h, uint16_t, H2, !float16_unordered_quiet) +GEN_TH_CMP_VV_ENV(th_vmford_vv_w, uint32_t, H4, !float32_unordered_quiet) +GEN_TH_CMP_VV_ENV(th_vmford_vv_d, uint64_t, H8, !float64_unordered_quiet) +GEN_TH_CMP_VF(th_vmford_vf_h, uint16_t, H2, !float16_unordered_quiet) +GEN_TH_CMP_VF(th_vmford_vf_w, uint32_t, H4, !float32_unordered_quiet) +GEN_TH_CMP_VF(th_vmford_vf_d, uint64_t, H8, !float64_unordered_quiet) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712914115; cv=none; d=zohomail.com; s=zohoarc; b=mr/f45n9xQ9drzHol+pUw0BuDgoZ6kDPYJmhJ6iNxlfImJgf9DvWU4TL2ZSO4Cb9w7sEwO7lvJ0CabuOkqeQ1w9rFrIsLpeI0bt0tTDp48mKd6t8EpHo/0KjYDz4Nm02iGtpENHjymmyletvUL/D/qjbi3SSxWqESvFaVKuSiqQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712914115; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=u18ajQw8BnyfQfbSYOWjFV4OF+CR1EdD+IwPlrADmSY=; b=NWrlrkkqeYoUxAOPI3owQuo4Bo8UchcNghuO3Dbg1rPLrRUWLDvt0d3Wz04UO1+XU7go0HdEb8Cbt1jHeSPFEe8awlrvhyU41xILfuXELNaQnsDz/pTRSvYk9Xc2FvobQMzmf/U9YC7bkcakFxLPOve2OAsBPyIiF1T1ICCX/Uo= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712914115726793.5274276643137; Fri, 12 Apr 2024 02:28:35 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDCM-0001a6-00; Fri, 12 Apr 2024 05:27:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDCH-0001Xs-9k; Fri, 12 Apr 2024 05:27:49 -0400 Received: from out30-110.freemail.mail.aliyun.com ([115.124.30.110]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDCD-0003X4-3Y; Fri, 12 Apr 2024 05:27:48 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Novpj_1712914053) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:27:34 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712914055; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=u18ajQw8BnyfQfbSYOWjFV4OF+CR1EdD+IwPlrADmSY=; b=qQGukJHBu0ps21+a/jPDEDmO0OBJ5sWDwXztUD5e96fI6xjt2iN23HvZBL3nRNQGjXM+ZZa7HuFxmI1RBwqUadn6eM3lnTfdn4Aiz9uRfsG5p2RFqOhytw4hSL8Xtz+HZkpH9OcYBaqmeVoyyPbneHjj5C8IxMitaHrq1TzYQSs= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R191e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046059; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Novpj_1712914053; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 46/65] target/riscv: Add floating-point classify and merge instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:16 +0800 Message-ID: <20240412073735.76413-47-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.110; envelope-from=eric.huang@linux.alibaba.com; helo=out30-110.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712914117820100003 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 8 +++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 51 +++++++++++++++- target/riscv/xtheadvector_helper.c | 58 +++++++++++++++++++ 3 files changed, 114 insertions(+), 3 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 5771a4fa8a..886655899e 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2189,3 +2189,11 @@ DEF_HELPER_6(th_vmford_vv_d, void, ptr, ptr, ptr, pt= r, env, i32) DEF_HELPER_6(th_vmford_vf_h, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vmford_vf_w, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vmford_vf_d, void, ptr, ptr, i64, ptr, env, i32) + +DEF_HELPER_5(th_vfclass_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfclass_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfclass_v_d, void, ptr, ptr, ptr, env, i32) + +DEF_HELPER_6(th_vfmerge_vfm_h, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmerge_vfm_w, void, ptr, ptr, i64, ptr, env, i32) +DEF_HELPER_6(th_vfmerge_vfm_d, void, ptr, ptr, i64, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 1e773c673e..8e928febb7 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2143,15 +2143,60 @@ GEN_OPFVF_TRANS_TH(th_vmfgt_vf, opfvf_cmp_check_th) GEN_OPFVF_TRANS_TH(th_vmfge_vf, opfvf_cmp_check_th) GEN_OPFVF_TRANS_TH(th_vmford_vf, opfvf_cmp_check_th) =20 +/* Vector Floating-Point Classify Instruction */ +GEN_OPFV_TRANS_TH(th_vfclass_v, opfv_check_th) + +/* Vector Floating-Point Merge Instruction */ +GEN_OPFVF_TRANS_TH(th_vfmerge_vfm, opfvf_check_th) + +/* Besides of check function, th_vfmv_v_f just reuse the helper_th_vmv_v_x= */ +static bool trans_th_vfmv_v_f(DisasContext *s, arg_th_vfmv_v_f *a) +{ + if (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rd, false) && + (s->sew !=3D 0)) { + + TCGv_i64 t1; + + if (s->vl_eq_vlmax) { + t1 =3D tcg_temp_new_i64(); + /* NaN-box f[rs1] */ + do_nanbox(s, t1, cpu_fpr[a->rs1]); + tcg_gen_gvec_dup_i64(s->sew, vreg_ofs(s, a->rd), + MAXSZ(s), MAXSZ(s), t1); + } else { + TCGv_ptr dest; + TCGv_i32 desc; + uint32_t data =3D FIELD_DP32(0, VDATA_TH, LMUL, s->lmul); + static gen_helper_vmv_vx_th * const fns[3] =3D { + gen_helper_th_vmv_v_x_h, + gen_helper_th_vmv_v_x_w, + gen_helper_th_vmv_v_x_d, + }; + + t1 =3D tcg_temp_new_i64(); + /* NaN-box f[rs1] */ + do_nanbox(s, t1, cpu_fpr[a->rs1]); + + dest =3D tcg_temp_new_ptr(); + desc =3D tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data)); + tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, a->rd)); + fns[s->sew - 1](dest, t1, tcg_env, desc); + } + finalize_rvv_inst(s); + return true; + } + return false; +} + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfclass_v) -TH_TRANS_STUB(th_vfmerge_vfm) -TH_TRANS_STUB(th_vfmv_v_f) TH_TRANS_STUB(th_vfcvt_xu_f_v) TH_TRANS_STUB(th_vfcvt_x_f_v) TH_TRANS_STUB(th_vfcvt_f_xu_v) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 603b34a094..e31e13dff3 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3147,3 +3147,61 @@ GEN_TH_CMP_VV_ENV(th_vmford_vv_d, uint64_t, H8, !flo= at64_unordered_quiet) GEN_TH_CMP_VF(th_vmford_vf_h, uint16_t, H2, !float16_unordered_quiet) GEN_TH_CMP_VF(th_vmford_vf_w, uint32_t, H4, !float32_unordered_quiet) GEN_TH_CMP_VF(th_vmford_vf_d, uint64_t, H8, !float64_unordered_quiet) + +/* Vector Floating-Point Classify Instruction */ +#define TH_OPIVV1(NAME, TD, T2, TX2, HD, HS2, OP) \ + OPIVV1(NAME, TD, T2, TX2, HD, HS2, OP) + +#define GEN_TH_V(NAME, ESZ, DSZ, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t vlmax =3D th_maxsz(desc) / ESZ; \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vm =3D th_vm(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + do_##NAME(vd, vs2, i); \ + } \ + env->vstart =3D 0; \ + CLEAR_FN(vd, vl, vl * DSZ, vlmax * DSZ); \ +} + +THCALL(TH_OPIVV1, th_vfclass_v_h, OP_UU_H, H2, H2, fclass_h) +THCALL(TH_OPIVV1, th_vfclass_v_w, OP_UU_W, H4, H4, fclass_s) +THCALL(TH_OPIVV1, th_vfclass_v_d, OP_UU_D, H8, H8, fclass_d) +GEN_TH_V(th_vfclass_v_h, 2, 2, clearh_th) +GEN_TH_V(th_vfclass_v_w, 4, 4, clearl_th) +GEN_TH_V(th_vfclass_v_d, 8, 8, clearq_th) + +/* Vector Floating-Point Merge Instruction */ +#define GEN_VFMERGE_VF_TH(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, uint64_t s1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vm =3D th_vm(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t esz =3D sizeof(ETYPE); \ + uint32_t vlmax =3D th_maxsz(desc) / esz; \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ + *((ETYPE *)vd + H(i)) \ + =3D (!vm && !th_elem_mask(v0, mlen, i) ? s2 : s1); \ + } \ + env->vstart =3D 0; \ + CLEAR_FN(vd, vl, vl * esz, vlmax * esz); \ +} + +GEN_VFMERGE_VF_TH(th_vfmerge_vfm_h, int16_t, H2, clearh_th) +GEN_VFMERGE_VF_TH(th_vfmerge_vfm_w, int32_t, H4, clearl_th) +GEN_VFMERGE_VF_TH(th_vfmerge_vfm_d, int64_t, H8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712914225; cv=none; d=zohomail.com; s=zohoarc; b=N4UkTGZscA4efTobm85KAnUsaTUW7FkWXCHC2i0/7N1rTDhHteMU3n9kPwVeOKUrnmKk7WDCnDWV9o1NLAfnMfWH8zL4a+qcijHQtMtpdnuja7VmNqAUOcZ/nnAhIwImmp6XCGH0qocrTJF8mxCsJZWlrEofN8QOopp7hofYD+0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712914225; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=8umZ3OYSUXMZOWPRFyta6L/0drZPcl2rI0y+4M2ZquQ=; b=P8xRBEssnWbuJn2r1DnMRyLPQKHKjtfrllbM07YOOUS7GwPtNjaAPYIyGGegNSSw/XUTEML72YPDJVtHgsJeojbArhC6e0G5IUbkBfwoThHHPQmItw2XjlikJkluV4TM4mbRjtIsF7PjZ59RqXqV81FlyNoBek4WjWuN7Q50Nlw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712914225063544.9929990836393; Fri, 12 Apr 2024 02:30:25 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDEH-0002hM-HA; Fri, 12 Apr 2024 05:29:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDEB-0002ew-Ab; Fri, 12 Apr 2024 05:29:49 -0400 Received: from out30-131.freemail.mail.aliyun.com ([115.124.30.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDE4-0003oK-SS; Fri, 12 Apr 2024 05:29:46 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nsagg_1712914174) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:29:35 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712914176; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=8umZ3OYSUXMZOWPRFyta6L/0drZPcl2rI0y+4M2ZquQ=; b=cObjeej2wjWDy8t5YnUYG8eeaEXdIcWpJnvynup2ocp3WvzK+HEEaGkCWdkmE8ZZiK32Sqvh9MEMAYqA07x2brR22+NCjEN0vo7t41Y/3Zyx1m6RaglG9EK6C2KNREYpfUQ22PCNmm9ZK+lwcmzRky913fpvHt6a9d79oPg89AA= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R201e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046050; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nsagg_1712914174; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 47/65] target/riscv: Add single-width floating-point/integer type-convert instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:17 +0800 Message-ID: <20240412073735.76413-48-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.131; envelope-from=eric.huang@linux.alibaba.com; helo=out30-131.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712914226079100002 Content-Type: text/plain; charset="utf-8" Compared to RVV1.0, XTheadVector lacks .rtz instructions, which specify the rounding mode of rounding to zero. Except of lack of similar instructions, the instructions have the same func= tion as RVV1.0. Overall there are only general differences between XTheadVector = and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 13 ++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 10 +++--- target/riscv/xtheadvector_helper.c | 33 +++++++++++++++++++ 3 files changed, 52 insertions(+), 4 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 886655899e..18640c4a1e 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2197,3 +2197,16 @@ DEF_HELPER_5(th_vfclass_v_d, void, ptr, ptr, ptr, en= v, i32) DEF_HELPER_6(th_vfmerge_vfm_h, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfmerge_vfm_w, void, ptr, ptr, i64, ptr, env, i32) DEF_HELPER_6(th_vfmerge_vfm_d, void, ptr, ptr, i64, ptr, env, i32) + +DEF_HELPER_5(th_vfcvt_xu_f_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfcvt_xu_f_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfcvt_xu_f_v_d, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfcvt_x_f_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfcvt_x_f_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfcvt_x_f_v_d, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfcvt_f_xu_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfcvt_f_xu_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfcvt_f_xu_v_d, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfcvt_f_x_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfcvt_f_x_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfcvt_f_x_v_d, void, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 8e928febb7..27a06c2cac 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2191,16 +2191,18 @@ static bool trans_th_vfmv_v_f(DisasContext *s, arg_= th_vfmv_v_f *a) return false; } =20 +/* Single-Width Floating-Point/Integer Type-Convert Instructions */ +GEN_OPFV_TRANS_TH(th_vfcvt_xu_f_v, opfv_check_th) +GEN_OPFV_TRANS_TH(th_vfcvt_x_f_v, opfv_check_th) +GEN_OPFV_TRANS_TH(th_vfcvt_f_xu_v, opfv_check_th) +GEN_OPFV_TRANS_TH(th_vfcvt_f_x_v, opfv_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfcvt_xu_f_v) -TH_TRANS_STUB(th_vfcvt_x_f_v) -TH_TRANS_STUB(th_vfcvt_f_xu_v) -TH_TRANS_STUB(th_vfcvt_f_x_v) TH_TRANS_STUB(th_vfwcvt_xu_f_v) TH_TRANS_STUB(th_vfwcvt_x_f_v) TH_TRANS_STUB(th_vfwcvt_f_xu_v) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index e31e13dff3..7e98c1ead2 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3205,3 +3205,36 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, v= oid *vs2, \ GEN_VFMERGE_VF_TH(th_vfmerge_vfm_h, int16_t, H2, clearh_th) GEN_VFMERGE_VF_TH(th_vfmerge_vfm_w, int32_t, H4, clearl_th) GEN_VFMERGE_VF_TH(th_vfmerge_vfm_d, int64_t, H8, clearq_th) + +/* Single-Width Floating-Point/Integer Type-Convert Instructions */ +/* vfcvt.xu.f.v vd, vs2, vm # Convert float to unsigned integer. */ +THCALL(TH_OPFVV1, th_vfcvt_xu_f_v_h, OP_UU_H, H2, H2, float16_to_uint16) +THCALL(TH_OPFVV1, th_vfcvt_xu_f_v_w, OP_UU_W, H4, H4, float32_to_uint32) +THCALL(TH_OPFVV1, th_vfcvt_xu_f_v_d, OP_UU_D, H8, H8, float64_to_uint64) +GEN_TH_V_ENV(th_vfcvt_xu_f_v_h, 2, 2, clearh_th) +GEN_TH_V_ENV(th_vfcvt_xu_f_v_w, 4, 4, clearl_th) +GEN_TH_V_ENV(th_vfcvt_xu_f_v_d, 8, 8, clearq_th) + +/* vfcvt.x.f.v vd, vs2, vm # Convert float to signed integer. */ +THCALL(TH_OPFVV1, th_vfcvt_x_f_v_h, OP_UU_H, H2, H2, float16_to_int16) +THCALL(TH_OPFVV1, th_vfcvt_x_f_v_w, OP_UU_W, H4, H4, float32_to_int32) +THCALL(TH_OPFVV1, th_vfcvt_x_f_v_d, OP_UU_D, H8, H8, float64_to_int64) +GEN_TH_V_ENV(th_vfcvt_x_f_v_h, 2, 2, clearh_th) +GEN_TH_V_ENV(th_vfcvt_x_f_v_w, 4, 4, clearl_th) +GEN_TH_V_ENV(th_vfcvt_x_f_v_d, 8, 8, clearq_th) + +/* vfcvt.f.xu.v vd, vs2, vm # Convert unsigned integer to float. */ +THCALL(TH_OPFVV1, th_vfcvt_f_xu_v_h, OP_UU_H, H2, H2, uint16_to_float16) +THCALL(TH_OPFVV1, th_vfcvt_f_xu_v_w, OP_UU_W, H4, H4, uint32_to_float32) +THCALL(TH_OPFVV1, th_vfcvt_f_xu_v_d, OP_UU_D, H8, H8, uint64_to_float64) +GEN_TH_V_ENV(th_vfcvt_f_xu_v_h, 2, 2, clearh_th) +GEN_TH_V_ENV(th_vfcvt_f_xu_v_w, 4, 4, clearl_th) +GEN_TH_V_ENV(th_vfcvt_f_xu_v_d, 8, 8, clearq_th) + +/* vfcvt.f.x.v vd, vs2, vm # Convert integer to float. */ +THCALL(TH_OPFVV1, th_vfcvt_f_x_v_h, OP_UU_H, H2, H2, int16_to_float16) +THCALL(TH_OPFVV1, th_vfcvt_f_x_v_w, OP_UU_W, H4, H4, int32_to_float32) +THCALL(TH_OPFVV1, th_vfcvt_f_x_v_d, OP_UU_D, H8, H8, int64_to_float64) +GEN_TH_V_ENV(th_vfcvt_f_x_v_h, 2, 2, clearh_th) +GEN_TH_V_ENV(th_vfcvt_f_x_v_w, 4, 4, clearl_th) +GEN_TH_V_ENV(th_vfcvt_f_x_v_d, 8, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712914321; cv=none; d=zohomail.com; s=zohoarc; b=HCWo0o4hsdeFk0TXdQzWQsiwftc/QggyPErKzf+dc/9I9sah7mjKllLAM66z2DV9SzNQgGv6ocYarvNRlJzduMDBvpI1B8Tll46jZ46Wb3iCYQZF4gg5S8R/h6OI3ijZzBwdunERxcjMF0qeMpy9NTFXHyTXwf3IBARQ2mrC8KQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712914321; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=OQkwqlQyAGecjgu36JEhPI+x+SwzdeavdQ8Z7kWrmk8=; b=eWVqPBp3HMWyGxEJK3r8rmBq5nVmarZ1KJvh/4Gbx4jQL+7TFsFGmwx19x5byepUUm8XXM3R+DFFeWRQ0Z67IjVF3Qy0x9+AAR/doH+qpnpvV2ndiT9+Q+gPSQgPCVZL3BMsXpaycdei0q3mdUDdA5AnRtWpBFdSM9tppBE0ewE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 171291432175529.37798405118258; Fri, 12 Apr 2024 02:32:01 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDG5-0003fa-LN; Fri, 12 Apr 2024 05:31:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDG4-0003f9-AR; Fri, 12 Apr 2024 05:31:44 -0400 Received: from out30-101.freemail.mail.aliyun.com ([115.124.30.101]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDG2-0004FG-2m; Fri, 12 Apr 2024 05:31:44 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NqQ8L_1712914296) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:31:36 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712914297; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=OQkwqlQyAGecjgu36JEhPI+x+SwzdeavdQ8Z7kWrmk8=; b=N/xasp9Fa5+6OGnwkM6lf87zZ6gdkszNCWr69MDatzkuX+1HUXAZsJg3kYa2PvdMXUY+TOg8V0znkPwYR3Qq+QmtVdy0I+rUKOn24Nr/huweVzZFNL7i4ptelE3EzwH0p5geDPS0kxnNv3Z51Y9VvESG6bMfFJ/f02WT608DCJU= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R111e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046060; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NqQ8L_1712914296; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 48/65] target/riscv: Add widening floating-point/integer type-convert instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:18 +0800 Message-ID: <20240412073735.76413-49-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.101; envelope-from=eric.huang@linux.alibaba.com; helo=out30-101.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712914322341100001 Content-Type: text/plain; charset="utf-8" Compared to RVV1.0, XTheadVector lacks .rtz instructions, which specify the rounding mode of rounding to zero. Except of lack of similar instructions, the instructions have the same func= tion as RVV1.0. Overall there are only general differences between XTheadVector = and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 13 +++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 93 ++++++++++++++++++- target/riscv/vector_helper.c | 5 +- target/riscv/vector_internals.h | 3 + target/riscv/xtheadvector_helper.c | 44 +++++++++ 5 files changed, 149 insertions(+), 9 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 18640c4a1e..e2d737c9c4 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2210,3 +2210,16 @@ DEF_HELPER_5(th_vfcvt_f_xu_v_d, void, ptr, ptr, ptr,= env, i32) DEF_HELPER_5(th_vfcvt_f_x_v_h, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(th_vfcvt_f_x_v_w, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(th_vfcvt_f_x_v_d, void, ptr, ptr, ptr, env, i32) + +DEF_HELPER_5(th_vfwcvt_xu_f_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfwcvt_xu_f_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfwcvt_x_f_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfwcvt_x_f_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfwcvt_f_xu_v_b, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfwcvt_f_xu_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfwcvt_f_xu_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfwcvt_f_x_v_b, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfwcvt_f_x_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfwcvt_f_x_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfwcvt_f_f_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfwcvt_f_f_v_w, void, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 27a06c2cac..72643facb1 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2197,17 +2197,100 @@ GEN_OPFV_TRANS_TH(th_vfcvt_x_f_v, opfv_check_th) GEN_OPFV_TRANS_TH(th_vfcvt_f_xu_v, opfv_check_th) GEN_OPFV_TRANS_TH(th_vfcvt_f_x_v, opfv_check_th) =20 +/* Widening Floating-Point/Integer Type-Convert Instructions */ + +/* + * If the current SEW does not correspond to a supported IEEE floating-poi= nt + * type, an illegal instruction exception is raised + */ +static bool opfv_widen_check_th(DisasContext *s, arg_rmr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, true) && + th_check_reg(s, a->rs2, false) && + th_check_overlap_group(a->rd, 2 << s->lmul, a->rs2, + 1 << s->lmul) && + (s->lmul < 0x3) && (s->sew < 0x3) && (s->sew !=3D 0)); +} + +static bool opfxv_widen_check_th(DisasContext *s, arg_rmr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, true) && + th_check_reg(s, a->rs2, false) && + th_check_overlap_group(a->rd, 2 << s->lmul, a->rs2, + 1 << s->lmul) && + (s->lmul < 0x3) && (s->sew < 0x3)); +} + +#define GEN_OPFXV_WIDEN_TRANS_TH(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ +{ \ + if (opfxv_widen_check_th(s, a)) { \ + uint32_t data =3D 0; \ + static gen_helper_gvec_3_ptr * const fns[3] =3D { \ + gen_helper_##NAME##_b, \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, \ + }; \ + gen_set_rm(s, RISCV_FRM_DYN); \ + \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), \ + vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs2), tcg_env, \ + s->cfg_ptr->vlenb, \ + s->cfg_ptr->vlenb, data, \ + fns[s->sew]); \ + finalize_rvv_inst(s); \ + return true; \ + } \ + return false; \ +} +GEN_OPFXV_WIDEN_TRANS_TH(th_vfwcvt_f_xu_v) +GEN_OPFXV_WIDEN_TRANS_TH(th_vfwcvt_f_x_v) + +#define GEN_OPFV_WIDEN_TRANS_TH(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ +{ \ + if (opfv_widen_check_th(s, a)) { \ + uint32_t data =3D 0; \ + static gen_helper_gvec_3_ptr * const fns[2] =3D { \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, \ + }; \ + gen_set_rm(s, RISCV_FRM_DYN); \ + \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), \ + vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs2), tcg_env, \ + s->cfg_ptr->vlenb, \ + s->cfg_ptr->vlenb, data, \ + fns[s->sew - 1]); \ + finalize_rvv_inst(s); \ + return true; \ + } \ + return false; \ +} +GEN_OPFV_WIDEN_TRANS_TH(th_vfwcvt_xu_f_v) +GEN_OPFV_WIDEN_TRANS_TH(th_vfwcvt_x_f_v) +GEN_OPFV_WIDEN_TRANS_TH(th_vfwcvt_f_f_v) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfwcvt_xu_f_v) -TH_TRANS_STUB(th_vfwcvt_x_f_v) -TH_TRANS_STUB(th_vfwcvt_f_xu_v) -TH_TRANS_STUB(th_vfwcvt_f_x_v) -TH_TRANS_STUB(th_vfwcvt_f_f_v) TH_TRANS_STUB(th_vfncvt_xu_f_v) TH_TRANS_STUB(th_vfncvt_x_f_v) TH_TRANS_STUB(th_vfncvt_f_xu_v) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index c966600d0c..105c2eb00a 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -4265,10 +4265,7 @@ GEN_VEXT_V_ENV(vfcvt_f_x_v_w, 4) GEN_VEXT_V_ENV(vfcvt_f_x_v_d, 8) =20 /* Widening Floating-Point/Integer Type-Convert Instructions */ -/* (TD, T2, TX2) */ -#define WOP_UU_B uint16_t, uint8_t, uint8_t -#define WOP_UU_H uint32_t, uint16_t, uint16_t -#define WOP_UU_W uint64_t, uint32_t, uint32_t + /* * vfwcvt.xu.f.v vd, vs2, vm # Convert float to double-width unsigned inte= ger. */ diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index b870e15392..aac96f830c 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -132,6 +132,9 @@ void vext_set_elems_1s(void *base, uint32_t is_agnostic= , uint32_t cnt, #define OP_UU_H uint16_t, uint16_t, uint16_t #define OP_UU_W uint32_t, uint32_t, uint32_t #define OP_UU_D uint64_t, uint64_t, uint64_t +#define WOP_UU_B uint16_t, uint8_t, uint8_t +#define WOP_UU_H uint32_t, uint16_t, uint16_t +#define WOP_UU_W uint64_t, uint32_t, uint32_t =20 /* (TD, T1, T2, TX1, TX2) */ #define OP_UUU_B uint8_t, uint8_t, uint8_t, uint8_t, uint8_t diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 7e98c1ead2..42328a8a58 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3238,3 +3238,47 @@ THCALL(TH_OPFVV1, th_vfcvt_f_x_v_d, OP_UU_D, H8, H8,= int64_to_float64) GEN_TH_V_ENV(th_vfcvt_f_x_v_h, 2, 2, clearh_th) GEN_TH_V_ENV(th_vfcvt_f_x_v_w, 4, 4, clearl_th) GEN_TH_V_ENV(th_vfcvt_f_x_v_d, 8, 8, clearq_th) + +/* Widening Floating-Point/Integer Type-Convert Instructions */ + +/* vfwcvt.xu.f.v vd, vs2, vm # Convert float to double-width unsigned inte= ger.*/ +THCALL(TH_OPFVV1, th_vfwcvt_xu_f_v_h, WOP_UU_H, H4, H2, float16_to_uint32) +THCALL(TH_OPFVV1, th_vfwcvt_xu_f_v_w, WOP_UU_W, H8, H4, float32_to_uint64) +GEN_TH_V_ENV(th_vfwcvt_xu_f_v_h, 2, 4, clearl_th) +GEN_TH_V_ENV(th_vfwcvt_xu_f_v_w, 4, 8, clearq_th) + +/* vfwcvt.x.f.v vd, vs2, vm # Convert float to double-width signed integer= . */ +THCALL(TH_OPFVV1, th_vfwcvt_x_f_v_h, WOP_UU_H, H4, H2, float16_to_int32) +THCALL(TH_OPFVV1, th_vfwcvt_x_f_v_w, WOP_UU_W, H8, H4, float32_to_int64) +GEN_TH_V_ENV(th_vfwcvt_x_f_v_h, 2, 4, clearl_th) +GEN_TH_V_ENV(th_vfwcvt_x_f_v_w, 4, 8, clearq_th) + +/* vfwcvt.f.xu.v vd, vs2, vm # Convert unsigned integer to double-width fl= oat */ +THCALL(TH_OPFVV1, th_vfwcvt_f_xu_v_b, WOP_UU_B, H2, H1, uint8_to_float16) +THCALL(TH_OPFVV1, th_vfwcvt_f_xu_v_h, WOP_UU_H, H4, H2, uint16_to_float32) +THCALL(TH_OPFVV1, th_vfwcvt_f_xu_v_w, WOP_UU_W, H8, H4, uint32_to_float64) +GEN_TH_V_ENV(th_vfwcvt_f_xu_v_b, 1, 2, clearh_th) +GEN_TH_V_ENV(th_vfwcvt_f_xu_v_h, 2, 4, clearl_th) +GEN_TH_V_ENV(th_vfwcvt_f_xu_v_w, 4, 8, clearq_th) + +/* vfwcvt.f.x.v vd, vs2, vm # Convert integer to double-width float. */ +THCALL(TH_OPFVV1, th_vfwcvt_f_x_v_b, WOP_UU_B, H2, H1, int8_to_float16) +THCALL(TH_OPFVV1, th_vfwcvt_f_x_v_h, WOP_UU_H, H4, H2, int16_to_float32) +THCALL(TH_OPFVV1, th_vfwcvt_f_x_v_w, WOP_UU_W, H8, H4, int32_to_float64) +GEN_TH_V_ENV(th_vfwcvt_f_x_v_b, 1, 2, clearh_th) +GEN_TH_V_ENV(th_vfwcvt_f_x_v_h, 2, 4, clearl_th) +GEN_TH_V_ENV(th_vfwcvt_f_x_v_w, 4, 8, clearq_th) + +/* + * vfwcvt.f.f.v vd, vs2, vm # + * Convert single-width float to double-width float. + */ +static uint32_t vfwcvtffv16(uint16_t a, float_status *s) +{ + return float16_to_float32(a, true, s); +} + +THCALL(TH_OPFVV1, th_vfwcvt_f_f_v_h, WOP_UU_H, H4, H2, vfwcvtffv16) +THCALL(TH_OPFVV1, th_vfwcvt_f_f_v_w, WOP_UU_W, H8, H4, float32_to_float64) +GEN_TH_V_ENV(th_vfwcvt_f_f_v_h, 2, 4, clearl_th) +GEN_TH_V_ENV(th_vfwcvt_f_f_v_w, 4, 8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712914460; cv=none; d=zohomail.com; s=zohoarc; b=dVS+3jvVms3u0wDXk0sE3laD24q0zvJz04mTIpqf9W7HH0Knzi9ieHuwf1llpPnBNfc4w8QHEaoSJXdho1R3JivQOygM0jWM4fCkqVmGx5zWxEaL759cu+ZvG+yjFlkao6eYhIOmHrXNyg4FpMijY/o+AxOChAnF7/TsX7+hUZA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712914460; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=rC93BLhcxBARDE3kNCwShP8uX2l3KK8J6hT0P2jkSUg=; b=AaxypO6hfjHj+rWsghudH7vW+knmb/TRP/fDGRlpBjJXJ/MTgY1ic6QntntRNE3HK5AGk1b4J9MXm2DYuPW4bYWKcpNQmvnP0ONmaTAw6VBDAdWUsXPbNF5Gm2hJri5tGcvc7FTlb4py/6am9z2rYG+U8PUYXYP8czjvoDww39A= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712914460885104.77125869330166; Fri, 12 Apr 2024 02:34:20 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDIE-0004TW-9E; Fri, 12 Apr 2024 05:33:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDI8-0004T5-8N; Fri, 12 Apr 2024 05:33:52 -0400 Received: from out30-112.freemail.mail.aliyun.com ([115.124.30.112]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDI4-0004qE-JD; Fri, 12 Apr 2024 05:33:51 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NqQpB_1712914417) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:33:38 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712914419; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=rC93BLhcxBARDE3kNCwShP8uX2l3KK8J6hT0P2jkSUg=; b=glHiHUEktGvcFtJrzTzT2rA9mX4nzwrjsa1MF60ZVck0r/VhxaPyRwOYwCfJJ/JucjUcVx+pQgopjf1p/oJM3Fz6FaICO7I0RPXR0eBCFuvCdfj95zA67RAyVCjTJPex9lZYav+VGjU8a2hfuhxFs7Mt81zBzITMgpP4QgKP3uA= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R671e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046049; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NqQpB_1712914417; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 49/65] target/riscv: Add narrowing floating-point/integer type-convert instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:19 +0800 Message-ID: <20240412073735.76413-50-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.112; envelope-from=eric.huang@linux.alibaba.com; helo=out30-112.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712914463059100001 Content-Type: text/plain; charset="utf-8" Compared to RVV1.0, XTheadVector lacks .rtz and .rod instructions, which sp= ecify the rounding mode. Except of lack of similar instructions, the instructions have the same func= tion as RVV1.0. Overall there are only general differences between XTheadVector = and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 13 +++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 94 ++++++++++++++++++- target/riscv/vector_helper.c | 5 +- target/riscv/vector_internals.h | 3 + target/riscv/xtheadvector_helper.c | 41 ++++++++ 5 files changed, 147 insertions(+), 9 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index e2d737c9c4..c666a5a020 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2223,3 +2223,16 @@ DEF_HELPER_5(th_vfwcvt_f_x_v_h, void, ptr, ptr, ptr,= env, i32) DEF_HELPER_5(th_vfwcvt_f_x_v_w, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(th_vfwcvt_f_f_v_h, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(th_vfwcvt_f_f_v_w, void, ptr, ptr, ptr, env, i32) + +DEF_HELPER_5(th_vfncvt_xu_f_v_b, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfncvt_xu_f_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfncvt_xu_f_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfncvt_x_f_v_b, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfncvt_x_f_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfncvt_x_f_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfncvt_f_xu_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfncvt_f_xu_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfncvt_f_x_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfncvt_f_x_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfncvt_f_f_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vfncvt_f_f_v_w, void, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 72643facb1..d2734c007a 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2285,17 +2285,101 @@ GEN_OPFV_WIDEN_TRANS_TH(th_vfwcvt_xu_f_v) GEN_OPFV_WIDEN_TRANS_TH(th_vfwcvt_x_f_v) GEN_OPFV_WIDEN_TRANS_TH(th_vfwcvt_f_f_v) =20 +/* Narrowing Floating-Point/Integer Type-Convert Instructions */ + +/* + * If the current SEW does not correspond to a supported IEEE floating-poi= nt + * type, an illegal instruction exception is raised + */ +static bool opfv_narrow_check_th(DisasContext *s, arg_rmr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, false) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, true) && + th_check_overlap_group(a->rd, 1 << s->lmul, a->rs2, + 2 << s->lmul) && + (s->lmul < 0x3) && (s->sew < 0x3) && (s->sew !=3D 0)); +} + +static bool opxfv_narrow_check_th(DisasContext *s, arg_rmr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, false) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, true) && + th_check_overlap_group(a->rd, 1 << s->lmul, a->rs2, + 2 << s->lmul) && + (s->lmul < 0x3) && (s->sew < 0x3)); +} + +#define GEN_OPXFV_NARROW_TRANS_TH(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ +{ \ + if (opxfv_narrow_check_th(s, a)) { \ + uint32_t data =3D 0; \ + static gen_helper_gvec_3_ptr * const fns[3] =3D { \ + gen_helper_##NAME##_b, \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, \ + }; \ + gen_set_rm(s, RISCV_FRM_DYN); \ + \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), \ + vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs2), tcg_env, \ + s->cfg_ptr->vlenb, \ + s->cfg_ptr->vlenb, data, \ + fns[s->sew]); \ + finalize_rvv_inst(s); \ + return true; \ + } \ + return false; \ +} + +GEN_OPXFV_NARROW_TRANS_TH(th_vfncvt_xu_f_v) +GEN_OPXFV_NARROW_TRANS_TH(th_vfncvt_x_f_v) + +#define GEN_OPFV_NARROW_TRANS_TH(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ +{ \ + if (opfv_narrow_check_th(s, a)) { \ + uint32_t data =3D 0; \ + static gen_helper_gvec_3_ptr * const fns[2] =3D { \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, \ + }; \ + gen_set_rm(s, RISCV_FRM_DYN); \ + \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), \ + vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs2), tcg_env, \ + s->cfg_ptr->vlenb, \ + s->cfg_ptr->vlenb, data, \ + fns[s->sew - 1]); \ + finalize_rvv_inst(s); \ + return true; \ + } \ + return false; \ +} +GEN_OPFV_NARROW_TRANS_TH(th_vfncvt_f_xu_v) +GEN_OPFV_NARROW_TRANS_TH(th_vfncvt_f_x_v) +GEN_OPFV_NARROW_TRANS_TH(th_vfncvt_f_f_v) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfncvt_xu_f_v) -TH_TRANS_STUB(th_vfncvt_x_f_v) -TH_TRANS_STUB(th_vfncvt_f_xu_v) -TH_TRANS_STUB(th_vfncvt_f_x_v) -TH_TRANS_STUB(th_vfncvt_f_f_v) TH_TRANS_STUB(th_vredsum_vs) TH_TRANS_STUB(th_vredand_vs) TH_TRANS_STUB(th_vredor_vs) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 105c2eb00a..baa0f47da6 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -4315,10 +4315,7 @@ RVVCALL(OPFVV1, vfwcvtbf16_f_f_v, WOP_UU_H, H4, H2, = bfloat16_to_float32) GEN_VEXT_V_ENV(vfwcvtbf16_f_f_v, 4) =20 /* Narrowing Floating-Point/Integer Type-Convert Instructions */ -/* (TD, T2, TX2) */ -#define NOP_UU_B uint8_t, uint16_t, uint32_t -#define NOP_UU_H uint16_t, uint32_t, uint32_t -#define NOP_UU_W uint32_t, uint64_t, uint64_t + /* vfncvt.xu.f.v vd, vs2, vm # Convert float to unsigned integer. */ RVVCALL(OPFVV1, vfncvt_xu_f_w_b, NOP_UU_B, H1, H2, float16_to_uint8) RVVCALL(OPFVV1, vfncvt_xu_f_w_h, NOP_UU_H, H2, H4, float32_to_uint16) diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index aac96f830c..9033eae1cf 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -135,6 +135,9 @@ void vext_set_elems_1s(void *base, uint32_t is_agnostic= , uint32_t cnt, #define WOP_UU_B uint16_t, uint8_t, uint8_t #define WOP_UU_H uint32_t, uint16_t, uint16_t #define WOP_UU_W uint64_t, uint32_t, uint32_t +#define NOP_UU_B uint8_t, uint16_t, uint32_t +#define NOP_UU_H uint16_t, uint32_t, uint32_t +#define NOP_UU_W uint32_t, uint64_t, uint64_t =20 /* (TD, T1, T2, TX1, TX2) */ #define OP_UUU_B uint8_t, uint8_t, uint8_t, uint8_t, uint8_t diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 42328a8a58..3a7512ecd8 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3282,3 +3282,44 @@ THCALL(TH_OPFVV1, th_vfwcvt_f_f_v_h, WOP_UU_H, H4, H= 2, vfwcvtffv16) THCALL(TH_OPFVV1, th_vfwcvt_f_f_v_w, WOP_UU_W, H8, H4, float32_to_float64) GEN_TH_V_ENV(th_vfwcvt_f_f_v_h, 2, 4, clearl_th) GEN_TH_V_ENV(th_vfwcvt_f_f_v_w, 4, 8, clearq_th) + +/* Narrowing Floating-Point/Integer Type-Convert Instructions */ + +/* vfncvt.xu.f.v vd, vs2, vm # Convert float to unsigned integer. */ +THCALL(TH_OPFVV1, th_vfncvt_xu_f_v_b, NOP_UU_B, H1, H2, float16_to_uint8) +THCALL(TH_OPFVV1, th_vfncvt_xu_f_v_h, NOP_UU_H, H2, H4, float32_to_uint16) +THCALL(TH_OPFVV1, th_vfncvt_xu_f_v_w, NOP_UU_W, H4, H8, float64_to_uint32) +GEN_TH_V_ENV(th_vfncvt_xu_f_v_b, 1, 1, clearb_th) +GEN_TH_V_ENV(th_vfncvt_xu_f_v_h, 2, 2, clearh_th) +GEN_TH_V_ENV(th_vfncvt_xu_f_v_w, 4, 4, clearl_th) + +/* vfncvt.x.f.v vd, vs2, vm # Convert double-width float to signed integer= . */ +THCALL(TH_OPFVV1, th_vfncvt_x_f_v_b, NOP_UU_B, H1, H2, float16_to_int8) +THCALL(TH_OPFVV1, th_vfncvt_x_f_v_h, NOP_UU_H, H2, H4, float32_to_int16) +THCALL(TH_OPFVV1, th_vfncvt_x_f_v_w, NOP_UU_W, H4, H8, float64_to_int32) +GEN_TH_V_ENV(th_vfncvt_x_f_v_b, 1, 1, clearb_th) +GEN_TH_V_ENV(th_vfncvt_x_f_v_h, 2, 2, clearh_th) +GEN_TH_V_ENV(th_vfncvt_x_f_v_w, 4, 4, clearl_th) + +/* vfncvt.f.xu.v vd, vs2, vm # Convert double-width unsigned integer to fl= oat */ +THCALL(TH_OPFVV1, th_vfncvt_f_xu_v_h, NOP_UU_H, H2, H4, uint32_to_float16) +THCALL(TH_OPFVV1, th_vfncvt_f_xu_v_w, NOP_UU_W, H4, H8, uint64_to_float32) +GEN_TH_V_ENV(th_vfncvt_f_xu_v_h, 2, 2, clearh_th) +GEN_TH_V_ENV(th_vfncvt_f_xu_v_w, 4, 4, clearl_th) + +/* vfncvt.f.x.v vd, vs2, vm # Convert double-width integer to float. */ +THCALL(TH_OPFVV1, th_vfncvt_f_x_v_h, NOP_UU_H, H2, H4, int32_to_float16) +THCALL(TH_OPFVV1, th_vfncvt_f_x_v_w, NOP_UU_W, H4, H8, int64_to_float32) +GEN_TH_V_ENV(th_vfncvt_f_x_v_h, 2, 2, clearh_th) +GEN_TH_V_ENV(th_vfncvt_f_x_v_w, 4, 4, clearl_th) + +/* vfncvt.f.f.v vd, vs2, vm # Convert double float to single-width float. = */ +static uint16_t vfncvtffv16(uint32_t a, float_status *s) +{ + return float32_to_float16(a, true, s); +} + +THCALL(TH_OPFVV1, th_vfncvt_f_f_v_h, NOP_UU_H, H2, H4, vfncvtffv16) +THCALL(TH_OPFVV1, th_vfncvt_f_f_v_w, NOP_UU_W, H4, H8, float64_to_float32) +GEN_TH_V_ENV(th_vfncvt_f_f_v_h, 2, 2, clearh_th) +GEN_TH_V_ENV(th_vfncvt_f_f_v_w, 4, 4, clearl_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712914587; cv=none; d=zohomail.com; s=zohoarc; b=Xvmv8vOZAHyEDUW5W5GXXOWge1hyuXYXBrUzT8L0jBgqv8DNVYuvcIj73s4FS3xgtR/8e+UU77U5WTD4FsLqXdQSmF+Xd4gxUTh00RhDoJ9N6fh1JOwVbk9SNWgeiiH5NGUO6lqR1qGfvVjCIHzhlP8lF43bwqwCaqcJv6WpkkQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712914587; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=wV3RFD3fpfbJlIBfRcdxx9Z+2PDzu0q2P7HqEp2U/PI=; b=SXkURANAAarEiyr/k8sYt8j+RsX3BKLWI7rRE4AlxPcegkH885ZRa/+fgXqXV57NsKTsppSU311r5fIwoKQCNnc1fqFrwgoJ3Gp51KVzQKeovqQZLg07U9QElXpQFS6oUc8YQYELTcb7FemghdArDMOGSn2NXERLuBQ0S112pWU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712914587425139.04739420661213; Fri, 12 Apr 2024 02:36:27 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDK6-0005Qv-Hz; Fri, 12 Apr 2024 05:35:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDK2-0005Ob-8n; Fri, 12 Apr 2024 05:35:50 -0400 Received: from out30-119.freemail.mail.aliyun.com ([115.124.30.119]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDJz-0005AP-5I; Fri, 12 Apr 2024 05:35:49 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NoyXp_1712914538) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:35:39 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712914540; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=wV3RFD3fpfbJlIBfRcdxx9Z+2PDzu0q2P7HqEp2U/PI=; b=xkEV1eX+krrI+SYnlULYr1sQbz42gf8p0qaRuDEYzrX6Bc4WVojTLouq4K0HsNsLDuECgTqUyDSxcElSkcHoiCQSxLtIVfhabiHjH429YBIAGQHEyb32GdjhncpVr0/ASnymIQgZupQcj+/CfJyuCTZ1zHadIbLzJ0ikLH/J7eU= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R131e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045192; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NoyXp_1712914538; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 50/65] target/riscv: Add single-width integer reduction instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:20 +0800 Message-ID: <20240412073735.76413-51-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.119; envelope-from=eric.huang@linux.alibaba.com; helo=out30-119.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712914589587100002 Content-Type: text/plain; charset="utf-8" In this patch, we add single-width integer reduction instructions to show the way we implement XTheadVector reduction instructions. XTheadVector single-width integer reduction instructions diff from RVV1.0 in the following points: 1. Different mask reg layout. For mask bit of element i, XTheadVector locat= es it in bit[mlen], while RVV1.0 locates it in bit[i]. 2. Different tail elements process policy. XTheadVector clear the tail elem= ents. While RVV1.0 has vta to set the processing policy, keeping value or over= write it with 1s. 3. Different check policy. XTheadVector does not have fractional lmul, so w= e can use simpler check function. Signed-off-by: Huang Tao --- target/riscv/helper.h | 33 ++++++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 27 +++++-- target/riscv/xtheadvector_helper.c | 76 +++++++++++++++++++ 3 files changed, 128 insertions(+), 8 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index c666a5a020..84d2921945 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2236,3 +2236,36 @@ DEF_HELPER_5(th_vfncvt_f_x_v_h, void, ptr, ptr, ptr,= env, i32) DEF_HELPER_5(th_vfncvt_f_x_v_w, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(th_vfncvt_f_f_v_h, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(th_vfncvt_f_f_v_w, void, ptr, ptr, ptr, env, i32) + +DEF_HELPER_6(th_vredsum_vs_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredsum_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredsum_vs_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredsum_vs_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredmaxu_vs_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredmaxu_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredmaxu_vs_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredmaxu_vs_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredmax_vs_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredmax_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredmax_vs_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredmax_vs_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredminu_vs_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredminu_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredminu_vs_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredminu_vs_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredmin_vs_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredmin_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredmin_vs_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredmin_vs_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredand_vs_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredand_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredand_vs_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredand_vs_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredor_vs_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredor_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredor_vs_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredor_vs_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredxor_vs_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredxor_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredxor_vs_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vredxor_vs_d, void, ptr, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index d2734c007a..1fd66353ed 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2374,20 +2374,31 @@ GEN_OPFV_NARROW_TRANS_TH(th_vfncvt_f_xu_v) GEN_OPFV_NARROW_TRANS_TH(th_vfncvt_f_x_v) GEN_OPFV_NARROW_TRANS_TH(th_vfncvt_f_f_v) =20 +/* + * Vector Reduction Operations + */ + +/* Vector Single-Width Integer Reduction Instructions */ +static bool reduction_check_th(DisasContext *s, arg_rmrr *a) +{ + return vext_check_isa_ill(s) && th_check_reg(s, a->rs2, false); +} + +GEN_OPIVV_TRANS_TH(th_vredsum_vs, reduction_check_th) +GEN_OPIVV_TRANS_TH(th_vredmaxu_vs, reduction_check_th) +GEN_OPIVV_TRANS_TH(th_vredmax_vs, reduction_check_th) +GEN_OPIVV_TRANS_TH(th_vredminu_vs, reduction_check_th) +GEN_OPIVV_TRANS_TH(th_vredmin_vs, reduction_check_th) +GEN_OPIVV_TRANS_TH(th_vredand_vs, reduction_check_th) +GEN_OPIVV_TRANS_TH(th_vredor_vs, reduction_check_th) +GEN_OPIVV_TRANS_TH(th_vredxor_vs, reduction_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vredsum_vs) -TH_TRANS_STUB(th_vredand_vs) -TH_TRANS_STUB(th_vredor_vs) -TH_TRANS_STUB(th_vredxor_vs) -TH_TRANS_STUB(th_vredminu_vs) -TH_TRANS_STUB(th_vredmin_vs) -TH_TRANS_STUB(th_vredmaxu_vs) -TH_TRANS_STUB(th_vredmax_vs) TH_TRANS_STUB(th_vwredsumu_vs) TH_TRANS_STUB(th_vwredsum_vs) TH_TRANS_STUB(th_vfredsum_vs) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 3a7512ecd8..d041a81150 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3323,3 +3323,79 @@ THCALL(TH_OPFVV1, th_vfncvt_f_f_v_h, NOP_UU_H, H2, H= 4, vfncvtffv16) THCALL(TH_OPFVV1, th_vfncvt_f_f_v_w, NOP_UU_W, H4, H8, float64_to_float32) GEN_TH_V_ENV(th_vfncvt_f_f_v_h, 2, 2, clearh_th) GEN_TH_V_ENV(th_vfncvt_f_f_v_w, 4, 4, clearl_th) + +/* + * Vector Reduction Operations + */ + +/* Vector Single-Width Integer Reduction Instructions */ +#define GEN_TH_RED(NAME, TD, TS2, HD, HS2, OP, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, \ + void *vs2, CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vm =3D th_vm(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t i; \ + uint32_t tot =3D env_archcpu(env)->cfg.vlenb; \ + TD s1 =3D *((TD *)vs1 + HD(0)); \ + \ + for (i =3D env->vstart; i < vl; i++) { \ + TS2 s2 =3D *((TS2 *)vs2 + HS2(i)); \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + s1 =3D OP(s1, (TD)s2); \ + } \ + *((TD *)vd + HD(0)) =3D s1; \ + env->vstart =3D 0; \ + CLEAR_FN(vd, 1, sizeof(TD), tot); \ +} + +/* vd[0] =3D sum(vs1[0], vs2[*]) */ +GEN_TH_RED(th_vredsum_vs_b, int8_t, int8_t, H1, H1, TH_ADD, clearb_th) +GEN_TH_RED(th_vredsum_vs_h, int16_t, int16_t, H2, H2, TH_ADD, clearh_th) +GEN_TH_RED(th_vredsum_vs_w, int32_t, int32_t, H4, H4, TH_ADD, clearl_th) +GEN_TH_RED(th_vredsum_vs_d, int64_t, int64_t, H8, H8, TH_ADD, clearq_th) + +/* vd[0] =3D maxu(vs1[0], vs2[*]) */ +GEN_TH_RED(th_vredmaxu_vs_b, uint8_t, uint8_t, H1, H1, TH_MAX, clearb_th) +GEN_TH_RED(th_vredmaxu_vs_h, uint16_t, uint16_t, H2, H2, TH_MAX, clearh_th) +GEN_TH_RED(th_vredmaxu_vs_w, uint32_t, uint32_t, H4, H4, TH_MAX, clearl_th) +GEN_TH_RED(th_vredmaxu_vs_d, uint64_t, uint64_t, H8, H8, TH_MAX, clearq_th) + +/* vd[0] =3D max(vs1[0], vs2[*]) */ +GEN_TH_RED(th_vredmax_vs_b, int8_t, int8_t, H1, H1, TH_MAX, clearb_th) +GEN_TH_RED(th_vredmax_vs_h, int16_t, int16_t, H2, H2, TH_MAX, clearh_th) +GEN_TH_RED(th_vredmax_vs_w, int32_t, int32_t, H4, H4, TH_MAX, clearl_th) +GEN_TH_RED(th_vredmax_vs_d, int64_t, int64_t, H8, H8, TH_MAX, clearq_th) + +/* vd[0] =3D minu(vs1[0], vs2[*]) */ +GEN_TH_RED(th_vredminu_vs_b, uint8_t, uint8_t, H1, H1, TH_MIN, clearb_th) +GEN_TH_RED(th_vredminu_vs_h, uint16_t, uint16_t, H2, H2, TH_MIN, clearh_th) +GEN_TH_RED(th_vredminu_vs_w, uint32_t, uint32_t, H4, H4, TH_MIN, clearl_th) +GEN_TH_RED(th_vredminu_vs_d, uint64_t, uint64_t, H8, H8, TH_MIN, clearq_th) + +/* vd[0] =3D min(vs1[0], vs2[*]) */ +GEN_TH_RED(th_vredmin_vs_b, int8_t, int8_t, H1, H1, TH_MIN, clearb_th) +GEN_TH_RED(th_vredmin_vs_h, int16_t, int16_t, H2, H2, TH_MIN, clearh_th) +GEN_TH_RED(th_vredmin_vs_w, int32_t, int32_t, H4, H4, TH_MIN, clearl_th) +GEN_TH_RED(th_vredmin_vs_d, int64_t, int64_t, H8, H8, TH_MIN, clearq_th) + +/* vd[0] =3D and(vs1[0], vs2[*]) */ +GEN_TH_RED(th_vredand_vs_b, int8_t, int8_t, H1, H1, TH_AND, clearb_th) +GEN_TH_RED(th_vredand_vs_h, int16_t, int16_t, H2, H2, TH_AND, clearh_th) +GEN_TH_RED(th_vredand_vs_w, int32_t, int32_t, H4, H4, TH_AND, clearl_th) +GEN_TH_RED(th_vredand_vs_d, int64_t, int64_t, H8, H8, TH_AND, clearq_th) + +/* vd[0] =3D or(vs1[0], vs2[*]) */ +GEN_TH_RED(th_vredor_vs_b, int8_t, int8_t, H1, H1, TH_OR, clearb_th) +GEN_TH_RED(th_vredor_vs_h, int16_t, int16_t, H2, H2, TH_OR, clearh_th) +GEN_TH_RED(th_vredor_vs_w, int32_t, int32_t, H4, H4, TH_OR, clearl_th) +GEN_TH_RED(th_vredor_vs_d, int64_t, int64_t, H8, H8, TH_OR, clearq_th) + +/* vd[0] =3D xor(vs1[0], vs2[*]) */ +GEN_TH_RED(th_vredxor_vs_b, int8_t, int8_t, H1, H1, TH_XOR, clearb_th) +GEN_TH_RED(th_vredxor_vs_h, int16_t, int16_t, H2, H2, TH_XOR, clearh_th) +GEN_TH_RED(th_vredxor_vs_w, int32_t, int32_t, H4, H4, TH_XOR, clearl_th) +GEN_TH_RED(th_vredxor_vs_d, int64_t, int64_t, H8, H8, TH_XOR, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712914712; cv=none; d=zohomail.com; s=zohoarc; b=AB+sCYZcpsKAFwQ2sN5RrkFCjb6SXCE7XGp2DShw8QjzdwyrTe6HZk6RPyI4oa0hmPn11hKzwgQbEgfSUpTkt4oovblwbuBTjgd43ErrIQylIrJlPU5pNXi/faMAsU1Tqv37YsqcCuQ5OUNhB1vbZs2e67R3eu+9wXPGIlWdW30= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712914712; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=rvFeLBFtyFCY3re8IsCpZb6TNRzQPebm4CWct6JZwjo=; b=kpQd/VvgU82hQklaOIIgnqpKB2cvgb48jSfGZ2JM3jMUR1mEkNy/bLo2k4WKdCScmH8/c05UBXz17OjcwitPY7R8WGzeiGOIMy9mZvAPgLd6/wiDYCuPBlZ0cOFdsH6uawn5HtWYA8S4RxjN4RRSwBWzpBYZzVtkTtGjsVYBAW0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712914712339304.7025471476811; Fri, 12 Apr 2024 02:38:32 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDM9-0006Kz-43; Fri, 12 Apr 2024 05:38:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDM5-0006KG-FI; Fri, 12 Apr 2024 05:37:59 -0400 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDLv-0005Mq-RJ; Fri, 12 Apr 2024 05:37:53 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NsdXC_1712914660) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:37:41 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712914662; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=rvFeLBFtyFCY3re8IsCpZb6TNRzQPebm4CWct6JZwjo=; b=JMEVtv/WeuhtWNlZX6Dens2NTZOKs0rZ68gvo/FJjucFYDC/dpcpUN0B9fNXYdu+7863qhKWqL7+2XdqwxrBy/hq61tOdBDCd6U+kn1xC3s+pdwetd6IiO0TQ7g3JFVAiqIMDeLuv8KYr8XLP246duD+N+BDVWgbEQcgwNMLWQE= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R101e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045176; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NsdXC_1712914660; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 51/65] target/riscv: Add widening integer reduction instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:21 +0800 Message-ID: <20240412073735.76413-52-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.132; envelope-from=eric.huang@linux.alibaba.com; helo=out30-132.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712914714115100003 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 7 +++++++ target/riscv/insn_trans/trans_xtheadvector.c.inc | 6 ++++-- target/riscv/xtheadvector_helper.c | 11 +++++++++++ 3 files changed, 22 insertions(+), 2 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 84d2921945..2cd4a7401f 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2269,3 +2269,10 @@ DEF_HELPER_6(th_vredxor_vs_b, void, ptr, ptr, ptr, p= tr, env, i32) DEF_HELPER_6(th_vredxor_vs_h, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(th_vredxor_vs_w, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(th_vredxor_vs_d, void, ptr, ptr, ptr, ptr, env, i32) + +DEF_HELPER_6(th_vwredsumu_vs_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwredsumu_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwredsumu_vs_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwredsum_vs_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwredsum_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vwredsum_vs_w, void, ptr, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 1fd66353ed..8a1f0e1e74 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2393,14 +2393,16 @@ GEN_OPIVV_TRANS_TH(th_vredand_vs, reduction_check_t= h) GEN_OPIVV_TRANS_TH(th_vredor_vs, reduction_check_th) GEN_OPIVV_TRANS_TH(th_vredxor_vs, reduction_check_th) =20 +/* Vector Widening Integer Reduction Instructions */ +GEN_OPIVV_WIDEN_TRANS_TH(th_vwredsum_vs, reduction_check_th) +GEN_OPIVV_WIDEN_TRANS_TH(th_vwredsumu_vs, reduction_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vwredsumu_vs) -TH_TRANS_STUB(th_vwredsum_vs) TH_TRANS_STUB(th_vfredsum_vs) TH_TRANS_STUB(th_vfredmin_vs) TH_TRANS_STUB(th_vfredmax_vs) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index d041a81150..f802b2c5ac 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3399,3 +3399,14 @@ GEN_TH_RED(th_vredxor_vs_b, int8_t, int8_t, H1, H1, = TH_XOR, clearb_th) GEN_TH_RED(th_vredxor_vs_h, int16_t, int16_t, H2, H2, TH_XOR, clearh_th) GEN_TH_RED(th_vredxor_vs_w, int32_t, int32_t, H4, H4, TH_XOR, clearl_th) GEN_TH_RED(th_vredxor_vs_d, int64_t, int64_t, H8, H8, TH_XOR, clearq_th) + +/* Vector Widening Integer Reduction Instructions */ +/* signed sum reduction into double-width accumulator */ +GEN_TH_RED(th_vwredsum_vs_b, int16_t, int8_t, H2, H1, TH_ADD, clearh_th) +GEN_TH_RED(th_vwredsum_vs_h, int32_t, int16_t, H4, H2, TH_ADD, clearl_th) +GEN_TH_RED(th_vwredsum_vs_w, int64_t, int32_t, H8, H4, TH_ADD, clearq_th) + +/* Unsigned sum reduction into double-width accumulator */ +GEN_TH_RED(th_vwredsumu_vs_b, uint16_t, uint8_t, H2, H1, TH_ADD, clearh_th) +GEN_TH_RED(th_vwredsumu_vs_h, uint32_t, uint16_t, H4, H2, TH_ADD, clearl_t= h) +GEN_TH_RED(th_vwredsumu_vs_w, uint64_t, uint32_t, H8, H4, TH_ADD, clearq_t= h) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712914843; cv=none; d=zohomail.com; s=zohoarc; b=VPIRphf2heEMF6kIGdpcNUe3pvAPsgKIKxUXkzad3pStVy2FNJlb09lqDbzm5gtiOgYYhpNmTlwpLige4vF7/UQhYM3Osykmf6f5c1+Y6GB4UeRon5jmz/y1cUU2OozgJ73tQ2kBTTvVqCqYCKV2rp3thPJaWqduyGEz7ocoH6I= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712914843; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=JwaMqiHOkzXwc6HLXGIISXAQVZZqsuqU7z1zj4QhC1U=; b=AwlU6RCWi+SWnoGrIMi/BFq9GF7+vOX98yK7STjA1ZSXnINp3xMjXjR0d8k+pj5STgV3HylAKS6pHqosYNo+37r2DKXrut16y38pBWGoLi2m7zrufcE1j1JHSKf59zNK9bP7Vt5A88wiC0dM5el/mclEwqvxEYC1Sq0UgMz7b1s= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712914843832903.16941074808; Fri, 12 Apr 2024 02:40:43 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDO1-0007HI-JE; Fri, 12 Apr 2024 05:39:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDNz-0007GX-AC; Fri, 12 Apr 2024 05:39:55 -0400 Received: from out30-101.freemail.mail.aliyun.com ([115.124.30.101]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDNw-0005Wb-6E; Fri, 12 Apr 2024 05:39:55 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NtKZ-_1712914781) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:39:42 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712914783; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=JwaMqiHOkzXwc6HLXGIISXAQVZZqsuqU7z1zj4QhC1U=; b=HyYQ4XQKpdmB3lAbgAGZxmfswsraichiQ5UH6SAbD0/ffoVHilzoWwpodN8h4XLDio94AgRgz88O4b6XAmQu0suJPW2xvia3c//d8CyrtHze3TqSk1xVG2nLqMLK4WrYl5eZ26lq2j3eNd1l27JZSyHLTrxumfWzDx8+yBjiU8E= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R101e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046059; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NtKZ-_1712914781; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 52/65] target/riscv: Add single-width floating-point reduction instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:22 +0800 Message-ID: <20240412073735.76413-53-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.101; envelope-from=eric.huang@linux.alibaba.com; helo=out30-101.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712914844523100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 10 ++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 8 +-- target/riscv/xtheadvector_helper.c | 49 +++++++++++++++++++ 3 files changed, 64 insertions(+), 3 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 2cd4a7401f..24bb8479a4 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2276,3 +2276,13 @@ DEF_HELPER_6(th_vwredsumu_vs_w, void, ptr, ptr, ptr,= ptr, env, i32) DEF_HELPER_6(th_vwredsum_vs_b, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(th_vwredsum_vs_h, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(th_vwredsum_vs_w, void, ptr, ptr, ptr, ptr, env, i32) + +DEF_HELPER_6(th_vfredsum_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfredsum_vs_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfredsum_vs_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfredmax_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfredmax_vs_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfredmax_vs_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfredmin_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfredmin_vs_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfredmin_vs_d, void, ptr, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 8a1f0e1e74..f77d76dc5e 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2397,15 +2397,17 @@ GEN_OPIVV_TRANS_TH(th_vredxor_vs, reduction_check_t= h) GEN_OPIVV_WIDEN_TRANS_TH(th_vwredsum_vs, reduction_check_th) GEN_OPIVV_WIDEN_TRANS_TH(th_vwredsumu_vs, reduction_check_th) =20 +/* Vector Single-Width Floating-Point Reduction Instructions */ +GEN_OPFVV_TRANS_TH(th_vfredsum_vs, reduction_check_th) +GEN_OPFVV_TRANS_TH(th_vfredmax_vs, reduction_check_th) +GEN_OPFVV_TRANS_TH(th_vfredmin_vs, reduction_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfredsum_vs) -TH_TRANS_STUB(th_vfredmin_vs) -TH_TRANS_STUB(th_vfredmax_vs) TH_TRANS_STUB(th_vfwredsum_vs) TH_TRANS_STUB(th_vmand_mm) TH_TRANS_STUB(th_vmnand_mm) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index f802b2c5ac..2a241aed65 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3410,3 +3410,52 @@ GEN_TH_RED(th_vwredsum_vs_w, int64_t, int32_t, H8, H= 4, TH_ADD, clearq_th) GEN_TH_RED(th_vwredsumu_vs_b, uint16_t, uint8_t, H2, H1, TH_ADD, clearh_th) GEN_TH_RED(th_vwredsumu_vs_h, uint32_t, uint16_t, H4, H2, TH_ADD, clearl_t= h) GEN_TH_RED(th_vwredsumu_vs_w, uint64_t, uint32_t, H8, H4, TH_ADD, clearq_t= h) + +/* Vector Single-Width Floating-Point Reduction Instructions */ +#define GEN_TH_FRED(NAME, TD, TS2, HD, HS2, OP, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, \ + void *vs2, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vm =3D th_vm(desc); \ + uint32_t vl =3D env->vl; \ + uint32_t i; \ + uint32_t tot =3D env_archcpu(env)->cfg.vlenb; \ + TD s1 =3D *((TD *)vs1 + HD(0)); \ + \ + for (i =3D env->vstart; i < vl; i++) { \ + TS2 s2 =3D *((TS2 *)vs2 + HS2(i)); \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + s1 =3D OP(s1, (TD)s2, &env->fp_status); \ + } \ + *((TD *)vd + HD(0)) =3D s1; \ + env->vstart =3D 0; \ + CLEAR_FN(vd, 1, sizeof(TD), tot); \ +} + +/* Unordered sum */ +GEN_TH_FRED(th_vfredsum_vs_h, uint16_t, uint16_t, H2, H2, + float16_add, clearh_th) +GEN_TH_FRED(th_vfredsum_vs_w, uint32_t, uint32_t, H4, H4, + float32_add, clearl_th) +GEN_TH_FRED(th_vfredsum_vs_d, uint64_t, uint64_t, H8, H8, + float64_add, clearq_th) + +/* Maximum value */ +GEN_TH_FRED(th_vfredmax_vs_h, uint16_t, uint16_t, H2, H2, + float16_maxnum, clearh_th) +GEN_TH_FRED(th_vfredmax_vs_w, uint32_t, uint32_t, H4, H4, + float32_maxnum, clearl_th) +GEN_TH_FRED(th_vfredmax_vs_d, uint64_t, uint64_t, H8, H8, + float64_maxnum, clearq_th) + +/* Minimum value */ +GEN_TH_FRED(th_vfredmin_vs_h, uint16_t, uint16_t, H2, H2, + float16_minnum, clearh_th) +GEN_TH_FRED(th_vfredmin_vs_w, uint32_t, uint32_t, H4, H4, + float32_minnum, clearl_th) +GEN_TH_FRED(th_vfredmin_vs_d, uint64_t, uint64_t, H8, H8, + float64_minnum, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712914935; cv=none; d=zohomail.com; s=zohoarc; b=j6bmSPOwwrOopH3HSNZMwGr8XbmfdYRuA+RxW+GH/hD17RAF/ws7VRX3qhCzWJsaiADnC29bjoPqqeLfQpzAENbwzD5CWUzHkN2eOe+jbXR/ytcOthI48nhmXFodF5LyYFnuLWNu2SFwP/npt2ntcke1oWvhjmHbTcg+E2OXHbQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712914935; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=DvS7ZVmPsoC/yhBHMBdWOLfrIecMJsQ4hrpstrB7mYg=; b=naoi5hVM3jaZPPiJ7C20S+f2MpS6Mp56VNdcNIotTun/3tfeI/fQRvr6rymBc3+REVdmuJGgtDBc+Njqh0VKHWhBvrmjhAyCvoxVRZGVn3+HcwEBIF5GPx62JDUXB+Cd3LZ/qTLMG/YscPRiBxxp659RtfpNNJTBecxTF9axqVI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712914935444726.5926965610294; Fri, 12 Apr 2024 02:42:15 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDPu-0008FS-0e; Fri, 12 Apr 2024 05:41:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDPs-0008Er-78; Fri, 12 Apr 2024 05:41:52 -0400 Received: from out30-124.freemail.mail.aliyun.com ([115.124.30.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDPp-0005vT-IC; Fri, 12 Apr 2024 05:41:51 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NtLFy_1712914902) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:41:43 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712914904; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=DvS7ZVmPsoC/yhBHMBdWOLfrIecMJsQ4hrpstrB7mYg=; b=sSmcE2jr9QKy4LBmXwCn1UVC3snX/3y4C6sRLk0Ae8EOrmQECBWOTAr9MHR5E9Ns08ECKv5cBaCX2lemCJCRs4tit4RJu4NCKSqR4axeKT3K1nTlIHUafBXh/sKuM/28sBR5PVE+b4qazyEew8g4U2hMLQnKWNnfcGWxLsypHps= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R201e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046049; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NtLFy_1712914902; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 53/65] target/riscv: Add widening floating-point reduction instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:23 +0800 Message-ID: <20240412073735.76413-54-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.124; envelope-from=eric.huang@linux.alibaba.com; helo=out30-124.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712914936936100003 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 3 +++ target/riscv/insn_trans/trans_xtheadvector.c.inc | 4 +++- target/riscv/xtheadvector_helper.c | 16 ++++++++++++++++ 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 24bb8479a4..c39ee9a8e8 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2286,3 +2286,6 @@ DEF_HELPER_6(th_vfredmax_vs_d, void, ptr, ptr, ptr, p= tr, env, i32) DEF_HELPER_6(th_vfredmin_vs_h, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(th_vfredmin_vs_w, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(th_vfredmin_vs_d, void, ptr, ptr, ptr, ptr, env, i32) + +DEF_HELPER_6(th_vfwredsum_vs_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vfwredsum_vs_w, void, ptr, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index f77d76dc5e..b71875700b 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2402,13 +2402,15 @@ GEN_OPFVV_TRANS_TH(th_vfredsum_vs, reduction_check_= th) GEN_OPFVV_TRANS_TH(th_vfredmax_vs, reduction_check_th) GEN_OPFVV_TRANS_TH(th_vfredmin_vs, reduction_check_th) =20 +/* Vector Widening Floating-Point Reduction Instructions */ +GEN_OPFVV_WIDEN_TRANS_TH(th_vfwredsum_vs, reduction_check_th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfwredsum_vs) TH_TRANS_STUB(th_vmand_mm) TH_TRANS_STUB(th_vmnand_mm) TH_TRANS_STUB(th_vmandnot_mm) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 2a241aed65..8953207630 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3459,3 +3459,19 @@ GEN_TH_FRED(th_vfredmin_vs_w, uint32_t, uint32_t, H4= , H4, float32_minnum, clearl_th) GEN_TH_FRED(th_vfredmin_vs_d, uint64_t, uint64_t, H8, H8, float64_minnum, clearq_th) + +/* Vector Widening Floating-Point Add functions */ +static uint32_t fwadd16(uint32_t a, uint16_t b, float_status *s) +{ + return float32_add(a, float16_to_float32(b, true, s), s); +} + +static uint64_t fwadd32(uint64_t a, uint32_t b, float_status *s) +{ + return float64_add(a, float32_to_float64(b, s), s); +} + +/* Vector Widening Floating-Point Reduction Instructions */ +/* Unordered reduce 2*SEW =3D 2*SEW + sum(promote(SEW)) */ +GEN_TH_FRED(th_vfwredsum_vs_h, uint32_t, uint16_t, H4, H2, fwadd16, clearl= _th) +GEN_TH_FRED(th_vfwredsum_vs_w, uint64_t, uint32_t, H8, H4, fwadd32, clearq= _th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712915063; cv=none; d=zohomail.com; s=zohoarc; b=B1Nn8MGCxNdigxlqiKZb4Bp9dvp7ZSjoxFmEHS8UP9sCy9Uoy7L2TbF9WmISsOqLIcwPdqiXo5vQFpFBR1zCgO8zKkh871UwmFElfRIBGx0wHO4XEZSs5bDvcKNJbfrIxHqqOUCJTnRUhioDltReTEHrMfcBYBkZnFtoCaqRfzk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712915063; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=SmbjskTSh1OMf/kh1nsuVP/7jRp1dpEJSJw1gknWio8=; b=cyCYk3lCa9jRzHxc6hwkUE/8f0N+M2JXfYf4v1dVZ4wrSVtsMaI4/Z7cNgcdZ/UlxhGe948SpIqyFe8aTKLALOJhw0+c/0BahQE97F5Z9g12c1qz2mY7Db6RPNRPvlviGivBf7AZin/A0VGw0vwUWjR1bQRYQTkHernJyTedHWc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 171291506384690.40742707254333; Fri, 12 Apr 2024 02:44:23 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDRr-0000ja-Nm; Fri, 12 Apr 2024 05:43:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDRq-0000jN-69; Fri, 12 Apr 2024 05:43:54 -0400 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDRn-00068O-QK; Fri, 12 Apr 2024 05:43:53 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nu1d9_1712915024) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:43:45 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712915026; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=SmbjskTSh1OMf/kh1nsuVP/7jRp1dpEJSJw1gknWio8=; b=fEYklDcuAJa//ZTN3wi8qFc7tZqkrC115SVMV2wZEYGJaauQgjLoQQvH/f82CLrni3q9/sqLtJ75B1YOOzMa8pus1rPYQyyJsP8ujGvzjGq1y973YOCV8mwgpqsQqE3cXpjQjNvcYMe08h+meWl2xwcYyZl9zTedI1CEnYlI5jA= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R651e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046060; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nu1d9_1712915024; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 54/65] target/riscv: Add mask-register logical instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:24 +0800 Message-ID: <20240412073735.76413-55-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.132; envelope-from=eric.huang@linux.alibaba.com; helo=out30-132.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712915065090100003 Content-Type: text/plain; charset="utf-8" In this patch, we add mask-register logical instructions to show the way we implement XTheadVector mask instructions. XTheadVector mask-register logical instructions diff from RVV1.0 in the following points: 1. Different mask reg layout. For mask bit of element i, XTheadVector locat= es it in bit[mlen], while RVV1.0 locates it in bit[i]. Signed-off-by: Huang Tao --- target/riscv/helper.h | 9 ++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 44 +++++++++++++++---- target/riscv/xtheadvector_helper.c | 42 ++++++++++++++++++ 3 files changed, 87 insertions(+), 8 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index c39ee9a8e8..7d992ac3b1 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2289,3 +2289,12 @@ DEF_HELPER_6(th_vfredmin_vs_d, void, ptr, ptr, ptr, = ptr, env, i32) =20 DEF_HELPER_6(th_vfwredsum_vs_h, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(th_vfwredsum_vs_w, void, ptr, ptr, ptr, ptr, env, i32) + +DEF_HELPER_6(th_vmand_mm, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmnand_mm, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmandnot_mm, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmxor_mm, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmor_mm, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmnor_mm, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmornot_mm, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vmxnor_mm, void, ptr, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index b71875700b..e9fa7f1ae2 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2405,20 +2405,48 @@ GEN_OPFVV_TRANS_TH(th_vfredmin_vs, reduction_check_= th) /* Vector Widening Floating-Point Reduction Instructions */ GEN_OPFVV_WIDEN_TRANS_TH(th_vfwredsum_vs, reduction_check_th) =20 +/* + * Vector Mask Operations + */ + +/* Vector Mask-Register Logical Instructions */ +#define GEN_MM_TRANS_TH(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + if (require_xtheadvector(s) && \ + vext_check_isa_ill(s)) { \ + uint32_t data =3D 0; \ + gen_helper_gvec_4_ptr *fn =3D gen_helper_##NAME; \ + \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), \ + vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs1), \ + vreg_ofs(s, a->rs2), tcg_env, \ + s->cfg_ptr->vlenb, \ + s->cfg_ptr->vlenb, data, fn); \ + finalize_rvv_inst(s); \ + return true; \ + } \ + return false; \ +} + +GEN_MM_TRANS_TH(th_vmand_mm) +GEN_MM_TRANS_TH(th_vmnand_mm) +GEN_MM_TRANS_TH(th_vmandnot_mm) +GEN_MM_TRANS_TH(th_vmxor_mm) +GEN_MM_TRANS_TH(th_vmor_mm) +GEN_MM_TRANS_TH(th_vmnor_mm) +GEN_MM_TRANS_TH(th_vmornot_mm) +GEN_MM_TRANS_TH(th_vmxnor_mm) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vmand_mm) -TH_TRANS_STUB(th_vmnand_mm) -TH_TRANS_STUB(th_vmandnot_mm) -TH_TRANS_STUB(th_vmxor_mm) -TH_TRANS_STUB(th_vmor_mm) -TH_TRANS_STUB(th_vmnor_mm) -TH_TRANS_STUB(th_vmornot_mm) -TH_TRANS_STUB(th_vmxnor_mm) TH_TRANS_STUB(th_vmpopc_m) TH_TRANS_STUB(th_vmfirst_m) TH_TRANS_STUB(th_vmsbf_m) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 8953207630..b3f445eeb5 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3475,3 +3475,45 @@ static uint64_t fwadd32(uint64_t a, uint32_t b, floa= t_status *s) /* Unordered reduce 2*SEW =3D 2*SEW + sum(promote(SEW)) */ GEN_TH_FRED(th_vfwredsum_vs_h, uint32_t, uint16_t, H4, H2, fwadd16, clearl= _th) GEN_TH_FRED(th_vfwredsum_vs_w, uint64_t, uint32_t, H8, H4, fwadd32, clearq= _th) + +/* + * Vector Mask Operations + */ +/* Vector Mask-Register Logical Instructions */ +#define GEN_TH_MASK_VV(NAME, OP) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, \ + void *vs2, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); \ + uint32_t vlmax =3D (env_archcpu(env)->cfg.vlenb << 3) / mlen; \ + uint32_t vl =3D env->vl; \ + uint32_t i; \ + int a, b; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { \ + a =3D th_elem_mask(vs1, mlen, i); \ + b =3D th_elem_mask(vs2, mlen, i); \ + th_set_elem_mask(vd, mlen, i, OP(b, a)); \ + } \ + env->vstart =3D 0; \ + for (; i < vlmax; i++) { \ + th_set_elem_mask(vd, mlen, i, 0); \ + } \ +} + +#define TH_NAND(N, M) (!(N & M)) +#define TH_ANDNOT(N, M) (N & !M) +#define TH_NOR(N, M) (!(N | M)) +#define TH_ORNOT(N, M) (N | !M) +#define TH_XNOR(N, M) (!(N ^ M)) + +GEN_TH_MASK_VV(th_vmand_mm, TH_AND) +GEN_TH_MASK_VV(th_vmnand_mm, TH_NAND) +GEN_TH_MASK_VV(th_vmandnot_mm, TH_ANDNOT) +GEN_TH_MASK_VV(th_vmxor_mm, TH_XOR) +GEN_TH_MASK_VV(th_vmor_mm, TH_OR) +GEN_TH_MASK_VV(th_vmnor_mm, TH_NOR) +GEN_TH_MASK_VV(th_vmornot_mm, TH_ORNOT) +GEN_TH_MASK_VV(th_vmxnor_mm, TH_XNOR) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712915195; cv=none; d=zohomail.com; s=zohoarc; b=VGKnLKliw36ot3DqEYij/6+NQRH6/iEynkYZ6nY80wJpXG4n0aIQ4m0gbKJ1S7SOh0SDAPJMWOiu9eh/kw58yTDHNv9OluzKaVlNPenA+tm3uFMO+I0+v/jSmYhQaoShPzKpsKRk/8KPM5Qk89XJcmAsOwVXHmv7Ps1w0RCrU1w= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712915195; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=e3otANKWYib7Loqv8gYcSbnsPPTJrAUEBz6cAqTFaUw=; b=k/1vY5bFS/18QnUnVhgyjLVbTRwABpUDQJVD4pn15waZiijj2HzqqK4JQvS3xqKZ5iNAxpQxu3GZ0SiBY01gZdTR16CmG+255oRgx/QnmOSYvMfwDPyyNUeZj45L+urNr9yu2HWlkSHU4wwXp94l7CccA5g8seaZcslxTj0hrTQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 171291519529585.40631261237445; Fri, 12 Apr 2024 02:46:35 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDUD-0001aE-HL; Fri, 12 Apr 2024 05:46:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDU4-0001Zf-QG; Fri, 12 Apr 2024 05:46:14 -0400 Received: from out30-98.freemail.mail.aliyun.com ([115.124.30.98]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDTo-0006Qp-SQ; Fri, 12 Apr 2024 05:46:12 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nr8LB_1712915145) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:45:46 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712915147; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=e3otANKWYib7Loqv8gYcSbnsPPTJrAUEBz6cAqTFaUw=; b=V1C2rvm1/OJrme+DWQGUTRtHsWb/6WZ8q2elPSjnwDCjHhCMQF5ny8O5WLD1HjvXNHsCkrUJwo/6mRmIwHlmtVlzMF5aH3Q3oqMTIjKLPyzz4nDDxiv0RnsWcFq4KGN2Bxz3PoiZ6b1/mEjpb42fzmQC2S0r2ItUyc0clA2E0Es= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R591e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046051; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nr8LB_1712915145; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 55/65] target/riscv: Add vector mask population count vmpopc for XTheadVector Date: Fri, 12 Apr 2024 15:37:25 +0800 Message-ID: <20240412073735.76413-56-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.98; envelope-from=eric.huang@linux.alibaba.com; helo=out30-98.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712915197614100003 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 2 ++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 31 ++++++++++++++++++- target/riscv/xtheadvector_helper.c | 21 +++++++++++++ 3 files changed, 53 insertions(+), 1 deletion(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 7d992ac3b1..6ddecbbe65 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2298,3 +2298,5 @@ DEF_HELPER_6(th_vmor_mm, void, ptr, ptr, ptr, ptr, en= v, i32) DEF_HELPER_6(th_vmnor_mm, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(th_vmornot_mm, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(th_vmxnor_mm, void, ptr, ptr, ptr, ptr, env, i32) + +DEF_HELPER_4(th_vmpopc_m, tl, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index e9fa7f1ae2..f8e8b321e4 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2441,13 +2441,42 @@ GEN_MM_TRANS_TH(th_vmnor_mm) GEN_MM_TRANS_TH(th_vmornot_mm) GEN_MM_TRANS_TH(th_vmxnor_mm) =20 +/* Vector mask population count vmpopc */ +static bool trans_th_vmpopc_m(DisasContext *s, arg_rmr *a) +{ + if (require_xtheadvector(s) && + vext_check_isa_ill(s) && + s->vstart_eq_zero) { + TCGv_ptr src2, mask; + TCGv dst; + TCGv_i32 desc; + uint32_t data =3D 0; + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + + mask =3D tcg_temp_new_ptr(); + src2 =3D tcg_temp_new_ptr(); + dst =3D dest_gpr(s, a->rd); + desc =3D tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data)); + + tcg_gen_addi_ptr(src2, tcg_env, vreg_ofs(s, a->rs2)); + tcg_gen_addi_ptr(mask, tcg_env, vreg_ofs(s, 0)); + + gen_helper_th_vmpopc_m(dst, mask, src2, tcg_env, desc); + gen_set_gpr(s, a->rd, dst); + return true; + } + return false; +} + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vmpopc_m) TH_TRANS_STUB(th_vmfirst_m) TH_TRANS_STUB(th_vmsbf_m) TH_TRANS_STUB(th_vmsif_m) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index b3f445eeb5..ba1ab0435d 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3517,3 +3517,24 @@ GEN_TH_MASK_VV(th_vmor_mm, TH_OR) GEN_TH_MASK_VV(th_vmnor_mm, TH_NOR) GEN_TH_MASK_VV(th_vmornot_mm, TH_ORNOT) GEN_TH_MASK_VV(th_vmxnor_mm, TH_XNOR) + +/* Vector mask population count vmpopc */ +target_ulong HELPER(th_vmpopc_m)(void *v0, void *vs2, CPURISCVState *env, + uint32_t desc) +{ + target_ulong cnt =3D 0; + uint32_t mlen =3D th_mlen(desc); + uint32_t vm =3D th_vm(desc); + uint32_t vl =3D env->vl; + int i; + + for (i =3D env->vstart; i < vl; i++) { + if (vm || th_elem_mask(v0, mlen, i)) { + if (th_elem_mask(vs2, mlen, i)) { + cnt++; + } + } + } + env->vstart =3D 0; + return cnt; +} --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712915301; cv=none; d=zohomail.com; s=zohoarc; b=YDF7fXfVXqApiNVn8EuOGQvUgC2X10objfGLacGigU/l5yGGxdtWwuoHW3zjHnYOTe+JPuKMWo3vipom8r/jP9UY2ZuEp0AinesQjWeLQK49DLQNLbYfWS+ehR3VN0NFAuY2XWuzQI+gz+U6cCUKAExPnZRkgY9uyGT4WSRrQtc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712915301; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=4hfyZW4/05JjpSrcaJwfhZSpTipmklmzRqInWF/EWcg=; b=NEFxWae+902AZ52shlaI3wCk5gZmboDrBjgx/ReXkVIie5MCBIgRLxs+1W59td22g6t8vpNiZ7KIobprNCXAlMy2d3PJoOE51n7hx958w1OpRAR/5NiHcgI163PyZAktLimSTjN2pz8OtwnbQ4qouuxcf5jbWUzL3smIrGewibU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712915301640470.7388110754056; Fri, 12 Apr 2024 02:48:21 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDVo-0002Oy-1q; Fri, 12 Apr 2024 05:48:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDVj-0002Og-SW; Fri, 12 Apr 2024 05:47:55 -0400 Received: from out30-113.freemail.mail.aliyun.com ([115.124.30.113]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDVh-0006hx-Ia; Fri, 12 Apr 2024 05:47:55 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nsh82_1712915267) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:47:48 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712915268; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=4hfyZW4/05JjpSrcaJwfhZSpTipmklmzRqInWF/EWcg=; b=pjwufuHa2HSW1onxtucc6u/bIDvweAOQ1S09hGTQ8uqUzaL9FshLOenGlmzFRFXmNtrJMjTIgXtG/Na7Z1eVq6cmJb4BVd+MGT82g8i0/oBQ6SxyFM76TZRERUtm7R0YrfiW/lWDzPSHdAMPDBa4A11SnQ7J13qITEWZOCalOco= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R181e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046059; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nsh82_1712915267; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 56/65] target/riscv: Add th.vmfirst.m for XTheadVector Date: Fri, 12 Apr 2024 15:37:26 +0800 Message-ID: <20240412073735.76413-57-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.113; envelope-from=eric.huang@linux.alibaba.com; helo=out30-113.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712915303860100003 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 2 ++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 31 ++++++++++++++++++- target/riscv/xtheadvector_helper.c | 20 ++++++++++++ 3 files changed, 52 insertions(+), 1 deletion(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 6ddecbbe65..2379a3431d 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2300,3 +2300,5 @@ DEF_HELPER_6(th_vmornot_mm, void, ptr, ptr, ptr, ptr,= env, i32) DEF_HELPER_6(th_vmxnor_mm, void, ptr, ptr, ptr, ptr, env, i32) =20 DEF_HELPER_4(th_vmpopc_m, tl, ptr, ptr, env, i32) + +DEF_HELPER_4(th_vmfirst_m, tl, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index f8e8b321e4..45554c38fb 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2471,13 +2471,42 @@ static bool trans_th_vmpopc_m(DisasContext *s, arg_= rmr *a) return false; } =20 +/* vmfirst find-first-set mask bit */ +static bool trans_th_vmfirst_m(DisasContext *s, arg_rmr *a) +{ + if (require_xtheadvector(s) && + vext_check_isa_ill(s)) { + TCGv_ptr src2, mask; + TCGv dst; + TCGv_i32 desc; + uint32_t data =3D 0; + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + + mask =3D tcg_temp_new_ptr(); + src2 =3D tcg_temp_new_ptr(); + dst =3D dest_gpr(s, a->rd); + desc =3D tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data)); + + tcg_gen_addi_ptr(src2, tcg_env, vreg_ofs(s, a->rs2)); + tcg_gen_addi_ptr(mask, tcg_env, vreg_ofs(s, 0)); + + gen_helper_th_vmfirst_m(dst, mask, src2, tcg_env, desc); + gen_set_gpr(s, a->rd, dst); + + return true; + } + return false; +} + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vmfirst_m) TH_TRANS_STUB(th_vmsbf_m) TH_TRANS_STUB(th_vmsif_m) TH_TRANS_STUB(th_vmsof_m) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index ba1ab0435d..1860e47f4f 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3538,3 +3538,23 @@ target_ulong HELPER(th_vmpopc_m)(void *v0, void *vs2= , CPURISCVState *env, env->vstart =3D 0; return cnt; } + +/* vmfirst find-first-set mask bit*/ +target_ulong HELPER(th_vmfirst_m)(void *v0, void *vs2, CPURISCVState *env, + uint32_t desc) +{ + uint32_t mlen =3D th_mlen(desc); + uint32_t vm =3D th_vm(desc); + uint32_t vl =3D env->vl; + int i; + + for (i =3D env->vstart; i < vl; i++) { + if (vm || th_elem_mask(v0, mlen, i)) { + if (th_elem_mask(vs2, mlen, i)) { + return i; + } + } + } + env->vstart =3D 0; + return -1LL; +} --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712915424; cv=none; d=zohomail.com; s=zohoarc; b=LGbYQ0fc/O38dNAHdxaKrnPQNpjLGn1OLEzVERJwHlD07U/dTWrgZ9aY5pvk581ErLxEG0W0sfLsoBhzTOOyPGUVeftS3C/nqlU44n7B2wVtJAqt4wRhqf4aMk/4lVPJNf4CZWybW8ZIK+GhmqnDsXpVqgFxdY+A74/fXb5+OHs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712915424; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=imXaAcxAeJEThFU0GM/qjKnSob7ed7kgIB3usibiBv8=; b=Y27D7Fi9rmAGHu6Qph1ba6tuXGHpPQHPdXFALTvUFxAFDbbieZqgt7igp/54KG+/ME9EXb0jhixzibkyEVHaNkS9wBCxlE6zLUkSZuOKvVzfnjfzNI1A3dOk0X4EynXwHteKFV9/zQ26P3JqAxMGzFIDwz6trx9uMWNvk1y3kQE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712915424424100.40758385997742; Fri, 12 Apr 2024 02:50:24 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDXq-0003yI-SI; Fri, 12 Apr 2024 05:50:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDXh-0003vm-Vd; Fri, 12 Apr 2024 05:50:00 -0400 Received: from out30-98.freemail.mail.aliyun.com ([115.124.30.98]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDXf-0006xW-RV; Fri, 12 Apr 2024 05:49:57 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nshnm_1712915388) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:49:49 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712915390; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=imXaAcxAeJEThFU0GM/qjKnSob7ed7kgIB3usibiBv8=; b=hwZKwu264V2ta7yl8JLorBB7tEh7KMKhlHRUhX459hBX9HowmyA1FyUYG0QM+RmW+mJq/UCAuCCpvSJEB0QGYOpsKJvvXX2WpWE1MzPQzKsB+7FSyW6RxbrOnNc2pvR30G0slVnRDfAD5bOvIGnGB05mkPSENOp4DT6nA2DY6Ms= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R461e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046050; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nshnm_1712915388; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 57/65] target/riscv: Add set-X-first mask bit instructrions for XTheadVector Date: Fri, 12 Apr 2024 15:37:27 +0800 Message-ID: <20240412073735.76413-58-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.98; envelope-from=eric.huang@linux.alibaba.com; helo=out30-98.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712915426006100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 4 ++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 36 ++++++++++- target/riscv/xtheadvector_helper.c | 64 +++++++++++++++++++ 3 files changed, 101 insertions(+), 3 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 2379a3431d..90a1ff2601 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2302,3 +2302,7 @@ DEF_HELPER_6(th_vmxnor_mm, void, ptr, ptr, ptr, ptr, = env, i32) DEF_HELPER_4(th_vmpopc_m, tl, ptr, ptr, env, i32) =20 DEF_HELPER_4(th_vmfirst_m, tl, ptr, ptr, env, i32) + +DEF_HELPER_5(th_vmsbf_m, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vmsif_m, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_vmsof_m, void, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 45554c38fb..d41c691c31 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2500,6 +2500,39 @@ static bool trans_th_vmfirst_m(DisasContext *s, arg_= rmr *a) } return false; } +/* + * th.vmsbf.m set-before-first mask bit + * th.vmsif.m set-including-first mask bit + * th.vmsof.m set-only-first mask bit + */ +#define GEN_M_TRANS_TH(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ +{ \ + if (require_xtheadvector(s) && \ + vext_check_isa_ill(s) && \ + (a->rd !=3D a->rs2) && \ + s->vstart_eq_zero) { \ + uint32_t data =3D 0; \ + gen_helper_gvec_3_ptr *fn =3D gen_helper_##NAME; \ + \ + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); \ + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); \ + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); \ + tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), \ + vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs2), \ + tcg_env, s->cfg_ptr->vlenb, \ + s->cfg_ptr->vlenb, \ + data, fn); \ + finalize_rvv_inst(s); \ + return true; \ + } \ + return false; \ +} + +GEN_M_TRANS_TH(th_vmsbf_m) +GEN_M_TRANS_TH(th_vmsif_m) +GEN_M_TRANS_TH(th_vmsof_m) =20 #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ @@ -2507,9 +2540,6 @@ static bool trans_##NAME(DisasContext *s, arg_##NAME = *a) \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vmsbf_m) -TH_TRANS_STUB(th_vmsif_m) -TH_TRANS_STUB(th_vmsof_m) TH_TRANS_STUB(th_viota_m) TH_TRANS_STUB(th_vid_v) TH_TRANS_STUB(th_vext_x_v) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 1860e47f4f..d4f1665bf3 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3558,3 +3558,67 @@ target_ulong HELPER(th_vmfirst_m)(void *v0, void *vs= 2, CPURISCVState *env, env->vstart =3D 0; return -1LL; } + +enum set_mask_type_th { + ONLY_FIRST =3D 1, + INCLUDE_FIRST, + BEFORE_FIRST, +}; + +static void vmsetm(void *vd, void *v0, void *vs2, CPURISCVState *env, + uint32_t desc, enum set_mask_type_th type) +{ + uint32_t mlen =3D th_mlen(desc); + uint32_t vlmax =3D (env_archcpu(env)->cfg.vlenb << 3) / mlen; + uint32_t vm =3D th_vm(desc); + uint32_t vl =3D env->vl; + int i; + bool first_mask_bit =3D false; + + for (i =3D env->vstart; i < vl; i++) { + if (!vm && !th_elem_mask(v0, mlen, i)) { + continue; + } + /* write a zero to all following active elements */ + if (first_mask_bit) { + th_set_elem_mask(vd, mlen, i, 0); + continue; + } + if (th_elem_mask(vs2, mlen, i)) { + first_mask_bit =3D true; + if (type =3D=3D BEFORE_FIRST) { + th_set_elem_mask(vd, mlen, i, 0); + } else { + th_set_elem_mask(vd, mlen, i, 1); + } + } else { + if (type =3D=3D ONLY_FIRST) { + th_set_elem_mask(vd, mlen, i, 0); + } else { + th_set_elem_mask(vd, mlen, i, 1); + } + } + } + env->vstart =3D 0; + for (; i < vlmax; i++) { + th_set_elem_mask(vd, mlen, i, 0); + } +} + +void HELPER(th_vmsbf_m)(void *vd, void *v0, void *vs2, CPURISCVState *env, + uint32_t desc) +{ + vmsetm(vd, v0, vs2, env, desc, BEFORE_FIRST); +} + +void HELPER(th_vmsif_m)(void *vd, void *v0, void *vs2, CPURISCVState *env, + uint32_t desc) +{ + vmsetm(vd, v0, vs2, env, desc, INCLUDE_FIRST); +} + +void HELPER(th_vmsof_m)(void *vd, void *v0, void *vs2, CPURISCVState *env, + uint32_t desc) +{ + vmsetm(vd, v0, vs2, env, desc, ONLY_FIRST); +} --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712915549; cv=none; d=zohomail.com; s=zohoarc; b=NneLZRKVzJP5qtj41O7uDf1Kj1BXDku6+0cqxrAHHQuLZ3EA5ie8dDXIUWmGKYZX+yDAU7uNPlrp/1mB9qOG8+cSH7758CQ7bJ5xHeZbbbxD+Xn1kr9ToWlYIsotQ9henrv5/K2Xxxnd1rAdktsDJxTC8iITwlK9pjUDxtoz6v8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712915549; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=O9JEdbM/55FFdsPIYnErCakBqaPzk+l9txOOWl3u26E=; b=YHcqo4wOo/U8+QnDA5fVayZWqXVU1Y1XSAtIWiovyWHnOcukSD95lzhRn3UNFQVSmk9OS0+XX1KHANix9mfUjPMdOak2xziQoryEgNSijBhetmagqL1Q82LPMlT4HkHZnUazq1yGmAfrc/UPXkUqhGQDZ4X+xnIt2+UV7fkuyfA= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712915549284518.6359796414649; Fri, 12 Apr 2024 02:52:29 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDZm-0005W6-AO; Fri, 12 Apr 2024 05:52:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDZj-0005SA-QH; Fri, 12 Apr 2024 05:52:03 -0400 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDZg-0007jk-CJ; Fri, 12 Apr 2024 05:52:03 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4O-Ryn_1712915510) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:51:50 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712915511; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=O9JEdbM/55FFdsPIYnErCakBqaPzk+l9txOOWl3u26E=; b=QHSrE6e2G7xmW5yPZHmUAkR247KC3itp19M6J7Wjf1wAC2L4hKwRdI59RBCMl/dptDG+gw+9Iz2jdozpXyG6BmXhSLW2e63E0Vu+hFy9mmUzXsLmuxhhd+fGtfk6wC5+kWW76Q8XC8wSEID59lgR79Dx2xHm9z0kSU77xF1bOzw= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R311e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045168; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4O-Ryn_1712915510; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 58/65] target/riscv: Add vector iota instruction for XTheadVector Date: Fri, 12 Apr 2024 15:37:28 +0800 Message-ID: <20240412073735.76413-59-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.133; envelope-from=eric.huang@linux.alibaba.com; helo=out30-133.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712915550625100001 Content-Type: text/plain; charset="utf-8" The instruction has the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 5 ++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 29 +++++++++++++++++- target/riscv/xtheadvector_helper.c | 30 +++++++++++++++++++ 3 files changed, 63 insertions(+), 1 deletion(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 90a1ff2601..a1c85e5254 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2306,3 +2306,8 @@ DEF_HELPER_4(th_vmfirst_m, tl, ptr, ptr, env, i32) DEF_HELPER_5(th_vmsbf_m, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(th_vmsif_m, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(th_vmsof_m, void, ptr, ptr, ptr, env, i32) + +DEF_HELPER_5(th_viota_m_b, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_viota_m_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_viota_m_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(th_viota_m_d, void, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index d41c691c31..93f4ee4a12 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2534,13 +2534,40 @@ GEN_M_TRANS_TH(th_vmsbf_m) GEN_M_TRANS_TH(th_vmsif_m) GEN_M_TRANS_TH(th_vmsof_m) =20 +/* Vector Iota Instruction */ +static bool trans_th_viota_m(DisasContext *s, arg_th_viota_m *a) +{ + if (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rd, false) && + th_check_overlap_group(a->rd, 1 << s->lmul, a->rs2, 1) && + (a->vm !=3D 0 || a->rd !=3D 0) && + s->vstart_eq_zero) { + uint32_t data =3D 0; + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + static gen_helper_gvec_3_ptr * const fns[4] =3D { + gen_helper_th_viota_m_b, gen_helper_th_viota_m_h, + gen_helper_th_viota_m_w, gen_helper_th_viota_m_d, + }; + tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), + vreg_ofs(s, a->rs2), tcg_env, + s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data, fns[s->sew]); + finalize_rvv_inst(s); + return true; + } + return false; +} + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_viota_m) TH_TRANS_STUB(th_vid_v) TH_TRANS_STUB(th_vext_x_v) TH_TRANS_STUB(th_vmv_s_x) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index d4f1665bf3..b0ddb3b307 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3622,3 +3622,33 @@ void HELPER(th_vmsof_m)(void *vd, void *v0, void *vs= 2, CPURISCVState *env, { vmsetm(vd, v0, vs2, env, desc, ONLY_FIRST); } + +/* Vector Iota Instruction */ +#define GEN_TH_VIOTA_M(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs2, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); = \ + uint32_t vlmax =3D (env_archcpu(env)->cfg.vlenb << 3) / mlen; = \ + uint32_t vm =3D th_vm(desc); = \ + uint32_t vl =3D env->vl; = \ + uint32_t sum =3D 0; = \ + int i; \ + \ + for (i =3D env->vstart; i < vl; i++) { = \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + *((ETYPE *)vd + H(i)) =3D sum; = \ + if (th_elem_mask(vs2, mlen, i)) { \ + sum++; \ + } \ + } \ + env->vstart =3D 0; = \ + CLEAR_FN(vd, vl, vl * sizeof(ETYPE), vlmax * sizeof(ETYPE)); \ +} + +GEN_TH_VIOTA_M(th_viota_m_b, uint8_t, H1, clearb_th) +GEN_TH_VIOTA_M(th_viota_m_h, uint16_t, H2, clearh_th) +GEN_TH_VIOTA_M(th_viota_m_w, uint32_t, H4, clearl_th) +GEN_TH_VIOTA_M(th_viota_m_d, uint64_t, H8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712915660; cv=none; d=zohomail.com; s=zohoarc; b=PbOPaJiSisEZGXtgM1SNpCu75/q5G5uXDhuZhz6YIx59xmn/VwglDlKcLK15WKFNdo7wMaUfV8UMS6ImPtWoPixPeKIJoH2HxEConB2PnCbpJ4yUt9vR+1RQos0mdcEHGhTDIgxYbbMbhUJSK059wtjc3h+r40C8dZnGbBeNyv4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712915660; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=hV0nePvd0MXT222s9i5ll/4ZZNClJxJPhYIvyLOz17E=; b=QO+NcEMeVmvcKzBOSRQx8MmbM6BfkqaNcBPVStDYLMvkFaYgZot5kCT/FhKD0peHDnZyDoIuZgrhr6WsglBLtIII5JdW/5ltbAxAYb+kaqLuIYcav81XAsIiHjjpiwiSiSiL7vzXERZ6EK4VBqrT2j+c2mJD4EaAGfyvxv7lSUU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712915660575768.7802875052522; Fri, 12 Apr 2024 02:54:20 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDbg-0006kS-4x; Fri, 12 Apr 2024 05:54:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDbc-0006jt-Ma; Fri, 12 Apr 2024 05:54:00 -0400 Received: from out30-101.freemail.mail.aliyun.com ([115.124.30.101]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDba-0007wA-GV; Fri, 12 Apr 2024 05:54:00 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4O-Shf_1712915631) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:53:52 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712915633; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=hV0nePvd0MXT222s9i5ll/4ZZNClJxJPhYIvyLOz17E=; b=UiAuZhCF9XuDzUEIMQMVTvLGcV5B4vQ/f5WPmrZyC44CNYUTg/icFK9ruIx2sbQN9x8gQEx7UNCN0CtEW0x2+EkMoNBVboMFD3XqnlxgztPhNNfo+43hV9bFl3sWabc1FrCNpJPzaOV/tBgIybmAu0gCPscSNzbZimj3i/cedCs= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R911e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046060; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4O-Shf_1712915631; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 59/65] target/riscv: Add vector element index instruction for XTheadVector Date: Fri, 12 Apr 2024 15:37:29 +0800 Message-ID: <20240412073735.76413-60-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.101; envelope-from=eric.huang@linux.alibaba.com; helo=out30-101.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712915662849100003 Content-Type: text/plain; charset="utf-8" The instruction has the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 5 ++++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 27 ++++++++++++++++++- target/riscv/xtheadvector_helper.c | 26 ++++++++++++++++++ 3 files changed, 57 insertions(+), 1 deletion(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index a1c85e5254..fe264621ff 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2311,3 +2311,8 @@ DEF_HELPER_5(th_viota_m_b, void, ptr, ptr, ptr, env, = i32) DEF_HELPER_5(th_viota_m_h, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(th_viota_m_w, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(th_viota_m_d, void, ptr, ptr, ptr, env, i32) + +DEF_HELPER_4(th_vid_v_b, void, ptr, ptr, env, i32) +DEF_HELPER_4(th_vid_v_h, void, ptr, ptr, env, i32) +DEF_HELPER_4(th_vid_v_w, void, ptr, ptr, env, i32) +DEF_HELPER_4(th_vid_v_d, void, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 93f4ee4a12..9a0ea606ab 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2562,13 +2562,38 @@ static bool trans_th_viota_m(DisasContext *s, arg_t= h_viota_m *a) return false; } =20 +/* Vector Element Index Instruction */ +static bool trans_th_vid_v(DisasContext *s, arg_th_vid_v *a) +{ + if (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rd, false) && + th_check_overlap_mask(s, a->rd, a->vm, false)) { + uint32_t data =3D 0; + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, VM, a->vm); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + static gen_helper_gvec_2_ptr * const fns[4] =3D { + gen_helper_th_vid_v_b, gen_helper_th_vid_v_h, + gen_helper_th_vid_v_w, gen_helper_th_vid_v_d, + }; + tcg_gen_gvec_2_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), + tcg_env, s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, + data, fns[s->sew]); + finalize_rvv_inst(s); + return true; + } + return false; +} + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vid_v) TH_TRANS_STUB(th_vext_x_v) TH_TRANS_STUB(th_vmv_s_x) TH_TRANS_STUB(th_vfmv_f_s) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index b0ddb3b307..0743d57b12 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3652,3 +3652,29 @@ GEN_TH_VIOTA_M(th_viota_m_b, uint8_t, H1, clearb_th) GEN_TH_VIOTA_M(th_viota_m_h, uint16_t, H2, clearh_th) GEN_TH_VIOTA_M(th_viota_m_w, uint32_t, H4, clearl_th) GEN_TH_VIOTA_M(th_viota_m_d, uint64_t, H8, clearq_th) + +/* Vector Element Index Instruction */ +#define GEN_TH_VID_V(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); = \ + uint32_t vlmax =3D (env_archcpu(env)->cfg.vlenb << 3) / mlen; = \ + uint32_t vm =3D th_vm(desc); = \ + uint32_t vl =3D env->vl; = \ + int i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { = \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + *((ETYPE *)vd + H(i)) =3D i; = \ + } \ + env->vstart =3D 0; = \ + CLEAR_FN(vd, vl, vl * sizeof(ETYPE), vlmax * sizeof(ETYPE)); \ +} + +GEN_TH_VID_V(th_vid_v_b, uint8_t, H1, clearb_th) +GEN_TH_VID_V(th_vid_v_h, uint16_t, H2, clearh_th) +GEN_TH_VID_V(th_vid_v_w, uint32_t, H4, clearl_th) +GEN_TH_VID_V(th_vid_v_d, uint64_t, H8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712915782; cv=none; d=zohomail.com; s=zohoarc; b=T8q7ozDgHIgXEjSbxygqNGKJF9FrP1+ccFlh6ryjFVtDiubA4Ou160dTSTOrgQrsxLo1viwMJpxu2hs+MtyQiscla4hJv72w3YWnYmPLDgYNRRKkhFgl59EPiOwovX3fA9gRpTx32pU61DMJLcSc3lo2HxJSYk+/PLu2+6PE6FI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712915782; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=PtcCdLJMSL6qhMP3DJl3nheiurGwNC2IL8KiBFsJE8o=; b=Skz6B9vnpYgQtVFwzTBqn0bwYZN9AnlmG7T4771PxJuYAG/OriHbL56A6207gMmck3k32ICc0R/nFn0FL7m7k0YdLw2JvDT0CHVVI8otxUy5FuljSB9sGM/VxjBHZA9DH/QVziN+Dl0QvM6R6X+KfStrKLoKsjw7aBf1QFzSKvY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712915782649859.4987936104737; Fri, 12 Apr 2024 02:56:22 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDdg-0007p1-Ah; Fri, 12 Apr 2024 05:56:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDdd-0007om-S5; Fri, 12 Apr 2024 05:56:05 -0400 Received: from out30-119.freemail.mail.aliyun.com ([115.124.30.119]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDdZ-0008OR-33; Fri, 12 Apr 2024 05:56:04 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nu53m_1712915753) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:55:54 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712915755; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=PtcCdLJMSL6qhMP3DJl3nheiurGwNC2IL8KiBFsJE8o=; b=SdK+XZzAq+/fDny8P/h73sRH5KSwpDpbiVwvqP6cmd9cLrWQYXaTpQwzAbhLsXD8Qps6fLD2YaQJNDIlBtWVvu1TMBWDqn800OUbBw2EyYOzKQq1hIsy0vGoNaHx7TwpVHa5w0ppKYjujSConf+oS9/uAzCAB0ohkyREa5M0NYg= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R181e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045168; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nu53m_1712915753; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 60/65] target/riscv: Add integer extract and scalar move instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:30 +0800 Message-ID: <20240412073735.76413-61-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.119; envelope-from=eric.huang@linux.alibaba.com; helo=out30-119.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712915783207100001 Content-Type: text/plain; charset="utf-8" In this patch, we add integer extract and scalar move instructions to show = the way we implement XTheadVector permutation instructions. XTheadVector integer scalar move instructions diff from RVV1.0 in the follo= wing points: 1. th.vext.x.v can transfer any element in a vector register to a general register, while vmv.x.s can only transfer the first element in a vector register to a general register. 2. When SEW < XLEN, XTheadVector zero-extend the value, while RVV1.0 sign-extend the value. 3. different tail element process policy. Signed-off-by: Huang Tao --- .../riscv/insn_trans/trans_xtheadvector.c.inc | 154 +++++++++++++++++- 1 file changed, 152 insertions(+), 2 deletions(-) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 9a0ea606ab..a8a1ec7b3f 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2588,14 +2588,164 @@ static bool trans_th_vid_v(DisasContext *s, arg_th= _vid_v *a) return false; } =20 +/* + * Vector Permutation Instructions + */ + +/* Integer Extract Instruction */ + +/* + * This function is almost the copy of load_element, except: + * 1) When SEW < XLEN, XTheadVector zero-extend the value, while + * RVV1.0 sign-extend the value. + */ +static void load_element_th(TCGv_i64 dest, TCGv_ptr base, + int ofs, int sew) +{ + switch (sew) { + case MO_8: + tcg_gen_ld8u_i64(dest, base, ofs); + break; + case MO_16: + tcg_gen_ld16u_i64(dest, base, ofs); + break; + case MO_32: + tcg_gen_ld32u_i64(dest, base, ofs); + break; + case MO_64: + tcg_gen_ld_i64(dest, base, ofs); + break; + default: + g_assert_not_reached(); + break; + } +} + +/* Load idx >=3D VLMAX ? 0 : vreg[idx] */ +static void th_element_loadx(DisasContext *s, TCGv_i64 dest, + int vreg, TCGv idx, int vlmax) +{ + TCGv_i32 ofs =3D tcg_temp_new_i32(); + TCGv_ptr base =3D tcg_temp_new_ptr(); + TCGv_i64 t_idx =3D tcg_temp_new_i64(); + TCGv_i64 t_vlmax, t_zero; + + /* + * Mask the index to the length so that we do + * not produce an out-of-range load. + */ + tcg_gen_trunc_tl_i32(ofs, idx); + tcg_gen_andi_i32(ofs, ofs, vlmax - 1); + + /* Convert the index to an offset. */ + endian_adjust(ofs, s->sew); + tcg_gen_shli_i32(ofs, ofs, s->sew); + + /* Convert the index to a pointer. */ + tcg_gen_ext_i32_ptr(base, ofs); + tcg_gen_add_ptr(base, base, tcg_env); + + /* Perform the load. */ + load_element_th(dest, base, + vreg_ofs(s, vreg), s->sew); + + /* Flush out-of-range indexing to zero. */ + t_vlmax =3D tcg_constant_i64(vlmax); + t_zero =3D tcg_constant_i64(0); + tcg_gen_extu_tl_i64(t_idx, idx); + + tcg_gen_movcond_i64(TCG_COND_LTU, dest, t_idx, + t_vlmax, dest, t_zero); + +} +/* + * This function is almost the copy of vec_element_loadi, except + * we just change the function name to decouple and delete the + * unused parameter. + * We delete the arg "bool sign", because XTheadVector always + * zero-extend the value. + */ +static void th_element_loadi(DisasContext *s, TCGv_i64 dest, + int vreg, int idx) +{ + load_element_th(dest, tcg_env, endian_ofs(s, vreg, idx), s->sew); +} + +/* + * Compared to trans_vmv_x_s, th.vext.x.v can transfer any element + * in a vector register to a general register, while vmv.x.s can only + * transfer the first element in a vector register to a general register. + * + * So we use th_element_loadx to load the element. And we use th_element_l= oadi + * to deal with the special case when rs1 =3D=3D 0, to accelerate. + */ +static bool trans_th_vext_x_v(DisasContext *s, arg_r *a) +{ + if (require_xtheadvector(s) && + vext_check_isa_ill(s)) { + TCGv_i64 tmp =3D tcg_temp_new_i64(); + TCGv dest =3D dest_gpr(s, a->rd); + + if (a->rs1 =3D=3D 0) { + /* Special case vmv.x.s rd, vs2. */ + th_element_loadi(s, tmp, a->rs2, 0); + } else { + /* This instruction ignores LMUL and vector register groups */ + int vlmax =3D s->cfg_ptr->vlenb >> s->sew; + th_element_loadx(s, tmp, a->rs2, cpu_gpr[a->rs1], vlmax); + } + + tcg_gen_trunc_i64_tl(dest, tmp); + gen_set_gpr(s, a->rd, dest); + tcg_gen_movi_tl(cpu_vstart, 0); + finalize_rvv_inst(s); + return true; + } + return false; +} + +/* Integer Scalar Move Instruction */ + +static void th_element_storei(DisasContext *s, int vreg, + int idx, TCGv_i64 val) +{ + vec_element_storei(s, vreg, idx, val); +} +/* vmv.s.x vd, rs1 # vd[0] =3D rs1 */ +static bool trans_th_vmv_s_x(DisasContext *s, arg_th_vmv_s_x *a) +{ + if (require_xtheadvector(s) && + vext_check_isa_ill(s)) { + /* This instruction ignores LMUL and vector register groups */ + int maxsz =3D s->cfg_ptr->vlenb; + TCGv_i64 t1; + TCGLabel *over =3D gen_new_label(); + TCGv src1 =3D get_gpr(s, a->rs1, EXT_ZERO); + + tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); + tcg_gen_gvec_dup_imm(MO_64, vreg_ofs(s, a->rd), maxsz, maxsz, 0); + if (a->rs1 =3D=3D 0) { + goto done; + } + + t1 =3D tcg_temp_new_i64(); + tcg_gen_extu_tl_i64(t1, src1); + th_element_storei(s, a->rd, 0, t1); + done: + gen_set_label(over); + tcg_gen_movi_tl(cpu_vstart, 0); + finalize_rvv_inst(s); + return true; + } + return false; +} + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vext_x_v) -TH_TRANS_STUB(th_vmv_s_x) TH_TRANS_STUB(th_vfmv_f_s) TH_TRANS_STUB(th_vfmv_s_f) TH_TRANS_STUB(th_vslideup_vx) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712915906; cv=none; d=zohomail.com; s=zohoarc; b=RJLS51GF8JkGVmxVDsKmnjG3ggsR8OFCOXSedMiFIXQMHuv9BwxUVxT7bUIOgNQhK5SELhmJfExUaWqNpblfgvXDULqv0CrMFGx1SA7RKOUjjZYwoDT0N6GU5oXLso4LLqcXHNAFHaNGGy6atFIoOEqkjdJ1m8oeRvOuKI00HvM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712915906; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=1DFEJpDLU3j4QtHeAqrFTUn3BuZZzQrxuet5/p11QiA=; b=Hi/HsQs6pwxeWK8bxEpOouwQybL7cLnaDFsc/aK4NsAZS90GAH/hSEtklcbI0fey3if1GlC8FM4VqaFshKlb5q7ZEHb/H2qM+kQb/JDNLI1FtFHTb9G6mxnYolGs0t0LpkIjCkdtwtv845zW015jEFptDjInSlcinId68UR407I= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712915906701580.4037213821597; Fri, 12 Apr 2024 02:58:26 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDfe-0000Wx-61; Fri, 12 Apr 2024 05:58:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDfd-0000Wa-1Y; Fri, 12 Apr 2024 05:58:09 -0400 Received: from out30-110.freemail.mail.aliyun.com ([115.124.30.110]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDfa-0000Y5-1p; Fri, 12 Apr 2024 05:58:08 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NtPyw_1712915874) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:57:55 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712915876; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=1DFEJpDLU3j4QtHeAqrFTUn3BuZZzQrxuet5/p11QiA=; b=xCvrunsBzI3xYTaky0AdiKEgzT4a5nstcsqPzGhPOW/V9bjAWIATtJxuxEzUFT64Z1oE7tE0hJApmw+DlN9DZODnbXE3gehflie9Ha023M49nyX53rzd9Zhoh2ww8/BwP9tFCJCyyzqVLt5CBiIew8TjCvlmoZKn33761uNt54M= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R121e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046056; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NtPyw_1712915874; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 61/65] target/riscv: Add floating-point scalar move instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:31 +0800 Message-ID: <20240412073735.76413-62-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.110; envelope-from=eric.huang@linux.alibaba.com; helo=out30-110.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712915907718100001 Content-Type: text/plain; charset="utf-8" XTheadVector floating-point scalar move instructions diff from RVV1.0 in the following points: 1. When src width < dst width, RVV1.0 checks whether the input value is a valid NaN-boxed value, in which case the least-significant dst-width bits are used, else the canonical NaN value is used. XTheadVector always use = the least-significant bits. 2. different tail elements process policy. Signed-off-by: Huang Tao --- .../riscv/insn_trans/trans_xtheadvector.c.inc | 59 ++++++++++++++++++- 1 file changed, 57 insertions(+), 2 deletions(-) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index a8a1ec7b3f..54ccd933c0 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2740,14 +2740,69 @@ static bool trans_th_vmv_s_x(DisasContext *s, arg_t= h_vmv_s_x *a) return false; } =20 +/* Floating-Point Scalar Move Instructions */ +static bool trans_th_vfmv_f_s(DisasContext *s, arg_th_vfmv_f_s *a) +{ + if (require_xtheadvector(s) && + !s->vill && has_ext(s, RVF) && + (s->mstatus_fs !=3D 0) && + (s->sew !=3D 0)) { + unsigned int len =3D 8 << s->sew; + + th_element_loadi(s, cpu_fpr[a->rd], a->rs2, 0); + if (len < 64) { + tcg_gen_ori_i64(cpu_fpr[a->rd], cpu_fpr[a->rd], + MAKE_64BIT_MASK(len, 64 - len)); + } + + mark_fs_dirty(s); + tcg_gen_movi_tl(cpu_vstart, 0); + finalize_rvv_inst(s); + return true; + } + return false; +} + +/* vfmv.s.f vd, rs1 # vd[0] =3D rs1 (vs2=3D0) */ +static bool trans_th_vfmv_s_f(DisasContext *s, arg_th_vfmv_s_f *a) +{ + if (require_xtheadvector(s) && + !s->vill && has_ext(s, RVF) && + (s->sew !=3D 0)) { + TCGv_i64 t1; + /* The instructions ignore LMUL and vector register group. */ + uint32_t vlmax =3D s->cfg_ptr->vlenb; + + /* if vl =3D=3D 0, skip vector register write back */ + TCGLabel *over =3D gen_new_label(); + tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); + + /* zeroed all elements */ + tcg_gen_gvec_dup_imm(MO_64, vreg_ofs(s, a->rd), vlmax, vlmax, 0); + + /* NaN-box f[rs1] as necessary for SEW */ + t1 =3D tcg_temp_new_i64(); + if (s->sew =3D=3D MO_64 && !has_ext(s, RVD)) { + tcg_gen_ori_i64(t1, cpu_fpr[a->rs1], MAKE_64BIT_MASK(32, 32)); + } else { + tcg_gen_mov_i64(t1, cpu_fpr[a->rs1]); + } + th_element_storei(s, a->rd, 0, t1); + + gen_set_label(over); + tcg_gen_movi_tl(cpu_vstart, 0); + finalize_rvv_inst(s); + return true; + } + return false; +} + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vfmv_f_s) -TH_TRANS_STUB(th_vfmv_s_f) TH_TRANS_STUB(th_vslideup_vx) TH_TRANS_STUB(th_vslideup_vi) TH_TRANS_STUB(th_vslide1up_vx) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712916041; cv=none; d=zohomail.com; s=zohoarc; b=i/rf6JiIrSqg7STosNw3B7qYlh5KbtIFsZz1iO3zD7cd3ORnRxJAKQN8hXPtrY9XkFMviy7H2hzlbeLBAmgjN+K2p99F8crS+jgl+qkP3Re5AUzF5dgpj8ZxIU0PpAvS+JMyonz+UT3ryTLn4gVws0PVdtBPMLoQDP4YVsC1aJQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712916041; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=xqsZ9O48LJGEVsAnRIqoz+CNI7V67QtC0dj8fIpqUF0=; b=mvCr/cD9gA2SD/c+hwE/LsudwfCOQCjwVn6rLnzwKxos6QKJnSY2Kat7hUDM1WSCy/3MIAQUI840uQZwkd7L/IJLht/oAiBQ+WjCxi3il+J1g86+wc0U0fd1XEzxIF0NXVw4i8jMMzr3yoAB42VxRPJ2N+4UQHPCx77JO5tVzJ4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712916041070827.541338284803; Fri, 12 Apr 2024 03:00:41 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDhn-0001lb-JH; Fri, 12 Apr 2024 06:00:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDhX-0001l7-RB; Fri, 12 Apr 2024 06:00:08 -0400 Received: from out30-98.freemail.mail.aliyun.com ([115.124.30.98]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDhU-0000pj-NC; Fri, 12 Apr 2024 06:00:07 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4Nu6N3_1712915996) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 17:59:57 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712915998; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=xqsZ9O48LJGEVsAnRIqoz+CNI7V67QtC0dj8fIpqUF0=; b=sob//O4OcEs1bClfwXcZv0UoOs7X3mfQeC2oYHJfP4MO3zLzF1VHWFIzH1ry82rZz+4UYFUokJeT10hQv96y3me+CVljxbhPa6uQ8GjkIIDgzOG/B0F9l3wJ+lq/0D/2/XRqVIkTFT/ikWJWnOA89smmO8jlDVbzvBd48apfSSw= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R541e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046059; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4Nu6N3_1712915996; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 62/65] target/riscv: Add vector slide instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:32 +0800 Message-ID: <20240412073735.76413-63-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.98; envelope-from=eric.huang@linux.alibaba.com; helo=out30-98.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712916042265100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 17 +++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 25 +++- target/riscv/xtheadvector_helper.c | 123 ++++++++++++++++++ 3 files changed, 159 insertions(+), 6 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index fe264621ff..6ce0bcbba7 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2316,3 +2316,20 @@ DEF_HELPER_4(th_vid_v_b, void, ptr, ptr, env, i32) DEF_HELPER_4(th_vid_v_h, void, ptr, ptr, env, i32) DEF_HELPER_4(th_vid_v_w, void, ptr, ptr, env, i32) DEF_HELPER_4(th_vid_v_d, void, ptr, ptr, env, i32) + +DEF_HELPER_6(th_vslideup_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslideup_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslideup_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslideup_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslidedown_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslidedown_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslidedown_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslidedown_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslide1up_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslide1up_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslide1up_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslide1up_vx_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslide1down_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslide1down_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslide1down_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vslide1down_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 54ccd933c0..46cfc51690 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2797,18 +2797,31 @@ static bool trans_th_vfmv_s_f(DisasContext *s, arg_= th_vfmv_s_f *a) return false; } =20 +/* Vector Slide Instructions */ +static bool slideup_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false) && + (a->rd !=3D a->rs2)); +} + +GEN_OPIVX_TRANS_TH(th_vslideup_vx, slideup_check_th) +GEN_OPIVX_TRANS_TH(th_vslide1up_vx, slideup_check_th) +GEN_OPIVI_TRANS_TH(th_vslideup_vi, IMM_ZX, th_vslideup_vx, slideup_check_t= h) + +GEN_OPIVX_TRANS_TH(th_vslidedown_vx, opivx_check_th) +GEN_OPIVX_TRANS_TH(th_vslide1down_vx, opivx_check_th) +GEN_OPIVI_TRANS_TH(th_vslidedown_vi, IMM_ZX, th_vslidedown_vx, opivx_check= _th) + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vslideup_vx) -TH_TRANS_STUB(th_vslideup_vi) -TH_TRANS_STUB(th_vslide1up_vx) -TH_TRANS_STUB(th_vslidedown_vx) -TH_TRANS_STUB(th_vslidedown_vi) -TH_TRANS_STUB(th_vslide1down_vx) TH_TRANS_STUB(th_vrgather_vv) TH_TRANS_STUB(th_vrgather_vx) TH_TRANS_STUB(th_vrgather_vi) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 0743d57b12..73a15eb070 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3678,3 +3678,126 @@ GEN_TH_VID_V(th_vid_v_b, uint8_t, H1, clearb_th) GEN_TH_VID_V(th_vid_v_h, uint16_t, H2, clearh_th) GEN_TH_VID_V(th_vid_v_w, uint32_t, H4, clearl_th) GEN_TH_VID_V(th_vid_v_d, uint64_t, H8, clearq_th) + +/* + * Vector Permutation Instructions + */ + +/* Vector Slide Instructions */ +#define GEN_TH_VSLIDEUP_VX(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); = \ + uint32_t vlmax =3D (env_archcpu(env)->cfg.vlenb << 3) / mlen; = \ + uint32_t vm =3D th_vm(desc); = \ + uint32_t vl =3D env->vl; = \ + target_ulong offset =3D s1, i_min, i; = \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + i_min =3D MAX(env->vstart, offset); = \ + for (i =3D i_min; i < vl; i++) { = \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + *((ETYPE *)vd + H(i)) =3D *((ETYPE *)vs2 + H(i - offset)); = \ + } \ + env->vstart =3D 0; = \ + CLEAR_FN(vd, vl, vl * sizeof(ETYPE), vlmax * sizeof(ETYPE)); \ +} + +/* vslideup.vx vd, vs2, rs1, vm # vd[i+rs1] =3D vs2[i] */ +GEN_TH_VSLIDEUP_VX(th_vslideup_vx_b, uint8_t, H1, clearb_th) +GEN_TH_VSLIDEUP_VX(th_vslideup_vx_h, uint16_t, H2, clearh_th) +GEN_TH_VSLIDEUP_VX(th_vslideup_vx_w, uint32_t, H4, clearl_th) +GEN_TH_VSLIDEUP_VX(th_vslideup_vx_d, uint64_t, H8, clearq_th) + +#define GEN_TH_VSLIDEDOWN_VX(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); = \ + uint32_t vlmax =3D (env_archcpu(env)->cfg.vlenb << 3) / mlen; = \ + uint32_t vm =3D th_vm(desc); = \ + uint32_t vl =3D env->vl; = \ + target_ulong offset =3D s1, i; = \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; ++i) { = \ + target_ulong j =3D i + offset; = \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + *((ETYPE *)vd + H(i)) =3D j >=3D vlmax ? 0 : *((ETYPE *)vs2 + H(j)= ); \ + } \ + env->vstart =3D 0; = \ + CLEAR_FN(vd, vl, vl * sizeof(ETYPE), vlmax * sizeof(ETYPE)); \ +} + +/* vslidedown.vx vd, vs2, rs1, vm # vd[i] =3D vs2[i+rs1] */ +GEN_TH_VSLIDEDOWN_VX(th_vslidedown_vx_b, uint8_t, H1, clearb_th) +GEN_TH_VSLIDEDOWN_VX(th_vslidedown_vx_h, uint16_t, H2, clearh_th) +GEN_TH_VSLIDEDOWN_VX(th_vslidedown_vx_w, uint32_t, H4, clearl_th) +GEN_TH_VSLIDEDOWN_VX(th_vslidedown_vx_d, uint64_t, H8, clearq_th) + +#define GEN_TH_VSLIDE1UP_VX(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); = \ + uint32_t vlmax =3D (env_archcpu(env)->cfg.vlenb << 3) / mlen; = \ + uint32_t vm =3D th_vm(desc); = \ + uint32_t vl =3D env->vl; = \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { = \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + if (i =3D=3D 0) { = \ + *((ETYPE *)vd + H(i)) =3D s1; = \ + } else { \ + *((ETYPE *)vd + H(i)) =3D *((ETYPE *)vs2 + H(i - 1)); = \ + } \ + } \ + env->vstart =3D 0; = \ + CLEAR_FN(vd, vl, vl * sizeof(ETYPE), vlmax * sizeof(ETYPE)); \ +} + +/* vslide1up.vx vd, vs2, rs1, vm # vd[0]=3Dx[rs1], vd[i+1] =3D vs2[i] */ +GEN_TH_VSLIDE1UP_VX(th_vslide1up_vx_b, uint8_t, H1, clearb_th) +GEN_TH_VSLIDE1UP_VX(th_vslide1up_vx_h, uint16_t, H2, clearh_th) +GEN_TH_VSLIDE1UP_VX(th_vslide1up_vx_w, uint32_t, H4, clearl_th) +GEN_TH_VSLIDE1UP_VX(th_vslide1up_vx_d, uint64_t, H8, clearq_th) + +#define GEN_TH_VSLIDE1DOWN_VX(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); = \ + uint32_t vlmax =3D (env_archcpu(env)->cfg.vlenb << 3) / mlen; = \ + uint32_t vm =3D th_vm(desc); = \ + uint32_t vl =3D env->vl; = \ + uint32_t i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { = \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + if (i =3D=3D vl - 1) { = \ + *((ETYPE *)vd + H(i)) =3D s1; = \ + } else { \ + *((ETYPE *)vd + H(i)) =3D *((ETYPE *)vs2 + H(i + 1)); = \ + } \ + } \ + env->vstart =3D 0; = \ + CLEAR_FN(vd, vl, vl * sizeof(ETYPE), vlmax * sizeof(ETYPE)); \ +} + +/* vslide1down.vx vd, vs2, rs1, vm # vd[i] =3D vs2[i+1], vd[vl-1]=3Dx[rs1]= */ +GEN_TH_VSLIDE1DOWN_VX(th_vslide1down_vx_b, uint8_t, H1, clearb_th) +GEN_TH_VSLIDE1DOWN_VX(th_vslide1down_vx_h, uint16_t, H2, clearh_th) +GEN_TH_VSLIDE1DOWN_VX(th_vslide1down_vx_w, uint32_t, H4, clearl_th) +GEN_TH_VSLIDE1DOWN_VX(th_vslide1down_vx_d, uint64_t, H8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712916151; cv=none; d=zohomail.com; s=zohoarc; b=jf4fLexZz+agWMpx1hJK/QoEXeBDI5PYH6a3oLRZ7bBGxUDNZb9n7XhYH6CwKEoV1WftIpDK1MVllN9PrP8ly75d8B8NABdXAEKLqSKVNeocLXUaxICucFdMaezedfb4dx9pQXGpu7s6vo2mxRwQ278gHmyGyaBdfAx+mNLsNc8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712916151; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=rKnpcCoeh0YvzwdWQhd/twJaWjtVRJRykEO5548t2OA=; b=LDQSR1q9QIorKhFh/yrMSvlL0sJpdyMAdnPifPdYHAyo8gFusP7kguZUiTwqCCj7jrOqqp7pzzZQbRq1AgB5YSmyympDnWtevt25aIDCmkwOIb2GOkDInivzuhU6P5WgHU0d5HBMLkEd7YysED5xcVt4L7+X/VqFwUNxjD0bpnM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712916151593344.82597734876515; Fri, 12 Apr 2024 03:02:31 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDjZ-00037w-Ug; Fri, 12 Apr 2024 06:02:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDjY-00037U-3K; Fri, 12 Apr 2024 06:02:12 -0400 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDjR-0001DV-Ox; Fri, 12 Apr 2024 06:02:10 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NtRJZ_1712916118) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 18:01:59 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712916120; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=rKnpcCoeh0YvzwdWQhd/twJaWjtVRJRykEO5548t2OA=; b=v19+I8YBi84yIYzaofvSN4NEZxoKqWjLkOTxzuJeQv9c2LmQjW//LxNkmnwFDook/JXQpRT2mZPFa+glAdGRTvfvhBnY+2sJfJO2NVsB3Va//asr+a4HQKjtAYImFCzncO+h0tCH0EOm2ddn6R5IIYJU2ewDKM5WHjCb00C/nsM= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R181e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046049; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NtRJZ_1712916118; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 63/65] target/riscv: Add vector register gather instructions for XTheadVector Date: Fri, 12 Apr 2024 15:37:33 +0800 Message-ID: <20240412073735.76413-64-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.130; envelope-from=eric.huang@linux.alibaba.com; helo=out30-130.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712916152473100001 Content-Type: text/plain; charset="utf-8" The instructions have the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 9 ++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 85 ++++++++++++++++++- target/riscv/xtheadvector_helper.c | 64 ++++++++++++++ 3 files changed, 155 insertions(+), 3 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 6ce0bcbba7..b650e299cf 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2333,3 +2333,12 @@ DEF_HELPER_6(th_vslide1down_vx_b, void, ptr, ptr, tl= , ptr, env, i32) DEF_HELPER_6(th_vslide1down_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vslide1down_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vslide1down_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vrgather_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vrgather_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vrgather_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vrgather_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vrgather_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vrgather_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vrgather_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(th_vrgather_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index 46cfc51690..f6da1ff384 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2816,13 +2816,92 @@ GEN_OPIVX_TRANS_TH(th_vslidedown_vx, opivx_check_th) GEN_OPIVX_TRANS_TH(th_vslide1down_vx, opivx_check_th) GEN_OPIVI_TRANS_TH(th_vslidedown_vi, IMM_ZX, th_vslidedown_vx, opivx_check= _th) =20 +/* Vector Register Gather Instruction */ +static bool vrgather_vv_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs1, false) && + th_check_reg(s, a->rs2, false) && + (a->rd !=3D a->rs2) && (a->rd !=3D a->rs1)); +} + +GEN_OPIVV_TRANS_TH(th_vrgather_vv, vrgather_vv_check_th) + +static bool vrgather_vx_check_th(DisasContext *s, arg_rmrr *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_overlap_mask(s, a->rd, a->vm, true) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false) && + (a->rd !=3D a->rs2)); +} + +/* vrgather.vx vd, vs2, rs1, vm # vd[i] =3D (x[rs1] >=3D VLMAX) ? 0 : vs2[= rs1] */ +static bool trans_th_vrgather_vx(DisasContext *s, arg_rmrr *a) +{ + if (!vrgather_vx_check_th(s, a)) { + return false; + } + + if (a->vm && s->vl_eq_vlmax) { + int vlmax =3D (s->cfg_ptr->vlenb << 3) / s->mlen; + TCGv_i64 dest =3D tcg_temp_new_i64(); + + if (a->rs1 =3D=3D 0) { + th_element_loadi(s, dest, a->rs2, 0); + } else { + th_element_loadx(s, dest, a->rs2, cpu_gpr[a->rs1], vlmax); + } + + tcg_gen_gvec_dup_i64(s->sew, vreg_ofs(s, a->rd), + MAXSZ(s), MAXSZ(s), dest); + finalize_rvv_inst(s); + } else { + static gen_helper_opivx * const fns[4] =3D { + gen_helper_th_vrgather_vx_b, gen_helper_th_vrgather_vx_h, + gen_helper_th_vrgather_vx_w, gen_helper_th_vrgather_vx_d + }; + return opivx_trans_th(a->rd, a->rs1, a->rs2, a->vm, fns[s->sew], s= ); + } + return true; +} + +/* vrgather.vi vd, vs2, imm, vm # vd[i] =3D (imm >=3D VLMAX) ? 0 : vs2[imm= ] */ +static bool trans_th_vrgather_vi(DisasContext *s, arg_rmrr *a) +{ + if (!vrgather_vx_check_th(s, a)) { + return false; + } + + if (a->vm && s->vl_eq_vlmax) { + if (a->rs1 >=3D (s->cfg_ptr->vlenb << 3) / s->mlen) { + tcg_gen_gvec_dup_imm(MO_64, vreg_ofs(s, a->rd), + MAXSZ(s), MAXSZ(s), 0); + } else { + tcg_gen_gvec_dup_mem(s->sew, vreg_ofs(s, a->rd), + endian_ofs(s, a->rs2, a->rs1), + MAXSZ(s), MAXSZ(s)); + } + finalize_rvv_inst(s); + } else { + static gen_helper_opivx * const fns[4] =3D { + gen_helper_th_vrgather_vx_b, gen_helper_th_vrgather_vx_h, + gen_helper_th_vrgather_vx_w, gen_helper_th_vrgather_vx_d + }; + return opivi_trans_th(a->rd, a->rs1, a->rs2, a->vm, fns[s->sew], + s, IMM_ZX); + } + return true; +} + #define TH_TRANS_STUB(NAME) \ static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ { \ return require_xtheadvector(s); \ } =20 -TH_TRANS_STUB(th_vrgather_vv) -TH_TRANS_STUB(th_vrgather_vx) -TH_TRANS_STUB(th_vrgather_vi) TH_TRANS_STUB(th_vcompress_vm) diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 73a15eb070..2598824bb3 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3801,3 +3801,67 @@ GEN_TH_VSLIDE1DOWN_VX(th_vslide1down_vx_b, uint8_t, = H1, clearb_th) GEN_TH_VSLIDE1DOWN_VX(th_vslide1down_vx_h, uint16_t, H2, clearh_th) GEN_TH_VSLIDE1DOWN_VX(th_vslide1down_vx_w, uint32_t, H4, clearl_th) GEN_TH_VSLIDE1DOWN_VX(th_vslide1down_vx_d, uint64_t, H8, clearq_th) + +/* Vector Register Gather Instruction */ +#define GEN_TH_VRGATHER_VV(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); = \ + uint32_t vlmax =3D (env_archcpu(env)->cfg.vlenb << 3) / mlen; = \ + uint32_t vm =3D th_vm(desc); = \ + uint32_t vl =3D env->vl; = \ + uint32_t index, i; \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { = \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + index =3D *((ETYPE *)vs1 + H(i)); = \ + if (index >=3D vlmax) { = \ + *((ETYPE *)vd + H(i)) =3D 0; = \ + } else { \ + *((ETYPE *)vd + H(i)) =3D *((ETYPE *)vs2 + H(index)); = \ + } \ + } \ + env->vstart =3D 0; = \ + CLEAR_FN(vd, vl, vl * sizeof(ETYPE), vlmax * sizeof(ETYPE)); \ +} + +/* vd[i] =3D (vs1[i] >=3D VLMAX) ? 0 : vs2[vs1[i]]; */ +GEN_TH_VRGATHER_VV(th_vrgather_vv_b, uint8_t, H1, clearb_th) +GEN_TH_VRGATHER_VV(th_vrgather_vv_h, uint16_t, H2, clearh_th) +GEN_TH_VRGATHER_VV(th_vrgather_vv_w, uint32_t, H4, clearl_th) +GEN_TH_VRGATHER_VV(th_vrgather_vv_d, uint64_t, H8, clearq_th) + +#define GEN_TH_VRGATHER_VX(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); = \ + uint32_t vlmax =3D (env_archcpu(env)->cfg.vlenb << 3) / mlen; = \ + uint32_t vm =3D th_vm(desc); = \ + uint32_t vl =3D env->vl; = \ + uint32_t index =3D s1, i; = \ + \ + VSTART_CHECK_EARLY_EXIT(env); \ + for (i =3D env->vstart; i < vl; i++) { = \ + if (!vm && !th_elem_mask(v0, mlen, i)) { \ + continue; \ + } \ + if (index >=3D vlmax) { = \ + *((ETYPE *)vd + H(i)) =3D 0; = \ + } else { \ + *((ETYPE *)vd + H(i)) =3D *((ETYPE *)vs2 + H(index)); = \ + } \ + } \ + env->vstart =3D 0; = \ + CLEAR_FN(vd, vl, vl * sizeof(ETYPE), vlmax * sizeof(ETYPE)); \ +} + +/* vd[i] =3D (x[rs1] >=3D VLMAX) ? 0 : vs2[rs1] */ +GEN_TH_VRGATHER_VX(th_vrgather_vx_b, uint8_t, H1, clearb_th) +GEN_TH_VRGATHER_VX(th_vrgather_vx_h, uint16_t, H2, clearh_th) +GEN_TH_VRGATHER_VX(th_vrgather_vx_w, uint32_t, H4, clearl_th) +GEN_TH_VRGATHER_VX(th_vrgather_vx_d, uint64_t, H8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712916270; cv=none; d=zohomail.com; s=zohoarc; b=Fjnpv87dXTBzk4ByHHSOw2O9/cEm/6Z86M2HADk3JxKWO647aGsj+WIV32KCUTYbk1/jbWq7zXM1ie0QSUNtazVAPX9PyyvsmNQc+JqMs6jhVvCKv85Y18A/xjGus4TUzhoqy4EpTd5TPLzrmbV7trOew8uLdA/Zdb88KEmvVhI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712916270; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=GhcV0+pMQbzlkWNZvNgh0j5WFf6ey4/Uuhz0/c1jvMk=; b=ZI0JbNmZOuHdO4wYgIxGLjBet/u9EuVQnCQHQ9tE1KWoFDZ6j/aFoMYWN02cAMUWwPhqniCUAb9j66IczTRTQvK+/+2ikYeUo2rRasU2bUZuprTqg/YgQJVC8NmWJ7gPXkMKyymZVbXa3ZfVeZU6fFM1WOuEZNl3sMX/hXvxpno= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1712916270005193.25972067285886; Fri, 12 Apr 2024 03:04:30 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDlb-0005CJ-HB; Fri, 12 Apr 2024 06:04:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDlY-0005BE-EJ; Fri, 12 Apr 2024 06:04:17 -0400 Received: from out30-113.freemail.mail.aliyun.com ([115.124.30.113]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDlT-0001Xq-7H; Fri, 12 Apr 2024 06:04:16 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4O-WLF_1712916239) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 18:04:00 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712916241; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=GhcV0+pMQbzlkWNZvNgh0j5WFf6ey4/Uuhz0/c1jvMk=; b=kRAufgWd2JYRCNtsGgXsoM38zml8DN0xIFhDUrPmky8U/Zpnbefu4Z0jPyA5gzbu/XGzICaoG30elkWHQfwbWBheWN0Iji7RCpQpYTkps6wgzxaruU8ikJD+sVCLXNqTqDdAjpk+xk/R1KfvqizuZOZXqsXCFgzYiS9NRQFbb1E= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R181e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046056; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4O-WLF_1712916239; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 64/65] target/riscv: Add vector compress instruction for XTheadVector Date: Fri, 12 Apr 2024 15:37:34 +0800 Message-ID: <20240412073735.76413-65-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.113; envelope-from=eric.huang@linux.alibaba.com; helo=out30-113.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712916270852100003 Content-Type: text/plain; charset="utf-8" The instruction has the same function as RVV1.0. Overall there are only general differences between XTheadVector and RVV1.0. Signed-off-by: Huang Tao --- target/riscv/helper.h | 5 +++ .../riscv/insn_trans/trans_xtheadvector.c.inc | 36 ++++++++++++++++--- target/riscv/xtheadvector_helper.c | 27 ++++++++++++++ 3 files changed, 63 insertions(+), 5 deletions(-) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index b650e299cf..b46f9fc2c3 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -2342,3 +2342,8 @@ DEF_HELPER_6(th_vrgather_vx_b, void, ptr, ptr, tl, pt= r, env, i32) DEF_HELPER_6(th_vrgather_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vrgather_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(th_vrgather_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(th_vcompress_vm_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vcompress_vm_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vcompress_vm_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(th_vcompress_vm_d, void, ptr, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn_trans/trans_xtheadvector.c.inc b/target/risc= v/insn_trans/trans_xtheadvector.c.inc index f6da1ff384..65b595d699 100644 --- a/target/riscv/insn_trans/trans_xtheadvector.c.inc +++ b/target/riscv/insn_trans/trans_xtheadvector.c.inc @@ -2898,10 +2898,36 @@ static bool trans_th_vrgather_vi(DisasContext *s, a= rg_rmrr *a) return true; } =20 -#define TH_TRANS_STUB(NAME) \ -static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ -{ \ - return require_xtheadvector(s); \ +/* Vector Compress Instruction */ +static bool vcompress_vm_check_th(DisasContext *s, arg_r *a) +{ + return (require_xtheadvector(s) && + vext_check_isa_ill(s) && + th_check_reg(s, a->rd, false) && + th_check_reg(s, a->rs2, false) && + th_check_overlap_group(a->rd, 1 << s->lmul, a->rs1, 1) && + (a->rd !=3D a->rs2)) && + s->vstart_eq_zero; } =20 -TH_TRANS_STUB(th_vcompress_vm) +static bool trans_th_vcompress_vm(DisasContext *s, arg_r *a) +{ + if (vcompress_vm_check_th(s, a)) { + uint32_t data =3D 0; + static gen_helper_gvec_4_ptr * const fns[4] =3D { + gen_helper_th_vcompress_vm_b, gen_helper_th_vcompress_vm_h, + gen_helper_th_vcompress_vm_w, gen_helper_th_vcompress_vm_d, + }; + + data =3D FIELD_DP32(data, VDATA_TH, MLEN, s->mlen); + data =3D FIELD_DP32(data, VDATA_TH, LMUL, s->lmul); + tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), + vreg_ofs(s, a->rs1), vreg_ofs(s, a->rs2), + tcg_env, s->cfg_ptr->vlenb, + s->cfg_ptr->vlenb, data, + fns[s->sew]); + finalize_rvv_inst(s); + return true; + } + return false; +} diff --git a/target/riscv/xtheadvector_helper.c b/target/riscv/xtheadvector= _helper.c index 2598824bb3..656f83f408 100644 --- a/target/riscv/xtheadvector_helper.c +++ b/target/riscv/xtheadvector_helper.c @@ -3865,3 +3865,30 @@ GEN_TH_VRGATHER_VX(th_vrgather_vx_b, uint8_t, H1, cl= earb_th) GEN_TH_VRGATHER_VX(th_vrgather_vx_h, uint16_t, H2, clearh_th) GEN_TH_VRGATHER_VX(th_vrgather_vx_w, uint32_t, H4, clearl_th) GEN_TH_VRGATHER_VX(th_vrgather_vx_d, uint64_t, H8, clearq_th) + +/* Vector Compress Instruction */ +#define GEN_TH_VCOMPRESS_VM(NAME, ETYPE, H, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen =3D th_mlen(desc); = \ + uint32_t vlmax =3D (env_archcpu(env)->cfg.vlenb << 3) / mlen; = \ + uint32_t vl =3D env->vl; = \ + uint32_t num =3D 0, i; = \ + \ + for (i =3D env->vstart; i < vl; i++) { = \ + if (!th_elem_mask(vs1, mlen, i)) { \ + continue; \ + } \ + *((ETYPE *)vd + H(num)) =3D *((ETYPE *)vs2 + H(i)); = \ + num++; \ + } \ + env->vstart =3D 0; = \ + CLEAR_FN(vd, num, num * sizeof(ETYPE), vlmax * sizeof(ETYPE)); \ +} + +/* Compress into vd elements of vs2 where vs1 is enabled */ +GEN_TH_VCOMPRESS_VM(th_vcompress_vm_b, uint8_t, H1, clearb_th) +GEN_TH_VCOMPRESS_VM(th_vcompress_vm_h, uint16_t, H2, clearh_th) +GEN_TH_VCOMPRESS_VM(th_vcompress_vm_w, uint32_t, H4, clearl_th) +GEN_TH_VCOMPRESS_VM(th_vcompress_vm_d, uint64_t, H8, clearq_th) --=20 2.44.0 From nobody Thu May 16 11:28:54 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linux.alibaba.com ARC-Seal: i=1; a=rsa-sha256; t=1712916408; cv=none; d=zohomail.com; s=zohoarc; b=crW1orrJmXdNiN4XxiLO790K5gST09m7i2/v3uV3R7hlYbhDuH4jE7s5SXtNnlH1K6MhnVSEBoCD7W2fOtVvsMyE5trsI3TkCtn+/yQAasZCr+T7FAsQh6ZqvQQk2AAUssRZHwPsCniPcXPu99tVXCkCOSQYt3srj6arJ66D+sc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1712916408; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=S6PxTgtORypWKOerrWqfq+DLXyveNYf73gc+A01znRY=; b=ZbL8WwqSPb52CV8AbNirNDaGHHuct7CMY+lc2z7kAYOKFyOnKXyfhk63Oi0So0EHmHMjQ8qMtitQ2kjTovDrAue0VZ4jA4IhH+HBxAx9YixKb/7PwB8HkY23ZPgNTcJPBmMFmxHvgcEM4vqA2Yq5i3P0YFxnrBWrFFlUWJLm1dM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 17129164085801015.4659395531477; Fri, 12 Apr 2024 03:06:48 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rvDna-000807-5L; Fri, 12 Apr 2024 06:06:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDnW-0007wD-SP; Fri, 12 Apr 2024 06:06:18 -0400 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rvDnP-00022L-6H; Fri, 12 Apr 2024 06:06:18 -0400 Received: from localhost.localdomain(mailfrom:eric.huang@linux.alibaba.com fp:SMTPD_---0W4NtSbs_1712916361) by smtp.aliyun-inc.com; Fri, 12 Apr 2024 18:06:02 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1712916362; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=S6PxTgtORypWKOerrWqfq+DLXyveNYf73gc+A01znRY=; b=Z6MQcW5NZoLm+vQ8e2OsUcOhKKSd9tQd4+KHGYuGyu84cpt0ljCIQj1R9C+e4hIy7JmtjpiBlCV/LQePp8A4vwbUTsd7QpyMo4PIlSTruAun/irMw9sf+JCypEU9KQNKZugP4qY7lvtrPnguudh1c2TFmo2zuyuClgiYgk+AU4s= X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R161e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018045176; MF=eric.huang@linux.alibaba.com; NM=1; PH=DS; RN=9; SR=0; TI=SMTPD_---0W4NtSbs_1712916361; From: Huang Tao To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, zhiwei_liu@linux.alibaba.com, dbarboza@ventanamicro.com, liwei1518@gmail.com, bin.meng@windriver.com, alistair.francis@wdc.com, palmer@dabbelt.com, Huang Tao Subject: [PATCH 65/65] target/riscv: Enable XTheadVector extension for c906 Date: Fri, 12 Apr 2024 15:37:35 +0800 Message-ID: <20240412073735.76413-66-eric.huang@linux.alibaba.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240412073735.76413-1-eric.huang@linux.alibaba.com> References: <20240412073735.76413-1-eric.huang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=115.124.30.132; envelope-from=eric.huang@linux.alibaba.com; helo=out30-132.freemail.mail.aliyun.com X-Spam_score_int: -174 X-Spam_score: -17.5 X-Spam_bar: ----------------- X-Spam_report: (-17.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linux.alibaba.com) X-ZM-MESSAGEID: 1712916409242100001 Content-Type: text/plain; charset="utf-8" This patch enables XTheadVector for the c906. Signed-off-by: Huang Tao --- target/riscv/cpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 05652e8c87..e85aa51237 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -542,7 +542,7 @@ static void rv64_thead_c906_cpu_init(Object *obj) cpu->cfg.ext_xtheadmemidx =3D true; cpu->cfg.ext_xtheadmempair =3D true; cpu->cfg.ext_xtheadsync =3D true; - cpu->cfg.ext_xtheadvector =3D false; + cpu->cfg.ext_xtheadvector =3D true; =20 cpu->cfg.mvendorid =3D THEAD_VENDOR_ID; #ifndef CONFIG_USER_ONLY --=20 2.44.0