From nobody Tue Feb 10 11:14:27 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1568184472; cv=none; d=zoho.com; s=zohoarc; b=BCk2fgyEEm1bemPtOKazBkY2FZ5KTIJtF+k5sagv2pVi1ySuXBpFbN1+d+TPiSaZce4flGV+xtcuMVeQposus1BQBE0st3jj36UATClWuMJsZe8dXc+vxaoL7u8babSFupTzIyGcQWesxopvBNRKSfwC9EA7heeD6BTLhz+kYq8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1568184472; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=2vJepLnOhsNVuMVHMzb1Cmgr28zXRC1paYaLrG48l4Y=; b=LkKAsugIFwzD1jbq7uIfTo0Knx57mVccFluK1ED54tp+HrLtmFrJMOprHOX5Bxp6voJeVmdB4n2n7Ga6yH+pCQBmrdDL57e65HA3cY75zCtnoU9SqBrRSGXtDvXYiNJPqJlmYfATCW2JaA6IRY86fUwl3BkqPvIvzhhy64PIvNw= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 156818447183553.34163964483457; Tue, 10 Sep 2019 23:47:51 -0700 (PDT) Received: from localhost ([::1]:46956 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wPt-0003HO-RD for importer@patchew.org; Wed, 11 Sep 2019 02:47:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38696) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wE6-0007nD-8n for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDv-00088a-S5 for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:38 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:49053) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDu-0007q7-7S; Wed, 11 Sep 2019 02:35:27 -0400 Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRDwoS_1568183700) by smtp.aliyun-inc.com(10.147.40.200); Wed, 11 Sep 2019 14:35:01 +0800 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.338074-0.00686125-0.655065; FP=0|0|0|0|0|-1|-1|-1; HT=e01a16384; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRDwoS_1568183700; From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:37 +0800 Message-Id: <1568183141-67641-14-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 13/17] RISC-V: add vector extension float instruction part1, add/sub/mul/div X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 37 + target/riscv/insn32.decode | 37 + target/riscv/insn_trans/trans_rvv.inc.c | 37 + target/riscv/vector_helper.c | 2645 +++++++++++++++++++++++++++= ++++ 4 files changed, 2756 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index ff6002e..d2c8684 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -307,5 +307,42 @@ DEF_HELPER_5(vector_vnclip_vv, void, env, i32, i32, i3= 2, i32) DEF_HELPER_5(vector_vnclip_vx, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vnclip_vi, void, env, i32, i32, i32, i32) =20 +DEF_HELPER_5(vector_vfadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfadd_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfsub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfsub_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfrsub_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwadd_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwadd_wv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwadd_wf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwsub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwsub_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwsub_wv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwsub_wf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmul_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmul_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfdiv_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfdiv_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfrdiv_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwmul_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwmul_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmacc_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmacc_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmacc_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmacc_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmsac_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmsac_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmsac_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmsac_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmadd_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmadd_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmsub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmsub_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmsub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmsub_vf, void, env, i32, i32, i32, i32) + DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index a82e53e..31868ab 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -447,5 +447,42 @@ vnclip_vv 101111 . ..... ..... 000 ..... 1010111= @r_vm vnclip_vx 101111 . ..... ..... 100 ..... 1010111 @r_vm vnclip_vi 101111 . ..... ..... 011 ..... 1010111 @r_vm =20 +vfadd_vv 000000 . ..... ..... 001 ..... 1010111 @r_vm +vfadd_vf 000000 . ..... ..... 101 ..... 1010111 @r_vm +vfsub_vv 000010 . ..... ..... 001 ..... 1010111 @r_vm +vfsub_vf 000010 . ..... ..... 101 ..... 1010111 @r_vm +vfrsub_vf 100111 . ..... ..... 101 ..... 1010111 @r_vm +vfwadd_vv 110000 . ..... ..... 001 ..... 1010111 @r_vm +vfwadd_vf 110000 . ..... ..... 101 ..... 1010111 @r_vm +vfwadd_wv 110100 . ..... ..... 001 ..... 1010111 @r_vm +vfwadd_wf 110100 . ..... ..... 101 ..... 1010111 @r_vm +vfwsub_vv 110010 . ..... ..... 001 ..... 1010111 @r_vm +vfwsub_vf 110010 . ..... ..... 101 ..... 1010111 @r_vm +vfwsub_wv 110110 . ..... ..... 001 ..... 1010111 @r_vm +vfwsub_wf 110110 . ..... ..... 101 ..... 1010111 @r_vm +vfmul_vv 100100 . ..... ..... 001 ..... 1010111 @r_vm +vfmul_vf 100100 . ..... ..... 101 ..... 1010111 @r_vm +vfdiv_vv 100000 . ..... ..... 001 ..... 1010111 @r_vm +vfdiv_vf 100000 . ..... ..... 101 ..... 1010111 @r_vm +vfrdiv_vf 100001 . ..... ..... 101 ..... 1010111 @r_vm +vfwmul_vv 111000 . ..... ..... 001 ..... 1010111 @r_vm +vfwmul_vf 111000 . ..... ..... 101 ..... 1010111 @r_vm +vfmacc_vf 101100 . ..... ..... 101 ..... 1010111 @r_vm +vfmacc_vv 101100 . ..... ..... 001 ..... 1010111 @r_vm +vfnmacc_vv 101101 . ..... ..... 001 ..... 1010111 @r_vm +vfnmacc_vf 101101 . ..... ..... 101 ..... 1010111 @r_vm +vfmsac_vv 101110 . ..... ..... 001 ..... 1010111 @r_vm +vfmsac_vf 101110 . ..... ..... 101 ..... 1010111 @r_vm +vfnmsac_vv 101111 . ..... ..... 001 ..... 1010111 @r_vm +vfnmsac_vf 101111 . ..... ..... 101 ..... 1010111 @r_vm +vfmadd_vv 101000 . ..... ..... 001 ..... 1010111 @r_vm +vfmadd_vf 101000 . ..... ..... 101 ..... 1010111 @r_vm +vfnmadd_vv 101001 . ..... ..... 001 ..... 1010111 @r_vm +vfnmadd_vf 101001 . ..... ..... 101 ..... 1010111 @r_vm +vfmsub_vv 101010 . ..... ..... 001 ..... 1010111 @r_vm +vfmsub_vf 101010 . ..... ..... 101 ..... 1010111 @r_vm +vfnmsub_vv 101011 . ..... ..... 001 ..... 1010111 @r_vm +vfnmsub_vf 101011 . ..... ..... 101 ..... 1010111 @r_vm + vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_tr= ans/trans_rvv.inc.c index d650e8c..ff23bc2 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -336,5 +336,42 @@ GEN_VECTOR_R_VM(vnclip_vv) GEN_VECTOR_R_VM(vnclip_vx) GEN_VECTOR_R_VM(vnclip_vi) =20 +GEN_VECTOR_R_VM(vfadd_vv) +GEN_VECTOR_R_VM(vfadd_vf) +GEN_VECTOR_R_VM(vfsub_vv) +GEN_VECTOR_R_VM(vfsub_vf) +GEN_VECTOR_R_VM(vfrsub_vf) +GEN_VECTOR_R_VM(vfwadd_vv) +GEN_VECTOR_R_VM(vfwadd_vf) +GEN_VECTOR_R_VM(vfwadd_wv) +GEN_VECTOR_R_VM(vfwadd_wf) +GEN_VECTOR_R_VM(vfwsub_wv) +GEN_VECTOR_R_VM(vfwsub_wf) +GEN_VECTOR_R_VM(vfwsub_vv) +GEN_VECTOR_R_VM(vfwsub_vf) +GEN_VECTOR_R_VM(vfmul_vv) +GEN_VECTOR_R_VM(vfmul_vf) +GEN_VECTOR_R_VM(vfdiv_vv) +GEN_VECTOR_R_VM(vfdiv_vf) +GEN_VECTOR_R_VM(vfrdiv_vf) +GEN_VECTOR_R_VM(vfwmul_vv) +GEN_VECTOR_R_VM(vfwmul_vf) +GEN_VECTOR_R_VM(vfmacc_vv) +GEN_VECTOR_R_VM(vfmacc_vf) +GEN_VECTOR_R_VM(vfnmacc_vv) +GEN_VECTOR_R_VM(vfnmacc_vf) +GEN_VECTOR_R_VM(vfmsac_vv) +GEN_VECTOR_R_VM(vfmsac_vf) +GEN_VECTOR_R_VM(vfnmsac_vv) +GEN_VECTOR_R_VM(vfnmsac_vf) +GEN_VECTOR_R_VM(vfmadd_vv) +GEN_VECTOR_R_VM(vfmadd_vf) +GEN_VECTOR_R_VM(vfnmadd_vv) +GEN_VECTOR_R_VM(vfnmadd_vf) +GEN_VECTOR_R_VM(vfmsub_vv) +GEN_VECTOR_R_VM(vfmsub_vf) +GEN_VECTOR_R_VM(vfnmsub_vv) +GEN_VECTOR_R_VM(vfnmsub_vf) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 2292fa5..e16543b 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -21,6 +21,7 @@ #include "exec/exec-all.h" #include "exec/helper-proto.h" #include "exec/cpu_ldst.h" +#include "fpu/softfloat.h" #include =20 #define VECTOR_HELPER(name) HELPER(glue(vector_, name)) @@ -1125,6 +1126,41 @@ static void vector_tail_narrow(CPURISCVState *env, i= nt vreg, int index, } } =20 +static void vector_tail_fcommon(CPURISCVState *env, int vreg, int index, + int width) +{ + switch (width) { + case 16: + env->vfp.vreg[vreg].u16[index] =3D 0; + break; + case 32: + env->vfp.vreg[vreg].u32[index] =3D 0; + break; + case 64: + env->vfp.vreg[vreg].u64[index] =3D 0; + break; + default: + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return; + } +} + +static void vector_tail_fwiden(CPURISCVState *env, int vreg, int index, + int width) +{ + switch (width) { + case 16: + env->vfp.vreg[vreg].u32[index] =3D 0; + break; + case 32: + env->vfp.vreg[vreg].u64[index] =3D 0; + break; + default: + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return; + } +} + static inline int vector_get_carry(CPURISCVState *env, int width, int lmul, int index) { @@ -17114,3 +17150,2612 @@ void VECTOR_HELPER(vnclip_vi)(CPURISCVState *env= , uint32_t vm, uint32_t rs1, env->vfp.vstart =3D 0; return; } + +/* vfadd.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_add( + env->vfp.vreg[src1].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_add( + env->vfp.vreg[src1].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_add( + env->vfp.vreg[src1].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfadd.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfadd_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_add( + env->fpr[rs1], + env->vfp.vreg[src2].f1= 6[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_add( + env->fpr[rs1], + env->vfp.vreg[src2].f3= 2[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_add( + env->fpr[rs1], + env->vfp.vreg[src2].f6= 4[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfsub.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_sub( + env->vfp.vreg[src2].f1= 6[j], + env->vfp.vreg[src1].f1= 6[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_sub( + env->vfp.vreg[src2].f3= 2[j], + env->vfp.vreg[src1].f3= 2[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_sub( + env->vfp.vreg[src2].f6= 4[j], + env->vfp.vreg[src1].f6= 4[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfsub.vf vd, vs2, rs1, vm # Vector-scalar vd[i] =3D vs2[i] - f[rs1] */ +void VECTOR_HELPER(vfsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_sub( + env->vfp.vreg[src2].f1= 6[j], + env->fpr[rs1], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_sub( + env->vfp.vreg[src2].f3= 2[j], + env->fpr[rs1], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_sub( + env->vfp.vreg[src2].f6= 4[j], + env->fpr[rs1], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfrsub.vf vd, vs2, rs1, vm # Scalar-vector vd[i] =3D f[rs1] - vs2[i] */ +void VECTOR_HELPER(vfrsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_sub( + env->fpr[rs1], + env->vfp.vreg[src2].f1= 6[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_sub( + env->fpr[rs1], + env->vfp.vreg[src2].f3= 2[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_sub( + env->fpr[rs1], + env->vfp.vreg[src2].f6= 4[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfwadd.vv vd, vs2, vs1, vm # vector-vector */ +void VECTOR_HELPER(vfwadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / (2 * width))); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + k =3D i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] =3D float32_add( + float16_to_float32(env->vfp.vreg[src2].f16[j], tru= e, + &env->fp_status), + float16_to_float32(env->vfp.vreg[src1].f16[j], tru= e, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] =3D float64_add( + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + float32_to_float64(env->vfp.vreg[src1].f32[j], + &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfwadd.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfwadd_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / (2 * width))); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + k =3D i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] =3D float32_add( + float16_to_float32(env->vfp.vreg[src2].f16[j], tru= e, + &env->fp_status), + float16_to_float32(env->fpr[rs1], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] =3D float64_add( + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + float32_to_float64(env->fpr[rs1], &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfwadd.wv vd, vs2, vs1, vm # vector-vector */ +void VECTOR_HELPER(vfwadd_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / (2 * width))); + src2 =3D rs2 + (i / (VLEN / (2 * width))); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + k =3D i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] =3D float32_add( + env->vfp.vreg[src2].f32[k], + float16_to_float32(env->vfp.vreg[src1].f16[j], tru= e, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] =3D float64_add( + env->vfp.vreg[src2].f64[k], + float32_to_float64(env->vfp.vreg[src1].f32[j], + &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfwadd.wf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfwadd_wf)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, k, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / (2 * width))); + src2 =3D rs2 + (i / (VLEN / (2 * width))); + k =3D i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] =3D float32_add( + env->vfp.vreg[src2].f32[k], + float16_to_float32(env->fpr[rs1], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] =3D float64_add( + env->vfp.vreg[src2].f64[k], + float32_to_float64(env->fpr[rs1], &env->fp_status= ), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfwsub.vv vd, vs2, vs1, vm # vector-vector */ +void VECTOR_HELPER(vfwsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / (2 * width))); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + k =3D i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] =3D float32_sub( + float16_to_float32(env->vfp.vreg[src2].f16[j], tru= e, + &env->fp_status), + float16_to_float32(env->vfp.vreg[src1].f16[j], tru= e, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] =3D float64_sub( + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + float32_to_float64(env->vfp.vreg[src1].f32[j], + &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfwsub.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfwsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / (2 * width))); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + k =3D i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] =3D float32_sub( + float16_to_float32(env->vfp.vreg[src2].f16[j], tru= e, + &env->fp_status), + float16_to_float32(env->fpr[rs1], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] =3D float64_sub( + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + float32_to_float64(env->fpr[rs1], &env->fp_status= ), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfwsub.wv vd, vs2, vs1, vm # vector-vector */ +void VECTOR_HELPER(vfwsub_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / (2 * width))); + src2 =3D rs2 + (i / (VLEN / (2 * width))); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + k =3D i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] =3D float32_sub( + env->vfp.vreg[src2].f32[k], + float16_to_float32(env->vfp.vreg[src1].f16[j], tru= e, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] =3D float64_sub( + env->vfp.vreg[src2].f64[k], + float32_to_float64(env->vfp.vreg[src1].f32[j], + &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfwsub.wf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfwsub_wf)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, k, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / (2 * width))); + src2 =3D rs2 + (i / (VLEN / (2 * width))); + k =3D i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] =3D float32_sub( + env->vfp.vreg[src2].f32[k], + float16_to_float32(env->fpr[rs1], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] =3D float64_sub( + env->vfp.vreg[src2].f64[k], + float32_to_float64(env->fpr[rs1], &env->fp_status= ), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfmul.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfmul_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_mul( + env->vfp.vreg[src1].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_mul( + env->vfp.vreg[src1].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_mul( + env->vfp.vreg[src1].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfmul.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfmul_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_mul( + env->fpr[rs1], + env->vfp.vreg[src2].f1= 6[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_mul( + env->fpr[rs1], + env->vfp.vreg[src2].f3= 2[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_mul( + env->fpr[rs1], + env->vfp.vreg[src2].f6= 4[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfdiv.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfdiv_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_div( + env->vfp.vreg[src2].f1= 6[j], + env->vfp.vreg[src1].f1= 6[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_div( + env->vfp.vreg[src2].f3= 2[j], + env->vfp.vreg[src1].f3= 2[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_div( + env->vfp.vreg[src2].f6= 4[j], + env->vfp.vreg[src1].f6= 4[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfdiv.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfdiv_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_div( + env->vfp.vreg[src2].f1= 6[j], + env->fpr[rs1], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_div( + env->vfp.vreg[src2].f3= 2[j], + env->fpr[rs1], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_div( + env->vfp.vreg[src2].f6= 4[j], + env->fpr[rs1], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfrdiv.vf vd, vs2, rs1, vm # scalar-vector, vd[i] =3D f[rs1]/vs2[i] */ +void VECTOR_HELPER(vfrdiv_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_div( + env->fpr[rs1], + env->vfp.vreg[src2].f1= 6[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_div( + env->fpr[rs1], + env->vfp.vreg[src2].f3= 2[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_div( + env->fpr[rs1], + env->vfp.vreg[src2].f6= 4[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfwmul.vv vd, vs2, vs1, vm # vector-vector */ +void VECTOR_HELPER(vfwmul_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / (2 * width))); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + k =3D i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] =3D float32_mul( + float16_to_float32(env->vfp.vreg[src2].f16[j], tru= e, + &env->fp_status), + float16_to_float32(env->vfp.vreg[src1].f16[j], tru= e, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] =3D float64_mul( + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + float32_to_float64(env->vfp.vreg[src1].f32[j], + &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + return; + + env->vfp.vstart =3D 0; +} + +/* vfwmul.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfwmul_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / (2 * width))); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + k =3D i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] =3D float32_mul( + float16_to_float32(env->vfp.vreg[src2].f16[j], tru= e, + &env->fp_status), + float16_to_float32(env->fpr[rs1], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] =3D float64_mul( + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + float32_to_float64(env->fpr[rs1], &env->fp_status= ), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfmacc.vv vd, vs1, vs2, vm # vd[i] =3D +(vs1[i] * vs2[i]) + vd[i] */ +void VECTOR_HELPER(vfmacc_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->vfp.vreg[src1].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + env->vfp.vreg[dest].f1= 6[j], + 0, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->vfp.vreg[src1].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + env->vfp.vreg[dest].f3= 2[j], + 0, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->vfp.vreg[src1].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + env->vfp.vreg[dest].f6= 4[j], + 0, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfmacc.vf vd, rs1, vs2, vm # vd[i] =3D +(f[rs1] * vs2[i]) + vd[i] */ +void VECTOR_HELPER(vfmacc_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f1= 6[j], + env->vfp.vreg[dest].f1= 6[j], + 0, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f3= 2[j], + env->vfp.vreg[dest].f3= 2[j], + 0, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f6= 4[j], + env->vfp.vreg[dest].f6= 4[j], + 0, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfnmacc.vv vd, vs1, vs2, vm # vd[i] =3D -(vs1[i] * vs2[i]) - vd[i] */ +void VECTOR_HELPER(vfnmacc_vv)(CPURISCVState *env, uint32_t vm, uint32_t r= s1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->vfp.vreg[src1].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + env->vfp.vreg[dest].f1= 6[j], + float_muladd_negate_c | + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->vfp.vreg[src1].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + env->vfp.vreg[dest].f3= 2[j], + float_muladd_negate_c | + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->vfp.vreg[src1].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + env->vfp.vreg[dest].f6= 4[j], + float_muladd_negate_c | + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfnmacc.vf vd, rs1, vs2, vm # vd[i] =3D -(f[rs1] * vs2[i]) - vd[i] */ +void VECTOR_HELPER(vfnmacc_vf)(CPURISCVState *env, uint32_t vm, uint32_t r= s1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f1= 6[j], + env->vfp.vreg[dest].f1= 6[j], + float_muladd_negate_c | + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f3= 2[j], + env->vfp.vreg[dest].f3= 2[j], + float_muladd_negate_c | + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f6= 4[j], + env->vfp.vreg[dest].f6= 4[j], + float_muladd_negate_c | + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfmsac.vv vd, vs1, vs2, vm # vd[i] =3D +(vs1[i] * vs2[i]) - vd[i] */ +void VECTOR_HELPER(vfmsac_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->vfp.vreg[src1].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + env->vfp.vreg[dest].f1= 6[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->vfp.vreg[src1].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + env->vfp.vreg[dest].f3= 2[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->vfp.vreg[src1].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + env->vfp.vreg[dest].f6= 4[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfmsac.vf vd, rs1, vs2, vm # vd[i] =3D +(f[rs1] * vs2[i]) - vd[i] */ +void VECTOR_HELPER(vfmsac_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f1= 6[j], + env->vfp.vreg[dest].f1= 6[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f3= 2[j], + env->vfp.vreg[dest].f3= 2[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f6= 4[j], + env->vfp.vreg[dest].f6= 4[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + return; + + env->vfp.vstart =3D 0; +} + +/* vfnmsac.vv vd, vs1, vs2, vm # vd[i] =3D -(vs1[i] * vs2[i]) + vd[i] */ +void VECTOR_HELPER(vfnmsac_vv)(CPURISCVState *env, uint32_t vm, uint32_t r= s1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->vfp.vreg[src1].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + env->vfp.vreg[dest].f1= 6[j], + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->vfp.vreg[src1].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + env->vfp.vreg[dest].f3= 2[j], + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->vfp.vreg[src1].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + env->vfp.vreg[dest].f6= 4[j], + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfnmsac.vf vd, rs1, vs2, vm # vd[i] =3D -(f[rs1] * vs2[i]) + vd[i] */ +void VECTOR_HELPER(vfnmsac_vf)(CPURISCVState *env, uint32_t vm, uint32_t r= s1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f1= 6[j], + env->vfp.vreg[dest].f1= 6[j], + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f3= 2[j], + env->vfp.vreg[dest].f3= 2[j], + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f6= 4[j], + env->vfp.vreg[dest].f6= 4[j], + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfmadd.vv vd, vs1, vs2, vm # vd[i] =3D +(vs1[i] * vd[i]) + vs2[i] */ +void VECTOR_HELPER(vfmadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->vfp.vreg[src1].f1= 6[j], + env->vfp.vreg[dest].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + 0, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->vfp.vreg[src1].f3= 2[j], + env->vfp.vreg[dest].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + 0, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->vfp.vreg[src1].f6= 4[j], + env->vfp.vreg[dest].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + 0, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfmadd.vf vd, rs1, vs2, vm # vd[i] =3D +(f[rs1] * vd[i]) + vs2[i] */ +void VECTOR_HELPER(vfmadd_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + 0, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + 0, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + 0, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} +/* vfnmadd.vv vd, vs1, vs2, vm # vd[i] =3D -(vs1[i] * vd[i]) - vs2[i] */ +void VECTOR_HELPER(vfnmadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t r= s1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->vfp.vreg[src1].f1= 6[j], + env->vfp.vreg[dest].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + float_muladd_negate_c | + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->vfp.vreg[src1].f3= 2[j], + env->vfp.vreg[dest].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + float_muladd_negate_c | + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->vfp.vreg[src1].f6= 4[j], + env->vfp.vreg[dest].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + float_muladd_negate_c | + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfnmadd.vf vd, rs1, vs2, vm # vd[i] =3D -(f[rs1] * vd[i]) - vs2[i] */ +void VECTOR_HELPER(vfnmadd_vf)(CPURISCVState *env, uint32_t vm, uint32_t r= s1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + float_muladd_negate_c | + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + float_muladd_negate_c | + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + float_muladd_negate_c | + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfmsub.vv vd, vs1, vs2, vm # vd[i] =3D +(vs1[i] * vd[i]) - vs2[i] */ +void VECTOR_HELPER(vfmsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->vfp.vreg[src1].f1= 6[j], + env->vfp.vreg[dest].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->vfp.vreg[src1].f3= 2[j], + env->vfp.vreg[dest].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->vfp.vreg[src1].f6= 4[j], + env->vfp.vreg[dest].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + +/* vfmsub.vf vd, rs1, vs2, vm # vd[i] =3D +(f[rs1] * vd[i]) - vs2[i] */ +void VECTOR_HELPER(vfmsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs= 1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} +/* vfnmsub.vv vd, vs1, vs2, vm # vd[i] =3D -(vs1[i] * vd[i]) + vs2[i] */ +void VECTOR_HELPER(vfnmsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t r= s1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + src1 =3D rs1 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->vfp.vreg[src1].f1= 6[j], + env->vfp.vreg[dest].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->vfp.vreg[src1].f3= 2[j], + env->vfp.vreg[dest].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->vfp.vreg[src1].f6= 4[j], + env->vfp.vreg[dest].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + return; + + env->vfp.vstart =3D 0; +} + +/* vfnmsub.vf vd, rs1, vs2, vm # vd[i] =3D -(f[rs1] * vd[i]) + vs2[i] */ +void VECTOR_HELPER(vfnmsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t r= s1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul =3D vector_get_lmul(env); + vl =3D env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >=3D vl) { + return; + } + + width =3D vector_get_width(env); + vlmax =3D vector_get_vlmax(env); + + for (i =3D 0; i < vlmax; i++) { + dest =3D rd + (i / (VLEN / width)); + src2 =3D rs2 + (i / (VLEN / width)); + j =3D i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] =3D float16_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f1= 6[j], + env->vfp.vreg[src2].f1= 6[j], + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] =3D float32_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f3= 2[j], + env->vfp.vreg[src2].f3= 2[j], + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] =3D float64_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f6= 4[j], + env->vfp.vreg[src2].f6= 4[j], + float_muladd_negate_pr= oduct, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC(= )); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart =3D 0; + return; +} + + --=20 2.7.4