From nobody Thu Nov 28 11:01:56 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=gmail.com ARC-Seal: i=1; a=rsa-sha256; t=1694153487; cv=none; d=zohomail.com; s=zohoarc; b=nIJwjVyR2CS2mszGoEEA6zMzmVl8NLpdN95zKvG9W5IpxgaS08l7iamOw+1OXxWMDem6WYunE1pzjKhs3t00RqV0AvmKBc4SRht3DzdNItNFi81PrAnHj7M9KYungSBKV12VjC+Lc5BOIyR5DPWJXTgKSUkmpcFaYOiBxw7zGpY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1694153487; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=sSeyTQyuBwnxsxZ+/qUq8xyLuzir65M8yR4fVevxw4Y=; b=nRoysf7K5IdZ/X147UlQ7sno/bjaZI4cCL6oM9dLj28LXWOKMWS4RSkrZ1Vj5c5m1PLKdbdJ1pbYS2qgXUYAhlMHeE1D9h8NSuZlDCFtc0PakkBBOMQLEOp274/cUPOfEAhZfa1mhe3Sz0vVcfIiNd9d3X8JDxE/k/b8mnOUjq4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1694153487689951.2319590876401; Thu, 7 Sep 2023 23:11:27 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeUcX-00007G-Px; Fri, 08 Sep 2023 02:05:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeUcW-00006g-Cu for qemu-devel@nongnu.org; Fri, 08 Sep 2023 02:05:32 -0400 Received: from mail-pj1-x102c.google.com ([2607:f8b0:4864:20::102c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qeUcR-0005Ir-Fb for qemu-devel@nongnu.org; Fri, 08 Sep 2023 02:05:32 -0400 Received: by mail-pj1-x102c.google.com with SMTP id 98e67ed59e1d1-273ca7ab3f5so1127575a91.2 for ; Thu, 07 Sep 2023 23:05:27 -0700 (PDT) Received: from toolbox.alistair23.me (2403-580b-97e8-0-321-6fb2-58f1-a1b1.ip6.aussiebb.net. [2403:580b:97e8:0:321:6fb2:58f1:a1b1]) by smtp.gmail.com with ESMTPSA id q1-20020a170902dac100b001c3267ae31bsm715231plx.301.2023.09.07.23.05.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Sep 2023 23:05:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1694153125; x=1694757925; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sSeyTQyuBwnxsxZ+/qUq8xyLuzir65M8yR4fVevxw4Y=; b=gJOp3ZEHfnlPxdImzWOcpt1Eh9ZesDKNnWB+P+rWB8Y2RU16XYkVVPHAzAhcdFP6rA QgJE4L0VNx9JFbiLyrHFocPCe8S5jgfH0oXGtIlaT0RyAUhPmVGp/lSPEwJy/XeIa8Fw 1ejRAVsJBhONLzrkrLSHox2EW4lGxvhWko68BcZ1fgVxlLILQVHtAWNJfjDm+Hsq7WB2 9JCUWrdnibu69r9qYpH7U/rGqqR8Xk94OEmWPPprUJBQsQd1GvQYFy0434+AqeOgwyI5 8ZKJcaSR2/JIFjefMjl3PSqv/74qVAua41OxeYkXAFw6nhWpXcQUwLjyFdIOA7oipfMa BUgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694153125; x=1694757925; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sSeyTQyuBwnxsxZ+/qUq8xyLuzir65M8yR4fVevxw4Y=; b=Knnm4VyexxzB8lPTbGy7Ns5s5FjLBGTlZuK4rjnQOTI0dHwqJ3War5IY3GyF/W52Lp NJjrpHfV7HKRT+MmC+dHAcxMb5gtN+mjn5P64S7eIES1hMb7jZ5Oy/ffGpRsMbdfvfF9 o1GGGzIv5+03JAdu3uXhewiRi2xPOzeymTBFJnJP9F1OrMEMTptbn6oB2BqiN3/196AI A4HknEQDJvy6WbZEtNH37RRP1kHAzxysX5QLkKtRzScJOlY60aH8RDMel0w13U/irMMz sPhwUPWjpfnFoCE10v4Ug8fwjFMM52zFtFjGUVLm68DiREB98pV+AAobZ+5wzWgxI7qF bDZQ== X-Gm-Message-State: AOJu0Yy4bPFK3tKNB4SuzCi7SP+HiYcBpp13zbzNlD+o2E2yLplPKPyh t7cUHyGSi7a/97KWVjTwL/llu4y6XXD/7w2S X-Google-Smtp-Source: AGHT+IHGSrN+g6Vg5N71z2hEcu+yHasS9BVWyXULEX6UVgKZuM0M4xe8e8W1Rq4gpYk1LrQ4UTJcFA== X-Received: by 2002:a17:90a:8815:b0:26d:287f:a545 with SMTP id s21-20020a17090a881500b0026d287fa545mr1617339pjn.29.1694153125360; Thu, 07 Sep 2023 23:05:25 -0700 (PDT) From: Alistair Francis X-Google-Original-From: Alistair Francis To: qemu-devel@nongnu.org Cc: alistair23@gmail.com, Kiran Ostrolenk , Weiwei Li , Max Chou , Alistair Francis Subject: [PULL 08/65] target/riscv: Refactor some of the generic vector functionality Date: Fri, 8 Sep 2023 16:03:34 +1000 Message-ID: <20230908060431.1903919-9-alistair.francis@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230908060431.1903919-1-alistair.francis@wdc.com> References: <20230908060431.1903919-1-alistair.francis@wdc.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::102c; envelope-from=alistair23@gmail.com; helo=mail-pj1-x102c.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @gmail.com) X-ZM-MESSAGEID: 1694153488047100019 Content-Type: text/plain; charset="utf-8" From: Kiran Ostrolenk Take some functions/macros out of `vector_helper` and put them in a new module called `vector_internals`. This ensures they can be used by both vector and vector-crypto helpers (latter implemented in proceeding commits). Signed-off-by: Kiran Ostrolenk Reviewed-by: Weiwei Li Signed-off-by: Max Chou Acked-by: Alistair Francis Message-ID: <20230711165917.2629866-2-max.chou@sifive.com> Signed-off-by: Alistair Francis --- target/riscv/vector_internals.h | 182 +++++++++++++++++++++++++++++ target/riscv/vector_helper.c | 201 +------------------------------- target/riscv/vector_internals.c | 81 +++++++++++++ target/riscv/meson.build | 1 + 4 files changed, 265 insertions(+), 200 deletions(-) create mode 100644 target/riscv/vector_internals.h create mode 100644 target/riscv/vector_internals.c diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h new file mode 100644 index 0000000000..749d138beb --- /dev/null +++ b/target/riscv/vector_internals.h @@ -0,0 +1,182 @@ +/* + * RISC-V Vector Extension Internals + * + * Copyright (c) 2020 T-Head Semiconductor Co., Ltd. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License f= or + * more details. + * + * You should have received a copy of the GNU General Public License along= with + * this program. If not, see . + */ + +#ifndef TARGET_RISCV_VECTOR_INTERNALS_H +#define TARGET_RISCV_VECTOR_INTERNALS_H + +#include "qemu/osdep.h" +#include "qemu/bitops.h" +#include "cpu.h" +#include "tcg/tcg-gvec-desc.h" +#include "internals.h" + +static inline uint32_t vext_nf(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA, NF); +} + +/* + * Note that vector data is stored in host-endian 64-bit chunks, + * so addressing units smaller than that needs a host-endian fixup. + */ +#if HOST_BIG_ENDIAN +#define H1(x) ((x) ^ 7) +#define H1_2(x) ((x) ^ 6) +#define H1_4(x) ((x) ^ 4) +#define H2(x) ((x) ^ 3) +#define H4(x) ((x) ^ 1) +#define H8(x) ((x)) +#else +#define H1(x) (x) +#define H1_2(x) (x) +#define H1_4(x) (x) +#define H2(x) (x) +#define H4(x) (x) +#define H8(x) (x) +#endif + +/* + * Encode LMUL to lmul as following: + * LMUL vlmul lmul + * 1 000 0 + * 2 001 1 + * 4 010 2 + * 8 011 3 + * - 100 - + * 1/8 101 -3 + * 1/4 110 -2 + * 1/2 111 -1 + */ +static inline int32_t vext_lmul(uint32_t desc) +{ + return sextract32(FIELD_EX32(simd_data(desc), VDATA, LMUL), 0, 3); +} + +static inline uint32_t vext_vm(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA, VM); +} + +static inline uint32_t vext_vma(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA, VMA); +} + +static inline uint32_t vext_vta(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA, VTA); +} + +static inline uint32_t vext_vta_all_1s(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA, VTA_ALL_1S); +} + +/* + * Earlier designs (pre-0.9) had a varying number of bits + * per mask value (MLEN). In the 0.9 design, MLEN=3D1. + * (Section 4.5) + */ +static inline int vext_elem_mask(void *v0, int index) +{ + int idx =3D index / 64; + int pos =3D index % 64; + return (((uint64_t *)v0)[idx] >> pos) & 1; +} + +/* + * Get number of total elements, including prestart, body and tail element= s. + * Note that when LMUL < 1, the tail includes the elements past VLMAX that + * are held in the same vector register. + */ +static inline uint32_t vext_get_total_elems(CPURISCVState *env, uint32_t d= esc, + uint32_t esz) +{ + uint32_t vlenb =3D simd_maxsz(desc); + uint32_t sew =3D 1 << FIELD_EX64(env->vtype, VTYPE, VSEW); + int8_t emul =3D ctzl(esz) - ctzl(sew) + vext_lmul(desc) < 0 ? 0 : + ctzl(esz) - ctzl(sew) + vext_lmul(desc); + return (vlenb << emul) / esz; +} + +/* set agnostic elements to 1s */ +void vext_set_elems_1s(void *base, uint32_t is_agnostic, uint32_t cnt, + uint32_t tot); + +/* expand macro args before macro */ +#define RVVCALL(macro, ...) macro(__VA_ARGS__) + +/* (TD, T1, T2, TX1, TX2) */ +#define OP_UUU_B uint8_t, uint8_t, uint8_t, uint8_t, uint8_t +#define OP_UUU_H uint16_t, uint16_t, uint16_t, uint16_t, uint16_t +#define OP_UUU_W uint32_t, uint32_t, uint32_t, uint32_t, uint32_t +#define OP_UUU_D uint64_t, uint64_t, uint64_t, uint64_t, uint64_t + +/* operation of two vector elements */ +typedef void opivv2_fn(void *vd, void *vs1, void *vs2, int i); + +#define OPIVV2(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2, OP) \ +static void do_##NAME(void *vd, void *vs1, void *vs2, int i) \ +{ \ + TX1 s1 =3D *((T1 *)vs1 + HS1(i)); \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + *((TD *)vd + HD(i)) =3D OP(s2, s1); \ +} + +void do_vext_vv(void *vd, void *v0, void *vs1, void *vs2, + CPURISCVState *env, uint32_t desc, + opivv2_fn *fn, uint32_t esz); + +/* generate the helpers for OPIVV */ +#define GEN_VEXT_VV(NAME, ESZ) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, \ + void *vs2, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + do_vext_vv(vd, v0, vs1, vs2, env, desc, \ + do_##NAME, ESZ); \ +} + +typedef void opivx2_fn(void *vd, target_long s1, void *vs2, int i); + +/* + * (T1)s1 gives the real operator type. + * (TX1)(T1)s1 expands the operator type of widen or narrow operations. + */ +#define OPIVX2(NAME, TD, T1, T2, TX1, TX2, HD, HS2, OP) \ +static void do_##NAME(void *vd, target_long s1, void *vs2, int i) \ +{ \ + TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ + *((TD *)vd + HD(i)) =3D OP(s2, (TX1)(T1)s1); \ +} + +void do_vext_vx(void *vd, void *v0, target_long s1, void *vs2, + CPURISCVState *env, uint32_t desc, + opivx2_fn fn, uint32_t esz); + +/* generate the helpers for OPIVX */ +#define GEN_VEXT_VX(NAME, ESZ) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ + void *vs2, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + do_vext_vx(vd, v0, s1, vs2, env, desc, \ + do_##NAME, ESZ); \ +} + +#endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 379f03df06..1f29236a63 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -27,6 +27,7 @@ #include "fpu/softfloat.h" #include "tcg/tcg-gvec-desc.h" #include "internals.h" +#include "vector_internals.h" #include =20 target_ulong HELPER(vsetvl)(CPURISCVState *env, target_ulong s1, @@ -73,68 +74,6 @@ target_ulong HELPER(vsetvl)(CPURISCVState *env, target_u= long s1, return vl; } =20 -/* - * Note that vector data is stored in host-endian 64-bit chunks, - * so addressing units smaller than that needs a host-endian fixup. - */ -#if HOST_BIG_ENDIAN -#define H1(x) ((x) ^ 7) -#define H1_2(x) ((x) ^ 6) -#define H1_4(x) ((x) ^ 4) -#define H2(x) ((x) ^ 3) -#define H4(x) ((x) ^ 1) -#define H8(x) ((x)) -#else -#define H1(x) (x) -#define H1_2(x) (x) -#define H1_4(x) (x) -#define H2(x) (x) -#define H4(x) (x) -#define H8(x) (x) -#endif - -static inline uint32_t vext_nf(uint32_t desc) -{ - return FIELD_EX32(simd_data(desc), VDATA, NF); -} - -static inline uint32_t vext_vm(uint32_t desc) -{ - return FIELD_EX32(simd_data(desc), VDATA, VM); -} - -/* - * Encode LMUL to lmul as following: - * LMUL vlmul lmul - * 1 000 0 - * 2 001 1 - * 4 010 2 - * 8 011 3 - * - 100 - - * 1/8 101 -3 - * 1/4 110 -2 - * 1/2 111 -1 - */ -static inline int32_t vext_lmul(uint32_t desc) -{ - return sextract32(FIELD_EX32(simd_data(desc), VDATA, LMUL), 0, 3); -} - -static inline uint32_t vext_vta(uint32_t desc) -{ - return FIELD_EX32(simd_data(desc), VDATA, VTA); -} - -static inline uint32_t vext_vma(uint32_t desc) -{ - return FIELD_EX32(simd_data(desc), VDATA, VMA); -} - -static inline uint32_t vext_vta_all_1s(uint32_t desc) -{ - return FIELD_EX32(simd_data(desc), VDATA, VTA_ALL_1S); -} - /* * Get the maximum number of elements can be operated. * @@ -153,21 +92,6 @@ static inline uint32_t vext_max_elems(uint32_t desc, ui= nt32_t log2_esz) return scale < 0 ? vlenb >> -scale : vlenb << scale; } =20 -/* - * Get number of total elements, including prestart, body and tail element= s. - * Note that when LMUL < 1, the tail includes the elements past VLMAX that - * are held in the same vector register. - */ -static inline uint32_t vext_get_total_elems(CPURISCVState *env, uint32_t d= esc, - uint32_t esz) -{ - uint32_t vlenb =3D simd_maxsz(desc); - uint32_t sew =3D 1 << FIELD_EX64(env->vtype, VTYPE, VSEW); - int8_t emul =3D ctzl(esz) - ctzl(sew) + vext_lmul(desc) < 0 ? 0 : - ctzl(esz) - ctzl(sew) + vext_lmul(desc); - return (vlenb << emul) / esz; -} - static inline target_ulong adjust_addr(CPURISCVState *env, target_ulong ad= dr) { return (addr & ~env->cur_pmmask) | env->cur_pmbase; @@ -200,20 +124,6 @@ static void probe_pages(CPURISCVState *env, target_ulo= ng addr, } } =20 -/* set agnostic elements to 1s */ -static void vext_set_elems_1s(void *base, uint32_t is_agnostic, uint32_t c= nt, - uint32_t tot) -{ - if (is_agnostic =3D=3D 0) { - /* policy undisturbed */ - return; - } - if (tot - cnt =3D=3D 0) { - return; - } - memset(base + cnt, -1, tot - cnt); -} - static inline void vext_set_elem_mask(void *v0, int index, uint8_t value) { @@ -223,18 +133,6 @@ static inline void vext_set_elem_mask(void *v0, int in= dex, ((uint64_t *)v0)[idx] =3D deposit64(old, pos, 1, value); } =20 -/* - * Earlier designs (pre-0.9) had a varying number of bits - * per mask value (MLEN). In the 0.9 design, MLEN=3D1. - * (Section 4.5) - */ -static inline int vext_elem_mask(void *v0, int index) -{ - int idx =3D index / 64; - int pos =3D index % 64; - return (((uint64_t *)v0)[idx] >> pos) & 1; -} - /* elements operations for load and store */ typedef void vext_ldst_elem_fn(CPURISCVState *env, abi_ptr addr, uint32_t idx, void *vd, uintptr_t retaddr); @@ -729,18 +627,11 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b) * Vector Integer Arithmetic Instructions */ =20 -/* expand macro args before macro */ -#define RVVCALL(macro, ...) macro(__VA_ARGS__) - /* (TD, T1, T2, TX1, TX2) */ #define OP_SSS_B int8_t, int8_t, int8_t, int8_t, int8_t #define OP_SSS_H int16_t, int16_t, int16_t, int16_t, int16_t #define OP_SSS_W int32_t, int32_t, int32_t, int32_t, int32_t #define OP_SSS_D int64_t, int64_t, int64_t, int64_t, int64_t -#define OP_UUU_B uint8_t, uint8_t, uint8_t, uint8_t, uint8_t -#define OP_UUU_H uint16_t, uint16_t, uint16_t, uint16_t, uint16_t -#define OP_UUU_W uint32_t, uint32_t, uint32_t, uint32_t, uint32_t -#define OP_UUU_D uint64_t, uint64_t, uint64_t, uint64_t, uint64_t #define OP_SUS_B int8_t, uint8_t, int8_t, uint8_t, int8_t #define OP_SUS_H int16_t, uint16_t, int16_t, uint16_t, int16_t #define OP_SUS_W int32_t, uint32_t, int32_t, uint32_t, int32_t @@ -764,16 +655,6 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b) #define NOP_UUU_H uint16_t, uint16_t, uint32_t, uint16_t, uint32_t #define NOP_UUU_W uint32_t, uint32_t, uint64_t, uint32_t, uint64_t =20 -/* operation of two vector elements */ -typedef void opivv2_fn(void *vd, void *vs1, void *vs2, int i); - -#define OPIVV2(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2, OP) \ -static void do_##NAME(void *vd, void *vs1, void *vs2, int i) \ -{ \ - TX1 s1 =3D *((T1 *)vs1 + HS1(i)); \ - TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ - *((TD *)vd + HD(i)) =3D OP(s2, s1); \ -} #define DO_SUB(N, M) (N - M) #define DO_RSUB(N, M) (M - N) =20 @@ -786,40 +667,6 @@ RVVCALL(OPIVV2, vsub_vv_h, OP_SSS_H, H2, H2, H2, DO_SU= B) RVVCALL(OPIVV2, vsub_vv_w, OP_SSS_W, H4, H4, H4, DO_SUB) RVVCALL(OPIVV2, vsub_vv_d, OP_SSS_D, H8, H8, H8, DO_SUB) =20 -static void do_vext_vv(void *vd, void *v0, void *vs1, void *vs2, - CPURISCVState *env, uint32_t desc, - opivv2_fn *fn, uint32_t esz) -{ - uint32_t vm =3D vext_vm(desc); - uint32_t vl =3D env->vl; - uint32_t total_elems =3D vext_get_total_elems(env, desc, esz); - uint32_t vta =3D vext_vta(desc); - uint32_t vma =3D vext_vma(desc); - uint32_t i; - - for (i =3D env->vstart; i < vl; i++) { - if (!vm && !vext_elem_mask(v0, i)) { - /* set masked-off elements to 1s */ - vext_set_elems_1s(vd, vma, i * esz, (i + 1) * esz); - continue; - } - fn(vd, vs1, vs2, i); - } - env->vstart =3D 0; - /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); -} - -/* generate the helpers for OPIVV */ -#define GEN_VEXT_VV(NAME, ESZ) \ -void HELPER(NAME)(void *vd, void *v0, void *vs1, \ - void *vs2, CPURISCVState *env, \ - uint32_t desc) \ -{ \ - do_vext_vv(vd, v0, vs1, vs2, env, desc, \ - do_##NAME, ESZ); \ -} - GEN_VEXT_VV(vadd_vv_b, 1) GEN_VEXT_VV(vadd_vv_h, 2) GEN_VEXT_VV(vadd_vv_w, 4) @@ -829,18 +676,6 @@ GEN_VEXT_VV(vsub_vv_h, 2) GEN_VEXT_VV(vsub_vv_w, 4) GEN_VEXT_VV(vsub_vv_d, 8) =20 -typedef void opivx2_fn(void *vd, target_long s1, void *vs2, int i); - -/* - * (T1)s1 gives the real operator type. - * (TX1)(T1)s1 expands the operator type of widen or narrow operations. - */ -#define OPIVX2(NAME, TD, T1, T2, TX1, TX2, HD, HS2, OP) \ -static void do_##NAME(void *vd, target_long s1, void *vs2, int i) \ -{ \ - TX2 s2 =3D *((T2 *)vs2 + HS2(i)); \ - *((TD *)vd + HD(i)) =3D OP(s2, (TX1)(T1)s1); \ -} =20 RVVCALL(OPIVX2, vadd_vx_b, OP_SSS_B, H1, H1, DO_ADD) RVVCALL(OPIVX2, vadd_vx_h, OP_SSS_H, H2, H2, DO_ADD) @@ -855,40 +690,6 @@ RVVCALL(OPIVX2, vrsub_vx_h, OP_SSS_H, H2, H2, DO_RSUB) RVVCALL(OPIVX2, vrsub_vx_w, OP_SSS_W, H4, H4, DO_RSUB) RVVCALL(OPIVX2, vrsub_vx_d, OP_SSS_D, H8, H8, DO_RSUB) =20 -static void do_vext_vx(void *vd, void *v0, target_long s1, void *vs2, - CPURISCVState *env, uint32_t desc, - opivx2_fn fn, uint32_t esz) -{ - uint32_t vm =3D vext_vm(desc); - uint32_t vl =3D env->vl; - uint32_t total_elems =3D vext_get_total_elems(env, desc, esz); - uint32_t vta =3D vext_vta(desc); - uint32_t vma =3D vext_vma(desc); - uint32_t i; - - for (i =3D env->vstart; i < vl; i++) { - if (!vm && !vext_elem_mask(v0, i)) { - /* set masked-off elements to 1s */ - vext_set_elems_1s(vd, vma, i * esz, (i + 1) * esz); - continue; - } - fn(vd, s1, vs2, i); - } - env->vstart =3D 0; - /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); -} - -/* generate the helpers for OPIVX */ -#define GEN_VEXT_VX(NAME, ESZ) \ -void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ - void *vs2, CPURISCVState *env, \ - uint32_t desc) \ -{ \ - do_vext_vx(vd, v0, s1, vs2, env, desc, \ - do_##NAME, ESZ); \ -} - GEN_VEXT_VX(vadd_vx_b, 1) GEN_VEXT_VX(vadd_vx_h, 2) GEN_VEXT_VX(vadd_vx_w, 4) diff --git a/target/riscv/vector_internals.c b/target/riscv/vector_internal= s.c new file mode 100644 index 0000000000..9cf5c17cde --- /dev/null +++ b/target/riscv/vector_internals.c @@ -0,0 +1,81 @@ +/* + * RISC-V Vector Extension Internals + * + * Copyright (c) 2020 T-Head Semiconductor Co., Ltd. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License f= or + * more details. + * + * You should have received a copy of the GNU General Public License along= with + * this program. If not, see . + */ + +#include "vector_internals.h" + +/* set agnostic elements to 1s */ +void vext_set_elems_1s(void *base, uint32_t is_agnostic, uint32_t cnt, + uint32_t tot) +{ + if (is_agnostic =3D=3D 0) { + /* policy undisturbed */ + return; + } + if (tot - cnt =3D=3D 0) { + return ; + } + memset(base + cnt, -1, tot - cnt); +} + +void do_vext_vv(void *vd, void *v0, void *vs1, void *vs2, + CPURISCVState *env, uint32_t desc, + opivv2_fn *fn, uint32_t esz) +{ + uint32_t vm =3D vext_vm(desc); + uint32_t vl =3D env->vl; + uint32_t total_elems =3D vext_get_total_elems(env, desc, esz); + uint32_t vta =3D vext_vta(desc); + uint32_t vma =3D vext_vma(desc); + uint32_t i; + + for (i =3D env->vstart; i < vl; i++) { + if (!vm && !vext_elem_mask(v0, i)) { + /* set masked-off elements to 1s */ + vext_set_elems_1s(vd, vma, i * esz, (i + 1) * esz); + continue; + } + fn(vd, vs1, vs2, i); + } + env->vstart =3D 0; + /* set tail elements to 1s */ + vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); +} + +void do_vext_vx(void *vd, void *v0, target_long s1, void *vs2, + CPURISCVState *env, uint32_t desc, + opivx2_fn fn, uint32_t esz) +{ + uint32_t vm =3D vext_vm(desc); + uint32_t vl =3D env->vl; + uint32_t total_elems =3D vext_get_total_elems(env, desc, esz); + uint32_t vta =3D vext_vta(desc); + uint32_t vma =3D vext_vma(desc); + uint32_t i; + + for (i =3D env->vstart; i < vl; i++) { + if (!vm && !vext_elem_mask(v0, i)) { + /* set masked-off elements to 1s */ + vext_set_elems_1s(vd, vma, i * esz, (i + 1) * esz); + continue; + } + fn(vd, s1, vs2, i); + } + env->vstart =3D 0; + /* set tail elements to 1s */ + vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); +} diff --git a/target/riscv/meson.build b/target/riscv/meson.build index 7f56c5f88d..c3801ee5e0 100644 --- a/target/riscv/meson.build +++ b/target/riscv/meson.build @@ -16,6 +16,7 @@ riscv_ss.add(files( 'gdbstub.c', 'op_helper.c', 'vector_helper.c', + 'vector_internals.c', 'bitmanip_helper.c', 'translate.c', 'm128_helper.c', --=20 2.41.0