From nobody Sat Feb 7 06:20:53 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1710071719; cv=none; d=zohomail.com; s=zohoarc; b=CWemHgwz+B7GbSaq5i0HvDnhGNUatn7WaHaC/X0HDdb18kwhcH23q4cw3fYPU+zQdbfVq9AY42CwQ23XcH2r1jh32f28ii32qgcu5IY5ssOTkJDK/V2DE7bnnGCl1C+SnOJ0O/vj0eWnE2ZtO8quvfMmNe8Obc+wCg9CHzScDuE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1710071719; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=UkZttwzbGfcxGdm/nXSRZp+cW9ikJaFZ7Cn5052rX+I=; b=iPuSfYKkZ2aIP/umc/jb85rm8V1dDNRAZ795+pvvPO9K1/YKwVyU6bYwBVQobj6NBXjtLBM6CqvuyhigmgrLvYu9aXj+K7NwOE2uF95ekHVlJ6triFEPfxHdS4+TGz55inf4Gno34R0xZT6qRqbsa5AKx22QuXEWp8XryK6M24Y= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1710071719093962.5930767171997; Sun, 10 Mar 2024 04:55:19 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjHka-0007pT-DO; Sun, 10 Mar 2024 07:53:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjHkO-0007lo-F6 for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:48 -0400 Received: from mail-oi1-x22f.google.com ([2607:f8b0:4864:20::22f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjHkM-00043O-4k for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:44 -0400 Received: by mail-oi1-x22f.google.com with SMTP id 5614622812f47-3c19aaedfdaso1565396b6e.2 for ; Sun, 10 Mar 2024 04:53:40 -0700 (PDT) Received: from grind.. ([177.94.15.159]) by smtp.gmail.com with ESMTPSA id g22-20020aa78196000000b006e647059cccsm2449253pfi.33.2024.03.10.04.53.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Mar 2024 04:53:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710071619; x=1710676419; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UkZttwzbGfcxGdm/nXSRZp+cW9ikJaFZ7Cn5052rX+I=; b=DcOpGhPZgVJgSfjnvfZgLL5QWi5P/efWcQQYTVYfPRjed9pfkxiqvXfDP4oCdWBQrO ZxZLlRFPR4+KXDb0ZCqiDFeZzgYX6tIG+36/reGWVfFtoDUdcUXD+8+xy9u9EjeeKZRO xkJtAwlfkfig5nBjJQQMRdH9R5tOf6WdOFBYTLCuTVxywYuPcLy6NhAB5/s9ipAd+WAk kYT//2HcE9TjWahgMUhC6fpm9XkpC5ZaNzoX0NIk9ibIr+RuQErzwvcMy/OZF5gZzwXg SAZ0G8ylcbKocezTh0CwwD94Y5giqz7ttxUkmxeVDUenGls4j92VUJAuJcOhwwR9j00I rvEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710071619; x=1710676419; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UkZttwzbGfcxGdm/nXSRZp+cW9ikJaFZ7Cn5052rX+I=; b=NyIu4WzvMIBscuJ80cycFJG5prV60zl7sCim6Lq4el8sihktI3Q/lWT0XbCaQ/wOD0 hX83nNsPCmG7o/jaP7doga5I97rdcxT2Ku7nS+1PZnhvLQaCzxffcmCANNzDqnij9INv 9ENH6PBhoX1mIwnPVVTi5N15/DdURIPIAPlL/ALu+ojrkUfB5EYwLJqdO45RBRqgQsWl HP6zHqQNsiZl1z44JWqtMXe8dC2QpzhIDbgJMcfICeNoiZPc3dQRSFW+WZAHuRYYJRvQ JUrQ5vCrDLrY/sd/chsKSOG1SeNVoP7FpviliA0YNskQNXeDvOprbKeSfC5U3g51oIfs YJaw== X-Gm-Message-State: AOJu0YwNt2IIpdQ99welByWlHc67JUNXS8q/QvanHqAVEuzcHA62Rg2v D7a7sh68i1GyllcOs8PSYCTBY+dXWzA+hxRAJb69Ucb8Il9r2A4/2DPIjFWdNESLmK75zkL0zHW m X-Google-Smtp-Source: AGHT+IGbs6X6DctJfeBcfmk2lKwdPd9GitOZkFmrkHDDSAuyPi7KdwQ+If5A1ILO98uj6oi/wOKi0g== X-Received: by 2002:a05:6808:2e89:b0:3c2:3b0e:b830 with SMTP id gt9-20020a0568082e8900b003c23b0eb830mr6492658oib.25.1710071619273; Sun, 10 Mar 2024 04:53:39 -0700 (PDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, bmeng@tinylab.org, liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com, palmer@rivosinc.com, richard.henderson@linaro.org, philmd@linaro.org, Daniel Henrique Barboza Subject: [PATCH v10 05/10] target/riscv: use vext_set_tail_elems_1s() in vcrypto insns Date: Sun, 10 Mar 2024 08:53:09 -0300 Message-ID: <20240310115315.187283-6-dbarboza@ventanamicro.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240310115315.187283-1-dbarboza@ventanamicro.com> References: <20240310115315.187283-1-dbarboza@ventanamicro.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::22f; envelope-from=dbarboza@ventanamicro.com; helo=mail-oi1-x22f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @ventanamicro.com) X-ZM-MESSAGEID: 1710071720155100003 Vcrypto insns should also use the same helper the regular vector insns uses to update the tail elements. Move vext_set_tail_elems_1s() to vector_internals.c and make it public. Use it in vcrypto_helper.c to set tail elements instead of vext_set_elems_1s(). Helpers must set env->vstart =3D 0 after setting the tail. Signed-off-by: Daniel Henrique Barboza Reviewed-by: Richard Henderson --- target/riscv/vcrypto_helper.c | 63 ++++++++++++--------------------- target/riscv/vector_helper.c | 30 ---------------- target/riscv/vector_internals.c | 29 +++++++++++++++ target/riscv/vector_internals.h | 4 +++ 4 files changed, 56 insertions(+), 70 deletions(-) diff --git a/target/riscv/vcrypto_helper.c b/target/riscv/vcrypto_helper.c index e2d719b13b..66d449c274 100644 --- a/target/riscv/vcrypto_helper.c +++ b/target/riscv/vcrypto_helper.c @@ -218,9 +218,7 @@ static inline void xor_round_key(AESState *round_state,= AESState *round_key) void HELPER(NAME)(void *vd, void *vs2, CPURISCVState *env, \ uint32_t desc) \ { \ - uint32_t vl =3D env->vl; = \ uint32_t total_elems =3D vext_get_total_elems(env, desc, 4); = \ - uint32_t vta =3D vext_vta(desc); = \ \ for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { = \ AESState round_key; \ @@ -233,18 +231,16 @@ static inline void xor_round_key(AESState *round_stat= e, AESState *round_key) *((uint64_t *)vd + H8(i * 2 + 0)) =3D round_state.d[0]; = \ *((uint64_t *)vd + H8(i * 2 + 1)) =3D round_state.d[1]; = \ } \ - env->vstart =3D 0; = \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); \ + vext_set_tail_elems_1s(env, vd, desc, 4, total_elems); \ + env->vstart =3D 0; = \ } =20 #define GEN_ZVKNED_HELPER_VS(NAME, ...) \ void HELPER(NAME)(void *vd, void *vs2, CPURISCVState *env, \ uint32_t desc) \ { \ - uint32_t vl =3D env->vl; = \ uint32_t total_elems =3D vext_get_total_elems(env, desc, 4); = \ - uint32_t vta =3D vext_vta(desc); = \ \ for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { = \ AESState round_key; \ @@ -257,9 +253,9 @@ static inline void xor_round_key(AESState *round_state,= AESState *round_key) *((uint64_t *)vd + H8(i * 2 + 0)) =3D round_state.d[0]; = \ *((uint64_t *)vd + H8(i * 2 + 1)) =3D round_state.d[1]; = \ } \ - env->vstart =3D 0; = \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); \ + vext_set_tail_elems_1s(env, vd, desc, 4, total_elems); \ + env->vstart =3D 0; = \ } =20 GEN_ZVKNED_HELPER_VV(vaesef_vv, aesenc_SB_SR_AK(&round_state, @@ -301,9 +297,7 @@ void HELPER(vaeskf1_vi)(void *vd_vptr, void *vs2_vptr, = uint32_t uimm, { uint32_t *vd =3D vd_vptr; uint32_t *vs2 =3D vs2_vptr; - uint32_t vl =3D env->vl; uint32_t total_elems =3D vext_get_total_elems(env, desc, 4); - uint32_t vta =3D vext_vta(desc); =20 uimm &=3D 0b1111; if (uimm > 10 || uimm =3D=3D 0) { @@ -337,9 +331,9 @@ void HELPER(vaeskf1_vi)(void *vd_vptr, void *vs2_vptr, = uint32_t uimm, vd[i * 4 + H4(2)] =3D rk[6]; vd[i * 4 + H4(3)] =3D rk[7]; } - env->vstart =3D 0; /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); + vext_set_tail_elems_1s(env, vd, desc, 4, total_elems); + env->vstart =3D 0; } =20 void HELPER(vaeskf2_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, @@ -347,9 +341,7 @@ void HELPER(vaeskf2_vi)(void *vd_vptr, void *vs2_vptr, = uint32_t uimm, { uint32_t *vd =3D vd_vptr; uint32_t *vs2 =3D vs2_vptr; - uint32_t vl =3D env->vl; uint32_t total_elems =3D vext_get_total_elems(env, desc, 4); - uint32_t vta =3D vext_vta(desc); =20 uimm &=3D 0b1111; if (uimm > 14 || uimm < 2) { @@ -394,9 +386,9 @@ void HELPER(vaeskf2_vi)(void *vd_vptr, void *vs2_vptr, = uint32_t uimm, vd[i * 4 + H4(2)] =3D rk[10]; vd[i * 4 + H4(3)] =3D rk[11]; } - env->vstart =3D 0; /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); + vext_set_tail_elems_1s(env, vd, desc, 4, total_elems); + env->vstart =3D 0; } =20 static inline uint32_t sig0_sha256(uint32_t x) @@ -455,7 +447,6 @@ void HELPER(vsha2ms_vv)(void *vd, void *vs1, void *vs2,= CPURISCVState *env, uint32_t sew =3D FIELD_EX64(env->vtype, VTYPE, VSEW); uint32_t esz =3D sew =3D=3D MO_32 ? 4 : 8; uint32_t total_elems; - uint32_t vta =3D vext_vta(desc); =20 for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { if (sew =3D=3D MO_32) { @@ -469,7 +460,7 @@ void HELPER(vsha2ms_vv)(void *vd, void *vs1, void *vs2,= CPURISCVState *env, } /* set tail elements to 1s */ total_elems =3D vext_get_total_elems(env, desc, esz); - vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); env->vstart =3D 0; } =20 @@ -570,7 +561,6 @@ void HELPER(vsha2ch32_vv)(void *vd, void *vs1, void *vs= 2, CPURISCVState *env, { const uint32_t esz =3D 4; uint32_t total_elems; - uint32_t vta =3D vext_vta(desc); =20 for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { vsha2c_32(((uint32_t *)vs2) + 4 * i, ((uint32_t *)vd) + 4 * i, @@ -579,7 +569,7 @@ void HELPER(vsha2ch32_vv)(void *vd, void *vs1, void *vs= 2, CPURISCVState *env, =20 /* set tail elements to 1s */ total_elems =3D vext_get_total_elems(env, desc, esz); - vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); env->vstart =3D 0; } =20 @@ -588,7 +578,6 @@ void HELPER(vsha2ch64_vv)(void *vd, void *vs1, void *vs= 2, CPURISCVState *env, { const uint32_t esz =3D 8; uint32_t total_elems; - uint32_t vta =3D vext_vta(desc); =20 for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { vsha2c_64(((uint64_t *)vs2) + 4 * i, ((uint64_t *)vd) + 4 * i, @@ -597,7 +586,7 @@ void HELPER(vsha2ch64_vv)(void *vd, void *vs1, void *vs= 2, CPURISCVState *env, =20 /* set tail elements to 1s */ total_elems =3D vext_get_total_elems(env, desc, esz); - vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); env->vstart =3D 0; } =20 @@ -606,7 +595,6 @@ void HELPER(vsha2cl32_vv)(void *vd, void *vs1, void *vs= 2, CPURISCVState *env, { const uint32_t esz =3D 4; uint32_t total_elems; - uint32_t vta =3D vext_vta(desc); =20 for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { vsha2c_32(((uint32_t *)vs2) + 4 * i, ((uint32_t *)vd) + 4 * i, @@ -615,7 +603,7 @@ void HELPER(vsha2cl32_vv)(void *vd, void *vs1, void *vs= 2, CPURISCVState *env, =20 /* set tail elements to 1s */ total_elems =3D vext_get_total_elems(env, desc, esz); - vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); env->vstart =3D 0; } =20 @@ -624,7 +612,6 @@ void HELPER(vsha2cl64_vv)(void *vd, void *vs1, void *vs= 2, CPURISCVState *env, { uint32_t esz =3D 8; uint32_t total_elems; - uint32_t vta =3D vext_vta(desc); =20 for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { vsha2c_64(((uint64_t *)vs2) + 4 * i, ((uint64_t *)vd) + 4 * i, @@ -633,7 +620,7 @@ void HELPER(vsha2cl64_vv)(void *vd, void *vs1, void *vs= 2, CPURISCVState *env, =20 /* set tail elements to 1s */ total_elems =3D vext_get_total_elems(env, desc, esz); - vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); env->vstart =3D 0; } =20 @@ -653,7 +640,6 @@ void HELPER(vsm3me_vv)(void *vd_vptr, void *vs1_vptr, v= oid *vs2_vptr, { uint32_t esz =3D memop_size(FIELD_EX64(env->vtype, VTYPE, VSEW)); uint32_t total_elems =3D vext_get_total_elems(env, desc, esz); - uint32_t vta =3D vext_vta(desc); uint32_t *vd =3D vd_vptr; uint32_t *vs1 =3D vs1_vptr; uint32_t *vs2 =3D vs2_vptr; @@ -672,7 +658,7 @@ void HELPER(vsm3me_vv)(void *vd_vptr, void *vs1_vptr, v= oid *vs2_vptr, vd[(i * 8) + j] =3D bswap32(w[H4(j + 16)]); } } - vext_set_elems_1s(vd_vptr, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd_vptr, desc, esz, total_elems); env->vstart =3D 0; } =20 @@ -752,7 +738,6 @@ void HELPER(vsm3c_vi)(void *vd_vptr, void *vs2_vptr, ui= nt32_t uimm, { uint32_t esz =3D memop_size(FIELD_EX64(env->vtype, VTYPE, VSEW)); uint32_t total_elems =3D vext_get_total_elems(env, desc, esz); - uint32_t vta =3D vext_vta(desc); uint32_t *vd =3D vd_vptr; uint32_t *vs2 =3D vs2_vptr; uint32_t v1[8], v2[8], v3[8]; @@ -767,7 +752,7 @@ void HELPER(vsm3c_vi)(void *vd_vptr, void *vs2_vptr, ui= nt32_t uimm, vd[i * 8 + k] =3D bswap32(v1[H4(k)]); } } - vext_set_elems_1s(vd_vptr, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd_vptr, desc, esz, total_elems); env->vstart =3D 0; } =20 @@ -777,7 +762,6 @@ void HELPER(vghsh_vv)(void *vd_vptr, void *vs1_vptr, vo= id *vs2_vptr, uint64_t *vd =3D vd_vptr; uint64_t *vs1 =3D vs1_vptr; uint64_t *vs2 =3D vs2_vptr; - uint32_t vta =3D vext_vta(desc); uint32_t total_elems =3D vext_get_total_elems(env, desc, 4); =20 for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { @@ -805,7 +789,7 @@ void HELPER(vghsh_vv)(void *vd_vptr, void *vs1_vptr, vo= id *vs2_vptr, vd[i * 2 + 1] =3D brev8(Z[1]); } /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, env->vl * 4, total_elems * 4); + vext_set_tail_elems_1s(env, vd, desc, 4, total_elems); env->vstart =3D 0; } =20 @@ -814,7 +798,6 @@ void HELPER(vgmul_vv)(void *vd_vptr, void *vs2_vptr, CP= URISCVState *env, { uint64_t *vd =3D vd_vptr; uint64_t *vs2 =3D vs2_vptr; - uint32_t vta =3D vext_vta(desc); uint32_t total_elems =3D vext_get_total_elems(env, desc, 4); =20 for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { @@ -839,7 +822,7 @@ void HELPER(vgmul_vv)(void *vd_vptr, void *vs2_vptr, CP= URISCVState *env, vd[i * 2 + 1] =3D brev8(Z[1]); } /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, env->vl * 4, total_elems * 4); + vext_set_tail_elems_1s(env, vd, desc, 4, total_elems); env->vstart =3D 0; } =20 @@ -881,9 +864,9 @@ void HELPER(vsm4k_vi)(void *vd, void *vs2, uint32_t uim= m5, CPURISCVState *env, } } =20 - env->vstart =3D 0; /* set tail elements to 1s */ - vext_set_elems_1s(vd, vext_vta(desc), env->vl * esz, total_elems * esz= ); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); + env->vstart =3D 0; } =20 static void do_sm4_round(uint32_t *rk, uint32_t *buf) @@ -930,9 +913,9 @@ void HELPER(vsm4r_vv)(void *vd, void *vs2, CPURISCVStat= e *env, uint32_t desc) } } =20 - env->vstart =3D 0; /* set tail elements to 1s */ - vext_set_elems_1s(vd, vext_vta(desc), env->vl * esz, total_elems * esz= ); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); + env->vstart =3D 0; } =20 void HELPER(vsm4r_vs)(void *vd, void *vs2, CPURISCVState *env, uint32_t de= sc) @@ -964,7 +947,7 @@ void HELPER(vsm4r_vs)(void *vd, void *vs2, CPURISCVStat= e *env, uint32_t desc) } } =20 - env->vstart =3D 0; /* set tail elements to 1s */ - vext_set_elems_1s(vd, vext_vta(desc), env->vl * esz, total_elems * esz= ); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); + env->vstart =3D 0; } diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index b174ddeae8..4fe8752eea 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -174,36 +174,6 @@ GEN_VEXT_ST_ELEM(ste_h, int16_t, H2, stw) GEN_VEXT_ST_ELEM(ste_w, int32_t, H4, stl) GEN_VEXT_ST_ELEM(ste_d, int64_t, H8, stq) =20 -/* - * This function is sensitive to env->vstart changes since - * it'll be a no-op if vstart >=3D vl. Do not clear env->vstart - * before calling it unless you're certain that vstart < vl. - */ -static void vext_set_tail_elems_1s(CPURISCVState *env, void *vd, - uint32_t desc, uint32_t esz, - uint32_t max_elems) -{ - uint32_t vta =3D vext_vta(desc); - uint32_t nf =3D vext_nf(desc); - int k; - - /* - * Section 5.4 of the RVV spec mentions: - * "When vstart =E2=89=A5 vl, there are no body elements, and no - * elements are updated in any destination vector register - * group, including that no tail elements are updated - * with agnostic values." - */ - if (vta =3D=3D 0 || env->vstart >=3D env->vl) { - return; - } - - for (k =3D 0; k < nf; ++k) { - vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz, - (k * max_elems + max_elems) * esz); - } -} - /* * stride: access vector element from strided memory */ diff --git a/target/riscv/vector_internals.c b/target/riscv/vector_internal= s.c index 12f5964fbb..bf3e9e2370 100644 --- a/target/riscv/vector_internals.c +++ b/target/riscv/vector_internals.c @@ -33,6 +33,35 @@ void vext_set_elems_1s(void *base, uint32_t is_agnostic,= uint32_t cnt, memset(base + cnt, -1, tot - cnt); } =20 +/* + * This function is sensitive to env->vstart changes since + * it'll be a no-op if vstart >=3D vl. Do not clear env->vstart + * before calling it unless you're certain that vstart < vl. + */ +void vext_set_tail_elems_1s(CPURISCVState *env, void *vd, uint32_t desc, + uint32_t esz, uint32_t max_elems) +{ + uint32_t vta =3D vext_vta(desc); + uint32_t nf =3D vext_nf(desc); + int k; + + /* + * Section 5.4 of the RVV spec mentions: + * "When vstart =E2=89=A5 vl, there are no body elements, and no + * elements are updated in any destination vector register + * group, including that no tail elements are updated + * with agnostic values." + */ + if (vta =3D=3D 0 || env->vstart >=3D env->vl) { + return; + } + + for (k =3D 0; k < nf; ++k) { + vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz, + (k * max_elems + max_elems) * esz); + } +} + void do_vext_vv(void *vd, void *v0, void *vs1, void *vs2, CPURISCVState *env, uint32_t desc, opivv2_fn *fn, uint32_t esz) diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 842765f6c1..c5a2bc4bf3 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -117,6 +117,10 @@ static inline uint32_t vext_get_total_elems(CPURISCVSt= ate *env, uint32_t desc, void vext_set_elems_1s(void *base, uint32_t is_agnostic, uint32_t cnt, uint32_t tot); =20 +void vext_set_tail_elems_1s(CPURISCVState *env, void *vd, + uint32_t desc, uint32_t esz, + uint32_t max_elems); + /* expand macro args before macro */ #define RVVCALL(macro, ...) macro(__VA_ARGS__) =20 --=20 2.43.2