From nobody Tue Nov 26 04:33:25 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1710180655; cv=none; d=zohomail.com; s=zohoarc; b=bK8vsq2MlmpsLoYjiS73EiGzW/ZbvC+GJqvvPhfl7Jc+WHNI967fnXsLwPYQsYs4XbCJ5lVM33p9k5gNj9TIdU0TrM6UIiRy1jkVQuqULy/0ZlFPV8F6elxEaWqXYIOyVvUyyMPdmGwjTdNqNCtfK5Sfs8qlyjMVTPI1w11qlDE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1710180655; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=vg/SvxBeJqtWJVGus18bti4j4Qr9pK9CJICyMYxY14c=; b=V9zwX9Ci02lnfO07DIjF4qktRb4S0VDJ3LFKytYFiwH99TeRghR3htsI3zxBhX9d9aSANUsBp5hJIFu+GLfvVXGxPLLtYO/5eKNP71ifHp8LFAcyjSxVoqyVOXj3HZiuT4yH6zCOrf8pZ5PQBCME9co+yXjVrhACoz2sZK/eGRU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1710180655392769.8867536220522; Mon, 11 Mar 2024 11:10:55 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjk5c-00051c-NZ; Mon, 11 Mar 2024 14:09:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjk4x-0004t7-G6 for qemu-devel@nongnu.org; Mon, 11 Mar 2024 14:08:55 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjk4l-0004Km-PW for qemu-devel@nongnu.org; Mon, 11 Mar 2024 14:08:50 -0400 Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-6e6092a84f4so3150393b3a.0 for ; Mon, 11 Mar 2024 11:08:39 -0700 (PDT) Received: from grind.dc1.ventanamicro.com ([177.94.15.159]) by smtp.gmail.com with ESMTPSA id hk13-20020a17090b224d00b0029c2794d3f7sm525810pjb.7.2024.03.11.11.08.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Mar 2024 11:08:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710180518; x=1710785318; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vg/SvxBeJqtWJVGus18bti4j4Qr9pK9CJICyMYxY14c=; b=bDZ0jvkAWsWgXheZ4T7yajAOYKu9Ksd+/GnOe2EhlpzKO0lDSVLC4CpC0uovGKCNSH vIL0hV4DpK1ar3TgImZL2gS8DC64drkbvTebTMhtpDyxKWT/YC7oLVN3pg0wRsv8ZMlH ASTnEwXHza0uLjk3kfxekg5Sg9xODK+CF+Oid6Has8/l2obQROl68hH6d7GxwInqF63I 1xgM2z+vnZPztITIvw4qYI4qiH9OG7E5jeenGSf3ozIstb0N+us8W2QUT9MFy3KIs2cb YKu2jcHVy2uRyjnaK0UxmufDLV0zUqLhgzgr6erxeVWszA2RQMnN9YVS5rV82mRrFZtu 9PXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710180518; x=1710785318; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vg/SvxBeJqtWJVGus18bti4j4Qr9pK9CJICyMYxY14c=; b=CaWJhSP4BUp4jG5BGgnJNBidY4pasxcu7tyB56A4pHe9qdOmqvqUyO/qbHiN53EC2i aH/Zltqqi9txk8UpLcsIYzdpPGPq2T+LhTsngWb8MnSdWv9jkw84ZUsG3WOCMWV5Ghhy YPKPowS6rZms9KIIboRCV5/aDxWq9kZoyGU5VBtiVfCo0fIMM/ocIHTM1g6l/jVwqC1+ gi7FTG5jBoJyFYdUG0p8CD0ief9mWs6Il1D63stUJKYRiY4cdEFaIXxecSs7utruC8yr LM+sk5Rx8VtGw298mY4LwN+xQGXeS+BIwwsqQxn682TGkb5CwcXR1e8cpbMA6kP2ph/b cxJA== X-Gm-Message-State: AOJu0Yx4FCHu2q+tqHsNzNgGUHnnvVapV8FBoFO052iyjptHo78mPFgr 8GdwXjISCpafTcqhDxkO7j+GtUtz6CEnajRp5bLsiZUMOGqdc1zQsL9uxJZ5VSw8m3IFRz3V6Z7 X X-Google-Smtp-Source: AGHT+IE11+CWiUXcTPuK3UqQd3kfB4AzBCPpaPaOL+yRBtRWzGfGRK4Jq/EtmvBxqpCHl77pePf9ew== X-Received: by 2002:a05:6a21:a591:b0:1a1:e70:3195 with SMTP id gd17-20020a056a21a59100b001a10e703195mr5412695pzc.8.1710180517725; Mon, 11 Mar 2024 11:08:37 -0700 (PDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, bmeng@tinylab.org, liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com, palmer@rivosinc.com, philmd@linaro.org, richard.henderson@linaro.org, Daniel Henrique Barboza Subject: [PATCH v12 3/7] target/riscv/vector_helpers: do early exit when vstart >= vl Date: Mon, 11 Mar 2024 15:08:17 -0300 Message-ID: <20240311180821.250469-4-dbarboza@ventanamicro.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240311180821.250469-1-dbarboza@ventanamicro.com> References: <20240311180821.250469-1-dbarboza@ventanamicro.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::429; envelope-from=dbarboza@ventanamicro.com; helo=mail-pf1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @ventanamicro.com) X-ZM-MESSAGEID: 1710180656632100003 Content-Type: text/plain; charset="utf-8" We're going to make changes that will required each helper to be responsible for the 'vstart' management, i.e. we will relieve the 'vstart < vl' assumption that helpers have today. Helpers are usually able to deal with vstart >=3D vl, i.e. doing nothing aside from setting vstart =3D 0 at the end, but the tail update functions will update the tail regardless of vstart being valid or not. Unifying the tail update process in a single function that would handle the vstart >=3D vl case isn't trivial. We have 2 functions that are used to update tail: vext_set_tail_elems_1s() and vext_set_elems_1s(). The latter is a more generic function that is also used to mask elements. There's no easy way of making all callers using vext_set_tail_elems_1s() because we're not encoding NF properly in all cases [1]. This patch takes a blunt approach: do an early exit in every single vector helper if vstart >=3D vl. We can worry about unifying the tail update process later. [1] https://lore.kernel.org/qemu-riscv/1590234b-0291-432a-a0fa-c5a6876097bc= @linux.alibaba.com/ Signed-off-by: Daniel Henrique Barboza Reviewed-by: Richard Henderson --- target/riscv/vcrypto_helper.c | 32 ++++++++++++ target/riscv/vector_helper.c | 90 +++++++++++++++++++++++++++++++++ target/riscv/vector_internals.c | 4 ++ target/riscv/vector_internals.h | 9 ++++ 4 files changed, 135 insertions(+) diff --git a/target/riscv/vcrypto_helper.c b/target/riscv/vcrypto_helper.c index e2d719b13b..f7423df226 100644 --- a/target/riscv/vcrypto_helper.c +++ b/target/riscv/vcrypto_helper.c @@ -222,6 +222,8 @@ static inline void xor_round_key(AESState *round_state,= AESState *round_key) uint32_t total_elems =3D vext_get_total_elems(env, desc, 4); = \ uint32_t vta =3D vext_vta(desc); = \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { = \ AESState round_key; \ round_key.d[0] =3D *((uint64_t *)vs2 + H8(i * 2 + 0)); = \ @@ -246,6 +248,8 @@ static inline void xor_round_key(AESState *round_state,= AESState *round_key) uint32_t total_elems =3D vext_get_total_elems(env, desc, 4); = \ uint32_t vta =3D vext_vta(desc); = \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { = \ AESState round_key; \ round_key.d[0] =3D *((uint64_t *)vs2 + H8(0)); = \ @@ -305,6 +309,8 @@ void HELPER(vaeskf1_vi)(void *vd_vptr, void *vs2_vptr, = uint32_t uimm, uint32_t total_elems =3D vext_get_total_elems(env, desc, 4); uint32_t vta =3D vext_vta(desc); =20 + VSTART_CHECK_EARLY_EXIT(env); + uimm &=3D 0b1111; if (uimm > 10 || uimm =3D=3D 0) { uimm ^=3D 0b1000; @@ -351,6 +357,8 @@ void HELPER(vaeskf2_vi)(void *vd_vptr, void *vs2_vptr, = uint32_t uimm, uint32_t total_elems =3D vext_get_total_elems(env, desc, 4); uint32_t vta =3D vext_vta(desc); =20 + VSTART_CHECK_EARLY_EXIT(env); + uimm &=3D 0b1111; if (uimm > 14 || uimm < 2) { uimm ^=3D 0b1000; @@ -457,6 +465,8 @@ void HELPER(vsha2ms_vv)(void *vd, void *vs1, void *vs2,= CPURISCVState *env, uint32_t total_elems; uint32_t vta =3D vext_vta(desc); =20 + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { if (sew =3D=3D MO_32) { vsha2ms_e32(((uint32_t *)vd) + i * 4, ((uint32_t *)vs1) + i * = 4, @@ -572,6 +582,8 @@ void HELPER(vsha2ch32_vv)(void *vd, void *vs1, void *vs= 2, CPURISCVState *env, uint32_t total_elems; uint32_t vta =3D vext_vta(desc); =20 + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { vsha2c_32(((uint32_t *)vs2) + 4 * i, ((uint32_t *)vd) + 4 * i, ((uint32_t *)vs1) + 4 * i + 2); @@ -590,6 +602,8 @@ void HELPER(vsha2ch64_vv)(void *vd, void *vs1, void *vs= 2, CPURISCVState *env, uint32_t total_elems; uint32_t vta =3D vext_vta(desc); =20 + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { vsha2c_64(((uint64_t *)vs2) + 4 * i, ((uint64_t *)vd) + 4 * i, ((uint64_t *)vs1) + 4 * i + 2); @@ -608,6 +622,8 @@ void HELPER(vsha2cl32_vv)(void *vd, void *vs1, void *vs= 2, CPURISCVState *env, uint32_t total_elems; uint32_t vta =3D vext_vta(desc); =20 + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { vsha2c_32(((uint32_t *)vs2) + 4 * i, ((uint32_t *)vd) + 4 * i, (((uint32_t *)vs1) + 4 * i)); @@ -626,6 +642,8 @@ void HELPER(vsha2cl64_vv)(void *vd, void *vs1, void *vs= 2, CPURISCVState *env, uint32_t total_elems; uint32_t vta =3D vext_vta(desc); =20 + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { vsha2c_64(((uint64_t *)vs2) + 4 * i, ((uint64_t *)vd) + 4 * i, (((uint64_t *)vs1) + 4 * i)); @@ -658,6 +676,8 @@ void HELPER(vsm3me_vv)(void *vd_vptr, void *vs1_vptr, v= oid *vs2_vptr, uint32_t *vs1 =3D vs1_vptr; uint32_t *vs2 =3D vs2_vptr; =20 + VSTART_CHECK_EARLY_EXIT(env); + for (int i =3D env->vstart / 8; i < env->vl / 8; i++) { uint32_t w[24]; for (int j =3D 0; j < 8; j++) { @@ -757,6 +777,8 @@ void HELPER(vsm3c_vi)(void *vd_vptr, void *vs2_vptr, ui= nt32_t uimm, uint32_t *vs2 =3D vs2_vptr; uint32_t v1[8], v2[8], v3[8]; =20 + VSTART_CHECK_EARLY_EXIT(env); + for (int i =3D env->vstart / 8; i < env->vl / 8; i++) { for (int k =3D 0; k < 8; k++) { v2[k] =3D bswap32(vd[H4(i * 8 + k)]); @@ -780,6 +802,8 @@ void HELPER(vghsh_vv)(void *vd_vptr, void *vs1_vptr, vo= id *vs2_vptr, uint32_t vta =3D vext_vta(desc); uint32_t total_elems =3D vext_get_total_elems(env, desc, 4); =20 + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { uint64_t Y[2] =3D {vd[i * 2 + 0], vd[i * 2 + 1]}; uint64_t H[2] =3D {brev8(vs2[i * 2 + 0]), brev8(vs2[i * 2 + 1])}; @@ -817,6 +841,8 @@ void HELPER(vgmul_vv)(void *vd_vptr, void *vs2_vptr, CP= URISCVState *env, uint32_t vta =3D vext_vta(desc); uint32_t total_elems =3D vext_get_total_elems(env, desc, 4); =20 + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D env->vstart / 4; i < env->vl / 4; i++) { uint64_t Y[2] =3D {brev8(vd[i * 2 + 0]), brev8(vd[i * 2 + 1])}; uint64_t H[2] =3D {brev8(vs2[i * 2 + 0]), brev8(vs2[i * 2 + 1])}; @@ -853,6 +879,8 @@ void HELPER(vsm4k_vi)(void *vd, void *vs2, uint32_t uim= m5, CPURISCVState *env, uint32_t esz =3D sizeof(uint32_t); uint32_t total_elems =3D vext_get_total_elems(env, desc, esz); =20 + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D group_start; i < group_end; ++i) { uint32_t vstart =3D i * egs; uint32_t vend =3D (i + 1) * egs; @@ -909,6 +937,8 @@ void HELPER(vsm4r_vv)(void *vd, void *vs2, CPURISCVStat= e *env, uint32_t desc) uint32_t esz =3D sizeof(uint32_t); uint32_t total_elems =3D vext_get_total_elems(env, desc, esz); =20 + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D group_start; i < group_end; ++i) { uint32_t vstart =3D i * egs; uint32_t vend =3D (i + 1) * egs; @@ -943,6 +973,8 @@ void HELPER(vsm4r_vs)(void *vd, void *vs2, CPURISCVStat= e *env, uint32_t desc) uint32_t esz =3D sizeof(uint32_t); uint32_t total_elems =3D vext_get_total_elems(env, desc, esz); =20 + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D group_start; i < group_end; ++i) { uint32_t vstart =3D i * egs; uint32_t vend =3D (i + 1) * egs; diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index ca79571ae2..b4360dbd52 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -207,6 +207,8 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base, uint32_t esz =3D 1 << log2_esz; uint32_t vma =3D vext_vma(desc); =20 + VSTART_CHECK_EARLY_EXIT(env); + for (i =3D env->vstart; i < env->vl; i++, env->vstart++) { k =3D 0; while (k < nf) { @@ -272,6 +274,8 @@ vext_ldst_us(void *vd, target_ulong base, CPURISCVState= *env, uint32_t desc, uint32_t max_elems =3D vext_max_elems(desc, log2_esz); uint32_t esz =3D 1 << log2_esz; =20 + VSTART_CHECK_EARLY_EXIT(env); + /* load bytes from guest memory */ for (i =3D env->vstart; i < evl; i++, env->vstart++) { k =3D 0; @@ -386,6 +390,8 @@ vext_ldst_index(void *vd, void *v0, target_ulong base, uint32_t esz =3D 1 << log2_esz; uint32_t vma =3D vext_vma(desc); =20 + VSTART_CHECK_EARLY_EXIT(env); + /* load bytes from guest memory */ for (i =3D env->vstart; i < env->vl; i++, env->vstart++) { k =3D 0; @@ -477,6 +483,8 @@ vext_ldff(void *vd, void *v0, target_ulong base, target_ulong addr, offset, remain; int mmu_index =3D riscv_env_mmu_index(env, false); =20 + VSTART_CHECK_EARLY_EXIT(env); + /* probe every access */ for (i =3D env->vstart; i < env->vl; i++) { if (!vm && !vext_elem_mask(v0, i)) { @@ -572,6 +580,8 @@ vext_ldst_whole(void *vd, target_ulong base, CPURISCVSt= ate *env, uint32_t desc, uint32_t vlenb =3D riscv_cpu_cfg(env)->vlenb; uint32_t max_elems =3D vlenb >> log2_esz; =20 + VSTART_CHECK_EARLY_EXIT(env); + k =3D env->vstart / max_elems; off =3D env->vstart % max_elems; =20 @@ -877,6 +887,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *= vs2, \ uint32_t vta =3D vext_vta(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ ETYPE s1 =3D *((ETYPE *)vs1 + H(i)); \ ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ @@ -909,6 +921,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, = void *vs2, \ uint32_t vta =3D vext_vta(desc); = \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { = \ ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); = \ ETYPE carry =3D vext_elem_mask(v0, i); = \ @@ -944,6 +958,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *= vs2, \ uint32_t vta_all_1s =3D vext_vta_all_1s(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ ETYPE s1 =3D *((ETYPE *)vs1 + H(i)); \ ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ @@ -982,6 +998,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, = \ uint32_t vta_all_1s =3D vext_vta_all_1s(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ ETYPE carry =3D !vm && vext_elem_mask(v0, i); \ @@ -1078,6 +1096,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, = \ uint32_t vma =3D vext_vma(desc); = \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { = \ if (!vm && !vext_elem_mask(v0, i)) { \ /* set masked-off elements to 1s */ \ @@ -1125,6 +1145,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1= , \ uint32_t vma =3D vext_vma(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ if (!vm && !vext_elem_mask(v0, i)) { \ /* set masked-off elements to 1s */ \ @@ -1187,6 +1209,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void= *vs2, \ uint32_t vma =3D vext_vma(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ ETYPE s1 =3D *((ETYPE *)vs1 + H(i)); \ ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ @@ -1252,6 +1276,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1= , void *vs2, \ uint32_t vma =3D vext_vma(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ if (!vm && !vext_elem_mask(v0, i)) { \ @@ -1799,6 +1825,8 @@ void HELPER(NAME)(void *vd, void *vs1, CPURISCVState = *env, \ uint32_t vta =3D vext_vta(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ ETYPE s1 =3D *((ETYPE *)vs1 + H(i)); \ *((ETYPE *)vd + H(i)) =3D s1; \ @@ -1823,6 +1851,8 @@ void HELPER(NAME)(void *vd, uint64_t s1, CPURISCVStat= e *env, \ uint32_t vta =3D vext_vta(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ *((ETYPE *)vd + H(i)) =3D (ETYPE)s1; \ } \ @@ -1846,6 +1876,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void= *vs2, \ uint32_t vta =3D vext_vta(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ ETYPE *vt =3D (!vext_elem_mask(v0, i) ? vs2 : vs1); \ *((ETYPE *)vd + H(i)) =3D *(vt + H(i)); \ @@ -1870,6 +1902,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1= , \ uint32_t vta =3D vext_vta(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ ETYPE d =3D (!vext_elem_mask(v0, i) ? s2 : \ @@ -1915,6 +1949,8 @@ vext_vv_rm_1(void *vd, void *v0, void *vs1, void *vs2, uint32_t vl, uint32_t vm, int vxrm, opivv2_rm_fn *fn, uint32_t vma, uint32_t esz) { + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D env->vstart; i < vl; i++) { if (!vm && !vext_elem_mask(v0, i)) { /* set masked-off elements to 1s */ @@ -2040,6 +2076,8 @@ vext_vx_rm_1(void *vd, void *v0, target_long s1, void= *vs2, uint32_t vl, uint32_t vm, int vxrm, opivx2_rm_fn *fn, uint32_t vma, uint32_t esz) { + VSTART_CHECK_EARLY_EXIT(env); + for (uint32_t i =3D env->vstart; i < vl; i++) { if (!vm && !vext_elem_mask(v0, i)) { /* set masked-off elements to 1s */ @@ -2837,6 +2875,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, = \ uint32_t vma =3D vext_vma(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ if (!vm && !vext_elem_mask(v0, i)) { \ /* set masked-off elements to 1s */ \ @@ -2880,6 +2920,8 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, = \ uint32_t vma =3D vext_vma(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ if (!vm && !vext_elem_mask(v0, i)) { \ /* set masked-off elements to 1s */ \ @@ -3466,6 +3508,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, = \ uint32_t vma =3D vext_vma(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ if (vl =3D=3D 0) { \ return; \ } \ @@ -3987,6 +4031,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void= *vs2, \ uint32_t vma =3D vext_vma(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ ETYPE s1 =3D *((ETYPE *)vs1 + H(i)); \ ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ @@ -4027,6 +4073,8 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, vo= id *vs2, \ uint32_t vma =3D vext_vma(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ if (!vm && !vext_elem_mask(v0, i)) { \ @@ -4220,6 +4268,8 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, vo= id *vs2, \ uint32_t vta =3D vext_vta(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ ETYPE s2 =3D *((ETYPE *)vs2 + H(i)); \ *((ETYPE *)vd + H(i)) =3D \ @@ -4386,6 +4436,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, = \ uint32_t i; \ TD s1 =3D *((TD *)vs1 + HD(0)); \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ TS2 s2 =3D *((TS2 *)vs2 + HS2(i)); \ if (!vm && !vext_elem_mask(v0, i)) { \ @@ -4472,6 +4524,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, = \ uint32_t i; \ TD s1 =3D *((TD *)vs1 + HD(0)); \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ TS2 s2 =3D *((TS2 *)vs2 + HS2(i)); \ if (!vm && !vext_elem_mask(v0, i)) { \ @@ -4544,6 +4598,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, = \ uint32_t i; \ int a, b; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ a =3D vext_elem_mask(vs1, i); \ b =3D vext_elem_mask(vs2, i); \ @@ -4585,6 +4641,11 @@ target_ulong HELPER(vcpop_m)(void *v0, void *vs2, CP= URISCVState *env, uint32_t vl =3D env->vl; int i; =20 + if (env->vstart >=3D env->vl) { + env->vstart =3D 0; + return 0; + } + for (i =3D env->vstart; i < vl; i++) { if (vm || vext_elem_mask(v0, i)) { if (vext_elem_mask(vs2, i)) { @@ -4604,6 +4665,11 @@ target_ulong HELPER(vfirst_m)(void *v0, void *vs2, C= PURISCVState *env, uint32_t vl =3D env->vl; int i; =20 + if (env->vstart >=3D env->vl) { + env->vstart =3D 0; + return 0; + } + for (i =3D env->vstart; i < vl; i++) { if (vm || vext_elem_mask(v0, i)) { if (vext_elem_mask(vs2, i)) { @@ -4632,6 +4698,8 @@ static void vmsetm(void *vd, void *v0, void *vs2, CPU= RISCVState *env, int i; bool first_mask_bit =3D false; =20 + VSTART_CHECK_EARLY_EXIT(env); + for (i =3D env->vstart; i < vl; i++) { if (!vm && !vext_elem_mask(v0, i)) { /* set masked-off elements to 1s */ @@ -4704,6 +4772,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, CPUR= ISCVState *env, \ uint32_t sum =3D 0; = \ int i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { = \ if (!vm && !vext_elem_mask(v0, i)) { \ /* set masked-off elements to 1s */ \ @@ -4737,6 +4807,8 @@ void HELPER(NAME)(void *vd, void *v0, CPURISCVState *= env, uint32_t desc) \ uint32_t vma =3D vext_vma(desc); = \ int i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { = \ if (!vm && !vext_elem_mask(v0, i)) { \ /* set masked-off elements to 1s */ \ @@ -4772,6 +4844,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1= , void *vs2, \ uint32_t vma =3D vext_vma(desc); = \ target_ulong offset =3D s1, i_min, i; = \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ i_min =3D MAX(env->vstart, offset); = \ for (i =3D i_min; i < vl; i++) { = \ if (!vm && !vext_elem_mask(v0, i)) { \ @@ -4805,6 +4879,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1= , void *vs2, \ uint32_t vma =3D vext_vma(desc); = \ target_ulong i_max, i_min, i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ i_min =3D MIN(s1 < vlmax ? vlmax - s1 : 0, vl); = \ i_max =3D MAX(i_min, env->vstart); = \ for (i =3D env->vstart; i < i_max; ++i) { = \ @@ -4847,6 +4923,8 @@ static void vslide1up_##BITWIDTH(void *vd, void *v0, = uint64_t s1, \ uint32_t vma =3D vext_vma(desc); = \ uint32_t i; = \ = \ + VSTART_CHECK_EARLY_EXIT(env); = \ + = \ for (i =3D env->vstart; i < vl; i++) { = \ if (!vm && !vext_elem_mask(v0, i)) { = \ /* set masked-off elements to 1s */ = \ @@ -4896,6 +4974,8 @@ static void vslide1down_##BITWIDTH(void *vd, void *v0= , uint64_t s1, \ uint32_t vma =3D vext_vma(desc); = \ uint32_t i; = \ = \ + VSTART_CHECK_EARLY_EXIT(env); = \ + = \ for (i =3D env->vstart; i < vl; i++) { = \ if (!vm && !vext_elem_mask(v0, i)) { = \ /* set masked-off elements to 1s */ = \ @@ -4971,6 +5051,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void= *vs2, \ uint64_t index; \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { = \ if (!vm && !vext_elem_mask(v0, i)) { \ /* set masked-off elements to 1s */ \ @@ -5014,6 +5096,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1= , void *vs2, \ uint64_t index =3D s1; = \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { = \ if (!vm && !vext_elem_mask(v0, i)) { \ /* set masked-off elements to 1s */ \ @@ -5048,6 +5132,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void= *vs2, \ uint32_t vta =3D vext_vta(desc); = \ uint32_t num =3D 0, i; = \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { = \ if (!vext_elem_mask(vs1, i)) { \ continue; \ @@ -5075,6 +5161,8 @@ void HELPER(vmvr_v)(void *vd, void *vs2, CPURISCVStat= e *env, uint32_t desc) uint32_t startb =3D env->vstart * sewb; uint32_t i =3D startb; =20 + VSTART_CHECK_EARLY_EXIT(env); + memcpy((uint8_t *)vd + H1(i), (uint8_t *)vs2 + H1(i), maxsz - startb); @@ -5095,6 +5183,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, = \ uint32_t vma =3D vext_vma(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ if (!vm && !vext_elem_mask(v0, i)) { \ /* set masked-off elements to 1s */ \ diff --git a/target/riscv/vector_internals.c b/target/riscv/vector_internal= s.c index 12f5964fbb..996c21eb31 100644 --- a/target/riscv/vector_internals.c +++ b/target/riscv/vector_internals.c @@ -44,6 +44,8 @@ void do_vext_vv(void *vd, void *v0, void *vs1, void *vs2, uint32_t vma =3D vext_vma(desc); uint32_t i; =20 + VSTART_CHECK_EARLY_EXIT(env); + for (i =3D env->vstart; i < vl; i++) { if (!vm && !vext_elem_mask(v0, i)) { /* set masked-off elements to 1s */ @@ -68,6 +70,8 @@ void do_vext_vx(void *vd, void *v0, target_long s1, void = *vs2, uint32_t vma =3D vext_vma(desc); uint32_t i; =20 + VSTART_CHECK_EARLY_EXIT(env); + for (i =3D env->vstart; i < vl; i++) { if (!vm && !vext_elem_mask(v0, i)) { /* set masked-off elements to 1s */ diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internal= s.h index 842765f6c1..9e1e15b575 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -24,6 +24,13 @@ #include "tcg/tcg-gvec-desc.h" #include "internals.h" =20 +#define VSTART_CHECK_EARLY_EXIT(env) do { \ + if (env->vstart >=3D env->vl) { \ + env->vstart =3D 0; \ + return; \ + } \ +} while (0) + static inline uint32_t vext_nf(uint32_t desc) { return FIELD_EX32(simd_data(desc), VDATA, NF); @@ -151,6 +158,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, \ uint32_t vma =3D vext_vma(desc); \ uint32_t i; \ \ + VSTART_CHECK_EARLY_EXIT(env); \ + \ for (i =3D env->vstart; i < vl; i++) { \ if (!vm && !vext_elem_mask(v0, i)) { \ /* set masked-off elements to 1s */ \ --=20 2.43.2