From nobody Sat May 18 08:47:08 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1686129207; cv=none; d=zohomail.com; s=zohoarc; b=CUpoOrK/7aRvSLmh250IbzRMG5Fr3dUzfvWU27TSSmgox70PuSd4bICVbOKNHoPyd38vIQ/6zuyMek/vw4YoPPtSw0zJE2x10HgTVPA/d+MPyQZezXeRz8qTRNIL2U4JGtge4rLQ35CVnPvquMWIQjOMylLMQ/z+Yrb8nbojMVA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1686129207; h=Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To; bh=W8AgYyYI9d4otK1hczHiOaQ1d8WGE71cd1hA40HFSY0=; b=jvYmZKBZSVCYUJm2AK7lRyrwoMLnOsGYJjnl471nnvJ4tnjIRCY+21AsfSf+anVLPYWE6aSe5zO/eR26rc4INckSTv7nLz/4GIGJNTSc0fb88/UszQhnqbi/keF5uF2XmfDMEuMxQ8buZUXJ5jGcE+gr3GEPo+IyROdQRv8tdME= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1686129207200327.1606364768968; Wed, 7 Jun 2023 02:13:27 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q6pDm-00084j-9H; Wed, 07 Jun 2023 05:12:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q6pDk-00084E-Oa; Wed, 07 Jun 2023 05:12:48 -0400 Received: from mga09.intel.com ([134.134.136.24]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q6pDi-0001Tm-UV; Wed, 07 Jun 2023 05:12:48 -0400 Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jun 2023 02:12:39 -0700 Received: from xiao-desktop.sh.intel.com ([10.239.46.158]) by fmsmga008.fm.intel.com with ESMTP; 07 Jun 2023 02:12:33 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686129166; x=1717665166; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=AHNI1/RpJ6AS+SCdzUR89xL6O4y2FGQ892g2z6y0DNw=; b=TR5YcChzsZVV7LVs0c1N2H2w2rBWYSM5/eouZ9mtVtnbFoDmL7btwCNV Gd1mpzZMEFf+i09NHu68sV5J34FrBndc3tFCn7k4qLDoQjXAltGfLtQ0c Nt5tYmIkAapWJQr93TlSL4JFZQq+8pSjCfb7kTtrhvgKdXDVp4iqOa2Ss tiZrywYh12ozf18XaWMMrxzFE+IaK95BH2K+2FAxI4+9sEvNqezk73PBz Q9jTL66RrzTyjK1JWjDmwlZHTtPJ0SqZNCiKRDFDWg+iGjpyiZbT+O1AT bY9C6O8tNXQZZVUM3Wtsp4XUV2UjDNoQ24RDfA32llg58ETtYvPPCzVdS g==; X-IronPort-AV: E=McAfee;i="6600,9927,10733"; a="359403253" X-IronPort-AV: E=Sophos;i="6.00,223,1681196400"; d="scan'208";a="359403253" X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10733"; a="774439958" X-IronPort-AV: E=Sophos;i="6.00,223,1681196400"; d="scan'208";a="774439958" From: Xiao Wang To: qemu-devel@nongnu.org Cc: Xiao Wang , Palmer Dabbelt , Alistair Francis , Bin Meng , Weiwei Li , Daniel Henrique Barboza , Liu Zhiwei , qemu-riscv@nongnu.org (open list:RISC-V TCG CPUs) Subject: [PATCH v2] target/riscv/vector_helper.c: Remove the check for extra tail elements Date: Wed, 7 Jun 2023 17:16:46 +0800 Message-Id: <20230607091646.4049428-1-xiao.w.wang@intel.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=134.134.136.24; envelope-from=xiao.w.wang@intel.com; helo=mga09.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1686129207903100002 Content-Type: text/plain; charset="utf-8" Commit 752614cab8e6 ("target/riscv: rvv: Add tail agnostic for vector load / store instructions") added an extra check for LMUL fragmentation, intended for setting the "rest tail elements" in the last register for a segment load insn. Actually, the max_elements derived in vext_ld*() won't be a fraction of vector register size, since the lmul encoded in desc is emul, which has already been adjusted to 1 for LMUL fragmentation case by vext_get_emul() in trans_rvv.c.inc, for ld_stride(), ld_us(), ld_index() and ldff(). Besides, vext_get_emul() has also taken EEW/SEW into consideration, so no need to call vext_get_total_elems() which would base on the emul to derive another emul, the second emul would be incorrect when esz differs from sew. Thus this patch removes the check for extra tail elements. Fixes: 752614cab8e6 ("target/riscv: rvv: Add tail agnostic for vector load = / store instructions") Signed-off-by: Xiao Wang Reviewed-by: Daniel Henrique Barboza Reviewed-by: Weiwei Li --- v2: * Rebased on top of Alistair's riscv-to-apply.next branch. --- target/riscv/vector_helper.c | 22 ++++++---------------- 1 file changed, 6 insertions(+), 16 deletions(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 7505f9470a..f261e726c2 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -264,11 +264,10 @@ GEN_VEXT_ST_ELEM(ste_h, int16_t, H2, stw) GEN_VEXT_ST_ELEM(ste_w, int32_t, H4, stl) GEN_VEXT_ST_ELEM(ste_d, int64_t, H8, stq) =20 -static void vext_set_tail_elems_1s(CPURISCVState *env, target_ulong vl, - void *vd, uint32_t desc, uint32_t nf, +static void vext_set_tail_elems_1s(target_ulong vl, void *vd, + uint32_t desc, uint32_t nf, uint32_t esz, uint32_t max_elems) { - uint32_t total_elems, vlenb, registers_used; uint32_t vta =3D vext_vta(desc); int k; =20 @@ -276,19 +275,10 @@ static void vext_set_tail_elems_1s(CPURISCVState *env= , target_ulong vl, return; } =20 - total_elems =3D vext_get_total_elems(env, desc, esz); - vlenb =3D riscv_cpu_cfg(env)->vlen >> 3; - for (k =3D 0; k < nf; ++k) { vext_set_elems_1s(vd, vta, (k * max_elems + vl) * esz, (k * max_elems + max_elems) * esz); } - - if (nf * max_elems % total_elems !=3D 0) { - registers_used =3D ((nf * max_elems) * esz + (vlenb - 1)) / vlenb; - vext_set_elems_1s(vd, vta, (nf * max_elems) * esz, - registers_used * vlenb); - } } =20 /* @@ -324,7 +314,7 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base, } env->vstart =3D 0; =20 - vext_set_tail_elems_1s(env, env->vl, vd, desc, nf, esz, max_elems); + vext_set_tail_elems_1s(env->vl, vd, desc, nf, esz, max_elems); } =20 #define GEN_VEXT_LD_STRIDE(NAME, ETYPE, LOAD_FN) \ @@ -383,7 +373,7 @@ vext_ldst_us(void *vd, target_ulong base, CPURISCVState= *env, uint32_t desc, } env->vstart =3D 0; =20 - vext_set_tail_elems_1s(env, evl, vd, desc, nf, esz, max_elems); + vext_set_tail_elems_1s(evl, vd, desc, nf, esz, max_elems); } =20 /* @@ -504,7 +494,7 @@ vext_ldst_index(void *vd, void *v0, target_ulong base, } env->vstart =3D 0; =20 - vext_set_tail_elems_1s(env, env->vl, vd, desc, nf, esz, max_elems); + vext_set_tail_elems_1s(env->vl, vd, desc, nf, esz, max_elems); } =20 #define GEN_VEXT_LD_INDEX(NAME, ETYPE, INDEX_FN, LOAD_FN) = \ @@ -634,7 +624,7 @@ ProbeSuccess: } env->vstart =3D 0; =20 - vext_set_tail_elems_1s(env, env->vl, vd, desc, nf, esz, max_elems); + vext_set_tail_elems_1s(env->vl, vd, desc, nf, esz, max_elems); } =20 #define GEN_VEXT_LDFF(NAME, ETYPE, LOAD_FN) \ --=20 2.25.1