From nobody Tue Feb 10 09:10:31 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1619816293; cv=none; d=zohomail.com; s=zohoarc; b=iiCjMCBfbPZlOcpMo5Ra+ljIyETCEvqze5Nj+FDWLzSKDgecxN9RfH9kZFXOys2JbJvhhhR9QOXoe4T41qbfKjFcGCvMOLTXs6Qno4huTHrK+7r0KbSW9ktYvPh24gxUzPDzq+z48JGw96kZYPvobLSInPMt88n4j2ktW+qroOA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1619816293; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=9wtSJ1TR4daGwiyeVd5xHZ/xhp0+r1Sm5he1wDTEIDQ=; b=ZjeTW0TRIwiINi/yAckoEg1TIt3EsrYP5bu3JooaRQkosU4pq4EQObv9Eg6Qfpp3vqwhdEK9g2bcn4GoG+4srII7J38fCEFIkevrUAaBoRfh9Wu1ycnSK21oRS/LwUuGq1UaU20324OwtGaypM4u2lgRbZt91Jm12moDd3vwDzk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1619816293884804.0550606720008; Fri, 30 Apr 2021 13:58:13 -0700 (PDT) Received: from localhost ([::1]:33520 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lcaDE-0005Jt-Pk for importer@patchew.org; Fri, 30 Apr 2021 16:58:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53092) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lcZlx-0004Bz-6a for qemu-devel@nongnu.org; Fri, 30 Apr 2021 16:30:01 -0400 Received: from mail-pj1-x1033.google.com ([2607:f8b0:4864:20::1033]:37431) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lcZlb-0007gM-Cm for qemu-devel@nongnu.org; Fri, 30 Apr 2021 16:30:00 -0400 Received: by mail-pj1-x1033.google.com with SMTP id k3-20020a17090ad083b0290155b934a295so2366606pju.2 for ; Fri, 30 Apr 2021 13:29:38 -0700 (PDT) Received: from localhost.localdomain ([71.212.144.24]) by smtp.gmail.com with ESMTPSA id q23sm3788781pgt.42.2021.04.30.13.29.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Apr 2021 13:29:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9wtSJ1TR4daGwiyeVd5xHZ/xhp0+r1Sm5he1wDTEIDQ=; b=JOTOOwNzVdoVhcSw6cmD0R20zlbbcPXiWp+epHoDTQExobI+4wMcsIIcddjbsYo5HD UzU1Z9NAjIP7nwoCMhDk3hDALpo4+ep1OvfBc+vP84/5R//3ZcY8P3Jpo8Q3b40X7ug/ NLz5KoLbwSCzSMaOUpeRKQQxYXUiEBje7y0ACmx77DaAEdW//DlWKeZwJcE1Ll4QT2OK QxR0OYBCuoFPT0EQzbAiX02KwfCpxaSoVbOBvsKsx3CiNyIB1AZYALpBVe5GM1cYgrxh 5uo5ZLdPxMvAw3W+uoEviQuaA8BpaApBlKtVxtguT7+IKOlpY9s+uw9e8Lpv6eM+qH0A iYtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9wtSJ1TR4daGwiyeVd5xHZ/xhp0+r1Sm5he1wDTEIDQ=; b=rQvsH8s/RrGkMKT4b0XIX2aiBzf1mMydjpRkVkLmw5qi9tpnVsV42e6SPOx4t4GH+K 6C1osPz/OCyMO/TYqX2UMjXTByW6MXyAY98YQ/IV58a76SmB36WalRk22bXy5SJEvPBZ orM1bY1/f/KegRRcPba5RNPHLalJrCVyopeKUXkZ2FhicmyueIozVOXygDPwkZyGerQs Jsrc+0bOnCzuumYJnkXo3xXJiXCXEPN9O4d4eeFXGe2zZrxxt6QfeuSKnqURZKEkTTS1 Kx64NfjtFy7tG7inU6uJ35zNW8oCuTLPactUR9rR8LFh2GvpEdwhD3eDLnPE02/qFsES RO0w== X-Gm-Message-State: AOAM533vaz+FWfTtPAq77DV5VVYEmrnO1UrVXy0s70vypn6ptWrJ2z6n YJ98SOLPosacgSKEd86/1ukXyTbJHFKP9g== X-Google-Smtp-Source: ABdhPJxaKImhAHCHrkmE8BuCrcnVBQwOxfhg/EGyyH7coAVjrnz0NJeBYdMN0OK5eeDxnvR3P4UYcg== X-Received: by 2002:a17:90b:17c7:: with SMTP id me7mr13273510pjb.96.1619814578171; Fri, 30 Apr 2021 13:29:38 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v6 43/82] target/arm: Implement SVE2 XAR Date: Fri, 30 Apr 2021 13:25:31 -0700 Message-Id: <20210430202610.1136687-44-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210430202610.1136687-1-richard.henderson@linaro.org> References: <20210430202610.1136687-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::1033; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1033.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @linaro.org) Content-Type: text/plain; charset="utf-8" In addition, use the same vector generator interface for AdvSIMD. This fixes a bug in which the AdvSIMD insn failed to clear the high bits of the SVE register. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 4 ++ target/arm/helper.h | 2 + target/arm/translate-a64.h | 3 ++ target/arm/sve.decode | 4 ++ target/arm/sve_helper.c | 39 ++++++++++++++ target/arm/translate-a64.c | 25 ++------- target/arm/translate-sve.c | 104 +++++++++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 12 +++++ 8 files changed, 172 insertions(+), 21 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 507a2fea8e..28b8f00201 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2558,6 +2558,10 @@ DEF_HELPER_FLAGS_5(sve2_histcnt_d, TCG_CALL_NO_RWG, =20 DEF_HELPER_FLAGS_4(sve2_histseg, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve2_xar_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_xar_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_xar_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/helper.h b/target/arm/helper.h index 6bb0b0ddc0..23a7ec5638 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -953,6 +953,8 @@ DEF_HELPER_FLAGS_5(neon_sqrdmulh_h, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(neon_sqrdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(gvec_xar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h index 89437276e7..58f50abca4 100644 --- a/target/arm/translate-a64.h +++ b/target/arm/translate-a64.h @@ -120,5 +120,8 @@ bool disas_sve(DisasContext *, uint32_t); =20 void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, int64_t shift, + uint32_t opr_sz, uint32_t max_sz); =20 #endif /* TARGET_ARM_TRANSLATE_A64_H */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8f501a083c..7645587469 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -65,6 +65,7 @@ &rr_dbm rd rn dbm &rrri rd rn rm imm &rri_esz rd rn imm esz +&rrri_esz rd rn rm imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz &rpr_s rd pg rn s @@ -384,6 +385,9 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... .....= @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_= e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_= e0 =20 +XAR 00000100 .. 1 ..... 001 101 rm:5 rd:5 &rrri_esz \ + rn=3D%reg_movprfx esz=3D%tszimm16_esz imm=3D%tszimm16_shr + # SVE2 bitwise ternary operations EOR3 00000100 00 1 ..... 001 110 ..... ..... @rdn_ra_rm= _e0 BSL 00000100 00 1 ..... 001 111 ..... ..... @rdn_ra_rm= _e0 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 8d002fdb65..c77003217e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -7193,3 +7193,42 @@ void HELPER(sve2_histseg)(void *vd, void *vn, void *= vm, uint32_t desc) *(uint64_t *)(vd + i + 8) =3D out1; } } + +void HELPER(sve2_xar_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + int shr =3D simd_data(desc); + int shl =3D 8 - shr; + uint64_t mask =3D dup_const(MO_8, 0xff >> shr); + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + + for (i =3D 0; i < opr_sz; ++i) { + uint64_t t =3D n[i] ^ m[i]; + d[i] =3D ((t >> shr) & mask) | ((t << shl) & ~mask); + } +} + +void HELPER(sve2_xar_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + int shr =3D simd_data(desc); + int shl =3D 16 - shr; + uint64_t mask =3D dup_const(MO_16, 0xffff >> shr); + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + + for (i =3D 0; i < opr_sz; ++i) { + uint64_t t =3D n[i] ^ m[i]; + d[i] =3D ((t >> shr) & mask) | ((t << shl) & ~mask); + } +} + +void HELPER(sve2_xar_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 4; + int shr =3D simd_data(desc); + uint32_t *d =3D vd, *n =3D vn, *m =3D vm; + + for (i =3D 0; i < opr_sz; ++i) { + d[i] =3D ror32(n[i] ^ m[i], shr); + } +} diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 0c80d0b505..bdd47208b1 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -14349,8 +14349,6 @@ static void disas_crypto_xar(DisasContext *s, uint3= 2_t insn) int imm6 =3D extract32(insn, 10, 6); int rn =3D extract32(insn, 5, 5); int rd =3D extract32(insn, 0, 5); - TCGv_i64 tcg_op1, tcg_op2, tcg_res[2]; - int pass; =20 if (!dc_isar_feature(aa64_sha3, s)) { unallocated_encoding(s); @@ -14361,25 +14359,10 @@ static void disas_crypto_xar(DisasContext *s, uin= t32_t insn) return; } =20 - tcg_op1 =3D tcg_temp_new_i64(); - tcg_op2 =3D tcg_temp_new_i64(); - tcg_res[0] =3D tcg_temp_new_i64(); - tcg_res[1] =3D tcg_temp_new_i64(); - - for (pass =3D 0; pass < 2; pass++) { - read_vec_element(s, tcg_op1, rn, pass, MO_64); - read_vec_element(s, tcg_op2, rm, pass, MO_64); - - tcg_gen_xor_i64(tcg_res[pass], tcg_op1, tcg_op2); - tcg_gen_rotri_i64(tcg_res[pass], tcg_res[pass], imm6); - } - write_vec_element(s, tcg_res[0], rd, 0, MO_64); - write_vec_element(s, tcg_res[1], rd, 1, MO_64); - - tcg_temp_free_i64(tcg_op1); - tcg_temp_free_i64(tcg_op2); - tcg_temp_free_i64(tcg_res[0]); - tcg_temp_free_i64(tcg_res[1]); + gen_gvec_xar(MO_64, vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), imm6, 16, + vec_full_reg_size(s)); } =20 /* Crypto three-reg imm2 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0cbca8d715..eea8b6f1d0 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -340,6 +340,110 @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_es= z *a) return do_zzz_fn(s, a, tcg_gen_gvec_andc); } =20 +static void gen_xar8_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + uint64_t mask =3D dup_const(MO_8, 0xff >> sh); + + tcg_gen_xor_i64(t, n, m); + tcg_gen_shri_i64(d, t, sh); + tcg_gen_shli_i64(t, t, 8 - sh); + tcg_gen_andi_i64(d, d, mask); + tcg_gen_andi_i64(t, t, ~mask); + tcg_gen_or_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_xar16_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + TCGv_i64 t =3D tcg_temp_new_i64(); + uint64_t mask =3D dup_const(MO_16, 0xffff >> sh); + + tcg_gen_xor_i64(t, n, m); + tcg_gen_shri_i64(d, t, sh); + tcg_gen_shli_i64(t, t, 16 - sh); + tcg_gen_andi_i64(d, d, mask); + tcg_gen_andi_i64(t, t, ~mask); + tcg_gen_or_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_xar_i32(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, int32_t sh) +{ + tcg_gen_xor_i32(d, n, m); + tcg_gen_rotri_i32(d, d, sh); +} + +static void gen_xar_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + tcg_gen_xor_i64(d, n, m); + tcg_gen_rotri_i64(d, d, sh); +} + +static void gen_xar_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, int64_t sh) +{ + tcg_gen_xor_vec(vece, d, n, m); + tcg_gen_rotri_vec(vece, d, d, sh); +} + +void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, int64_t shift, + uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop[] =3D { INDEX_op_rotli_vec, 0 }; + static const GVecGen3i ops[4] =3D { + { .fni8 =3D gen_xar8_i64, + .fniv =3D gen_xar_vec, + .fno =3D gen_helper_sve2_xar_b, + .opt_opc =3D vecop, + .vece =3D MO_8 }, + { .fni8 =3D gen_xar16_i64, + .fniv =3D gen_xar_vec, + .fno =3D gen_helper_sve2_xar_h, + .opt_opc =3D vecop, + .vece =3D MO_16 }, + { .fni4 =3D gen_xar_i32, + .fniv =3D gen_xar_vec, + .fno =3D gen_helper_sve2_xar_s, + .opt_opc =3D vecop, + .vece =3D MO_32 }, + { .fni8 =3D gen_xar_i64, + .fniv =3D gen_xar_vec, + .fno =3D gen_helper_gvec_xar_d, + .opt_opc =3D vecop, + .vece =3D MO_64 } + }; + int esize =3D 8 << vece; + + /* The SVE2 range is 1 .. esize; the AdvSIMD range is 0 .. esize-1. */ + tcg_debug_assert(shift >=3D 0); + tcg_debug_assert(shift <=3D esize); + shift &=3D esize - 1; + + if (shift =3D=3D 0) { + /* xar with no rotate devolves to xor. */ + tcg_gen_gvec_xor(vece, rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_3i(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, + shift, &ops[vece]); + } +} + +static bool trans_XAR(DisasContext *s, arg_rrri_esz *a) +{ + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz =3D vec_full_reg_size(s); + gen_gvec_xar(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), a->imm, vsz, vsz); + } + return true; +} + static bool do_sve2_zzzz_fn(DisasContext *s, arg_rrrr_esz *a, GVecGen4Fn *= fn) { if (!dc_isar_feature(aa64_sve2, s)) { diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 19006f50f7..a3d80ecad0 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -2253,3 +2253,15 @@ void HELPER(gvec_umulh_d)(void *vd, void *vn, void *= vm, uint32_t desc) } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +void HELPER(gvec_xar_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + int shr =3D simd_data(desc); + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + + for (i =3D 0; i < opr_sz; ++i) { + d[i] =3D ror64(n[i] ^ m[i], shr); + } + clear_tail(d, opr_sz * 8, simd_maxsz(desc)); +} --=20 2.25.1