In the process, convert more code to gvec as well -- I will need
the gvec code for implementing SME2. I guess this is about 1/3
of the job done, but there's no reason to wait until the patch
set is completely unwieldy.
Changes for v2:
* Fix existing RISU failures vs neoverse-n1.
* Introduce vfp_load_reg16, fixing a regression wrt VNEG (scalar, hp).
* Fix typo in SUQADD vectorization.
* Two more conversions.
r~
Richard Henderson (67):
target/arm: Add neoverse-n1 to qemu-arm (DO NOT MERGE)
target/arm: Use PLD, PLDW, PLI not NOP for t32
target/arm: Reject incorrect operands to PLD, PLDW, PLI
target/arm: Zero-extend writeback for fp16 FCVTZS (scalar, integer)
target/arm: Fix decode of FMOV (hp) vs MOVI
target/arm: Verify sz=0 for Advanced SIMD scalar pairwise (fp16)
target/arm: Split out gengvec.c
target/arm: Split out gengvec64.c
target/arm: Convert Cryptographic AES to decodetree
target/arm: Convert Cryptographic 3-register SHA to decodetree
target/arm: Convert Cryptographic 2-register SHA to decodetree
target/arm: Convert Cryptographic 3-register SHA512 to decodetree
target/arm: Convert Cryptographic 2-register SHA512 to decodetree
target/arm: Convert Cryptographic 4-register to decodetree
target/arm: Convert Cryptographic 3-register, imm2 to decodetree
target/arm: Convert XAR to decodetree
target/arm: Convert Advanced SIMD copy to decodetree
target/arm: Convert FMULX to decodetree
target/arm: Convert FADD, FSUB, FDIV, FMUL to decodetree
target/arm: Convert FMAX, FMIN, FMAXNM, FMINNM to decodetree
target/arm: Introduce vfp_load_reg16
target/arm: Expand vfp neg and abs inline
target/arm: Convert FNMUL to decodetree
target/arm: Convert FMLA, FMLS to decodetree
target/arm: Convert FCMEQ, FCMGE, FCMGT, FACGE, FACGT to decodetree
target/arm: Convert FABD to decodetree
target/arm: Convert FRECPS, FRSQRTS to decodetree
target/arm: Convert FADDP to decodetree
target/arm: Convert FMAXP, FMINP, FMAXNMP, FMINNMP to decodetree
target/arm: Use gvec for neon faddp, fmaxp, fminp
target/arm: Convert ADDP to decodetree
target/arm: Use gvec for neon padd
target/arm: Convert SMAXP, SMINP, UMAXP, UMINP to decodetree
target/arm: Use gvec for neon pmax, pmin
target/arm: Convert FMLAL, FMLSL to decodetree
target/arm: Convert disas_simd_3same_logic to decodetree
target/arm: Improve vector UQADD, UQSUB, SQADD, SQSUB
target/arm: Convert SUQADD and USQADD to gvec
target/arm: Inline scalar SUQADD and USQADD
target/arm: Inline scalar SQADD, UQADD, SQSUB, UQSUB
target/arm: Convert SQADD, SQSUB, UQADD, UQSUB to decodetree
target/arm: Convert SUQADD, USQADD to decodetree
target/arm: Convert SSHL, USHL to decodetree
target/arm: Convert SRSHL and URSHL (register) to gvec
target/arm: Convert SRSHL, URSHL to decodetree
target/arm: Convert SQSHL and UQSHL (register) to gvec
target/arm: Convert SQSHL, UQSHL to decodetree
target/arm: Convert SQRSHL and UQRSHL (register) to gvec
target/arm: Convert SQRSHL, UQRSHL to decodetree
target/arm: Convert ADD, SUB (vector) to decodetree
target/arm: Convert CMGT, CMHI, CMGE, CMHS, CMTST, CMEQ to decodetree
target/arm: Use TCG_COND_TSTNE in gen_cmtst_{i32,i64}
target/arm: Use TCG_COND_TSTNE in gen_cmtst_vec
target/arm: Convert SHADD, UHADD to gvec
target/arm: Convert SHADD, UHADD to decodetree
target/arm: Convert SHSUB, UHSUB to gvec
target/arm: Convert SHSUB, UHSUB to decodetree
target/arm: Convert SRHADD, URHADD to gvec
target/arm: Convert SRHADD, URHADD to decodetree
target/arm: Convert SMAX, SMIN, UMAX, UMIN to decodetree
target/arm: Convert SABA, SABD, UABA, UABD to decodetree
target/arm: Convert MUL, PMUL to decodetree
target/arm: Convert MLA, MLS to decodetree
target/arm: Tidy SQDMULH, SQRDMULH (vector)
target/arm: Convert SQDMULH, SQRDMULH to decodetree
target/arm: Convert FMADD, FMSUB, FNMADD, FNMSUB to decodetree
target/arm: Convert FCSEL to decodetree
target/arm/helper.h | 164 +-
target/arm/tcg/helper-a64.h | 12 +
target/arm/tcg/translate-a64.h | 18 +
target/arm/tcg/translate.h | 95 +
target/arm/tcg/a32-uncond.decode | 8 +-
target/arm/tcg/a64.decode | 430 ++-
target/arm/tcg/neon-dp.decode | 37 +-
target/arm/tcg/t32.decode | 26 +-
target/arm/tcg/cpu32.c | 73 +
target/arm/tcg/gengvec.c | 2306 ++++++++++++++++
target/arm/tcg/gengvec64.c | 367 +++
target/arm/tcg/neon_helper.c | 511 +---
target/arm/tcg/translate-a64.c | 4440 ++++++++++--------------------
target/arm/tcg/translate-neon.c | 254 +-
target/arm/tcg/translate-sve.c | 145 +-
target/arm/tcg/translate-vfp.c | 93 +-
target/arm/tcg/translate.c | 1649 +----------
target/arm/tcg/vec_helper.c | 349 ++-
target/arm/vfp_helper.c | 30 -
target/arm/tcg/meson.build | 2 +
20 files changed, 5446 insertions(+), 5563 deletions(-)
create mode 100644 target/arm/tcg/gengvec.c
create mode 100644 target/arm/tcg/gengvec64.c
--
2.34.1