[v2] target/arm: Convert a64 advsimd to decodetree (part 1)

[PATCH v2 00/67] target/arm: Convert a64 advsimd to decodetree (part 1)

Posted by Richard Henderson 6 months ago

In the process, convert more code to gvec as well -- I will need
the gvec code for implementing SME2.  I guess this is about 1/3
of the job done, but there's no reason to wait until the patch
set is completely unwieldy.

Changes for v2:
  * Fix existing RISU failures vs neoverse-n1.
  * Introduce vfp_load_reg16, fixing a regression wrt VNEG (scalar, hp).
  * Fix typo in SUQADD vectorization.
  * Two more conversions.


r~


Richard Henderson (67):
  target/arm: Add neoverse-n1 to qemu-arm (DO NOT MERGE)
  target/arm: Use PLD, PLDW, PLI not NOP for t32
  target/arm: Reject incorrect operands to PLD, PLDW, PLI
  target/arm: Zero-extend writeback for fp16 FCVTZS (scalar, integer)
  target/arm: Fix decode of FMOV (hp) vs MOVI
  target/arm: Verify sz=0 for Advanced SIMD scalar pairwise (fp16)
  target/arm: Split out gengvec.c
  target/arm: Split out gengvec64.c
  target/arm: Convert Cryptographic AES to decodetree
  target/arm: Convert Cryptographic 3-register SHA to decodetree
  target/arm: Convert Cryptographic 2-register SHA to decodetree
  target/arm: Convert Cryptographic 3-register SHA512 to decodetree
  target/arm: Convert Cryptographic 2-register SHA512 to decodetree
  target/arm: Convert Cryptographic 4-register to decodetree
  target/arm: Convert Cryptographic 3-register, imm2 to decodetree
  target/arm: Convert XAR to decodetree
  target/arm: Convert Advanced SIMD copy to decodetree
  target/arm: Convert FMULX to decodetree
  target/arm: Convert FADD, FSUB, FDIV, FMUL to decodetree
  target/arm: Convert FMAX, FMIN, FMAXNM, FMINNM to decodetree
  target/arm: Introduce vfp_load_reg16
  target/arm: Expand vfp neg and abs inline
  target/arm: Convert FNMUL to decodetree
  target/arm: Convert FMLA, FMLS to decodetree
  target/arm: Convert FCMEQ, FCMGE, FCMGT, FACGE, FACGT to decodetree
  target/arm: Convert FABD to decodetree
  target/arm: Convert FRECPS, FRSQRTS to decodetree
  target/arm: Convert FADDP to decodetree
  target/arm: Convert FMAXP, FMINP, FMAXNMP, FMINNMP to decodetree
  target/arm: Use gvec for neon faddp, fmaxp, fminp
  target/arm: Convert ADDP to decodetree
  target/arm: Use gvec for neon padd
  target/arm: Convert SMAXP, SMINP, UMAXP, UMINP to decodetree
  target/arm: Use gvec for neon pmax, pmin
  target/arm: Convert FMLAL, FMLSL to decodetree
  target/arm: Convert disas_simd_3same_logic to decodetree
  target/arm: Improve vector UQADD, UQSUB, SQADD, SQSUB
  target/arm: Convert SUQADD and USQADD to gvec
  target/arm: Inline scalar SUQADD and USQADD
  target/arm: Inline scalar SQADD, UQADD, SQSUB, UQSUB
  target/arm: Convert SQADD, SQSUB, UQADD, UQSUB to decodetree
  target/arm: Convert SUQADD, USQADD to decodetree
  target/arm: Convert SSHL, USHL to decodetree
  target/arm: Convert SRSHL and URSHL (register) to gvec
  target/arm: Convert SRSHL, URSHL to decodetree
  target/arm: Convert SQSHL and UQSHL (register) to gvec
  target/arm: Convert SQSHL, UQSHL to decodetree
  target/arm: Convert SQRSHL and UQRSHL (register) to gvec
  target/arm: Convert SQRSHL, UQRSHL to decodetree
  target/arm: Convert ADD, SUB (vector) to decodetree
  target/arm: Convert CMGT, CMHI, CMGE, CMHS, CMTST, CMEQ to decodetree
  target/arm: Use TCG_COND_TSTNE in gen_cmtst_{i32,i64}
  target/arm: Use TCG_COND_TSTNE in gen_cmtst_vec
  target/arm: Convert SHADD, UHADD to gvec
  target/arm: Convert SHADD, UHADD to decodetree
  target/arm: Convert SHSUB, UHSUB to gvec
  target/arm: Convert SHSUB, UHSUB to decodetree
  target/arm: Convert SRHADD, URHADD to gvec
  target/arm: Convert SRHADD, URHADD to decodetree
  target/arm: Convert SMAX, SMIN, UMAX, UMIN to decodetree
  target/arm: Convert SABA, SABD, UABA, UABD to decodetree
  target/arm: Convert MUL, PMUL to decodetree
  target/arm: Convert MLA, MLS to decodetree
  target/arm: Tidy SQDMULH, SQRDMULH (vector)
  target/arm: Convert SQDMULH, SQRDMULH to decodetree
  target/arm: Convert FMADD, FMSUB, FNMADD, FNMSUB to decodetree
  target/arm: Convert FCSEL to decodetree

 target/arm/helper.h              |  164 +-
 target/arm/tcg/helper-a64.h      |   12 +
 target/arm/tcg/translate-a64.h   |   18 +
 target/arm/tcg/translate.h       |   95 +
 target/arm/tcg/a32-uncond.decode |    8 +-
 target/arm/tcg/a64.decode        |  430 ++-
 target/arm/tcg/neon-dp.decode    |   37 +-
 target/arm/tcg/t32.decode        |   26 +-
 target/arm/tcg/cpu32.c           |   73 +
 target/arm/tcg/gengvec.c         | 2306 ++++++++++++++++
 target/arm/tcg/gengvec64.c       |  367 +++
 target/arm/tcg/neon_helper.c     |  511 +---
 target/arm/tcg/translate-a64.c   | 4440 ++++++++++--------------------
 target/arm/tcg/translate-neon.c  |  254 +-
 target/arm/tcg/translate-sve.c   |  145 +-
 target/arm/tcg/translate-vfp.c   |   93 +-
 target/arm/tcg/translate.c       | 1649 +----------
 target/arm/tcg/vec_helper.c      |  349 ++-
 target/arm/vfp_helper.c          |   30 -
 target/arm/tcg/meson.build       |    2 +
 20 files changed, 5446 insertions(+), 5563 deletions(-)
 create mode 100644 target/arm/tcg/gengvec.c
 create mode 100644 target/arm/tcg/gengvec64.c

-- 
2.34.1

Re: [PATCH v2 00/67] target/arm: Convert a64 advsimd to decodetree (part 1)

Posted by Peter Maydell 6 months ago

On Sat, 25 May 2024 at 00:22, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> In the process, convert more code to gvec as well -- I will need
> the gvec code for implementing SME2.  I guess this is about 1/3
> of the job done, but there's no reason to wait until the patch
> set is completely unwieldy.
>
> Changes for v2:
>   * Fix existing RISU failures vs neoverse-n1.
>   * Introduce vfp_load_reg16, fixing a regression wrt VNEG (scalar, hp).
>   * Fix typo in SUQADD vectorization.
>   * Two more conversions.

I've now finished review for this. I pulled 2 and 4-36 into my
target-arm pullreq currently on list. To save you having to
go through a lot of replies that are just me giving r-by tags,
the patches I had comments/questions on are:
  3, 38, 44, 46, 50, 65, 66.

thanks
-- PMM