Thanks for the reviews.
Queued in gitlab.com/danielhb/qemu/tree/ppc-next.
Daniel
On 10/19/22 09:50, Lucas Mateus Castro(alqotel) wrote:
> From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>
>
> Patches missing review: 12
>
> v2 -> v3:
> - Used ctpop in i32 and i64 vprtyb
> - Changed gvec set up in xvtstdc[ds]p
>
> v1 -> v2:
> - Implemented instructions with fni4/fni8 and dropped the helper:
> * VSUBCUW
> * VADDCUW
> * VPRTYBW
> * VPRTYBD
> - Reworked patch12 to only use gvec implementation with a few
> immediates.
> - Used bitsel_ver on patch9
> - Changed vec variables to tcg_constant_vec when possible
>
> This patch series moves some instructions from decode legacy to
> decodetree and translate said instructions with gvec. Some cases using
> gvec ended up with a bigger, more complex and slower so those
> instructions were only moved to decodetree.
>
> In each patch there's a comparison of the execution time before the
> patch being applied and after. Said result is the sum of 10 executions.
>
> The program used to time the execution worked like this:
>
> clock_t start = clock();
> for (int i = 0; i < LOOP; i++) {
> asm (
> load values in registers, between 2 and 3 instructions
> ".rept REPT\n\t"
> "INSTRUCTION registers\n\t"
> ".endr\n\t"
> save result from register, 1 instruction
> );
> }
> clock_t end = clock();
> printf("INSTRUCTION rept=REPT loop=LOOP, time taken: %.12lf\n",
> ((double)(end - start))/ CLOCKS_PER_SEC);
>
> Where the column rept in the value used in .rept in the inline assembly
> and loop column is the value used for the for loop. All of those tests
> were executed on a Power9. When comparing the TCGop the data used was
> gathered using '-d op' and '-d op_opt'.
>
> Lucas Mateus Castro (alqotel) (12):
> target/ppc: Moved VMLADDUHM to decodetree and use gvec
> target/ppc: Move VMH[R]ADDSHS instruction to decodetree
> target/ppc: Move V(ADD|SUB)CUW to decodetree and use gvec
> target/ppc: Move VNEG[WD] to decodtree and use gvec
> target/ppc: Move VPRTYB[WDQ] to decodetree and use gvec
> target/ppc: Move VAVG[SU][BHW] to decodetree and use gvec
> target/ppc: Move VABSDU[BHW] to decodetree and use gvec
> target/ppc: Use gvec to decode XV[N]ABS[DS]P/XVNEG[DS]P
> target/ppc: Use gvec to decode XVCPSGN[SD]P
> target/ppc: Moved XVTSTDC[DS]P to decodetree
> target/ppc: Moved XSTSTDC[QDS]P to decodetree
> target/ppc: Use gvec to decode XVTSTDC[DS]P
>
> target/ppc/fpu_helper.c | 137 +++++-----
> target/ppc/helper.h | 42 ++--
> target/ppc/insn32.decode | 50 ++++
> target/ppc/int_helper.c | 107 ++------
> target/ppc/translate.c | 1 -
> target/ppc/translate/vmx-impl.c.inc | 352 ++++++++++++++++++++++----
> target/ppc/translate/vmx-ops.c.inc | 15 +-
> target/ppc/translate/vsx-impl.c.inc | 372 +++++++++++++++++++++++-----
> target/ppc/translate/vsx-ops.c.inc | 21 --
> 9 files changed, 771 insertions(+), 326 deletions(-)
>