Hi James,
Here's a reorg of your multiplier instructions, expanding inline.
The tcg multiply optimization is to fold away the idiom by which a
sequence of v3mulu with 0 and 1 operands is used to read the MPL
and P registers:
v3mulu p0, $0, $0
v3mulu p1, $0, $0
v3mulu p2, $0, $0
v3mulu p3, $0, $0
v3mulu p4, $0, $0
v3mulu p5, $0, $0
ori $3, $0, 1
v3mulu mpl0, $3, $0
v3mulu mpl1, $0, $0
v3mulu mpl2, $0, $0
v3mulu mpl3, $0, $0
v3mulu mpl4, $0, $0
v3mulu mpl5, $0, $0
r~
James Hilliard (7):
target/mips: add Octeon multiplier state
target/mips: add Octeon MTM instructions
target/mips: add Octeon MTP instructions
target/mips: add Octeon VMULU instruction
target/mips: add Octeon VMM0 instruction
target/mips: add Octeon V3MULU instruction
target/mips: add Octeon QMAC instructions
Richard Henderson (2):
tcg: Introduce tcg_gen_addN_i64
tcg: Optimize INDEX_op_mul[us]2 for 0 and 1
include/tcg/tcg-op-common.h | 1 +
target/mips/cpu.h | 12 ++
target/mips/tcg/translate.h | 2 +
target/mips/system/machine.c | 33 +++++
target/mips/tcg/octeon_translate.c | 208 +++++++++++++++++++++++++++++
target/mips/tcg/translate.c | 18 +++
tcg/optimize.c | 90 ++++++++-----
tcg/tcg-op.c | 42 ++++++
target/mips/tcg/octeon.decode | 20 +++
9 files changed, 395 insertions(+), 31 deletions(-)
--
2.43.0