[PATCH v5 00/11] Hexagon (target/hexagon) performance and bug fixes

Taylor Simpson posted 11 patches 1 year, 5 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20221108162906.3166-1-tsimpson@quicinc.com
Maintainers: Taylor Simpson <tsimpson@quicinc.com>
target/hexagon/cpu.h                |  14 +-
target/hexagon/gen_tcg.h            | 414 +++++++++++++++++++++++++++-
target/hexagon/gen_tcg_hvx.h        |   6 +-
target/hexagon/insn.h               |   9 +-
target/hexagon/macros.h             |  16 +-
target/hexagon/mmvec/macros.h       |   4 +-
target/hexagon/translate.h          |  20 +-
target/hexagon/decode.c             |  15 +-
target/hexagon/genptr.c             | 392 +++++++++++++++++++++++++-
target/hexagon/op_helper.c          |  28 +-
target/hexagon/translate.c          | 231 +++++++++++-----
tests/tcg/hexagon/hvx_misc.c        |  72 +++++
tests/tcg/hexagon/usr.c             |  34 ++-
target/hexagon/gen_helper_funcs.py  |  13 +-
target/hexagon/gen_helper_protos.py |  14 +-
target/hexagon/gen_tcg_funcs.py     |  38 ++-
target/hexagon/hex_common.py        |  29 +-
17 files changed, 1211 insertions(+), 138 deletions(-)
[PATCH v5 00/11] Hexagon (target/hexagon) performance and bug fixes
Posted by Taylor Simpson 1 year, 5 months ago
1)
Performance improvement
Add pkt and insn to DisasContext
Many functions need information from all 3 structures, so merge
them together.

2)
Bug fix
Fix predicated assignment to .tmp and .cur


3)
Performance improvement
Add overrides for S2_asr_r_r_sat/S2_asl_r_r_sat
These functions will not be handled by idef-parser

4-11)
The final 8 patches improve change-of-flow handling.

Currently, we set the PC to a new address before exiting a TB.  The
ultimate goal is to use direct block chaining.  However, several steps
are needed along the way.

4)
When a packet has more than one change-of-flow (COF) instruction, only
the first one taken is considered.  The runtime bookkeeping is only
needed when there is more than one COF instruction in a packet.

5, 6)
Remove PC and next_PC from the runtime state and always use a
translation-time constant.  Note that next_PC is used by call instructions
to set LR and by conditional COF instructions to set the fall-through
address.

7, 8, 9)
Add helper overrides for COF instructions.  In particular, we must
distinguish those that use a PC-relative address for the destination.
These are candidates for direct block chaining later.

10)
Use direct block chaining for packets that have a single PC-relative
COF instruction.  Instead of generating the code while processing the
instruction, we record the effect in DisasContext and generate the code
during gen_end_tb.

11)
Use direct block chaining for tight loops.  We look for TBs that end
with an endloop0 that will branch back to the TB start address.


**** Changes in V5 ****
Address feedback from Richard Henderson <richard.henderson@linaro.org>
    Handle out-of-range shifts in gen_shl_sat

**** Changes in V4 ****
Address feedback from Richard Henderson <richard.henderson@linaro.org>
    Rewrite gen_sat_i64 to be branchless
    Rewrite gen_sar to be branchless
    Rewrite gen_shl_sat to be branchless
    Pass branch condition instead of doing xor when false
    Use hex_branch_taken to hold the predicate for direct branches
    Remove HexStateFlags and use "hw/registerfields.h" macros instead

**** Changes in V3 ****
Merge previously emailed patches into single series
Merge functions that check if vreg is preloaded

**** Changes in V2 ****
Update test case to use both true and false predicates
Add fix for .cur
Simplify test in need_pkt_has_multi_cof
Address feedback from Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
    Rearrange new-value-jump overrides
    Simplify gen_write_new_pc_addr



Taylor Simpson (11):
  Hexagon (target/hexagon) Add pkt and insn to DisasContext
  Hexagon (target/hexagon) Fix predicated assignment to .tmp and .cur
  Hexagon (target/hexagon) Add overrides for
    S2_asr_r_r_sat/S2_asl_r_r_sat
  Hexagon (target/hexagon) Only use branch_taken when packet has multi
    cof
  Hexagon (target/hexagon) Remove PC from the runtime state
  Hexagon (target/hexagon) Remove next_PC from runtime state
  Hexagon (target/hexagon) Add overrides for direct call instructions
  Hexagon (target/hexagon) Add overrides for compound compare and jump
  Hexagon (target/hexagon) Add overrides for various forms of jump
  Hexagon (target/hexagon) Use direct block chaining for direct
    jump/branch
  Hexagon (target/hexagon) Use direct block chaining for tight loops

 target/hexagon/cpu.h                |  14 +-
 target/hexagon/gen_tcg.h            | 414 +++++++++++++++++++++++++++-
 target/hexagon/gen_tcg_hvx.h        |   6 +-
 target/hexagon/insn.h               |   9 +-
 target/hexagon/macros.h             |  16 +-
 target/hexagon/mmvec/macros.h       |   4 +-
 target/hexagon/translate.h          |  20 +-
 target/hexagon/decode.c             |  15 +-
 target/hexagon/genptr.c             | 392 +++++++++++++++++++++++++-
 target/hexagon/op_helper.c          |  28 +-
 target/hexagon/translate.c          | 231 +++++++++++-----
 tests/tcg/hexagon/hvx_misc.c        |  72 +++++
 tests/tcg/hexagon/usr.c             |  34 ++-
 target/hexagon/gen_helper_funcs.py  |  13 +-
 target/hexagon/gen_helper_protos.py |  14 +-
 target/hexagon/gen_tcg_funcs.py     |  38 ++-
 target/hexagon/hex_common.py        |  29 +-
 17 files changed, 1211 insertions(+), 138 deletions(-)

-- 
2.17.1