Currently, we have support for optimizing redundant zero extensions,
which I think was done with x86 and aarch64 in mind, which zero-extend
all 32-bit operations into the 64-bit register. But targets like Alpha,
MIPS, and RISC-V do sign-extensions instead.
But before that, split the quite massive tcg_optimize function.
r~
Changes for v3:
* Fix brcond2 bug (luis)
* Fix fold_count_zeros typo.
* Rebase and fix up int128.h conflict.
Changes for v2:
* Rebase, adjusting MemOpIdx renaming.
* Apply r-b and some feedback (f4bug).
Patches lacking review:
17-tcg-optimize-Split-out-fold_brcond2.patch
20-tcg-optimize-Split-out-fold_mulu2_i32.patch
21-tcg-optimize-Split-out-fold_addsub2_i32.patch
22-tcg-optimize-Split-out-fold_movcond.patch
23-tcg-optimize-Split-out-fold_extract2.patch
24-tcg-optimize-Split-out-fold_extract-fold_sextract.patch
25-tcg-optimize-Split-out-fold_deposit.patch
26-tcg-optimize-Split-out-fold_count_zeros.patch
28-tcg-optimize-Split-out-fold_dup-fold_dup2.patch
33-tcg-optimize-Add-type-to-OptContext.patch
34-tcg-optimize-Split-out-fold_to_not.patch
35-tcg-optimize-Split-out-fold_sub_to_neg.patch
36-tcg-optimize-Split-out-fold_xi_to_x.patch
37-tcg-optimize-Split-out-fold_ix_to_i.patch
38-tcg-optimize-Split-out-fold_masks.patch
39-tcg-optimize-Expand-fold_mulu2_i32-to-all-4-arg-m.patch
41-tcg-optimize-Sink-commutative-operand-swapping-in.patch
42-tcg-optimize-Add-more-simplifications-for-orc.patch
43-tcg-optimize-Stop-forcing-z_mask-to-garbage-for-3.patch
44-tcg-optimize-Optimize-sign-extensions.patch
46-tcg-optimize-Propagate-sign-info-for-setcond.patch
47-tcg-optimize-Propagate-sign-info-for-bit-counting.patch
48-tcg-optimize-Propagate-sign-info-for-shifting.patch
Richard Henderson (48):
tcg/optimize: Rename "mask" to "z_mask"
tcg/optimize: Split out OptContext
tcg/optimize: Remove do_default label
tcg/optimize: Change tcg_opt_gen_{mov,movi} interface
tcg/optimize: Move prev_mb into OptContext
tcg/optimize: Split out init_arguments
tcg/optimize: Split out copy_propagate
tcg/optimize: Split out fold_call
tcg/optimize: Drop nb_oargs, nb_iargs locals
tcg/optimize: Change fail return for do_constant_folding_cond*
tcg/optimize: Return true from tcg_opt_gen_{mov,movi}
tcg/optimize: Split out finish_folding
tcg/optimize: Use a boolean to avoid a mass of continues
tcg/optimize: Split out fold_mb, fold_qemu_{ld,st}
tcg/optimize: Split out fold_const{1,2}
tcg/optimize: Split out fold_setcond2
tcg/optimize: Split out fold_brcond2
tcg/optimize: Split out fold_brcond
tcg/optimize: Split out fold_setcond
tcg/optimize: Split out fold_mulu2_i32
tcg/optimize: Split out fold_addsub2_i32
tcg/optimize: Split out fold_movcond
tcg/optimize: Split out fold_extract2
tcg/optimize: Split out fold_extract, fold_sextract
tcg/optimize: Split out fold_deposit
tcg/optimize: Split out fold_count_zeros
tcg/optimize: Split out fold_bswap
tcg/optimize: Split out fold_dup, fold_dup2
tcg/optimize: Split out fold_mov
tcg/optimize: Split out fold_xx_to_i
tcg/optimize: Split out fold_xx_to_x
tcg/optimize: Split out fold_xi_to_i
tcg/optimize: Add type to OptContext
tcg/optimize: Split out fold_to_not
tcg/optimize: Split out fold_sub_to_neg
tcg/optimize: Split out fold_xi_to_x
tcg/optimize: Split out fold_ix_to_i
tcg/optimize: Split out fold_masks
tcg/optimize: Expand fold_mulu2_i32 to all 4-arg multiplies
tcg/optimize: Expand fold_addsub2_i32 to 64-bit ops
tcg/optimize: Sink commutative operand swapping into fold functions
tcg/optimize: Add more simplifications for orc
tcg/optimize: Stop forcing z_mask to "garbage" for 32-bit values
tcg/optimize: Optimize sign extensions
tcg/optimize: Propagate sign info for logical operations
tcg/optimize: Propagate sign info for setcond
tcg/optimize: Propagate sign info for bit counting
tcg/optimize: Propagate sign info for shifting
tcg/optimize.c | 2600 +++++++++++++++++++++++++++++-------------------
1 file changed, 1583 insertions(+), 1017 deletions(-)
--
2.25.1