This is working toward improving atomicity within TCG, especially
with respect to Arm FEAT_LSE2, which guarantees that any operation
that does not cross a 16-byte boundary is treated atomically.
That goal is somewhat down the road. This patch set contains two
items: paired register allocation and TCGv_i128 usage with helpers.
The next step will be putting these two together to provide atomic
128-bit load/store operations within tcg. Via, e.g. Arm lse2 ldp,
Power ldq, or S390x lpq -- all of which require allocating a pair
of registers. (Intel will require that I go through AVX, which is
a bit of a complication, but I'll figure that out.) And then of
course separately via the helpers used by the slow path.
I have additional patches to target/s390x to test Int128, but that
should be posted separately. Anyway, this patch set is large enough.
Changes for v2:
* Fixes and r-b (philmd).
* Include i386 atomic16 patch, which avoids minor conflicts later.
* Split a few larger patches.
* Bug fixes for TCI.
r~
Based-on: 20221017062445.563431-1-richard.henderson@linaro.org
("[PATCH 0/3] tcg/sparc: Remove support for sparc32plus")
Richard Henderson (36):
include/qemu/atomic128: Support 16-byte atomic read/write for Intel
AVX
tcg: Tidy tcg_reg_alloc_op
tcg: Introduce paired register allocation
tcg/s390x: Use register pair allocation for div and mulu2
tcg/arm: Use register pair allocation for qemu_{ld,st}_i64
meson: Move CONFIG_TCG_INTERPRETER to config_host
tcg: Remove TCG_TARGET_STACK_GROWSUP
accel/tcg: Set cflags_next_tb in cpu_common_initfn
target/sparc: Avoid TCGV_{LOW,HIGH}
tcg: Move TCG_{LOW,HIGH} to tcg-internal.h
tcg: Add temp_subindex to TCGTemp
tcg: Simplify calls to temp_sync vs mem_coherent
tcg: Allocate TCGTemp pairs in host memory order
tcg: Move TCG_TYPE_COUNT outside enum
tcg: Introduce tcg_type_size
tcg: Introduce TCGCallReturnKind and TCGCallArgumentKind
tcg: Replace TCG_TARGET_CALL_ALIGN_ARGS with TCG_TARGET_CALL_ARG_I64
tcg: Replace TCG_TARGET_EXTEND_ARGS with TCG_TARGET_CALL_ARG_I32
tcg: Use TCG_CALL_ARG_EVEN for TCI special case
tcg: Reorg function calls
tcg: Move ffi_cif pointer into TCGHelperInfo
tcg: Add TCGHelperInfo argument to tcg_out_call
tcg: Define TCG_TYPE_I128 and related helper macros
tcg: Add TCG_CALL_{RET,ARG}_NORMAL_4
tcg: Allocate objects contiguously in temp_allocate_frame
tcg: Introduce tcg_out_addi_ptr
tcg: Add TCG_CALL_{RET,ARG}_BY_REF
tcg: Introduce tcg_target_call_oarg_reg
tcg: Add TCG_CALL_RET_BY_VEC
include/qemu/int128: Use Int128 structure for TCI
tcg/i386: Add TCG_TARGET_CALL_{RET,ARG}_I128
tcg/tci: Fix big-endian return register ordering
tcg/tci: Add TCG_TARGET_CALL_{RET,ARG}_I128
tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128
tcg: Add temp allocation for TCGv_i128
tcg: Add tcg_gen_extr_i128_i64, tcg_gen_concat_i64_i128
meson.build | 4 +-
include/exec/helper-head.h | 7 +
include/qemu/atomic128.h | 74 +-
include/qemu/int128.h | 18 +-
include/tcg/tcg-op.h | 3 +
include/tcg/tcg-opc.h | 4 +
include/tcg/tcg.h | 97 +-
tcg/aarch64/tcg-target.h | 6 +-
tcg/arm/tcg-target-con-set.h | 7 +-
tcg/arm/tcg-target-con-str.h | 2 +
tcg/arm/tcg-target.h | 6 +-
tcg/i386/tcg-target.h | 12 +
tcg/loongarch64/tcg-target.h | 5 +-
tcg/mips/tcg-target.h | 6 +-
tcg/riscv/tcg-target.h | 10 +-
tcg/s390x/tcg-target-con-set.h | 4 +-
tcg/s390x/tcg-target-con-str.h | 8 +-
tcg/s390x/tcg-target.h | 5 +-
tcg/sparc64/tcg-target.h | 5 +-
tcg/tcg-internal.h | 72 +-
tcg/tci/tcg-target.h | 10 +
hw/core/cpu-common.c | 1 +
target/sparc/translate.c | 21 +-
tcg/tcg-op.c | 45 +-
tcg/tcg.c | 1755 ++++++++++++++++++++++--------
tcg/tci.c | 66 +-
util/atomic128.c | 67 ++
util/int128.c | 42 +
tcg/aarch64/tcg-target.c.inc | 36 +-
tcg/arm/tcg-target.c.inc | 67 +-
tcg/i386/tcg-target.c.inc | 65 +-
tcg/loongarch64/tcg-target.c.inc | 24 +-
tcg/mips/tcg-target.c.inc | 20 +-
tcg/ppc/tcg-target.c.inc | 56 +-
tcg/riscv/tcg-target.c.inc | 24 +-
tcg/s390x/tcg-target.c.inc | 71 +-
tcg/sparc64/tcg-target.c.inc | 22 +-
tcg/tci/tcg-target.c.inc | 31 +-
util/meson.build | 1 +
39 files changed, 2084 insertions(+), 695 deletions(-)
create mode 100644 util/atomic128.c
--
2.34.1