1
The following changes since commit d530697ca20e19f7a626f4c1c8b26fccd0dc4470:
1
The following changes since commit 7fe6cb68117ac856e03c93d18aca09de015392b0:
2
2
3
Merge tag 'pull-testing-updates-100523-1' of https://gitlab.com/stsquad/qemu into staging (2023-05-10 16:43:01 +0100)
3
Merge tag 'pull-target-arm-20230530-1' of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2023-05-30 08:02:05 -0700)
4
4
5
are available in the Git repository at:
5
are available in the Git repository at:
6
6
7
https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230511
7
https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230530
8
8
9
for you to fetch changes up to b2d4d6616c22325dff802e0a35092167f2dc2268:
9
for you to fetch changes up to 276d77de503e8f5f5cbd3f7d94302ca12d1d982e:
10
10
11
target/loongarch: Do not include tcg-ldst.h (2023-05-11 06:06:04 +0100)
11
tests/decode: Add tests for various named-field cases (2023-05-30 10:55:39 -0700)
12
12
13
----------------------------------------------------------------
13
----------------------------------------------------------------
14
target/m68k: Fix gen_load_fp regression
14
Improvements to 128-bit atomics:
15
accel/tcg: Ensure fairness with icount
15
- Separate __int128_t type and arithmetic detection
16
disas: Move disas.c into the target-independent source sets
16
- Support 128-bit load/store in backend for i386, aarch64, ppc64, s390x
17
tcg: Use common routines for calling slow path helpers
17
- Accelerate atomics via host/include/
18
tcg/*: Cleanups to qemu_ld/st constraints
18
Decodetree:
19
tcg: Remove TARGET_ALIGNED_ONLY
19
- Add named field syntax
20
accel/tcg: Reorg system mode load/store helpers
20
- Move tests to meson
21
21
22
----------------------------------------------------------------
22
----------------------------------------------------------------
23
Jamie Iles (2):
23
Peter Maydell (5):
24
cpu: expose qemu_cpu_list_lock for lock-guard use
24
docs: Document decodetree named field syntax
25
accel/tcg/tcg-accel-ops-rr: ensure fairness with icount
25
scripts/decodetree: Pass lvalue-formatter function to str_extract()
26
scripts/decodetree: Implement a topological sort
27
scripts/decodetree: Implement named field support
28
tests/decode: Add tests for various named-field cases
26
29
27
Richard Henderson (49):
30
Richard Henderson (22):
28
target/m68k: Fix gen_load_fp for OS_LONG
31
tcg: Fix register move type in tcg_out_ld_helper_ret
29
accel/tcg: Fix atomic_mmu_lookup for reads
32
accel/tcg: Fix check for page writeability in load_atomic16_or_exit
30
disas: Fix tabs and braces in disas.c
33
meson: Split test for __int128_t type from __int128_t arithmetic
31
disas: Move disas.c to disas/
34
qemu/atomic128: Add x86_64 atomic128-ldst.h
32
disas: Remove target_ulong from the interface
35
tcg/i386: Support 128-bit load/store
33
disas: Remove target-specific headers
36
tcg/aarch64: Rename temporaries
34
tcg/i386: Introduce prepare_host_addr
37
tcg/aarch64: Reserve TCG_REG_TMP1, TCG_REG_TMP2
35
tcg/i386: Use indexed addressing for softmmu fast path
38
tcg/aarch64: Simplify constraints on qemu_ld/st
36
tcg/aarch64: Introduce prepare_host_addr
39
tcg/aarch64: Support 128-bit load/store
37
tcg/arm: Introduce prepare_host_addr
40
tcg/ppc: Support 128-bit load/store
38
tcg/loongarch64: Introduce prepare_host_addr
41
tcg/s390x: Support 128-bit load/store
39
tcg/mips: Introduce prepare_host_addr
42
accel/tcg: Extract load_atom_extract_al16_or_al8 to host header
40
tcg/ppc: Introduce prepare_host_addr
43
accel/tcg: Extract store_atom_insert_al16 to host header
41
tcg/riscv: Introduce prepare_host_addr
44
accel/tcg: Add x86_64 load_atom_extract_al16_or_al8
42
tcg/s390x: Introduce prepare_host_addr
45
accel/tcg: Add aarch64 lse2 load_atom_extract_al16_or_al8
43
tcg: Add routines for calling slow-path helpers
46
accel/tcg: Add aarch64 store_atom_insert_al16
44
tcg/i386: Convert tcg_out_qemu_ld_slow_path
47
tcg: Remove TCG_TARGET_TLB_DISPLACEMENT_BITS
45
tcg/i386: Convert tcg_out_qemu_st_slow_path
48
decodetree: Add --test-for-error
46
tcg/aarch64: Convert tcg_out_qemu_{ld,st}_slow_path
49
decodetree: Fix recursion in prop_format and build_tree
47
tcg/arm: Convert tcg_out_qemu_{ld,st}_slow_path
50
decodetree: Diagnose empty pattern group
48
tcg/loongarch64: Convert tcg_out_qemu_{ld,st}_slow_path
51
decodetree: Do not remove output_file from /dev
49
tcg/mips: Convert tcg_out_qemu_{ld,st}_slow_path
52
tests/decode: Convert tests to meson
50
tcg/ppc: Convert tcg_out_qemu_{ld,st}_slow_path
51
tcg/riscv: Convert tcg_out_qemu_{ld,st}_slow_path
52
tcg/s390x: Convert tcg_out_qemu_{ld,st}_slow_path
53
tcg/loongarch64: Simplify constraints on qemu_ld/st
54
tcg/mips: Remove MO_BSWAP handling
55
tcg/mips: Reorg tlb load within prepare_host_addr
56
tcg/mips: Simplify constraints on qemu_ld/st
57
tcg/ppc: Reorg tcg_out_tlb_read
58
tcg/ppc: Adjust constraints on qemu_ld/st
59
tcg/ppc: Remove unused constraints A, B, C, D
60
tcg/ppc: Remove unused constraint J
61
tcg/riscv: Simplify constraints on qemu_ld/st
62
tcg/s390x: Use ALGFR in constructing softmmu host address
63
tcg/s390x: Simplify constraints on qemu_ld/st
64
target/mips: Add MO_ALIGN to gen_llwp, gen_scwp
65
target/mips: Add missing default_tcg_memop_mask
66
target/mips: Use MO_ALIGN instead of 0
67
target/mips: Remove TARGET_ALIGNED_ONLY
68
target/nios2: Remove TARGET_ALIGNED_ONLY
69
target/sh4: Use MO_ALIGN where required
70
target/sh4: Remove TARGET_ALIGNED_ONLY
71
tcg: Remove TARGET_ALIGNED_ONLY
72
accel/tcg: Add cpu_in_serial_context
73
accel/tcg: Introduce tlb_read_idx
74
accel/tcg: Reorg system mode load helpers
75
accel/tcg: Reorg system mode store helpers
76
target/loongarch: Do not include tcg-ldst.h
77
53
78
Thomas Huth (2):
54
docs/devel/decodetree.rst | 33 ++-
79
disas: Move softmmu specific code to separate file
55
meson.build | 15 +-
80
disas: Move disas.c into the target-independent source set
56
host/include/aarch64/host/load-extract-al16-al8.h | 40 ++++
81
57
host/include/aarch64/host/store-insert-al16.h | 47 ++++
82
configs/targets/mips-linux-user.mak | 1 -
58
host/include/generic/host/load-extract-al16-al8.h | 45 ++++
83
configs/targets/mips-softmmu.mak | 1 -
59
host/include/generic/host/store-insert-al16.h | 50 ++++
84
configs/targets/mips64-linux-user.mak | 1 -
60
host/include/x86_64/host/atomic128-ldst.h | 68 ++++++
85
configs/targets/mips64-softmmu.mak | 1 -
61
host/include/x86_64/host/load-extract-al16-al8.h | 50 ++++
86
configs/targets/mips64el-linux-user.mak | 1 -
62
include/qemu/int128.h | 4 +-
87
configs/targets/mips64el-softmmu.mak | 1 -
63
tcg/aarch64/tcg-target-con-set.h | 4 +-
88
configs/targets/mipsel-linux-user.mak | 1 -
64
tcg/aarch64/tcg-target-con-str.h | 1 -
89
configs/targets/mipsel-softmmu.mak | 1 -
65
tcg/aarch64/tcg-target.h | 12 +-
90
configs/targets/mipsn32-linux-user.mak | 1 -
66
tcg/arm/tcg-target.h | 1 -
91
configs/targets/mipsn32el-linux-user.mak | 1 -
67
tcg/i386/tcg-target.h | 5 +-
92
configs/targets/nios2-softmmu.mak | 1 -
68
tcg/mips/tcg-target.h | 1 -
93
configs/targets/sh4-linux-user.mak | 1 -
69
tcg/ppc/tcg-target-con-set.h | 2 +
94
configs/targets/sh4-softmmu.mak | 1 -
70
tcg/ppc/tcg-target-con-str.h | 1 +
95
configs/targets/sh4eb-linux-user.mak | 1 -
71
tcg/ppc/tcg-target.h | 4 +-
96
configs/targets/sh4eb-softmmu.mak | 1 -
72
tcg/riscv/tcg-target.h | 1 -
97
meson.build | 3 -
73
tcg/s390x/tcg-target-con-set.h | 2 +
98
accel/tcg/internal.h | 9 +
74
tcg/s390x/tcg-target.h | 3 +-
99
accel/tcg/tcg-accel-ops-icount.h | 3 +-
75
tcg/sparc64/tcg-target.h | 1 -
100
disas/disas-internal.h | 21 +
76
tcg/tci/tcg-target.h | 1 -
101
include/disas/disas.h | 23 +-
77
tests/decode/err_field10.decode | 7 +
102
include/exec/cpu-common.h | 1 +
78
tests/decode/err_field7.decode | 7 +
103
include/exec/cpu-defs.h | 7 +-
79
tests/decode/err_field8.decode | 8 +
104
include/exec/cpu_ldst.h | 26 +-
80
tests/decode/err_field9.decode | 14 ++
105
include/exec/memop.h | 13 +-
81
tests/decode/succ_named_field.decode | 19 ++
106
include/exec/poison.h | 1 -
82
tcg/tcg.c | 4 +-
107
tcg/loongarch64/tcg-target-con-set.h | 2 -
83
accel/tcg/ldst_atomicity.c.inc | 80 +------
108
tcg/loongarch64/tcg-target-con-str.h | 1 -
84
tcg/aarch64/tcg-target.c.inc | 243 +++++++++++++++-----
109
tcg/mips/tcg-target-con-set.h | 13 +-
85
tcg/i386/tcg-target.c.inc | 191 +++++++++++++++-
110
tcg/mips/tcg-target-con-str.h | 2 -
86
tcg/ppc/tcg-target.c.inc | 108 ++++++++-
111
tcg/mips/tcg-target.h | 4 +-
87
tcg/s390x/tcg-target.c.inc | 107 ++++++++-
112
tcg/ppc/tcg-target-con-set.h | 11 +-
88
scripts/decodetree.py | 265 ++++++++++++++++++++--
113
tcg/ppc/tcg-target-con-str.h | 7 -
89
tests/decode/check.sh | 24 --
114
tcg/riscv/tcg-target-con-set.h | 2 -
90
tests/decode/meson.build | 64 ++++++
115
tcg/riscv/tcg-target-con-str.h | 1 -
91
tests/meson.build | 5 +-
116
tcg/s390x/tcg-target-con-set.h | 2 -
92
38 files changed, 1312 insertions(+), 225 deletions(-)
117
tcg/s390x/tcg-target-con-str.h | 1 -
93
create mode 100644 host/include/aarch64/host/load-extract-al16-al8.h
118
accel/tcg/cpu-exec-common.c | 3 +
94
create mode 100644 host/include/aarch64/host/store-insert-al16.h
119
accel/tcg/cputlb.c | 1113 ++++++++++++++++-------------
95
create mode 100644 host/include/generic/host/load-extract-al16-al8.h
120
accel/tcg/tb-maint.c | 2 +-
96
create mode 100644 host/include/generic/host/store-insert-al16.h
121
accel/tcg/tcg-accel-ops-icount.c | 21 +-
97
create mode 100644 host/include/x86_64/host/atomic128-ldst.h
122
accel/tcg/tcg-accel-ops-rr.c | 37 +-
98
create mode 100644 host/include/x86_64/host/load-extract-al16-al8.h
123
bsd-user/elfload.c | 5 +-
99
create mode 100644 tests/decode/err_field10.decode
124
cpus-common.c | 2 +-
100
create mode 100644 tests/decode/err_field7.decode
125
disas/disas-mon.c | 65 ++
101
create mode 100644 tests/decode/err_field8.decode
126
disas.c => disas/disas.c | 109 +--
102
create mode 100644 tests/decode/err_field9.decode
127
linux-user/elfload.c | 18 +-
103
create mode 100644 tests/decode/succ_named_field.decode
128
migration/dirtyrate.c | 26 +-
104
delete mode 100755 tests/decode/check.sh
129
replay/replay.c | 3 +-
105
create mode 100644 tests/decode/meson.build
130
target/loongarch/csr_helper.c | 1 -
131
target/loongarch/iocsr_helper.c | 1 -
132
target/m68k/translate.c | 1 +
133
target/mips/tcg/mxu_translate.c | 3 +-
134
target/nios2/translate.c | 10 +
135
target/sh4/translate.c | 102 ++-
136
tcg/tcg.c | 480 ++++++++++++-
137
trace/control-target.c | 9 +-
138
target/mips/tcg/micromips_translate.c.inc | 24 +-
139
target/mips/tcg/mips16e_translate.c.inc | 18 +-
140
target/mips/tcg/nanomips_translate.c.inc | 32 +-
141
tcg/aarch64/tcg-target.c.inc | 347 ++++-----
142
tcg/arm/tcg-target.c.inc | 455 +++++-------
143
tcg/i386/tcg-target.c.inc | 453 +++++-------
144
tcg/loongarch64/tcg-target.c.inc | 313 +++-----
145
tcg/mips/tcg-target.c.inc | 870 +++++++---------------
146
tcg/ppc/tcg-target.c.inc | 512 ++++++-------
147
tcg/riscv/tcg-target.c.inc | 304 ++++----
148
tcg/s390x/tcg-target.c.inc | 314 ++++----
149
disas/meson.build | 6 +-
150
68 files changed, 2788 insertions(+), 3039 deletions(-)
151
create mode 100644 disas/disas-internal.h
152
create mode 100644 disas/disas-mon.c
153
rename disas.c => disas/disas.c (79%)
diff view generated by jsdifflib
Deleted patch
1
Case was accidentally dropped in b7a94da9550b.
2
1
3
Tested-by: Laurent Vivier <laurent@vivier.eu>
4
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
target/m68k/translate.c | 1 +
9
1 file changed, 1 insertion(+)
10
11
diff --git a/target/m68k/translate.c b/target/m68k/translate.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/m68k/translate.c
14
+++ b/target/m68k/translate.c
15
@@ -XXX,XX +XXX,XX @@ static void gen_load_fp(DisasContext *s, int opsize, TCGv addr, TCGv_ptr fp,
16
switch (opsize) {
17
case OS_BYTE:
18
case OS_WORD:
19
+ case OS_LONG:
20
tcg_gen_qemu_ld_tl(tmp, addr, index, opsize | MO_SIGN | MO_TE);
21
gen_helper_exts32(cpu_env, fp, tmp);
22
break;
23
--
24
2.34.1
25
26
diff view generated by jsdifflib
Deleted patch
1
A copy-paste bug had us looking at the victim cache for writes.
2
1
3
Cc: qemu-stable@nongnu.org
4
Reported-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Fixes: 08dff435e2 ("tcg: Probe the proper permissions for atomic ops")
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Message-Id: <20230505204049.352469-1-richard.henderson@linaro.org>
10
---
11
accel/tcg/cputlb.c | 2 +-
12
1 file changed, 1 insertion(+), 1 deletion(-)
13
14
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
15
index XXXXXXX..XXXXXXX 100644
16
--- a/accel/tcg/cputlb.c
17
+++ b/accel/tcg/cputlb.c
18
@@ -XXX,XX +XXX,XX @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
19
} else /* if (prot & PAGE_READ) */ {
20
tlb_addr = tlbe->addr_read;
21
if (!tlb_hit(tlb_addr, addr)) {
22
- if (!VICTIM_TLB_HIT(addr_write, addr)) {
23
+ if (!VICTIM_TLB_HIT(addr_read, addr)) {
24
tlb_fill(env_cpu(env), addr, size,
25
MMU_DATA_LOAD, mmu_idx, retaddr);
26
index = tlb_index(env, mmu_idx, addr);
27
--
28
2.34.1
29
30
diff view generated by jsdifflib
1
All uses have now been expunged.
1
The first move was incorrectly using TCG_TYPE_I32 while the second
2
move was correctly using TCG_TYPE_REG. This prevents a 64-bit host
3
from moving all 128-bits of the return value.
2
4
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
Fixes: ebebea53ef8 ("tcg: Support TCG_TYPE_I128 in tcg_out_{ld,st}_helper_{args,ret}")
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
---
8
---
6
include/exec/memop.h | 13 ++-----------
9
tcg/tcg.c | 4 ++--
7
include/exec/poison.h | 1 -
10
1 file changed, 2 insertions(+), 2 deletions(-)
8
tcg/tcg.c | 5 -----
9
3 files changed, 2 insertions(+), 17 deletions(-)
10
11
11
diff --git a/include/exec/memop.h b/include/exec/memop.h
12
index XXXXXXX..XXXXXXX 100644
13
--- a/include/exec/memop.h
14
+++ b/include/exec/memop.h
15
@@ -XXX,XX +XXX,XX @@ typedef enum MemOp {
16
* MO_UNALN accesses are never checked for alignment.
17
* MO_ALIGN accesses will result in a call to the CPU's
18
* do_unaligned_access hook if the guest address is not aligned.
19
- * The default depends on whether the target CPU defines
20
- * TARGET_ALIGNED_ONLY.
21
*
22
* Some architectures (e.g. ARMv8) need the address which is aligned
23
* to a size more than the size of the memory access.
24
@@ -XXX,XX +XXX,XX @@ typedef enum MemOp {
25
*/
26
MO_ASHIFT = 5,
27
MO_AMASK = 0x7 << MO_ASHIFT,
28
-#ifdef NEED_CPU_H
29
-#ifdef TARGET_ALIGNED_ONLY
30
- MO_ALIGN = 0,
31
- MO_UNALN = MO_AMASK,
32
-#else
33
- MO_ALIGN = MO_AMASK,
34
- MO_UNALN = 0,
35
-#endif
36
-#endif
37
+ MO_UNALN = 0,
38
MO_ALIGN_2 = 1 << MO_ASHIFT,
39
MO_ALIGN_4 = 2 << MO_ASHIFT,
40
MO_ALIGN_8 = 3 << MO_ASHIFT,
41
MO_ALIGN_16 = 4 << MO_ASHIFT,
42
MO_ALIGN_32 = 5 << MO_ASHIFT,
43
MO_ALIGN_64 = 6 << MO_ASHIFT,
44
+ MO_ALIGN = MO_AMASK,
45
46
/* Combinations of the above, for ease of use. */
47
MO_UB = MO_8,
48
diff --git a/include/exec/poison.h b/include/exec/poison.h
49
index XXXXXXX..XXXXXXX 100644
50
--- a/include/exec/poison.h
51
+++ b/include/exec/poison.h
52
@@ -XXX,XX +XXX,XX @@
53
#pragma GCC poison TARGET_TRICORE
54
#pragma GCC poison TARGET_XTENSA
55
56
-#pragma GCC poison TARGET_ALIGNED_ONLY
57
#pragma GCC poison TARGET_HAS_BFLT
58
#pragma GCC poison TARGET_NAME
59
#pragma GCC poison TARGET_SUPPORTS_MTTCG
60
diff --git a/tcg/tcg.c b/tcg/tcg.c
12
diff --git a/tcg/tcg.c b/tcg/tcg.c
61
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
62
--- a/tcg/tcg.c
14
--- a/tcg/tcg.c
63
+++ b/tcg/tcg.c
15
+++ b/tcg/tcg.c
64
@@ -XXX,XX +XXX,XX @@ static const char * const ldst_name[] =
16
@@ -XXX,XX +XXX,XX @@ static void tcg_out_ld_helper_ret(TCGContext *s, const TCGLabelQemuLdst *ldst,
65
};
17
mov[0].dst = ldst->datalo_reg;
66
18
mov[0].src =
67
static const char * const alignment_name[(MO_AMASK >> MO_ASHIFT) + 1] = {
19
tcg_target_call_oarg_reg(TCG_CALL_RET_NORMAL, HOST_BIG_ENDIAN);
68
-#ifdef TARGET_ALIGNED_ONLY
20
- mov[0].dst_type = TCG_TYPE_I32;
69
[MO_UNALN >> MO_ASHIFT] = "un+",
21
- mov[0].src_type = TCG_TYPE_I32;
70
- [MO_ALIGN >> MO_ASHIFT] = "",
22
+ mov[0].dst_type = TCG_TYPE_REG;
71
-#else
23
+ mov[0].src_type = TCG_TYPE_REG;
72
- [MO_UNALN >> MO_ASHIFT] = "",
24
mov[0].src_ext = TCG_TARGET_REG_BITS == 32 ? MO_32 : MO_64;
73
[MO_ALIGN >> MO_ASHIFT] = "al+",
25
74
-#endif
26
mov[1].dst = ldst->datahi_reg;
75
[MO_ALIGN_2 >> MO_ASHIFT] = "al2+",
76
[MO_ALIGN_4 >> MO_ASHIFT] = "al4+",
77
[MO_ALIGN_8 >> MO_ASHIFT] = "al8+",
78
--
27
--
79
2.34.1
28
2.34.1
80
81
diff view generated by jsdifflib
1
This header is supposed to be private to tcg and in fact
1
PAGE_WRITE is current writability, as modified by TB protection;
2
does not need to be included here at all.
2
PAGE_WRITE_ORG is the original page writability.
3
3
4
Reviewed-by: Song Gao <gaosong@loongson.cn>
4
Fixes: cdfac37be0d ("accel/tcg: Honor atomicity of loads")
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
7
---
8
target/loongarch/csr_helper.c | 1 -
8
accel/tcg/ldst_atomicity.c.inc | 4 ++--
9
target/loongarch/iocsr_helper.c | 1 -
9
1 file changed, 2 insertions(+), 2 deletions(-)
10
2 files changed, 2 deletions(-)
11
10
12
diff --git a/target/loongarch/csr_helper.c b/target/loongarch/csr_helper.c
11
diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc
13
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
14
--- a/target/loongarch/csr_helper.c
13
--- a/accel/tcg/ldst_atomicity.c.inc
15
+++ b/target/loongarch/csr_helper.c
14
+++ b/accel/tcg/ldst_atomicity.c.inc
16
@@ -XXX,XX +XXX,XX @@
15
@@ -XXX,XX +XXX,XX @@ static uint64_t load_atomic8_or_exit(CPUArchState *env, uintptr_t ra, void *pv)
17
#include "exec/cpu_ldst.h"
16
* another process, because the fallback start_exclusive solution
18
#include "hw/irq.h"
17
* provides no protection across processes.
19
#include "cpu-csr.h"
18
*/
20
-#include "tcg/tcg-ldst.h"
19
- if (!page_check_range(h2g(pv), 8, PAGE_WRITE)) {
21
20
+ if (!page_check_range(h2g(pv), 8, PAGE_WRITE_ORG)) {
22
target_ulong helper_csrrd_pgd(CPULoongArchState *env)
21
uint64_t *p = __builtin_assume_aligned(pv, 8);
23
{
22
return *p;
24
diff --git a/target/loongarch/iocsr_helper.c b/target/loongarch/iocsr_helper.c
23
}
25
index XXXXXXX..XXXXXXX 100644
24
@@ -XXX,XX +XXX,XX @@ static Int128 load_atomic16_or_exit(CPUArchState *env, uintptr_t ra, void *pv)
26
--- a/target/loongarch/iocsr_helper.c
25
* another process, because the fallback start_exclusive solution
27
+++ b/target/loongarch/iocsr_helper.c
26
* provides no protection across processes.
28
@@ -XXX,XX +XXX,XX @@
27
*/
29
#include "exec/helper-proto.h"
28
- if (!page_check_range(h2g(p), 16, PAGE_WRITE)) {
30
#include "exec/exec-all.h"
29
+ if (!page_check_range(h2g(p), 16, PAGE_WRITE_ORG)) {
31
#include "exec/cpu_ldst.h"
30
return *p;
32
-#include "tcg/tcg-ldst.h"
31
}
33
32
#endif
34
#define GET_MEMTXATTRS(cas) \
35
((MemTxAttrs){.requester_id = env_cpu(cas)->cpu_index})
36
--
33
--
37
2.34.1
34
2.34.1
diff view generated by jsdifflib
1
Reviewed-by: Thomas Huth <thuth@redhat.com>
1
Older versions of clang have missing runtime functions for arithmetic
2
with -fsanitize=undefined (see 464e3671f9d5c), so we cannot use
3
__int128_t for implementing Int128. But __int128_t is present,
4
data movement works, and it can be used for atomic128.
5
6
Probe for both CONFIG_INT128_TYPE and CONFIG_INT128, adjust
7
qemu/int128.h to define Int128Alias if CONFIG_INT128_TYPE,
8
and adjust the meson probe for atomics to use has_int128_type.
9
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Message-Id: <20230503072331.1747057-80-richard.henderson@linaro.org>
4
---
12
---
5
meson.build | 3 ---
13
meson.build | 15 ++++++++++-----
6
disas.c => disas/disas.c | 0
14
include/qemu/int128.h | 4 ++--
7
disas/meson.build | 4 +++-
15
2 files changed, 12 insertions(+), 7 deletions(-)
8
3 files changed, 3 insertions(+), 4 deletions(-)
9
rename disas.c => disas/disas.c (100%)
10
16
11
diff --git a/meson.build b/meson.build
17
diff --git a/meson.build b/meson.build
12
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
13
--- a/meson.build
19
--- a/meson.build
14
+++ b/meson.build
20
+++ b/meson.build
15
@@ -XXX,XX +XXX,XX @@ specific_ss.add(files('cpu.c'))
21
@@ -XXX,XX +XXX,XX @@ config_host_data.set('CONFIG_ATOMIC64', cc.links('''
16
22
return 0;
17
subdir('softmmu')
23
}'''))
18
24
19
-common_ss.add(capstone)
25
-has_int128 = cc.links('''
20
-specific_ss.add(files('disas.c'), capstone)
26
+has_int128_type = cc.compiles('''
27
+ __int128_t a;
28
+ __uint128_t b;
29
+ int main(void) { b = a; }''')
30
+config_host_data.set('CONFIG_INT128_TYPE', has_int128_type)
31
+
32
+has_int128 = has_int128_type and cc.links('''
33
__int128_t a;
34
__uint128_t b;
35
int main (void) {
36
@@ -XXX,XX +XXX,XX @@ has_int128 = cc.links('''
37
a = a * a;
38
return 0;
39
}''')
21
-
40
-
22
# Work around a gcc bug/misfeature wherein constant propagation looks
41
config_host_data.set('CONFIG_INT128', has_int128)
23
# through an alias:
42
24
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696
43
-if has_int128
25
diff --git a/disas.c b/disas/disas.c
44
+if has_int128_type
26
similarity index 100%
45
# "do we have 128-bit atomics which are handled inline and specifically not
27
rename from disas.c
46
# via libatomic". The reason we can't use libatomic is documented in the
28
rename to disas/disas.c
47
# comment starting "GCC is a house divided" in include/qemu/atomic128.h.
29
diff --git a/disas/meson.build b/disas/meson.build
48
@@ -XXX,XX +XXX,XX @@ if has_int128
49
# __alignof(unsigned __int128) for the host.
50
atomic_test_128 = '''
51
int main(int ac, char **av) {
52
- unsigned __int128 *p = __builtin_assume_aligned(av[ac - 1], 16);
53
+ __uint128_t *p = __builtin_assume_aligned(av[ac - 1], 16);
54
p[1] = __atomic_load_n(&p[0], __ATOMIC_RELAXED);
55
__atomic_store_n(&p[2], p[3], __ATOMIC_RELAXED);
56
__atomic_compare_exchange_n(&p[4], &p[5], p[6], 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
57
@@ -XXX,XX +XXX,XX @@ if has_int128
58
config_host_data.set('CONFIG_CMPXCHG128', cc.links('''
59
int main(void)
60
{
61
- unsigned __int128 x = 0, y = 0;
62
+ __uint128_t x = 0, y = 0;
63
__sync_val_compare_and_swap_16(&x, y, x);
64
return 0;
65
}
66
diff --git a/include/qemu/int128.h b/include/qemu/int128.h
30
index XXXXXXX..XXXXXXX 100644
67
index XXXXXXX..XXXXXXX 100644
31
--- a/disas/meson.build
68
--- a/include/qemu/int128.h
32
+++ b/disas/meson.build
69
+++ b/include/qemu/int128.h
33
@@ -XXX,XX +XXX,XX @@ common_ss.add(when: 'CONFIG_RISCV_DIS', if_true: files('riscv.c'))
70
@@ -XXX,XX +XXX,XX @@ static inline void bswap128s(Int128 *s)
34
common_ss.add(when: 'CONFIG_SH4_DIS', if_true: files('sh4.c'))
71
* a possible structure and the native types. Ease parameter passing
35
common_ss.add(when: 'CONFIG_SPARC_DIS', if_true: files('sparc.c'))
72
* via use of the transparent union extension.
36
common_ss.add(when: 'CONFIG_XTENSA_DIS', if_true: files('xtensa.c'))
73
*/
37
-common_ss.add(when: capstone, if_true: files('capstone.c'))
74
-#ifdef CONFIG_INT128
38
+common_ss.add(when: capstone, if_true: [files('capstone.c'), capstone])
75
+#ifdef CONFIG_INT128_TYPE
39
+
76
typedef union {
40
+specific_ss.add(files('disas.c'), capstone)
77
__uint128_t u;
78
__int128_t i;
79
@@ -XXX,XX +XXX,XX @@ typedef union {
80
} Int128Alias __attribute__((transparent_union));
81
#else
82
typedef Int128 Int128Alias;
83
-#endif /* CONFIG_INT128 */
84
+#endif /* CONFIG_INT128_TYPE */
85
86
#endif /* INT128_H */
41
--
87
--
42
2.34.1
88
2.34.1
diff view generated by jsdifflib
1
Add tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
1
With CPUINFO_ATOMIC_VMOVDQA, we can perform proper atomic
2
and tcg_out_st_helper_args. These and their subroutines
2
load/store without cmpxchg16b.
3
use the existing knowledge of the host function call abi
4
to load the function call arguments and return results.
5
6
These will be used to simplify the backends in turn.
7
3
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
---
6
---
11
tcg/tcg.c | 475 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
7
host/include/x86_64/host/atomic128-ldst.h | 68 +++++++++++++++++++++++
12
1 file changed, 471 insertions(+), 4 deletions(-)
8
1 file changed, 68 insertions(+)
9
create mode 100644 host/include/x86_64/host/atomic128-ldst.h
13
10
14
diff --git a/tcg/tcg.c b/tcg/tcg.c
11
diff --git a/host/include/x86_64/host/atomic128-ldst.h b/host/include/x86_64/host/atomic128-ldst.h
15
index XXXXXXX..XXXXXXX 100644
12
new file mode 100644
16
--- a/tcg/tcg.c
13
index XXXXXXX..XXXXXXX
17
+++ b/tcg/tcg.c
14
--- /dev/null
18
@@ -XXX,XX +XXX,XX @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct);
15
+++ b/host/include/x86_64/host/atomic128-ldst.h
19
static int tcg_out_ldst_finalize(TCGContext *s);
16
@@ -XXX,XX +XXX,XX @@
20
#endif
21
22
+typedef struct TCGLdstHelperParam {
23
+ TCGReg (*ra_gen)(TCGContext *s, const TCGLabelQemuLdst *l, int arg_reg);
24
+ unsigned ntmp;
25
+ int tmp[3];
26
+} TCGLdstHelperParam;
27
+
28
+static void tcg_out_ld_helper_args(TCGContext *s, const TCGLabelQemuLdst *l,
29
+ const TCGLdstHelperParam *p)
30
+ __attribute__((unused));
31
+static void tcg_out_ld_helper_ret(TCGContext *s, const TCGLabelQemuLdst *l,
32
+ bool load_sign, const TCGLdstHelperParam *p)
33
+ __attribute__((unused));
34
+static void tcg_out_st_helper_args(TCGContext *s, const TCGLabelQemuLdst *l,
35
+ const TCGLdstHelperParam *p)
36
+ __attribute__((unused));
37
+
38
TCGContext tcg_init_ctx;
39
__thread TCGContext *tcg_ctx;
40
41
@@ -XXX,XX +XXX,XX @@ void tcg_raise_tb_overflow(TCGContext *s)
42
siglongjmp(s->jmp_trans, -2);
43
}
44
45
+/*
17
+/*
46
+ * Used by tcg_out_movext{1,2} to hold the arguments for tcg_out_movext.
18
+ * SPDX-License-Identifier: GPL-2.0-or-later
47
+ * By the time we arrive at tcg_out_movext1, @dst is always a TCGReg.
19
+ * Load/store for 128-bit atomic operations, x86_64 version.
48
+ *
20
+ *
49
+ * However, tcg_out_helper_load_slots reuses this field to hold an
21
+ * Copyright (C) 2023 Linaro, Ltd.
50
+ * argument slot number (which may designate a argument register or an
22
+ *
51
+ * argument stack slot), converting to TCGReg once all arguments that
23
+ * See docs/devel/atomics.rst for discussion about the guarantees each
52
+ * are destined for the stack are processed.
24
+ * atomic primitive is meant to provide.
53
+ */
54
typedef struct TCGMovExtend {
55
- TCGReg dst;
56
+ unsigned dst;
57
TCGReg src;
58
TCGType dst_type;
59
TCGType src_type;
60
@@ -XXX,XX +XXX,XX @@ static void tcg_out_movext1(TCGContext *s, const TCGMovExtend *i)
61
* between the sources and destinations.
62
*/
63
64
-static void __attribute__((unused))
65
-tcg_out_movext2(TCGContext *s, const TCGMovExtend *i1,
66
- const TCGMovExtend *i2, int scratch)
67
+static void tcg_out_movext2(TCGContext *s, const TCGMovExtend *i1,
68
+ const TCGMovExtend *i2, int scratch)
69
{
70
TCGReg src1 = i1->src;
71
TCGReg src2 = i2->src;
72
@@ -XXX,XX +XXX,XX @@ static TCGHelperInfo all_helpers[] = {
73
};
74
static GHashTable *helper_table;
75
76
+/*
77
+ * Create TCGHelperInfo structures for "tcg/tcg-ldst.h" functions,
78
+ * akin to what "exec/helper-tcg.h" does with DEF_HELPER_FLAGS_N.
79
+ * We only use these for layout in tcg_out_ld_helper_ret and
80
+ * tcg_out_st_helper_args, and share them between several of
81
+ * the helpers, with the end result that it's easier to build manually.
82
+ */
25
+ */
83
+
26
+
84
+#if TCG_TARGET_REG_BITS == 32
27
+#ifndef AARCH64_ATOMIC128_LDST_H
85
+# define dh_typecode_ttl dh_typecode_i32
28
+#define AARCH64_ATOMIC128_LDST_H
29
+
30
+#ifdef CONFIG_INT128_TYPE
31
+#include "host/cpuinfo.h"
32
+#include "tcg/debug-assert.h"
33
+
34
+/*
35
+ * Through clang 16, with -mcx16, __atomic_load_n is incorrectly
36
+ * expanded to a read-write operation: lock cmpxchg16b.
37
+ */
38
+
39
+#define HAVE_ATOMIC128_RO likely(cpuinfo & CPUINFO_ATOMIC_VMOVDQA)
40
+#define HAVE_ATOMIC128_RW 1
41
+
42
+static inline Int128 atomic16_read_ro(const Int128 *ptr)
43
+{
44
+ Int128Alias r;
45
+
46
+ tcg_debug_assert(HAVE_ATOMIC128_RO);
47
+ asm("vmovdqa %1, %0" : "=x" (r.i) : "m" (*ptr));
48
+
49
+ return r.s;
50
+}
51
+
52
+static inline Int128 atomic16_read_rw(Int128 *ptr)
53
+{
54
+ __int128_t *ptr_align = __builtin_assume_aligned(ptr, 16);
55
+ Int128Alias r;
56
+
57
+ if (HAVE_ATOMIC128_RO) {
58
+ asm("vmovdqa %1, %0" : "=x" (r.i) : "m" (*ptr_align));
59
+ } else {
60
+ r.i = __sync_val_compare_and_swap_16(ptr_align, 0, 0);
61
+ }
62
+ return r.s;
63
+}
64
+
65
+static inline void atomic16_set(Int128 *ptr, Int128 val)
66
+{
67
+ __int128_t *ptr_align = __builtin_assume_aligned(ptr, 16);
68
+ Int128Alias new = { .s = val };
69
+
70
+ if (HAVE_ATOMIC128_RO) {
71
+ asm("vmovdqa %1, %0" : "=m"(*ptr_align) : "x" (new.i));
72
+ } else {
73
+ __int128_t old;
74
+ do {
75
+ old = *ptr_align;
76
+ } while (!__sync_bool_compare_and_swap_16(ptr_align, old, new.i));
77
+ }
78
+}
86
+#else
79
+#else
87
+# define dh_typecode_ttl dh_typecode_i64
80
+/* Provide QEMU_ERROR stubs. */
81
+#include "host/include/generic/host/atomic128-ldst.h"
88
+#endif
82
+#endif
89
+
83
+
90
+static TCGHelperInfo info_helper_ld32_mmu = {
84
+#endif /* AARCH64_ATOMIC128_LDST_H */
91
+ .flags = TCG_CALL_NO_WG,
92
+ .typemask = dh_typemask(ttl, 0) /* return tcg_target_ulong */
93
+ | dh_typemask(env, 1)
94
+ | dh_typemask(tl, 2) /* target_ulong addr */
95
+ | dh_typemask(i32, 3) /* unsigned oi */
96
+ | dh_typemask(ptr, 4) /* uintptr_t ra */
97
+};
98
+
99
+static TCGHelperInfo info_helper_ld64_mmu = {
100
+ .flags = TCG_CALL_NO_WG,
101
+ .typemask = dh_typemask(i64, 0) /* return uint64_t */
102
+ | dh_typemask(env, 1)
103
+ | dh_typemask(tl, 2) /* target_ulong addr */
104
+ | dh_typemask(i32, 3) /* unsigned oi */
105
+ | dh_typemask(ptr, 4) /* uintptr_t ra */
106
+};
107
+
108
+static TCGHelperInfo info_helper_st32_mmu = {
109
+ .flags = TCG_CALL_NO_WG,
110
+ .typemask = dh_typemask(void, 0)
111
+ | dh_typemask(env, 1)
112
+ | dh_typemask(tl, 2) /* target_ulong addr */
113
+ | dh_typemask(i32, 3) /* uint32_t data */
114
+ | dh_typemask(i32, 4) /* unsigned oi */
115
+ | dh_typemask(ptr, 5) /* uintptr_t ra */
116
+};
117
+
118
+static TCGHelperInfo info_helper_st64_mmu = {
119
+ .flags = TCG_CALL_NO_WG,
120
+ .typemask = dh_typemask(void, 0)
121
+ | dh_typemask(env, 1)
122
+ | dh_typemask(tl, 2) /* target_ulong addr */
123
+ | dh_typemask(i64, 3) /* uint64_t data */
124
+ | dh_typemask(i32, 4) /* unsigned oi */
125
+ | dh_typemask(ptr, 5) /* uintptr_t ra */
126
+};
127
+
128
#ifdef CONFIG_TCG_INTERPRETER
129
static ffi_type *typecode_to_ffi(int argmask)
130
{
131
@@ -XXX,XX +XXX,XX @@ static void tcg_context_init(unsigned max_cpus)
132
(gpointer)&all_helpers[i]);
133
}
134
135
+ init_call_layout(&info_helper_ld32_mmu);
136
+ init_call_layout(&info_helper_ld64_mmu);
137
+ init_call_layout(&info_helper_st32_mmu);
138
+ init_call_layout(&info_helper_st64_mmu);
139
+
140
#ifdef CONFIG_TCG_INTERPRETER
141
init_ffi_layouts();
142
#endif
143
@@ -XXX,XX +XXX,XX @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op)
144
}
145
}
146
147
+/*
148
+ * Similarly for qemu_ld/st slow path helpers.
149
+ * We must re-implement tcg_gen_callN and tcg_reg_alloc_call simultaneously,
150
+ * using only the provided backend tcg_out_* functions.
151
+ */
152
+
153
+static int tcg_out_helper_stk_ofs(TCGType type, unsigned slot)
154
+{
155
+ int ofs = arg_slot_stk_ofs(slot);
156
+
157
+ /*
158
+ * Each stack slot is TCG_TARGET_LONG_BITS. If the host does not
159
+ * require extension to uint64_t, adjust the address for uint32_t.
160
+ */
161
+ if (HOST_BIG_ENDIAN &&
162
+ TCG_TARGET_REG_BITS == 64 &&
163
+ type == TCG_TYPE_I32) {
164
+ ofs += 4;
165
+ }
166
+ return ofs;
167
+}
168
+
169
+static void tcg_out_helper_load_regs(TCGContext *s,
170
+ unsigned nmov, TCGMovExtend *mov,
171
+ unsigned ntmp, const int *tmp)
172
+{
173
+ switch (nmov) {
174
+ default:
175
+ /* The backend must have provided enough temps for the worst case. */
176
+ tcg_debug_assert(ntmp + 1 >= nmov);
177
+
178
+ for (unsigned i = nmov - 1; i >= 2; --i) {
179
+ TCGReg dst = mov[i].dst;
180
+
181
+ for (unsigned j = 0; j < i; ++j) {
182
+ if (dst == mov[j].src) {
183
+ /*
184
+ * Conflict.
185
+ * Copy the source to a temporary, recurse for the
186
+ * remaining moves, perform the extension from our
187
+ * scratch on the way out.
188
+ */
189
+ TCGReg scratch = tmp[--ntmp];
190
+ tcg_out_mov(s, mov[i].src_type, scratch, mov[i].src);
191
+ mov[i].src = scratch;
192
+
193
+ tcg_out_helper_load_regs(s, i, mov, ntmp, tmp);
194
+ tcg_out_movext1(s, &mov[i]);
195
+ return;
196
+ }
197
+ }
198
+
199
+ /* No conflicts: perform this move and continue. */
200
+ tcg_out_movext1(s, &mov[i]);
201
+ }
202
+ /* fall through for the final two moves */
203
+
204
+ case 2:
205
+ tcg_out_movext2(s, mov, mov + 1, ntmp ? tmp[0] : -1);
206
+ return;
207
+ case 1:
208
+ tcg_out_movext1(s, mov);
209
+ return;
210
+ case 0:
211
+ g_assert_not_reached();
212
+ }
213
+}
214
+
215
+static void tcg_out_helper_load_slots(TCGContext *s,
216
+ unsigned nmov, TCGMovExtend *mov,
217
+ const TCGLdstHelperParam *parm)
218
+{
219
+ unsigned i;
220
+
221
+ /*
222
+ * Start from the end, storing to the stack first.
223
+ * This frees those registers, so we need not consider overlap.
224
+ */
225
+ for (i = nmov; i-- > 0; ) {
226
+ unsigned slot = mov[i].dst;
227
+
228
+ if (arg_slot_reg_p(slot)) {
229
+ goto found_reg;
230
+ }
231
+
232
+ TCGReg src = mov[i].src;
233
+ TCGType dst_type = mov[i].dst_type;
234
+ MemOp dst_mo = dst_type == TCG_TYPE_I32 ? MO_32 : MO_64;
235
+
236
+ /* The argument is going onto the stack; extend into scratch. */
237
+ if ((mov[i].src_ext & MO_SIZE) != dst_mo) {
238
+ tcg_debug_assert(parm->ntmp != 0);
239
+ mov[i].dst = src = parm->tmp[0];
240
+ tcg_out_movext1(s, &mov[i]);
241
+ }
242
+
243
+ tcg_out_st(s, dst_type, src, TCG_REG_CALL_STACK,
244
+ tcg_out_helper_stk_ofs(dst_type, slot));
245
+ }
246
+ return;
247
+
248
+ found_reg:
249
+ /*
250
+ * The remaining arguments are in registers.
251
+ * Convert slot numbers to argument registers.
252
+ */
253
+ nmov = i + 1;
254
+ for (i = 0; i < nmov; ++i) {
255
+ mov[i].dst = tcg_target_call_iarg_regs[mov[i].dst];
256
+ }
257
+ tcg_out_helper_load_regs(s, nmov, mov, parm->ntmp, parm->tmp);
258
+}
259
+
260
+static void tcg_out_helper_load_imm(TCGContext *s, unsigned slot,
261
+ TCGType type, tcg_target_long imm,
262
+ const TCGLdstHelperParam *parm)
263
+{
264
+ if (arg_slot_reg_p(slot)) {
265
+ tcg_out_movi(s, type, tcg_target_call_iarg_regs[slot], imm);
266
+ } else {
267
+ int ofs = tcg_out_helper_stk_ofs(type, slot);
268
+ if (!tcg_out_sti(s, type, imm, TCG_REG_CALL_STACK, ofs)) {
269
+ tcg_debug_assert(parm->ntmp != 0);
270
+ tcg_out_movi(s, type, parm->tmp[0], imm);
271
+ tcg_out_st(s, type, parm->tmp[0], TCG_REG_CALL_STACK, ofs);
272
+ }
273
+ }
274
+}
275
+
276
+static void tcg_out_helper_load_common_args(TCGContext *s,
277
+ const TCGLabelQemuLdst *ldst,
278
+ const TCGLdstHelperParam *parm,
279
+ const TCGHelperInfo *info,
280
+ unsigned next_arg)
281
+{
282
+ TCGMovExtend ptr_mov = {
283
+ .dst_type = TCG_TYPE_PTR,
284
+ .src_type = TCG_TYPE_PTR,
285
+ .src_ext = sizeof(void *) == 4 ? MO_32 : MO_64
286
+ };
287
+ const TCGCallArgumentLoc *loc = &info->in[0];
288
+ TCGType type;
289
+ unsigned slot;
290
+ tcg_target_ulong imm;
291
+
292
+ /*
293
+ * Handle env, which is always first.
294
+ */
295
+ ptr_mov.dst = loc->arg_slot;
296
+ ptr_mov.src = TCG_AREG0;
297
+ tcg_out_helper_load_slots(s, 1, &ptr_mov, parm);
298
+
299
+ /*
300
+ * Handle oi.
301
+ */
302
+ imm = ldst->oi;
303
+ loc = &info->in[next_arg];
304
+ type = TCG_TYPE_I32;
305
+ switch (loc->kind) {
306
+ case TCG_CALL_ARG_NORMAL:
307
+ break;
308
+ case TCG_CALL_ARG_EXTEND_U:
309
+ case TCG_CALL_ARG_EXTEND_S:
310
+ /* No extension required for MemOpIdx. */
311
+ tcg_debug_assert(imm <= INT32_MAX);
312
+ type = TCG_TYPE_REG;
313
+ break;
314
+ default:
315
+ g_assert_not_reached();
316
+ }
317
+ tcg_out_helper_load_imm(s, loc->arg_slot, type, imm, parm);
318
+ next_arg++;
319
+
320
+ /*
321
+ * Handle ra.
322
+ */
323
+ loc = &info->in[next_arg];
324
+ slot = loc->arg_slot;
325
+ if (parm->ra_gen) {
326
+ int arg_reg = -1;
327
+ TCGReg ra_reg;
328
+
329
+ if (arg_slot_reg_p(slot)) {
330
+ arg_reg = tcg_target_call_iarg_regs[slot];
331
+ }
332
+ ra_reg = parm->ra_gen(s, ldst, arg_reg);
333
+
334
+ ptr_mov.dst = slot;
335
+ ptr_mov.src = ra_reg;
336
+ tcg_out_helper_load_slots(s, 1, &ptr_mov, parm);
337
+ } else {
338
+ imm = (uintptr_t)ldst->raddr;
339
+ tcg_out_helper_load_imm(s, slot, TCG_TYPE_PTR, imm, parm);
340
+ }
341
+}
342
+
343
+static unsigned tcg_out_helper_add_mov(TCGMovExtend *mov,
344
+ const TCGCallArgumentLoc *loc,
345
+ TCGType dst_type, TCGType src_type,
346
+ TCGReg lo, TCGReg hi)
347
+{
348
+ if (dst_type <= TCG_TYPE_REG) {
349
+ MemOp src_ext;
350
+
351
+ switch (loc->kind) {
352
+ case TCG_CALL_ARG_NORMAL:
353
+ src_ext = src_type == TCG_TYPE_I32 ? MO_32 : MO_64;
354
+ break;
355
+ case TCG_CALL_ARG_EXTEND_U:
356
+ dst_type = TCG_TYPE_REG;
357
+ src_ext = MO_UL;
358
+ break;
359
+ case TCG_CALL_ARG_EXTEND_S:
360
+ dst_type = TCG_TYPE_REG;
361
+ src_ext = MO_SL;
362
+ break;
363
+ default:
364
+ g_assert_not_reached();
365
+ }
366
+
367
+ mov[0].dst = loc->arg_slot;
368
+ mov[0].dst_type = dst_type;
369
+ mov[0].src = lo;
370
+ mov[0].src_type = src_type;
371
+ mov[0].src_ext = src_ext;
372
+ return 1;
373
+ }
374
+
375
+ assert(TCG_TARGET_REG_BITS == 32);
376
+
377
+ mov[0].dst = loc[HOST_BIG_ENDIAN].arg_slot;
378
+ mov[0].src = lo;
379
+ mov[0].dst_type = TCG_TYPE_I32;
380
+ mov[0].src_type = TCG_TYPE_I32;
381
+ mov[0].src_ext = MO_32;
382
+
383
+ mov[1].dst = loc[!HOST_BIG_ENDIAN].arg_slot;
384
+ mov[1].src = hi;
385
+ mov[1].dst_type = TCG_TYPE_I32;
386
+ mov[1].src_type = TCG_TYPE_I32;
387
+ mov[1].src_ext = MO_32;
388
+
389
+ return 2;
390
+}
391
+
392
+static void tcg_out_ld_helper_args(TCGContext *s, const TCGLabelQemuLdst *ldst,
393
+ const TCGLdstHelperParam *parm)
394
+{
395
+ const TCGHelperInfo *info;
396
+ const TCGCallArgumentLoc *loc;
397
+ TCGMovExtend mov[2];
398
+ unsigned next_arg, nmov;
399
+ MemOp mop = get_memop(ldst->oi);
400
+
401
+ switch (mop & MO_SIZE) {
402
+ case MO_8:
403
+ case MO_16:
404
+ case MO_32:
405
+ info = &info_helper_ld32_mmu;
406
+ break;
407
+ case MO_64:
408
+ info = &info_helper_ld64_mmu;
409
+ break;
410
+ default:
411
+ g_assert_not_reached();
412
+ }
413
+
414
+ /* Defer env argument. */
415
+ next_arg = 1;
416
+
417
+ loc = &info->in[next_arg];
418
+ nmov = tcg_out_helper_add_mov(mov, loc, TCG_TYPE_TL, TCG_TYPE_TL,
419
+ ldst->addrlo_reg, ldst->addrhi_reg);
420
+ next_arg += nmov;
421
+
422
+ tcg_out_helper_load_slots(s, nmov, mov, parm);
423
+
424
+ /* No special attention for 32 and 64-bit return values. */
425
+ tcg_debug_assert(info->out_kind == TCG_CALL_RET_NORMAL);
426
+
427
+ tcg_out_helper_load_common_args(s, ldst, parm, info, next_arg);
428
+}
429
+
430
+static void tcg_out_ld_helper_ret(TCGContext *s, const TCGLabelQemuLdst *ldst,
431
+ bool load_sign,
432
+ const TCGLdstHelperParam *parm)
433
+{
434
+ TCGMovExtend mov[2];
435
+
436
+ if (ldst->type <= TCG_TYPE_REG) {
437
+ MemOp mop = get_memop(ldst->oi);
438
+
439
+ mov[0].dst = ldst->datalo_reg;
440
+ mov[0].src = tcg_target_call_oarg_reg(TCG_CALL_RET_NORMAL, 0);
441
+ mov[0].dst_type = ldst->type;
442
+ mov[0].src_type = TCG_TYPE_REG;
443
+
444
+ /*
445
+ * If load_sign, then we allowed the helper to perform the
446
+ * appropriate sign extension to tcg_target_ulong, and all
447
+ * we need now is a plain move.
448
+ *
449
+ * If they do not, then we expect the relevant extension
450
+ * instruction to be no more expensive than a move, and
451
+ * we thus save the icache etc by only using one of two
452
+ * helper functions.
453
+ */
454
+ if (load_sign || !(mop & MO_SIGN)) {
455
+ if (TCG_TARGET_REG_BITS == 32 || ldst->type == TCG_TYPE_I32) {
456
+ mov[0].src_ext = MO_32;
457
+ } else {
458
+ mov[0].src_ext = MO_64;
459
+ }
460
+ } else {
461
+ mov[0].src_ext = mop & MO_SSIZE;
462
+ }
463
+ tcg_out_movext1(s, mov);
464
+ } else {
465
+ assert(TCG_TARGET_REG_BITS == 32);
466
+
467
+ mov[0].dst = ldst->datalo_reg;
468
+ mov[0].src =
469
+ tcg_target_call_oarg_reg(TCG_CALL_RET_NORMAL, HOST_BIG_ENDIAN);
470
+ mov[0].dst_type = TCG_TYPE_I32;
471
+ mov[0].src_type = TCG_TYPE_I32;
472
+ mov[0].src_ext = MO_32;
473
+
474
+ mov[1].dst = ldst->datahi_reg;
475
+ mov[1].src =
476
+ tcg_target_call_oarg_reg(TCG_CALL_RET_NORMAL, !HOST_BIG_ENDIAN);
477
+ mov[1].dst_type = TCG_TYPE_REG;
478
+ mov[1].src_type = TCG_TYPE_REG;
479
+ mov[1].src_ext = MO_32;
480
+
481
+ tcg_out_movext2(s, mov, mov + 1, parm->ntmp ? parm->tmp[0] : -1);
482
+ }
483
+}
484
+
485
+static void tcg_out_st_helper_args(TCGContext *s, const TCGLabelQemuLdst *ldst,
486
+ const TCGLdstHelperParam *parm)
487
+{
488
+ const TCGHelperInfo *info;
489
+ const TCGCallArgumentLoc *loc;
490
+ TCGMovExtend mov[4];
491
+ TCGType data_type;
492
+ unsigned next_arg, nmov, n;
493
+ MemOp mop = get_memop(ldst->oi);
494
+
495
+ switch (mop & MO_SIZE) {
496
+ case MO_8:
497
+ case MO_16:
498
+ case MO_32:
499
+ info = &info_helper_st32_mmu;
500
+ data_type = TCG_TYPE_I32;
501
+ break;
502
+ case MO_64:
503
+ info = &info_helper_st64_mmu;
504
+ data_type = TCG_TYPE_I64;
505
+ break;
506
+ default:
507
+ g_assert_not_reached();
508
+ }
509
+
510
+ /* Defer env argument. */
511
+ next_arg = 1;
512
+ nmov = 0;
513
+
514
+ /* Handle addr argument. */
515
+ loc = &info->in[next_arg];
516
+ n = tcg_out_helper_add_mov(mov, loc, TCG_TYPE_TL, TCG_TYPE_TL,
517
+ ldst->addrlo_reg, ldst->addrhi_reg);
518
+ next_arg += n;
519
+ nmov += n;
520
+
521
+ /* Handle data argument. */
522
+ loc = &info->in[next_arg];
523
+ n = tcg_out_helper_add_mov(mov + nmov, loc, data_type, ldst->type,
524
+ ldst->datalo_reg, ldst->datahi_reg);
525
+ next_arg += n;
526
+ nmov += n;
527
+ tcg_debug_assert(nmov <= ARRAY_SIZE(mov));
528
+
529
+ tcg_out_helper_load_slots(s, nmov, mov, parm);
530
+ tcg_out_helper_load_common_args(s, ldst, parm, info, next_arg);
531
+}
532
+
533
#ifdef CONFIG_PROFILER
534
535
/* avoid copy/paste errors */
536
--
85
--
537
2.34.1
86
2.34.1
538
87
539
88
diff view generated by jsdifflib
1
Use tcg_out_ld_helper_args and tcg_out_ld_helper_ret.
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2
3
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
---
3
---
6
tcg/i386/tcg-target.c.inc | 71 +++++++++++++++------------------------
4
tcg/i386/tcg-target.h | 4 +-
7
1 file changed, 28 insertions(+), 43 deletions(-)
5
tcg/i386/tcg-target.c.inc | 191 +++++++++++++++++++++++++++++++++++++-
6
2 files changed, 190 insertions(+), 5 deletions(-)
8
7
8
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
9
index XXXXXXX..XXXXXXX 100644
10
--- a/tcg/i386/tcg-target.h
11
+++ b/tcg/i386/tcg-target.h
12
@@ -XXX,XX +XXX,XX @@ typedef enum {
13
#define have_avx1 (cpuinfo & CPUINFO_AVX1)
14
#define have_avx2 (cpuinfo & CPUINFO_AVX2)
15
#define have_movbe (cpuinfo & CPUINFO_MOVBE)
16
-#define have_atomic16 (cpuinfo & CPUINFO_ATOMIC_VMOVDQA)
17
18
/*
19
* There are interesting instructions in AVX512, so long as we have AVX512VL,
20
@@ -XXX,XX +XXX,XX @@ typedef enum {
21
#define TCG_TARGET_HAS_qemu_st8_i32 1
22
#endif
23
24
-#define TCG_TARGET_HAS_qemu_ldst_i128 0
25
+#define TCG_TARGET_HAS_qemu_ldst_i128 \
26
+ (TCG_TARGET_REG_BITS == 64 && (cpuinfo & CPUINFO_ATOMIC_VMOVDQA))
27
28
/* We do not support older SSE systems, only beginning with AVX1. */
29
#define TCG_TARGET_HAS_v64 have_avx1
9
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
30
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
10
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
11
--- a/tcg/i386/tcg-target.c.inc
32
--- a/tcg/i386/tcg-target.c.inc
12
+++ b/tcg/i386/tcg-target.c.inc
33
+++ b/tcg/i386/tcg-target.c.inc
13
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
34
@@ -XXX,XX +XXX,XX @@ static const int tcg_target_reg_alloc_order[] = {
14
[MO_BEUQ] = helper_be_stq_mmu,
35
#endif
15
};
36
};
16
37
17
+/*
38
+#define TCG_TMP_VEC TCG_REG_XMM5
18
+ * Because i686 has no register parameters and because x86_64 has xchg
39
+
19
+ * to handle addr/data register overlap, we have placed all input arguments
40
static const int tcg_target_call_iarg_regs[] = {
20
+ * before we need might need a scratch reg.
41
#if TCG_TARGET_REG_BITS == 64
21
+ *
42
#if defined(_WIN64)
22
+ * Even then, a scratch is only needed for l->raddr. Rather than expose
43
@@ -XXX,XX +XXX,XX @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
23
+ * a general-purpose scratch when we don't actually know it's available,
44
#define OPC_PCMPGTW (0x65 | P_EXT | P_DATA16)
24
+ * use the ra_gen hook to load into RAX if needed.
45
#define OPC_PCMPGTD (0x66 | P_EXT | P_DATA16)
25
+ */
46
#define OPC_PCMPGTQ (0x37 | P_EXT38 | P_DATA16)
26
+#if TCG_TARGET_REG_BITS == 64
47
+#define OPC_PEXTRD (0x16 | P_EXT3A | P_DATA16)
27
+static TCGReg ldst_ra_gen(TCGContext *s, const TCGLabelQemuLdst *l, int arg)
48
+#define OPC_PINSRD (0x22 | P_EXT3A | P_DATA16)
49
#define OPC_PMAXSB (0x3c | P_EXT38 | P_DATA16)
50
#define OPC_PMAXSW (0xee | P_EXT | P_DATA16)
51
#define OPC_PMAXSD (0x3d | P_EXT38 | P_DATA16)
52
@@ -XXX,XX +XXX,XX @@ typedef struct {
53
54
bool tcg_target_has_memory_bswap(MemOp memop)
55
{
56
- return have_movbe;
57
+ TCGAtomAlign aa;
58
+
59
+ if (!have_movbe) {
60
+ return false;
61
+ }
62
+ if ((memop & MO_SIZE) < MO_128) {
63
+ return true;
64
+ }
65
+
66
+ /*
67
+ * Reject 16-byte memop with 16-byte atomicity, i.e. VMOVDQA,
68
+ * but do allow a pair of 64-bit operations, i.e. MOVBEQ.
69
+ */
70
+ aa = atom_and_align_for_opc(tcg_ctx, memop, MO_ATOM_IFALIGN, true);
71
+ return aa.atom < MO_128;
72
}
73
74
/*
75
@@ -XXX,XX +XXX,XX @@ static const TCGLdstHelperParam ldst_helper_param = {
76
static const TCGLdstHelperParam ldst_helper_param = { };
77
#endif
78
79
+static void tcg_out_vec_to_pair(TCGContext *s, TCGType type,
80
+ TCGReg l, TCGReg h, TCGReg v)
28
+{
81
+{
29
+ if (arg < 0) {
82
+ int rexw = type == TCG_TYPE_I32 ? 0 : P_REXW;
30
+ arg = TCG_REG_RAX;
83
+
31
+ }
84
+ /* vpmov{d,q} %v, %l */
32
+ tcg_out_movi(s, TCG_TYPE_PTR, arg, (uintptr_t)l->raddr);
85
+ tcg_out_vex_modrm(s, OPC_MOVD_EyVy + rexw, v, 0, l);
33
+ return arg;
86
+ /* vpextr{d,q} $1, %v, %h */
87
+ tcg_out_vex_modrm(s, OPC_PEXTRD + rexw, v, 0, h);
88
+ tcg_out8(s, 1);
34
+}
89
+}
35
+static const TCGLdstHelperParam ldst_helper_param = {
90
+
36
+ .ra_gen = ldst_ra_gen
91
+static void tcg_out_pair_to_vec(TCGContext *s, TCGType type,
37
+};
92
+ TCGReg v, TCGReg l, TCGReg h)
38
+#else
93
+{
39
+static const TCGLdstHelperParam ldst_helper_param = { };
94
+ int rexw = type == TCG_TYPE_I32 ? 0 : P_REXW;
40
+#endif
95
+
96
+ /* vmov{d,q} %l, %v */
97
+ tcg_out_vex_modrm(s, OPC_MOVD_VyEy + rexw, v, 0, l);
98
+ /* vpinsr{d,q} $1, %h, %v, %v */
99
+ tcg_out_vex_modrm(s, OPC_PINSRD + rexw, v, v, h);
100
+ tcg_out8(s, 1);
101
+}
41
+
102
+
42
/*
103
/*
43
* Generate code for the slow path for a load at the end of block
104
* Generate code for the slow path for a load at the end of block
44
*/
105
*/
45
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
106
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
46
{
107
{
47
- MemOpIdx oi = l->oi;
108
TCGLabelQemuLdst *ldst = NULL;
48
- MemOp opc = get_memop(oi);
109
MemOp opc = get_memop(oi);
49
+ MemOp opc = get_memop(l->oi);
110
+ MemOp s_bits = opc & MO_SIZE;
50
tcg_insn_unit **label_ptr = &l->label_ptr[0];
111
unsigned a_mask;
51
112
52
/* resolve label address */
113
#ifdef CONFIG_SOFTMMU
53
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
114
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
54
tcg_patch32(label_ptr[1], s->code_ptr - label_ptr[1] - 4);
115
*h = x86_guest_base;
116
#endif
117
h->base = addrlo;
118
- h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, false);
119
+ h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, s_bits == MO_128);
120
a_mask = (1 << h->aa.align) - 1;
121
122
#ifdef CONFIG_SOFTMMU
123
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
124
TCGType tlbtype = TCG_TYPE_I32;
125
int trexw = 0, hrexw = 0, tlbrexw = 0;
126
unsigned mem_index = get_mmuidx(oi);
127
- unsigned s_bits = opc & MO_SIZE;
128
unsigned s_mask = (1 << s_bits) - 1;
129
int tlb_mask;
130
131
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
132
h.base, h.index, 0, h.ofs + 4);
133
}
134
break;
135
+
136
+ case MO_128:
137
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
138
+
139
+ /*
140
+ * Without 16-byte atomicity, use integer regs.
141
+ * That is where we want the data, and it allows bswaps.
142
+ */
143
+ if (h.aa.atom < MO_128) {
144
+ if (use_movbe) {
145
+ TCGReg t = datalo;
146
+ datalo = datahi;
147
+ datahi = t;
148
+ }
149
+ if (h.base == datalo || h.index == datalo) {
150
+ tcg_out_modrm_sib_offset(s, OPC_LEA + P_REXW, datahi,
151
+ h.base, h.index, 0, h.ofs);
152
+ tcg_out_modrm_offset(s, movop + P_REXW + h.seg,
153
+ datalo, datahi, 0);
154
+ tcg_out_modrm_offset(s, movop + P_REXW + h.seg,
155
+ datahi, datahi, 8);
156
+ } else {
157
+ tcg_out_modrm_sib_offset(s, movop + P_REXW + h.seg, datalo,
158
+ h.base, h.index, 0, h.ofs);
159
+ tcg_out_modrm_sib_offset(s, movop + P_REXW + h.seg, datahi,
160
+ h.base, h.index, 0, h.ofs + 8);
161
+ }
162
+ break;
163
+ }
164
+
165
+ /*
166
+ * With 16-byte atomicity, a vector load is required.
167
+ * If we already have 16-byte alignment, then VMOVDQA always works.
168
+ * Else if VMOVDQU has atomicity with dynamic alignment, use that.
169
+ * Else use we require a runtime test for alignment for VMOVDQA;
170
+ * use VMOVDQU on the unaligned nonatomic path for simplicity.
171
+ */
172
+ if (h.aa.align >= MO_128) {
173
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQA_VxWx + h.seg,
174
+ TCG_TMP_VEC, 0,
175
+ h.base, h.index, 0, h.ofs);
176
+ } else if (cpuinfo & CPUINFO_ATOMIC_VMOVDQU) {
177
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQU_VxWx + h.seg,
178
+ TCG_TMP_VEC, 0,
179
+ h.base, h.index, 0, h.ofs);
180
+ } else {
181
+ TCGLabel *l1 = gen_new_label();
182
+ TCGLabel *l2 = gen_new_label();
183
+
184
+ tcg_out_testi(s, h.base, 15);
185
+ tcg_out_jxx(s, JCC_JNE, l1, true);
186
+
187
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQA_VxWx + h.seg,
188
+ TCG_TMP_VEC, 0,
189
+ h.base, h.index, 0, h.ofs);
190
+ tcg_out_jxx(s, JCC_JMP, l2, true);
191
+
192
+ tcg_out_label(s, l1);
193
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQU_VxWx + h.seg,
194
+ TCG_TMP_VEC, 0,
195
+ h.base, h.index, 0, h.ofs);
196
+ tcg_out_label(s, l2);
197
+ }
198
+ tcg_out_vec_to_pair(s, TCG_TYPE_I64, datalo, datahi, TCG_TMP_VEC);
199
+ break;
200
+
201
default:
202
g_assert_not_reached();
55
}
203
}
56
204
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
57
- if (TCG_TARGET_REG_BITS == 32) {
205
h.base, h.index, 0, h.ofs + 4);
58
- int ofs = 0;
206
}
59
-
207
break;
60
- tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, ofs);
208
+
61
- ofs += 4;
209
+ case MO_128:
62
-
210
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
63
- tcg_out_st(s, TCG_TYPE_I32, l->addrlo_reg, TCG_REG_ESP, ofs);
211
+
64
- ofs += 4;
212
+ /*
65
-
213
+ * Without 16-byte atomicity, use integer regs.
66
- if (TARGET_LONG_BITS == 64) {
214
+ * That is where we have the data, and it allows bswaps.
67
- tcg_out_st(s, TCG_TYPE_I32, l->addrhi_reg, TCG_REG_ESP, ofs);
215
+ */
68
- ofs += 4;
216
+ if (h.aa.atom < MO_128) {
69
- }
217
+ if (use_movbe) {
70
-
218
+ TCGReg t = datalo;
71
- tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, ofs);
219
+ datalo = datahi;
72
- ofs += 4;
220
+ datahi = t;
73
-
221
+ }
74
- tcg_out_sti(s, TCG_TYPE_PTR, (uintptr_t)l->raddr, TCG_REG_ESP, ofs);
222
+ tcg_out_modrm_sib_offset(s, movop + P_REXW + h.seg, datalo,
75
- } else {
223
+ h.base, h.index, 0, h.ofs);
76
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
224
+ tcg_out_modrm_sib_offset(s, movop + P_REXW + h.seg, datahi,
77
- tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
225
+ h.base, h.index, 0, h.ofs + 8);
78
- l->addrlo_reg);
226
+ break;
79
- tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[2], oi);
227
+ }
80
- tcg_out_movi(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[3],
228
+
81
- (uintptr_t)l->raddr);
229
+ /*
82
- }
230
+ * With 16-byte atomicity, a vector store is required.
83
-
231
+ * If we already have 16-byte alignment, then VMOVDQA always works.
84
+ tcg_out_ld_helper_args(s, l, &ldst_helper_param);
232
+ * Else if VMOVDQU has atomicity with dynamic alignment, use that.
85
tcg_out_branch(s, 1, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]);
233
+ * Else use we require a runtime test for alignment for VMOVDQA;
86
+ tcg_out_ld_helper_ret(s, l, false, &ldst_helper_param);
234
+ * use VMOVDQU on the unaligned nonatomic path for simplicity.
87
235
+ */
88
- if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
236
+ tcg_out_pair_to_vec(s, TCG_TYPE_I64, TCG_TMP_VEC, datalo, datahi);
89
- TCGMovExtend ext[2] = {
237
+ if (h.aa.align >= MO_128) {
90
- { .dst = l->datalo_reg, .dst_type = TCG_TYPE_I32,
238
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQA_WxVx + h.seg,
91
- .src = TCG_REG_EAX, .src_type = TCG_TYPE_I32, .src_ext = MO_UL },
239
+ TCG_TMP_VEC, 0,
92
- { .dst = l->datahi_reg, .dst_type = TCG_TYPE_I32,
240
+ h.base, h.index, 0, h.ofs);
93
- .src = TCG_REG_EDX, .src_type = TCG_TYPE_I32, .src_ext = MO_UL },
241
+ } else if (cpuinfo & CPUINFO_ATOMIC_VMOVDQU) {
94
- };
242
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQU_WxVx + h.seg,
95
- tcg_out_movext2(s, &ext[0], &ext[1], -1);
243
+ TCG_TMP_VEC, 0,
96
- } else {
244
+ h.base, h.index, 0, h.ofs);
97
- tcg_out_movext(s, l->type, l->datalo_reg,
245
+ } else {
98
- TCG_TYPE_REG, opc & MO_SSIZE, TCG_REG_EAX);
246
+ TCGLabel *l1 = gen_new_label();
99
- }
247
+ TCGLabel *l2 = gen_new_label();
100
-
248
+
101
- /* Jump to the code corresponding to next IR of qemu_st */
249
+ tcg_out_testi(s, h.base, 15);
102
tcg_out_jmp(s, l->raddr);
250
+ tcg_out_jxx(s, JCC_JNE, l1, true);
103
return true;
251
+
104
}
252
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQA_WxVx + h.seg,
253
+ TCG_TMP_VEC, 0,
254
+ h.base, h.index, 0, h.ofs);
255
+ tcg_out_jxx(s, JCC_JMP, l2, true);
256
+
257
+ tcg_out_label(s, l1);
258
+ tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQU_WxVx + h.seg,
259
+ TCG_TMP_VEC, 0,
260
+ h.base, h.index, 0, h.ofs);
261
+ tcg_out_label(s, l2);
262
+ }
263
+ break;
264
+
265
default:
266
g_assert_not_reached();
267
}
268
@@ -XXX,XX +XXX,XX @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
269
tcg_out_qemu_ld(s, a0, a1, a2, args[3], args[4], TCG_TYPE_I64);
270
}
271
break;
272
+ case INDEX_op_qemu_ld_a32_i128:
273
+ case INDEX_op_qemu_ld_a64_i128:
274
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
275
+ tcg_out_qemu_ld(s, a0, a1, a2, -1, args[3], TCG_TYPE_I128);
276
+ break;
277
278
case INDEX_op_qemu_st_a64_i32:
279
case INDEX_op_qemu_st8_a64_i32:
280
@@ -XXX,XX +XXX,XX @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
281
tcg_out_qemu_st(s, a0, a1, a2, args[3], args[4], TCG_TYPE_I64);
282
}
283
break;
284
+ case INDEX_op_qemu_st_a32_i128:
285
+ case INDEX_op_qemu_st_a64_i128:
286
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
287
+ tcg_out_qemu_st(s, a0, a1, a2, -1, args[3], TCG_TYPE_I128);
288
+ break;
289
290
OP_32_64(mulu2):
291
tcg_out_modrm(s, OPC_GRP3_Ev + rexw, EXT3_MUL, args[3]);
292
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
293
case INDEX_op_qemu_st_a64_i64:
294
return TCG_TARGET_REG_BITS == 64 ? C_O0_I2(L, L) : C_O0_I4(L, L, L, L);
295
296
+ case INDEX_op_qemu_ld_a32_i128:
297
+ case INDEX_op_qemu_ld_a64_i128:
298
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
299
+ return C_O2_I1(r, r, L);
300
+ case INDEX_op_qemu_st_a32_i128:
301
+ case INDEX_op_qemu_st_a64_i128:
302
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
303
+ return C_O0_I3(L, L, L);
304
+
305
case INDEX_op_brcond2_i32:
306
return C_O0_I4(r, r, ri, ri);
307
308
@@ -XXX,XX +XXX,XX @@ static void tcg_target_init(TCGContext *s)
309
310
s->reserved_regs = 0;
311
tcg_regset_set_reg(s->reserved_regs, TCG_REG_CALL_STACK);
312
+ tcg_regset_set_reg(s->reserved_regs, TCG_TMP_VEC);
313
#ifdef _WIN64
314
/* These are call saved, and we don't save them, so don't use them. */
315
tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM6);
105
--
316
--
106
2.34.1
317
2.34.1
107
108
diff view generated by jsdifflib
1
Instead of trying to unify all operations on uint64_t, use
1
We will need to allocate a second general-purpose temporary.
2
mmu_lookup() to perform the basic tlb hit and resolution.
2
Rename the existing temps to add a distinguishing number.
3
Create individual functions to handle access by size.
4
3
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
6
---
8
accel/tcg/cputlb.c | 408 +++++++++++++++++++++------------------------
7
tcg/aarch64/tcg-target.c.inc | 50 ++++++++++++++++++------------------
9
1 file changed, 193 insertions(+), 215 deletions(-)
8
1 file changed, 25 insertions(+), 25 deletions(-)
10
9
11
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
10
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
13
--- a/accel/tcg/cputlb.c
12
--- a/tcg/aarch64/tcg-target.c.inc
14
+++ b/accel/tcg/cputlb.c
13
+++ b/tcg/aarch64/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ store_memop(void *haddr, uint64_t val, MemOp op)
14
@@ -XXX,XX +XXX,XX @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
15
return TCG_REG_X0 + slot;
16
}
17
18
-#define TCG_REG_TMP TCG_REG_X30
19
-#define TCG_VEC_TMP TCG_REG_V31
20
+#define TCG_REG_TMP0 TCG_REG_X30
21
+#define TCG_VEC_TMP0 TCG_REG_V31
22
23
#ifndef CONFIG_SOFTMMU
24
#define TCG_REG_GUEST_BASE TCG_REG_X28
25
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece,
26
static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece,
27
TCGReg r, TCGReg base, intptr_t offset)
28
{
29
- TCGReg temp = TCG_REG_TMP;
30
+ TCGReg temp = TCG_REG_TMP0;
31
32
if (offset < -0xffffff || offset > 0xffffff) {
33
tcg_out_movi(s, TCG_TYPE_PTR, temp, offset);
34
@@ -XXX,XX +XXX,XX @@ static void tcg_out_ldst(TCGContext *s, AArch64Insn insn, TCGReg rd,
35
}
36
37
/* Worst-case scenario, move offset to temp register, use reg offset. */
38
- tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP, offset);
39
- tcg_out_ldst_r(s, insn, rd, rn, TCG_TYPE_I64, TCG_REG_TMP);
40
+ tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP0, offset);
41
+ tcg_out_ldst_r(s, insn, rd, rn, TCG_TYPE_I64, TCG_REG_TMP0);
42
}
43
44
static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg)
45
@@ -XXX,XX +XXX,XX @@ static void tcg_out_call_int(TCGContext *s, const tcg_insn_unit *target)
46
if (offset == sextract64(offset, 0, 26)) {
47
tcg_out_insn(s, 3206, BL, offset);
48
} else {
49
- tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP, (intptr_t)target);
50
- tcg_out_insn(s, 3207, BLR, TCG_REG_TMP);
51
+ tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP0, (intptr_t)target);
52
+ tcg_out_insn(s, 3207, BLR, TCG_REG_TMP0);
16
}
53
}
17
}
54
}
18
55
19
-static void full_stb_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
56
@@ -XXX,XX +XXX,XX @@ static void tcg_out_addsub2(TCGContext *s, TCGType ext, TCGReg rl,
20
- MemOpIdx oi, uintptr_t retaddr);
57
AArch64Insn insn;
21
-
58
22
-static void __attribute__((noinline))
59
if (rl == ah || (!const_bh && rl == bh)) {
23
-store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val,
60
- rl = TCG_REG_TMP;
24
- uintptr_t retaddr, size_t size, uintptr_t mmu_idx,
61
+ rl = TCG_REG_TMP0;
25
- bool big_endian)
62
}
26
+/**
63
27
+ * do_st_mmio_leN:
64
if (const_bl) {
28
+ * @env: cpu context
65
@@ -XXX,XX +XXX,XX @@ static void tcg_out_addsub2(TCGContext *s, TCGType ext, TCGReg rl,
29
+ * @p: translation parameters
66
possibility of adding 0+const in the low part, and the
30
+ * @val_le: data to store
67
immediate add instructions encode XSP not XZR. Don't try
31
+ * @mmu_idx: virtual address context
68
anything more elaborate here than loading another zero. */
32
+ * @ra: return address into tcg generated code, or 0
69
- al = TCG_REG_TMP;
33
+ *
70
+ al = TCG_REG_TMP0;
34
+ * Store @p->size bytes at @p->addr, which is memory-mapped i/o.
71
tcg_out_movi(s, ext, al, 0);
35
+ * The bytes to store are extracted in little-endian order from @val_le;
72
}
36
+ * return the bytes of @val_le beyond @p->size that have not been stored.
73
tcg_out_insn_3401(s, insn, ext, rl, al, bl);
37
+ */
74
@@ -XXX,XX +XXX,XX @@ static void tcg_out_cltz(TCGContext *s, TCGType ext, TCGReg d,
38
+static uint64_t do_st_mmio_leN(CPUArchState *env, MMULookupPageData *p,
39
+ uint64_t val_le, int mmu_idx, uintptr_t ra)
40
{
75
{
41
- uintptr_t index, index2;
76
TCGReg a1 = a0;
42
- CPUTLBEntry *entry, *entry2;
77
if (is_ctz) {
43
- target_ulong page1, page2, tlb_addr, tlb_addr2;
78
- a1 = TCG_REG_TMP;
44
- MemOpIdx oi;
79
+ a1 = TCG_REG_TMP0;
45
- size_t size2;
80
tcg_out_insn(s, 3507, RBIT, ext, a1, a0);
46
- int i;
47
+ CPUTLBEntryFull *full = p->full;
48
+ target_ulong addr = p->addr;
49
+ int i, size = p->size;
50
51
- /*
52
- * Ensure the second page is in the TLB. Note that the first page
53
- * is already guaranteed to be filled, and that the second page
54
- * cannot evict the first. An exception to this rule is PAGE_WRITE_INV
55
- * handling: the first page could have evicted itself.
56
- */
57
- page1 = addr & TARGET_PAGE_MASK;
58
- page2 = (addr + size) & TARGET_PAGE_MASK;
59
- size2 = (addr + size) & ~TARGET_PAGE_MASK;
60
- index2 = tlb_index(env, mmu_idx, page2);
61
- entry2 = tlb_entry(env, mmu_idx, page2);
62
-
63
- tlb_addr2 = tlb_addr_write(entry2);
64
- if (page1 != page2 && !tlb_hit_page(tlb_addr2, page2)) {
65
- if (!victim_tlb_hit(env, mmu_idx, index2, MMU_DATA_STORE, page2)) {
66
- tlb_fill(env_cpu(env), page2, size2, MMU_DATA_STORE,
67
- mmu_idx, retaddr);
68
- index2 = tlb_index(env, mmu_idx, page2);
69
- entry2 = tlb_entry(env, mmu_idx, page2);
70
- }
71
- tlb_addr2 = tlb_addr_write(entry2);
72
+ QEMU_IOTHREAD_LOCK_GUARD();
73
+ for (i = 0; i < size; i++, val_le >>= 8) {
74
+ io_writex(env, full, mmu_idx, val_le, addr + i, ra, MO_UB);
75
}
81
}
76
+ return val_le;
82
if (const_b && b == (ext ? 64 : 32)) {
77
+}
83
@@ -XXX,XX +XXX,XX @@ static void tcg_out_cltz(TCGContext *s, TCGType ext, TCGReg d,
78
84
AArch64Insn sel = I3506_CSEL;
79
- index = tlb_index(env, mmu_idx, addr);
85
80
- entry = tlb_entry(env, mmu_idx, addr);
86
tcg_out_cmp(s, ext, a0, 0, 1);
81
- tlb_addr = tlb_addr_write(entry);
87
- tcg_out_insn(s, 3507, CLZ, ext, TCG_REG_TMP, a1);
82
+/**
88
+ tcg_out_insn(s, 3507, CLZ, ext, TCG_REG_TMP0, a1);
83
+ * do_st_bytes_leN:
89
84
+ * @p: translation parameters
90
if (const_b) {
85
+ * @val_le: data to store
91
if (b == -1) {
86
+ *
92
@@ -XXX,XX +XXX,XX @@ static void tcg_out_cltz(TCGContext *s, TCGType ext, TCGReg d,
87
+ * Store @p->size bytes at @p->haddr, which is RAM.
93
b = d;
88
+ * The bytes to store are extracted in little-endian order from @val_le;
94
}
89
+ * return the bytes of @val_le beyond @p->size that have not been stored.
95
}
90
+ */
96
- tcg_out_insn_3506(s, sel, ext, d, TCG_REG_TMP, b, TCG_COND_NE);
91
+static uint64_t do_st_bytes_leN(MMULookupPageData *p, uint64_t val_le)
97
+ tcg_out_insn_3506(s, sel, ext, d, TCG_REG_TMP0, b, TCG_COND_NE);
92
+{
93
+ uint8_t *haddr = p->haddr;
94
+ int i, size = p->size;
95
96
- /*
97
- * Handle watchpoints. Since this may trap, all checks
98
- * must happen before any store.
99
- */
100
- if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
101
- cpu_check_watchpoint(env_cpu(env), addr, size - size2,
102
- env_tlb(env)->d[mmu_idx].fulltlb[index].attrs,
103
- BP_MEM_WRITE, retaddr);
104
- }
105
- if (unlikely(tlb_addr2 & TLB_WATCHPOINT)) {
106
- cpu_check_watchpoint(env_cpu(env), page2, size2,
107
- env_tlb(env)->d[mmu_idx].fulltlb[index2].attrs,
108
- BP_MEM_WRITE, retaddr);
109
+ for (i = 0; i < size; i++, val_le >>= 8) {
110
+ haddr[i] = val_le;
111
}
112
+ return val_le;
113
+}
114
115
- /*
116
- * XXX: not efficient, but simple.
117
- * This loop must go in the forward direction to avoid issues
118
- * with self-modifying code in Windows 64-bit.
119
- */
120
- oi = make_memop_idx(MO_UB, mmu_idx);
121
- if (big_endian) {
122
- for (i = 0; i < size; ++i) {
123
- /* Big-endian extract. */
124
- uint8_t val8 = val >> (((size - 1) * 8) - (i * 8));
125
- full_stb_mmu(env, addr + i, val8, oi, retaddr);
126
- }
127
+/*
128
+ * Wrapper for the above.
129
+ */
130
+static uint64_t do_st_leN(CPUArchState *env, MMULookupPageData *p,
131
+ uint64_t val_le, int mmu_idx, uintptr_t ra)
132
+{
133
+ if (unlikely(p->flags & TLB_MMIO)) {
134
+ return do_st_mmio_leN(env, p, val_le, mmu_idx, ra);
135
+ } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
136
+ return val_le >> (p->size * 8);
137
} else {
138
- for (i = 0; i < size; ++i) {
139
- /* Little-endian extract. */
140
- uint8_t val8 = val >> (i * 8);
141
- full_stb_mmu(env, addr + i, val8, oi, retaddr);
142
- }
143
+ return do_st_bytes_leN(p, val_le);
144
}
98
}
145
}
99
}
146
100
147
-static inline void QEMU_ALWAYS_INLINE
101
@@ -XXX,XX +XXX,XX @@ bool tcg_target_has_memory_bswap(MemOp memop)
148
-store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
102
}
149
- MemOpIdx oi, uintptr_t retaddr, MemOp op)
103
150
+static void do_st_1(CPUArchState *env, MMULookupPageData *p, uint8_t val,
104
static const TCGLdstHelperParam ldst_helper_param = {
151
+ int mmu_idx, uintptr_t ra)
105
- .ntmp = 1, .tmp = { TCG_REG_TMP }
152
{
106
+ .ntmp = 1, .tmp = { TCG_REG_TMP0 }
153
- const unsigned a_bits = get_alignment_bits(get_memop(oi));
107
};
154
- const size_t size = memop_size(op);
108
155
- uintptr_t mmu_idx = get_mmuidx(oi);
109
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
156
- uintptr_t index;
110
@@ -XXX,XX +XXX,XX @@ static void tcg_out_goto_tb(TCGContext *s, int which)
157
- CPUTLBEntry *entry;
111
158
- target_ulong tlb_addr;
112
set_jmp_insn_offset(s, which);
159
- void *haddr;
113
tcg_out32(s, I3206_B);
160
-
114
- tcg_out_insn(s, 3207, BR, TCG_REG_TMP);
161
- tcg_debug_assert(mmu_idx < NB_MMU_MODES);
115
+ tcg_out_insn(s, 3207, BR, TCG_REG_TMP0);
162
-
116
set_jmp_reset_offset(s, which);
163
- /* Handle CPU specific unaligned behaviour */
117
}
164
- if (addr & ((1 << a_bits) - 1)) {
118
165
- cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_STORE,
119
@@ -XXX,XX +XXX,XX @@ void tb_target_set_jmp_target(const TranslationBlock *tb, int n,
166
- mmu_idx, retaddr);
120
ptrdiff_t i_offset = i_addr - jmp_rx;
167
+ if (unlikely(p->flags & TLB_MMIO)) {
121
168
+ io_writex(env, p->full, mmu_idx, val, p->addr, ra, MO_UB);
122
/* Note that we asserted this in range in tcg_out_goto_tb. */
169
+ } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
123
- insn = deposit32(I3305_LDR | TCG_REG_TMP, 5, 19, i_offset >> 2);
170
+ /* nothing */
124
+ insn = deposit32(I3305_LDR | TCG_REG_TMP0, 5, 19, i_offset >> 2);
171
+ } else {
172
+ *(uint8_t *)p->haddr = val;
173
}
125
}
174
-
126
qatomic_set((uint32_t *)jmp_rw, insn);
175
- index = tlb_index(env, mmu_idx, addr);
127
flush_idcache_range(jmp_rx, jmp_rw, 4);
176
- entry = tlb_entry(env, mmu_idx, addr);
128
@@ -XXX,XX +XXX,XX @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
177
- tlb_addr = tlb_addr_write(entry);
129
178
-
130
case INDEX_op_rem_i64:
179
- /* If the TLB entry is for a different page, reload and try again. */
131
case INDEX_op_rem_i32:
180
- if (!tlb_hit(tlb_addr, addr)) {
132
- tcg_out_insn(s, 3508, SDIV, ext, TCG_REG_TMP, a1, a2);
181
- if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_STORE,
133
- tcg_out_insn(s, 3509, MSUB, ext, a0, TCG_REG_TMP, a2, a1);
182
- addr & TARGET_PAGE_MASK)) {
134
+ tcg_out_insn(s, 3508, SDIV, ext, TCG_REG_TMP0, a1, a2);
183
- tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE,
135
+ tcg_out_insn(s, 3509, MSUB, ext, a0, TCG_REG_TMP0, a2, a1);
184
- mmu_idx, retaddr);
136
break;
185
- index = tlb_index(env, mmu_idx, addr);
137
case INDEX_op_remu_i64:
186
- entry = tlb_entry(env, mmu_idx, addr);
138
case INDEX_op_remu_i32:
187
- }
139
- tcg_out_insn(s, 3508, UDIV, ext, TCG_REG_TMP, a1, a2);
188
- tlb_addr = tlb_addr_write(entry) & ~TLB_INVALID_MASK;
140
- tcg_out_insn(s, 3509, MSUB, ext, a0, TCG_REG_TMP, a2, a1);
189
- }
141
+ tcg_out_insn(s, 3508, UDIV, ext, TCG_REG_TMP0, a1, a2);
190
-
142
+ tcg_out_insn(s, 3509, MSUB, ext, a0, TCG_REG_TMP0, a2, a1);
191
- /* Handle anything that isn't just a straight memory access. */
143
break;
192
- if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
144
193
- CPUTLBEntryFull *full;
145
case INDEX_op_shl_i64:
194
- bool need_swap;
146
@@ -XXX,XX +XXX,XX @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
195
-
147
if (c2) {
196
- /* For anything that is unaligned, recurse through byte stores. */
148
tcg_out_rotl(s, ext, a0, a1, a2);
197
- if ((addr & (size - 1)) != 0) {
149
} else {
198
- goto do_unaligned_access;
150
- tcg_out_insn(s, 3502, SUB, 0, TCG_REG_TMP, TCG_REG_XZR, a2);
199
- }
151
- tcg_out_insn(s, 3508, RORV, ext, a0, a1, TCG_REG_TMP);
200
-
152
+ tcg_out_insn(s, 3502, SUB, 0, TCG_REG_TMP0, TCG_REG_XZR, a2);
201
- full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
153
+ tcg_out_insn(s, 3508, RORV, ext, a0, a1, TCG_REG_TMP0);
202
-
154
}
203
- /* Handle watchpoints. */
155
break;
204
- if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
156
205
- /* On watchpoint hit, this will longjmp out. */
157
@@ -XXX,XX +XXX,XX @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
206
- cpu_check_watchpoint(env_cpu(env), addr, size,
158
break;
207
- full->attrs, BP_MEM_WRITE, retaddr);
159
}
208
- }
160
}
209
-
161
- tcg_out_dupi_vec(s, type, MO_8, TCG_VEC_TMP, 0);
210
- need_swap = size > 1 && (tlb_addr & TLB_BSWAP);
162
- a2 = TCG_VEC_TMP;
211
-
163
+ tcg_out_dupi_vec(s, type, MO_8, TCG_VEC_TMP0, 0);
212
- /* Handle I/O access. */
164
+ a2 = TCG_VEC_TMP0;
213
- if (tlb_addr & TLB_MMIO) {
165
}
214
- io_writex(env, full, mmu_idx, val, addr, retaddr,
166
if (is_scalar) {
215
- op ^ (need_swap * MO_BSWAP));
167
insn = cmp_scalar_insn[cond];
216
- return;
168
@@ -XXX,XX +XXX,XX @@ static void tcg_target_init(TCGContext *s)
217
- }
169
s->reserved_regs = 0;
218
-
170
tcg_regset_set_reg(s->reserved_regs, TCG_REG_SP);
219
- /* Ignore writes to ROM. */
171
tcg_regset_set_reg(s->reserved_regs, TCG_REG_FP);
220
- if (unlikely(tlb_addr & TLB_DISCARD_WRITE)) {
172
- tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP);
221
- return;
173
tcg_regset_set_reg(s->reserved_regs, TCG_REG_X18); /* platform register */
222
- }
174
- tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP);
223
-
175
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP0);
224
- /* Handle clean RAM pages. */
176
+ tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP0);
225
- if (tlb_addr & TLB_NOTDIRTY) {
226
- notdirty_write(env_cpu(env), addr, size, full, retaddr);
227
- }
228
-
229
- haddr = (void *)((uintptr_t)addr + entry->addend);
230
-
231
- /*
232
- * Keep these two store_memop separate to ensure that the compiler
233
- * is able to fold the entire function to a single instruction.
234
- * There is a build-time assert inside to remind you of this. ;-)
235
- */
236
- if (unlikely(need_swap)) {
237
- store_memop(haddr, val, op ^ MO_BSWAP);
238
- } else {
239
- store_memop(haddr, val, op);
240
- }
241
- return;
242
- }
243
-
244
- /* Handle slow unaligned access (it spans two pages or IO). */
245
- if (size > 1
246
- && unlikely((addr & ~TARGET_PAGE_MASK) + size - 1
247
- >= TARGET_PAGE_SIZE)) {
248
- do_unaligned_access:
249
- store_helper_unaligned(env, addr, val, retaddr, size,
250
- mmu_idx, memop_big_endian(op));
251
- return;
252
- }
253
-
254
- haddr = (void *)((uintptr_t)addr + entry->addend);
255
- store_memop(haddr, val, op);
256
}
177
}
257
178
258
-static void __attribute__((noinline))
179
/* Saving pairs: (X19, X20) .. (X27, X28), (X29(fp), X30(lr)). */
259
-full_stb_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
260
- MemOpIdx oi, uintptr_t retaddr)
261
+static void do_st_2(CPUArchState *env, MMULookupPageData *p, uint16_t val,
262
+ int mmu_idx, MemOp memop, uintptr_t ra)
263
{
264
- validate_memop(oi, MO_UB);
265
- store_helper(env, addr, val, oi, retaddr, MO_UB);
266
+ if (unlikely(p->flags & TLB_MMIO)) {
267
+ io_writex(env, p->full, mmu_idx, val, p->addr, ra, memop);
268
+ } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
269
+ /* nothing */
270
+ } else {
271
+ /* Swap to host endian if necessary, then store. */
272
+ if (memop & MO_BSWAP) {
273
+ val = bswap16(val);
274
+ }
275
+ store_memop(p->haddr, val, MO_UW);
276
+ }
277
+}
278
+
279
+static void do_st_4(CPUArchState *env, MMULookupPageData *p, uint32_t val,
280
+ int mmu_idx, MemOp memop, uintptr_t ra)
281
+{
282
+ if (unlikely(p->flags & TLB_MMIO)) {
283
+ io_writex(env, p->full, mmu_idx, val, p->addr, ra, memop);
284
+ } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
285
+ /* nothing */
286
+ } else {
287
+ /* Swap to host endian if necessary, then store. */
288
+ if (memop & MO_BSWAP) {
289
+ val = bswap32(val);
290
+ }
291
+ store_memop(p->haddr, val, MO_UL);
292
+ }
293
+}
294
+
295
+static void do_st_8(CPUArchState *env, MMULookupPageData *p, uint64_t val,
296
+ int mmu_idx, MemOp memop, uintptr_t ra)
297
+{
298
+ if (unlikely(p->flags & TLB_MMIO)) {
299
+ io_writex(env, p->full, mmu_idx, val, p->addr, ra, memop);
300
+ } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
301
+ /* nothing */
302
+ } else {
303
+ /* Swap to host endian if necessary, then store. */
304
+ if (memop & MO_BSWAP) {
305
+ val = bswap64(val);
306
+ }
307
+ store_memop(p->haddr, val, MO_UQ);
308
+ }
309
}
310
311
void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
312
- MemOpIdx oi, uintptr_t retaddr)
313
+ MemOpIdx oi, uintptr_t ra)
314
{
315
- full_stb_mmu(env, addr, val, oi, retaddr);
316
+ MMULookupLocals l;
317
+ bool crosspage;
318
+
319
+ validate_memop(oi, MO_UB);
320
+ crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l);
321
+ tcg_debug_assert(!crosspage);
322
+
323
+ do_st_1(env, &l.page[0], val, l.mmu_idx, ra);
324
}
325
326
-static void full_le_stw_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
327
- MemOpIdx oi, uintptr_t retaddr)
328
+static void do_st2_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
329
+ MemOpIdx oi, uintptr_t ra)
330
{
331
- validate_memop(oi, MO_LEUW);
332
- store_helper(env, addr, val, oi, retaddr, MO_LEUW);
333
+ MMULookupLocals l;
334
+ bool crosspage;
335
+ uint8_t a, b;
336
+
337
+ crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l);
338
+ if (likely(!crosspage)) {
339
+ do_st_2(env, &l.page[0], val, l.mmu_idx, l.memop, ra);
340
+ return;
341
+ }
342
+
343
+ if ((l.memop & MO_BSWAP) == MO_LE) {
344
+ a = val, b = val >> 8;
345
+ } else {
346
+ b = val, a = val >> 8;
347
+ }
348
+ do_st_1(env, &l.page[0], a, l.mmu_idx, ra);
349
+ do_st_1(env, &l.page[1], b, l.mmu_idx, ra);
350
}
351
352
void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
353
MemOpIdx oi, uintptr_t retaddr)
354
{
355
- full_le_stw_mmu(env, addr, val, oi, retaddr);
356
-}
357
-
358
-static void full_be_stw_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
359
- MemOpIdx oi, uintptr_t retaddr)
360
-{
361
- validate_memop(oi, MO_BEUW);
362
- store_helper(env, addr, val, oi, retaddr, MO_BEUW);
363
+ validate_memop(oi, MO_LEUW);
364
+ do_st2_mmu(env, addr, val, oi, retaddr);
365
}
366
367
void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
368
MemOpIdx oi, uintptr_t retaddr)
369
{
370
- full_be_stw_mmu(env, addr, val, oi, retaddr);
371
+ validate_memop(oi, MO_BEUW);
372
+ do_st2_mmu(env, addr, val, oi, retaddr);
373
}
374
375
-static void full_le_stl_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
376
- MemOpIdx oi, uintptr_t retaddr)
377
+static void do_st4_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
378
+ MemOpIdx oi, uintptr_t ra)
379
{
380
- validate_memop(oi, MO_LEUL);
381
- store_helper(env, addr, val, oi, retaddr, MO_LEUL);
382
+ MMULookupLocals l;
383
+ bool crosspage;
384
+
385
+ crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l);
386
+ if (likely(!crosspage)) {
387
+ do_st_4(env, &l.page[0], val, l.mmu_idx, l.memop, ra);
388
+ return;
389
+ }
390
+
391
+ /* Swap to little endian for simplicity, then store by bytes. */
392
+ if ((l.memop & MO_BSWAP) != MO_LE) {
393
+ val = bswap32(val);
394
+ }
395
+ val = do_st_leN(env, &l.page[0], val, l.mmu_idx, ra);
396
+ (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, ra);
397
}
398
399
void helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
400
MemOpIdx oi, uintptr_t retaddr)
401
{
402
- full_le_stl_mmu(env, addr, val, oi, retaddr);
403
-}
404
-
405
-static void full_be_stl_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
406
- MemOpIdx oi, uintptr_t retaddr)
407
-{
408
- validate_memop(oi, MO_BEUL);
409
- store_helper(env, addr, val, oi, retaddr, MO_BEUL);
410
+ validate_memop(oi, MO_LEUL);
411
+ do_st4_mmu(env, addr, val, oi, retaddr);
412
}
413
414
void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
415
MemOpIdx oi, uintptr_t retaddr)
416
{
417
- full_be_stl_mmu(env, addr, val, oi, retaddr);
418
+ validate_memop(oi, MO_BEUL);
419
+ do_st4_mmu(env, addr, val, oi, retaddr);
420
+}
421
+
422
+static void do_st8_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
423
+ MemOpIdx oi, uintptr_t ra)
424
+{
425
+ MMULookupLocals l;
426
+ bool crosspage;
427
+
428
+ crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l);
429
+ if (likely(!crosspage)) {
430
+ do_st_8(env, &l.page[0], val, l.mmu_idx, l.memop, ra);
431
+ return;
432
+ }
433
+
434
+ /* Swap to little endian for simplicity, then store by bytes. */
435
+ if ((l.memop & MO_BSWAP) != MO_LE) {
436
+ val = bswap64(val);
437
+ }
438
+ val = do_st_leN(env, &l.page[0], val, l.mmu_idx, ra);
439
+ (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, ra);
440
}
441
442
void helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
443
MemOpIdx oi, uintptr_t retaddr)
444
{
445
validate_memop(oi, MO_LEUQ);
446
- store_helper(env, addr, val, oi, retaddr, MO_LEUQ);
447
+ do_st8_mmu(env, addr, val, oi, retaddr);
448
}
449
450
void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
451
MemOpIdx oi, uintptr_t retaddr)
452
{
453
validate_memop(oi, MO_BEUQ);
454
- store_helper(env, addr, val, oi, retaddr, MO_BEUQ);
455
+ do_st8_mmu(env, addr, val, oi, retaddr);
456
}
457
458
/*
459
* Store Helpers for cpu_ldst.h
460
*/
461
462
-typedef void FullStoreHelper(CPUArchState *env, target_ulong addr,
463
- uint64_t val, MemOpIdx oi, uintptr_t retaddr);
464
-
465
-static inline void cpu_store_helper(CPUArchState *env, target_ulong addr,
466
- uint64_t val, MemOpIdx oi, uintptr_t ra,
467
- FullStoreHelper *full_store)
468
+static void plugin_store_cb(CPUArchState *env, abi_ptr addr, MemOpIdx oi)
469
{
470
- full_store(env, addr, val, oi, ra);
471
qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W);
472
}
473
474
void cpu_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val,
475
MemOpIdx oi, uintptr_t retaddr)
476
{
477
- cpu_store_helper(env, addr, val, oi, retaddr, full_stb_mmu);
478
+ helper_ret_stb_mmu(env, addr, val, oi, retaddr);
479
+ plugin_store_cb(env, addr, oi);
480
}
481
482
void cpu_stw_be_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
483
MemOpIdx oi, uintptr_t retaddr)
484
{
485
- cpu_store_helper(env, addr, val, oi, retaddr, full_be_stw_mmu);
486
+ helper_be_stw_mmu(env, addr, val, oi, retaddr);
487
+ plugin_store_cb(env, addr, oi);
488
}
489
490
void cpu_stl_be_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
491
MemOpIdx oi, uintptr_t retaddr)
492
{
493
- cpu_store_helper(env, addr, val, oi, retaddr, full_be_stl_mmu);
494
+ helper_be_stl_mmu(env, addr, val, oi, retaddr);
495
+ plugin_store_cb(env, addr, oi);
496
}
497
498
void cpu_stq_be_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
499
MemOpIdx oi, uintptr_t retaddr)
500
{
501
- cpu_store_helper(env, addr, val, oi, retaddr, helper_be_stq_mmu);
502
+ helper_be_stq_mmu(env, addr, val, oi, retaddr);
503
+ plugin_store_cb(env, addr, oi);
504
}
505
506
void cpu_stw_le_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
507
MemOpIdx oi, uintptr_t retaddr)
508
{
509
- cpu_store_helper(env, addr, val, oi, retaddr, full_le_stw_mmu);
510
+ helper_le_stw_mmu(env, addr, val, oi, retaddr);
511
+ plugin_store_cb(env, addr, oi);
512
}
513
514
void cpu_stl_le_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
515
MemOpIdx oi, uintptr_t retaddr)
516
{
517
- cpu_store_helper(env, addr, val, oi, retaddr, full_le_stl_mmu);
518
+ helper_le_stl_mmu(env, addr, val, oi, retaddr);
519
+ plugin_store_cb(env, addr, oi);
520
}
521
522
void cpu_stq_le_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
523
MemOpIdx oi, uintptr_t retaddr)
524
{
525
- cpu_store_helper(env, addr, val, oi, retaddr, helper_le_stq_mmu);
526
+ helper_le_stq_mmu(env, addr, val, oi, retaddr);
527
+ plugin_store_cb(env, addr, oi);
528
}
529
530
void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, Int128 val,
531
--
180
--
532
2.34.1
181
2.34.1
diff view generated by jsdifflib
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2
and tcg_out_st_helper_args.
3
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
3
---
7
tcg/aarch64/tcg-target.c.inc | 40 +++++++++++++++---------------------
4
tcg/aarch64/tcg-target.c.inc | 9 +++++++--
8
1 file changed, 16 insertions(+), 24 deletions(-)
5
1 file changed, 7 insertions(+), 2 deletions(-)
9
6
10
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
7
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
11
index XXXXXXX..XXXXXXX 100644
8
index XXXXXXX..XXXXXXX 100644
12
--- a/tcg/aarch64/tcg-target.c.inc
9
--- a/tcg/aarch64/tcg-target.c.inc
13
+++ b/tcg/aarch64/tcg-target.c.inc
10
+++ b/tcg/aarch64/tcg-target.c.inc
14
@@ -XXX,XX +XXX,XX @@ static void tcg_out_cltz(TCGContext *s, TCGType ext, TCGReg d,
11
@@ -XXX,XX +XXX,XX @@ static const int tcg_target_reg_alloc_order[] = {
15
}
12
13
TCG_REG_X8, TCG_REG_X9, TCG_REG_X10, TCG_REG_X11,
14
TCG_REG_X12, TCG_REG_X13, TCG_REG_X14, TCG_REG_X15,
15
- TCG_REG_X16, TCG_REG_X17,
16
17
TCG_REG_X0, TCG_REG_X1, TCG_REG_X2, TCG_REG_X3,
18
TCG_REG_X4, TCG_REG_X5, TCG_REG_X6, TCG_REG_X7,
19
20
+ /* X16 reserved as temporary */
21
+ /* X17 reserved as temporary */
22
/* X18 reserved by system */
23
/* X19 reserved for AREG0 */
24
/* X29 reserved as fp */
25
@@ -XXX,XX +XXX,XX @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
26
return TCG_REG_X0 + slot;
16
}
27
}
17
28
18
-static void tcg_out_adr(TCGContext *s, TCGReg rd, const void *target)
29
-#define TCG_REG_TMP0 TCG_REG_X30
19
-{
30
+#define TCG_REG_TMP0 TCG_REG_X16
20
- ptrdiff_t offset = tcg_pcrel_diff(s, target);
31
+#define TCG_REG_TMP1 TCG_REG_X17
21
- tcg_debug_assert(offset == sextract64(offset, 0, 21));
32
+#define TCG_REG_TMP2 TCG_REG_X30
22
- tcg_out_insn(s, 3406, ADR, rd, offset);
33
#define TCG_VEC_TMP0 TCG_REG_V31
23
-}
34
24
-
35
#ifndef CONFIG_SOFTMMU
25
typedef struct {
36
@@ -XXX,XX +XXX,XX @@ static void tcg_target_init(TCGContext *s)
26
TCGReg base;
37
tcg_regset_set_reg(s->reserved_regs, TCG_REG_FP);
27
TCGReg index;
38
tcg_regset_set_reg(s->reserved_regs, TCG_REG_X18); /* platform register */
28
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[MO_SIZE + 1] = {
39
tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP0);
29
#endif
40
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP1);
30
};
41
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP2);
31
42
tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP0);
32
+static const TCGLdstHelperParam ldst_helper_param = {
33
+ .ntmp = 1, .tmp = { TCG_REG_TMP }
34
+};
35
+
36
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
37
{
38
- MemOpIdx oi = lb->oi;
39
- MemOp opc = get_memop(oi);
40
+ MemOp opc = get_memop(lb->oi);
41
42
if (!reloc_pc19(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
43
return false;
44
}
45
46
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_X0, TCG_AREG0);
47
- tcg_out_mov(s, TARGET_LONG_BITS == 64, TCG_REG_X1, lb->addrlo_reg);
48
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_X2, oi);
49
- tcg_out_adr(s, TCG_REG_X3, lb->raddr);
50
+ tcg_out_ld_helper_args(s, lb, &ldst_helper_param);
51
tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SIZE]);
52
-
53
- tcg_out_movext(s, lb->type, lb->datalo_reg,
54
- TCG_TYPE_REG, opc & MO_SSIZE, TCG_REG_X0);
55
+ tcg_out_ld_helper_ret(s, lb, false, &ldst_helper_param);
56
tcg_out_goto(s, lb->raddr);
57
return true;
58
}
43
}
59
44
60
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
61
{
62
- MemOpIdx oi = lb->oi;
63
- MemOp opc = get_memop(oi);
64
- MemOp size = opc & MO_SIZE;
65
+ MemOp opc = get_memop(lb->oi);
66
67
if (!reloc_pc19(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
68
return false;
69
}
70
71
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_X0, TCG_AREG0);
72
- tcg_out_mov(s, TARGET_LONG_BITS == 64, TCG_REG_X1, lb->addrlo_reg);
73
- tcg_out_mov(s, size == MO_64, TCG_REG_X2, lb->datalo_reg);
74
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_X3, oi);
75
- tcg_out_adr(s, TCG_REG_X4, lb->raddr);
76
+ tcg_out_st_helper_args(s, lb, &ldst_helper_param);
77
tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE]);
78
tcg_out_goto(s, lb->raddr);
79
return true;
80
}
81
#else
82
+static void tcg_out_adr(TCGContext *s, TCGReg rd, const void *target)
83
+{
84
+ ptrdiff_t offset = tcg_pcrel_diff(s, target);
85
+ tcg_debug_assert(offset == sextract64(offset, 0, 21));
86
+ tcg_out_insn(s, 3406, ADR, rd, offset);
87
+}
88
+
89
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
90
{
91
if (!reloc_pc19(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
92
--
45
--
93
2.34.1
46
2.34.1
94
95
diff view generated by jsdifflib
1
The softmmu tlb uses TCG_REG_TMP[0-2], not any of the normally available
1
Adjust the softmmu tlb to use TMP[0-2], not any of the normally available
2
registers. Now that we handle overlap betwen inputs and helper arguments,
2
registers. Since we handle overlap betwen inputs and helper arguments,
3
we can allow any allocatable reg.
3
we can allow any allocatable reg.
4
4
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
7
---
9
tcg/riscv/tcg-target-con-set.h | 2 --
8
tcg/aarch64/tcg-target-con-set.h | 2 --
10
tcg/riscv/tcg-target-con-str.h | 1 -
9
tcg/aarch64/tcg-target-con-str.h | 1 -
11
tcg/riscv/tcg-target.c.inc | 16 +++-------------
10
tcg/aarch64/tcg-target.c.inc | 45 ++++++++++++++------------------
12
3 files changed, 3 insertions(+), 16 deletions(-)
11
3 files changed, 19 insertions(+), 29 deletions(-)
13
12
14
diff --git a/tcg/riscv/tcg-target-con-set.h b/tcg/riscv/tcg-target-con-set.h
13
diff --git a/tcg/aarch64/tcg-target-con-set.h b/tcg/aarch64/tcg-target-con-set.h
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/tcg/riscv/tcg-target-con-set.h
15
--- a/tcg/aarch64/tcg-target-con-set.h
17
+++ b/tcg/riscv/tcg-target-con-set.h
16
+++ b/tcg/aarch64/tcg-target-con-set.h
18
@@ -XXX,XX +XXX,XX @@
17
@@ -XXX,XX +XXX,XX @@
19
* tcg-target-con-str.h; the constraint combination is inclusive or.
18
* tcg-target-con-str.h; the constraint combination is inclusive or.
20
*/
19
*/
21
C_O0_I1(r)
20
C_O0_I1(r)
22
-C_O0_I2(LZ, L)
21
-C_O0_I2(lZ, l)
22
C_O0_I2(r, rA)
23
C_O0_I2(rZ, r)
23
C_O0_I2(rZ, r)
24
C_O0_I2(rZ, rZ)
24
C_O0_I2(w, r)
25
-C_O1_I1(r, L)
25
-C_O1_I1(r, l)
26
C_O1_I1(r, r)
26
C_O1_I1(r, r)
27
C_O1_I2(r, r, ri)
27
C_O1_I1(w, r)
28
C_O1_I2(r, r, rI)
28
C_O1_I1(w, w)
29
diff --git a/tcg/riscv/tcg-target-con-str.h b/tcg/riscv/tcg-target-con-str.h
29
diff --git a/tcg/aarch64/tcg-target-con-str.h b/tcg/aarch64/tcg-target-con-str.h
30
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
31
--- a/tcg/riscv/tcg-target-con-str.h
31
--- a/tcg/aarch64/tcg-target-con-str.h
32
+++ b/tcg/riscv/tcg-target-con-str.h
32
+++ b/tcg/aarch64/tcg-target-con-str.h
33
@@ -XXX,XX +XXX,XX @@
33
@@ -XXX,XX +XXX,XX @@
34
* REGS(letter, register_mask)
34
* REGS(letter, register_mask)
35
*/
35
*/
36
REGS('r', ALL_GENERAL_REGS)
36
REGS('r', ALL_GENERAL_REGS)
37
-REGS('L', ALL_GENERAL_REGS & ~SOFTMMU_RESERVE_REGS)
37
-REGS('l', ALL_QLDST_REGS)
38
REGS('w', ALL_VECTOR_REGS)
38
39
39
/*
40
/*
40
* Define constraint letters for constants:
41
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
41
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
42
index XXXXXXX..XXXXXXX 100644
42
index XXXXXXX..XXXXXXX 100644
43
--- a/tcg/riscv/tcg-target.c.inc
43
--- a/tcg/aarch64/tcg-target.c.inc
44
+++ b/tcg/riscv/tcg-target.c.inc
44
+++ b/tcg/aarch64/tcg-target.c.inc
45
@@ -XXX,XX +XXX,XX @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
45
@@ -XXX,XX +XXX,XX @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
46
#define TCG_CT_CONST_N12 0x400
46
#define ALL_GENERAL_REGS 0xffffffffu
47
#define TCG_CT_CONST_M12 0x800
47
#define ALL_VECTOR_REGS 0xffffffff00000000ull
48
48
49
-#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
50
-/*
51
- * For softmmu, we need to avoid conflicts with the first 5
52
- * argument registers to call the helper. Some of these are
53
- * also used for the tlb lookup.
54
- */
55
-#ifdef CONFIG_SOFTMMU
49
-#ifdef CONFIG_SOFTMMU
56
-#define SOFTMMU_RESERVE_REGS MAKE_64BIT_MASK(TCG_REG_A0, 5)
50
-#define ALL_QLDST_REGS \
51
- (ALL_GENERAL_REGS & ~((1 << TCG_REG_X0) | (1 << TCG_REG_X1) | \
52
- (1 << TCG_REG_X2) | (1 << TCG_REG_X3)))
57
-#else
53
-#else
58
-#define SOFTMMU_RESERVE_REGS 0
54
-#define ALL_QLDST_REGS ALL_GENERAL_REGS
59
-#endif
55
-#endif
60
+#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
56
-
61
57
/* Match a constant valid for addition (12-bit, optionally shifted). */
62
#define sextreg sextract64
58
static inline bool is_aimm(uint64_t val)
63
59
{
60
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
61
unsigned s_bits = opc & MO_SIZE;
62
unsigned s_mask = (1u << s_bits) - 1;
63
unsigned mem_index = get_mmuidx(oi);
64
- TCGReg x3;
65
+ TCGReg addr_adj;
66
TCGType mask_type;
67
uint64_t compare_mask;
68
69
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
70
mask_type = (s->page_bits + s->tlb_dyn_max_bits > 32
71
? TCG_TYPE_I64 : TCG_TYPE_I32);
72
73
- /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {x0,x1}. */
74
+ /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {tmp0,tmp1}. */
75
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
76
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -512);
77
QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
78
QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 8);
79
- tcg_out_insn(s, 3314, LDP, TCG_REG_X0, TCG_REG_X1, TCG_AREG0,
80
+ tcg_out_insn(s, 3314, LDP, TCG_REG_TMP0, TCG_REG_TMP1, TCG_AREG0,
81
TLB_MASK_TABLE_OFS(mem_index), 1, 0);
82
83
/* Extract the TLB index from the address into X0. */
84
tcg_out_insn(s, 3502S, AND_LSR, mask_type == TCG_TYPE_I64,
85
- TCG_REG_X0, TCG_REG_X0, addr_reg,
86
+ TCG_REG_TMP0, TCG_REG_TMP0, addr_reg,
87
s->page_bits - CPU_TLB_ENTRY_BITS);
88
89
- /* Add the tlb_table pointer, creating the CPUTLBEntry address into X1. */
90
- tcg_out_insn(s, 3502, ADD, 1, TCG_REG_X1, TCG_REG_X1, TCG_REG_X0);
91
+ /* Add the tlb_table pointer, forming the CPUTLBEntry address in TMP1. */
92
+ tcg_out_insn(s, 3502, ADD, 1, TCG_REG_TMP1, TCG_REG_TMP1, TCG_REG_TMP0);
93
94
- /* Load the tlb comparator into X0, and the fast path addend into X1. */
95
- tcg_out_ld(s, addr_type, TCG_REG_X0, TCG_REG_X1,
96
+ /* Load the tlb comparator into TMP0, and the fast path addend into TMP1. */
97
+ tcg_out_ld(s, addr_type, TCG_REG_TMP0, TCG_REG_TMP1,
98
is_ld ? offsetof(CPUTLBEntry, addr_read)
99
: offsetof(CPUTLBEntry, addr_write));
100
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_X1, TCG_REG_X1,
101
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_REG_TMP1,
102
offsetof(CPUTLBEntry, addend));
103
104
/*
105
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
106
* cross pages using the address of the last byte of the access.
107
*/
108
if (a_mask >= s_mask) {
109
- x3 = addr_reg;
110
+ addr_adj = addr_reg;
111
} else {
112
+ addr_adj = TCG_REG_TMP2;
113
tcg_out_insn(s, 3401, ADDI, addr_type,
114
- TCG_REG_X3, addr_reg, s_mask - a_mask);
115
- x3 = TCG_REG_X3;
116
+ addr_adj, addr_reg, s_mask - a_mask);
117
}
118
compare_mask = (uint64_t)s->page_mask | a_mask;
119
120
- /* Store the page mask part of the address into X3. */
121
- tcg_out_logicali(s, I3404_ANDI, addr_type, TCG_REG_X3, x3, compare_mask);
122
+ /* Store the page mask part of the address into TMP2. */
123
+ tcg_out_logicali(s, I3404_ANDI, addr_type, TCG_REG_TMP2,
124
+ addr_adj, compare_mask);
125
126
/* Perform the address comparison. */
127
- tcg_out_cmp(s, addr_type, TCG_REG_X0, TCG_REG_X3, 0);
128
+ tcg_out_cmp(s, addr_type, TCG_REG_TMP0, TCG_REG_TMP2, 0);
129
130
/* If not equal, we jump to the slow path. */
131
ldst->label_ptr[0] = s->code_ptr;
132
tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
133
134
- h->base = TCG_REG_X1,
135
+ h->base = TCG_REG_TMP1;
136
h->index = addr_reg;
137
h->index_ext = addr_type;
138
#else
64
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
139
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
65
140
case INDEX_op_qemu_ld_a64_i32:
66
case INDEX_op_qemu_ld_i32:
141
case INDEX_op_qemu_ld_a32_i64:
67
case INDEX_op_qemu_ld_i64:
142
case INDEX_op_qemu_ld_a64_i64:
68
- return C_O1_I1(r, L);
143
- return C_O1_I1(r, l);
69
+ return C_O1_I1(r, r);
144
+ return C_O1_I1(r, r);
70
case INDEX_op_qemu_st_i32:
145
case INDEX_op_qemu_st_a32_i32:
71
case INDEX_op_qemu_st_i64:
146
case INDEX_op_qemu_st_a64_i32:
72
- return C_O0_I2(LZ, L);
147
case INDEX_op_qemu_st_a32_i64:
148
case INDEX_op_qemu_st_a64_i64:
149
- return C_O0_I2(lZ, l);
73
+ return C_O0_I2(rZ, r);
150
+ return C_O0_I2(rZ, r);
74
151
75
default:
152
case INDEX_op_deposit_i32:
76
g_assert_not_reached();
153
case INDEX_op_deposit_i64:
77
--
154
--
78
2.34.1
155
2.34.1
79
80
diff view generated by jsdifflib
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
1
With FEAT_LSE2, LDP/STP suffices. Without FEAT_LSE2, use LDXP+STXP
2
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
2
16-byte atomicity is required and LDP/STP otherwise.
3
into one function that returns HostAddress and TCGLabelQemuLdst structures.
4
3
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
6
---
8
tcg/mips/tcg-target.c.inc | 404 ++++++++++++++++----------------------
7
tcg/aarch64/tcg-target-con-set.h | 2 +
9
1 file changed, 172 insertions(+), 232 deletions(-)
8
tcg/aarch64/tcg-target.h | 11 ++-
9
tcg/aarch64/tcg-target.c.inc | 141 ++++++++++++++++++++++++++++++-
10
3 files changed, 151 insertions(+), 3 deletions(-)
10
11
11
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
12
diff --git a/tcg/aarch64/tcg-target-con-set.h b/tcg/aarch64/tcg-target-con-set.h
12
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/mips/tcg-target.c.inc
14
--- a/tcg/aarch64/tcg-target-con-set.h
14
+++ b/tcg/mips/tcg-target.c.inc
15
+++ b/tcg/aarch64/tcg-target-con-set.h
15
@@ -XXX,XX +XXX,XX @@ static int tcg_out_call_iarg_reg2(TCGContext *s, int i, TCGReg al, TCGReg ah)
16
@@ -XXX,XX +XXX,XX @@ C_O0_I1(r)
16
return i;
17
C_O0_I2(r, rA)
18
C_O0_I2(rZ, r)
19
C_O0_I2(w, r)
20
+C_O0_I3(rZ, rZ, r)
21
C_O1_I1(r, r)
22
C_O1_I1(w, r)
23
C_O1_I1(w, w)
24
@@ -XXX,XX +XXX,XX @@ C_O1_I2(w, w, wO)
25
C_O1_I2(w, w, wZ)
26
C_O1_I3(w, w, w, w)
27
C_O1_I4(r, r, rA, rZ, rZ)
28
+C_O2_I1(r, r, r)
29
C_O2_I4(r, r, rZ, rZ, rA, rMZ)
30
diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
31
index XXXXXXX..XXXXXXX 100644
32
--- a/tcg/aarch64/tcg-target.h
33
+++ b/tcg/aarch64/tcg-target.h
34
@@ -XXX,XX +XXX,XX @@ typedef enum {
35
#define TCG_TARGET_HAS_muluh_i64 1
36
#define TCG_TARGET_HAS_mulsh_i64 1
37
38
-#define TCG_TARGET_HAS_qemu_ldst_i128 0
39
+/*
40
+ * Without FEAT_LSE2, we must use LDXP+STXP to implement atomic 128-bit load,
41
+ * which requires writable pages. We must defer to the helper for user-only,
42
+ * but in system mode all ram is writable for the host.
43
+ */
44
+#ifdef CONFIG_USER_ONLY
45
+#define TCG_TARGET_HAS_qemu_ldst_i128 have_lse2
46
+#else
47
+#define TCG_TARGET_HAS_qemu_ldst_i128 1
48
+#endif
49
50
#define TCG_TARGET_HAS_v64 1
51
#define TCG_TARGET_HAS_v128 1
52
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
53
index XXXXXXX..XXXXXXX 100644
54
--- a/tcg/aarch64/tcg-target.c.inc
55
+++ b/tcg/aarch64/tcg-target.c.inc
56
@@ -XXX,XX +XXX,XX @@ typedef enum {
57
I3305_LDR_v64 = 0x5c000000,
58
I3305_LDR_v128 = 0x9c000000,
59
60
+ /* Load/store exclusive. */
61
+ I3306_LDXP = 0xc8600000,
62
+ I3306_STXP = 0xc8200000,
63
+
64
/* Load/store register. Described here as 3.3.12, but the helper
65
that emits them can transform to 3.3.10 or 3.3.13. */
66
I3312_STRB = 0x38000000 | LDST_ST << 22 | MO_8 << 30,
67
@@ -XXX,XX +XXX,XX @@ typedef enum {
68
I3406_ADR = 0x10000000,
69
I3406_ADRP = 0x90000000,
70
71
+ /* Add/subtract extended register instructions. */
72
+ I3501_ADD = 0x0b200000,
73
+
74
/* Add/subtract shifted register instructions (without a shift). */
75
I3502_ADD = 0x0b000000,
76
I3502_ADDS = 0x2b000000,
77
@@ -XXX,XX +XXX,XX @@ static void tcg_out_insn_3305(TCGContext *s, AArch64Insn insn,
78
tcg_out32(s, insn | (imm19 & 0x7ffff) << 5 | rt);
17
}
79
}
18
80
19
-/* We expect to use a 16-bit negative offset from ENV. */
81
+static void tcg_out_insn_3306(TCGContext *s, AArch64Insn insn, TCGReg rs,
20
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
82
+ TCGReg rt, TCGReg rt2, TCGReg rn)
21
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
83
+{
22
-
84
+ tcg_out32(s, insn | rs << 16 | rt2 << 10 | rn << 5 | rt);
23
-/*
85
+}
24
- * Perform the tlb comparison operation.
86
+
25
- * The complete host address is placed in BASE.
87
static void tcg_out_insn_3201(TCGContext *s, AArch64Insn insn, TCGType ext,
26
- * Clobbers TMP0, TMP1, TMP2, TMP3.
88
TCGReg rt, int imm19)
27
- */
89
{
28
-static void tcg_out_tlb_load(TCGContext *s, TCGReg base, TCGReg addrl,
90
@@ -XXX,XX +XXX,XX @@ static void tcg_out_insn_3406(TCGContext *s, AArch64Insn insn,
29
- TCGReg addrh, MemOpIdx oi,
91
tcg_out32(s, insn | (disp & 3) << 29 | (disp & 0x1ffffc) << (5 - 2) | rd);
30
- tcg_insn_unit *label_ptr[2], bool is_load)
92
}
31
-{
93
32
- MemOp opc = get_memop(oi);
94
+static inline void tcg_out_insn_3501(TCGContext *s, AArch64Insn insn,
33
- unsigned a_bits = get_alignment_bits(opc);
95
+ TCGType sf, TCGReg rd, TCGReg rn,
96
+ TCGReg rm, int opt, int imm3)
97
+{
98
+ tcg_out32(s, insn | sf << 31 | rm << 16 | opt << 13 |
99
+ imm3 << 10 | rn << 5 | rd);
100
+}
101
+
102
/* This function is for both 3.5.2 (Add/Subtract shifted register), for
103
the rare occasion when we actually want to supply a shift amount. */
104
static inline void tcg_out_insn_3502S(TCGContext *s, AArch64Insn insn,
105
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
106
TCGType addr_type = s->addr_type;
107
TCGLabelQemuLdst *ldst = NULL;
108
MemOp opc = get_memop(oi);
109
+ MemOp s_bits = opc & MO_SIZE;
110
unsigned a_mask;
111
112
h->aa = atom_and_align_for_opc(s, opc,
113
have_lse2 ? MO_ATOM_WITHIN16
114
: MO_ATOM_IFALIGN,
115
- false);
116
+ s_bits == MO_128);
117
a_mask = (1 << h->aa.align) - 1;
118
119
#ifdef CONFIG_SOFTMMU
34
- unsigned s_bits = opc & MO_SIZE;
120
- unsigned s_bits = opc & MO_SIZE;
35
- unsigned a_mask = (1 << a_bits) - 1;
121
unsigned s_mask = (1u << s_bits) - 1;
36
- unsigned s_mask = (1 << s_bits) - 1;
122
unsigned mem_index = get_mmuidx(oi);
37
- int mem_index = get_mmuidx(oi);
123
TCGReg addr_adj;
38
- int fast_off = TLB_MASK_TABLE_OFS(mem_index);
124
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
39
- int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
125
}
40
- int table_off = fast_off + offsetof(CPUTLBDescFast, table);
41
- int add_off = offsetof(CPUTLBEntry, addend);
42
- int cmp_off = (is_load ? offsetof(CPUTLBEntry, addr_read)
43
- : offsetof(CPUTLBEntry, addr_write));
44
- target_ulong tlb_mask;
45
-
46
- /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
47
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP0, TCG_AREG0, mask_off);
48
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP1, TCG_AREG0, table_off);
49
-
50
- /* Extract the TLB index from the address into TMP3. */
51
- tcg_out_opc_sa(s, ALIAS_TSRL, TCG_TMP3, addrl,
52
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
53
- tcg_out_opc_reg(s, OPC_AND, TCG_TMP3, TCG_TMP3, TCG_TMP0);
54
-
55
- /* Add the tlb_table pointer, creating the CPUTLBEntry address in TMP3. */
56
- tcg_out_opc_reg(s, ALIAS_PADD, TCG_TMP3, TCG_TMP3, TCG_TMP1);
57
-
58
- /* Load the (low-half) tlb comparator. */
59
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
60
- tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + LO_OFF);
61
- } else {
62
- tcg_out_ldst(s, (TARGET_LONG_BITS == 64 ? OPC_LD
63
- : TCG_TARGET_REG_BITS == 64 ? OPC_LWU : OPC_LW),
64
- TCG_TMP0, TCG_TMP3, cmp_off);
65
- }
66
-
67
- /* Zero extend a 32-bit guest address for a 64-bit host. */
68
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
69
- tcg_out_ext32u(s, base, addrl);
70
- addrl = base;
71
- }
72
-
73
- /*
74
- * Mask the page bits, keeping the alignment bits to compare against.
75
- * For unaligned accesses, compare against the end of the access to
76
- * verify that it does not cross a page boundary.
77
- */
78
- tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
79
- tcg_out_movi(s, TCG_TYPE_I32, TCG_TMP1, tlb_mask);
80
- if (a_mask >= s_mask) {
81
- tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrl);
82
- } else {
83
- tcg_out_opc_imm(s, ALIAS_PADDI, TCG_TMP2, addrl, s_mask - a_mask);
84
- tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, TCG_TMP2);
85
- }
86
-
87
- if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
88
- /* Load the tlb addend for the fast path. */
89
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
90
- }
91
-
92
- label_ptr[0] = s->code_ptr;
93
- tcg_out_opc_br(s, OPC_BNE, TCG_TMP1, TCG_TMP0);
94
-
95
- /* Load and test the high half tlb comparator. */
96
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
97
- /* delay slot */
98
- tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + HI_OFF);
99
-
100
- /* Load the tlb addend for the fast path. */
101
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
102
-
103
- label_ptr[1] = s->code_ptr;
104
- tcg_out_opc_br(s, OPC_BNE, addrh, TCG_TMP0);
105
- }
106
-
107
- /* delay slot */
108
- tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_TMP2, addrl);
109
-}
110
-
111
-static void add_qemu_ldst_label(TCGContext *s, int is_ld, MemOpIdx oi,
112
- TCGType ext,
113
- TCGReg datalo, TCGReg datahi,
114
- TCGReg addrlo, TCGReg addrhi,
115
- void *raddr, tcg_insn_unit *label_ptr[2])
116
-{
117
- TCGLabelQemuLdst *label = new_ldst_label(s);
118
-
119
- label->is_ld = is_ld;
120
- label->oi = oi;
121
- label->type = ext;
122
- label->datalo_reg = datalo;
123
- label->datahi_reg = datahi;
124
- label->addrlo_reg = addrlo;
125
- label->addrhi_reg = addrhi;
126
- label->raddr = tcg_splitwx_to_rx(raddr);
127
- label->label_ptr[0] = label_ptr[0];
128
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
129
- label->label_ptr[1] = label_ptr[1];
130
- }
131
-}
132
-
133
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
134
{
135
const tcg_insn_unit *tgt_rx = tcg_splitwx_to_rx(s->code_ptr);
136
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
137
}
126
}
138
127
139
#else
128
+static void tcg_out_qemu_ldst_i128(TCGContext *s, TCGReg datalo, TCGReg datahi,
140
-
129
+ TCGReg addr_reg, MemOpIdx oi, bool is_ld)
141
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
142
- TCGReg addrhi, unsigned a_bits)
143
-{
144
- unsigned a_mask = (1 << a_bits) - 1;
145
- TCGLabelQemuLdst *l = new_ldst_label(s);
146
-
147
- l->is_ld = is_ld;
148
- l->addrlo_reg = addrlo;
149
- l->addrhi_reg = addrhi;
150
-
151
- /* We are expecting a_bits to max out at 7, much lower than ANDI. */
152
- tcg_debug_assert(a_bits < 16);
153
- tcg_out_opc_imm(s, OPC_ANDI, TCG_TMP0, addrlo, a_mask);
154
-
155
- l->label_ptr[0] = s->code_ptr;
156
- if (use_mips32r6_instructions) {
157
- tcg_out_opc_br(s, OPC_BNEZALC_R6, TCG_REG_ZERO, TCG_TMP0);
158
- } else {
159
- tcg_out_opc_br(s, OPC_BNEL, TCG_TMP0, TCG_REG_ZERO);
160
- tcg_out_nop(s);
161
- }
162
-
163
- l->raddr = tcg_splitwx_to_rx(s->code_ptr);
164
-}
165
-
166
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
167
{
168
void *target;
169
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
170
}
171
#endif /* SOFTMMU */
172
173
+typedef struct {
174
+ TCGReg base;
175
+ MemOp align;
176
+} HostAddress;
177
+
178
+/*
179
+ * For softmmu, perform the TLB load and compare.
180
+ * For useronly, perform any required alignment tests.
181
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
182
+ * is required and fill in @h with the host address for the fast path.
183
+ */
184
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
185
+ TCGReg addrlo, TCGReg addrhi,
186
+ MemOpIdx oi, bool is_ld)
187
+{
130
+{
188
+ TCGLabelQemuLdst *ldst = NULL;
189
+ MemOp opc = get_memop(oi);
190
+ unsigned a_bits = get_alignment_bits(opc);
191
+ unsigned s_bits = opc & MO_SIZE;
192
+ unsigned a_mask = (1 << a_bits) - 1;
193
+ TCGReg base;
194
+
195
+#ifdef CONFIG_SOFTMMU
196
+ unsigned s_mask = (1 << s_bits) - 1;
197
+ int mem_index = get_mmuidx(oi);
198
+ int fast_off = TLB_MASK_TABLE_OFS(mem_index);
199
+ int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
200
+ int table_off = fast_off + offsetof(CPUTLBDescFast, table);
201
+ int add_off = offsetof(CPUTLBEntry, addend);
202
+ int cmp_off = is_ld ? offsetof(CPUTLBEntry, addr_read)
203
+ : offsetof(CPUTLBEntry, addr_write);
204
+ target_ulong tlb_mask;
205
+
206
+ ldst = new_ldst_label(s);
207
+ ldst->is_ld = is_ld;
208
+ ldst->oi = oi;
209
+ ldst->addrlo_reg = addrlo;
210
+ ldst->addrhi_reg = addrhi;
211
+ base = TCG_REG_A0;
212
+
213
+ /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
214
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
215
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
216
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP0, TCG_AREG0, mask_off);
217
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP1, TCG_AREG0, table_off);
218
+
219
+ /* Extract the TLB index from the address into TMP3. */
220
+ tcg_out_opc_sa(s, ALIAS_TSRL, TCG_TMP3, addrlo,
221
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
222
+ tcg_out_opc_reg(s, OPC_AND, TCG_TMP3, TCG_TMP3, TCG_TMP0);
223
+
224
+ /* Add the tlb_table pointer, creating the CPUTLBEntry address in TMP3. */
225
+ tcg_out_opc_reg(s, ALIAS_PADD, TCG_TMP3, TCG_TMP3, TCG_TMP1);
226
+
227
+ /* Load the (low-half) tlb comparator. */
228
+ if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
229
+ tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + LO_OFF);
230
+ } else {
231
+ tcg_out_ldst(s, (TARGET_LONG_BITS == 64 ? OPC_LD
232
+ : TCG_TARGET_REG_BITS == 64 ? OPC_LWU : OPC_LW),
233
+ TCG_TMP0, TCG_TMP3, cmp_off);
234
+ }
235
+
236
+ /* Zero extend a 32-bit guest address for a 64-bit host. */
237
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
238
+ tcg_out_ext32u(s, base, addrlo);
239
+ addrlo = base;
240
+ }
241
+
242
+ /*
243
+ * Mask the page bits, keeping the alignment bits to compare against.
244
+ * For unaligned accesses, compare against the end of the access to
245
+ * verify that it does not cross a page boundary.
246
+ */
247
+ tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
248
+ tcg_out_movi(s, TCG_TYPE_I32, TCG_TMP1, tlb_mask);
249
+ if (a_mask >= s_mask) {
250
+ tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrlo);
251
+ } else {
252
+ tcg_out_opc_imm(s, ALIAS_PADDI, TCG_TMP2, addrlo, s_mask - a_mask);
253
+ tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, TCG_TMP2);
254
+ }
255
+
256
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
257
+ /* Load the tlb addend for the fast path. */
258
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
259
+ }
260
+
261
+ ldst->label_ptr[0] = s->code_ptr;
262
+ tcg_out_opc_br(s, OPC_BNE, TCG_TMP1, TCG_TMP0);
263
+
264
+ /* Load and test the high half tlb comparator. */
265
+ if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
266
+ /* delay slot */
267
+ tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + HI_OFF);
268
+
269
+ /* Load the tlb addend for the fast path. */
270
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
271
+
272
+ ldst->label_ptr[1] = s->code_ptr;
273
+ tcg_out_opc_br(s, OPC_BNE, addrhi, TCG_TMP0);
274
+ }
275
+
276
+ /* delay slot */
277
+ tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_TMP2, addrlo);
278
+#else
279
+ if (a_mask && (use_mips32r6_instructions || a_bits != s_bits)) {
280
+ ldst = new_ldst_label(s);
281
+
282
+ ldst->is_ld = is_ld;
283
+ ldst->oi = oi;
284
+ ldst->addrlo_reg = addrlo;
285
+ ldst->addrhi_reg = addrhi;
286
+
287
+ /* We are expecting a_bits to max out at 7, much lower than ANDI. */
288
+ tcg_debug_assert(a_bits < 16);
289
+ tcg_out_opc_imm(s, OPC_ANDI, TCG_TMP0, addrlo, a_mask);
290
+
291
+ ldst->label_ptr[0] = s->code_ptr;
292
+ if (use_mips32r6_instructions) {
293
+ tcg_out_opc_br(s, OPC_BNEZALC_R6, TCG_REG_ZERO, TCG_TMP0);
294
+ } else {
295
+ tcg_out_opc_br(s, OPC_BNEL, TCG_TMP0, TCG_REG_ZERO);
296
+ tcg_out_nop(s);
297
+ }
298
+ }
299
+
300
+ base = addrlo;
301
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
302
+ tcg_out_ext32u(s, TCG_REG_A0, base);
303
+ base = TCG_REG_A0;
304
+ }
305
+ if (guest_base) {
306
+ if (guest_base == (int16_t)guest_base) {
307
+ tcg_out_opc_imm(s, ALIAS_PADDI, TCG_REG_A0, base, guest_base);
308
+ } else {
309
+ tcg_out_opc_reg(s, ALIAS_PADD, TCG_REG_A0, base,
310
+ TCG_GUEST_BASE_REG);
311
+ }
312
+ base = TCG_REG_A0;
313
+ }
314
+#endif
315
+
316
+ h->base = base;
317
+ h->align = a_bits;
318
+ return ldst;
319
+}
320
+
321
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
322
TCGReg base, MemOp opc, TCGType type)
323
{
324
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
325
MemOpIdx oi, TCGType data_type)
326
{
327
MemOp opc = get_memop(oi);
328
- unsigned a_bits = get_alignment_bits(opc);
329
- unsigned s_bits = opc & MO_SIZE;
330
- TCGReg base;
331
+ TCGLabelQemuLdst *ldst;
131
+ TCGLabelQemuLdst *ldst;
332
+ HostAddress h;
132
+ HostAddress h;
333
133
+ TCGReg base;
334
- /*
134
+ bool use_pair;
335
- * R6 removes the left/right instructions but requires the
135
+
336
- * system to support misaligned memory accesses.
136
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, is_ld);
337
- */
137
+
338
-#if defined(CONFIG_SOFTMMU)
138
+ /* Compose the final address, as LDP/STP have no indexing. */
339
- tcg_insn_unit *label_ptr[2];
139
+ if (h.index == TCG_REG_XZR) {
340
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
140
+ base = h.base;
341
141
+ } else {
342
- base = TCG_REG_A0;
142
+ base = TCG_REG_TMP2;
343
- tcg_out_tlb_load(s, base, addrlo, addrhi, oi, label_ptr, 1);
143
+ if (h.index_ext == TCG_TYPE_I32) {
344
- if (use_mips32r6_instructions || a_bits >= s_bits) {
144
+ /* add base, base, index, uxtw */
345
- tcg_out_qemu_ld_direct(s, datalo, datahi, base, opc, data_type);
145
+ tcg_out_insn(s, 3501, ADD, TCG_TYPE_I64, base,
346
+ if (use_mips32r6_instructions || h.align >= (opc & MO_SIZE)) {
146
+ h.base, h.index, MO_32, 0);
347
+ tcg_out_qemu_ld_direct(s, datalo, datahi, h.base, opc, data_type);
147
+ } else {
348
} else {
148
+ /* add base, base, index */
349
- tcg_out_qemu_ld_unalign(s, datalo, datahi, base, opc, data_type);
149
+ tcg_out_insn(s, 3502, ADD, 1, base, h.base, h.index);
350
+ tcg_out_qemu_ld_unalign(s, datalo, datahi, h.base, opc, data_type);
150
+ }
351
}
151
+ }
352
- add_qemu_ldst_label(s, true, oi, data_type, datalo, datahi,
152
+
353
- addrlo, addrhi, s->code_ptr, label_ptr);
153
+ use_pair = h.aa.atom < MO_128 || have_lse2;
354
-#else
154
+
355
- base = addrlo;
155
+ if (!use_pair) {
356
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
156
+ tcg_insn_unit *branch = NULL;
357
- tcg_out_ext32u(s, TCG_REG_A0, base);
157
+ TCGReg ll, lh, sl, sh;
358
- base = TCG_REG_A0;
158
+
159
+ /*
160
+ * If we have already checked for 16-byte alignment, that's all
161
+ * we need. Otherwise we have determined that misaligned atomicity
162
+ * may be handled with two 8-byte loads.
163
+ */
164
+ if (h.aa.align < MO_128) {
165
+ /*
166
+ * TODO: align should be MO_64, so we only need test bit 3,
167
+ * which means we could use TBNZ instead of ANDS+B_C.
168
+ */
169
+ tcg_out_logicali(s, I3404_ANDSI, 0, TCG_REG_XZR, addr_reg, 15);
170
+ branch = s->code_ptr;
171
+ tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
172
+ use_pair = true;
173
+ }
174
+
175
+ if (is_ld) {
176
+ /*
177
+ * 16-byte atomicity without LSE2 requires LDXP+STXP loop:
178
+ * ldxp lo, hi, [base]
179
+ * stxp t0, lo, hi, [base]
180
+ * cbnz t0, .-8
181
+ * Require no overlap between data{lo,hi} and base.
182
+ */
183
+ if (datalo == base || datahi == base) {
184
+ tcg_out_mov(s, TCG_TYPE_REG, TCG_REG_TMP2, base);
185
+ base = TCG_REG_TMP2;
186
+ }
187
+ ll = sl = datalo;
188
+ lh = sh = datahi;
189
+ } else {
190
+ /*
191
+ * 16-byte atomicity without LSE2 requires LDXP+STXP loop:
192
+ * 1: ldxp t0, t1, [base]
193
+ * stxp t0, lo, hi, [base]
194
+ * cbnz t0, 1b
195
+ */
196
+ tcg_debug_assert(base != TCG_REG_TMP0 && base != TCG_REG_TMP1);
197
+ ll = TCG_REG_TMP0;
198
+ lh = TCG_REG_TMP1;
199
+ sl = datalo;
200
+ sh = datahi;
201
+ }
202
+
203
+ tcg_out_insn(s, 3306, LDXP, TCG_REG_XZR, ll, lh, base);
204
+ tcg_out_insn(s, 3306, STXP, TCG_REG_TMP0, sl, sh, base);
205
+ tcg_out_insn(s, 3201, CBNZ, 0, TCG_REG_TMP0, -2);
206
+
207
+ if (use_pair) {
208
+ /* "b .+8", branching across the one insn of use_pair. */
209
+ tcg_out_insn(s, 3206, B, 2);
210
+ reloc_pc19(branch, tcg_splitwx_to_rx(s->code_ptr));
211
+ }
212
+ }
213
+
214
+ if (use_pair) {
215
+ if (is_ld) {
216
+ tcg_out_insn(s, 3314, LDP, datalo, datahi, base, 0, 1, 0);
217
+ } else {
218
+ tcg_out_insn(s, 3314, STP, datalo, datahi, base, 0, 1, 0);
219
+ }
220
+ }
359
+
221
+
360
+ if (ldst) {
222
+ if (ldst) {
361
+ ldst->type = data_type;
223
+ ldst->type = TCG_TYPE_I128;
362
+ ldst->datalo_reg = datalo;
224
+ ldst->datalo_reg = datalo;
363
+ ldst->datahi_reg = datahi;
225
+ ldst->datahi_reg = datahi;
364
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
226
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
365
}
227
+ }
366
- if (guest_base) {
228
+}
367
- if (guest_base == (int16_t)guest_base) {
229
+
368
- tcg_out_opc_imm(s, ALIAS_PADDI, TCG_REG_A0, base, guest_base);
230
static const tcg_insn_unit *tb_ret_addr;
369
- } else {
231
370
- tcg_out_opc_reg(s, ALIAS_PADD, TCG_REG_A0, base,
232
static void tcg_out_exit_tb(TCGContext *s, uintptr_t a0)
371
- TCG_GUEST_BASE_REG);
233
@@ -XXX,XX +XXX,XX @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
372
- }
234
case INDEX_op_qemu_st_a64_i64:
373
- base = TCG_REG_A0;
235
tcg_out_qemu_st(s, REG0(0), a1, a2, ext);
374
- }
236
break;
375
- if (use_mips32r6_instructions) {
237
+ case INDEX_op_qemu_ld_a32_i128:
376
- if (a_bits) {
238
+ case INDEX_op_qemu_ld_a64_i128:
377
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
239
+ tcg_out_qemu_ldst_i128(s, a0, a1, a2, args[3], true);
378
- }
240
+ break;
379
- tcg_out_qemu_ld_direct(s, datalo, datahi, base, opc, data_type);
241
+ case INDEX_op_qemu_st_a32_i128:
380
- } else {
242
+ case INDEX_op_qemu_st_a64_i128:
381
- if (a_bits && a_bits != s_bits) {
243
+ tcg_out_qemu_ldst_i128(s, REG0(0), REG0(1), a2, args[3], false);
382
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
244
+ break;
383
- }
245
384
- if (a_bits >= s_bits) {
246
case INDEX_op_bswap64_i64:
385
- tcg_out_qemu_ld_direct(s, datalo, datahi, base, opc, data_type);
247
tcg_out_rev(s, TCG_TYPE_I64, MO_64, a0, a1);
386
- } else {
248
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
387
- tcg_out_qemu_ld_unalign(s, datalo, datahi, base, opc, data_type);
249
case INDEX_op_qemu_ld_a32_i64:
388
- }
250
case INDEX_op_qemu_ld_a64_i64:
389
- }
251
return C_O1_I1(r, r);
390
-#endif
252
+ case INDEX_op_qemu_ld_a32_i128:
391
}
253
+ case INDEX_op_qemu_ld_a64_i128:
392
254
+ return C_O2_I1(r, r, r);
393
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo, TCGReg hi,
255
case INDEX_op_qemu_st_a32_i32:
394
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
256
case INDEX_op_qemu_st_a64_i32:
395
MemOpIdx oi, TCGType data_type)
257
case INDEX_op_qemu_st_a32_i64:
396
{
258
case INDEX_op_qemu_st_a64_i64:
397
MemOp opc = get_memop(oi);
259
return C_O0_I2(rZ, r);
398
- unsigned a_bits = get_alignment_bits(opc);
260
+ case INDEX_op_qemu_st_a32_i128:
399
- unsigned s_bits = opc & MO_SIZE;
261
+ case INDEX_op_qemu_st_a64_i128:
400
- TCGReg base;
262
+ return C_O0_I3(rZ, rZ, r);
401
+ TCGLabelQemuLdst *ldst;
263
402
+ HostAddress h;
264
case INDEX_op_deposit_i32:
403
265
case INDEX_op_deposit_i64:
404
- /*
405
- * R6 removes the left/right instructions but requires the
406
- * system to support misaligned memory accesses.
407
- */
408
-#if defined(CONFIG_SOFTMMU)
409
- tcg_insn_unit *label_ptr[2];
410
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
411
412
- base = TCG_REG_A0;
413
- tcg_out_tlb_load(s, base, addrlo, addrhi, oi, label_ptr, 0);
414
- if (use_mips32r6_instructions || a_bits >= s_bits) {
415
- tcg_out_qemu_st_direct(s, datalo, datahi, base, opc);
416
+ if (use_mips32r6_instructions || h.align >= (opc & MO_SIZE)) {
417
+ tcg_out_qemu_st_direct(s, datalo, datahi, h.base, opc);
418
} else {
419
- tcg_out_qemu_st_unalign(s, datalo, datahi, base, opc);
420
+ tcg_out_qemu_st_unalign(s, datalo, datahi, h.base, opc);
421
}
422
- add_qemu_ldst_label(s, false, oi, data_type, datalo, datahi,
423
- addrlo, addrhi, s->code_ptr, label_ptr);
424
-#else
425
- base = addrlo;
426
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
427
- tcg_out_ext32u(s, TCG_REG_A0, base);
428
- base = TCG_REG_A0;
429
+
430
+ if (ldst) {
431
+ ldst->type = data_type;
432
+ ldst->datalo_reg = datalo;
433
+ ldst->datahi_reg = datahi;
434
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
435
}
436
- if (guest_base) {
437
- if (guest_base == (int16_t)guest_base) {
438
- tcg_out_opc_imm(s, ALIAS_PADDI, TCG_REG_A0, base, guest_base);
439
- } else {
440
- tcg_out_opc_reg(s, ALIAS_PADD, TCG_REG_A0, base,
441
- TCG_GUEST_BASE_REG);
442
- }
443
- base = TCG_REG_A0;
444
- }
445
- if (use_mips32r6_instructions) {
446
- if (a_bits) {
447
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
448
- }
449
- tcg_out_qemu_st_direct(s, datalo, datahi, base, opc);
450
- } else {
451
- if (a_bits && a_bits != s_bits) {
452
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
453
- }
454
- if (a_bits >= s_bits) {
455
- tcg_out_qemu_st_direct(s, datalo, datahi, base, opc);
456
- } else {
457
- tcg_out_qemu_st_unalign(s, datalo, datahi, base, opc);
458
- }
459
- }
460
-#endif
461
}
462
463
static void tcg_out_mb(TCGContext *s, TCGArg a0)
464
--
266
--
465
2.34.1
267
2.34.1
466
467
diff view generated by jsdifflib
1
The softmmu tlb uses TCG_REG_{TMP1,TMP2,R0}, not any of the normally
1
Use LQ/STQ with ISA v2.07, and 16-byte atomicity is required.
2
available registers. Now that we handle overlap betwen inputs and
2
Note that these instructions do not require 16-byte alignment.
3
helper arguments, we can allow any allocatable reg.
4
3
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
4
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
6
---
9
tcg/ppc/tcg-target-con-set.h | 11 ++++-------
7
tcg/ppc/tcg-target-con-set.h | 2 +
10
tcg/ppc/tcg-target-con-str.h | 2 --
8
tcg/ppc/tcg-target-con-str.h | 1 +
11
tcg/ppc/tcg-target.c.inc | 32 ++++++++++----------------------
9
tcg/ppc/tcg-target.h | 3 +-
12
3 files changed, 14 insertions(+), 31 deletions(-)
10
tcg/ppc/tcg-target.c.inc | 108 +++++++++++++++++++++++++++++++----
11
4 files changed, 101 insertions(+), 13 deletions(-)
13
12
14
diff --git a/tcg/ppc/tcg-target-con-set.h b/tcg/ppc/tcg-target-con-set.h
13
diff --git a/tcg/ppc/tcg-target-con-set.h b/tcg/ppc/tcg-target-con-set.h
15
index XXXXXXX..XXXXXXX 100644
14
index XXXXXXX..XXXXXXX 100644
16
--- a/tcg/ppc/tcg-target-con-set.h
15
--- a/tcg/ppc/tcg-target-con-set.h
17
+++ b/tcg/ppc/tcg-target-con-set.h
16
+++ b/tcg/ppc/tcg-target-con-set.h
18
@@ -XXX,XX +XXX,XX @@
17
@@ -XXX,XX +XXX,XX @@ C_O0_I2(r, r)
19
C_O0_I1(r)
20
C_O0_I2(r, r)
21
C_O0_I2(r, ri)
18
C_O0_I2(r, ri)
22
-C_O0_I2(S, S)
23
C_O0_I2(v, r)
19
C_O0_I2(v, r)
24
-C_O0_I3(S, S, S)
20
C_O0_I3(r, r, r)
25
+C_O0_I3(r, r, r)
21
+C_O0_I3(o, m, r)
26
C_O0_I4(r, r, ri, ri)
22
C_O0_I4(r, r, ri, ri)
27
-C_O0_I4(S, S, S, S)
23
C_O0_I4(r, r, r, r)
28
-C_O1_I1(r, L)
29
+C_O0_I4(r, r, r, r)
30
C_O1_I1(r, r)
24
C_O1_I1(r, r)
31
C_O1_I1(v, r)
25
@@ -XXX,XX +XXX,XX @@ C_O1_I3(v, v, v, v)
32
C_O1_I1(v, v)
33
C_O1_I1(v, vr)
34
C_O1_I2(r, 0, rZ)
35
-C_O1_I2(r, L, L)
36
C_O1_I2(r, rI, ri)
37
C_O1_I2(r, rI, rT)
38
C_O1_I2(r, r, r)
39
@@ -XXX,XX +XXX,XX @@ C_O1_I2(v, v, v)
40
C_O1_I3(v, v, v, v)
41
C_O1_I4(r, r, ri, rZ, rZ)
26
C_O1_I4(r, r, ri, rZ, rZ)
42
C_O1_I4(r, r, r, ri, ri)
27
C_O1_I4(r, r, r, ri, ri)
43
-C_O2_I1(L, L, L)
28
C_O2_I1(r, r, r)
44
-C_O2_I2(L, L, L, L)
29
+C_O2_I1(o, m, r)
45
+C_O2_I1(r, r, r)
30
C_O2_I2(r, r, r, r)
46
+C_O2_I2(r, r, r, r)
47
C_O2_I4(r, r, rI, rZM, r, r)
31
C_O2_I4(r, r, rI, rZM, r, r)
48
C_O2_I4(r, r, r, r, rI, rZM)
32
C_O2_I4(r, r, r, r, rI, rZM)
49
diff --git a/tcg/ppc/tcg-target-con-str.h b/tcg/ppc/tcg-target-con-str.h
33
diff --git a/tcg/ppc/tcg-target-con-str.h b/tcg/ppc/tcg-target-con-str.h
50
index XXXXXXX..XXXXXXX 100644
34
index XXXXXXX..XXXXXXX 100644
51
--- a/tcg/ppc/tcg-target-con-str.h
35
--- a/tcg/ppc/tcg-target-con-str.h
52
+++ b/tcg/ppc/tcg-target-con-str.h
36
+++ b/tcg/ppc/tcg-target-con-str.h
53
@@ -XXX,XX +XXX,XX @@ REGS('A', 1u << TCG_REG_R3)
37
@@ -XXX,XX +XXX,XX @@
54
REGS('B', 1u << TCG_REG_R4)
38
* REGS(letter, register_mask)
55
REGS('C', 1u << TCG_REG_R5)
39
*/
56
REGS('D', 1u << TCG_REG_R6)
40
REGS('r', ALL_GENERAL_REGS)
57
-REGS('L', ALL_QLOAD_REGS)
41
+REGS('o', ALL_GENERAL_REGS & 0xAAAAAAAAu) /* odd registers */
58
-REGS('S', ALL_QSTORE_REGS)
42
REGS('v', ALL_VECTOR_REGS)
59
43
60
/*
44
/*
61
* Define constraint letters for constants:
45
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
46
index XXXXXXX..XXXXXXX 100644
47
--- a/tcg/ppc/tcg-target.h
48
+++ b/tcg/ppc/tcg-target.h
49
@@ -XXX,XX +XXX,XX @@ extern bool have_vsx;
50
#define TCG_TARGET_HAS_mulsh_i64 1
51
#endif
52
53
-#define TCG_TARGET_HAS_qemu_ldst_i128 0
54
+#define TCG_TARGET_HAS_qemu_ldst_i128 \
55
+ (TCG_TARGET_REG_BITS == 64 && have_isa_2_07)
56
57
/*
58
* While technically Altivec could support V64, it has no 64-bit store
62
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
59
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
63
index XXXXXXX..XXXXXXX 100644
60
index XXXXXXX..XXXXXXX 100644
64
--- a/tcg/ppc/tcg-target.c.inc
61
--- a/tcg/ppc/tcg-target.c.inc
65
+++ b/tcg/ppc/tcg-target.c.inc
62
+++ b/tcg/ppc/tcg-target.c.inc
66
@@ -XXX,XX +XXX,XX @@
63
@@ -XXX,XX +XXX,XX @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
67
#define ALL_GENERAL_REGS 0xffffffffu
64
68
#define ALL_VECTOR_REGS 0xffffffff00000000ull
65
#define B OPCD( 18)
69
66
#define BC OPCD( 16)
70
-#ifdef CONFIG_SOFTMMU
67
+
71
-#define ALL_QLOAD_REGS \
68
#define LBZ OPCD( 34)
72
- (ALL_GENERAL_REGS & \
69
#define LHZ OPCD( 40)
73
- ~((1 << TCG_REG_R3) | (1 << TCG_REG_R4) | (1 << TCG_REG_R5)))
70
#define LHA OPCD( 42)
74
-#define ALL_QSTORE_REGS \
71
#define LWZ OPCD( 32)
75
- (ALL_GENERAL_REGS & ~((1 << TCG_REG_R3) | (1 << TCG_REG_R4) | \
72
#define LWZUX XO31( 55)
76
- (1 << TCG_REG_R5) | (1 << TCG_REG_R6)))
73
-#define STB OPCD( 38)
77
-#else
74
-#define STH OPCD( 44)
78
-#define ALL_QLOAD_REGS (ALL_GENERAL_REGS & ~(1 << TCG_REG_R3))
75
-#define STW OPCD( 36)
79
-#define ALL_QSTORE_REGS ALL_QLOAD_REGS
80
-#endif
81
-
76
-
82
TCGPowerISA have_isa;
77
-#define STD XO62( 0)
83
static bool have_isel;
78
-#define STDU XO62( 1)
84
bool have_altivec;
79
-#define STDX XO31(149)
80
-
81
#define LD XO58( 0)
82
#define LDX XO31( 21)
83
#define LDU XO58( 1)
84
#define LDUX XO31( 53)
85
#define LWA XO58( 2)
86
#define LWAX XO31(341)
87
+#define LQ OPCD( 56)
88
+
89
+#define STB OPCD( 38)
90
+#define STH OPCD( 44)
91
+#define STW OPCD( 36)
92
+#define STD XO62( 0)
93
+#define STDU XO62( 1)
94
+#define STDX XO31(149)
95
+#define STQ XO62( 2)
96
97
#define ADDIC OPCD( 12)
98
#define ADDI OPCD( 14)
99
@@ -XXX,XX +XXX,XX @@ typedef struct {
100
101
bool tcg_target_has_memory_bswap(MemOp memop)
102
{
103
- return true;
104
+ TCGAtomAlign aa;
105
+
106
+ if ((memop & MO_SIZE) <= MO_64) {
107
+ return true;
108
+ }
109
+
110
+ /*
111
+ * Reject 16-byte memop with 16-byte atomicity,
112
+ * but do allow a pair of 64-bit operations.
113
+ */
114
+ aa = atom_and_align_for_opc(tcg_ctx, memop, MO_ATOM_IFALIGN, true);
115
+ return aa.atom <= MO_64;
116
}
117
118
/*
119
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
120
{
121
TCGLabelQemuLdst *ldst = NULL;
122
MemOp opc = get_memop(oi);
123
- MemOp a_bits;
124
+ MemOp a_bits, s_bits;
125
126
/*
127
* Book II, Section 1.4, Single-Copy Atomicity, specifies:
128
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
129
* As of 3.0, "the non-atomic access is performed as described in
130
* the corresponding list", which matches MO_ATOM_SUBALIGN.
131
*/
132
+ s_bits = opc & MO_SIZE;
133
h->aa = atom_and_align_for_opc(s, opc,
134
have_isa_3_00 ? MO_ATOM_SUBALIGN
135
: MO_ATOM_IFALIGN,
136
- false);
137
+ s_bits == MO_128);
138
a_bits = h->aa.align;
139
140
#ifdef CONFIG_SOFTMMU
141
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
142
int fast_off = TLB_MASK_TABLE_OFS(mem_index);
143
int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
144
int table_off = fast_off + offsetof(CPUTLBDescFast, table);
145
- unsigned s_bits = opc & MO_SIZE;
146
147
ldst = new_ldst_label(s);
148
ldst->is_ld = is_ld;
149
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
150
}
151
}
152
153
+static void tcg_out_qemu_ldst_i128(TCGContext *s, TCGReg datalo, TCGReg datahi,
154
+ TCGReg addr_reg, MemOpIdx oi, bool is_ld)
155
+{
156
+ TCGLabelQemuLdst *ldst;
157
+ HostAddress h;
158
+ bool need_bswap;
159
+ uint32_t insn;
160
+ TCGReg index;
161
+
162
+ ldst = prepare_host_addr(s, &h, addr_reg, -1, oi, is_ld);
163
+
164
+ /* Compose the final address, as LQ/STQ have no indexing. */
165
+ index = h.index;
166
+ if (h.base != 0) {
167
+ index = TCG_REG_TMP1;
168
+ tcg_out32(s, ADD | TAB(index, h.base, h.index));
169
+ }
170
+ need_bswap = get_memop(oi) & MO_BSWAP;
171
+
172
+ if (h.aa.atom == MO_128) {
173
+ tcg_debug_assert(!need_bswap);
174
+ tcg_debug_assert(datalo & 1);
175
+ tcg_debug_assert(datahi == datalo - 1);
176
+ insn = is_ld ? LQ : STQ;
177
+ tcg_out32(s, insn | TAI(datahi, index, 0));
178
+ } else {
179
+ TCGReg d1, d2;
180
+
181
+ if (HOST_BIG_ENDIAN ^ need_bswap) {
182
+ d1 = datahi, d2 = datalo;
183
+ } else {
184
+ d1 = datalo, d2 = datahi;
185
+ }
186
+
187
+ if (need_bswap) {
188
+ tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R0, 8);
189
+ insn = is_ld ? LDBRX : STDBRX;
190
+ tcg_out32(s, insn | TAB(d1, 0, index));
191
+ tcg_out32(s, insn | TAB(d2, index, TCG_REG_R0));
192
+ } else {
193
+ insn = is_ld ? LD : STD;
194
+ tcg_out32(s, insn | TAI(d1, index, 0));
195
+ tcg_out32(s, insn | TAI(d2, index, 8));
196
+ }
197
+ }
198
+
199
+ if (ldst) {
200
+ ldst->type = TCG_TYPE_I128;
201
+ ldst->datalo_reg = datalo;
202
+ ldst->datahi_reg = datahi;
203
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
204
+ }
205
+}
206
+
207
static void tcg_out_nop_fill(tcg_insn_unit *p, int count)
208
{
209
int i;
210
@@ -XXX,XX +XXX,XX @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
211
args[4], TCG_TYPE_I64);
212
}
213
break;
214
+ case INDEX_op_qemu_ld_a32_i128:
215
+ case INDEX_op_qemu_ld_a64_i128:
216
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
217
+ tcg_out_qemu_ldst_i128(s, args[0], args[1], args[2], args[3], true);
218
+ break;
219
220
case INDEX_op_qemu_st_a64_i32:
221
if (TCG_TARGET_REG_BITS == 32) {
222
@@ -XXX,XX +XXX,XX @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
223
args[4], TCG_TYPE_I64);
224
}
225
break;
226
+ case INDEX_op_qemu_st_a32_i128:
227
+ case INDEX_op_qemu_st_a64_i128:
228
+ tcg_debug_assert(TCG_TARGET_REG_BITS == 64);
229
+ tcg_out_qemu_ldst_i128(s, args[0], args[1], args[2], args[3], false);
230
+ break;
231
232
case INDEX_op_setcond_i32:
233
tcg_out_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1], args[2],
85
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
234
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
86
235
case INDEX_op_qemu_st_a64_i64:
87
case INDEX_op_qemu_ld_i32:
236
return TCG_TARGET_REG_BITS == 64 ? C_O0_I2(r, r) : C_O0_I4(r, r, r, r);
88
return (TCG_TARGET_REG_BITS == 64 || TARGET_LONG_BITS == 32
237
89
- ? C_O1_I1(r, L)
238
+ case INDEX_op_qemu_ld_a32_i128:
90
- : C_O1_I2(r, L, L));
239
+ case INDEX_op_qemu_ld_a64_i128:
91
+ ? C_O1_I1(r, r)
240
+ return C_O2_I1(o, m, r);
92
+ : C_O1_I2(r, r, r));
241
+ case INDEX_op_qemu_st_a32_i128:
93
242
+ case INDEX_op_qemu_st_a64_i128:
94
case INDEX_op_qemu_st_i32:
243
+ return C_O0_I3(o, m, r);
95
return (TCG_TARGET_REG_BITS == 64 || TARGET_LONG_BITS == 32
244
+
96
- ? C_O0_I2(S, S)
97
- : C_O0_I3(S, S, S));
98
+ ? C_O0_I2(r, r)
99
+ : C_O0_I3(r, r, r));
100
101
case INDEX_op_qemu_ld_i64:
102
- return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, L)
103
- : TARGET_LONG_BITS == 32 ? C_O2_I1(L, L, L)
104
- : C_O2_I2(L, L, L, L));
105
+ return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, r)
106
+ : TARGET_LONG_BITS == 32 ? C_O2_I1(r, r, r)
107
+ : C_O2_I2(r, r, r, r));
108
109
case INDEX_op_qemu_st_i64:
110
- return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(S, S)
111
- : TARGET_LONG_BITS == 32 ? C_O0_I3(S, S, S)
112
- : C_O0_I4(S, S, S, S));
113
+ return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(r, r)
114
+ : TARGET_LONG_BITS == 32 ? C_O0_I3(r, r, r)
115
+ : C_O0_I4(r, r, r, r));
116
117
case INDEX_op_add_vec:
245
case INDEX_op_add_vec:
118
case INDEX_op_sub_vec:
246
case INDEX_op_sub_vec:
247
case INDEX_op_mul_vec:
119
--
248
--
120
2.34.1
249
2.34.1
121
122
diff view generated by jsdifflib
1
Adjust the softmmu tlb to use R0+R1, not any of the normally available
1
Use LPQ/STPQ when 16-byte atomicity is required.
2
registers. Since we handle overlap betwen inputs and helper arguments,
2
Note that these instructions require 16-byte alignment.
3
we can allow any allocatable reg.
4
3
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
6
---
8
tcg/s390x/tcg-target-con-set.h | 2 --
7
tcg/s390x/tcg-target-con-set.h | 2 +
9
tcg/s390x/tcg-target-con-str.h | 1 -
8
tcg/s390x/tcg-target.h | 2 +-
10
tcg/s390x/tcg-target.c.inc | 36 ++++++++++++----------------------
9
tcg/s390x/tcg-target.c.inc | 107 ++++++++++++++++++++++++++++++++-
11
3 files changed, 12 insertions(+), 27 deletions(-)
10
3 files changed, 107 insertions(+), 4 deletions(-)
12
11
13
diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
12
diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
14
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
15
--- a/tcg/s390x/tcg-target-con-set.h
14
--- a/tcg/s390x/tcg-target-con-set.h
16
+++ b/tcg/s390x/tcg-target-con-set.h
15
+++ b/tcg/s390x/tcg-target-con-set.h
17
@@ -XXX,XX +XXX,XX @@
16
@@ -XXX,XX +XXX,XX @@ C_O0_I2(r, r)
18
* tcg-target-con-str.h; the constraint combination is inclusive or.
19
*/
20
C_O0_I1(r)
21
-C_O0_I2(L, L)
22
C_O0_I2(r, r)
23
C_O0_I2(r, ri)
17
C_O0_I2(r, ri)
24
C_O0_I2(r, rA)
18
C_O0_I2(r, rA)
25
C_O0_I2(v, r)
19
C_O0_I2(v, r)
26
-C_O1_I1(r, L)
20
+C_O0_I3(o, m, r)
27
C_O1_I1(r, r)
21
C_O1_I1(r, r)
28
C_O1_I1(v, r)
22
C_O1_I1(v, r)
29
C_O1_I1(v, v)
23
C_O1_I1(v, v)
30
diff --git a/tcg/s390x/tcg-target-con-str.h b/tcg/s390x/tcg-target-con-str.h
24
@@ -XXX,XX +XXX,XX @@ C_O1_I2(v, v, v)
25
C_O1_I3(v, v, v, v)
26
C_O1_I4(r, r, ri, rI, r)
27
C_O1_I4(r, r, rA, rI, r)
28
+C_O2_I1(o, m, r)
29
C_O2_I2(o, m, 0, r)
30
C_O2_I2(o, m, r, r)
31
C_O2_I3(o, m, 0, 1, r)
32
diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
31
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
32
--- a/tcg/s390x/tcg-target-con-str.h
34
--- a/tcg/s390x/tcg-target.h
33
+++ b/tcg/s390x/tcg-target-con-str.h
35
+++ b/tcg/s390x/tcg-target.h
34
@@ -XXX,XX +XXX,XX @@
36
@@ -XXX,XX +XXX,XX @@ extern uint64_t s390_facilities[3];
35
* REGS(letter, register_mask)
37
#define TCG_TARGET_HAS_muluh_i64 0
36
*/
38
#define TCG_TARGET_HAS_mulsh_i64 0
37
REGS('r', ALL_GENERAL_REGS)
39
38
-REGS('L', ALL_GENERAL_REGS & ~SOFTMMU_RESERVE_REGS)
40
-#define TCG_TARGET_HAS_qemu_ldst_i128 0
39
REGS('v', ALL_VECTOR_REGS)
41
+#define TCG_TARGET_HAS_qemu_ldst_i128 1
40
REGS('o', 0xaaaa) /* odd numbered general regs */
42
41
43
#define TCG_TARGET_HAS_v64 HAVE_FACILITY(VECTOR)
44
#define TCG_TARGET_HAS_v128 HAVE_FACILITY(VECTOR)
42
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
45
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
43
index XXXXXXX..XXXXXXX 100644
46
index XXXXXXX..XXXXXXX 100644
44
--- a/tcg/s390x/tcg-target.c.inc
47
--- a/tcg/s390x/tcg-target.c.inc
45
+++ b/tcg/s390x/tcg-target.c.inc
48
+++ b/tcg/s390x/tcg-target.c.inc
46
@@ -XXX,XX +XXX,XX @@
49
@@ -XXX,XX +XXX,XX @@ typedef enum S390Opcode {
47
#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 16)
50
RXY_LLGF = 0xe316,
48
#define ALL_VECTOR_REGS MAKE_64BIT_MASK(32, 32)
51
RXY_LLGH = 0xe391,
49
52
RXY_LMG = 0xeb04,
50
-/*
53
+ RXY_LPQ = 0xe38f,
51
- * For softmmu, we need to avoid conflicts with the first 3
54
RXY_LRV = 0xe31e,
52
- * argument registers to perform the tlb lookup, and to call
55
RXY_LRVG = 0xe30f,
53
- * the helper function.
56
RXY_LRVH = 0xe31f,
54
- */
57
@@ -XXX,XX +XXX,XX @@ typedef enum S390Opcode {
55
-#ifdef CONFIG_SOFTMMU
58
RXY_STG = 0xe324,
56
-#define SOFTMMU_RESERVE_REGS MAKE_64BIT_MASK(TCG_REG_R2, 3)
59
RXY_STHY = 0xe370,
57
-#else
60
RXY_STMG = 0xeb24,
58
-#define SOFTMMU_RESERVE_REGS 0
61
+ RXY_STPQ = 0xe38e,
59
-#endif
62
RXY_STRV = 0xe33e,
60
-
63
RXY_STRVG = 0xe32f,
61
-
64
RXY_STRVH = 0xe33f,
62
/* Several places within the instruction set 0 means "no register"
65
@@ -XXX,XX +XXX,XX @@ typedef struct {
63
rather than TCG_REG_R0. */
66
64
#define TCG_REG_NONE 0
67
bool tcg_target_has_memory_bswap(MemOp memop)
68
{
69
- return true;
70
+ TCGAtomAlign aa;
71
+
72
+ if ((memop & MO_SIZE) <= MO_64) {
73
+ return true;
74
+ }
75
+
76
+ /*
77
+ * Reject 16-byte memop with 16-byte atomicity,
78
+ * but do allow a pair of 64-bit operations.
79
+ */
80
+ aa = atom_and_align_for_opc(tcg_ctx, memop, MO_ATOM_IFALIGN, true);
81
+ return aa.atom <= MO_64;
82
}
83
84
static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg data,
65
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
85
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
66
ldst->oi = oi;
86
{
67
ldst->addrlo_reg = addr_reg;
87
TCGLabelQemuLdst *ldst = NULL;
68
88
MemOp opc = get_memop(oi);
69
- tcg_out_sh64(s, RSY_SRLG, TCG_REG_R2, addr_reg, TCG_REG_NONE,
89
+ MemOp s_bits = opc & MO_SIZE;
70
+ tcg_out_sh64(s, RSY_SRLG, TCG_TMP0, addr_reg, TCG_REG_NONE,
90
unsigned a_mask;
71
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
91
72
92
- h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, false);
73
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
93
+ h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, s_bits == MO_128);
74
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 19));
94
a_mask = (1 << h->aa.align) - 1;
75
- tcg_out_insn(s, RXY, NG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, mask_off);
95
76
- tcg_out_insn(s, RXY, AG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, table_off);
96
#ifdef CONFIG_SOFTMMU
77
+ tcg_out_insn(s, RXY, NG, TCG_TMP0, TCG_AREG0, TCG_REG_NONE, mask_off);
97
- unsigned s_bits = opc & MO_SIZE;
78
+ tcg_out_insn(s, RXY, AG, TCG_TMP0, TCG_AREG0, TCG_REG_NONE, table_off);
98
unsigned s_mask = (1 << s_bits) - 1;
79
99
int mem_index = get_mmuidx(oi);
80
/*
100
int fast_off = TLB_MASK_TABLE_OFS(mem_index);
81
* For aligned accesses, we check the first byte and include the alignment
101
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
82
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
83
a_off = (a_bits >= s_bits ? 0 : s_mask - a_mask);
84
tlb_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
85
if (a_off == 0) {
86
- tgen_andi_risbg(s, TCG_REG_R3, addr_reg, tlb_mask);
87
+ tgen_andi_risbg(s, TCG_REG_R0, addr_reg, tlb_mask);
88
} else {
89
- tcg_out_insn(s, RX, LA, TCG_REG_R3, addr_reg, TCG_REG_NONE, a_off);
90
- tgen_andi(s, TCG_TYPE_TL, TCG_REG_R3, tlb_mask);
91
+ tcg_out_insn(s, RX, LA, TCG_REG_R0, addr_reg, TCG_REG_NONE, a_off);
92
+ tgen_andi(s, TCG_TYPE_TL, TCG_REG_R0, tlb_mask);
93
}
102
}
94
103
}
95
if (is_ld) {
104
96
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
105
+static void tcg_out_qemu_ldst_i128(TCGContext *s, TCGReg datalo, TCGReg datahi,
97
ofs = offsetof(CPUTLBEntry, addr_write);
106
+ TCGReg addr_reg, MemOpIdx oi, bool is_ld)
98
}
107
+{
99
if (TARGET_LONG_BITS == 32) {
108
+ TCGLabel *l1 = NULL, *l2 = NULL;
100
- tcg_out_insn(s, RX, C, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
109
+ TCGLabelQemuLdst *ldst;
101
+ tcg_out_insn(s, RX, C, TCG_REG_R0, TCG_TMP0, TCG_REG_NONE, ofs);
110
+ HostAddress h;
102
} else {
111
+ bool need_bswap;
103
- tcg_out_insn(s, RXY, CG, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
112
+ bool use_pair;
104
+ tcg_out_insn(s, RXY, CG, TCG_REG_R0, TCG_TMP0, TCG_REG_NONE, ofs);
113
+ S390Opcode insn;
105
}
114
+
106
115
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, is_ld);
107
tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
116
+
108
ldst->label_ptr[0] = s->code_ptr++;
117
+ use_pair = h.aa.atom < MO_128;
109
118
+ need_bswap = get_memop(oi) & MO_BSWAP;
110
- h->index = TCG_REG_R2;
119
+
111
- tcg_out_insn(s, RXY, LG, h->index, TCG_REG_R2, TCG_REG_NONE,
120
+ if (!use_pair) {
112
+ h->index = TCG_TMP0;
121
+ /*
113
+ tcg_out_insn(s, RXY, LG, h->index, TCG_TMP0, TCG_REG_NONE,
122
+ * Atomicity requires we use LPQ. If we've already checked for
114
offsetof(CPUTLBEntry, addend));
123
+ * 16-byte alignment, that's all we need. If we arrive with
115
124
+ * lesser alignment, we have determined that less than 16-byte
116
if (TARGET_LONG_BITS == 32) {
125
+ * alignment can be satisfied with two 8-byte loads.
126
+ */
127
+ if (h.aa.align < MO_128) {
128
+ use_pair = true;
129
+ l1 = gen_new_label();
130
+ l2 = gen_new_label();
131
+
132
+ tcg_out_insn(s, RI, TMLL, addr_reg, 15);
133
+ tgen_branch(s, 7, l1); /* CC in {1,2,3} */
134
+ }
135
+
136
+ tcg_debug_assert(!need_bswap);
137
+ tcg_debug_assert(datalo & 1);
138
+ tcg_debug_assert(datahi == datalo - 1);
139
+ insn = is_ld ? RXY_LPQ : RXY_STPQ;
140
+ tcg_out_insn_RXY(s, insn, datahi, h.base, h.index, h.disp);
141
+
142
+ if (use_pair) {
143
+ tgen_branch(s, S390_CC_ALWAYS, l2);
144
+ tcg_out_label(s, l1);
145
+ }
146
+ }
147
+ if (use_pair) {
148
+ TCGReg d1, d2;
149
+
150
+ if (need_bswap) {
151
+ d1 = datalo, d2 = datahi;
152
+ insn = is_ld ? RXY_LRVG : RXY_STRVG;
153
+ } else {
154
+ d1 = datahi, d2 = datalo;
155
+ insn = is_ld ? RXY_LG : RXY_STG;
156
+ }
157
+
158
+ if (h.base == d1 || h.index == d1) {
159
+ tcg_out_insn(s, RXY, LAY, TCG_TMP0, h.base, h.index, h.disp);
160
+ h.base = TCG_TMP0;
161
+ h.index = TCG_REG_NONE;
162
+ h.disp = 0;
163
+ }
164
+ tcg_out_insn_RXY(s, insn, d1, h.base, h.index, h.disp);
165
+ tcg_out_insn_RXY(s, insn, d2, h.base, h.index, h.disp + 8);
166
+ }
167
+ if (l2) {
168
+ tcg_out_label(s, l2);
169
+ }
170
+
171
+ if (ldst) {
172
+ ldst->type = TCG_TYPE_I128;
173
+ ldst->datalo_reg = datalo;
174
+ ldst->datahi_reg = datahi;
175
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
176
+ }
177
+}
178
+
179
static void tcg_out_exit_tb(TCGContext *s, uintptr_t a0)
180
{
181
/* Reuse the zeroing that exists for goto_ptr. */
182
@@ -XXX,XX +XXX,XX @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
183
case INDEX_op_qemu_st_a64_i64:
184
tcg_out_qemu_st(s, args[0], args[1], args[2], TCG_TYPE_I64);
185
break;
186
+ case INDEX_op_qemu_ld_a32_i128:
187
+ case INDEX_op_qemu_ld_a64_i128:
188
+ tcg_out_qemu_ldst_i128(s, args[0], args[1], args[2], args[3], true);
189
+ break;
190
+ case INDEX_op_qemu_st_a32_i128:
191
+ case INDEX_op_qemu_st_a64_i128:
192
+ tcg_out_qemu_ldst_i128(s, args[0], args[1], args[2], args[3], false);
193
+ break;
194
195
case INDEX_op_ld16s_i64:
196
tcg_out_mem(s, 0, RXY_LGH, args[0], args[1], TCG_REG_NONE, args[2]);
117
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
197
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
118
198
case INDEX_op_qemu_st_a32_i32:
119
case INDEX_op_qemu_ld_i32:
199
case INDEX_op_qemu_st_a64_i32:
120
case INDEX_op_qemu_ld_i64:
200
return C_O0_I2(r, r);
121
- return C_O1_I1(r, L);
201
+ case INDEX_op_qemu_ld_a32_i128:
122
+ return C_O1_I1(r, r);
202
+ case INDEX_op_qemu_ld_a64_i128:
123
case INDEX_op_qemu_st_i64:
203
+ return C_O2_I1(o, m, r);
124
case INDEX_op_qemu_st_i32:
204
+ case INDEX_op_qemu_st_a32_i128:
125
- return C_O0_I2(L, L);
205
+ case INDEX_op_qemu_st_a64_i128:
126
+ return C_O0_I2(r, r);
206
+ return C_O0_I3(o, m, r);
127
207
128
case INDEX_op_deposit_i32:
208
case INDEX_op_deposit_i32:
129
case INDEX_op_deposit_i64:
209
case INDEX_op_deposit_i64:
130
--
210
--
131
2.34.1
211
2.34.1
132
133
diff view generated by jsdifflib
1
Instead of trying to unify all operations on uint64_t, pull out
2
mmu_lookup() to perform the basic tlb hit and resolution.
3
Create individual functions to handle access by size.
4
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
3
---
9
accel/tcg/cputlb.c | 645 +++++++++++++++++++++++++++++----------------
4
.../generic/host/load-extract-al16-al8.h | 45 +++++++++++++++++++
10
1 file changed, 424 insertions(+), 221 deletions(-)
5
accel/tcg/ldst_atomicity.c.inc | 36 +--------------
6
2 files changed, 47 insertions(+), 34 deletions(-)
7
create mode 100644 host/include/generic/host/load-extract-al16-al8.h
11
8
12
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
9
diff --git a/host/include/generic/host/load-extract-al16-al8.h b/host/include/generic/host/load-extract-al16-al8.h
13
index XXXXXXX..XXXXXXX 100644
10
new file mode 100644
14
--- a/accel/tcg/cputlb.c
11
index XXXXXXX..XXXXXXX
15
+++ b/accel/tcg/cputlb.c
12
--- /dev/null
16
@@ -XXX,XX +XXX,XX @@ bool tlb_plugin_lookup(CPUState *cpu, target_ulong addr, int mmu_idx,
13
+++ b/host/include/generic/host/load-extract-al16-al8.h
17
14
@@ -XXX,XX +XXX,XX @@
18
#endif
19
20
+/*
15
+/*
21
+ * Probe for a load/store operation.
16
+ * SPDX-License-Identifier: GPL-2.0-or-later
22
+ * Return the host address and into @flags.
17
+ * Atomic extract 64 from 128-bit, generic version.
18
+ *
19
+ * Copyright (C) 2023 Linaro, Ltd.
23
+ */
20
+ */
24
+
21
+
25
+typedef struct MMULookupPageData {
22
+#ifndef HOST_LOAD_EXTRACT_AL16_AL8_H
26
+ CPUTLBEntryFull *full;
23
+#define HOST_LOAD_EXTRACT_AL16_AL8_H
27
+ void *haddr;
28
+ target_ulong addr;
29
+ int flags;
30
+ int size;
31
+} MMULookupPageData;
32
+
33
+typedef struct MMULookupLocals {
34
+ MMULookupPageData page[2];
35
+ MemOp memop;
36
+ int mmu_idx;
37
+} MMULookupLocals;
38
+
24
+
39
+/**
25
+/**
40
+ * mmu_lookup1: translate one page
26
+ * load_atom_extract_al16_or_al8:
41
+ * @env: cpu context
27
+ * @pv: host address
42
+ * @data: lookup parameters
28
+ * @s: object size in bytes, @s <= 8.
43
+ * @mmu_idx: virtual address context
44
+ * @access_type: load/store/code
45
+ * @ra: return address into tcg generated code, or 0
46
+ *
29
+ *
47
+ * Resolve the translation for the one page at @data.addr, filling in
30
+ * Load @s bytes from @pv, when pv % s != 0. If [p, p+s-1] does not
48
+ * the rest of @data with the results. If the translation fails,
31
+ * cross an 16-byte boundary then the access must be 16-byte atomic,
49
+ * tlb_fill will longjmp out. Return true if the softmmu tlb for
32
+ * otherwise the access must be 8-byte atomic.
50
+ * @mmu_idx may have resized.
51
+ */
33
+ */
52
+static bool mmu_lookup1(CPUArchState *env, MMULookupPageData *data,
34
+static inline uint64_t ATTRIBUTE_ATOMIC128_OPT
53
+ int mmu_idx, MMUAccessType access_type, uintptr_t ra)
35
+load_atom_extract_al16_or_al8(void *pv, int s)
54
+{
36
+{
55
+ target_ulong addr = data->addr;
37
+ uintptr_t pi = (uintptr_t)pv;
56
+ uintptr_t index = tlb_index(env, mmu_idx, addr);
38
+ int o = pi & 7;
57
+ CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
39
+ int shr = (HOST_BIG_ENDIAN ? 16 - s - o : o) * 8;
58
+ target_ulong tlb_addr = tlb_read_idx(entry, access_type);
40
+ Int128 r;
59
+ bool maybe_resized = false;
60
+
41
+
61
+ /* If the TLB entry is for a different page, reload and try again. */
42
+ pv = (void *)(pi & ~7);
62
+ if (!tlb_hit(tlb_addr, addr)) {
43
+ if (pi & 8) {
63
+ if (!victim_tlb_hit(env, mmu_idx, index, access_type,
44
+ uint64_t *p8 = __builtin_assume_aligned(pv, 16, 8);
64
+ addr & TARGET_PAGE_MASK)) {
45
+ uint64_t a = qatomic_read__nocheck(p8);
65
+ tlb_fill(env_cpu(env), addr, data->size, access_type, mmu_idx, ra);
46
+ uint64_t b = qatomic_read__nocheck(p8 + 1);
66
+ maybe_resized = true;
47
+
67
+ index = tlb_index(env, mmu_idx, addr);
48
+ if (HOST_BIG_ENDIAN) {
68
+ entry = tlb_entry(env, mmu_idx, addr);
49
+ r = int128_make128(b, a);
50
+ } else {
51
+ r = int128_make128(a, b);
69
+ }
52
+ }
70
+ tlb_addr = tlb_read_idx(entry, access_type) & ~TLB_INVALID_MASK;
53
+ } else {
54
+ r = atomic16_read_ro(pv);
71
+ }
55
+ }
72
+
56
+ return int128_getlo(int128_urshift(r, shr));
73
+ data->flags = tlb_addr & TLB_FLAGS_MASK;
74
+ data->full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
75
+ /* Compute haddr speculatively; depending on flags it might be invalid. */
76
+ data->haddr = (void *)((uintptr_t)addr + entry->addend);
77
+
78
+ return maybe_resized;
79
+}
57
+}
80
+
58
+
81
+/**
59
+#endif /* HOST_LOAD_EXTRACT_AL16_AL8_H */
82
+ * mmu_watch_or_dirty
60
diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc
83
+ * @env: cpu context
61
index XXXXXXX..XXXXXXX 100644
84
+ * @data: lookup parameters
62
--- a/accel/tcg/ldst_atomicity.c.inc
85
+ * @access_type: load/store/code
63
+++ b/accel/tcg/ldst_atomicity.c.inc
86
+ * @ra: return address into tcg generated code, or 0
64
@@ -XXX,XX +XXX,XX @@
87
+ *
65
* See the COPYING file in the top-level directory.
88
+ * Trigger watchpoints for @data.addr:@data.size;
66
*/
89
+ * record writes to protected clean pages.
67
90
+ */
68
+#include "host/load-extract-al16-al8.h"
91
+static void mmu_watch_or_dirty(CPUArchState *env, MMULookupPageData *data,
92
+ MMUAccessType access_type, uintptr_t ra)
93
+{
94
+ CPUTLBEntryFull *full = data->full;
95
+ target_ulong addr = data->addr;
96
+ int flags = data->flags;
97
+ int size = data->size;
98
+
69
+
99
+ /* On watchpoint hit, this will longjmp out. */
70
#ifdef CONFIG_ATOMIC64
100
+ if (flags & TLB_WATCHPOINT) {
71
# define HAVE_al8 true
101
+ int wp = access_type == MMU_DATA_STORE ? BP_MEM_WRITE : BP_MEM_READ;
72
#else
102
+ cpu_check_watchpoint(env_cpu(env), addr, size, full->attrs, wp, ra);
73
@@ -XXX,XX +XXX,XX @@ static uint64_t load_atom_extract_al16_or_exit(CPUArchState *env, uintptr_t ra,
103
+ flags &= ~TLB_WATCHPOINT;
74
return int128_getlo(r);
104
+ }
105
+
106
+ /* Note that notdirty is only set for writes. */
107
+ if (flags & TLB_NOTDIRTY) {
108
+ notdirty_write(env_cpu(env), addr, size, full, ra);
109
+ flags &= ~TLB_NOTDIRTY;
110
+ }
111
+ data->flags = flags;
112
+}
113
+
114
+/**
115
+ * mmu_lookup: translate page(s)
116
+ * @env: cpu context
117
+ * @addr: virtual address
118
+ * @oi: combined mmu_idx and MemOp
119
+ * @ra: return address into tcg generated code, or 0
120
+ * @access_type: load/store/code
121
+ * @l: output result
122
+ *
123
+ * Resolve the translation for the page(s) beginning at @addr, for MemOp.size
124
+ * bytes. Return true if the lookup crosses a page boundary.
125
+ */
126
+static bool mmu_lookup(CPUArchState *env, target_ulong addr, MemOpIdx oi,
127
+ uintptr_t ra, MMUAccessType type, MMULookupLocals *l)
128
+{
129
+ unsigned a_bits;
130
+ bool crosspage;
131
+ int flags;
132
+
133
+ l->memop = get_memop(oi);
134
+ l->mmu_idx = get_mmuidx(oi);
135
+
136
+ tcg_debug_assert(l->mmu_idx < NB_MMU_MODES);
137
+
138
+ /* Handle CPU specific unaligned behaviour */
139
+ a_bits = get_alignment_bits(l->memop);
140
+ if (addr & ((1 << a_bits) - 1)) {
141
+ cpu_unaligned_access(env_cpu(env), addr, type, l->mmu_idx, ra);
142
+ }
143
+
144
+ l->page[0].addr = addr;
145
+ l->page[0].size = memop_size(l->memop);
146
+ l->page[1].addr = (addr + l->page[0].size - 1) & TARGET_PAGE_MASK;
147
+ l->page[1].size = 0;
148
+ crosspage = (addr ^ l->page[1].addr) & TARGET_PAGE_MASK;
149
+
150
+ if (likely(!crosspage)) {
151
+ mmu_lookup1(env, &l->page[0], l->mmu_idx, type, ra);
152
+
153
+ flags = l->page[0].flags;
154
+ if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
155
+ mmu_watch_or_dirty(env, &l->page[0], type, ra);
156
+ }
157
+ if (unlikely(flags & TLB_BSWAP)) {
158
+ l->memop ^= MO_BSWAP;
159
+ }
160
+ } else {
161
+ /* Finish compute of page crossing. */
162
+ int size0 = l->page[1].addr - addr;
163
+ l->page[1].size = l->page[0].size - size0;
164
+ l->page[0].size = size0;
165
+
166
+ /*
167
+ * Lookup both pages, recognizing exceptions from either. If the
168
+ * second lookup potentially resized, refresh first CPUTLBEntryFull.
169
+ */
170
+ mmu_lookup1(env, &l->page[0], l->mmu_idx, type, ra);
171
+ if (mmu_lookup1(env, &l->page[1], l->mmu_idx, type, ra)) {
172
+ uintptr_t index = tlb_index(env, l->mmu_idx, addr);
173
+ l->page[0].full = &env_tlb(env)->d[l->mmu_idx].fulltlb[index];
174
+ }
175
+
176
+ flags = l->page[0].flags | l->page[1].flags;
177
+ if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
178
+ mmu_watch_or_dirty(env, &l->page[0], type, ra);
179
+ mmu_watch_or_dirty(env, &l->page[1], type, ra);
180
+ }
181
+
182
+ /*
183
+ * Since target/sparc is the only user of TLB_BSWAP, and all
184
+ * Sparc accesses are aligned, any treatment across two pages
185
+ * would be arbitrary. Refuse it until there's a use.
186
+ */
187
+ tcg_debug_assert((flags & TLB_BSWAP) == 0);
188
+ }
189
+
190
+ return crosspage;
191
+}
192
+
193
/*
194
* Probe for an atomic operation. Do not allow unaligned operations,
195
* or io operations to proceed. Return the host address.
196
@@ -XXX,XX +XXX,XX @@ load_memop(const void *haddr, MemOp op)
197
}
198
}
75
}
199
76
200
-static inline uint64_t QEMU_ALWAYS_INLINE
77
-/**
201
-load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi,
78
- * load_atom_extract_al16_or_al8:
202
- uintptr_t retaddr, MemOp op, MMUAccessType access_type,
79
- * @p: host address
203
- FullLoadHelper *full_load)
80
- * @s: object size in bytes, @s <= 8.
81
- *
82
- * Load @s bytes from @p, when p % s != 0. If [p, p+s-1] does not
83
- * cross an 16-byte boundary then the access must be 16-byte atomic,
84
- * otherwise the access must be 8-byte atomic.
85
- */
86
-static inline uint64_t ATTRIBUTE_ATOMIC128_OPT
87
-load_atom_extract_al16_or_al8(void *pv, int s)
204
-{
88
-{
205
- const unsigned a_bits = get_alignment_bits(get_memop(oi));
89
- uintptr_t pi = (uintptr_t)pv;
206
- const size_t size = memop_size(op);
90
- int o = pi & 7;
207
- uintptr_t mmu_idx = get_mmuidx(oi);
91
- int shr = (HOST_BIG_ENDIAN ? 16 - s - o : o) * 8;
208
- uintptr_t index;
92
- Int128 r;
209
- CPUTLBEntry *entry;
210
- target_ulong tlb_addr;
211
- void *haddr;
212
- uint64_t res;
213
-
93
-
214
- tcg_debug_assert(mmu_idx < NB_MMU_MODES);
94
- pv = (void *)(pi & ~7);
95
- if (pi & 8) {
96
- uint64_t *p8 = __builtin_assume_aligned(pv, 16, 8);
97
- uint64_t a = qatomic_read__nocheck(p8);
98
- uint64_t b = qatomic_read__nocheck(p8 + 1);
215
-
99
-
216
- /* Handle CPU specific unaligned behaviour */
100
- if (HOST_BIG_ENDIAN) {
217
- if (addr & ((1 << a_bits) - 1)) {
101
- r = int128_make128(b, a);
218
- cpu_unaligned_access(env_cpu(env), addr, access_type,
102
- } else {
219
- mmu_idx, retaddr);
103
- r = int128_make128(a, b);
104
- }
105
- } else {
106
- r = atomic16_read_ro(pv);
220
- }
107
- }
221
-
108
- return int128_getlo(int128_urshift(r, shr));
222
- index = tlb_index(env, mmu_idx, addr);
223
- entry = tlb_entry(env, mmu_idx, addr);
224
- tlb_addr = tlb_read_idx(entry, access_type);
225
-
226
- /* If the TLB entry is for a different page, reload and try again. */
227
- if (!tlb_hit(tlb_addr, addr)) {
228
- if (!victim_tlb_hit(env, mmu_idx, index, access_type,
229
- addr & TARGET_PAGE_MASK)) {
230
- tlb_fill(env_cpu(env), addr, size,
231
- access_type, mmu_idx, retaddr);
232
- index = tlb_index(env, mmu_idx, addr);
233
- entry = tlb_entry(env, mmu_idx, addr);
234
- }
235
- tlb_addr = tlb_read_idx(entry, access_type);
236
- tlb_addr &= ~TLB_INVALID_MASK;
237
- }
238
-
239
- /* Handle anything that isn't just a straight memory access. */
240
- if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
241
- CPUTLBEntryFull *full;
242
- bool need_swap;
243
-
244
- /* For anything that is unaligned, recurse through full_load. */
245
- if ((addr & (size - 1)) != 0) {
246
- goto do_unaligned_access;
247
- }
248
-
249
- full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
250
-
251
- /* Handle watchpoints. */
252
- if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
253
- /* On watchpoint hit, this will longjmp out. */
254
- cpu_check_watchpoint(env_cpu(env), addr, size,
255
- full->attrs, BP_MEM_READ, retaddr);
256
- }
257
-
258
- need_swap = size > 1 && (tlb_addr & TLB_BSWAP);
259
-
260
- /* Handle I/O access. */
261
- if (likely(tlb_addr & TLB_MMIO)) {
262
- return io_readx(env, full, mmu_idx, addr, retaddr,
263
- access_type, op ^ (need_swap * MO_BSWAP));
264
- }
265
-
266
- haddr = (void *)((uintptr_t)addr + entry->addend);
267
-
268
- /*
269
- * Keep these two load_memop separate to ensure that the compiler
270
- * is able to fold the entire function to a single instruction.
271
- * There is a build-time assert inside to remind you of this. ;-)
272
- */
273
- if (unlikely(need_swap)) {
274
- return load_memop(haddr, op ^ MO_BSWAP);
275
- }
276
- return load_memop(haddr, op);
277
- }
278
-
279
- /* Handle slow unaligned access (it spans two pages or IO). */
280
- if (size > 1
281
- && unlikely((addr & ~TARGET_PAGE_MASK) + size - 1
282
- >= TARGET_PAGE_SIZE)) {
283
- target_ulong addr1, addr2;
284
- uint64_t r1, r2;
285
- unsigned shift;
286
- do_unaligned_access:
287
- addr1 = addr & ~((target_ulong)size - 1);
288
- addr2 = addr1 + size;
289
- r1 = full_load(env, addr1, oi, retaddr);
290
- r2 = full_load(env, addr2, oi, retaddr);
291
- shift = (addr & (size - 1)) * 8;
292
-
293
- if (memop_big_endian(op)) {
294
- /* Big-endian combine. */
295
- res = (r1 << shift) | (r2 >> ((size * 8) - shift));
296
- } else {
297
- /* Little-endian combine. */
298
- res = (r1 >> shift) | (r2 << ((size * 8) - shift));
299
- }
300
- return res & MAKE_64BIT_MASK(0, size * 8);
301
- }
302
-
303
- haddr = (void *)((uintptr_t)addr + entry->addend);
304
- return load_memop(haddr, op);
305
-}
109
-}
306
-
110
-
307
/*
111
/**
308
* For the benefit of TCG generated code, we want to avoid the
112
* load_atom_4_by_2:
309
* complication of ABI-specific return type promotion and always
113
* @pv: host address
310
@@ -XXX,XX +XXX,XX @@ load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi,
311
* We don't bother with this widened value for SOFTMMU_CODE_ACCESS.
312
*/
313
314
-static uint64_t full_ldub_mmu(CPUArchState *env, target_ulong addr,
315
- MemOpIdx oi, uintptr_t retaddr)
316
+/**
317
+ * do_ld_mmio_beN:
318
+ * @env: cpu context
319
+ * @p: translation parameters
320
+ * @ret_be: accumulated data
321
+ * @mmu_idx: virtual address context
322
+ * @ra: return address into tcg generated code, or 0
323
+ *
324
+ * Load @p->size bytes from @p->addr, which is memory-mapped i/o.
325
+ * The bytes are concatenated in big-endian order with @ret_be.
326
+ */
327
+static uint64_t do_ld_mmio_beN(CPUArchState *env, MMULookupPageData *p,
328
+ uint64_t ret_be, int mmu_idx,
329
+ MMUAccessType type, uintptr_t ra)
330
{
331
- validate_memop(oi, MO_UB);
332
- return load_helper(env, addr, oi, retaddr, MO_UB, MMU_DATA_LOAD,
333
- full_ldub_mmu);
334
+ CPUTLBEntryFull *full = p->full;
335
+ target_ulong addr = p->addr;
336
+ int i, size = p->size;
337
+
338
+ QEMU_IOTHREAD_LOCK_GUARD();
339
+ for (i = 0; i < size; i++) {
340
+ uint8_t x = io_readx(env, full, mmu_idx, addr + i, ra, type, MO_UB);
341
+ ret_be = (ret_be << 8) | x;
342
+ }
343
+ return ret_be;
344
+}
345
+
346
+/**
347
+ * do_ld_bytes_beN
348
+ * @p: translation parameters
349
+ * @ret_be: accumulated data
350
+ *
351
+ * Load @p->size bytes from @p->haddr, which is RAM.
352
+ * The bytes to concatenated in big-endian order with @ret_be.
353
+ */
354
+static uint64_t do_ld_bytes_beN(MMULookupPageData *p, uint64_t ret_be)
355
+{
356
+ uint8_t *haddr = p->haddr;
357
+ int i, size = p->size;
358
+
359
+ for (i = 0; i < size; i++) {
360
+ ret_be = (ret_be << 8) | haddr[i];
361
+ }
362
+ return ret_be;
363
+}
364
+
365
+/*
366
+ * Wrapper for the above.
367
+ */
368
+static uint64_t do_ld_beN(CPUArchState *env, MMULookupPageData *p,
369
+ uint64_t ret_be, int mmu_idx,
370
+ MMUAccessType type, uintptr_t ra)
371
+{
372
+ if (unlikely(p->flags & TLB_MMIO)) {
373
+ return do_ld_mmio_beN(env, p, ret_be, mmu_idx, type, ra);
374
+ } else {
375
+ return do_ld_bytes_beN(p, ret_be);
376
+ }
377
+}
378
+
379
+static uint8_t do_ld_1(CPUArchState *env, MMULookupPageData *p, int mmu_idx,
380
+ MMUAccessType type, uintptr_t ra)
381
+{
382
+ if (unlikely(p->flags & TLB_MMIO)) {
383
+ return io_readx(env, p->full, mmu_idx, p->addr, ra, type, MO_UB);
384
+ } else {
385
+ return *(uint8_t *)p->haddr;
386
+ }
387
+}
388
+
389
+static uint16_t do_ld_2(CPUArchState *env, MMULookupPageData *p, int mmu_idx,
390
+ MMUAccessType type, MemOp memop, uintptr_t ra)
391
+{
392
+ uint64_t ret;
393
+
394
+ if (unlikely(p->flags & TLB_MMIO)) {
395
+ return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop);
396
+ }
397
+
398
+ /* Perform the load host endian, then swap if necessary. */
399
+ ret = load_memop(p->haddr, MO_UW);
400
+ if (memop & MO_BSWAP) {
401
+ ret = bswap16(ret);
402
+ }
403
+ return ret;
404
+}
405
+
406
+static uint32_t do_ld_4(CPUArchState *env, MMULookupPageData *p, int mmu_idx,
407
+ MMUAccessType type, MemOp memop, uintptr_t ra)
408
+{
409
+ uint32_t ret;
410
+
411
+ if (unlikely(p->flags & TLB_MMIO)) {
412
+ return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop);
413
+ }
414
+
415
+ /* Perform the load host endian. */
416
+ ret = load_memop(p->haddr, MO_UL);
417
+ if (memop & MO_BSWAP) {
418
+ ret = bswap32(ret);
419
+ }
420
+ return ret;
421
+}
422
+
423
+static uint64_t do_ld_8(CPUArchState *env, MMULookupPageData *p, int mmu_idx,
424
+ MMUAccessType type, MemOp memop, uintptr_t ra)
425
+{
426
+ uint64_t ret;
427
+
428
+ if (unlikely(p->flags & TLB_MMIO)) {
429
+ return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop);
430
+ }
431
+
432
+ /* Perform the load host endian. */
433
+ ret = load_memop(p->haddr, MO_UQ);
434
+ if (memop & MO_BSWAP) {
435
+ ret = bswap64(ret);
436
+ }
437
+ return ret;
438
+}
439
+
440
+static uint8_t do_ld1_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi,
441
+ uintptr_t ra, MMUAccessType access_type)
442
+{
443
+ MMULookupLocals l;
444
+ bool crosspage;
445
+
446
+ crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l);
447
+ tcg_debug_assert(!crosspage);
448
+
449
+ return do_ld_1(env, &l.page[0], l.mmu_idx, access_type, ra);
450
}
451
452
tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr,
453
MemOpIdx oi, uintptr_t retaddr)
454
{
455
- return full_ldub_mmu(env, addr, oi, retaddr);
456
+ validate_memop(oi, MO_UB);
457
+ return do_ld1_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
458
}
459
460
-static uint64_t full_le_lduw_mmu(CPUArchState *env, target_ulong addr,
461
- MemOpIdx oi, uintptr_t retaddr)
462
+static uint16_t do_ld2_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi,
463
+ uintptr_t ra, MMUAccessType access_type)
464
{
465
- validate_memop(oi, MO_LEUW);
466
- return load_helper(env, addr, oi, retaddr, MO_LEUW, MMU_DATA_LOAD,
467
- full_le_lduw_mmu);
468
+ MMULookupLocals l;
469
+ bool crosspage;
470
+ uint16_t ret;
471
+ uint8_t a, b;
472
+
473
+ crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l);
474
+ if (likely(!crosspage)) {
475
+ return do_ld_2(env, &l.page[0], l.mmu_idx, access_type, l.memop, ra);
476
+ }
477
+
478
+ a = do_ld_1(env, &l.page[0], l.mmu_idx, access_type, ra);
479
+ b = do_ld_1(env, &l.page[1], l.mmu_idx, access_type, ra);
480
+
481
+ if ((l.memop & MO_BSWAP) == MO_LE) {
482
+ ret = a | (b << 8);
483
+ } else {
484
+ ret = b | (a << 8);
485
+ }
486
+ return ret;
487
}
488
489
tcg_target_ulong helper_le_lduw_mmu(CPUArchState *env, target_ulong addr,
490
MemOpIdx oi, uintptr_t retaddr)
491
{
492
- return full_le_lduw_mmu(env, addr, oi, retaddr);
493
-}
494
-
495
-static uint64_t full_be_lduw_mmu(CPUArchState *env, target_ulong addr,
496
- MemOpIdx oi, uintptr_t retaddr)
497
-{
498
- validate_memop(oi, MO_BEUW);
499
- return load_helper(env, addr, oi, retaddr, MO_BEUW, MMU_DATA_LOAD,
500
- full_be_lduw_mmu);
501
+ validate_memop(oi, MO_LEUW);
502
+ return do_ld2_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
503
}
504
505
tcg_target_ulong helper_be_lduw_mmu(CPUArchState *env, target_ulong addr,
506
MemOpIdx oi, uintptr_t retaddr)
507
{
508
- return full_be_lduw_mmu(env, addr, oi, retaddr);
509
+ validate_memop(oi, MO_BEUW);
510
+ return do_ld2_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
511
}
512
513
-static uint64_t full_le_ldul_mmu(CPUArchState *env, target_ulong addr,
514
- MemOpIdx oi, uintptr_t retaddr)
515
+static uint32_t do_ld4_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi,
516
+ uintptr_t ra, MMUAccessType access_type)
517
{
518
- validate_memop(oi, MO_LEUL);
519
- return load_helper(env, addr, oi, retaddr, MO_LEUL, MMU_DATA_LOAD,
520
- full_le_ldul_mmu);
521
+ MMULookupLocals l;
522
+ bool crosspage;
523
+ uint32_t ret;
524
+
525
+ crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l);
526
+ if (likely(!crosspage)) {
527
+ return do_ld_4(env, &l.page[0], l.mmu_idx, access_type, l.memop, ra);
528
+ }
529
+
530
+ ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, ra);
531
+ ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, ra);
532
+ if ((l.memop & MO_BSWAP) == MO_LE) {
533
+ ret = bswap32(ret);
534
+ }
535
+ return ret;
536
}
537
538
tcg_target_ulong helper_le_ldul_mmu(CPUArchState *env, target_ulong addr,
539
MemOpIdx oi, uintptr_t retaddr)
540
{
541
- return full_le_ldul_mmu(env, addr, oi, retaddr);
542
-}
543
-
544
-static uint64_t full_be_ldul_mmu(CPUArchState *env, target_ulong addr,
545
- MemOpIdx oi, uintptr_t retaddr)
546
-{
547
- validate_memop(oi, MO_BEUL);
548
- return load_helper(env, addr, oi, retaddr, MO_BEUL, MMU_DATA_LOAD,
549
- full_be_ldul_mmu);
550
+ validate_memop(oi, MO_LEUL);
551
+ return do_ld4_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
552
}
553
554
tcg_target_ulong helper_be_ldul_mmu(CPUArchState *env, target_ulong addr,
555
MemOpIdx oi, uintptr_t retaddr)
556
{
557
- return full_be_ldul_mmu(env, addr, oi, retaddr);
558
+ validate_memop(oi, MO_BEUL);
559
+ return do_ld4_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
560
+}
561
+
562
+static uint64_t do_ld8_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi,
563
+ uintptr_t ra, MMUAccessType access_type)
564
+{
565
+ MMULookupLocals l;
566
+ bool crosspage;
567
+ uint64_t ret;
568
+
569
+ crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l);
570
+ if (likely(!crosspage)) {
571
+ return do_ld_8(env, &l.page[0], l.mmu_idx, access_type, l.memop, ra);
572
+ }
573
+
574
+ ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, ra);
575
+ ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, ra);
576
+ if ((l.memop & MO_BSWAP) == MO_LE) {
577
+ ret = bswap64(ret);
578
+ }
579
+ return ret;
580
}
581
582
uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr,
583
MemOpIdx oi, uintptr_t retaddr)
584
{
585
validate_memop(oi, MO_LEUQ);
586
- return load_helper(env, addr, oi, retaddr, MO_LEUQ, MMU_DATA_LOAD,
587
- helper_le_ldq_mmu);
588
+ return do_ld8_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
589
}
590
591
uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr,
592
MemOpIdx oi, uintptr_t retaddr)
593
{
594
validate_memop(oi, MO_BEUQ);
595
- return load_helper(env, addr, oi, retaddr, MO_BEUQ, MMU_DATA_LOAD,
596
- helper_be_ldq_mmu);
597
+ return do_ld8_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
598
}
599
600
/*
601
@@ -XXX,XX +XXX,XX @@ tcg_target_ulong helper_be_ldsl_mmu(CPUArchState *env, target_ulong addr,
602
* Load helpers for cpu_ldst.h.
603
*/
604
605
-static inline uint64_t cpu_load_helper(CPUArchState *env, abi_ptr addr,
606
- MemOpIdx oi, uintptr_t retaddr,
607
- FullLoadHelper *full_load)
608
+static void plugin_load_cb(CPUArchState *env, abi_ptr addr, MemOpIdx oi)
609
{
610
- uint64_t ret;
611
-
612
- ret = full_load(env, addr, oi, retaddr);
613
qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
614
- return ret;
615
}
616
617
uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra)
618
{
619
- return cpu_load_helper(env, addr, oi, ra, full_ldub_mmu);
620
+ uint8_t ret;
621
+
622
+ validate_memop(oi, MO_UB);
623
+ ret = do_ld1_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
624
+ plugin_load_cb(env, addr, oi);
625
+ return ret;
626
}
627
628
uint16_t cpu_ldw_be_mmu(CPUArchState *env, abi_ptr addr,
629
MemOpIdx oi, uintptr_t ra)
630
{
631
- return cpu_load_helper(env, addr, oi, ra, full_be_lduw_mmu);
632
+ uint16_t ret;
633
+
634
+ validate_memop(oi, MO_BEUW);
635
+ ret = do_ld2_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
636
+ plugin_load_cb(env, addr, oi);
637
+ return ret;
638
}
639
640
uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr,
641
MemOpIdx oi, uintptr_t ra)
642
{
643
- return cpu_load_helper(env, addr, oi, ra, full_be_ldul_mmu);
644
+ uint32_t ret;
645
+
646
+ validate_memop(oi, MO_BEUL);
647
+ ret = do_ld4_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
648
+ plugin_load_cb(env, addr, oi);
649
+ return ret;
650
}
651
652
uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr,
653
MemOpIdx oi, uintptr_t ra)
654
{
655
- return cpu_load_helper(env, addr, oi, ra, helper_be_ldq_mmu);
656
+ uint64_t ret;
657
+
658
+ validate_memop(oi, MO_BEUQ);
659
+ ret = do_ld8_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
660
+ plugin_load_cb(env, addr, oi);
661
+ return ret;
662
}
663
664
uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr,
665
MemOpIdx oi, uintptr_t ra)
666
{
667
- return cpu_load_helper(env, addr, oi, ra, full_le_lduw_mmu);
668
+ uint16_t ret;
669
+
670
+ validate_memop(oi, MO_LEUW);
671
+ ret = do_ld2_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
672
+ plugin_load_cb(env, addr, oi);
673
+ return ret;
674
}
675
676
uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr,
677
MemOpIdx oi, uintptr_t ra)
678
{
679
- return cpu_load_helper(env, addr, oi, ra, full_le_ldul_mmu);
680
+ uint32_t ret;
681
+
682
+ validate_memop(oi, MO_LEUL);
683
+ ret = do_ld4_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
684
+ plugin_load_cb(env, addr, oi);
685
+ return ret;
686
}
687
688
uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr,
689
MemOpIdx oi, uintptr_t ra)
690
{
691
- return cpu_load_helper(env, addr, oi, ra, helper_le_ldq_mmu);
692
+ uint64_t ret;
693
+
694
+ validate_memop(oi, MO_LEUQ);
695
+ ret = do_ld8_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
696
+ plugin_load_cb(env, addr, oi);
697
+ return ret;
698
}
699
700
Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr,
701
@@ -XXX,XX +XXX,XX @@ void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val,
702
703
/* Code access functions. */
704
705
-static uint64_t full_ldub_code(CPUArchState *env, target_ulong addr,
706
- MemOpIdx oi, uintptr_t retaddr)
707
-{
708
- return load_helper(env, addr, oi, retaddr, MO_8,
709
- MMU_INST_FETCH, full_ldub_code);
710
-}
711
-
712
uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr)
713
{
714
MemOpIdx oi = make_memop_idx(MO_UB, cpu_mmu_index(env, true));
715
- return full_ldub_code(env, addr, oi, 0);
716
-}
717
-
718
-static uint64_t full_lduw_code(CPUArchState *env, target_ulong addr,
719
- MemOpIdx oi, uintptr_t retaddr)
720
-{
721
- return load_helper(env, addr, oi, retaddr, MO_TEUW,
722
- MMU_INST_FETCH, full_lduw_code);
723
+ return do_ld1_mmu(env, addr, oi, 0, MMU_INST_FETCH);
724
}
725
726
uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr)
727
{
728
MemOpIdx oi = make_memop_idx(MO_TEUW, cpu_mmu_index(env, true));
729
- return full_lduw_code(env, addr, oi, 0);
730
-}
731
-
732
-static uint64_t full_ldl_code(CPUArchState *env, target_ulong addr,
733
- MemOpIdx oi, uintptr_t retaddr)
734
-{
735
- return load_helper(env, addr, oi, retaddr, MO_TEUL,
736
- MMU_INST_FETCH, full_ldl_code);
737
+ return do_ld2_mmu(env, addr, oi, 0, MMU_INST_FETCH);
738
}
739
740
uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr)
741
{
742
MemOpIdx oi = make_memop_idx(MO_TEUL, cpu_mmu_index(env, true));
743
- return full_ldl_code(env, addr, oi, 0);
744
-}
745
-
746
-static uint64_t full_ldq_code(CPUArchState *env, target_ulong addr,
747
- MemOpIdx oi, uintptr_t retaddr)
748
-{
749
- return load_helper(env, addr, oi, retaddr, MO_TEUQ,
750
- MMU_INST_FETCH, full_ldq_code);
751
+ return do_ld4_mmu(env, addr, oi, 0, MMU_INST_FETCH);
752
}
753
754
uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr)
755
{
756
MemOpIdx oi = make_memop_idx(MO_TEUQ, cpu_mmu_index(env, true));
757
- return full_ldq_code(env, addr, oi, 0);
758
+ return do_ld8_mmu(env, addr, oi, 0, MMU_INST_FETCH);
759
}
760
761
uint8_t cpu_ldb_code_mmu(CPUArchState *env, abi_ptr addr,
762
MemOpIdx oi, uintptr_t retaddr)
763
{
764
- return full_ldub_code(env, addr, oi, retaddr);
765
+ return do_ld1_mmu(env, addr, oi, retaddr, MMU_INST_FETCH);
766
}
767
768
uint16_t cpu_ldw_code_mmu(CPUArchState *env, abi_ptr addr,
769
MemOpIdx oi, uintptr_t retaddr)
770
{
771
- MemOp mop = get_memop(oi);
772
- int idx = get_mmuidx(oi);
773
- uint16_t ret;
774
-
775
- ret = full_lduw_code(env, addr, make_memop_idx(MO_TEUW, idx), retaddr);
776
- if ((mop & MO_BSWAP) != MO_TE) {
777
- ret = bswap16(ret);
778
- }
779
- return ret;
780
+ return do_ld2_mmu(env, addr, oi, retaddr, MMU_INST_FETCH);
781
}
782
783
uint32_t cpu_ldl_code_mmu(CPUArchState *env, abi_ptr addr,
784
MemOpIdx oi, uintptr_t retaddr)
785
{
786
- MemOp mop = get_memop(oi);
787
- int idx = get_mmuidx(oi);
788
- uint32_t ret;
789
-
790
- ret = full_ldl_code(env, addr, make_memop_idx(MO_TEUL, idx), retaddr);
791
- if ((mop & MO_BSWAP) != MO_TE) {
792
- ret = bswap32(ret);
793
- }
794
- return ret;
795
+ return do_ld4_mmu(env, addr, oi, retaddr, MMU_INST_FETCH);
796
}
797
798
uint64_t cpu_ldq_code_mmu(CPUArchState *env, abi_ptr addr,
799
MemOpIdx oi, uintptr_t retaddr)
800
{
801
- MemOp mop = get_memop(oi);
802
- int idx = get_mmuidx(oi);
803
- uint64_t ret;
804
-
805
- ret = full_ldq_code(env, addr, make_memop_idx(MO_TEUQ, idx), retaddr);
806
- if ((mop & MO_BSWAP) != MO_TE) {
807
- ret = bswap64(ret);
808
- }
809
- return ret;
810
+ return do_ld8_mmu(env, addr, oi, retaddr, MMU_INST_FETCH);
811
}
812
--
114
--
813
2.34.1
115
2.34.1
814
815
diff view generated by jsdifflib
1
Instead of playing with offsetof in various places, use
2
MMUAccessType to index an array. This is easily defined
3
instead of the previous dummy padding array in the union.
4
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
---
3
---
10
include/exec/cpu-defs.h | 7 ++-
4
host/include/generic/host/store-insert-al16.h | 50 +++++++++++++++++++
11
include/exec/cpu_ldst.h | 26 ++++++++--
5
accel/tcg/ldst_atomicity.c.inc | 40 +--------------
12
accel/tcg/cputlb.c | 104 +++++++++++++---------------------------
6
2 files changed, 51 insertions(+), 39 deletions(-)
13
3 files changed, 59 insertions(+), 78 deletions(-)
7
create mode 100644 host/include/generic/host/store-insert-al16.h
14
8
15
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
9
diff --git a/host/include/generic/host/store-insert-al16.h b/host/include/generic/host/store-insert-al16.h
16
index XXXXXXX..XXXXXXX 100644
10
new file mode 100644
17
--- a/include/exec/cpu-defs.h
11
index XXXXXXX..XXXXXXX
18
+++ b/include/exec/cpu-defs.h
12
--- /dev/null
19
@@ -XXX,XX +XXX,XX @@ typedef struct CPUTLBEntry {
13
+++ b/host/include/generic/host/store-insert-al16.h
20
use the corresponding iotlb value. */
14
@@ -XXX,XX +XXX,XX @@
21
uintptr_t addend;
15
+/*
22
};
16
+ * SPDX-License-Identifier: GPL-2.0-or-later
23
- /* padding to get a power of two size */
17
+ * Atomic store insert into 128-bit, generic version.
24
- uint8_t dummy[1 << CPU_TLB_ENTRY_BITS];
18
+ *
25
+ /*
19
+ * Copyright (C) 2023 Linaro, Ltd.
26
+ * Padding to get a power of two size, as well as index
20
+ */
27
+ * access to addr_{read,write,code}.
21
+
28
+ */
22
+#ifndef HOST_STORE_INSERT_AL16_H
29
+ target_ulong addr_idx[(1 << CPU_TLB_ENTRY_BITS) / TARGET_LONG_SIZE];
23
+#define HOST_STORE_INSERT_AL16_H
30
};
24
+
31
} CPUTLBEntry;
25
+/**
32
26
+ * store_atom_insert_al16:
33
diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h
27
+ * @p: host address
34
index XXXXXXX..XXXXXXX 100644
28
+ * @val: shifted value to store
35
--- a/include/exec/cpu_ldst.h
29
+ * @msk: mask for value to store
36
+++ b/include/exec/cpu_ldst.h
30
+ *
37
@@ -XXX,XX +XXX,XX @@ static inline void clear_helper_retaddr(void)
31
+ * Atomically store @val to @p masked by @msk.
38
/* Needed for TCG_OVERSIZED_GUEST */
32
+ */
39
#include "tcg/tcg.h"
33
+static inline void ATTRIBUTE_ATOMIC128_OPT
40
34
+store_atom_insert_al16(Int128 *ps, Int128 val, Int128 msk)
41
+static inline target_ulong tlb_read_idx(const CPUTLBEntry *entry,
42
+ MMUAccessType access_type)
43
+{
35
+{
44
+ /* Do not rearrange the CPUTLBEntry structure members. */
36
+#if defined(CONFIG_ATOMIC128)
45
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_read) !=
37
+ __uint128_t *pu;
46
+ MMU_DATA_LOAD * TARGET_LONG_SIZE);
38
+ Int128Alias old, new;
47
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_write) !=
48
+ MMU_DATA_STORE * TARGET_LONG_SIZE);
49
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_code) !=
50
+ MMU_INST_FETCH * TARGET_LONG_SIZE);
51
+
39
+
52
+ const target_ulong *ptr = &entry->addr_idx[access_type];
40
+ /* With CONFIG_ATOMIC128, we can avoid the memory barriers. */
53
+#if TCG_OVERSIZED_GUEST
41
+ pu = __builtin_assume_aligned(ps, 16);
54
+ return *ptr;
42
+ old.u = *pu;
43
+ msk = int128_not(msk);
44
+ do {
45
+ new.s = int128_and(old.s, msk);
46
+ new.s = int128_or(new.s, val);
47
+ } while (!__atomic_compare_exchange_n(pu, &old.u, new.u, true,
48
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED));
55
+#else
49
+#else
56
+ /* ofs might correspond to .addr_write, so use qatomic_read */
50
+ Int128 old, new, cmp;
57
+ return qatomic_read(ptr);
51
+
52
+ ps = __builtin_assume_aligned(ps, 16);
53
+ old = *ps;
54
+ msk = int128_not(msk);
55
+ do {
56
+ cmp = old;
57
+ new = int128_and(old, msk);
58
+ new = int128_or(new, val);
59
+ old = atomic16_cmpxchg(ps, cmp, new);
60
+ } while (int128_ne(cmp, old));
58
+#endif
61
+#endif
59
+}
62
+}
60
+
63
+
61
static inline target_ulong tlb_addr_write(const CPUTLBEntry *entry)
64
+#endif /* HOST_STORE_INSERT_AL16_H */
62
{
65
diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc
63
-#if TCG_OVERSIZED_GUEST
66
index XXXXXXX..XXXXXXX 100644
64
- return entry->addr_write;
67
--- a/accel/tcg/ldst_atomicity.c.inc
68
+++ b/accel/tcg/ldst_atomicity.c.inc
69
@@ -XXX,XX +XXX,XX @@
70
*/
71
72
#include "host/load-extract-al16-al8.h"
73
+#include "host/store-insert-al16.h"
74
75
#ifdef CONFIG_ATOMIC64
76
# define HAVE_al8 true
77
@@ -XXX,XX +XXX,XX @@ static void store_atom_insert_al8(uint64_t *p, uint64_t val, uint64_t msk)
78
__ATOMIC_RELAXED, __ATOMIC_RELAXED));
79
}
80
81
-/**
82
- * store_atom_insert_al16:
83
- * @p: host address
84
- * @val: shifted value to store
85
- * @msk: mask for value to store
86
- *
87
- * Atomically store @val to @p masked by @msk.
88
- */
89
-static void ATTRIBUTE_ATOMIC128_OPT
90
-store_atom_insert_al16(Int128 *ps, Int128Alias val, Int128Alias msk)
91
-{
92
-#if defined(CONFIG_ATOMIC128)
93
- __uint128_t *pu, old, new;
94
-
95
- /* With CONFIG_ATOMIC128, we can avoid the memory barriers. */
96
- pu = __builtin_assume_aligned(ps, 16);
97
- old = *pu;
98
- do {
99
- new = (old & ~msk.u) | val.u;
100
- } while (!__atomic_compare_exchange_n(pu, &old, new, true,
101
- __ATOMIC_RELAXED, __ATOMIC_RELAXED));
102
-#elif defined(CONFIG_CMPXCHG128)
103
- __uint128_t *pu, old, new;
104
-
105
- /*
106
- * Without CONFIG_ATOMIC128, __atomic_compare_exchange_n will always
107
- * defer to libatomic, so we must use __sync_*_compare_and_swap_16
108
- * and accept the sequential consistency that comes with it.
109
- */
110
- pu = __builtin_assume_aligned(ps, 16);
111
- do {
112
- old = *pu;
113
- new = (old & ~msk.u) | val.u;
114
- } while (!__sync_bool_compare_and_swap_16(pu, old, new));
65
-#else
115
-#else
66
- return qatomic_read(&entry->addr_write);
116
- qemu_build_not_reached();
67
-#endif
68
+ return tlb_read_idx(entry, MMU_DATA_STORE);
69
}
70
71
/* Find the TLB index corresponding to the mmu_idx + address pair. */
72
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
73
index XXXXXXX..XXXXXXX 100644
74
--- a/accel/tcg/cputlb.c
75
+++ b/accel/tcg/cputlb.c
76
@@ -XXX,XX +XXX,XX @@ static void io_writex(CPUArchState *env, CPUTLBEntryFull *full,
77
}
78
}
79
80
-static inline target_ulong tlb_read_ofs(CPUTLBEntry *entry, size_t ofs)
81
-{
82
-#if TCG_OVERSIZED_GUEST
83
- return *(target_ulong *)((uintptr_t)entry + ofs);
84
-#else
85
- /* ofs might correspond to .addr_write, so use qatomic_read */
86
- return qatomic_read((target_ulong *)((uintptr_t)entry + ofs));
87
-#endif
117
-#endif
88
-}
118
-}
89
-
119
-
90
/* Return true if ADDR is present in the victim tlb, and has been copied
120
/**
91
back to the main tlb. */
121
* store_bytes_leN:
92
static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
122
* @pv: host address
93
- size_t elt_ofs, target_ulong page)
94
+ MMUAccessType access_type, target_ulong page)
95
{
96
size_t vidx;
97
98
assert_cpu_is_self(env_cpu(env));
99
for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) {
100
CPUTLBEntry *vtlb = &env_tlb(env)->d[mmu_idx].vtable[vidx];
101
- target_ulong cmp;
102
-
103
- /* elt_ofs might correspond to .addr_write, so use qatomic_read */
104
-#if TCG_OVERSIZED_GUEST
105
- cmp = *(target_ulong *)((uintptr_t)vtlb + elt_ofs);
106
-#else
107
- cmp = qatomic_read((target_ulong *)((uintptr_t)vtlb + elt_ofs));
108
-#endif
109
+ target_ulong cmp = tlb_read_idx(vtlb, access_type);
110
111
if (cmp == page) {
112
/* Found entry in victim tlb, swap tlb and iotlb. */
113
@@ -XXX,XX +XXX,XX @@ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
114
return false;
115
}
116
117
-/* Macro to call the above, with local variables from the use context. */
118
-#define VICTIM_TLB_HIT(TY, ADDR) \
119
- victim_tlb_hit(env, mmu_idx, index, offsetof(CPUTLBEntry, TY), \
120
- (ADDR) & TARGET_PAGE_MASK)
121
-
122
static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
123
CPUTLBEntryFull *full, uintptr_t retaddr)
124
{
125
@@ -XXX,XX +XXX,XX @@ static int probe_access_internal(CPUArchState *env, target_ulong addr,
126
{
127
uintptr_t index = tlb_index(env, mmu_idx, addr);
128
CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
129
- target_ulong tlb_addr, page_addr;
130
- size_t elt_ofs;
131
- int flags;
132
+ target_ulong tlb_addr = tlb_read_idx(entry, access_type);
133
+ target_ulong page_addr = addr & TARGET_PAGE_MASK;
134
+ int flags = TLB_FLAGS_MASK;
135
136
- switch (access_type) {
137
- case MMU_DATA_LOAD:
138
- elt_ofs = offsetof(CPUTLBEntry, addr_read);
139
- break;
140
- case MMU_DATA_STORE:
141
- elt_ofs = offsetof(CPUTLBEntry, addr_write);
142
- break;
143
- case MMU_INST_FETCH:
144
- elt_ofs = offsetof(CPUTLBEntry, addr_code);
145
- break;
146
- default:
147
- g_assert_not_reached();
148
- }
149
- tlb_addr = tlb_read_ofs(entry, elt_ofs);
150
-
151
- flags = TLB_FLAGS_MASK;
152
- page_addr = addr & TARGET_PAGE_MASK;
153
if (!tlb_hit_page(tlb_addr, page_addr)) {
154
- if (!victim_tlb_hit(env, mmu_idx, index, elt_ofs, page_addr)) {
155
+ if (!victim_tlb_hit(env, mmu_idx, index, access_type, page_addr)) {
156
CPUState *cs = env_cpu(env);
157
158
if (!cs->cc->tcg_ops->tlb_fill(cs, addr, fault_size, access_type,
159
@@ -XXX,XX +XXX,XX @@ static int probe_access_internal(CPUArchState *env, target_ulong addr,
160
*/
161
flags &= ~TLB_INVALID_MASK;
162
}
163
- tlb_addr = tlb_read_ofs(entry, elt_ofs);
164
+ tlb_addr = tlb_read_idx(entry, access_type);
165
}
166
flags &= tlb_addr;
167
168
@@ -XXX,XX +XXX,XX @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
169
if (prot & PAGE_WRITE) {
170
tlb_addr = tlb_addr_write(tlbe);
171
if (!tlb_hit(tlb_addr, addr)) {
172
- if (!VICTIM_TLB_HIT(addr_write, addr)) {
173
+ if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_STORE,
174
+ addr & TARGET_PAGE_MASK)) {
175
tlb_fill(env_cpu(env), addr, size,
176
MMU_DATA_STORE, mmu_idx, retaddr);
177
index = tlb_index(env, mmu_idx, addr);
178
@@ -XXX,XX +XXX,XX @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
179
} else /* if (prot & PAGE_READ) */ {
180
tlb_addr = tlbe->addr_read;
181
if (!tlb_hit(tlb_addr, addr)) {
182
- if (!VICTIM_TLB_HIT(addr_read, addr)) {
183
+ if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_LOAD,
184
+ addr & TARGET_PAGE_MASK)) {
185
tlb_fill(env_cpu(env), addr, size,
186
MMU_DATA_LOAD, mmu_idx, retaddr);
187
index = tlb_index(env, mmu_idx, addr);
188
@@ -XXX,XX +XXX,XX @@ load_memop(const void *haddr, MemOp op)
189
190
static inline uint64_t QEMU_ALWAYS_INLINE
191
load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi,
192
- uintptr_t retaddr, MemOp op, bool code_read,
193
+ uintptr_t retaddr, MemOp op, MMUAccessType access_type,
194
FullLoadHelper *full_load)
195
{
196
- const size_t tlb_off = code_read ?
197
- offsetof(CPUTLBEntry, addr_code) : offsetof(CPUTLBEntry, addr_read);
198
- const MMUAccessType access_type =
199
- code_read ? MMU_INST_FETCH : MMU_DATA_LOAD;
200
const unsigned a_bits = get_alignment_bits(get_memop(oi));
201
const size_t size = memop_size(op);
202
uintptr_t mmu_idx = get_mmuidx(oi);
203
@@ -XXX,XX +XXX,XX @@ load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi,
204
205
index = tlb_index(env, mmu_idx, addr);
206
entry = tlb_entry(env, mmu_idx, addr);
207
- tlb_addr = code_read ? entry->addr_code : entry->addr_read;
208
+ tlb_addr = tlb_read_idx(entry, access_type);
209
210
/* If the TLB entry is for a different page, reload and try again. */
211
if (!tlb_hit(tlb_addr, addr)) {
212
- if (!victim_tlb_hit(env, mmu_idx, index, tlb_off,
213
+ if (!victim_tlb_hit(env, mmu_idx, index, access_type,
214
addr & TARGET_PAGE_MASK)) {
215
tlb_fill(env_cpu(env), addr, size,
216
access_type, mmu_idx, retaddr);
217
index = tlb_index(env, mmu_idx, addr);
218
entry = tlb_entry(env, mmu_idx, addr);
219
}
220
- tlb_addr = code_read ? entry->addr_code : entry->addr_read;
221
+ tlb_addr = tlb_read_idx(entry, access_type);
222
tlb_addr &= ~TLB_INVALID_MASK;
223
}
224
225
@@ -XXX,XX +XXX,XX @@ static uint64_t full_ldub_mmu(CPUArchState *env, target_ulong addr,
226
MemOpIdx oi, uintptr_t retaddr)
227
{
228
validate_memop(oi, MO_UB);
229
- return load_helper(env, addr, oi, retaddr, MO_UB, false, full_ldub_mmu);
230
+ return load_helper(env, addr, oi, retaddr, MO_UB, MMU_DATA_LOAD,
231
+ full_ldub_mmu);
232
}
233
234
tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr,
235
@@ -XXX,XX +XXX,XX @@ static uint64_t full_le_lduw_mmu(CPUArchState *env, target_ulong addr,
236
MemOpIdx oi, uintptr_t retaddr)
237
{
238
validate_memop(oi, MO_LEUW);
239
- return load_helper(env, addr, oi, retaddr, MO_LEUW, false,
240
+ return load_helper(env, addr, oi, retaddr, MO_LEUW, MMU_DATA_LOAD,
241
full_le_lduw_mmu);
242
}
243
244
@@ -XXX,XX +XXX,XX @@ static uint64_t full_be_lduw_mmu(CPUArchState *env, target_ulong addr,
245
MemOpIdx oi, uintptr_t retaddr)
246
{
247
validate_memop(oi, MO_BEUW);
248
- return load_helper(env, addr, oi, retaddr, MO_BEUW, false,
249
+ return load_helper(env, addr, oi, retaddr, MO_BEUW, MMU_DATA_LOAD,
250
full_be_lduw_mmu);
251
}
252
253
@@ -XXX,XX +XXX,XX @@ static uint64_t full_le_ldul_mmu(CPUArchState *env, target_ulong addr,
254
MemOpIdx oi, uintptr_t retaddr)
255
{
256
validate_memop(oi, MO_LEUL);
257
- return load_helper(env, addr, oi, retaddr, MO_LEUL, false,
258
+ return load_helper(env, addr, oi, retaddr, MO_LEUL, MMU_DATA_LOAD,
259
full_le_ldul_mmu);
260
}
261
262
@@ -XXX,XX +XXX,XX @@ static uint64_t full_be_ldul_mmu(CPUArchState *env, target_ulong addr,
263
MemOpIdx oi, uintptr_t retaddr)
264
{
265
validate_memop(oi, MO_BEUL);
266
- return load_helper(env, addr, oi, retaddr, MO_BEUL, false,
267
+ return load_helper(env, addr, oi, retaddr, MO_BEUL, MMU_DATA_LOAD,
268
full_be_ldul_mmu);
269
}
270
271
@@ -XXX,XX +XXX,XX @@ uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr,
272
MemOpIdx oi, uintptr_t retaddr)
273
{
274
validate_memop(oi, MO_LEUQ);
275
- return load_helper(env, addr, oi, retaddr, MO_LEUQ, false,
276
+ return load_helper(env, addr, oi, retaddr, MO_LEUQ, MMU_DATA_LOAD,
277
helper_le_ldq_mmu);
278
}
279
280
@@ -XXX,XX +XXX,XX @@ uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr,
281
MemOpIdx oi, uintptr_t retaddr)
282
{
283
validate_memop(oi, MO_BEUQ);
284
- return load_helper(env, addr, oi, retaddr, MO_BEUQ, false,
285
+ return load_helper(env, addr, oi, retaddr, MO_BEUQ, MMU_DATA_LOAD,
286
helper_be_ldq_mmu);
287
}
288
289
@@ -XXX,XX +XXX,XX @@ store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val,
290
uintptr_t retaddr, size_t size, uintptr_t mmu_idx,
291
bool big_endian)
292
{
293
- const size_t tlb_off = offsetof(CPUTLBEntry, addr_write);
294
uintptr_t index, index2;
295
CPUTLBEntry *entry, *entry2;
296
target_ulong page1, page2, tlb_addr, tlb_addr2;
297
@@ -XXX,XX +XXX,XX @@ store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val,
298
299
tlb_addr2 = tlb_addr_write(entry2);
300
if (page1 != page2 && !tlb_hit_page(tlb_addr2, page2)) {
301
- if (!victim_tlb_hit(env, mmu_idx, index2, tlb_off, page2)) {
302
+ if (!victim_tlb_hit(env, mmu_idx, index2, MMU_DATA_STORE, page2)) {
303
tlb_fill(env_cpu(env), page2, size2, MMU_DATA_STORE,
304
mmu_idx, retaddr);
305
index2 = tlb_index(env, mmu_idx, page2);
306
@@ -XXX,XX +XXX,XX @@ static inline void QEMU_ALWAYS_INLINE
307
store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
308
MemOpIdx oi, uintptr_t retaddr, MemOp op)
309
{
310
- const size_t tlb_off = offsetof(CPUTLBEntry, addr_write);
311
const unsigned a_bits = get_alignment_bits(get_memop(oi));
312
const size_t size = memop_size(op);
313
uintptr_t mmu_idx = get_mmuidx(oi);
314
@@ -XXX,XX +XXX,XX @@ store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
315
316
/* If the TLB entry is for a different page, reload and try again. */
317
if (!tlb_hit(tlb_addr, addr)) {
318
- if (!victim_tlb_hit(env, mmu_idx, index, tlb_off,
319
+ if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_STORE,
320
addr & TARGET_PAGE_MASK)) {
321
tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE,
322
mmu_idx, retaddr);
323
@@ -XXX,XX +XXX,XX @@ void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val,
324
static uint64_t full_ldub_code(CPUArchState *env, target_ulong addr,
325
MemOpIdx oi, uintptr_t retaddr)
326
{
327
- return load_helper(env, addr, oi, retaddr, MO_8, true, full_ldub_code);
328
+ return load_helper(env, addr, oi, retaddr, MO_8,
329
+ MMU_INST_FETCH, full_ldub_code);
330
}
331
332
uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr)
333
@@ -XXX,XX +XXX,XX @@ uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr)
334
static uint64_t full_lduw_code(CPUArchState *env, target_ulong addr,
335
MemOpIdx oi, uintptr_t retaddr)
336
{
337
- return load_helper(env, addr, oi, retaddr, MO_TEUW, true, full_lduw_code);
338
+ return load_helper(env, addr, oi, retaddr, MO_TEUW,
339
+ MMU_INST_FETCH, full_lduw_code);
340
}
341
342
uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr)
343
@@ -XXX,XX +XXX,XX @@ uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr)
344
static uint64_t full_ldl_code(CPUArchState *env, target_ulong addr,
345
MemOpIdx oi, uintptr_t retaddr)
346
{
347
- return load_helper(env, addr, oi, retaddr, MO_TEUL, true, full_ldl_code);
348
+ return load_helper(env, addr, oi, retaddr, MO_TEUL,
349
+ MMU_INST_FETCH, full_ldl_code);
350
}
351
352
uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr)
353
@@ -XXX,XX +XXX,XX @@ uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr)
354
static uint64_t full_ldq_code(CPUArchState *env, target_ulong addr,
355
MemOpIdx oi, uintptr_t retaddr)
356
{
357
- return load_helper(env, addr, oi, retaddr, MO_TEUQ, true, full_ldq_code);
358
+ return load_helper(env, addr, oi, retaddr, MO_TEUQ,
359
+ MMU_INST_FETCH, full_ldq_code);
360
}
361
362
uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr)
363
--
123
--
364
2.34.1
124
2.34.1
365
366
diff view generated by jsdifflib
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, and some code that lived
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2
in both tcg_out_qemu_ld and tcg_out_qemu_st into one function that
3
returns HostAddress and TCGLabelQemuLdst structures.
4
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
3
---
8
tcg/arm/tcg-target.c.inc | 351 ++++++++++++++++++---------------------
4
.../x86_64/host/load-extract-al16-al8.h | 50 +++++++++++++++++++
9
1 file changed, 159 insertions(+), 192 deletions(-)
5
1 file changed, 50 insertions(+)
6
create mode 100644 host/include/x86_64/host/load-extract-al16-al8.h
10
7
11
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
8
diff --git a/host/include/x86_64/host/load-extract-al16-al8.h b/host/include/x86_64/host/load-extract-al16-al8.h
12
index XXXXXXX..XXXXXXX 100644
9
new file mode 100644
13
--- a/tcg/arm/tcg-target.c.inc
10
index XXXXXXX..XXXXXXX
14
+++ b/tcg/arm/tcg-target.c.inc
11
--- /dev/null
15
@@ -XXX,XX +XXX,XX @@ static TCGReg tcg_out_arg_reg64(TCGContext *s, TCGReg argreg,
12
+++ b/host/include/x86_64/host/load-extract-al16-al8.h
16
}
13
@@ -XXX,XX +XXX,XX @@
17
}
14
+/*
18
15
+ * SPDX-License-Identifier: GPL-2.0-or-later
19
-#define TLB_SHIFT    (CPU_TLB_ENTRY_BITS + CPU_TLB_BITS)
16
+ * Atomic extract 64 from 128-bit, x86_64 version.
20
-
17
+ *
21
-/* We expect to use an 9-bit sign-magnitude negative offset from ENV. */
18
+ * Copyright (C) 2023 Linaro, Ltd.
22
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
19
+ */
23
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -256);
20
+
24
-
21
+#ifndef X86_64_LOAD_EXTRACT_AL16_AL8_H
25
-/* These offsets are built into the LDRD below. */
22
+#define X86_64_LOAD_EXTRACT_AL16_AL8_H
26
-QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
23
+
27
-QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 4);
24
+#ifdef CONFIG_INT128_TYPE
28
-
25
+#include "host/cpuinfo.h"
29
-/* Load and compare a TLB entry, leaving the flags set. Returns the register
26
+
30
- containing the addend of the tlb entry. Clobbers R0, R1, R2, TMP. */
27
+/**
31
-
28
+ * load_atom_extract_al16_or_al8:
32
-static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
29
+ * @pv: host address
33
- MemOp opc, int mem_index, bool is_load)
30
+ * @s: object size in bytes, @s <= 8.
34
-{
31
+ *
35
- int cmp_off = (is_load ? offsetof(CPUTLBEntry, addr_read)
32
+ * Load @s bytes from @pv, when pv % s != 0. If [p, p+s-1] does not
36
- : offsetof(CPUTLBEntry, addr_write));
33
+ * cross an 16-byte boundary then the access must be 16-byte atomic,
37
- int fast_off = TLB_MASK_TABLE_OFS(mem_index);
34
+ * otherwise the access must be 8-byte atomic.
38
- unsigned s_mask = (1 << (opc & MO_SIZE)) - 1;
35
+ */
39
- unsigned a_mask = (1 << get_alignment_bits(opc)) - 1;
36
+static inline uint64_t ATTRIBUTE_ATOMIC128_OPT
40
- TCGReg t_addr;
37
+load_atom_extract_al16_or_al8(void *pv, int s)
41
-
42
- /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {r0,r1}. */
43
- tcg_out_ldrd_8(s, COND_AL, TCG_REG_R0, TCG_AREG0, fast_off);
44
-
45
- /* Extract the tlb index from the address into R0. */
46
- tcg_out_dat_reg(s, COND_AL, ARITH_AND, TCG_REG_R0, TCG_REG_R0, addrlo,
47
- SHIFT_IMM_LSR(TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS));
48
-
49
- /*
50
- * Add the tlb_table pointer, creating the CPUTLBEntry address in R1.
51
- * Load the tlb comparator into R2/R3 and the fast path addend into R1.
52
- */
53
- if (cmp_off == 0) {
54
- if (TARGET_LONG_BITS == 64) {
55
- tcg_out_ldrd_rwb(s, COND_AL, TCG_REG_R2, TCG_REG_R1, TCG_REG_R0);
56
- } else {
57
- tcg_out_ld32_rwb(s, COND_AL, TCG_REG_R2, TCG_REG_R1, TCG_REG_R0);
58
- }
59
- } else {
60
- tcg_out_dat_reg(s, COND_AL, ARITH_ADD,
61
- TCG_REG_R1, TCG_REG_R1, TCG_REG_R0, 0);
62
- if (TARGET_LONG_BITS == 64) {
63
- tcg_out_ldrd_8(s, COND_AL, TCG_REG_R2, TCG_REG_R1, cmp_off);
64
- } else {
65
- tcg_out_ld32_12(s, COND_AL, TCG_REG_R2, TCG_REG_R1, cmp_off);
66
- }
67
- }
68
-
69
- /* Load the tlb addend. */
70
- tcg_out_ld32_12(s, COND_AL, TCG_REG_R1, TCG_REG_R1,
71
- offsetof(CPUTLBEntry, addend));
72
-
73
- /*
74
- * Check alignment, check comparators.
75
- * Do this in 2-4 insns. Use MOVW for v7, if possible,
76
- * to reduce the number of sequential conditional instructions.
77
- * Almost all guests have at least 4k pages, which means that we need
78
- * to clear at least 9 bits even for an 8-byte memory, which means it
79
- * isn't worth checking for an immediate operand for BIC.
80
- *
81
- * For unaligned accesses, test the page of the last unit of alignment.
82
- * This leaves the least significant alignment bits unchanged, and of
83
- * course must be zero.
84
- */
85
- t_addr = addrlo;
86
- if (a_mask < s_mask) {
87
- t_addr = TCG_REG_R0;
88
- tcg_out_dat_imm(s, COND_AL, ARITH_ADD, t_addr,
89
- addrlo, s_mask - a_mask);
90
- }
91
- if (use_armv7_instructions && TARGET_PAGE_BITS <= 16) {
92
- tcg_out_movi32(s, COND_AL, TCG_REG_TMP, ~(TARGET_PAGE_MASK | a_mask));
93
- tcg_out_dat_reg(s, COND_AL, ARITH_BIC, TCG_REG_TMP,
94
- t_addr, TCG_REG_TMP, 0);
95
- tcg_out_dat_reg(s, COND_AL, ARITH_CMP, 0, TCG_REG_R2, TCG_REG_TMP, 0);
96
- } else {
97
- if (a_mask) {
98
- tcg_debug_assert(a_mask <= 0xff);
99
- tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
100
- }
101
- tcg_out_dat_reg(s, COND_AL, ARITH_MOV, TCG_REG_TMP, 0, t_addr,
102
- SHIFT_IMM_LSR(TARGET_PAGE_BITS));
103
- tcg_out_dat_reg(s, (a_mask ? COND_EQ : COND_AL), ARITH_CMP,
104
- 0, TCG_REG_R2, TCG_REG_TMP,
105
- SHIFT_IMM_LSL(TARGET_PAGE_BITS));
106
- }
107
-
108
- if (TARGET_LONG_BITS == 64) {
109
- tcg_out_dat_reg(s, COND_EQ, ARITH_CMP, 0, TCG_REG_R3, addrhi, 0);
110
- }
111
-
112
- return TCG_REG_R1;
113
-}
114
-
115
-/* Record the context of a call to the out of line helper code for the slow
116
- path for a load or store, so that we can later generate the correct
117
- helper code. */
118
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld,
119
- MemOpIdx oi, TCGType type,
120
- TCGReg datalo, TCGReg datahi,
121
- TCGReg addrlo, TCGReg addrhi,
122
- tcg_insn_unit *raddr,
123
- tcg_insn_unit *label_ptr)
124
-{
125
- TCGLabelQemuLdst *label = new_ldst_label(s);
126
-
127
- label->is_ld = is_ld;
128
- label->oi = oi;
129
- label->type = type;
130
- label->datalo_reg = datalo;
131
- label->datahi_reg = datahi;
132
- label->addrlo_reg = addrlo;
133
- label->addrhi_reg = addrhi;
134
- label->raddr = tcg_splitwx_to_rx(raddr);
135
- label->label_ptr[0] = label_ptr;
136
-}
137
-
138
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
139
{
140
TCGReg argreg;
141
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
142
return true;
143
}
144
#else
145
-
146
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
147
- TCGReg addrhi, unsigned a_bits)
148
-{
149
- unsigned a_mask = (1 << a_bits) - 1;
150
- TCGLabelQemuLdst *label = new_ldst_label(s);
151
-
152
- label->is_ld = is_ld;
153
- label->addrlo_reg = addrlo;
154
- label->addrhi_reg = addrhi;
155
-
156
- /* We are expecting a_bits to max out at 7, and can easily support 8. */
157
- tcg_debug_assert(a_mask <= 0xff);
158
- /* tst addr, #mask */
159
- tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
160
-
161
- /* blne slow_path */
162
- label->label_ptr[0] = s->code_ptr;
163
- tcg_out_bl_imm(s, COND_NE, 0);
164
-
165
- label->raddr = tcg_splitwx_to_rx(s->code_ptr);
166
-}
167
-
168
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
169
{
170
if (!reloc_pc24(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
171
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
172
}
173
#endif /* SOFTMMU */
174
175
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
176
+ TCGReg addrlo, TCGReg addrhi,
177
+ MemOpIdx oi, bool is_ld)
178
+{
38
+{
179
+ TCGLabelQemuLdst *ldst = NULL;
39
+ uintptr_t pi = (uintptr_t)pv;
180
+ MemOp opc = get_memop(oi);
40
+ __int128_t *ptr_align = (__int128_t *)(pi & ~7);
181
+ MemOp a_bits = get_alignment_bits(opc);
41
+ int shr = (pi & 7) * 8;
182
+ unsigned a_mask = (1 << a_bits) - 1;
42
+ Int128Alias r;
183
+
184
+#ifdef CONFIG_SOFTMMU
185
+ int mem_index = get_mmuidx(oi);
186
+ int cmp_off = is_ld ? offsetof(CPUTLBEntry, addr_read)
187
+ : offsetof(CPUTLBEntry, addr_write);
188
+ int fast_off = TLB_MASK_TABLE_OFS(mem_index);
189
+ unsigned s_mask = (1 << (opc & MO_SIZE)) - 1;
190
+ TCGReg t_addr;
191
+
192
+ ldst = new_ldst_label(s);
193
+ ldst->is_ld = is_ld;
194
+ ldst->oi = oi;
195
+ ldst->addrlo_reg = addrlo;
196
+ ldst->addrhi_reg = addrhi;
197
+
198
+ /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {r0,r1}. */
199
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
200
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -256);
201
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
202
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 4);
203
+ tcg_out_ldrd_8(s, COND_AL, TCG_REG_R0, TCG_AREG0, fast_off);
204
+
205
+ /* Extract the tlb index from the address into R0. */
206
+ tcg_out_dat_reg(s, COND_AL, ARITH_AND, TCG_REG_R0, TCG_REG_R0, addrlo,
207
+ SHIFT_IMM_LSR(TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS));
208
+
43
+
209
+ /*
44
+ /*
210
+ * Add the tlb_table pointer, creating the CPUTLBEntry address in R1.
45
+ * ptr_align % 16 is now only 0 or 8.
211
+ * Load the tlb comparator into R2/R3 and the fast path addend into R1.
46
+ * If the host supports atomic loads with VMOVDQU, then always use that,
47
+ * making the branch highly predictable. Otherwise we must use VMOVDQA
48
+ * when ptr_align % 16 == 0 for 16-byte atomicity.
212
+ */
49
+ */
213
+ if (cmp_off == 0) {
50
+ if ((cpuinfo & CPUINFO_ATOMIC_VMOVDQU) || (pi & 8)) {
214
+ if (TARGET_LONG_BITS == 64) {
51
+ asm("vmovdqu %1, %0" : "=x" (r.i) : "m" (*ptr_align));
215
+ tcg_out_ldrd_rwb(s, COND_AL, TCG_REG_R2, TCG_REG_R1, TCG_REG_R0);
216
+ } else {
217
+ tcg_out_ld32_rwb(s, COND_AL, TCG_REG_R2, TCG_REG_R1, TCG_REG_R0);
218
+ }
219
+ } else {
52
+ } else {
220
+ tcg_out_dat_reg(s, COND_AL, ARITH_ADD,
53
+ asm("vmovdqa %1, %0" : "=x" (r.i) : "m" (*ptr_align));
221
+ TCG_REG_R1, TCG_REG_R1, TCG_REG_R0, 0);
222
+ if (TARGET_LONG_BITS == 64) {
223
+ tcg_out_ldrd_8(s, COND_AL, TCG_REG_R2, TCG_REG_R1, cmp_off);
224
+ } else {
225
+ tcg_out_ld32_12(s, COND_AL, TCG_REG_R2, TCG_REG_R1, cmp_off);
226
+ }
227
+ }
54
+ }
228
+
55
+ return int128_getlo(int128_urshift(r.s, shr));
229
+ /* Load the tlb addend. */
56
+}
230
+ tcg_out_ld32_12(s, COND_AL, TCG_REG_R1, TCG_REG_R1,
231
+ offsetof(CPUTLBEntry, addend));
232
+
233
+ /*
234
+ * Check alignment, check comparators.
235
+ * Do this in 2-4 insns. Use MOVW for v7, if possible,
236
+ * to reduce the number of sequential conditional instructions.
237
+ * Almost all guests have at least 4k pages, which means that we need
238
+ * to clear at least 9 bits even for an 8-byte memory, which means it
239
+ * isn't worth checking for an immediate operand for BIC.
240
+ *
241
+ * For unaligned accesses, test the page of the last unit of alignment.
242
+ * This leaves the least significant alignment bits unchanged, and of
243
+ * course must be zero.
244
+ */
245
+ t_addr = addrlo;
246
+ if (a_mask < s_mask) {
247
+ t_addr = TCG_REG_R0;
248
+ tcg_out_dat_imm(s, COND_AL, ARITH_ADD, t_addr,
249
+ addrlo, s_mask - a_mask);
250
+ }
251
+ if (use_armv7_instructions && TARGET_PAGE_BITS <= 16) {
252
+ tcg_out_movi32(s, COND_AL, TCG_REG_TMP, ~(TARGET_PAGE_MASK | a_mask));
253
+ tcg_out_dat_reg(s, COND_AL, ARITH_BIC, TCG_REG_TMP,
254
+ t_addr, TCG_REG_TMP, 0);
255
+ tcg_out_dat_reg(s, COND_AL, ARITH_CMP, 0, TCG_REG_R2, TCG_REG_TMP, 0);
256
+ } else {
257
+ if (a_mask) {
258
+ tcg_debug_assert(a_mask <= 0xff);
259
+ tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
260
+ }
261
+ tcg_out_dat_reg(s, COND_AL, ARITH_MOV, TCG_REG_TMP, 0, t_addr,
262
+ SHIFT_IMM_LSR(TARGET_PAGE_BITS));
263
+ tcg_out_dat_reg(s, (a_mask ? COND_EQ : COND_AL), ARITH_CMP,
264
+ 0, TCG_REG_R2, TCG_REG_TMP,
265
+ SHIFT_IMM_LSL(TARGET_PAGE_BITS));
266
+ }
267
+
268
+ if (TARGET_LONG_BITS == 64) {
269
+ tcg_out_dat_reg(s, COND_EQ, ARITH_CMP, 0, TCG_REG_R3, addrhi, 0);
270
+ }
271
+
272
+ *h = (HostAddress){
273
+ .cond = COND_AL,
274
+ .base = addrlo,
275
+ .index = TCG_REG_R1,
276
+ .index_scratch = true,
277
+ };
278
+#else
57
+#else
279
+ if (a_mask) {
58
+/* Fallback definition that must be optimized away, or error. */
280
+ ldst = new_ldst_label(s);
59
+uint64_t QEMU_ERROR("unsupported atomic")
281
+ ldst->is_ld = is_ld;
60
+ load_atom_extract_al16_or_al8(void *pv, int s);
282
+ ldst->oi = oi;
283
+ ldst->addrlo_reg = addrlo;
284
+ ldst->addrhi_reg = addrhi;
285
+
286
+ /* We are expecting a_bits to max out at 7 */
287
+ tcg_debug_assert(a_mask <= 0xff);
288
+ /* tst addr, #mask */
289
+ tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
290
+ }
291
+
292
+ *h = (HostAddress){
293
+ .cond = COND_AL,
294
+ .base = addrlo,
295
+ .index = guest_base ? TCG_REG_GUEST_BASE : -1,
296
+ .index_scratch = false,
297
+ };
298
+#endif
61
+#endif
299
+
62
+
300
+ return ldst;
63
+#endif /* X86_64_LOAD_EXTRACT_AL16_AL8_H */
301
+}
302
+
303
static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg datalo,
304
TCGReg datahi, HostAddress h)
305
{
306
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
307
MemOpIdx oi, TCGType data_type)
308
{
309
MemOp opc = get_memop(oi);
310
+ TCGLabelQemuLdst *ldst;
311
HostAddress h;
312
313
-#ifdef CONFIG_SOFTMMU
314
- h.cond = COND_AL;
315
- h.base = addrlo;
316
- h.index_scratch = true;
317
- h.index = tcg_out_tlb_read(s, addrlo, addrhi, opc, get_mmuidx(oi), 1);
318
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
319
+ if (ldst) {
320
+ ldst->type = data_type;
321
+ ldst->datalo_reg = datalo;
322
+ ldst->datahi_reg = datahi;
323
324
- /*
325
- * This a conditional BL only to load a pointer within this opcode into
326
- * LR for the slow path. We will not be using the value for a tail call.
327
- */
328
- tcg_insn_unit *label_ptr = s->code_ptr;
329
- tcg_out_bl_imm(s, COND_NE, 0);
330
+ /*
331
+ * This a conditional BL only to load a pointer within this
332
+ * opcode into LR for the slow path. We will not be using
333
+ * the value for a tail call.
334
+ */
335
+ ldst->label_ptr[0] = s->code_ptr;
336
+ tcg_out_bl_imm(s, COND_NE, 0);
337
338
- tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
339
-
340
- add_qemu_ldst_label(s, true, oi, data_type, datalo, datahi,
341
- addrlo, addrhi, s->code_ptr, label_ptr);
342
-#else
343
- unsigned a_bits = get_alignment_bits(opc);
344
- if (a_bits) {
345
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
346
+ tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
347
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
348
+ } else {
349
+ tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
350
}
351
-
352
- h.cond = COND_AL;
353
- h.base = addrlo;
354
- h.index = guest_base ? TCG_REG_GUEST_BASE : -1;
355
- h.index_scratch = false;
356
- tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
357
-#endif
358
}
359
360
static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg datalo,
361
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
362
MemOpIdx oi, TCGType data_type)
363
{
364
MemOp opc = get_memop(oi);
365
+ TCGLabelQemuLdst *ldst;
366
HostAddress h;
367
368
-#ifdef CONFIG_SOFTMMU
369
- h.cond = COND_EQ;
370
- h.base = addrlo;
371
- h.index_scratch = true;
372
- h.index = tcg_out_tlb_read(s, addrlo, addrhi, opc, get_mmuidx(oi), 0);
373
- tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
374
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
375
+ if (ldst) {
376
+ ldst->type = data_type;
377
+ ldst->datalo_reg = datalo;
378
+ ldst->datahi_reg = datahi;
379
380
- /* The conditional call must come last, as we're going to return here. */
381
- tcg_insn_unit *label_ptr = s->code_ptr;
382
- tcg_out_bl_imm(s, COND_NE, 0);
383
-
384
- add_qemu_ldst_label(s, false, oi, data_type, datalo, datahi,
385
- addrlo, addrhi, s->code_ptr, label_ptr);
386
-#else
387
- unsigned a_bits = get_alignment_bits(opc);
388
-
389
- h.cond = COND_AL;
390
- if (a_bits) {
391
- tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
392
h.cond = COND_EQ;
393
- }
394
+ tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
395
396
- h.base = addrlo;
397
- h.index = guest_base ? TCG_REG_GUEST_BASE : -1;
398
- h.index_scratch = false;
399
- tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
400
-#endif
401
+ /* The conditional call is last, as we're going to return here. */
402
+ ldst->label_ptr[0] = s->code_ptr;
403
+ tcg_out_bl_imm(s, COND_NE, 0);
404
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
405
+ } else {
406
+ tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
407
+ }
408
}
409
410
static void tcg_out_epilogue(TCGContext *s);
411
--
64
--
412
2.34.1
65
2.34.1
413
414
diff view generated by jsdifflib
1
Like cpu_in_exclusive_context, but also true if
2
there is no other cpu against which we could race.
3
4
Use it in tb_flush as a direct replacement.
5
Use it in cpu_loop_exit_atomic to ensure that there
6
is no loop against cpu_exec_step_atomic.
7
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
---
3
---
13
accel/tcg/internal.h | 9 +++++++++
4
.../aarch64/host/load-extract-al16-al8.h | 40 +++++++++++++++++++
14
accel/tcg/cpu-exec-common.c | 3 +++
5
1 file changed, 40 insertions(+)
15
accel/tcg/tb-maint.c | 2 +-
6
create mode 100644 host/include/aarch64/host/load-extract-al16-al8.h
16
3 files changed, 13 insertions(+), 1 deletion(-)
17
7
18
diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
8
diff --git a/host/include/aarch64/host/load-extract-al16-al8.h b/host/include/aarch64/host/load-extract-al16-al8.h
19
index XXXXXXX..XXXXXXX 100644
9
new file mode 100644
20
--- a/accel/tcg/internal.h
10
index XXXXXXX..XXXXXXX
21
+++ b/accel/tcg/internal.h
11
--- /dev/null
22
@@ -XXX,XX +XXX,XX @@ static inline target_ulong log_pc(CPUState *cpu, const TranslationBlock *tb)
12
+++ b/host/include/aarch64/host/load-extract-al16-al8.h
23
}
13
@@ -XXX,XX +XXX,XX @@
24
}
25
26
+/*
14
+/*
27
+ * Return true if CS is not running in parallel with other cpus, either
15
+ * SPDX-License-Identifier: GPL-2.0-or-later
28
+ * because there are no other cpus or we are within an exclusive context.
16
+ * Atomic extract 64 from 128-bit, AArch64 version.
17
+ *
18
+ * Copyright (C) 2023 Linaro, Ltd.
29
+ */
19
+ */
30
+static inline bool cpu_in_serial_context(CPUState *cs)
20
+
21
+#ifndef AARCH64_LOAD_EXTRACT_AL16_AL8_H
22
+#define AARCH64_LOAD_EXTRACT_AL16_AL8_H
23
+
24
+#include "host/cpuinfo.h"
25
+#include "tcg/debug-assert.h"
26
+
27
+/**
28
+ * load_atom_extract_al16_or_al8:
29
+ * @pv: host address
30
+ * @s: object size in bytes, @s <= 8.
31
+ *
32
+ * Load @s bytes from @pv, when pv % s != 0. If [p, p+s-1] does not
33
+ * cross an 16-byte boundary then the access must be 16-byte atomic,
34
+ * otherwise the access must be 8-byte atomic.
35
+ */
36
+static inline uint64_t load_atom_extract_al16_or_al8(void *pv, int s)
31
+{
37
+{
32
+ return !(cs->tcg_cflags & CF_PARALLEL) || cpu_in_exclusive_context(cs);
38
+ uintptr_t pi = (uintptr_t)pv;
39
+ __int128_t *ptr_align = (__int128_t *)(pi & ~7);
40
+ int shr = (pi & 7) * 8;
41
+ uint64_t l, h;
42
+
43
+ /*
44
+ * With FEAT_LSE2, LDP is single-copy atomic if 16-byte aligned
45
+ * and single-copy atomic on the parts if 8-byte aligned.
46
+ * All we need do is align the pointer mod 8.
47
+ */
48
+ tcg_debug_assert(HAVE_ATOMIC128_RO);
49
+ asm("ldp %0, %1, %2" : "=r"(l), "=r"(h) : "m"(*ptr_align));
50
+ return (l >> shr) | (h << (-shr & 63));
33
+}
51
+}
34
+
52
+
35
extern int64_t max_delay;
53
+#endif /* AARCH64_LOAD_EXTRACT_AL16_AL8_H */
36
extern int64_t max_advance;
37
38
diff --git a/accel/tcg/cpu-exec-common.c b/accel/tcg/cpu-exec-common.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/accel/tcg/cpu-exec-common.c
41
+++ b/accel/tcg/cpu-exec-common.c
42
@@ -XXX,XX +XXX,XX @@
43
#include "sysemu/tcg.h"
44
#include "exec/exec-all.h"
45
#include "qemu/plugin.h"
46
+#include "internal.h"
47
48
bool tcg_allowed;
49
50
@@ -XXX,XX +XXX,XX @@ void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc)
51
52
void cpu_loop_exit_atomic(CPUState *cpu, uintptr_t pc)
53
{
54
+ /* Prevent looping if already executing in a serial context. */
55
+ g_assert(!cpu_in_serial_context(cpu));
56
cpu->exception_index = EXCP_ATOMIC;
57
cpu_loop_exit_restore(cpu, pc);
58
}
59
diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
60
index XXXXXXX..XXXXXXX 100644
61
--- a/accel/tcg/tb-maint.c
62
+++ b/accel/tcg/tb-maint.c
63
@@ -XXX,XX +XXX,XX @@ void tb_flush(CPUState *cpu)
64
if (tcg_enabled()) {
65
unsigned tb_flush_count = qatomic_read(&tb_ctx.tb_flush_count);
66
67
- if (cpu_in_exclusive_context(cpu)) {
68
+ if (cpu_in_serial_context(cpu)) {
69
do_tb_flush(cpu, RUN_ON_CPU_HOST_INT(tb_flush_count));
70
} else {
71
async_safe_run_on_cpu(cpu, do_tb_flush,
72
--
54
--
73
2.34.1
55
2.34.1
74
75
diff view generated by jsdifflib
1
From: Thomas Huth <thuth@redhat.com>
1
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2
3
We'd like to move disas.c into the common code source set, where
4
CONFIG_USER_ONLY is not available anymore. So we have to move
5
the related code into a separate file instead.
6
7
Signed-off-by: Thomas Huth <thuth@redhat.com>
8
Message-Id: <20230508133745.109463-2-thuth@redhat.com>
9
[rth: Type change done in a separate patch]
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
---
3
---
12
disas/disas-internal.h | 21 ++++++++++++
4
host/include/aarch64/host/store-insert-al16.h | 47 +++++++++++++++++++
13
disas/disas-mon.c | 65 ++++++++++++++++++++++++++++++++++++
5
1 file changed, 47 insertions(+)
14
disas/disas.c | 76 ++++--------------------------------------
6
create mode 100644 host/include/aarch64/host/store-insert-al16.h
15
disas/meson.build | 1 +
16
4 files changed, 93 insertions(+), 70 deletions(-)
17
create mode 100644 disas/disas-internal.h
18
create mode 100644 disas/disas-mon.c
19
7
20
diff --git a/disas/disas-internal.h b/disas/disas-internal.h
8
diff --git a/host/include/aarch64/host/store-insert-al16.h b/host/include/aarch64/host/store-insert-al16.h
21
new file mode 100644
9
new file mode 100644
22
index XXXXXXX..XXXXXXX
10
index XXXXXXX..XXXXXXX
23
--- /dev/null
11
--- /dev/null
24
+++ b/disas/disas-internal.h
12
+++ b/host/include/aarch64/host/store-insert-al16.h
25
@@ -XXX,XX +XXX,XX @@
13
@@ -XXX,XX +XXX,XX @@
26
+/*
14
+/*
27
+ * Definitions used internally in the disassembly code
15
+ * SPDX-License-Identifier: GPL-2.0-or-later
16
+ * Atomic store insert into 128-bit, AArch64 version.
28
+ *
17
+ *
29
+ * SPDX-License-Identifier: GPL-2.0-or-later
18
+ * Copyright (C) 2023 Linaro, Ltd.
30
+ */
19
+ */
31
+
20
+
32
+#ifndef DISAS_INTERNAL_H
21
+#ifndef AARCH64_STORE_INSERT_AL16_H
33
+#define DISAS_INTERNAL_H
22
+#define AARCH64_STORE_INSERT_AL16_H
34
+
23
+
35
+#include "disas/dis-asm.h"
24
+/**
25
+ * store_atom_insert_al16:
26
+ * @p: host address
27
+ * @val: shifted value to store
28
+ * @msk: mask for value to store
29
+ *
30
+ * Atomically store @val to @p masked by @msk.
31
+ */
32
+static inline void ATTRIBUTE_ATOMIC128_OPT
33
+store_atom_insert_al16(Int128 *ps, Int128 val, Int128 msk)
34
+{
35
+ /*
36
+ * GCC only implements __sync* primitives for int128 on aarch64.
37
+ * We can do better without the barriers, and integrating the
38
+ * arithmetic into the load-exclusive/store-conditional pair.
39
+ */
40
+ uint64_t tl, th, vl, vh, ml, mh;
41
+ uint32_t fail;
36
+
42
+
37
+typedef struct CPUDebug {
43
+ qemu_build_assert(!HOST_BIG_ENDIAN);
38
+ struct disassemble_info info;
44
+ vl = int128_getlo(val);
39
+ CPUState *cpu;
45
+ vh = int128_gethi(val);
40
+} CPUDebug;
46
+ ml = int128_getlo(msk);
47
+ mh = int128_gethi(msk);
41
+
48
+
42
+void disas_initialize_debug_target(CPUDebug *s, CPUState *cpu);
49
+ asm("0: ldxp %[l], %[h], %[mem]\n\t"
43
+int disas_gstring_printf(FILE *stream, const char *fmt, ...)
50
+ "bic %[l], %[l], %[ml]\n\t"
44
+ G_GNUC_PRINTF(2, 3);
51
+ "bic %[h], %[h], %[mh]\n\t"
45
+
52
+ "orr %[l], %[l], %[vl]\n\t"
46
+#endif
53
+ "orr %[h], %[h], %[vh]\n\t"
47
diff --git a/disas/disas-mon.c b/disas/disas-mon.c
54
+ "stxp %w[f], %[l], %[h], %[mem]\n\t"
48
new file mode 100644
55
+ "cbnz %w[f], 0b\n"
49
index XXXXXXX..XXXXXXX
56
+ : [mem] "+Q"(*ps), [f] "=&r"(fail), [l] "=&r"(tl), [h] "=&r"(th)
50
--- /dev/null
57
+ : [vl] "r"(vl), [vh] "r"(vh), [ml] "r"(ml), [mh] "r"(mh));
51
+++ b/disas/disas-mon.c
52
@@ -XXX,XX +XXX,XX @@
53
+/*
54
+ * Functions related to disassembly from the monitor
55
+ *
56
+ * SPDX-License-Identifier: GPL-2.0-or-later
57
+ */
58
+
59
+#include "qemu/osdep.h"
60
+#include "disas-internal.h"
61
+#include "disas/disas.h"
62
+#include "exec/memory.h"
63
+#include "hw/core/cpu.h"
64
+#include "monitor/monitor.h"
65
+
66
+static int
67
+physical_read_memory(bfd_vma memaddr, bfd_byte *myaddr, int length,
68
+ struct disassemble_info *info)
69
+{
70
+ CPUDebug *s = container_of(info, CPUDebug, info);
71
+ MemTxResult res;
72
+
73
+ res = address_space_read(s->cpu->as, memaddr, MEMTXATTRS_UNSPECIFIED,
74
+ myaddr, length);
75
+ return res == MEMTX_OK ? 0 : EIO;
76
+}
58
+}
77
+
59
+
78
+/* Disassembler for the monitor. */
60
+#endif /* AARCH64_STORE_INSERT_AL16_H */
79
+void monitor_disas(Monitor *mon, CPUState *cpu, uint64_t pc,
80
+ int nb_insn, bool is_physical)
81
+{
82
+ int count, i;
83
+ CPUDebug s;
84
+ g_autoptr(GString) ds = g_string_new("");
85
+
86
+ disas_initialize_debug_target(&s, cpu);
87
+ s.info.fprintf_func = disas_gstring_printf;
88
+ s.info.stream = (FILE *)ds; /* abuse this slot */
89
+
90
+ if (is_physical) {
91
+ s.info.read_memory_func = physical_read_memory;
92
+ }
93
+ s.info.buffer_vma = pc;
94
+
95
+ if (s.info.cap_arch >= 0 && cap_disas_monitor(&s.info, pc, nb_insn)) {
96
+ monitor_puts(mon, ds->str);
97
+ return;
98
+ }
99
+
100
+ if (!s.info.print_insn) {
101
+ monitor_printf(mon, "0x%08" PRIx64
102
+ ": Asm output not supported on this arch\n", pc);
103
+ return;
104
+ }
105
+
106
+ for (i = 0; i < nb_insn; i++) {
107
+ g_string_append_printf(ds, "0x%08" PRIx64 ": ", pc);
108
+ count = s.info.print_insn(pc, &s.info);
109
+ g_string_append_c(ds, '\n');
110
+ if (count < 0) {
111
+ break;
112
+ }
113
+ pc += count;
114
+ }
115
+
116
+ monitor_puts(mon, ds->str);
117
+}
118
diff --git a/disas/disas.c b/disas/disas.c
119
index XXXXXXX..XXXXXXX 100644
120
--- a/disas/disas.c
121
+++ b/disas/disas.c
122
@@ -XXX,XX +XXX,XX @@
123
/* General "disassemble this chunk" code. Used for debugging. */
124
#include "qemu/osdep.h"
125
-#include "disas/dis-asm.h"
126
+#include "disas/disas-internal.h"
127
#include "elf.h"
128
#include "qemu/qemu-print.h"
129
#include "disas/disas.h"
130
@@ -XXX,XX +XXX,XX @@
131
#include "hw/core/cpu.h"
132
#include "exec/memory.h"
133
134
-typedef struct CPUDebug {
135
- struct disassemble_info info;
136
- CPUState *cpu;
137
-} CPUDebug;
138
-
139
/* Filled in by elfload.c. Simplistic, but will do for now. */
140
struct syminfo *syminfos = NULL;
141
142
@@ -XXX,XX +XXX,XX @@ static void initialize_debug(CPUDebug *s)
143
s->info.symbol_at_address_func = symbol_at_address;
144
}
145
146
-static void initialize_debug_target(CPUDebug *s, CPUState *cpu)
147
+void disas_initialize_debug_target(CPUDebug *s, CPUState *cpu)
148
{
149
initialize_debug(s);
150
151
@@ -XXX,XX +XXX,XX @@ void target_disas(FILE *out, CPUState *cpu, uint64_t code, size_t size)
152
int count;
153
CPUDebug s;
154
155
- initialize_debug_target(&s, cpu);
156
+ disas_initialize_debug_target(&s, cpu);
157
s.info.fprintf_func = fprintf;
158
s.info.stream = out;
159
s.info.buffer_vma = code;
160
@@ -XXX,XX +XXX,XX @@ void target_disas(FILE *out, CPUState *cpu, uint64_t code, size_t size)
161
}
162
}
163
164
-static int G_GNUC_PRINTF(2, 3)
165
-gstring_printf(FILE *stream, const char *fmt, ...)
166
+int disas_gstring_printf(FILE *stream, const char *fmt, ...)
167
{
168
/* We abuse the FILE parameter to pass a GString. */
169
GString *s = (GString *)stream;
170
@@ -XXX,XX +XXX,XX @@ char *plugin_disas(CPUState *cpu, uint64_t addr, size_t size)
171
CPUDebug s;
172
GString *ds = g_string_new(NULL);
173
174
- initialize_debug_target(&s, cpu);
175
- s.info.fprintf_func = gstring_printf;
176
+ disas_initialize_debug_target(&s, cpu);
177
+ s.info.fprintf_func = disas_gstring_printf;
178
s.info.stream = (FILE *)ds; /* abuse this slot */
179
s.info.buffer_vma = addr;
180
s.info.buffer_length = size;
181
@@ -XXX,XX +XXX,XX @@ const char *lookup_symbol(uint64_t orig_addr)
182
183
return symbol;
184
}
185
-
186
-#if !defined(CONFIG_USER_ONLY)
187
-
188
-#include "monitor/monitor.h"
189
-
190
-static int
191
-physical_read_memory(bfd_vma memaddr, bfd_byte *myaddr, int length,
192
- struct disassemble_info *info)
193
-{
194
- CPUDebug *s = container_of(info, CPUDebug, info);
195
- MemTxResult res;
196
-
197
- res = address_space_read(s->cpu->as, memaddr, MEMTXATTRS_UNSPECIFIED,
198
- myaddr, length);
199
- return res == MEMTX_OK ? 0 : EIO;
200
-}
201
-
202
-/* Disassembler for the monitor. */
203
-void monitor_disas(Monitor *mon, CPUState *cpu, uint64_t pc,
204
- int nb_insn, bool is_physical)
205
-{
206
- int count, i;
207
- CPUDebug s;
208
- g_autoptr(GString) ds = g_string_new("");
209
-
210
- initialize_debug_target(&s, cpu);
211
- s.info.fprintf_func = gstring_printf;
212
- s.info.stream = (FILE *)ds; /* abuse this slot */
213
-
214
- if (is_physical) {
215
- s.info.read_memory_func = physical_read_memory;
216
- }
217
- s.info.buffer_vma = pc;
218
-
219
- if (s.info.cap_arch >= 0 && cap_disas_monitor(&s.info, pc, nb_insn)) {
220
- monitor_puts(mon, ds->str);
221
- return;
222
- }
223
-
224
- if (!s.info.print_insn) {
225
- monitor_printf(mon, "0x%08" PRIx64
226
- ": Asm output not supported on this arch\n", pc);
227
- return;
228
- }
229
-
230
- for (i = 0; i < nb_insn; i++) {
231
- g_string_append_printf(ds, "0x%08" PRIx64 ": ", pc);
232
- count = s.info.print_insn(pc, &s.info);
233
- g_string_append_c(ds, '\n');
234
- if (count < 0) {
235
- break;
236
- }
237
- pc += count;
238
- }
239
-
240
- monitor_puts(mon, ds->str);
241
-}
242
-#endif
243
diff --git a/disas/meson.build b/disas/meson.build
244
index XXXXXXX..XXXXXXX 100644
245
--- a/disas/meson.build
246
+++ b/disas/meson.build
247
@@ -XXX,XX +XXX,XX @@ common_ss.add(when: 'CONFIG_SPARC_DIS', if_true: files('sparc.c'))
248
common_ss.add(when: 'CONFIG_XTENSA_DIS', if_true: files('xtensa.c'))
249
common_ss.add(when: capstone, if_true: [files('capstone.c'), capstone])
250
251
+softmmu_ss.add(files('disas-mon.c'))
252
specific_ss.add(files('disas.c'), capstone)
253
--
61
--
254
2.34.1
62
2.34.1
diff view generated by jsdifflib
1
While performing the load in the delay slot of the call to the common
1
The last use was removed by e77c89fb086a.
2
bswap helper function is cute, it is not worth the added complexity.
3
2
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
3
Fixes: e77c89fb086a ("cputlb: Remove static tlb sizing")
4
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
6
---
7
tcg/mips/tcg-target.h | 4 +-
7
tcg/aarch64/tcg-target.h | 1 -
8
tcg/mips/tcg-target.c.inc | 284 ++++++--------------------------------
8
tcg/arm/tcg-target.h | 1 -
9
2 files changed, 48 insertions(+), 240 deletions(-)
9
tcg/i386/tcg-target.h | 1 -
10
tcg/mips/tcg-target.h | 1 -
11
tcg/ppc/tcg-target.h | 1 -
12
tcg/riscv/tcg-target.h | 1 -
13
tcg/s390x/tcg-target.h | 1 -
14
tcg/sparc64/tcg-target.h | 1 -
15
tcg/tci/tcg-target.h | 1 -
16
9 files changed, 9 deletions(-)
10
17
18
diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
19
index XXXXXXX..XXXXXXX 100644
20
--- a/tcg/aarch64/tcg-target.h
21
+++ b/tcg/aarch64/tcg-target.h
22
@@ -XXX,XX +XXX,XX @@
23
#include "host/cpuinfo.h"
24
25
#define TCG_TARGET_INSN_UNIT_SIZE 4
26
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 24
27
#define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
28
29
typedef enum {
30
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
31
index XXXXXXX..XXXXXXX 100644
32
--- a/tcg/arm/tcg-target.h
33
+++ b/tcg/arm/tcg-target.h
34
@@ -XXX,XX +XXX,XX @@ extern int arm_arch;
35
#define use_armv7_instructions (__ARM_ARCH >= 7 || arm_arch >= 7)
36
37
#define TCG_TARGET_INSN_UNIT_SIZE 4
38
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
39
#define MAX_CODE_GEN_BUFFER_SIZE UINT32_MAX
40
41
typedef enum {
42
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
43
index XXXXXXX..XXXXXXX 100644
44
--- a/tcg/i386/tcg-target.h
45
+++ b/tcg/i386/tcg-target.h
46
@@ -XXX,XX +XXX,XX @@
47
#include "host/cpuinfo.h"
48
49
#define TCG_TARGET_INSN_UNIT_SIZE 1
50
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 31
51
52
#ifdef __x86_64__
53
# define TCG_TARGET_REG_BITS 64
11
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
54
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
12
index XXXXXXX..XXXXXXX 100644
55
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/mips/tcg-target.h
56
--- a/tcg/mips/tcg-target.h
14
+++ b/tcg/mips/tcg-target.h
57
+++ b/tcg/mips/tcg-target.h
15
@@ -XXX,XX +XXX,XX @@ extern bool use_mips32r2_instructions;
58
@@ -XXX,XX +XXX,XX @@
16
#define TCG_TARGET_HAS_ext16u_i64 0 /* andi rt, rs, 0xffff */
17
#endif
59
#endif
18
60
19
-#define TCG_TARGET_DEFAULT_MO (0)
61
#define TCG_TARGET_INSN_UNIT_SIZE 4
20
-#define TCG_TARGET_HAS_MEMORY_BSWAP 1
62
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
21
+#define TCG_TARGET_DEFAULT_MO 0
63
#define TCG_TARGET_NB_REGS 32
22
+#define TCG_TARGET_HAS_MEMORY_BSWAP 0
64
23
65
#define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
24
#define TCG_TARGET_NEED_LDST_LABELS
66
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
25
26
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
27
index XXXXXXX..XXXXXXX 100644
67
index XXXXXXX..XXXXXXX 100644
28
--- a/tcg/mips/tcg-target.c.inc
68
--- a/tcg/ppc/tcg-target.h
29
+++ b/tcg/mips/tcg-target.c.inc
69
+++ b/tcg/ppc/tcg-target.h
30
@@ -XXX,XX +XXX,XX @@ static void tcg_out_call(TCGContext *s, const tcg_insn_unit *arg,
70
@@ -XXX,XX +XXX,XX @@
31
}
71
32
72
#define TCG_TARGET_NB_REGS 64
33
#if defined(CONFIG_SOFTMMU)
73
#define TCG_TARGET_INSN_UNIT_SIZE 4
34
-static void * const qemu_ld_helpers[(MO_SSIZE | MO_BSWAP) + 1] = {
74
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
35
+static void * const qemu_ld_helpers[MO_SSIZE + 1] = {
75
36
[MO_UB] = helper_ret_ldub_mmu,
76
typedef enum {
37
[MO_SB] = helper_ret_ldsb_mmu,
77
TCG_REG_R0, TCG_REG_R1, TCG_REG_R2, TCG_REG_R3,
38
- [MO_LEUW] = helper_le_lduw_mmu,
78
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
39
- [MO_LESW] = helper_le_ldsw_mmu,
79
index XXXXXXX..XXXXXXX 100644
40
- [MO_LEUL] = helper_le_ldul_mmu,
80
--- a/tcg/riscv/tcg-target.h
41
- [MO_LEUQ] = helper_le_ldq_mmu,
81
+++ b/tcg/riscv/tcg-target.h
42
- [MO_BEUW] = helper_be_lduw_mmu,
82
@@ -XXX,XX +XXX,XX @@
43
- [MO_BESW] = helper_be_ldsw_mmu,
83
#define TCG_TARGET_REG_BITS 64
44
- [MO_BEUL] = helper_be_ldul_mmu,
84
45
- [MO_BEUQ] = helper_be_ldq_mmu,
85
#define TCG_TARGET_INSN_UNIT_SIZE 4
46
-#if TCG_TARGET_REG_BITS == 64
86
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 20
47
- [MO_LESL] = helper_le_ldsl_mmu,
87
#define TCG_TARGET_NB_REGS 32
48
- [MO_BESL] = helper_be_ldsl_mmu,
88
#define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
49
+#if HOST_BIG_ENDIAN
89
50
+ [MO_UW] = helper_be_lduw_mmu,
90
diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
51
+ [MO_SW] = helper_be_ldsw_mmu,
91
index XXXXXXX..XXXXXXX 100644
52
+ [MO_UL] = helper_be_ldul_mmu,
92
--- a/tcg/s390x/tcg-target.h
53
+ [MO_SL] = helper_be_ldsl_mmu,
93
+++ b/tcg/s390x/tcg-target.h
54
+ [MO_UQ] = helper_be_ldq_mmu,
94
@@ -XXX,XX +XXX,XX @@
55
+#else
95
#define S390_TCG_TARGET_H
56
+ [MO_UW] = helper_le_lduw_mmu,
96
57
+ [MO_SW] = helper_le_ldsw_mmu,
97
#define TCG_TARGET_INSN_UNIT_SIZE 2
58
+ [MO_UL] = helper_le_ldul_mmu,
98
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 19
59
+ [MO_UQ] = helper_le_ldq_mmu,
99
60
+ [MO_SL] = helper_le_ldsl_mmu,
100
/* We have a +- 4GB range on the branches; leave some slop. */
61
#endif
101
#define MAX_CODE_GEN_BUFFER_SIZE (3 * GiB)
62
};
102
diff --git a/tcg/sparc64/tcg-target.h b/tcg/sparc64/tcg-target.h
63
103
index XXXXXXX..XXXXXXX 100644
64
-static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
104
--- a/tcg/sparc64/tcg-target.h
65
+static void * const qemu_st_helpers[MO_SIZE + 1] = {
105
+++ b/tcg/sparc64/tcg-target.h
66
[MO_UB] = helper_ret_stb_mmu,
106
@@ -XXX,XX +XXX,XX @@
67
- [MO_LEUW] = helper_le_stw_mmu,
107
#define SPARC_TCG_TARGET_H
68
- [MO_LEUL] = helper_le_stl_mmu,
108
69
- [MO_LEUQ] = helper_le_stq_mmu,
109
#define TCG_TARGET_INSN_UNIT_SIZE 4
70
- [MO_BEUW] = helper_be_stw_mmu,
110
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 32
71
- [MO_BEUL] = helper_be_stl_mmu,
111
#define TCG_TARGET_NB_REGS 32
72
- [MO_BEUQ] = helper_be_stq_mmu,
112
#define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
73
+#if HOST_BIG_ENDIAN
113
74
+ [MO_UW] = helper_be_stw_mmu,
114
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
75
+ [MO_UL] = helper_be_stl_mmu,
115
index XXXXXXX..XXXXXXX 100644
76
+ [MO_UQ] = helper_be_stq_mmu,
116
--- a/tcg/tci/tcg-target.h
77
+#else
117
+++ b/tcg/tci/tcg-target.h
78
+ [MO_UW] = helper_le_stw_mmu,
118
@@ -XXX,XX +XXX,XX @@
79
+ [MO_UL] = helper_le_stl_mmu,
119
80
+ [MO_UQ] = helper_le_stq_mmu,
120
#define TCG_TARGET_INTERPRETER 1
81
+#endif
121
#define TCG_TARGET_INSN_UNIT_SIZE 4
82
};
122
-#define TCG_TARGET_TLB_DISPLACEMENT_BITS 32
83
123
#define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
84
/* We have four temps, we might as well expose three of them. */
124
85
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
125
#if UINTPTR_MAX == UINT32_MAX
86
87
tcg_out_ld_helper_args(s, l, &ldst_helper_param);
88
89
- tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)], false);
90
+ tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SSIZE], false);
91
/* delay slot */
92
tcg_out_nop(s);
93
94
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
95
96
tcg_out_st_helper_args(s, l, &ldst_helper_param);
97
98
- tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)], false);
99
+ tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE], false);
100
/* delay slot */
101
tcg_out_nop(s);
102
103
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
104
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
105
TCGReg base, MemOp opc, TCGType type)
106
{
107
- switch (opc & (MO_SSIZE | MO_BSWAP)) {
108
+ switch (opc & MO_SSIZE) {
109
case MO_UB:
110
tcg_out_opc_imm(s, OPC_LBU, lo, base, 0);
111
break;
112
case MO_SB:
113
tcg_out_opc_imm(s, OPC_LB, lo, base, 0);
114
break;
115
- case MO_UW | MO_BSWAP:
116
- tcg_out_opc_imm(s, OPC_LHU, TCG_TMP1, base, 0);
117
- tcg_out_bswap16(s, lo, TCG_TMP1, TCG_BSWAP_IZ | TCG_BSWAP_OZ);
118
- break;
119
case MO_UW:
120
tcg_out_opc_imm(s, OPC_LHU, lo, base, 0);
121
break;
122
- case MO_SW | MO_BSWAP:
123
- tcg_out_opc_imm(s, OPC_LHU, TCG_TMP1, base, 0);
124
- tcg_out_bswap16(s, lo, TCG_TMP1, TCG_BSWAP_IZ | TCG_BSWAP_OS);
125
- break;
126
case MO_SW:
127
tcg_out_opc_imm(s, OPC_LH, lo, base, 0);
128
break;
129
- case MO_UL | MO_BSWAP:
130
- if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64) {
131
- if (use_mips32r2_instructions) {
132
- tcg_out_opc_imm(s, OPC_LWU, lo, base, 0);
133
- tcg_out_bswap32(s, lo, lo, TCG_BSWAP_IZ | TCG_BSWAP_OZ);
134
- } else {
135
- tcg_out_bswap_subr(s, bswap32u_addr);
136
- /* delay slot */
137
- tcg_out_opc_imm(s, OPC_LWU, TCG_TMP0, base, 0);
138
- tcg_out_mov(s, TCG_TYPE_I64, lo, TCG_TMP3);
139
- }
140
- break;
141
- }
142
- /* FALLTHRU */
143
- case MO_SL | MO_BSWAP:
144
- if (use_mips32r2_instructions) {
145
- tcg_out_opc_imm(s, OPC_LW, lo, base, 0);
146
- tcg_out_bswap32(s, lo, lo, 0);
147
- } else {
148
- tcg_out_bswap_subr(s, bswap32_addr);
149
- /* delay slot */
150
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, base, 0);
151
- tcg_out_mov(s, TCG_TYPE_I32, lo, TCG_TMP3);
152
- }
153
- break;
154
case MO_UL:
155
if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64) {
156
tcg_out_opc_imm(s, OPC_LWU, lo, base, 0);
157
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
158
case MO_SL:
159
tcg_out_opc_imm(s, OPC_LW, lo, base, 0);
160
break;
161
- case MO_UQ | MO_BSWAP:
162
- if (TCG_TARGET_REG_BITS == 64) {
163
- if (use_mips32r2_instructions) {
164
- tcg_out_opc_imm(s, OPC_LD, lo, base, 0);
165
- tcg_out_bswap64(s, lo, lo);
166
- } else {
167
- tcg_out_bswap_subr(s, bswap64_addr);
168
- /* delay slot */
169
- tcg_out_opc_imm(s, OPC_LD, TCG_TMP0, base, 0);
170
- tcg_out_mov(s, TCG_TYPE_I64, lo, TCG_TMP3);
171
- }
172
- } else if (use_mips32r2_instructions) {
173
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, base, 0);
174
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP1, base, 4);
175
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP0, 0, TCG_TMP0);
176
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP1, 0, TCG_TMP1);
177
- tcg_out_opc_sa(s, OPC_ROTR, MIPS_BE ? lo : hi, TCG_TMP0, 16);
178
- tcg_out_opc_sa(s, OPC_ROTR, MIPS_BE ? hi : lo, TCG_TMP1, 16);
179
- } else {
180
- tcg_out_bswap_subr(s, bswap32_addr);
181
- /* delay slot */
182
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, base, 0);
183
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, base, 4);
184
- tcg_out_bswap_subr(s, bswap32_addr);
185
- /* delay slot */
186
- tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? lo : hi, TCG_TMP3);
187
- tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? hi : lo, TCG_TMP3);
188
- }
189
- break;
190
case MO_UQ:
191
/* Prefer to load from offset 0 first, but allow for overlap. */
192
if (TCG_TARGET_REG_BITS == 64) {
193
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
194
const MIPSInsn lw2 = MIPS_BE ? OPC_LWR : OPC_LWL;
195
const MIPSInsn ld1 = MIPS_BE ? OPC_LDL : OPC_LDR;
196
const MIPSInsn ld2 = MIPS_BE ? OPC_LDR : OPC_LDL;
197
+ bool sgn = opc & MO_SIGN;
198
199
- bool sgn = (opc & MO_SIGN);
200
-
201
- switch (opc & (MO_SSIZE | MO_BSWAP)) {
202
- case MO_SW | MO_BE:
203
- case MO_UW | MO_BE:
204
- tcg_out_opc_imm(s, sgn ? OPC_LB : OPC_LBU, TCG_TMP0, base, 0);
205
- tcg_out_opc_imm(s, OPC_LBU, lo, base, 1);
206
- if (use_mips32r2_instructions) {
207
- tcg_out_opc_bf(s, OPC_INS, lo, TCG_TMP0, 31, 8);
208
- } else {
209
- tcg_out_opc_sa(s, OPC_SLL, TCG_TMP0, TCG_TMP0, 8);
210
- tcg_out_opc_reg(s, OPC_OR, lo, TCG_TMP0, TCG_TMP1);
211
- }
212
- break;
213
-
214
- case MO_SW | MO_LE:
215
- case MO_UW | MO_LE:
216
- if (use_mips32r2_instructions && lo != base) {
217
+ switch (opc & MO_SIZE) {
218
+ case MO_16:
219
+ if (HOST_BIG_ENDIAN) {
220
+ tcg_out_opc_imm(s, sgn ? OPC_LB : OPC_LBU, TCG_TMP0, base, 0);
221
+ tcg_out_opc_imm(s, OPC_LBU, lo, base, 1);
222
+ if (use_mips32r2_instructions) {
223
+ tcg_out_opc_bf(s, OPC_INS, lo, TCG_TMP0, 31, 8);
224
+ } else {
225
+ tcg_out_opc_sa(s, OPC_SLL, TCG_TMP0, TCG_TMP0, 8);
226
+ tcg_out_opc_reg(s, OPC_OR, lo, lo, TCG_TMP0);
227
+ }
228
+ } else if (use_mips32r2_instructions && lo != base) {
229
tcg_out_opc_imm(s, OPC_LBU, lo, base, 0);
230
tcg_out_opc_imm(s, sgn ? OPC_LB : OPC_LBU, TCG_TMP0, base, 1);
231
tcg_out_opc_bf(s, OPC_INS, lo, TCG_TMP0, 31, 8);
232
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
233
}
234
break;
235
236
- case MO_SL:
237
- case MO_UL:
238
+ case MO_32:
239
tcg_out_opc_imm(s, lw1, lo, base, 0);
240
tcg_out_opc_imm(s, lw2, lo, base, 3);
241
if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64 && !sgn) {
242
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
243
}
244
break;
245
246
- case MO_UL | MO_BSWAP:
247
- case MO_SL | MO_BSWAP:
248
- if (use_mips32r2_instructions) {
249
- tcg_out_opc_imm(s, lw1, lo, base, 0);
250
- tcg_out_opc_imm(s, lw2, lo, base, 3);
251
- tcg_out_bswap32(s, lo, lo,
252
- TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64
253
- ? (sgn ? TCG_BSWAP_OS : TCG_BSWAP_OZ) : 0);
254
- } else {
255
- const tcg_insn_unit *subr =
256
- (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64 && !sgn
257
- ? bswap32u_addr : bswap32_addr);
258
-
259
- tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 0);
260
- tcg_out_bswap_subr(s, subr);
261
- /* delay slot */
262
- tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 3);
263
- tcg_out_mov(s, type, lo, TCG_TMP3);
264
- }
265
- break;
266
-
267
- case MO_UQ:
268
+ case MO_64:
269
if (TCG_TARGET_REG_BITS == 64) {
270
tcg_out_opc_imm(s, ld1, lo, base, 0);
271
tcg_out_opc_imm(s, ld2, lo, base, 7);
272
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
273
}
274
break;
275
276
- case MO_UQ | MO_BSWAP:
277
- if (TCG_TARGET_REG_BITS == 64) {
278
- if (use_mips32r2_instructions) {
279
- tcg_out_opc_imm(s, ld1, lo, base, 0);
280
- tcg_out_opc_imm(s, ld2, lo, base, 7);
281
- tcg_out_bswap64(s, lo, lo);
282
- } else {
283
- tcg_out_opc_imm(s, ld1, TCG_TMP0, base, 0);
284
- tcg_out_bswap_subr(s, bswap64_addr);
285
- /* delay slot */
286
- tcg_out_opc_imm(s, ld2, TCG_TMP0, base, 7);
287
- tcg_out_mov(s, TCG_TYPE_I64, lo, TCG_TMP3);
288
- }
289
- } else if (use_mips32r2_instructions) {
290
- tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 0 + 0);
291
- tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 0 + 3);
292
- tcg_out_opc_imm(s, lw1, TCG_TMP1, base, 4 + 0);
293
- tcg_out_opc_imm(s, lw2, TCG_TMP1, base, 4 + 3);
294
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP0, 0, TCG_TMP0);
295
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP1, 0, TCG_TMP1);
296
- tcg_out_opc_sa(s, OPC_ROTR, MIPS_BE ? lo : hi, TCG_TMP0, 16);
297
- tcg_out_opc_sa(s, OPC_ROTR, MIPS_BE ? hi : lo, TCG_TMP1, 16);
298
- } else {
299
- tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 0 + 0);
300
- tcg_out_bswap_subr(s, bswap32_addr);
301
- /* delay slot */
302
- tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 0 + 3);
303
- tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 4 + 0);
304
- tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? lo : hi, TCG_TMP3);
305
- tcg_out_bswap_subr(s, bswap32_addr);
306
- /* delay slot */
307
- tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 4 + 3);
308
- tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? hi : lo, TCG_TMP3);
309
- }
310
- break;
311
-
312
default:
313
g_assert_not_reached();
314
}
315
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
316
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo, TCGReg hi,
317
TCGReg base, MemOp opc)
318
{
319
- /* Don't clutter the code below with checks to avoid bswapping ZERO. */
320
- if ((lo | hi) == 0) {
321
- opc &= ~MO_BSWAP;
322
- }
323
-
324
- switch (opc & (MO_SIZE | MO_BSWAP)) {
325
+ switch (opc & MO_SIZE) {
326
case MO_8:
327
tcg_out_opc_imm(s, OPC_SB, lo, base, 0);
328
break;
329
-
330
- case MO_16 | MO_BSWAP:
331
- tcg_out_bswap16(s, TCG_TMP1, lo, 0);
332
- lo = TCG_TMP1;
333
- /* FALLTHRU */
334
case MO_16:
335
tcg_out_opc_imm(s, OPC_SH, lo, base, 0);
336
break;
337
-
338
- case MO_32 | MO_BSWAP:
339
- tcg_out_bswap32(s, TCG_TMP3, lo, 0);
340
- lo = TCG_TMP3;
341
- /* FALLTHRU */
342
case MO_32:
343
tcg_out_opc_imm(s, OPC_SW, lo, base, 0);
344
break;
345
-
346
- case MO_64 | MO_BSWAP:
347
- if (TCG_TARGET_REG_BITS == 64) {
348
- tcg_out_bswap64(s, TCG_TMP3, lo);
349
- tcg_out_opc_imm(s, OPC_SD, TCG_TMP3, base, 0);
350
- } else if (use_mips32r2_instructions) {
351
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP0, 0, MIPS_BE ? lo : hi);
352
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP1, 0, MIPS_BE ? hi : lo);
353
- tcg_out_opc_sa(s, OPC_ROTR, TCG_TMP0, TCG_TMP0, 16);
354
- tcg_out_opc_sa(s, OPC_ROTR, TCG_TMP1, TCG_TMP1, 16);
355
- tcg_out_opc_imm(s, OPC_SW, TCG_TMP0, base, 0);
356
- tcg_out_opc_imm(s, OPC_SW, TCG_TMP1, base, 4);
357
- } else {
358
- tcg_out_bswap32(s, TCG_TMP3, MIPS_BE ? lo : hi, 0);
359
- tcg_out_opc_imm(s, OPC_SW, TCG_TMP3, base, 0);
360
- tcg_out_bswap32(s, TCG_TMP3, MIPS_BE ? hi : lo, 0);
361
- tcg_out_opc_imm(s, OPC_SW, TCG_TMP3, base, 4);
362
- }
363
- break;
364
case MO_64:
365
if (TCG_TARGET_REG_BITS == 64) {
366
tcg_out_opc_imm(s, OPC_SD, lo, base, 0);
367
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo, TCGReg hi,
368
tcg_out_opc_imm(s, OPC_SW, MIPS_BE ? lo : hi, base, 4);
369
}
370
break;
371
-
372
default:
373
g_assert_not_reached();
374
}
375
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
376
const MIPSInsn sd1 = MIPS_BE ? OPC_SDL : OPC_SDR;
377
const MIPSInsn sd2 = MIPS_BE ? OPC_SDR : OPC_SDL;
378
379
- /* Don't clutter the code below with checks to avoid bswapping ZERO. */
380
- if ((lo | hi) == 0) {
381
- opc &= ~MO_BSWAP;
382
- }
383
-
384
- switch (opc & (MO_SIZE | MO_BSWAP)) {
385
- case MO_16 | MO_BE:
386
+ switch (opc & MO_SIZE) {
387
+ case MO_16:
388
tcg_out_opc_sa(s, OPC_SRL, TCG_TMP0, lo, 8);
389
- tcg_out_opc_imm(s, OPC_SB, TCG_TMP0, base, 0);
390
- tcg_out_opc_imm(s, OPC_SB, lo, base, 1);
391
+ tcg_out_opc_imm(s, OPC_SB, HOST_BIG_ENDIAN ? TCG_TMP0 : lo, base, 0);
392
+ tcg_out_opc_imm(s, OPC_SB, HOST_BIG_ENDIAN ? lo : TCG_TMP0, base, 1);
393
break;
394
395
- case MO_16 | MO_LE:
396
- tcg_out_opc_sa(s, OPC_SRL, TCG_TMP0, lo, 8);
397
- tcg_out_opc_imm(s, OPC_SB, lo, base, 0);
398
- tcg_out_opc_imm(s, OPC_SB, TCG_TMP0, base, 1);
399
- break;
400
-
401
- case MO_32 | MO_BSWAP:
402
- tcg_out_bswap32(s, TCG_TMP3, lo, 0);
403
- lo = TCG_TMP3;
404
- /* fall through */
405
case MO_32:
406
tcg_out_opc_imm(s, sw1, lo, base, 0);
407
tcg_out_opc_imm(s, sw2, lo, base, 3);
408
break;
409
410
- case MO_64 | MO_BSWAP:
411
- if (TCG_TARGET_REG_BITS == 64) {
412
- tcg_out_bswap64(s, TCG_TMP3, lo);
413
- lo = TCG_TMP3;
414
- } else if (use_mips32r2_instructions) {
415
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP0, 0, MIPS_BE ? hi : lo);
416
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP1, 0, MIPS_BE ? lo : hi);
417
- tcg_out_opc_sa(s, OPC_ROTR, TCG_TMP0, TCG_TMP0, 16);
418
- tcg_out_opc_sa(s, OPC_ROTR, TCG_TMP1, TCG_TMP1, 16);
419
- hi = MIPS_BE ? TCG_TMP0 : TCG_TMP1;
420
- lo = MIPS_BE ? TCG_TMP1 : TCG_TMP0;
421
- } else {
422
- tcg_out_bswap32(s, TCG_TMP3, MIPS_BE ? lo : hi, 0);
423
- tcg_out_opc_imm(s, sw1, TCG_TMP3, base, 0 + 0);
424
- tcg_out_opc_imm(s, sw2, TCG_TMP3, base, 0 + 3);
425
- tcg_out_bswap32(s, TCG_TMP3, MIPS_BE ? hi : lo, 0);
426
- tcg_out_opc_imm(s, sw1, TCG_TMP3, base, 4 + 0);
427
- tcg_out_opc_imm(s, sw2, TCG_TMP3, base, 4 + 3);
428
- break;
429
- }
430
- /* fall through */
431
case MO_64:
432
if (TCG_TARGET_REG_BITS == 64) {
433
tcg_out_opc_imm(s, sd1, lo, base, 0);
434
--
126
--
435
2.34.1
127
2.34.1
436
128
437
129
diff view generated by jsdifflib
1
In gen_ldx/gen_stx, the only two locations for memory operations,
1
Invert the exit code, for use with the testsuite.
2
mark the operation as either aligned (softmmu) or unaligned
3
(user-only, as if emulated by the kernel).
4
2
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
4
---
8
configs/targets/nios2-softmmu.mak | 1 -
5
scripts/decodetree.py | 9 +++++++--
9
target/nios2/translate.c | 10 ++++++++++
6
1 file changed, 7 insertions(+), 2 deletions(-)
10
2 files changed, 10 insertions(+), 1 deletion(-)
11
7
12
diff --git a/configs/targets/nios2-softmmu.mak b/configs/targets/nios2-softmmu.mak
8
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
13
index XXXXXXX..XXXXXXX 100644
9
index XXXXXXX..XXXXXXX 100644
14
--- a/configs/targets/nios2-softmmu.mak
10
--- a/scripts/decodetree.py
15
+++ b/configs/targets/nios2-softmmu.mak
11
+++ b/scripts/decodetree.py
16
@@ -XXX,XX +XXX,XX @@
12
@@ -XXX,XX +XXX,XX @@
17
TARGET_ARCH=nios2
13
formats = {}
18
-TARGET_ALIGNED_ONLY=y
14
allpatterns = []
19
TARGET_NEED_FDT=y
15
anyextern = False
20
diff --git a/target/nios2/translate.c b/target/nios2/translate.c
16
+testforerror = False
21
index XXXXXXX..XXXXXXX 100644
17
22
--- a/target/nios2/translate.c
18
translate_prefix = 'trans'
23
+++ b/target/nios2/translate.c
19
translate_scope = 'static '
24
@@ -XXX,XX +XXX,XX @@ static void gen_ldx(DisasContext *dc, uint32_t code, uint32_t flags)
20
@@ -XXX,XX +XXX,XX @@ def error_with_file(file, lineno, *args):
25
TCGv data = dest_gpr(dc, instr.b);
21
if output_file and output_fd:
26
22
output_fd.close()
27
tcg_gen_addi_tl(addr, load_gpr(dc, instr.a), instr.imm16.s);
23
os.remove(output_file)
28
+#ifdef CONFIG_USER_ONLY
24
- exit(1)
29
+ flags |= MO_UNALN;
25
+ exit(0 if testforerror else 1)
30
+#else
26
# end error_with_file
31
+ flags |= MO_ALIGN;
27
32
+#endif
28
33
tcg_gen_qemu_ld_tl(data, addr, dc->mem_idx, flags);
29
@@ -XXX,XX +XXX,XX @@ def main():
34
}
30
global bitop_width
35
31
global variablewidth
36
@@ -XXX,XX +XXX,XX @@ static void gen_stx(DisasContext *dc, uint32_t code, uint32_t flags)
32
global anyextern
37
33
+ global testforerror
38
TCGv addr = tcg_temp_new();
34
39
tcg_gen_addi_tl(addr, load_gpr(dc, instr.a), instr.imm16.s);
35
decode_scope = 'static '
40
+#ifdef CONFIG_USER_ONLY
36
41
+ flags |= MO_UNALN;
37
long_opts = ['decode=', 'translate=', 'output=', 'insnwidth=',
42
+#else
38
- 'static-decode=', 'varinsnwidth=']
43
+ flags |= MO_ALIGN;
39
+ 'static-decode=', 'varinsnwidth=', 'test-for-error']
44
+#endif
40
try:
45
tcg_gen_qemu_st_tl(val, addr, dc->mem_idx, flags);
41
(opts, args) = getopt.gnu_getopt(sys.argv[1:], 'o:vw:', long_opts)
46
}
42
except getopt.GetoptError as err:
43
@@ -XXX,XX +XXX,XX @@ def main():
44
bitop_width = 64
45
elif insnwidth != 32:
46
error(0, 'cannot handle insns of width', insnwidth)
47
+ elif o == '--test-for-error':
48
+ testforerror = True
49
else:
50
assert False, 'unhandled option'
51
52
@@ -XXX,XX +XXX,XX @@ def main():
53
54
if output_file:
55
output_fd.close()
56
+ exit(1 if testforerror else 0)
57
# end main
58
47
59
48
--
60
--
49
2.34.1
61
2.34.1
50
51
diff view generated by jsdifflib
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
1
Two copy-paste errors walking the parse tree.
2
and tcg_out_st_helper_args. This allows our local
3
tcg_out_arg_* infrastructure to be removed.
4
2
5
We are no longer filling the call or return branch
6
delay slots, nor are we tail-calling for the store,
7
but this seems a small price to pay.
8
9
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
---
4
---
12
tcg/mips/tcg-target.c.inc | 154 ++++++--------------------------------
5
scripts/decodetree.py | 4 ++--
13
1 file changed, 22 insertions(+), 132 deletions(-)
6
1 file changed, 2 insertions(+), 2 deletions(-)
14
7
15
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
8
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
16
index XXXXXXX..XXXXXXX 100644
9
index XXXXXXX..XXXXXXX 100644
17
--- a/tcg/mips/tcg-target.c.inc
10
--- a/scripts/decodetree.py
18
+++ b/tcg/mips/tcg-target.c.inc
11
+++ b/scripts/decodetree.py
19
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
12
@@ -XXX,XX +XXX,XX @@ def build_tree(self):
20
[MO_BEUQ] = helper_be_stq_mmu,
13
21
};
14
def prop_format(self):
22
15
for p in self.pats:
23
-/* Helper routines for marshalling helper function arguments into
16
- p.build_tree()
24
- * the correct registers and stack.
17
+ p.prop_format()
25
- * I is where we want to put this argument, and is updated and returned
18
26
- * for the next call. ARG is the argument itself.
19
def prop_width(self):
27
- *
20
width = None
28
- * We provide routines for arguments which are: immediate, 32 bit
21
@@ -XXX,XX +XXX,XX @@ def __build_tree(pats, outerbits, outermask):
29
- * value in register, 16 and 8 bit values in register (which must be zero
22
return t
30
- * extended before use) and 64 bit value in a lo:hi register pair.
23
31
- */
24
def build_tree(self):
32
-
25
- super().prop_format()
33
-static int tcg_out_call_iarg_reg(TCGContext *s, int i, TCGReg arg)
26
+ super().build_tree()
34
-{
27
self.tree = self.__build_tree(self.pats, self.fixedbits,
35
- if (i < ARRAY_SIZE(tcg_target_call_iarg_regs)) {
28
self.fixedmask)
36
- tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[i], arg);
37
- } else {
38
- /* For N32 and N64, the initial offset is different. But there
39
- we also have 8 argument register so we don't run out here. */
40
- tcg_debug_assert(TCG_TARGET_REG_BITS == 32);
41
- tcg_out_st(s, TCG_TYPE_REG, arg, TCG_REG_SP, 4 * i);
42
- }
43
- return i + 1;
44
-}
45
-
46
-static int tcg_out_call_iarg_reg8(TCGContext *s, int i, TCGReg arg)
47
-{
48
- TCGReg tmp = TCG_TMP0;
49
- if (i < ARRAY_SIZE(tcg_target_call_iarg_regs)) {
50
- tmp = tcg_target_call_iarg_regs[i];
51
- }
52
- tcg_out_ext8u(s, tmp, arg);
53
- return tcg_out_call_iarg_reg(s, i, tmp);
54
-}
55
-
56
-static int tcg_out_call_iarg_reg16(TCGContext *s, int i, TCGReg arg)
57
-{
58
- TCGReg tmp = TCG_TMP0;
59
- if (i < ARRAY_SIZE(tcg_target_call_iarg_regs)) {
60
- tmp = tcg_target_call_iarg_regs[i];
61
- }
62
- tcg_out_opc_imm(s, OPC_ANDI, tmp, arg, 0xffff);
63
- return tcg_out_call_iarg_reg(s, i, tmp);
64
-}
65
-
66
-static int tcg_out_call_iarg_imm(TCGContext *s, int i, TCGArg arg)
67
-{
68
- TCGReg tmp = TCG_TMP0;
69
- if (arg == 0) {
70
- tmp = TCG_REG_ZERO;
71
- } else {
72
- if (i < ARRAY_SIZE(tcg_target_call_iarg_regs)) {
73
- tmp = tcg_target_call_iarg_regs[i];
74
- }
75
- tcg_out_movi(s, TCG_TYPE_REG, tmp, arg);
76
- }
77
- return tcg_out_call_iarg_reg(s, i, tmp);
78
-}
79
-
80
-static int tcg_out_call_iarg_reg2(TCGContext *s, int i, TCGReg al, TCGReg ah)
81
-{
82
- tcg_debug_assert(TCG_TARGET_REG_BITS == 32);
83
- i = (i + 1) & ~1;
84
- i = tcg_out_call_iarg_reg(s, i, (MIPS_BE ? ah : al));
85
- i = tcg_out_call_iarg_reg(s, i, (MIPS_BE ? al : ah));
86
- return i;
87
-}
88
+/* We have four temps, we might as well expose three of them. */
89
+static const TCGLdstHelperParam ldst_helper_param = {
90
+ .ntmp = 3, .tmp = { TCG_TMP0, TCG_TMP1, TCG_TMP2 }
91
+};
92
93
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
94
{
95
const tcg_insn_unit *tgt_rx = tcg_splitwx_to_rx(s->code_ptr);
96
- MemOpIdx oi = l->oi;
97
- MemOp opc = get_memop(oi);
98
- TCGReg v0;
99
- int i;
100
+ MemOp opc = get_memop(l->oi);
101
102
/* resolve label address */
103
if (!reloc_pc16(l->label_ptr[0], tgt_rx)
104
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
105
return false;
106
}
107
108
- i = 1;
109
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
110
- i = tcg_out_call_iarg_reg2(s, i, l->addrlo_reg, l->addrhi_reg);
111
- } else {
112
- i = tcg_out_call_iarg_reg(s, i, l->addrlo_reg);
113
- }
114
- i = tcg_out_call_iarg_imm(s, i, oi);
115
- i = tcg_out_call_iarg_imm(s, i, (intptr_t)l->raddr);
116
+ tcg_out_ld_helper_args(s, l, &ldst_helper_param);
117
+
118
tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)], false);
119
/* delay slot */
120
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
121
+ tcg_out_nop(s);
122
123
- v0 = l->datalo_reg;
124
- if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
125
- /* We eliminated V0 from the possible output registers, so it
126
- cannot be clobbered here. So we must move V1 first. */
127
- if (MIPS_BE) {
128
- tcg_out_mov(s, TCG_TYPE_I32, v0, TCG_REG_V1);
129
- v0 = l->datahi_reg;
130
- } else {
131
- tcg_out_mov(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_V1);
132
- }
133
- }
134
+ tcg_out_ld_helper_ret(s, l, true, &ldst_helper_param);
135
136
tcg_out_opc_br(s, OPC_BEQ, TCG_REG_ZERO, TCG_REG_ZERO);
137
if (!reloc_pc16(s->code_ptr - 1, l->raddr)) {
138
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
139
}
140
141
/* delay slot */
142
- if (TCG_TARGET_REG_BITS == 64 && l->type == TCG_TYPE_I32) {
143
- /* we always sign-extend 32-bit loads */
144
- tcg_out_ext32s(s, v0, TCG_REG_V0);
145
- } else {
146
- tcg_out_opc_reg(s, OPC_OR, v0, TCG_REG_V0, TCG_REG_ZERO);
147
- }
148
+ tcg_out_nop(s);
149
return true;
150
}
151
152
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
153
{
154
const tcg_insn_unit *tgt_rx = tcg_splitwx_to_rx(s->code_ptr);
155
- MemOpIdx oi = l->oi;
156
- MemOp opc = get_memop(oi);
157
- MemOp s_bits = opc & MO_SIZE;
158
- int i;
159
+ MemOp opc = get_memop(l->oi);
160
161
/* resolve label address */
162
if (!reloc_pc16(l->label_ptr[0], tgt_rx)
163
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
164
return false;
165
}
166
167
- i = 1;
168
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
169
- i = tcg_out_call_iarg_reg2(s, i, l->addrlo_reg, l->addrhi_reg);
170
- } else {
171
- i = tcg_out_call_iarg_reg(s, i, l->addrlo_reg);
172
- }
173
- switch (s_bits) {
174
- case MO_8:
175
- i = tcg_out_call_iarg_reg8(s, i, l->datalo_reg);
176
- break;
177
- case MO_16:
178
- i = tcg_out_call_iarg_reg16(s, i, l->datalo_reg);
179
- break;
180
- case MO_32:
181
- i = tcg_out_call_iarg_reg(s, i, l->datalo_reg);
182
- break;
183
- case MO_64:
184
- if (TCG_TARGET_REG_BITS == 32) {
185
- i = tcg_out_call_iarg_reg2(s, i, l->datalo_reg, l->datahi_reg);
186
- } else {
187
- i = tcg_out_call_iarg_reg(s, i, l->datalo_reg);
188
- }
189
- break;
190
- default:
191
- g_assert_not_reached();
192
- }
193
- i = tcg_out_call_iarg_imm(s, i, oi);
194
+ tcg_out_st_helper_args(s, l, &ldst_helper_param);
195
196
- /* Tail call to the store helper. Thus force the return address
197
- computation to take place in the return address register. */
198
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_RA, (intptr_t)l->raddr);
199
- i = tcg_out_call_iarg_reg(s, i, TCG_REG_RA);
200
- tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)], true);
201
+ tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)], false);
202
/* delay slot */
203
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
204
+ tcg_out_nop(s);
205
+
206
+ tcg_out_opc_br(s, OPC_BEQ, TCG_REG_ZERO, TCG_REG_ZERO);
207
+ if (!reloc_pc16(s->code_ptr - 1, l->raddr)) {
208
+ return false;
209
+ }
210
+
211
+ /* delay slot */
212
+ tcg_out_nop(s);
213
return true;
214
}
215
29
216
--
30
--
217
2.34.1
31
2.34.1
218
219
diff view generated by jsdifflib
1
Memory operations that are not already aligned, or otherwise
1
Test err_pattern_group_empty.decode failed with exception:
2
marked up, require addition of ctx->default_tcg_memop_mask.
2
3
Traceback (most recent call last):
4
File "./scripts/decodetree.py", line 1424, in <module> main()
5
File "./scripts/decodetree.py", line 1342, in main toppat.build_tree()
6
File "./scripts/decodetree.py", line 627, in build_tree
7
self.tree = self.__build_tree(self.pats, self.fixedbits,
8
File "./scripts/decodetree.py", line 607, in __build_tree
9
fb = i.fixedbits & innermask
10
TypeError: unsupported operand type(s) for &: 'NoneType' and 'int'
3
11
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
---
13
---
6
target/mips/tcg/mxu_translate.c | 3 ++-
14
scripts/decodetree.py | 6 ++++++
7
target/mips/tcg/micromips_translate.c.inc | 24 ++++++++++++++--------
15
1 file changed, 6 insertions(+)
8
target/mips/tcg/mips16e_translate.c.inc | 18 ++++++++++------
9
target/mips/tcg/nanomips_translate.c.inc | 25 +++++++++++------------
10
4 files changed, 42 insertions(+), 28 deletions(-)
11
16
12
diff --git a/target/mips/tcg/mxu_translate.c b/target/mips/tcg/mxu_translate.c
17
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
13
index XXXXXXX..XXXXXXX 100644
18
index XXXXXXX..XXXXXXX 100644
14
--- a/target/mips/tcg/mxu_translate.c
19
--- a/scripts/decodetree.py
15
+++ b/target/mips/tcg/mxu_translate.c
20
+++ b/scripts/decodetree.py
16
@@ -XXX,XX +XXX,XX @@ static void gen_mxu_s32ldd_s32lddr(DisasContext *ctx)
21
@@ -XXX,XX +XXX,XX @@ def output_code(self, i, extracted, outerbits, outermask):
17
tcg_gen_ori_tl(t1, t1, 0xFFFFF000);
22
output(ind, '}\n')
18
}
23
else:
19
tcg_gen_add_tl(t1, t0, t1);
24
p.output_code(i, extracted, p.fixedbits, p.fixedmask)
20
- tcg_gen_qemu_ld_tl(t1, t1, ctx->mem_idx, MO_TESL ^ (sel * MO_BSWAP));
25
+
21
+ tcg_gen_qemu_ld_tl(t1, t1, ctx->mem_idx, (MO_TESL ^ (sel * MO_BSWAP)) |
26
+ def build_tree(self):
22
+ ctx->default_tcg_memop_mask);
27
+ if not self.pats:
23
28
+ error_with_file(self.file, self.lineno, 'empty pattern group')
24
gen_store_mxu_gpr(t1, XRa);
29
+ super().build_tree()
25
}
30
+
26
diff --git a/target/mips/tcg/micromips_translate.c.inc b/target/mips/tcg/micromips_translate.c.inc
31
#end IncMultiPattern
27
index XXXXXXX..XXXXXXX 100644
32
28
--- a/target/mips/tcg/micromips_translate.c.inc
33
29
+++ b/target/mips/tcg/micromips_translate.c.inc
30
@@ -XXX,XX +XXX,XX @@ static void gen_ldst_pair(DisasContext *ctx, uint32_t opc, int rd,
31
gen_reserved_instruction(ctx);
32
return;
33
}
34
- tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TESL);
35
+ tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TESL |
36
+ ctx->default_tcg_memop_mask);
37
gen_store_gpr(t1, rd);
38
tcg_gen_movi_tl(t1, 4);
39
gen_op_addr_add(ctx, t0, t0, t1);
40
- tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TESL);
41
+ tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TESL |
42
+ ctx->default_tcg_memop_mask);
43
gen_store_gpr(t1, rd + 1);
44
break;
45
case SWP:
46
gen_load_gpr(t1, rd);
47
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
48
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL |
49
+ ctx->default_tcg_memop_mask);
50
tcg_gen_movi_tl(t1, 4);
51
gen_op_addr_add(ctx, t0, t0, t1);
52
gen_load_gpr(t1, rd + 1);
53
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
54
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL |
55
+ ctx->default_tcg_memop_mask);
56
break;
57
#ifdef TARGET_MIPS64
58
case LDP:
59
@@ -XXX,XX +XXX,XX @@ static void gen_ldst_pair(DisasContext *ctx, uint32_t opc, int rd,
60
gen_reserved_instruction(ctx);
61
return;
62
}
63
- tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TEUQ);
64
+ tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TEUQ |
65
+ ctx->default_tcg_memop_mask);
66
gen_store_gpr(t1, rd);
67
tcg_gen_movi_tl(t1, 8);
68
gen_op_addr_add(ctx, t0, t0, t1);
69
- tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TEUQ);
70
+ tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TEUQ |
71
+ ctx->default_tcg_memop_mask);
72
gen_store_gpr(t1, rd + 1);
73
break;
74
case SDP:
75
gen_load_gpr(t1, rd);
76
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUQ);
77
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUQ |
78
+ ctx->default_tcg_memop_mask);
79
tcg_gen_movi_tl(t1, 8);
80
gen_op_addr_add(ctx, t0, t0, t1);
81
gen_load_gpr(t1, rd + 1);
82
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUQ);
83
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUQ |
84
+ ctx->default_tcg_memop_mask);
85
break;
86
#endif
87
}
88
diff --git a/target/mips/tcg/mips16e_translate.c.inc b/target/mips/tcg/mips16e_translate.c.inc
89
index XXXXXXX..XXXXXXX 100644
90
--- a/target/mips/tcg/mips16e_translate.c.inc
91
+++ b/target/mips/tcg/mips16e_translate.c.inc
92
@@ -XXX,XX +XXX,XX @@ static void gen_mips16_save(DisasContext *ctx,
93
case 4:
94
gen_base_offset_addr(ctx, t0, 29, 12);
95
gen_load_gpr(t1, 7);
96
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
97
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL |
98
+ ctx->default_tcg_memop_mask);
99
/* Fall through */
100
case 3:
101
gen_base_offset_addr(ctx, t0, 29, 8);
102
gen_load_gpr(t1, 6);
103
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
104
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL |
105
+ ctx->default_tcg_memop_mask);
106
/* Fall through */
107
case 2:
108
gen_base_offset_addr(ctx, t0, 29, 4);
109
gen_load_gpr(t1, 5);
110
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
111
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL |
112
+ ctx->default_tcg_memop_mask);
113
/* Fall through */
114
case 1:
115
gen_base_offset_addr(ctx, t0, 29, 0);
116
gen_load_gpr(t1, 4);
117
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
118
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL |
119
+ ctx->default_tcg_memop_mask);
120
}
121
122
gen_load_gpr(t0, 29);
123
@@ -XXX,XX +XXX,XX @@ static void gen_mips16_save(DisasContext *ctx,
124
tcg_gen_movi_tl(t2, -4); \
125
gen_op_addr_add(ctx, t0, t0, t2); \
126
gen_load_gpr(t1, reg); \
127
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL); \
128
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL | \
129
+ ctx->default_tcg_memop_mask); \
130
} while (0)
131
132
if (do_ra) {
133
@@ -XXX,XX +XXX,XX @@ static void gen_mips16_restore(DisasContext *ctx,
134
#define DECR_AND_LOAD(reg) do { \
135
tcg_gen_movi_tl(t2, -4); \
136
gen_op_addr_add(ctx, t0, t0, t2); \
137
- tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TESL); \
138
+ tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TESL | \
139
+ ctx->default_tcg_memop_mask); \
140
gen_store_gpr(t1, reg); \
141
} while (0)
142
143
diff --git a/target/mips/tcg/nanomips_translate.c.inc b/target/mips/tcg/nanomips_translate.c.inc
144
index XXXXXXX..XXXXXXX 100644
145
--- a/target/mips/tcg/nanomips_translate.c.inc
146
+++ b/target/mips/tcg/nanomips_translate.c.inc
147
@@ -XXX,XX +XXX,XX @@ static void gen_p_lsx(DisasContext *ctx, int rd, int rs, int rt)
148
149
switch (extract32(ctx->opcode, 7, 4)) {
150
case NM_LBX:
151
- tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx,
152
- MO_SB);
153
+ tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx, MO_SB);
154
gen_store_gpr(t0, rd);
155
break;
156
case NM_LHX:
157
/*case NM_LHXS:*/
158
tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx,
159
- MO_TESW);
160
+ MO_TESW | ctx->default_tcg_memop_mask);
161
gen_store_gpr(t0, rd);
162
break;
163
case NM_LWX:
164
/*case NM_LWXS:*/
165
tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx,
166
- MO_TESL);
167
+ MO_TESL | ctx->default_tcg_memop_mask);
168
gen_store_gpr(t0, rd);
169
break;
170
case NM_LBUX:
171
- tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx,
172
- MO_UB);
173
+ tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx, MO_UB);
174
gen_store_gpr(t0, rd);
175
break;
176
case NM_LHUX:
177
/*case NM_LHUXS:*/
178
tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx,
179
- MO_TEUW);
180
+ MO_TEUW | ctx->default_tcg_memop_mask);
181
gen_store_gpr(t0, rd);
182
break;
183
case NM_SBX:
184
check_nms(ctx);
185
gen_load_gpr(t1, rd);
186
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx,
187
- MO_8);
188
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_8);
189
break;
190
case NM_SHX:
191
/*case NM_SHXS:*/
192
check_nms(ctx);
193
gen_load_gpr(t1, rd);
194
tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx,
195
- MO_TEUW);
196
+ MO_TEUW | ctx->default_tcg_memop_mask);
197
break;
198
case NM_SWX:
199
/*case NM_SWXS:*/
200
check_nms(ctx);
201
gen_load_gpr(t1, rd);
202
tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx,
203
- MO_TEUL);
204
+ MO_TEUL | ctx->default_tcg_memop_mask);
205
break;
206
case NM_LWC1X:
207
/*case NM_LWC1XS:*/
208
@@ -XXX,XX +XXX,XX @@ static int decode_nanomips_32_48_opc(CPUMIPSState *env, DisasContext *ctx)
209
addr_off);
210
211
tcg_gen_movi_tl(t0, addr);
212
- tcg_gen_qemu_ld_tl(cpu_gpr[rt], t0, ctx->mem_idx, MO_TESL);
213
+ tcg_gen_qemu_ld_tl(cpu_gpr[rt], t0, ctx->mem_idx,
214
+ MO_TESL | ctx->default_tcg_memop_mask);
215
}
216
break;
217
case NM_SWPC48:
218
@@ -XXX,XX +XXX,XX @@ static int decode_nanomips_32_48_opc(CPUMIPSState *env, DisasContext *ctx)
219
tcg_gen_movi_tl(t0, addr);
220
gen_load_gpr(t1, rt);
221
222
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
223
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx,
224
+ MO_TEUL | ctx->default_tcg_memop_mask);
225
}
226
break;
227
default:
228
--
34
--
229
2.34.1
35
2.34.1
diff view generated by jsdifflib
1
These are atomic operations, so mark as requiring alignment.
1
Nor report any PermissionError on remove.
2
The primary purpose is testing with -o /dev/null.
2
3
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
---
5
---
5
target/mips/tcg/nanomips_translate.c.inc | 5 +++--
6
scripts/decodetree.py | 7 ++++++-
6
1 file changed, 3 insertions(+), 2 deletions(-)
7
1 file changed, 6 insertions(+), 1 deletion(-)
7
8
8
diff --git a/target/mips/tcg/nanomips_translate.c.inc b/target/mips/tcg/nanomips_translate.c.inc
9
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
9
index XXXXXXX..XXXXXXX 100644
10
index XXXXXXX..XXXXXXX 100644
10
--- a/target/mips/tcg/nanomips_translate.c.inc
11
--- a/scripts/decodetree.py
11
+++ b/target/mips/tcg/nanomips_translate.c.inc
12
+++ b/scripts/decodetree.py
12
@@ -XXX,XX +XXX,XX @@ static void gen_llwp(DisasContext *ctx, uint32_t base, int16_t offset,
13
@@ -XXX,XX +XXX,XX @@ def error_with_file(file, lineno, *args):
13
TCGv tmp2 = tcg_temp_new();
14
14
15
if output_file and output_fd:
15
gen_base_offset_addr(ctx, taddr, base, offset);
16
output_fd.close()
16
- tcg_gen_qemu_ld_i64(tval, taddr, ctx->mem_idx, MO_TEUQ);
17
- os.remove(output_file)
17
+ tcg_gen_qemu_ld_i64(tval, taddr, ctx->mem_idx, MO_TEUQ | MO_ALIGN);
18
+ # Do not try to remove e.g. -o /dev/null
18
if (cpu_is_bigendian(ctx)) {
19
+ if not output_file.startswith("/dev"):
19
tcg_gen_extr_i64_tl(tmp2, tmp1, tval);
20
+ try:
20
} else {
21
+ os.remove(output_file)
21
@@ -XXX,XX +XXX,XX @@ static void gen_scwp(DisasContext *ctx, uint32_t base, int16_t offset,
22
+ except PermissionError:
22
23
+ pass
23
tcg_gen_ld_i64(llval, cpu_env, offsetof(CPUMIPSState, llval_wp));
24
exit(0 if testforerror else 1)
24
tcg_gen_atomic_cmpxchg_i64(val, taddr, llval, tval,
25
# end error_with_file
25
- eva ? MIPS_HFLAG_UM : ctx->mem_idx, MO_64);
26
26
+ eva ? MIPS_HFLAG_UM : ctx->mem_idx,
27
+ MO_64 | MO_ALIGN);
28
if (reg1 != 0) {
29
tcg_gen_movi_tl(cpu_gpr[reg1], 1);
30
}
31
--
27
--
32
2.34.1
28
2.34.1
diff view generated by jsdifflib
1
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
1
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2
---
2
---
3
configs/targets/mips-linux-user.mak | 1 -
3
tests/decode/check.sh | 24 ----------------
4
configs/targets/mips-softmmu.mak | 1 -
4
tests/decode/meson.build | 59 ++++++++++++++++++++++++++++++++++++++++
5
configs/targets/mips64-linux-user.mak | 1 -
5
tests/meson.build | 5 +---
6
configs/targets/mips64-softmmu.mak | 1 -
6
3 files changed, 60 insertions(+), 28 deletions(-)
7
configs/targets/mips64el-linux-user.mak | 1 -
7
delete mode 100755 tests/decode/check.sh
8
configs/targets/mips64el-softmmu.mak | 1 -
8
create mode 100644 tests/decode/meson.build
9
configs/targets/mipsel-linux-user.mak | 1 -
10
configs/targets/mipsel-softmmu.mak | 1 -
11
configs/targets/mipsn32-linux-user.mak | 1 -
12
configs/targets/mipsn32el-linux-user.mak | 1 -
13
10 files changed, 10 deletions(-)
14
9
15
diff --git a/configs/targets/mips-linux-user.mak b/configs/targets/mips-linux-user.mak
10
diff --git a/tests/decode/check.sh b/tests/decode/check.sh
11
deleted file mode 100755
12
index XXXXXXX..XXXXXXX
13
--- a/tests/decode/check.sh
14
+++ /dev/null
15
@@ -XXX,XX +XXX,XX @@
16
-#!/bin/sh
17
-# This work is licensed under the terms of the GNU LGPL, version 2 or later.
18
-# See the COPYING.LIB file in the top-level directory.
19
-
20
-PYTHON=$1
21
-DECODETREE=$2
22
-E=0
23
-
24
-# All of these tests should produce errors
25
-for i in err_*.decode; do
26
- if $PYTHON $DECODETREE $i > /dev/null 2> /dev/null; then
27
- # Pass, aka failed to fail.
28
- echo FAIL: $i 1>&2
29
- E=1
30
- fi
31
-done
32
-
33
-for i in succ_*.decode; do
34
- if ! $PYTHON $DECODETREE $i > /dev/null 2> /dev/null; then
35
- echo FAIL:$i 1>&2
36
- fi
37
-done
38
-
39
-exit $E
40
diff --git a/tests/decode/meson.build b/tests/decode/meson.build
41
new file mode 100644
42
index XXXXXXX..XXXXXXX
43
--- /dev/null
44
+++ b/tests/decode/meson.build
45
@@ -XXX,XX +XXX,XX @@
46
+err_tests = [
47
+ 'err_argset1.decode',
48
+ 'err_argset2.decode',
49
+ 'err_field1.decode',
50
+ 'err_field2.decode',
51
+ 'err_field3.decode',
52
+ 'err_field4.decode',
53
+ 'err_field5.decode',
54
+ 'err_field6.decode',
55
+ 'err_init1.decode',
56
+ 'err_init2.decode',
57
+ 'err_init3.decode',
58
+ 'err_init4.decode',
59
+ 'err_overlap1.decode',
60
+ 'err_overlap2.decode',
61
+ 'err_overlap3.decode',
62
+ 'err_overlap4.decode',
63
+ 'err_overlap5.decode',
64
+ 'err_overlap6.decode',
65
+ 'err_overlap7.decode',
66
+ 'err_overlap8.decode',
67
+ 'err_overlap9.decode',
68
+ 'err_pattern_group_empty.decode',
69
+ 'err_pattern_group_ident1.decode',
70
+ 'err_pattern_group_ident2.decode',
71
+ 'err_pattern_group_nest1.decode',
72
+ 'err_pattern_group_nest2.decode',
73
+ 'err_pattern_group_nest3.decode',
74
+ 'err_pattern_group_overlap1.decode',
75
+ 'err_width1.decode',
76
+ 'err_width2.decode',
77
+ 'err_width3.decode',
78
+ 'err_width4.decode',
79
+]
80
+
81
+succ_tests = [
82
+ 'succ_argset_type1.decode',
83
+ 'succ_function.decode',
84
+ 'succ_ident1.decode',
85
+ 'succ_pattern_group_nest1.decode',
86
+ 'succ_pattern_group_nest2.decode',
87
+ 'succ_pattern_group_nest3.decode',
88
+ 'succ_pattern_group_nest4.decode',
89
+]
90
+
91
+suite = 'decodetree'
92
+decodetree = find_program(meson.project_source_root() / 'scripts/decodetree.py')
93
+
94
+foreach t: err_tests
95
+ test(fs.replace_suffix(t, ''),
96
+ decodetree, args: ['-o', '/dev/null', '--test-for-error', files(t)],
97
+ suite: suite)
98
+endforeach
99
+
100
+foreach t: succ_tests
101
+ test(fs.replace_suffix(t, ''),
102
+ decodetree, args: ['-o', '/dev/null', files(t)],
103
+ suite: suite)
104
+endforeach
105
diff --git a/tests/meson.build b/tests/meson.build
16
index XXXXXXX..XXXXXXX 100644
106
index XXXXXXX..XXXXXXX 100644
17
--- a/configs/targets/mips-linux-user.mak
107
--- a/tests/meson.build
18
+++ b/configs/targets/mips-linux-user.mak
108
+++ b/tests/meson.build
19
@@ -XXX,XX +XXX,XX @@ TARGET_ARCH=mips
109
@@ -XXX,XX +XXX,XX @@ if have_tools and have_vhost_user and 'CONFIG_LINUX' in config_host
20
TARGET_ABI_MIPSO32=y
110
dependencies: [qemuutil, vhost_user])
21
TARGET_SYSTBL_ABI=o32
111
endif
22
TARGET_SYSTBL=syscall_o32.tbl
112
23
-TARGET_ALIGNED_ONLY=y
113
-test('decodetree', sh,
24
TARGET_BIG_ENDIAN=y
114
- args: [ files('decode/check.sh'), config_host['PYTHON'], files('../scripts/decodetree.py') ],
25
diff --git a/configs/targets/mips-softmmu.mak b/configs/targets/mips-softmmu.mak
115
- workdir: meson.current_source_dir() / 'decode',
26
index XXXXXXX..XXXXXXX 100644
116
- suite: 'decodetree')
27
--- a/configs/targets/mips-softmmu.mak
117
+subdir('decode')
28
+++ b/configs/targets/mips-softmmu.mak
118
29
@@ -XXX,XX +XXX,XX @@
119
if 'CONFIG_TCG' in config_all
30
TARGET_ARCH=mips
120
subdir('fp')
31
-TARGET_ALIGNED_ONLY=y
32
TARGET_BIG_ENDIAN=y
33
TARGET_SUPPORTS_MTTCG=y
34
diff --git a/configs/targets/mips64-linux-user.mak b/configs/targets/mips64-linux-user.mak
35
index XXXXXXX..XXXXXXX 100644
36
--- a/configs/targets/mips64-linux-user.mak
37
+++ b/configs/targets/mips64-linux-user.mak
38
@@ -XXX,XX +XXX,XX @@ TARGET_ABI_MIPSN64=y
39
TARGET_BASE_ARCH=mips
40
TARGET_SYSTBL_ABI=n64
41
TARGET_SYSTBL=syscall_n64.tbl
42
-TARGET_ALIGNED_ONLY=y
43
TARGET_BIG_ENDIAN=y
44
diff --git a/configs/targets/mips64-softmmu.mak b/configs/targets/mips64-softmmu.mak
45
index XXXXXXX..XXXXXXX 100644
46
--- a/configs/targets/mips64-softmmu.mak
47
+++ b/configs/targets/mips64-softmmu.mak
48
@@ -XXX,XX +XXX,XX @@
49
TARGET_ARCH=mips64
50
TARGET_BASE_ARCH=mips
51
-TARGET_ALIGNED_ONLY=y
52
TARGET_BIG_ENDIAN=y
53
diff --git a/configs/targets/mips64el-linux-user.mak b/configs/targets/mips64el-linux-user.mak
54
index XXXXXXX..XXXXXXX 100644
55
--- a/configs/targets/mips64el-linux-user.mak
56
+++ b/configs/targets/mips64el-linux-user.mak
57
@@ -XXX,XX +XXX,XX @@ TARGET_ABI_MIPSN64=y
58
TARGET_BASE_ARCH=mips
59
TARGET_SYSTBL_ABI=n64
60
TARGET_SYSTBL=syscall_n64.tbl
61
-TARGET_ALIGNED_ONLY=y
62
diff --git a/configs/targets/mips64el-softmmu.mak b/configs/targets/mips64el-softmmu.mak
63
index XXXXXXX..XXXXXXX 100644
64
--- a/configs/targets/mips64el-softmmu.mak
65
+++ b/configs/targets/mips64el-softmmu.mak
66
@@ -XXX,XX +XXX,XX @@
67
TARGET_ARCH=mips64
68
TARGET_BASE_ARCH=mips
69
-TARGET_ALIGNED_ONLY=y
70
TARGET_NEED_FDT=y
71
diff --git a/configs/targets/mipsel-linux-user.mak b/configs/targets/mipsel-linux-user.mak
72
index XXXXXXX..XXXXXXX 100644
73
--- a/configs/targets/mipsel-linux-user.mak
74
+++ b/configs/targets/mipsel-linux-user.mak
75
@@ -XXX,XX +XXX,XX @@ TARGET_ARCH=mips
76
TARGET_ABI_MIPSO32=y
77
TARGET_SYSTBL_ABI=o32
78
TARGET_SYSTBL=syscall_o32.tbl
79
-TARGET_ALIGNED_ONLY=y
80
diff --git a/configs/targets/mipsel-softmmu.mak b/configs/targets/mipsel-softmmu.mak
81
index XXXXXXX..XXXXXXX 100644
82
--- a/configs/targets/mipsel-softmmu.mak
83
+++ b/configs/targets/mipsel-softmmu.mak
84
@@ -XXX,XX +XXX,XX @@
85
TARGET_ARCH=mips
86
-TARGET_ALIGNED_ONLY=y
87
TARGET_SUPPORTS_MTTCG=y
88
diff --git a/configs/targets/mipsn32-linux-user.mak b/configs/targets/mipsn32-linux-user.mak
89
index XXXXXXX..XXXXXXX 100644
90
--- a/configs/targets/mipsn32-linux-user.mak
91
+++ b/configs/targets/mipsn32-linux-user.mak
92
@@ -XXX,XX +XXX,XX @@ TARGET_ABI32=y
93
TARGET_BASE_ARCH=mips
94
TARGET_SYSTBL_ABI=n32
95
TARGET_SYSTBL=syscall_n32.tbl
96
-TARGET_ALIGNED_ONLY=y
97
TARGET_BIG_ENDIAN=y
98
diff --git a/configs/targets/mipsn32el-linux-user.mak b/configs/targets/mipsn32el-linux-user.mak
99
index XXXXXXX..XXXXXXX 100644
100
--- a/configs/targets/mipsn32el-linux-user.mak
101
+++ b/configs/targets/mipsn32el-linux-user.mak
102
@@ -XXX,XX +XXX,XX @@ TARGET_ABI32=y
103
TARGET_BASE_ARCH=mips
104
TARGET_SYSTBL_ABI=n32
105
TARGET_SYSTBL=syscall_n32.tbl
106
-TARGET_ALIGNED_ONLY=y
107
--
121
--
108
2.34.1
122
2.34.1
diff view generated by jsdifflib
1
From: Thomas Huth <thuth@redhat.com>
1
From: Peter Maydell <peter.maydell@linaro.org>
2
2
3
By using target_words_bigendian() instead of an ifdef,
3
Document the named field syntax that we want to implement for the
4
we can build this code once.
4
decodetree script. This allows a field to be defined in terms of
5
some other field that the instruction pattern has already set, for
6
example:
5
7
6
Signed-off-by: Thomas Huth <thuth@redhat.com>
8
%sz_imm 10:3 sz:3 !function=expand_sz_imm
7
Message-Id: <20230508133745.109463-3-thuth@redhat.com>
9
8
[rth: Type change done in a separate patch]
10
to allow a function to be passed both an immediate field from the
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
instruction and also a sz value which might have been specified by
12
the instruction pattern directly (sz=1, etc) rather than being a
13
simple field within the instruction.
14
15
Note that the restriction on not having the format referring to the
16
pattern and the pattern referring to the format simultaneously is a
17
restriction of the decoder generator rather than inherently being a
18
silly thing to do.
19
20
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
21
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
22
Message-Id: <20230523120447.728365-3-peter.maydell@linaro.org>
10
---
23
---
11
disas/disas.c | 10 +++++-----
24
docs/devel/decodetree.rst | 33 ++++++++++++++++++++++++++++-----
12
disas/meson.build | 3 ++-
25
1 file changed, 28 insertions(+), 5 deletions(-)
13
2 files changed, 7 insertions(+), 6 deletions(-)
14
26
15
diff --git a/disas/disas.c b/disas/disas.c
27
diff --git a/docs/devel/decodetree.rst b/docs/devel/decodetree.rst
16
index XXXXXXX..XXXXXXX 100644
28
index XXXXXXX..XXXXXXX 100644
17
--- a/disas/disas.c
29
--- a/docs/devel/decodetree.rst
18
+++ b/disas/disas.c
30
+++ b/docs/devel/decodetree.rst
19
@@ -XXX,XX +XXX,XX @@ void disas_initialize_debug_target(CPUDebug *s, CPUState *cpu)
31
@@ -XXX,XX +XXX,XX @@ Fields
20
s->cpu = cpu;
32
21
s->info.read_memory_func = target_read_memory;
33
Syntax::
22
s->info.print_address_func = print_address;
34
23
-#if TARGET_BIG_ENDIAN
35
- field_def := '%' identifier ( unnamed_field )* ( !function=identifier )?
24
- s->info.endian = BFD_ENDIAN_BIG;
36
+ field_def := '%' identifier ( field )* ( !function=identifier )?
25
-#else
37
+ field := unnamed_field | named_field
26
- s->info.endian = BFD_ENDIAN_LITTLE;
38
unnamed_field := number ':' ( 's' ) number
27
-#endif
39
+ named_field := identifier ':' ( 's' ) number
28
+ if (target_words_bigendian()) {
40
29
+ s->info.endian = BFD_ENDIAN_BIG;
41
For *unnamed_field*, the first number is the least-significant bit position
30
+ } else {
42
of the field and the second number is the length of the field. If the 's' is
31
+ s->info.endian = BFD_ENDIAN_LITTLE;
43
-present, the field is considered signed. If multiple ``unnamed_fields`` are
32
+ }
44
-present, they are concatenated. In this way one can define disjoint fields.
33
45
+present, the field is considered signed.
34
CPUClass *cc = CPU_GET_CLASS(cpu);
46
+
35
if (cc->disas_set_info) {
47
+A *named_field* refers to some other field in the instruction pattern
36
diff --git a/disas/meson.build b/disas/meson.build
48
+or format. Regardless of the length of the other field where it is
37
index XXXXXXX..XXXXXXX 100644
49
+defined, it will be inserted into this field with the specified
38
--- a/disas/meson.build
50
+signedness and bit width.
39
+++ b/disas/meson.build
51
+
40
@@ -XXX,XX +XXX,XX @@ common_ss.add(when: 'CONFIG_SH4_DIS', if_true: files('sh4.c'))
52
+Field definitions that involve loops (i.e. where a field is defined
41
common_ss.add(when: 'CONFIG_SPARC_DIS', if_true: files('sparc.c'))
53
+directly or indirectly in terms of itself) are errors.
42
common_ss.add(when: 'CONFIG_XTENSA_DIS', if_true: files('xtensa.c'))
54
+
43
common_ss.add(when: capstone, if_true: [files('capstone.c'), capstone])
55
+A format can include fields that refer to named fields that are
44
+common_ss.add(files('disas.c'))
56
+defined in the instruction pattern(s) that use the format.
45
57
+Conversely, an instruction pattern can include fields that refer to
46
softmmu_ss.add(files('disas-mon.c'))
58
+named fields that are defined in the format it uses. However you
47
-specific_ss.add(files('disas.c'), capstone)
59
+cannot currently do both at once (i.e. pattern P uses format F; F has
48
+specific_ss.add(capstone)
60
+a field A that refers to a named field B that is defined in P, and P
61
+has a field C that refers to a named field D that is defined in F).
62
+
63
+If multiple ``fields`` are present, they are concatenated.
64
+In this way one can define disjoint fields.
65
66
If ``!function`` is specified, the concatenated result is passed through the
67
named function, taking and returning an integral value.
68
69
-One may use ``!function`` with zero ``unnamed_fields``. This case is called
70
+One may use ``!function`` with zero ``fields``. This case is called
71
a *parameter*, and the named function is only passed the ``DisasContext``
72
and returns an integral value extracted from there.
73
74
-A field with no ``unnamed_fields`` and no ``!function`` is in error.
75
+A field with no ``fields`` and no ``!function`` is in error.
76
77
Field examples:
78
79
@@ -XXX,XX +XXX,XX @@ Field examples:
80
| %shimm8 5:s8 13:1 | expand_shimm8(sextract(i, 5, 8) << 1 | |
81
| !function=expand_shimm8 | extract(i, 13, 1)) |
82
+---------------------------+---------------------------------------------+
83
+| %sz_imm 10:2 sz:3 | expand_sz_imm(extract(i, 10, 2) << 3 | |
84
+| !function=expand_sz_imm | extract(a->sz, 0, 3)) |
85
++---------------------------+---------------------------------------------+
86
87
Argument Sets
88
=============
49
--
89
--
50
2.34.1
90
2.34.1
diff view generated by jsdifflib
1
From: Jamie Iles <quic_jiles@quicinc.com>
1
From: Peter Maydell <peter.maydell@linaro.org>
2
2
3
The round-robin scheduler will iterate over the CPU list with an
3
To support referring to other named fields in field definitions, we
4
assigned budget until the next timer expiry and may exit early because
4
need to pass the str_extract() method a function which tells it how
5
of a TB exit. This is fine under normal operation but with icount
5
to emit the code for a previously initialized named field. (In
6
enabled and SMP it is possible for a CPU to be starved of run time and
6
Pattern::output_code() the other field will be "u.f_foo.field", and
7
the system live-locks.
7
in Format::output_extract() it is "a->field".)
8
8
9
For example, booting a riscv64 platform with '-icount
9
Refactor the two callsites that currently do "output code to
10
shift=0,align=off,sleep=on -smp 2' we observe a livelock once the kernel
10
initialize each field", and have them pass a lambda that defines how
11
has timers enabled and starts performing TLB shootdowns. In this case
11
to format the lvalue in each case. This is then used both in
12
we have CPU 0 in M-mode with interrupts disabled sending an IPI to CPU
12
emitting the LHS of the assignment and also passed down to
13
1. As we enter the TCG loop, we assign the icount budget to next timer
13
str_extract() as a new argument (unused at the moment, but will be
14
interrupt to CPU 0 and begin executing where the guest is sat in a busy
14
used in the following patch).
15
loop exhausting all of the budget before we try to execute CPU 1 which
16
is the target of the IPI but CPU 1 is left with no budget with which to
17
execute and the process repeats.
18
15
19
We try here to add some fairness by splitting the budget across all of
16
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
the CPUs on the thread fairly before entering each one. The CPU count
17
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
21
is cached on CPU list generation ID to avoid iterating the list on each
18
Message-Id: <20230523120447.728365-4-peter.maydell@linaro.org>
22
loop iteration. With this change it is possible to boot an SMP rv64
19
---
23
guest with icount enabled and no hangs.
20
scripts/decodetree.py | 26 +++++++++++++++-----------
21
1 file changed, 15 insertions(+), 11 deletions(-)
24
22
25
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
23
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
26
Tested-by: Peter Maydell <peter.maydell@linaro.org>
27
Signed-off-by: Jamie Iles <quic_jiles@quicinc.com>
28
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
29
Message-Id: <20230427020925.51003-3-quic_jiles@quicinc.com>
30
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
31
---
32
accel/tcg/tcg-accel-ops-icount.h | 3 ++-
33
accel/tcg/tcg-accel-ops-icount.c | 21 ++++++++++++++----
34
accel/tcg/tcg-accel-ops-rr.c | 37 +++++++++++++++++++++++++++++++-
35
replay/replay.c | 3 +--
36
4 files changed, 56 insertions(+), 8 deletions(-)
37
38
diff --git a/accel/tcg/tcg-accel-ops-icount.h b/accel/tcg/tcg-accel-ops-icount.h
39
index XXXXXXX..XXXXXXX 100644
24
index XXXXXXX..XXXXXXX 100644
40
--- a/accel/tcg/tcg-accel-ops-icount.h
25
--- a/scripts/decodetree.py
41
+++ b/accel/tcg/tcg-accel-ops-icount.h
26
+++ b/scripts/decodetree.py
42
@@ -XXX,XX +XXX,XX @@
27
@@ -XXX,XX +XXX,XX @@ def __str__(self):
43
#define TCG_ACCEL_OPS_ICOUNT_H
28
s = ''
44
29
return str(self.pos) + ':' + s + str(self.len)
45
void icount_handle_deadline(void);
30
46
-void icount_prepare_for_run(CPUState *cpu);
31
- def str_extract(self):
47
+void icount_prepare_for_run(CPUState *cpu, int64_t cpu_budget);
32
+ def str_extract(self, lvalue_formatter):
48
+int64_t icount_percpu_budget(int cpu_count);
33
global bitop_width
49
void icount_process_data(CPUState *cpu);
34
s = 's' if self.sign else ''
50
35
return f'{s}extract{bitop_width}(insn, {self.pos}, {self.len})'
51
void icount_handle_interrupt(CPUState *cpu, int mask);
36
@@ -XXX,XX +XXX,XX @@ def __init__(self, subs, mask):
52
diff --git a/accel/tcg/tcg-accel-ops-icount.c b/accel/tcg/tcg-accel-ops-icount.c
37
def __str__(self):
53
index XXXXXXX..XXXXXXX 100644
38
return str(self.subs)
54
--- a/accel/tcg/tcg-accel-ops-icount.c
39
55
+++ b/accel/tcg/tcg-accel-ops-icount.c
40
- def str_extract(self):
56
@@ -XXX,XX +XXX,XX @@ void icount_handle_deadline(void)
41
+ def str_extract(self, lvalue_formatter):
57
}
42
global bitop_width
58
}
43
ret = '0'
59
44
pos = 0
60
-void icount_prepare_for_run(CPUState *cpu)
45
for f in reversed(self.subs):
61
+/* Distribute the budget evenly across all CPUs */
46
- ext = f.str_extract()
62
+int64_t icount_percpu_budget(int cpu_count)
47
+ ext = f.str_extract(lvalue_formatter)
63
+{
48
if pos == 0:
64
+ int64_t limit = icount_get_limit();
49
ret = ext
65
+ int64_t timeslice = limit / cpu_count;
50
else:
51
@@ -XXX,XX +XXX,XX @@ def __init__(self, value):
52
def __str__(self):
53
return str(self.value)
54
55
- def str_extract(self):
56
+ def str_extract(self, lvalue_formatter):
57
return str(self.value)
58
59
def __cmp__(self, other):
60
@@ -XXX,XX +XXX,XX @@ def __init__(self, func, base):
61
def __str__(self):
62
return self.func + '(' + str(self.base) + ')'
63
64
- def str_extract(self):
65
- return self.func + '(ctx, ' + self.base.str_extract() + ')'
66
+ def str_extract(self, lvalue_formatter):
67
+ return (self.func + '(ctx, '
68
+ + self.base.str_extract(lvalue_formatter) + ')')
69
70
def __eq__(self, other):
71
return self.func == other.func and self.base == other.base
72
@@ -XXX,XX +XXX,XX @@ def __init__(self, func):
73
def __str__(self):
74
return self.func
75
76
- def str_extract(self):
77
+ def str_extract(self, lvalue_formatter):
78
return self.func + '(ctx)'
79
80
def __eq__(self, other):
81
@@ -XXX,XX +XXX,XX @@ def __str__(self):
82
83
def str1(self, i):
84
return str_indent(i) + self.__str__()
66
+
85
+
67
+ if (timeslice == 0) {
86
+ def output_fields(self, indent, lvalue_formatter):
68
+ timeslice = limit;
87
+ for n, f in self.fields.items():
69
+ }
88
+ output(indent, lvalue_formatter(n), ' = ',
70
+
89
+ f.str_extract(lvalue_formatter), ';\n')
71
+ return timeslice;
90
# end General
72
+}
91
73
+
92
74
+void icount_prepare_for_run(CPUState *cpu, int64_t cpu_budget)
93
@@ -XXX,XX +XXX,XX @@ def extract_name(self):
75
{
94
def output_extract(self):
76
int insns_left;
95
output('static void ', self.extract_name(), '(DisasContext *ctx, ',
77
96
self.base.struct_name(), ' *a, ', insntype, ' insn)\n{\n')
78
@@ -XXX,XX +XXX,XX @@ void icount_prepare_for_run(CPUState *cpu)
97
- for n, f in self.fields.items():
79
g_assert(cpu_neg(cpu)->icount_decr.u16.low == 0);
98
- output(' a->', n, ' = ', f.str_extract(), ';\n')
80
g_assert(cpu->icount_extra == 0);
99
+ self.output_fields(str_indent(4), lambda n: 'a->' + n)
81
100
output('}\n\n')
82
- cpu->icount_budget = icount_get_limit();
101
# end Format
83
+ replay_mutex_lock();
102
84
+
103
@@ -XXX,XX +XXX,XX @@ def output_code(self, i, extracted, outerbits, outermask):
85
+ cpu->icount_budget = MIN(icount_get_limit(), cpu_budget);
104
if not extracted:
86
insns_left = MIN(0xffff, cpu->icount_budget);
105
output(ind, self.base.extract_name(),
87
cpu_neg(cpu)->icount_decr.u16.low = insns_left;
106
'(ctx, &u.f_', arg, ', insn);\n')
88
cpu->icount_extra = cpu->icount_budget - insns_left;
107
- for n, f in self.fields.items():
89
108
- output(ind, 'u.f_', arg, '.', n, ' = ', f.str_extract(), ';\n')
90
- replay_mutex_lock();
109
+ self.output_fields(ind, lambda n: 'u.f_' + arg + '.' + n)
91
-
110
output(ind, 'if (', translate_prefix, '_', self.name,
92
if (cpu->icount_budget == 0) {
111
'(ctx, &u.f_', arg, ')) return true;\n')
93
/*
94
* We're called without the iothread lock, so must take it while
95
diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
96
index XXXXXXX..XXXXXXX 100644
97
--- a/accel/tcg/tcg-accel-ops-rr.c
98
+++ b/accel/tcg/tcg-accel-ops-rr.c
99
@@ -XXX,XX +XXX,XX @@
100
*/
101
102
#include "qemu/osdep.h"
103
+#include "qemu/lockable.h"
104
#include "sysemu/tcg.h"
105
#include "sysemu/replay.h"
106
#include "sysemu/cpu-timers.h"
107
@@ -XXX,XX +XXX,XX @@ static void rr_force_rcu(Notifier *notify, void *data)
108
rr_kick_next_cpu();
109
}
110
111
+/*
112
+ * Calculate the number of CPUs that we will process in a single iteration of
113
+ * the main CPU thread loop so that we can fairly distribute the instruction
114
+ * count across CPUs.
115
+ *
116
+ * The CPU count is cached based on the CPU list generation ID to avoid
117
+ * iterating the list every time.
118
+ */
119
+static int rr_cpu_count(void)
120
+{
121
+ static unsigned int last_gen_id = ~0;
122
+ static int cpu_count;
123
+ CPUState *cpu;
124
+
125
+ QEMU_LOCK_GUARD(&qemu_cpu_list_lock);
126
+
127
+ if (cpu_list_generation_id_get() != last_gen_id) {
128
+ cpu_count = 0;
129
+ CPU_FOREACH(cpu) {
130
+ ++cpu_count;
131
+ }
132
+ last_gen_id = cpu_list_generation_id_get();
133
+ }
134
+
135
+ return cpu_count;
136
+}
137
+
138
/*
139
* In the single-threaded case each vCPU is simulated in turn. If
140
* there is more than a single vCPU we create a simple timer to kick
141
@@ -XXX,XX +XXX,XX @@ static void *rr_cpu_thread_fn(void *arg)
142
cpu->exit_request = 1;
143
144
while (1) {
145
+ /* Only used for icount_enabled() */
146
+ int64_t cpu_budget = 0;
147
+
148
qemu_mutex_unlock_iothread();
149
replay_mutex_lock();
150
qemu_mutex_lock_iothread();
151
152
if (icount_enabled()) {
153
+ int cpu_count = rr_cpu_count();
154
+
155
/* Account partial waits to QEMU_CLOCK_VIRTUAL. */
156
icount_account_warp_timer();
157
/*
158
@@ -XXX,XX +XXX,XX @@ static void *rr_cpu_thread_fn(void *arg)
159
* waking up the I/O thread and waiting for completion.
160
*/
161
icount_handle_deadline();
162
+
163
+ cpu_budget = icount_percpu_budget(cpu_count);
164
}
165
166
replay_mutex_unlock();
167
@@ -XXX,XX +XXX,XX @@ static void *rr_cpu_thread_fn(void *arg)
168
169
qemu_mutex_unlock_iothread();
170
if (icount_enabled()) {
171
- icount_prepare_for_run(cpu);
172
+ icount_prepare_for_run(cpu, cpu_budget);
173
}
174
r = tcg_cpus_exec(cpu);
175
if (icount_enabled()) {
176
diff --git a/replay/replay.c b/replay/replay.c
177
index XXXXXXX..XXXXXXX 100644
178
--- a/replay/replay.c
179
+++ b/replay/replay.c
180
@@ -XXX,XX +XXX,XX @@ uint64_t replay_get_current_icount(void)
181
int replay_get_instructions(void)
182
{
183
int res = 0;
184
- replay_mutex_lock();
185
+ g_assert(replay_mutex_locked());
186
if (replay_next_event_is(EVENT_INSTRUCTION)) {
187
res = replay_state.instruction_count;
188
if (replay_break_icount != -1LL) {
189
@@ -XXX,XX +XXX,XX @@ int replay_get_instructions(void)
190
}
191
}
192
}
193
- replay_mutex_unlock();
194
return res;
195
}
196
112
197
--
113
--
198
2.34.1
114
2.34.1
199
200
diff view generated by jsdifflib
1
Reviewed-by: Thomas Huth <thuth@redhat.com>
1
From: Peter Maydell <peter.maydell@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2
3
Message-Id: <20230503072331.1747057-83-richard.henderson@linaro.org>
3
To support named fields, we will need to be able to do a topological
4
sort (so that we ensure that we output the assignment to field A
5
before the assignment to field B if field B refers to field A by
6
name). The good news is that there is a tsort in the python standard
7
library; the bad news is that it was only added in Python 3.9.
8
9
To bridge the gap between our current minimum supported Python
10
version and 3.9, provide a local implementation that has the
11
same API as the stdlib version for the parts we care about.
12
In future when QEMU's minimum Python version requirement reaches
13
3.9 we can delete this code and replace it with an 'import' line.
14
15
The core of this implementation is based on
16
https://code.activestate.com/recipes/578272-topological-sort/
17
which is MIT-licensed.
18
19
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
20
Acked-by: Richard Henderson <richard.henderson@linaro.org>
21
Message-Id: <20230523120447.728365-5-peter.maydell@linaro.org>
4
---
22
---
5
include/disas/disas.h | 6 ------
23
scripts/decodetree.py | 74 +++++++++++++++++++++++++++++++++++++++++++
6
disas/disas.c | 3 ++-
24
1 file changed, 74 insertions(+)
7
2 files changed, 2 insertions(+), 7 deletions(-)
8
25
9
diff --git a/include/disas/disas.h b/include/disas/disas.h
26
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
10
index XXXXXXX..XXXXXXX 100644
27
index XXXXXXX..XXXXXXX 100644
11
--- a/include/disas/disas.h
28
--- a/scripts/decodetree.py
12
+++ b/include/disas/disas.h
29
+++ b/scripts/decodetree.py
13
@@ -XXX,XX +XXX,XX @@
30
@@ -XXX,XX +XXX,XX @@
14
#ifndef QEMU_DISAS_H
31
re_fmt_ident = '@[a-zA-Z0-9_]*'
15
#define QEMU_DISAS_H
32
re_pat_ident = '[a-zA-Z0-9_]*'
16
33
17
-#include "exec/hwaddr.h"
34
+# Local implementation of a topological sort. We use the same API that
18
-
35
+# the Python graphlib does, so that when QEMU moves forward to a
19
-#ifdef NEED_CPU_H
36
+# baseline of Python 3.9 or newer this code can all be dropped and
20
-#include "cpu.h"
37
+# replaced with:
21
-
38
+# from graphlib import TopologicalSorter, CycleError
22
/* Disassemble this for me please... (debugging). */
39
+#
23
void disas(FILE *out, const void *code, size_t size);
40
+# https://docs.python.org/3.9/library/graphlib.html#graphlib.TopologicalSorter
24
void target_disas(FILE *out, CPUState *cpu, uint64_t code, size_t size);
41
+#
25
@@ -XXX,XX +XXX,XX @@ char *plugin_disas(CPUState *cpu, uint64_t addr, size_t size);
42
+# We only implement the parts of TopologicalSorter we care about:
26
43
+# ts = TopologicalSorter(graph=None)
27
/* Look up symbol for debugging purpose. Returns "" if unknown. */
44
+# create the sorter. graph is a dictionary whose keys are
28
const char *lookup_symbol(uint64_t orig_addr);
45
+# nodes and whose values are lists of the predecessors of that node.
29
-#endif
46
+# (That is, if graph contains "A" -> ["B", "C"] then we must output
30
47
+# B and C before A.)
31
struct syminfo;
48
+# ts.static_order()
32
struct elf32_sym;
49
+# returns a list of all the nodes in sorted order, or raises CycleError
33
diff --git a/disas/disas.c b/disas/disas.c
50
+# CycleError
34
index XXXXXXX..XXXXXXX 100644
51
+# exception raised if there are cycles in the graph. The second
35
--- a/disas/disas.c
52
+# element in the args attribute is a list of nodes which form a
36
+++ b/disas/disas.c
53
+# cycle; the first and last element are the same, eg [a, b, c, a]
37
@@ -XXX,XX +XXX,XX @@
54
+# (Our implementation doesn't give the order correctly.)
38
#include "disas/dis-asm.h"
55
+#
39
#include "elf.h"
56
+# For our purposes we can assume that the data set is always small
40
#include "qemu/qemu-print.h"
57
+# (typically 10 nodes or less, actual links in the graph very rare),
41
-
58
+# so we don't need to worry about efficiency of implementation.
42
#include "disas/disas.h"
59
+#
43
#include "disas/capstone.h"
60
+# The core of this implementation is from
44
+#include "hw/core/cpu.h"
61
+# https://code.activestate.com/recipes/578272-topological-sort/
45
+#include "exec/memory.h"
62
+# (but updated to Python 3), and is under the MIT license.
46
63
+
47
typedef struct CPUDebug {
64
+class CycleError(ValueError):
48
struct disassemble_info info;
65
+ """Subclass of ValueError raised if cycles exist in the graph"""
66
+ pass
67
+
68
+class TopologicalSorter:
69
+ """Topologically sort a graph"""
70
+ def __init__(self, graph=None):
71
+ self.graph = graph
72
+
73
+ def static_order(self):
74
+ # We do the sort right here, unlike the stdlib version
75
+ from functools import reduce
76
+ data = {}
77
+ r = []
78
+
79
+ if not self.graph:
80
+ return []
81
+
82
+ # This code wants the values in the dict to be specifically sets
83
+ for k, v in self.graph.items():
84
+ data[k] = set(v)
85
+
86
+ # Find all items that don't depend on anything.
87
+ extra_items_in_deps = (reduce(set.union, data.values())
88
+ - set(data.keys()))
89
+ # Add empty dependencies where needed
90
+ data.update({item:{} for item in extra_items_in_deps})
91
+ while True:
92
+ ordered = set(item for item, dep in data.items() if not dep)
93
+ if not ordered:
94
+ break
95
+ r.extend(ordered)
96
+ data = {item: (dep - ordered)
97
+ for item, dep in data.items()
98
+ if item not in ordered}
99
+ if data:
100
+ # This doesn't give as nice results as the stdlib, which
101
+ # gives you the cycle by listing the nodes in order. Here
102
+ # we only know the nodes in the cycle but not their order.
103
+ raise CycleError(f'nodes are in a cycle', list(data.keys()))
104
+
105
+ return r
106
+# end TopologicalSorter
107
+
108
def error_with_file(file, lineno, *args):
109
"""Print an error message from file:line and args and exit."""
110
global output_file
49
--
111
--
50
2.34.1
112
2.34.1
diff view generated by jsdifflib
1
Fix these before moving the file, for checkpatch.pl.
1
From: Peter Maydell <peter.maydell@linaro.org>
2
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
3
Implement support for named fields, i.e. where one field is defined
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
in terms of another, rather than directly in terms of bits extracted
5
Message-Id: <20230510170812.663149-1-richard.henderson@linaro.org>
5
from the instruction.
6
7
The new method referenced_fields() on all the Field classes returns a
8
list of fields that this field references. This just passes through,
9
except for the new NamedField class.
10
11
We can then use referenced_fields() to:
12
* construct a list of 'dangling references' for a format or
13
pattern, which is the fields that the format/pattern uses but
14
doesn't define itself
15
* do a topological sort, so that we output "field = value"
16
assignments in an order that means that we assign a field before
17
we reference it in a subsequent assignment
18
* check when we output the code for a pattern whether we need to
19
fill in the format fields before or after the pattern fields, and
20
do other error checking
21
22
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
23
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
24
Message-Id: <20230523120447.728365-6-peter.maydell@linaro.org>
6
---
25
---
7
disas.c | 11 ++++++-----
26
scripts/decodetree.py | 145 ++++++++++++++++++++++++++++++++++++++++--
8
1 file changed, 6 insertions(+), 5 deletions(-)
27
1 file changed, 139 insertions(+), 6 deletions(-)
9
28
10
diff --git a/disas.c b/disas.c
29
diff --git a/scripts/decodetree.py b/scripts/decodetree.py
11
index XXXXXXX..XXXXXXX 100644
30
index XXXXXXX..XXXXXXX 100644
12
--- a/disas.c
31
--- a/scripts/decodetree.py
13
+++ b/disas.c
32
+++ b/scripts/decodetree.py
14
@@ -XXX,XX +XXX,XX @@ void target_disas(FILE *out, CPUState *cpu, target_ulong code,
33
@@ -XXX,XX +XXX,XX @@ def str_extract(self, lvalue_formatter):
15
}
34
s = 's' if self.sign else ''
16
35
return f'{s}extract{bitop_width}(insn, {self.pos}, {self.len})'
17
for (pc = code; size > 0; pc += count, size -= count) {
36
18
-    fprintf(out, "0x" TARGET_FMT_lx ": ", pc);
37
+ def referenced_fields(self):
19
-    count = s.info.print_insn(pc, &s.info);
38
+ return []
20
-    fprintf(out, "\n");
39
+
21
-    if (count < 0)
40
def __eq__(self, other):
22
-     break;
41
return self.sign == other.sign and self.mask == other.mask
23
+ fprintf(out, "0x" TARGET_FMT_lx ": ", pc);
42
24
+ count = s.info.print_insn(pc, &s.info);
43
@@ -XXX,XX +XXX,XX @@ def str_extract(self, lvalue_formatter):
25
+ fprintf(out, "\n");
44
pos += f.len
26
+ if (count < 0) {
45
return ret
27
+ break;
46
28
+ }
47
+ def referenced_fields(self):
29
if (size < count) {
48
+ l = []
30
fprintf(out,
49
+ for f in self.subs:
31
"Disassembler disagrees with translator over instruction "
50
+ l.extend(f.referenced_fields())
51
+ return l
52
+
53
def __ne__(self, other):
54
if len(self.subs) != len(other.subs):
55
return True
56
@@ -XXX,XX +XXX,XX @@ def __str__(self):
57
def str_extract(self, lvalue_formatter):
58
return str(self.value)
59
60
+ def referenced_fields(self):
61
+ return []
62
+
63
def __cmp__(self, other):
64
return self.value - other.value
65
# end ConstField
66
@@ -XXX,XX +XXX,XX @@ def str_extract(self, lvalue_formatter):
67
return (self.func + '(ctx, '
68
+ self.base.str_extract(lvalue_formatter) + ')')
69
70
+ def referenced_fields(self):
71
+ return self.base.referenced_fields()
72
+
73
def __eq__(self, other):
74
return self.func == other.func and self.base == other.base
75
76
@@ -XXX,XX +XXX,XX @@ def __str__(self):
77
def str_extract(self, lvalue_formatter):
78
return self.func + '(ctx)'
79
80
+ def referenced_fields(self):
81
+ return []
82
+
83
def __eq__(self, other):
84
return self.func == other.func
85
86
@@ -XXX,XX +XXX,XX @@ def __ne__(self, other):
87
return not self.__eq__(other)
88
# end ParameterField
89
90
+class NamedField:
91
+ """Class representing a field already named in the pattern"""
92
+ def __init__(self, name, sign, len):
93
+ self.mask = 0
94
+ self.sign = sign
95
+ self.len = len
96
+ self.name = name
97
+
98
+ def __str__(self):
99
+ return self.name
100
+
101
+ def str_extract(self, lvalue_formatter):
102
+ global bitop_width
103
+ s = 's' if self.sign else ''
104
+ lvalue = lvalue_formatter(self.name)
105
+ return f'{s}extract{bitop_width}({lvalue}, 0, {self.len})'
106
+
107
+ def referenced_fields(self):
108
+ return [self.name]
109
+
110
+ def __eq__(self, other):
111
+ return self.name == other.name
112
+
113
+ def __ne__(self, other):
114
+ return not self.__eq__(other)
115
+# end NamedField
116
117
class Arguments:
118
"""Class representing the extracted fields of a format"""
119
@@ -XXX,XX +XXX,XX @@ def output_def(self):
120
output('} ', self.struct_name(), ';\n\n')
121
# end Arguments
122
123
-
124
class General:
125
"""Common code between instruction formats and instruction patterns"""
126
def __init__(self, name, lineno, base, fixb, fixm, udfm, fldm, flds, w):
127
@@ -XXX,XX +XXX,XX @@ def __init__(self, name, lineno, base, fixb, fixm, udfm, fldm, flds, w):
128
self.fieldmask = fldm
129
self.fields = flds
130
self.width = w
131
+ self.dangling = None
132
133
def __str__(self):
134
return self.name + ' ' + str_match_bits(self.fixedbits, self.fixedmask)
135
@@ -XXX,XX +XXX,XX @@ def __str__(self):
136
def str1(self, i):
137
return str_indent(i) + self.__str__()
138
139
+ def dangling_references(self):
140
+ # Return a list of all named references which aren't satisfied
141
+ # directly by this format/pattern. This will be either:
142
+ # * a format referring to a field which is specified by the
143
+ # pattern(s) using it
144
+ # * a pattern referring to a field which is specified by the
145
+ # format it uses
146
+ # * a user error (referring to a field that doesn't exist at all)
147
+ if self.dangling is None:
148
+ # Compute this once and cache the answer
149
+ dangling = []
150
+ for n, f in self.fields.items():
151
+ for r in f.referenced_fields():
152
+ if r not in self.fields:
153
+ dangling.append(r)
154
+ self.dangling = dangling
155
+ return self.dangling
156
+
157
def output_fields(self, indent, lvalue_formatter):
158
+ # We use a topological sort to ensure that any use of NamedField
159
+ # comes after the initialization of the field it is referencing.
160
+ graph = {}
161
for n, f in self.fields.items():
162
- output(indent, lvalue_formatter(n), ' = ',
163
- f.str_extract(lvalue_formatter), ';\n')
164
+ refs = f.referenced_fields()
165
+ graph[n] = refs
166
+
167
+ try:
168
+ ts = TopologicalSorter(graph)
169
+ for n in ts.static_order():
170
+ # We only want to emit assignments for the keys
171
+ # in our fields list, not for anything that ends up
172
+ # in the tsort graph only because it was referenced as
173
+ # a NamedField.
174
+ try:
175
+ f = self.fields[n]
176
+ output(indent, lvalue_formatter(n), ' = ',
177
+ f.str_extract(lvalue_formatter), ';\n')
178
+ except KeyError:
179
+ pass
180
+ except CycleError as e:
181
+ # The second element of args is a list of nodes which form
182
+ # a cycle (there might be others too, but only one is reported).
183
+ # Pretty-print it to tell the user.
184
+ cycle = ' => '.join(e.args[1])
185
+ error(self.lineno, 'field definitions form a cycle: ' + cycle)
186
# end General
187
188
189
@@ -XXX,XX +XXX,XX @@ def output_code(self, i, extracted, outerbits, outermask):
190
ind = str_indent(i)
191
arg = self.base.base.name
192
output(ind, '/* ', self.file, ':', str(self.lineno), ' */\n')
193
+ # We might have named references in the format that refer to fields
194
+ # in the pattern, or named references in the pattern that refer
195
+ # to fields in the format. This affects whether we extract the fields
196
+ # for the format before or after the ones for the pattern.
197
+ # For simplicity we don't allow cross references in both directions.
198
+ # This is also where we catch the syntax error of referring to
199
+ # a nonexistent field.
200
+ fmt_refs = self.base.dangling_references()
201
+ for r in fmt_refs:
202
+ if r not in self.fields:
203
+ error(self.lineno, f'format refers to undefined field {r}')
204
+ pat_refs = self.dangling_references()
205
+ for r in pat_refs:
206
+ if r not in self.base.fields:
207
+ error(self.lineno, f'pattern refers to undefined field {r}')
208
+ if pat_refs and fmt_refs:
209
+ error(self.lineno, ('pattern that uses fields defined in format '
210
+ 'cannot use format that uses fields defined '
211
+ 'in pattern'))
212
+ if fmt_refs:
213
+ # pattern fields first
214
+ self.output_fields(ind, lambda n: 'u.f_' + arg + '.' + n)
215
+ assert not extracted, "dangling fmt refs but it was already extracted"
216
if not extracted:
217
output(ind, self.base.extract_name(),
218
'(ctx, &u.f_', arg, ', insn);\n')
219
- self.output_fields(ind, lambda n: 'u.f_' + arg + '.' + n)
220
+ if not fmt_refs:
221
+ # pattern fields last
222
+ self.output_fields(ind, lambda n: 'u.f_' + arg + '.' + n)
223
+
224
output(ind, 'if (', translate_prefix, '_', self.name,
225
'(ctx, &u.f_', arg, ')) return true;\n')
226
227
@@ -XXX,XX +XXX,XX @@ def output_code(self, i, extracted, outerbits, outermask):
228
ind = str_indent(i)
229
230
# If we identified all nodes below have the same format,
231
- # extract the fields now.
232
- if not extracted and self.base:
233
+ # extract the fields now. But don't do it if the format relies
234
+ # on named fields from the insn pattern, as those won't have
235
+ # been initialised at this point.
236
+ if not extracted and self.base and not self.base.dangling_references():
237
output(ind, self.base.extract_name(),
238
'(ctx, &u.f_', self.base.base.name, ', insn);\n')
239
extracted = True
240
@@ -XXX,XX +XXX,XX @@ def parse_field(lineno, name, toks):
241
"""Parse one instruction field from TOKS at LINENO"""
242
global fields
243
global insnwidth
244
+ global re_C_ident
245
246
# A "simple" field will have only one entry;
247
# a "multifield" will have several.
248
@@ -XXX,XX +XXX,XX @@ def parse_field(lineno, name, toks):
249
func = func[1]
250
continue
251
252
+ if re.fullmatch(re_C_ident + ':s[0-9]+', t):
253
+ # Signed named field
254
+ subtoks = t.split(':')
255
+ n = subtoks[0]
256
+ le = int(subtoks[1])
257
+ f = NamedField(n, True, le)
258
+ subs.append(f)
259
+ width += le
260
+ continue
261
+ if re.fullmatch(re_C_ident + ':[0-9]+', t):
262
+ # Unsigned named field
263
+ subtoks = t.split(':')
264
+ n = subtoks[0]
265
+ le = int(subtoks[1])
266
+ f = NamedField(n, False, le)
267
+ subs.append(f)
268
+ width += le
269
+ continue
270
+
271
if re.fullmatch('[0-9]+:s[0-9]+', t):
272
# Signed field extract
273
subtoks = t.split(':s')
32
--
274
--
33
2.34.1
275
2.34.1
diff view generated by jsdifflib
1
Use uint64_t for the pc, and size_t for the size.
1
From: Peter Maydell <peter.maydell@linaro.org>
2
2
3
Reviewed-by: Thomas Huth <thuth@redhat.com>
3
Add some tests for various cases of named-field use, both ones that
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
should work and ones that should be diagnosed as errors.
5
Message-Id: <20230503072331.1747057-81-richard.henderson@linaro.org>
5
6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-Id: <20230523120447.728365-7-peter.maydell@linaro.org>
6
---
9
---
7
include/disas/disas.h | 17 ++++++-----------
10
tests/decode/err_field10.decode | 7 +++++++
8
bsd-user/elfload.c | 5 +++--
11
tests/decode/err_field7.decode | 7 +++++++
9
disas/disas.c | 19 +++++++++----------
12
tests/decode/err_field8.decode | 8 ++++++++
10
linux-user/elfload.c | 5 +++--
13
tests/decode/err_field9.decode | 14 ++++++++++++++
11
4 files changed, 21 insertions(+), 25 deletions(-)
14
tests/decode/succ_named_field.decode | 19 +++++++++++++++++++
15
tests/decode/meson.build | 5 +++++
16
6 files changed, 60 insertions(+)
17
create mode 100644 tests/decode/err_field10.decode
18
create mode 100644 tests/decode/err_field7.decode
19
create mode 100644 tests/decode/err_field8.decode
20
create mode 100644 tests/decode/err_field9.decode
21
create mode 100644 tests/decode/succ_named_field.decode
12
22
13
diff --git a/include/disas/disas.h b/include/disas/disas.h
23
diff --git a/tests/decode/err_field10.decode b/tests/decode/err_field10.decode
24
new file mode 100644
25
index XXXXXXX..XXXXXXX
26
--- /dev/null
27
+++ b/tests/decode/err_field10.decode
28
@@ -XXX,XX +XXX,XX @@
29
+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
30
+# See the COPYING.LIB file in the top-level directory.
31
+
32
+# Diagnose formats which refer to undefined fields
33
+%field1 field2:3
34
+@fmt ........ ........ ........ ........ %field1
35
+insn 00000000 00000000 00000000 00000000 @fmt
36
diff --git a/tests/decode/err_field7.decode b/tests/decode/err_field7.decode
37
new file mode 100644
38
index XXXXXXX..XXXXXXX
39
--- /dev/null
40
+++ b/tests/decode/err_field7.decode
41
@@ -XXX,XX +XXX,XX @@
42
+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
43
+# See the COPYING.LIB file in the top-level directory.
44
+
45
+# Diagnose fields whose definitions form a loop
46
+%field1 field2:3
47
+%field2 field1:4
48
+insn 00000000 00000000 00000000 00000000 %field1 %field2
49
diff --git a/tests/decode/err_field8.decode b/tests/decode/err_field8.decode
50
new file mode 100644
51
index XXXXXXX..XXXXXXX
52
--- /dev/null
53
+++ b/tests/decode/err_field8.decode
54
@@ -XXX,XX +XXX,XX @@
55
+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
56
+# See the COPYING.LIB file in the top-level directory.
57
+
58
+# Diagnose patterns which refer to undefined fields
59
+&f1 f1 a
60
+%field1 field2:3
61
+@fmt ........ ........ ........ .... a:4 &f1
62
+insn 00000000 00000000 00000000 0000 .... @fmt f1=%field1
63
diff --git a/tests/decode/err_field9.decode b/tests/decode/err_field9.decode
64
new file mode 100644
65
index XXXXXXX..XXXXXXX
66
--- /dev/null
67
+++ b/tests/decode/err_field9.decode
68
@@ -XXX,XX +XXX,XX @@
69
+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
70
+# See the COPYING.LIB file in the top-level directory.
71
+
72
+# Diagnose fields where the format refers to a field defined in the
73
+# pattern and the pattern refers to a field defined in the format.
74
+# This is theoretically not impossible to implement, but is not
75
+# supported by the script at this time.
76
+&abcd a b c d
77
+%refa a:3
78
+%refc c:4
79
+# Format defines 'c' and sets 'b' to an indirect ref to 'a'
80
+@fmt ........ ........ ........ c:8 &abcd b=%refa
81
+# Pattern defines 'a' and sets 'd' to an indirect ref to 'c'
82
+insn 00000000 00000000 00000000 ........ @fmt d=%refc a=6
83
diff --git a/tests/decode/succ_named_field.decode b/tests/decode/succ_named_field.decode
84
new file mode 100644
85
index XXXXXXX..XXXXXXX
86
--- /dev/null
87
+++ b/tests/decode/succ_named_field.decode
88
@@ -XXX,XX +XXX,XX @@
89
+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
90
+# See the COPYING.LIB file in the top-level directory.
91
+
92
+# field using a named_field
93
+%imm_sz    8:8 sz:3
94
+insn 00000000 00000000 ........ 00000000 imm_sz=%imm_sz sz=1
95
+
96
+# Ditto, via a format. Here a field in the format
97
+# references a named field defined in the insn pattern:
98
+&imm_a imm alpha
99
+%foo 0:16 alpha:4
100
+@foo 00000001 ........ ........ ........ &imm_a imm=%foo
101
+i1 ........ 00000000 ........ ........ @foo alpha=1
102
+i2 ........ 00000001 ........ ........ @foo alpha=2
103
+
104
+# Here the named field is defined in the format and referenced
105
+# from the insn pattern:
106
+@bar 00000010 ........ ........ ........ &imm_a alpha=4
107
+i3 ........ 00000000 ........ ........ @bar imm=%foo
108
diff --git a/tests/decode/meson.build b/tests/decode/meson.build
14
index XXXXXXX..XXXXXXX 100644
109
index XXXXXXX..XXXXXXX 100644
15
--- a/include/disas/disas.h
110
--- a/tests/decode/meson.build
16
+++ b/include/disas/disas.h
111
+++ b/tests/decode/meson.build
17
@@ -XXX,XX +XXX,XX @@
112
@@ -XXX,XX +XXX,XX @@ err_tests = [
18
#include "cpu.h"
113
'err_field4.decode',
19
114
'err_field5.decode',
20
/* Disassemble this for me please... (debugging). */
115
'err_field6.decode',
21
-void disas(FILE *out, const void *code, unsigned long size);
116
+ 'err_field7.decode',
22
-void target_disas(FILE *out, CPUState *cpu, target_ulong code,
117
+ 'err_field8.decode',
23
- target_ulong size);
118
+ 'err_field9.decode',
24
+void disas(FILE *out, const void *code, size_t size);
119
+ 'err_field10.decode',
25
+void target_disas(FILE *out, CPUState *cpu, uint64_t code, size_t size);
120
'err_init1.decode',
26
121
'err_init2.decode',
27
-void monitor_disas(Monitor *mon, CPUState *cpu,
122
'err_init3.decode',
28
- target_ulong pc, int nb_insn, int is_physical);
123
@@ -XXX,XX +XXX,XX @@ succ_tests = [
29
+void monitor_disas(Monitor *mon, CPUState *cpu, uint64_t pc,
124
'succ_argset_type1.decode',
30
+ int nb_insn, bool is_physical);
125
'succ_function.decode',
31
126
'succ_ident1.decode',
32
char *plugin_disas(CPUState *cpu, uint64_t addr, size_t size);
127
+ 'succ_named_field.decode',
33
128
'succ_pattern_group_nest1.decode',
34
/* Look up symbol for debugging purpose. Returns "" if unknown. */
129
'succ_pattern_group_nest2.decode',
35
-const char *lookup_symbol(target_ulong orig_addr);
130
'succ_pattern_group_nest3.decode',
36
+const char *lookup_symbol(uint64_t orig_addr);
37
#endif
38
39
struct syminfo;
40
struct elf32_sym;
41
struct elf64_sym;
42
43
-#if defined(CONFIG_USER_ONLY)
44
-typedef const char *(*lookup_symbol_t)(struct syminfo *s, target_ulong orig_addr);
45
-#else
46
-typedef const char *(*lookup_symbol_t)(struct syminfo *s, hwaddr orig_addr);
47
-#endif
48
+typedef const char *(*lookup_symbol_t)(struct syminfo *s, uint64_t orig_addr);
49
50
struct syminfo {
51
lookup_symbol_t lookup_symbol;
52
diff --git a/bsd-user/elfload.c b/bsd-user/elfload.c
53
index XXXXXXX..XXXXXXX 100644
54
--- a/bsd-user/elfload.c
55
+++ b/bsd-user/elfload.c
56
@@ -XXX,XX +XXX,XX @@ static abi_ulong load_elf_interp(struct elfhdr *interp_elf_ex,
57
58
static int symfind(const void *s0, const void *s1)
59
{
60
- target_ulong addr = *(target_ulong *)s0;
61
+ __typeof(sym->st_value) addr = *(uint64_t *)s0;
62
struct elf_sym *sym = (struct elf_sym *)s1;
63
int result = 0;
64
+
65
if (addr < sym->st_value) {
66
result = -1;
67
} else if (addr >= sym->st_value + sym->st_size) {
68
@@ -XXX,XX +XXX,XX @@ static int symfind(const void *s0, const void *s1)
69
return result;
70
}
71
72
-static const char *lookup_symbolxx(struct syminfo *s, target_ulong orig_addr)
73
+static const char *lookup_symbolxx(struct syminfo *s, uint64_t orig_addr)
74
{
75
#if ELF_CLASS == ELFCLASS32
76
struct elf_sym *syms = s->disas_symtab.elf32;
77
diff --git a/disas/disas.c b/disas/disas.c
78
index XXXXXXX..XXXXXXX 100644
79
--- a/disas/disas.c
80
+++ b/disas/disas.c
81
@@ -XXX,XX +XXX,XX @@ static void initialize_debug_host(CPUDebug *s)
82
}
83
84
/* Disassemble this for me please... (debugging). */
85
-void target_disas(FILE *out, CPUState *cpu, target_ulong code,
86
- target_ulong size)
87
+void target_disas(FILE *out, CPUState *cpu, uint64_t code, size_t size)
88
{
89
- target_ulong pc;
90
+ uint64_t pc;
91
int count;
92
CPUDebug s;
93
94
@@ -XXX,XX +XXX,XX @@ void target_disas(FILE *out, CPUState *cpu, target_ulong code,
95
}
96
97
for (pc = code; size > 0; pc += count, size -= count) {
98
- fprintf(out, "0x" TARGET_FMT_lx ": ", pc);
99
+ fprintf(out, "0x%08" PRIx64 ": ", pc);
100
count = s.info.print_insn(pc, &s.info);
101
fprintf(out, "\n");
102
if (count < 0) {
103
@@ -XXX,XX +XXX,XX @@ char *plugin_disas(CPUState *cpu, uint64_t addr, size_t size)
104
}
105
106
/* Disassemble this for me please... (debugging). */
107
-void disas(FILE *out, const void *code, unsigned long size)
108
+void disas(FILE *out, const void *code, size_t size)
109
{
110
uintptr_t pc;
111
int count;
112
@@ -XXX,XX +XXX,XX @@ void disas(FILE *out, const void *code, unsigned long size)
113
}
114
115
/* Look up symbol for debugging purpose. Returns "" if unknown. */
116
-const char *lookup_symbol(target_ulong orig_addr)
117
+const char *lookup_symbol(uint64_t orig_addr)
118
{
119
const char *symbol = "";
120
struct syminfo *s;
121
@@ -XXX,XX +XXX,XX @@ physical_read_memory(bfd_vma memaddr, bfd_byte *myaddr, int length,
122
}
123
124
/* Disassembler for the monitor. */
125
-void monitor_disas(Monitor *mon, CPUState *cpu,
126
- target_ulong pc, int nb_insn, int is_physical)
127
+void monitor_disas(Monitor *mon, CPUState *cpu, uint64_t pc,
128
+ int nb_insn, bool is_physical)
129
{
130
int count, i;
131
CPUDebug s;
132
@@ -XXX,XX +XXX,XX @@ void monitor_disas(Monitor *mon, CPUState *cpu,
133
}
134
135
if (!s.info.print_insn) {
136
- monitor_printf(mon, "0x" TARGET_FMT_lx
137
+ monitor_printf(mon, "0x%08" PRIx64
138
": Asm output not supported on this arch\n", pc);
139
return;
140
}
141
142
for (i = 0; i < nb_insn; i++) {
143
- g_string_append_printf(ds, "0x" TARGET_FMT_lx ": ", pc);
144
+ g_string_append_printf(ds, "0x%08" PRIx64 ": ", pc);
145
count = s.info.print_insn(pc, &s.info);
146
g_string_append_c(ds, '\n');
147
if (count < 0) {
148
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
149
index XXXXXXX..XXXXXXX 100644
150
--- a/linux-user/elfload.c
151
+++ b/linux-user/elfload.c
152
@@ -XXX,XX +XXX,XX @@ static void load_elf_interp(const char *filename, struct image_info *info,
153
154
static int symfind(const void *s0, const void *s1)
155
{
156
- target_ulong addr = *(target_ulong *)s0;
157
struct elf_sym *sym = (struct elf_sym *)s1;
158
+ __typeof(sym->st_value) addr = *(uint64_t *)s0;
159
int result = 0;
160
+
161
if (addr < sym->st_value) {
162
result = -1;
163
} else if (addr >= sym->st_value + sym->st_size) {
164
@@ -XXX,XX +XXX,XX @@ static int symfind(const void *s0, const void *s1)
165
return result;
166
}
167
168
-static const char *lookup_symbolxx(struct syminfo *s, target_ulong orig_addr)
169
+static const char *lookup_symbolxx(struct syminfo *s, uint64_t orig_addr)
170
{
171
#if ELF_CLASS == ELFCLASS32
172
struct elf_sym *syms = s->disas_symtab.elf32;
173
--
131
--
174
2.34.1
132
2.34.1
diff view generated by jsdifflib
Deleted patch
1
From: Jamie Iles <quic_jiles@quicinc.com>
2
1
3
Expose qemu_cpu_list_lock globally so that we can use
4
WITH_QEMU_LOCK_GUARD and QEMU_LOCK_GUARD to simplify a few code paths
5
now and in future.
6
7
Signed-off-by: Jamie Iles <quic_jiles@quicinc.com>
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-Id: <20230427020925.51003-2-quic_jiles@quicinc.com>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
---
13
include/exec/cpu-common.h | 1 +
14
cpus-common.c | 2 +-
15
linux-user/elfload.c | 13 +++++++------
16
migration/dirtyrate.c | 26 +++++++++++++-------------
17
trace/control-target.c | 9 ++++-----
18
5 files changed, 26 insertions(+), 25 deletions(-)
19
20
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
21
index XXXXXXX..XXXXXXX 100644
22
--- a/include/exec/cpu-common.h
23
+++ b/include/exec/cpu-common.h
24
@@ -XXX,XX +XXX,XX @@ extern intptr_t qemu_host_page_mask;
25
#define REAL_HOST_PAGE_ALIGN(addr) ROUND_UP((addr), qemu_real_host_page_size())
26
27
/* The CPU list lock nests outside page_(un)lock or mmap_(un)lock */
28
+extern QemuMutex qemu_cpu_list_lock;
29
void qemu_init_cpu_list(void);
30
void cpu_list_lock(void);
31
void cpu_list_unlock(void);
32
diff --git a/cpus-common.c b/cpus-common.c
33
index XXXXXXX..XXXXXXX 100644
34
--- a/cpus-common.c
35
+++ b/cpus-common.c
36
@@ -XXX,XX +XXX,XX @@
37
#include "qemu/lockable.h"
38
#include "trace/trace-root.h"
39
40
-static QemuMutex qemu_cpu_list_lock;
41
+QemuMutex qemu_cpu_list_lock;
42
static QemuCond exclusive_cond;
43
static QemuCond exclusive_resume;
44
static QemuCond qemu_work_cond;
45
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
46
index XXXXXXX..XXXXXXX 100644
47
--- a/linux-user/elfload.c
48
+++ b/linux-user/elfload.c
49
@@ -XXX,XX +XXX,XX @@
50
#include "qemu/guest-random.h"
51
#include "qemu/units.h"
52
#include "qemu/selfmap.h"
53
+#include "qemu/lockable.h"
54
#include "qapi/error.h"
55
#include "qemu/error-report.h"
56
#include "target_signal.h"
57
@@ -XXX,XX +XXX,XX @@ static int fill_note_info(struct elf_note_info *info,
58
info->notes_size += note_size(&info->notes[i]);
59
60
/* read and fill status of all threads */
61
- cpu_list_lock();
62
- CPU_FOREACH(cpu) {
63
- if (cpu == thread_cpu) {
64
- continue;
65
+ WITH_QEMU_LOCK_GUARD(&qemu_cpu_list_lock) {
66
+ CPU_FOREACH(cpu) {
67
+ if (cpu == thread_cpu) {
68
+ continue;
69
+ }
70
+ fill_thread_info(info, cpu->env_ptr);
71
}
72
- fill_thread_info(info, cpu->env_ptr);
73
}
74
- cpu_list_unlock();
75
76
return (0);
77
}
78
diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
79
index XXXXXXX..XXXXXXX 100644
80
--- a/migration/dirtyrate.c
81
+++ b/migration/dirtyrate.c
82
@@ -XXX,XX +XXX,XX @@ int64_t vcpu_calculate_dirtyrate(int64_t calc_time_ms,
83
retry:
84
init_time_ms = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
85
86
- cpu_list_lock();
87
- gen_id = cpu_list_generation_id_get();
88
- records = vcpu_dirty_stat_alloc(stat);
89
- vcpu_dirty_stat_collect(stat, records, true);
90
- cpu_list_unlock();
91
+ WITH_QEMU_LOCK_GUARD(&qemu_cpu_list_lock) {
92
+ gen_id = cpu_list_generation_id_get();
93
+ records = vcpu_dirty_stat_alloc(stat);
94
+ vcpu_dirty_stat_collect(stat, records, true);
95
+ }
96
97
duration = dirty_stat_wait(calc_time_ms, init_time_ms);
98
99
global_dirty_log_sync(flag, one_shot);
100
101
- cpu_list_lock();
102
- if (gen_id != cpu_list_generation_id_get()) {
103
- g_free(records);
104
- g_free(stat->rates);
105
- cpu_list_unlock();
106
- goto retry;
107
+ WITH_QEMU_LOCK_GUARD(&qemu_cpu_list_lock) {
108
+ if (gen_id != cpu_list_generation_id_get()) {
109
+ g_free(records);
110
+ g_free(stat->rates);
111
+ cpu_list_unlock();
112
+ goto retry;
113
+ }
114
+ vcpu_dirty_stat_collect(stat, records, false);
115
}
116
- vcpu_dirty_stat_collect(stat, records, false);
117
- cpu_list_unlock();
118
119
for (i = 0; i < stat->nvcpu; i++) {
120
dirtyrate = do_calculate_dirtyrate(records[i], duration);
121
diff --git a/trace/control-target.c b/trace/control-target.c
122
index XXXXXXX..XXXXXXX 100644
123
--- a/trace/control-target.c
124
+++ b/trace/control-target.c
125
@@ -XXX,XX +XXX,XX @@
126
*/
127
128
#include "qemu/osdep.h"
129
+#include "qemu/lockable.h"
130
#include "cpu.h"
131
#include "trace/trace-root.h"
132
#include "trace/control.h"
133
@@ -XXX,XX +XXX,XX @@ static bool adding_first_cpu1(void)
134
135
static bool adding_first_cpu(void)
136
{
137
- bool res;
138
- cpu_list_lock();
139
- res = adding_first_cpu1();
140
- cpu_list_unlock();
141
- return res;
142
+ QEMU_LOCK_GUARD(&qemu_cpu_list_lock);
143
+
144
+ return adding_first_cpu1();
145
}
146
147
void trace_init_vcpu(CPUState *vcpu)
148
--
149
2.34.1
150
151
diff view generated by jsdifflib
Deleted patch
1
Merge tcg_out_tlb_load, add_qemu_ldst_label,
2
tcg_out_test_alignment, and some code that lived in both
3
tcg_out_qemu_ld and tcg_out_qemu_st into one function
4
that returns HostAddress and TCGLabelQemuLdst structures.
5
1
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/i386/tcg-target.c.inc | 346 ++++++++++++++++----------------------
10
1 file changed, 145 insertions(+), 201 deletions(-)
11
12
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/i386/tcg-target.c.inc
15
+++ b/tcg/i386/tcg-target.c.inc
16
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
17
[MO_BEUQ] = helper_be_stq_mmu,
18
};
19
20
-/* Perform the TLB load and compare.
21
-
22
- Inputs:
23
- ADDRLO and ADDRHI contain the low and high part of the address.
24
-
25
- MEM_INDEX and S_BITS are the memory context and log2 size of the load.
26
-
27
- WHICH is the offset into the CPUTLBEntry structure of the slot to read.
28
- This should be offsetof addr_read or addr_write.
29
-
30
- Outputs:
31
- LABEL_PTRS is filled with 1 (32-bit addresses) or 2 (64-bit addresses)
32
- positions of the displacements of forward jumps to the TLB miss case.
33
-
34
- Second argument register is loaded with the low part of the address.
35
- In the TLB hit case, it has been adjusted as indicated by the TLB
36
- and so is a host address. In the TLB miss case, it continues to
37
- hold a guest address.
38
-
39
- First argument register is clobbered. */
40
-
41
-static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
42
- int mem_index, MemOp opc,
43
- tcg_insn_unit **label_ptr, int which)
44
-{
45
- TCGType ttype = TCG_TYPE_I32;
46
- TCGType tlbtype = TCG_TYPE_I32;
47
- int trexw = 0, hrexw = 0, tlbrexw = 0;
48
- unsigned a_bits = get_alignment_bits(opc);
49
- unsigned s_bits = opc & MO_SIZE;
50
- unsigned a_mask = (1 << a_bits) - 1;
51
- unsigned s_mask = (1 << s_bits) - 1;
52
- target_ulong tlb_mask;
53
-
54
- if (TCG_TARGET_REG_BITS == 64) {
55
- if (TARGET_LONG_BITS == 64) {
56
- ttype = TCG_TYPE_I64;
57
- trexw = P_REXW;
58
- }
59
- if (TCG_TYPE_PTR == TCG_TYPE_I64) {
60
- hrexw = P_REXW;
61
- if (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32) {
62
- tlbtype = TCG_TYPE_I64;
63
- tlbrexw = P_REXW;
64
- }
65
- }
66
- }
67
-
68
- tcg_out_mov(s, tlbtype, TCG_REG_L0, addrlo);
69
- tcg_out_shifti(s, SHIFT_SHR + tlbrexw, TCG_REG_L0,
70
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
71
-
72
- tcg_out_modrm_offset(s, OPC_AND_GvEv + trexw, TCG_REG_L0, TCG_AREG0,
73
- TLB_MASK_TABLE_OFS(mem_index) +
74
- offsetof(CPUTLBDescFast, mask));
75
-
76
- tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, TCG_REG_L0, TCG_AREG0,
77
- TLB_MASK_TABLE_OFS(mem_index) +
78
- offsetof(CPUTLBDescFast, table));
79
-
80
- /* If the required alignment is at least as large as the access, simply
81
- copy the address and mask. For lesser alignments, check that we don't
82
- cross pages for the complete access. */
83
- if (a_bits >= s_bits) {
84
- tcg_out_mov(s, ttype, TCG_REG_L1, addrlo);
85
- } else {
86
- tcg_out_modrm_offset(s, OPC_LEA + trexw, TCG_REG_L1,
87
- addrlo, s_mask - a_mask);
88
- }
89
- tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
90
- tgen_arithi(s, ARITH_AND + trexw, TCG_REG_L1, tlb_mask, 0);
91
-
92
- /* cmp 0(TCG_REG_L0), TCG_REG_L1 */
93
- tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw,
94
- TCG_REG_L1, TCG_REG_L0, which);
95
-
96
- /* Prepare for both the fast path add of the tlb addend, and the slow
97
- path function argument setup. */
98
- tcg_out_mov(s, ttype, TCG_REG_L1, addrlo);
99
-
100
- /* jne slow_path */
101
- tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
102
- label_ptr[0] = s->code_ptr;
103
- s->code_ptr += 4;
104
-
105
- if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
106
- /* cmp 4(TCG_REG_L0), addrhi */
107
- tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, TCG_REG_L0, which + 4);
108
-
109
- /* jne slow_path */
110
- tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
111
- label_ptr[1] = s->code_ptr;
112
- s->code_ptr += 4;
113
- }
114
-
115
- /* TLB Hit. */
116
-
117
- /* add addend(TCG_REG_L0), TCG_REG_L1 */
118
- tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, TCG_REG_L1, TCG_REG_L0,
119
- offsetof(CPUTLBEntry, addend));
120
-}
121
-
122
-/*
123
- * Record the context of a call to the out of line helper code for the slow path
124
- * for a load or store, so that we can later generate the correct helper code
125
- */
126
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld,
127
- TCGType type, MemOpIdx oi,
128
- TCGReg datalo, TCGReg datahi,
129
- TCGReg addrlo, TCGReg addrhi,
130
- tcg_insn_unit *raddr,
131
- tcg_insn_unit **label_ptr)
132
-{
133
- TCGLabelQemuLdst *label = new_ldst_label(s);
134
-
135
- label->is_ld = is_ld;
136
- label->oi = oi;
137
- label->type = type;
138
- label->datalo_reg = datalo;
139
- label->datahi_reg = datahi;
140
- label->addrlo_reg = addrlo;
141
- label->addrhi_reg = addrhi;
142
- label->raddr = tcg_splitwx_to_rx(raddr);
143
- label->label_ptr[0] = label_ptr[0];
144
- if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
145
- label->label_ptr[1] = label_ptr[1];
146
- }
147
-}
148
-
149
/*
150
* Generate code for the slow path for a load at the end of block
151
*/
152
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
153
return true;
154
}
155
#else
156
-
157
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
158
- TCGReg addrhi, unsigned a_bits)
159
-{
160
- unsigned a_mask = (1 << a_bits) - 1;
161
- TCGLabelQemuLdst *label;
162
-
163
- tcg_out_testi(s, addrlo, a_mask);
164
- /* jne slow_path */
165
- tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
166
-
167
- label = new_ldst_label(s);
168
- label->is_ld = is_ld;
169
- label->addrlo_reg = addrlo;
170
- label->addrhi_reg = addrhi;
171
- label->raddr = tcg_splitwx_to_rx(s->code_ptr + 4);
172
- label->label_ptr[0] = s->code_ptr;
173
-
174
- s->code_ptr += 4;
175
-}
176
-
177
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
178
{
179
/* resolve label address */
180
@@ -XXX,XX +XXX,XX @@ static inline int setup_guest_base_seg(void)
181
#endif /* setup_guest_base_seg */
182
#endif /* SOFTMMU */
183
184
+/*
185
+ * For softmmu, perform the TLB load and compare.
186
+ * For useronly, perform any required alignment tests.
187
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
188
+ * is required and fill in @h with the host address for the fast path.
189
+ */
190
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
191
+ TCGReg addrlo, TCGReg addrhi,
192
+ MemOpIdx oi, bool is_ld)
193
+{
194
+ TCGLabelQemuLdst *ldst = NULL;
195
+ MemOp opc = get_memop(oi);
196
+ unsigned a_bits = get_alignment_bits(opc);
197
+ unsigned a_mask = (1 << a_bits) - 1;
198
+
199
+#ifdef CONFIG_SOFTMMU
200
+ int cmp_ofs = is_ld ? offsetof(CPUTLBEntry, addr_read)
201
+ : offsetof(CPUTLBEntry, addr_write);
202
+ TCGType ttype = TCG_TYPE_I32;
203
+ TCGType tlbtype = TCG_TYPE_I32;
204
+ int trexw = 0, hrexw = 0, tlbrexw = 0;
205
+ unsigned mem_index = get_mmuidx(oi);
206
+ unsigned s_bits = opc & MO_SIZE;
207
+ unsigned s_mask = (1 << s_bits) - 1;
208
+ target_ulong tlb_mask;
209
+
210
+ ldst = new_ldst_label(s);
211
+ ldst->is_ld = is_ld;
212
+ ldst->oi = oi;
213
+ ldst->addrlo_reg = addrlo;
214
+ ldst->addrhi_reg = addrhi;
215
+
216
+ if (TCG_TARGET_REG_BITS == 64) {
217
+ if (TARGET_LONG_BITS == 64) {
218
+ ttype = TCG_TYPE_I64;
219
+ trexw = P_REXW;
220
+ }
221
+ if (TCG_TYPE_PTR == TCG_TYPE_I64) {
222
+ hrexw = P_REXW;
223
+ if (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32) {
224
+ tlbtype = TCG_TYPE_I64;
225
+ tlbrexw = P_REXW;
226
+ }
227
+ }
228
+ }
229
+
230
+ tcg_out_mov(s, tlbtype, TCG_REG_L0, addrlo);
231
+ tcg_out_shifti(s, SHIFT_SHR + tlbrexw, TCG_REG_L0,
232
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
233
+
234
+ tcg_out_modrm_offset(s, OPC_AND_GvEv + trexw, TCG_REG_L0, TCG_AREG0,
235
+ TLB_MASK_TABLE_OFS(mem_index) +
236
+ offsetof(CPUTLBDescFast, mask));
237
+
238
+ tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, TCG_REG_L0, TCG_AREG0,
239
+ TLB_MASK_TABLE_OFS(mem_index) +
240
+ offsetof(CPUTLBDescFast, table));
241
+
242
+ /*
243
+ * If the required alignment is at least as large as the access, simply
244
+ * copy the address and mask. For lesser alignments, check that we don't
245
+ * cross pages for the complete access.
246
+ */
247
+ if (a_bits >= s_bits) {
248
+ tcg_out_mov(s, ttype, TCG_REG_L1, addrlo);
249
+ } else {
250
+ tcg_out_modrm_offset(s, OPC_LEA + trexw, TCG_REG_L1,
251
+ addrlo, s_mask - a_mask);
252
+ }
253
+ tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
254
+ tgen_arithi(s, ARITH_AND + trexw, TCG_REG_L1, tlb_mask, 0);
255
+
256
+ /* cmp 0(TCG_REG_L0), TCG_REG_L1 */
257
+ tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw,
258
+ TCG_REG_L1, TCG_REG_L0, cmp_ofs);
259
+
260
+ /*
261
+ * Prepare for both the fast path add of the tlb addend, and the slow
262
+ * path function argument setup.
263
+ */
264
+ *h = (HostAddress) {
265
+ .base = TCG_REG_L1,
266
+ .index = -1
267
+ };
268
+ tcg_out_mov(s, ttype, h->base, addrlo);
269
+
270
+ /* jne slow_path */
271
+ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
272
+ ldst->label_ptr[0] = s->code_ptr;
273
+ s->code_ptr += 4;
274
+
275
+ if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
276
+ /* cmp 4(TCG_REG_L0), addrhi */
277
+ tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, TCG_REG_L0, cmp_ofs + 4);
278
+
279
+ /* jne slow_path */
280
+ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
281
+ ldst->label_ptr[1] = s->code_ptr;
282
+ s->code_ptr += 4;
283
+ }
284
+
285
+ /* TLB Hit. */
286
+
287
+ /* add addend(TCG_REG_L0), TCG_REG_L1 */
288
+ tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, h->base, TCG_REG_L0,
289
+ offsetof(CPUTLBEntry, addend));
290
+#else
291
+ if (a_bits) {
292
+ ldst = new_ldst_label(s);
293
+
294
+ ldst->is_ld = is_ld;
295
+ ldst->oi = oi;
296
+ ldst->addrlo_reg = addrlo;
297
+ ldst->addrhi_reg = addrhi;
298
+
299
+ tcg_out_testi(s, addrlo, a_mask);
300
+ /* jne slow_path */
301
+ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
302
+ ldst->label_ptr[0] = s->code_ptr;
303
+ s->code_ptr += 4;
304
+ }
305
+
306
+ *h = x86_guest_base;
307
+ h->base = addrlo;
308
+#endif
309
+
310
+ return ldst;
311
+}
312
+
313
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
314
HostAddress h, TCGType type, MemOp memop)
315
{
316
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
317
TCGReg addrlo, TCGReg addrhi,
318
MemOpIdx oi, TCGType data_type)
319
{
320
- MemOp opc = get_memop(oi);
321
+ TCGLabelQemuLdst *ldst;
322
HostAddress h;
323
324
-#if defined(CONFIG_SOFTMMU)
325
- tcg_insn_unit *label_ptr[2];
326
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
327
+ tcg_out_qemu_ld_direct(s, datalo, datahi, h, data_type, get_memop(oi));
328
329
- tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc,
330
- label_ptr, offsetof(CPUTLBEntry, addr_read));
331
-
332
- /* TLB Hit. */
333
- h.base = TCG_REG_L1;
334
- h.index = -1;
335
- h.ofs = 0;
336
- h.seg = 0;
337
- tcg_out_qemu_ld_direct(s, datalo, datahi, h, data_type, opc);
338
-
339
- /* Record the current context of a load into ldst label */
340
- add_qemu_ldst_label(s, true, data_type, oi, datalo, datahi,
341
- addrlo, addrhi, s->code_ptr, label_ptr);
342
-#else
343
- unsigned a_bits = get_alignment_bits(opc);
344
- if (a_bits) {
345
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
346
+ if (ldst) {
347
+ ldst->type = data_type;
348
+ ldst->datalo_reg = datalo;
349
+ ldst->datahi_reg = datahi;
350
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
351
}
352
-
353
- h = x86_guest_base;
354
- h.base = addrlo;
355
- tcg_out_qemu_ld_direct(s, datalo, datahi, h, data_type, opc);
356
-#endif
357
}
358
359
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
360
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
361
TCGReg addrlo, TCGReg addrhi,
362
MemOpIdx oi, TCGType data_type)
363
{
364
- MemOp opc = get_memop(oi);
365
+ TCGLabelQemuLdst *ldst;
366
HostAddress h;
367
368
-#if defined(CONFIG_SOFTMMU)
369
- tcg_insn_unit *label_ptr[2];
370
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
371
+ tcg_out_qemu_st_direct(s, datalo, datahi, h, get_memop(oi));
372
373
- tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc,
374
- label_ptr, offsetof(CPUTLBEntry, addr_write));
375
-
376
- /* TLB Hit. */
377
- h.base = TCG_REG_L1;
378
- h.index = -1;
379
- h.ofs = 0;
380
- h.seg = 0;
381
- tcg_out_qemu_st_direct(s, datalo, datahi, h, opc);
382
-
383
- /* Record the current context of a store into ldst label */
384
- add_qemu_ldst_label(s, false, data_type, oi, datalo, datahi,
385
- addrlo, addrhi, s->code_ptr, label_ptr);
386
-#else
387
- unsigned a_bits = get_alignment_bits(opc);
388
- if (a_bits) {
389
- tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
390
+ if (ldst) {
391
+ ldst->type = data_type;
392
+ ldst->datalo_reg = datalo;
393
+ ldst->datahi_reg = datahi;
394
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
395
}
396
-
397
- h = x86_guest_base;
398
- h.base = addrlo;
399
-
400
- tcg_out_qemu_st_direct(s, datalo, datahi, h, opc);
401
-#endif
402
}
403
404
static void tcg_out_exit_tb(TCGContext *s, uintptr_t a0)
405
--
406
2.34.1
407
408
diff view generated by jsdifflib
Deleted patch
1
Since tcg_out_{ld,st}_helper_args, the slow path no longer requires
2
the address argument to be set up by the tlb load sequence. Use a
3
plain load for the addend and indexed addressing with the original
4
input address register.
5
1
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/i386/tcg-target.c.inc | 25 ++++++++++---------------
10
1 file changed, 10 insertions(+), 15 deletions(-)
11
12
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/i386/tcg-target.c.inc
15
+++ b/tcg/i386/tcg-target.c.inc
16
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
17
tcg_out_sti(s, TCG_TYPE_PTR, (uintptr_t)l->raddr, TCG_REG_ESP, ofs);
18
} else {
19
tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
20
- /* The second argument is already loaded with addrlo. */
21
+ tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
22
+ l->addrlo_reg);
23
tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[2], oi);
24
tcg_out_movi(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[3],
25
(uintptr_t)l->raddr);
26
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
27
tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, ofs);
28
} else {
29
tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
30
- /* The second argument is already loaded with addrlo. */
31
+ tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
32
+ l->addrlo_reg);
33
tcg_out_mov(s, (s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
34
tcg_target_call_iarg_regs[2], l->datalo_reg);
35
tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], oi);
36
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
37
tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw,
38
TCG_REG_L1, TCG_REG_L0, cmp_ofs);
39
40
- /*
41
- * Prepare for both the fast path add of the tlb addend, and the slow
42
- * path function argument setup.
43
- */
44
- *h = (HostAddress) {
45
- .base = TCG_REG_L1,
46
- .index = -1
47
- };
48
- tcg_out_mov(s, ttype, h->base, addrlo);
49
-
50
/* jne slow_path */
51
tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
52
ldst->label_ptr[0] = s->code_ptr;
53
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
54
}
55
56
/* TLB Hit. */
57
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_L0, TCG_REG_L0,
58
+ offsetof(CPUTLBEntry, addend));
59
60
- /* add addend(TCG_REG_L0), TCG_REG_L1 */
61
- tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, h->base, TCG_REG_L0,
62
- offsetof(CPUTLBEntry, addend));
63
+ *h = (HostAddress) {
64
+ .base = addrlo,
65
+ .index = TCG_REG_L0,
66
+ };
67
#else
68
if (a_bits) {
69
ldst = new_ldst_label(s);
70
--
71
2.34.1
72
73
diff view generated by jsdifflib
Deleted patch
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
2
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
3
into one function that returns HostAddress and TCGLabelQemuLdst structures.
4
1
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/aarch64/tcg-target.c.inc | 313 +++++++++++++++--------------------
9
1 file changed, 133 insertions(+), 180 deletions(-)
10
11
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/aarch64/tcg-target.c.inc
14
+++ b/tcg/aarch64/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
16
tcg_out_goto(s, lb->raddr);
17
return true;
18
}
19
-
20
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld, MemOpIdx oi,
21
- TCGType ext, TCGReg data_reg, TCGReg addr_reg,
22
- tcg_insn_unit *raddr, tcg_insn_unit *label_ptr)
23
-{
24
- TCGLabelQemuLdst *label = new_ldst_label(s);
25
-
26
- label->is_ld = is_ld;
27
- label->oi = oi;
28
- label->type = ext;
29
- label->datalo_reg = data_reg;
30
- label->addrlo_reg = addr_reg;
31
- label->raddr = tcg_splitwx_to_rx(raddr);
32
- label->label_ptr[0] = label_ptr;
33
-}
34
-
35
-/* We expect to use a 7-bit scaled negative offset from ENV. */
36
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
37
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -512);
38
-
39
-/* These offsets are built into the LDP below. */
40
-QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
41
-QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 8);
42
-
43
-/* Load and compare a TLB entry, emitting the conditional jump to the
44
- slow path for the failure case, which will be patched later when finalizing
45
- the slow path. Generated code returns the host addend in X1,
46
- clobbers X0,X2,X3,TMP. */
47
-static void tcg_out_tlb_read(TCGContext *s, TCGReg addr_reg, MemOp opc,
48
- tcg_insn_unit **label_ptr, int mem_index,
49
- bool is_read)
50
-{
51
- unsigned a_bits = get_alignment_bits(opc);
52
- unsigned s_bits = opc & MO_SIZE;
53
- unsigned a_mask = (1u << a_bits) - 1;
54
- unsigned s_mask = (1u << s_bits) - 1;
55
- TCGReg x3;
56
- TCGType mask_type;
57
- uint64_t compare_mask;
58
-
59
- mask_type = (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32
60
- ? TCG_TYPE_I64 : TCG_TYPE_I32);
61
-
62
- /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {x0,x1}. */
63
- tcg_out_insn(s, 3314, LDP, TCG_REG_X0, TCG_REG_X1, TCG_AREG0,
64
- TLB_MASK_TABLE_OFS(mem_index), 1, 0);
65
-
66
- /* Extract the TLB index from the address into X0. */
67
- tcg_out_insn(s, 3502S, AND_LSR, mask_type == TCG_TYPE_I64,
68
- TCG_REG_X0, TCG_REG_X0, addr_reg,
69
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
70
-
71
- /* Add the tlb_table pointer, creating the CPUTLBEntry address into X1. */
72
- tcg_out_insn(s, 3502, ADD, 1, TCG_REG_X1, TCG_REG_X1, TCG_REG_X0);
73
-
74
- /* Load the tlb comparator into X0, and the fast path addend into X1. */
75
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_X0, TCG_REG_X1, is_read
76
- ? offsetof(CPUTLBEntry, addr_read)
77
- : offsetof(CPUTLBEntry, addr_write));
78
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_X1, TCG_REG_X1,
79
- offsetof(CPUTLBEntry, addend));
80
-
81
- /* For aligned accesses, we check the first byte and include the alignment
82
- bits within the address. For unaligned access, we check that we don't
83
- cross pages using the address of the last byte of the access. */
84
- if (a_bits >= s_bits) {
85
- x3 = addr_reg;
86
- } else {
87
- tcg_out_insn(s, 3401, ADDI, TARGET_LONG_BITS == 64,
88
- TCG_REG_X3, addr_reg, s_mask - a_mask);
89
- x3 = TCG_REG_X3;
90
- }
91
- compare_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
92
-
93
- /* Store the page mask part of the address into X3. */
94
- tcg_out_logicali(s, I3404_ANDI, TARGET_LONG_BITS == 64,
95
- TCG_REG_X3, x3, compare_mask);
96
-
97
- /* Perform the address comparison. */
98
- tcg_out_cmp(s, TARGET_LONG_BITS == 64, TCG_REG_X0, TCG_REG_X3, 0);
99
-
100
- /* If not equal, we jump to the slow path. */
101
- *label_ptr = s->code_ptr;
102
- tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
103
-}
104
-
105
#else
106
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addr_reg,
107
- unsigned a_bits)
108
-{
109
- unsigned a_mask = (1 << a_bits) - 1;
110
- TCGLabelQemuLdst *label = new_ldst_label(s);
111
-
112
- label->is_ld = is_ld;
113
- label->addrlo_reg = addr_reg;
114
-
115
- /* tst addr, #mask */
116
- tcg_out_logicali(s, I3404_ANDSI, 0, TCG_REG_XZR, addr_reg, a_mask);
117
-
118
- label->label_ptr[0] = s->code_ptr;
119
-
120
- /* b.ne slow_path */
121
- tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
122
-
123
- label->raddr = tcg_splitwx_to_rx(s->code_ptr);
124
-}
125
-
126
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
127
{
128
if (!reloc_pc19(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
129
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
130
}
131
#endif /* CONFIG_SOFTMMU */
132
133
+/*
134
+ * For softmmu, perform the TLB load and compare.
135
+ * For useronly, perform any required alignment tests.
136
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
137
+ * is required and fill in @h with the host address for the fast path.
138
+ */
139
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
140
+ TCGReg addr_reg, MemOpIdx oi,
141
+ bool is_ld)
142
+{
143
+ TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
144
+ TCGLabelQemuLdst *ldst = NULL;
145
+ MemOp opc = get_memop(oi);
146
+ unsigned a_bits = get_alignment_bits(opc);
147
+ unsigned a_mask = (1u << a_bits) - 1;
148
+
149
+#ifdef CONFIG_SOFTMMU
150
+ unsigned s_bits = opc & MO_SIZE;
151
+ unsigned s_mask = (1u << s_bits) - 1;
152
+ unsigned mem_index = get_mmuidx(oi);
153
+ TCGReg x3;
154
+ TCGType mask_type;
155
+ uint64_t compare_mask;
156
+
157
+ ldst = new_ldst_label(s);
158
+ ldst->is_ld = is_ld;
159
+ ldst->oi = oi;
160
+ ldst->addrlo_reg = addr_reg;
161
+
162
+ mask_type = (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32
163
+ ? TCG_TYPE_I64 : TCG_TYPE_I32);
164
+
165
+ /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {x0,x1}. */
166
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
167
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -512);
168
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
169
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 8);
170
+ tcg_out_insn(s, 3314, LDP, TCG_REG_X0, TCG_REG_X1, TCG_AREG0,
171
+ TLB_MASK_TABLE_OFS(mem_index), 1, 0);
172
+
173
+ /* Extract the TLB index from the address into X0. */
174
+ tcg_out_insn(s, 3502S, AND_LSR, mask_type == TCG_TYPE_I64,
175
+ TCG_REG_X0, TCG_REG_X0, addr_reg,
176
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
177
+
178
+ /* Add the tlb_table pointer, creating the CPUTLBEntry address into X1. */
179
+ tcg_out_insn(s, 3502, ADD, 1, TCG_REG_X1, TCG_REG_X1, TCG_REG_X0);
180
+
181
+ /* Load the tlb comparator into X0, and the fast path addend into X1. */
182
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_X0, TCG_REG_X1,
183
+ is_ld ? offsetof(CPUTLBEntry, addr_read)
184
+ : offsetof(CPUTLBEntry, addr_write));
185
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_X1, TCG_REG_X1,
186
+ offsetof(CPUTLBEntry, addend));
187
+
188
+ /*
189
+ * For aligned accesses, we check the first byte and include the alignment
190
+ * bits within the address. For unaligned access, we check that we don't
191
+ * cross pages using the address of the last byte of the access.
192
+ */
193
+ if (a_bits >= s_bits) {
194
+ x3 = addr_reg;
195
+ } else {
196
+ tcg_out_insn(s, 3401, ADDI, TARGET_LONG_BITS == 64,
197
+ TCG_REG_X3, addr_reg, s_mask - a_mask);
198
+ x3 = TCG_REG_X3;
199
+ }
200
+ compare_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
201
+
202
+ /* Store the page mask part of the address into X3. */
203
+ tcg_out_logicali(s, I3404_ANDI, TARGET_LONG_BITS == 64,
204
+ TCG_REG_X3, x3, compare_mask);
205
+
206
+ /* Perform the address comparison. */
207
+ tcg_out_cmp(s, TARGET_LONG_BITS == 64, TCG_REG_X0, TCG_REG_X3, 0);
208
+
209
+ /* If not equal, we jump to the slow path. */
210
+ ldst->label_ptr[0] = s->code_ptr;
211
+ tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
212
+
213
+ *h = (HostAddress){
214
+ .base = TCG_REG_X1,
215
+ .index = addr_reg,
216
+ .index_ext = addr_type
217
+ };
218
+#else
219
+ if (a_mask) {
220
+ ldst = new_ldst_label(s);
221
+
222
+ ldst->is_ld = is_ld;
223
+ ldst->oi = oi;
224
+ ldst->addrlo_reg = addr_reg;
225
+
226
+ /* tst addr, #mask */
227
+ tcg_out_logicali(s, I3404_ANDSI, 0, TCG_REG_XZR, addr_reg, a_mask);
228
+
229
+ /* b.ne slow_path */
230
+ ldst->label_ptr[0] = s->code_ptr;
231
+ tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
232
+ }
233
+
234
+ if (USE_GUEST_BASE) {
235
+ *h = (HostAddress){
236
+ .base = TCG_REG_GUEST_BASE,
237
+ .index = addr_reg,
238
+ .index_ext = addr_type
239
+ };
240
+ } else {
241
+ *h = (HostAddress){
242
+ .base = addr_reg,
243
+ .index = TCG_REG_XZR,
244
+ .index_ext = TCG_TYPE_I64
245
+ };
246
+ }
247
+#endif
248
+
249
+ return ldst;
250
+}
251
+
252
static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp memop, TCGType ext,
253
TCGReg data_r, HostAddress h)
254
{
255
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp memop,
256
static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
257
MemOpIdx oi, TCGType data_type)
258
{
259
- MemOp memop = get_memop(oi);
260
- TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
261
+ TCGLabelQemuLdst *ldst;
262
HostAddress h;
263
264
- /* Byte swapping is left to middle-end expansion. */
265
- tcg_debug_assert((memop & MO_BSWAP) == 0);
266
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, true);
267
+ tcg_out_qemu_ld_direct(s, get_memop(oi), data_type, data_reg, h);
268
269
-#ifdef CONFIG_SOFTMMU
270
- tcg_insn_unit *label_ptr;
271
-
272
- tcg_out_tlb_read(s, addr_reg, memop, &label_ptr, get_mmuidx(oi), 1);
273
-
274
- h = (HostAddress){
275
- .base = TCG_REG_X1,
276
- .index = addr_reg,
277
- .index_ext = addr_type
278
- };
279
- tcg_out_qemu_ld_direct(s, memop, data_type, data_reg, h);
280
-
281
- add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
282
- s->code_ptr, label_ptr);
283
-#else /* !CONFIG_SOFTMMU */
284
- unsigned a_bits = get_alignment_bits(memop);
285
- if (a_bits) {
286
- tcg_out_test_alignment(s, true, addr_reg, a_bits);
287
+ if (ldst) {
288
+ ldst->type = data_type;
289
+ ldst->datalo_reg = data_reg;
290
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
291
}
292
- if (USE_GUEST_BASE) {
293
- h = (HostAddress){
294
- .base = TCG_REG_GUEST_BASE,
295
- .index = addr_reg,
296
- .index_ext = addr_type
297
- };
298
- } else {
299
- h = (HostAddress){
300
- .base = addr_reg,
301
- .index = TCG_REG_XZR,
302
- .index_ext = TCG_TYPE_I64
303
- };
304
- }
305
- tcg_out_qemu_ld_direct(s, memop, data_type, data_reg, h);
306
-#endif /* CONFIG_SOFTMMU */
307
}
308
309
static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
310
MemOpIdx oi, TCGType data_type)
311
{
312
- MemOp memop = get_memop(oi);
313
- TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
314
+ TCGLabelQemuLdst *ldst;
315
HostAddress h;
316
317
- /* Byte swapping is left to middle-end expansion. */
318
- tcg_debug_assert((memop & MO_BSWAP) == 0);
319
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, false);
320
+ tcg_out_qemu_st_direct(s, get_memop(oi), data_reg, h);
321
322
-#ifdef CONFIG_SOFTMMU
323
- tcg_insn_unit *label_ptr;
324
-
325
- tcg_out_tlb_read(s, addr_reg, memop, &label_ptr, get_mmuidx(oi), 0);
326
-
327
- h = (HostAddress){
328
- .base = TCG_REG_X1,
329
- .index = addr_reg,
330
- .index_ext = addr_type
331
- };
332
- tcg_out_qemu_st_direct(s, memop, data_reg, h);
333
-
334
- add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
335
- s->code_ptr, label_ptr);
336
-#else /* !CONFIG_SOFTMMU */
337
- unsigned a_bits = get_alignment_bits(memop);
338
- if (a_bits) {
339
- tcg_out_test_alignment(s, false, addr_reg, a_bits);
340
+ if (ldst) {
341
+ ldst->type = data_type;
342
+ ldst->datalo_reg = data_reg;
343
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
344
}
345
- if (USE_GUEST_BASE) {
346
- h = (HostAddress){
347
- .base = TCG_REG_GUEST_BASE,
348
- .index = addr_reg,
349
- .index_ext = addr_type
350
- };
351
- } else {
352
- h = (HostAddress){
353
- .base = addr_reg,
354
- .index = TCG_REG_XZR,
355
- .index_ext = TCG_TYPE_I64
356
- };
357
- }
358
- tcg_out_qemu_st_direct(s, memop, data_reg, h);
359
-#endif /* CONFIG_SOFTMMU */
360
}
361
362
static const tcg_insn_unit *tb_ret_addr;
363
--
364
2.34.1
365
366
diff view generated by jsdifflib
Deleted patch
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
2
tcg_out_zext_addr_if_32_bit, and some code that lived in both
3
tcg_out_qemu_ld and tcg_out_qemu_st into one function that returns
4
HostAddress and TCGLabelQemuLdst structures.
5
1
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/loongarch64/tcg-target.c.inc | 255 +++++++++++++------------------
10
1 file changed, 105 insertions(+), 150 deletions(-)
11
12
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/loongarch64/tcg-target.c.inc
15
+++ b/tcg/loongarch64/tcg-target.c.inc
16
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[4] = {
17
[MO_64] = helper_le_stq_mmu,
18
};
19
20
-/* We expect to use a 12-bit negative offset from ENV. */
21
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
22
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
23
-
24
static bool tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
25
{
26
tcg_out_opc_b(s, 0);
27
return reloc_br_sd10k16(s->code_ptr - 1, target);
28
}
29
30
-/*
31
- * Emits common code for TLB addend lookup, that eventually loads the
32
- * addend in TCG_REG_TMP2.
33
- */
34
-static void tcg_out_tlb_load(TCGContext *s, TCGReg addrl, MemOpIdx oi,
35
- tcg_insn_unit **label_ptr, bool is_load)
36
-{
37
- MemOp opc = get_memop(oi);
38
- unsigned s_bits = opc & MO_SIZE;
39
- unsigned a_bits = get_alignment_bits(opc);
40
- tcg_target_long compare_mask;
41
- int mem_index = get_mmuidx(oi);
42
- int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
43
- int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
44
- int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
45
-
46
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, TCG_AREG0, mask_ofs);
47
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_AREG0, table_ofs);
48
-
49
- tcg_out_opc_srli_d(s, TCG_REG_TMP2, addrl,
50
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
51
- tcg_out_opc_and(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
52
- tcg_out_opc_add_d(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
53
-
54
- /* Load the tlb comparator and the addend. */
55
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
56
- is_load ? offsetof(CPUTLBEntry, addr_read)
57
- : offsetof(CPUTLBEntry, addr_write));
58
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
59
- offsetof(CPUTLBEntry, addend));
60
-
61
- /* We don't support unaligned accesses. */
62
- if (a_bits < s_bits) {
63
- a_bits = s_bits;
64
- }
65
- /* Clear the non-page, non-alignment bits from the address. */
66
- compare_mask = (tcg_target_long)TARGET_PAGE_MASK | ((1 << a_bits) - 1);
67
- tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
68
- tcg_out_opc_and(s, TCG_REG_TMP1, TCG_REG_TMP1, addrl);
69
-
70
- /* Compare masked address with the TLB entry. */
71
- label_ptr[0] = s->code_ptr;
72
- tcg_out_opc_bne(s, TCG_REG_TMP0, TCG_REG_TMP1, 0);
73
-
74
- /* TLB Hit - addend in TCG_REG_TMP2, ready for use. */
75
-}
76
-
77
-static void add_qemu_ldst_label(TCGContext *s, int is_ld, MemOpIdx oi,
78
- TCGType type,
79
- TCGReg datalo, TCGReg addrlo,
80
- void *raddr, tcg_insn_unit **label_ptr)
81
-{
82
- TCGLabelQemuLdst *label = new_ldst_label(s);
83
-
84
- label->is_ld = is_ld;
85
- label->oi = oi;
86
- label->type = type;
87
- label->datalo_reg = datalo;
88
- label->datahi_reg = 0; /* unused */
89
- label->addrlo_reg = addrlo;
90
- label->addrhi_reg = 0; /* unused */
91
- label->raddr = tcg_splitwx_to_rx(raddr);
92
- label->label_ptr[0] = label_ptr[0];
93
-}
94
-
95
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
96
{
97
MemOpIdx oi = l->oi;
98
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
99
return tcg_out_goto(s, l->raddr);
100
}
101
#else
102
-
103
-/*
104
- * Alignment helpers for user-mode emulation
105
- */
106
-
107
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addr_reg,
108
- unsigned a_bits)
109
-{
110
- TCGLabelQemuLdst *l = new_ldst_label(s);
111
-
112
- l->is_ld = is_ld;
113
- l->addrlo_reg = addr_reg;
114
-
115
- /*
116
- * Without micro-architecture details, we don't know which of bstrpick or
117
- * andi is faster, so use bstrpick as it's not constrained by imm field
118
- * width. (Not to say alignments >= 2^12 are going to happen any time
119
- * soon, though)
120
- */
121
- tcg_out_opc_bstrpick_d(s, TCG_REG_TMP1, addr_reg, 0, a_bits - 1);
122
-
123
- l->label_ptr[0] = s->code_ptr;
124
- tcg_out_opc_bne(s, TCG_REG_TMP1, TCG_REG_ZERO, 0);
125
-
126
- l->raddr = tcg_splitwx_to_rx(s->code_ptr);
127
-}
128
-
129
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
130
{
131
/* resolve label address */
132
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
133
134
#endif /* CONFIG_SOFTMMU */
135
136
-/*
137
- * `ext32u` the address register into the temp register given,
138
- * if target is 32-bit, no-op otherwise.
139
- *
140
- * Returns the address register ready for use with TLB addend.
141
- */
142
-static TCGReg tcg_out_zext_addr_if_32_bit(TCGContext *s,
143
- TCGReg addr, TCGReg tmp)
144
-{
145
- if (TARGET_LONG_BITS == 32) {
146
- tcg_out_ext32u(s, tmp, addr);
147
- return tmp;
148
- }
149
- return addr;
150
-}
151
-
152
typedef struct {
153
TCGReg base;
154
TCGReg index;
155
} HostAddress;
156
157
+/*
158
+ * For softmmu, perform the TLB load and compare.
159
+ * For useronly, perform any required alignment tests.
160
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
161
+ * is required and fill in @h with the host address for the fast path.
162
+ */
163
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
164
+ TCGReg addr_reg, MemOpIdx oi,
165
+ bool is_ld)
166
+{
167
+ TCGLabelQemuLdst *ldst = NULL;
168
+ MemOp opc = get_memop(oi);
169
+ unsigned a_bits = get_alignment_bits(opc);
170
+
171
+#ifdef CONFIG_SOFTMMU
172
+ unsigned s_bits = opc & MO_SIZE;
173
+ int mem_index = get_mmuidx(oi);
174
+ int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
175
+ int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
176
+ int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
177
+ tcg_target_long compare_mask;
178
+
179
+ ldst = new_ldst_label(s);
180
+ ldst->is_ld = is_ld;
181
+ ldst->oi = oi;
182
+ ldst->addrlo_reg = addr_reg;
183
+
184
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
185
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
186
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, TCG_AREG0, mask_ofs);
187
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_AREG0, table_ofs);
188
+
189
+ tcg_out_opc_srli_d(s, TCG_REG_TMP2, addr_reg,
190
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
191
+ tcg_out_opc_and(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
192
+ tcg_out_opc_add_d(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
193
+
194
+ /* Load the tlb comparator and the addend. */
195
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
196
+ is_ld ? offsetof(CPUTLBEntry, addr_read)
197
+ : offsetof(CPUTLBEntry, addr_write));
198
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
199
+ offsetof(CPUTLBEntry, addend));
200
+
201
+ /* We don't support unaligned accesses. */
202
+ if (a_bits < s_bits) {
203
+ a_bits = s_bits;
204
+ }
205
+ /* Clear the non-page, non-alignment bits from the address. */
206
+ compare_mask = (tcg_target_long)TARGET_PAGE_MASK | ((1 << a_bits) - 1);
207
+ tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
208
+ tcg_out_opc_and(s, TCG_REG_TMP1, TCG_REG_TMP1, addr_reg);
209
+
210
+ /* Compare masked address with the TLB entry. */
211
+ ldst->label_ptr[0] = s->code_ptr;
212
+ tcg_out_opc_bne(s, TCG_REG_TMP0, TCG_REG_TMP1, 0);
213
+
214
+ h->index = TCG_REG_TMP2;
215
+#else
216
+ if (a_bits) {
217
+ ldst = new_ldst_label(s);
218
+
219
+ ldst->is_ld = is_ld;
220
+ ldst->oi = oi;
221
+ ldst->addrlo_reg = addr_reg;
222
+
223
+ /*
224
+ * Without micro-architecture details, we don't know which of
225
+ * bstrpick or andi is faster, so use bstrpick as it's not
226
+ * constrained by imm field width. Not to say alignments >= 2^12
227
+ * are going to happen any time soon.
228
+ */
229
+ tcg_out_opc_bstrpick_d(s, TCG_REG_TMP1, addr_reg, 0, a_bits - 1);
230
+
231
+ ldst->label_ptr[0] = s->code_ptr;
232
+ tcg_out_opc_bne(s, TCG_REG_TMP1, TCG_REG_ZERO, 0);
233
+ }
234
+
235
+ h->index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
236
+#endif
237
+
238
+ if (TARGET_LONG_BITS == 32) {
239
+ h->base = TCG_REG_TMP0;
240
+ tcg_out_ext32u(s, h->base, addr_reg);
241
+ } else {
242
+ h->base = addr_reg;
243
+ }
244
+
245
+ return ldst;
246
+}
247
+
248
static void tcg_out_qemu_ld_indexed(TCGContext *s, MemOp opc, TCGType type,
249
TCGReg rd, HostAddress h)
250
{
251
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_indexed(TCGContext *s, MemOp opc, TCGType type,
252
static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
253
MemOpIdx oi, TCGType data_type)
254
{
255
- MemOp opc = get_memop(oi);
256
+ TCGLabelQemuLdst *ldst;
257
HostAddress h;
258
259
-#ifdef CONFIG_SOFTMMU
260
- tcg_insn_unit *label_ptr[1];
261
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, true);
262
+ tcg_out_qemu_ld_indexed(s, get_memop(oi), data_type, data_reg, h);
263
264
- tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 1);
265
- h.index = TCG_REG_TMP2;
266
-#else
267
- unsigned a_bits = get_alignment_bits(opc);
268
- if (a_bits) {
269
- tcg_out_test_alignment(s, true, addr_reg, a_bits);
270
+ if (ldst) {
271
+ ldst->type = data_type;
272
+ ldst->datalo_reg = data_reg;
273
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
274
}
275
- h.index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
276
-#endif
277
-
278
- h.base = tcg_out_zext_addr_if_32_bit(s, addr_reg, TCG_REG_TMP0);
279
- tcg_out_qemu_ld_indexed(s, opc, data_type, data_reg, h);
280
-
281
-#ifdef CONFIG_SOFTMMU
282
- add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
283
- s->code_ptr, label_ptr);
284
-#endif
285
}
286
287
static void tcg_out_qemu_st_indexed(TCGContext *s, MemOp opc,
288
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_indexed(TCGContext *s, MemOp opc,
289
static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
290
MemOpIdx oi, TCGType data_type)
291
{
292
- MemOp opc = get_memop(oi);
293
+ TCGLabelQemuLdst *ldst;
294
HostAddress h;
295
296
-#ifdef CONFIG_SOFTMMU
297
- tcg_insn_unit *label_ptr[1];
298
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, false);
299
+ tcg_out_qemu_st_indexed(s, get_memop(oi), data_reg, h);
300
301
- tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 0);
302
- h.index = TCG_REG_TMP2;
303
-#else
304
- unsigned a_bits = get_alignment_bits(opc);
305
- if (a_bits) {
306
- tcg_out_test_alignment(s, false, addr_reg, a_bits);
307
+ if (ldst) {
308
+ ldst->type = data_type;
309
+ ldst->datalo_reg = data_reg;
310
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
311
}
312
- h.index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
313
-#endif
314
-
315
- h.base = tcg_out_zext_addr_if_32_bit(s, addr_reg, TCG_REG_TMP0);
316
- tcg_out_qemu_st_indexed(s, opc, data_reg, h);
317
-
318
-#ifdef CONFIG_SOFTMMU
319
- add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
320
- s->code_ptr, label_ptr);
321
-#endif
322
}
323
324
/*
325
--
326
2.34.1
327
328
diff view generated by jsdifflib
Deleted patch
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
2
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
3
into one function that returns HostAddress and TCGLabelQemuLdst structures.
4
1
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/ppc/tcg-target.c.inc | 381 ++++++++++++++++++---------------------
9
1 file changed, 172 insertions(+), 209 deletions(-)
10
11
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/ppc/tcg-target.c.inc
14
+++ b/tcg/ppc/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
16
[MO_BEUQ] = helper_be_stq_mmu,
17
};
18
19
-/* We expect to use a 16-bit negative offset from ENV. */
20
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
21
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
22
-
23
-/* Perform the TLB load and compare. Places the result of the comparison
24
- in CR7, loads the addend of the TLB into R3, and returns the register
25
- containing the guest address (zero-extended into R4). Clobbers R0 and R2. */
26
-
27
-static TCGReg tcg_out_tlb_read(TCGContext *s, MemOp opc,
28
- TCGReg addrlo, TCGReg addrhi,
29
- int mem_index, bool is_read)
30
-{
31
- int cmp_off
32
- = (is_read
33
- ? offsetof(CPUTLBEntry, addr_read)
34
- : offsetof(CPUTLBEntry, addr_write));
35
- int fast_off = TLB_MASK_TABLE_OFS(mem_index);
36
- int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
37
- int table_off = fast_off + offsetof(CPUTLBDescFast, table);
38
- unsigned s_bits = opc & MO_SIZE;
39
- unsigned a_bits = get_alignment_bits(opc);
40
-
41
- /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
42
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R3, TCG_AREG0, mask_off);
43
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R4, TCG_AREG0, table_off);
44
-
45
- /* Extract the page index, shifted into place for tlb index. */
46
- if (TCG_TARGET_REG_BITS == 32) {
47
- tcg_out_shri32(s, TCG_REG_TMP1, addrlo,
48
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
49
- } else {
50
- tcg_out_shri64(s, TCG_REG_TMP1, addrlo,
51
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
52
- }
53
- tcg_out32(s, AND | SAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_TMP1));
54
-
55
- /* Load the TLB comparator. */
56
- if (cmp_off == 0 && TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
57
- uint32_t lxu = (TCG_TARGET_REG_BITS == 32 || TARGET_LONG_BITS == 32
58
- ? LWZUX : LDUX);
59
- tcg_out32(s, lxu | TAB(TCG_REG_TMP1, TCG_REG_R3, TCG_REG_R4));
60
- } else {
61
- tcg_out32(s, ADD | TAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_R4));
62
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
63
- tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP1, TCG_REG_R3, cmp_off + 4);
64
- tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_R4, TCG_REG_R3, cmp_off);
65
- } else {
66
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP1, TCG_REG_R3, cmp_off);
67
- }
68
- }
69
-
70
- /* Load the TLB addend for use on the fast path. Do this asap
71
- to minimize any load use delay. */
72
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R3, TCG_REG_R3,
73
- offsetof(CPUTLBEntry, addend));
74
-
75
- /* Clear the non-page, non-alignment bits from the address */
76
- if (TCG_TARGET_REG_BITS == 32) {
77
- /* We don't support unaligned accesses on 32-bits.
78
- * Preserve the bottom bits and thus trigger a comparison
79
- * failure on unaligned accesses.
80
- */
81
- if (a_bits < s_bits) {
82
- a_bits = s_bits;
83
- }
84
- tcg_out_rlw(s, RLWINM, TCG_REG_R0, addrlo, 0,
85
- (32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
86
- } else {
87
- TCGReg t = addrlo;
88
-
89
- /* If the access is unaligned, we need to make sure we fail if we
90
- * cross a page boundary. The trick is to add the access size-1
91
- * to the address before masking the low bits. That will make the
92
- * address overflow to the next page if we cross a page boundary,
93
- * which will then force a mismatch of the TLB compare.
94
- */
95
- if (a_bits < s_bits) {
96
- unsigned a_mask = (1 << a_bits) - 1;
97
- unsigned s_mask = (1 << s_bits) - 1;
98
- tcg_out32(s, ADDI | TAI(TCG_REG_R0, t, s_mask - a_mask));
99
- t = TCG_REG_R0;
100
- }
101
-
102
- /* Mask the address for the requested alignment. */
103
- if (TARGET_LONG_BITS == 32) {
104
- tcg_out_rlw(s, RLWINM, TCG_REG_R0, t, 0,
105
- (32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
106
- /* Zero-extend the address for use in the final address. */
107
- tcg_out_ext32u(s, TCG_REG_R4, addrlo);
108
- addrlo = TCG_REG_R4;
109
- } else if (a_bits == 0) {
110
- tcg_out_rld(s, RLDICR, TCG_REG_R0, t, 0, 63 - TARGET_PAGE_BITS);
111
- } else {
112
- tcg_out_rld(s, RLDICL, TCG_REG_R0, t,
113
- 64 - TARGET_PAGE_BITS, TARGET_PAGE_BITS - a_bits);
114
- tcg_out_rld(s, RLDICL, TCG_REG_R0, TCG_REG_R0, TARGET_PAGE_BITS, 0);
115
- }
116
- }
117
-
118
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
119
- tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
120
- 0, 7, TCG_TYPE_I32);
121
- tcg_out_cmp(s, TCG_COND_EQ, addrhi, TCG_REG_R4, 0, 6, TCG_TYPE_I32);
122
- tcg_out32(s, CRAND | BT(7, CR_EQ) | BA(6, CR_EQ) | BB(7, CR_EQ));
123
- } else {
124
- tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
125
- 0, 7, TCG_TYPE_TL);
126
- }
127
-
128
- return addrlo;
129
-}
130
-
131
-/* Record the context of a call to the out of line helper code for the slow
132
- path for a load or store, so that we can later generate the correct
133
- helper code. */
134
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld,
135
- TCGType type, MemOpIdx oi,
136
- TCGReg datalo_reg, TCGReg datahi_reg,
137
- TCGReg addrlo_reg, TCGReg addrhi_reg,
138
- tcg_insn_unit *raddr, tcg_insn_unit *lptr)
139
-{
140
- TCGLabelQemuLdst *label = new_ldst_label(s);
141
-
142
- label->is_ld = is_ld;
143
- label->type = type;
144
- label->oi = oi;
145
- label->datalo_reg = datalo_reg;
146
- label->datahi_reg = datahi_reg;
147
- label->addrlo_reg = addrlo_reg;
148
- label->addrhi_reg = addrhi_reg;
149
- label->raddr = tcg_splitwx_to_rx(raddr);
150
- label->label_ptr[0] = lptr;
151
-}
152
-
153
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
154
{
155
MemOpIdx oi = lb->oi;
156
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
157
return true;
158
}
159
#else
160
-
161
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
162
- TCGReg addrhi, unsigned a_bits)
163
-{
164
- unsigned a_mask = (1 << a_bits) - 1;
165
- TCGLabelQemuLdst *label = new_ldst_label(s);
166
-
167
- label->is_ld = is_ld;
168
- label->addrlo_reg = addrlo;
169
- label->addrhi_reg = addrhi;
170
-
171
- /* We are expecting a_bits to max out at 7, much lower than ANDI. */
172
- tcg_debug_assert(a_bits < 16);
173
- tcg_out32(s, ANDI | SAI(addrlo, TCG_REG_R0, a_mask));
174
-
175
- label->label_ptr[0] = s->code_ptr;
176
- tcg_out32(s, BC | BI(0, CR_EQ) | BO_COND_FALSE | LK);
177
-
178
- label->raddr = tcg_splitwx_to_rx(s->code_ptr);
179
-}
180
-
181
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
182
{
183
if (!reloc_pc14(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
184
@@ -XXX,XX +XXX,XX @@ typedef struct {
185
TCGReg index;
186
} HostAddress;
187
188
+/*
189
+ * For softmmu, perform the TLB load and compare.
190
+ * For useronly, perform any required alignment tests.
191
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
192
+ * is required and fill in @h with the host address for the fast path.
193
+ */
194
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
195
+ TCGReg addrlo, TCGReg addrhi,
196
+ MemOpIdx oi, bool is_ld)
197
+{
198
+ TCGLabelQemuLdst *ldst = NULL;
199
+ MemOp opc = get_memop(oi);
200
+ unsigned a_bits = get_alignment_bits(opc);
201
+
202
+#ifdef CONFIG_SOFTMMU
203
+ int mem_index = get_mmuidx(oi);
204
+ int cmp_off = is_ld ? offsetof(CPUTLBEntry, addr_read)
205
+ : offsetof(CPUTLBEntry, addr_write);
206
+ int fast_off = TLB_MASK_TABLE_OFS(mem_index);
207
+ int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
208
+ int table_off = fast_off + offsetof(CPUTLBDescFast, table);
209
+ unsigned s_bits = opc & MO_SIZE;
210
+
211
+ ldst = new_ldst_label(s);
212
+ ldst->is_ld = is_ld;
213
+ ldst->oi = oi;
214
+ ldst->addrlo_reg = addrlo;
215
+ ldst->addrhi_reg = addrhi;
216
+
217
+ /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
218
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
219
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
220
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R3, TCG_AREG0, mask_off);
221
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R4, TCG_AREG0, table_off);
222
+
223
+ /* Extract the page index, shifted into place for tlb index. */
224
+ if (TCG_TARGET_REG_BITS == 32) {
225
+ tcg_out_shri32(s, TCG_REG_TMP1, addrlo,
226
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
227
+ } else {
228
+ tcg_out_shri64(s, TCG_REG_TMP1, addrlo,
229
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
230
+ }
231
+ tcg_out32(s, AND | SAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_TMP1));
232
+
233
+ /* Load the TLB comparator. */
234
+ if (cmp_off == 0 && TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
235
+ uint32_t lxu = (TCG_TARGET_REG_BITS == 32 || TARGET_LONG_BITS == 32
236
+ ? LWZUX : LDUX);
237
+ tcg_out32(s, lxu | TAB(TCG_REG_TMP1, TCG_REG_R3, TCG_REG_R4));
238
+ } else {
239
+ tcg_out32(s, ADD | TAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_R4));
240
+ if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
241
+ tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP1, TCG_REG_R3, cmp_off + 4);
242
+ tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_R4, TCG_REG_R3, cmp_off);
243
+ } else {
244
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP1, TCG_REG_R3, cmp_off);
245
+ }
246
+ }
247
+
248
+ /*
249
+ * Load the TLB addend for use on the fast path.
250
+ * Do this asap to minimize any load use delay.
251
+ */
252
+ h->base = TCG_REG_R3;
253
+ tcg_out_ld(s, TCG_TYPE_PTR, h->base, TCG_REG_R3,
254
+ offsetof(CPUTLBEntry, addend));
255
+
256
+ /* Clear the non-page, non-alignment bits from the address */
257
+ if (TCG_TARGET_REG_BITS == 32) {
258
+ /*
259
+ * We don't support unaligned accesses on 32-bits.
260
+ * Preserve the bottom bits and thus trigger a comparison
261
+ * failure on unaligned accesses.
262
+ */
263
+ if (a_bits < s_bits) {
264
+ a_bits = s_bits;
265
+ }
266
+ tcg_out_rlw(s, RLWINM, TCG_REG_R0, addrlo, 0,
267
+ (32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
268
+ } else {
269
+ TCGReg t = addrlo;
270
+
271
+ /*
272
+ * If the access is unaligned, we need to make sure we fail if we
273
+ * cross a page boundary. The trick is to add the access size-1
274
+ * to the address before masking the low bits. That will make the
275
+ * address overflow to the next page if we cross a page boundary,
276
+ * which will then force a mismatch of the TLB compare.
277
+ */
278
+ if (a_bits < s_bits) {
279
+ unsigned a_mask = (1 << a_bits) - 1;
280
+ unsigned s_mask = (1 << s_bits) - 1;
281
+ tcg_out32(s, ADDI | TAI(TCG_REG_R0, t, s_mask - a_mask));
282
+ t = TCG_REG_R0;
283
+ }
284
+
285
+ /* Mask the address for the requested alignment. */
286
+ if (TARGET_LONG_BITS == 32) {
287
+ tcg_out_rlw(s, RLWINM, TCG_REG_R0, t, 0,
288
+ (32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
289
+ /* Zero-extend the address for use in the final address. */
290
+ tcg_out_ext32u(s, TCG_REG_R4, addrlo);
291
+ addrlo = TCG_REG_R4;
292
+ } else if (a_bits == 0) {
293
+ tcg_out_rld(s, RLDICR, TCG_REG_R0, t, 0, 63 - TARGET_PAGE_BITS);
294
+ } else {
295
+ tcg_out_rld(s, RLDICL, TCG_REG_R0, t,
296
+ 64 - TARGET_PAGE_BITS, TARGET_PAGE_BITS - a_bits);
297
+ tcg_out_rld(s, RLDICL, TCG_REG_R0, TCG_REG_R0, TARGET_PAGE_BITS, 0);
298
+ }
299
+ }
300
+ h->index = addrlo;
301
+
302
+ if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
303
+ tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
304
+ 0, 7, TCG_TYPE_I32);
305
+ tcg_out_cmp(s, TCG_COND_EQ, addrhi, TCG_REG_R4, 0, 6, TCG_TYPE_I32);
306
+ tcg_out32(s, CRAND | BT(7, CR_EQ) | BA(6, CR_EQ) | BB(7, CR_EQ));
307
+ } else {
308
+ tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
309
+ 0, 7, TCG_TYPE_TL);
310
+ }
311
+
312
+ /* Load a pointer into the current opcode w/conditional branch-link. */
313
+ ldst->label_ptr[0] = s->code_ptr;
314
+ tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
315
+#else
316
+ if (a_bits) {
317
+ ldst = new_ldst_label(s);
318
+ ldst->is_ld = is_ld;
319
+ ldst->oi = oi;
320
+ ldst->addrlo_reg = addrlo;
321
+ ldst->addrhi_reg = addrhi;
322
+
323
+ /* We are expecting a_bits to max out at 7, much lower than ANDI. */
324
+ tcg_debug_assert(a_bits < 16);
325
+ tcg_out32(s, ANDI | SAI(addrlo, TCG_REG_R0, (1 << a_bits) - 1));
326
+
327
+ ldst->label_ptr[0] = s->code_ptr;
328
+ tcg_out32(s, BC | BI(0, CR_EQ) | BO_COND_FALSE | LK);
329
+ }
330
+
331
+ h->base = guest_base ? TCG_GUEST_BASE_REG : 0;
332
+ h->index = addrlo;
333
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
334
+ tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
335
+ h->index = TCG_REG_TMP1;
336
+ }
337
+#endif
338
+
339
+ return ldst;
340
+}
341
+
342
static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
343
TCGReg addrlo, TCGReg addrhi,
344
MemOpIdx oi, TCGType data_type)
345
{
346
MemOp opc = get_memop(oi);
347
- MemOp s_bits = opc & MO_SIZE;
348
+ TCGLabelQemuLdst *ldst;
349
HostAddress h;
350
351
-#ifdef CONFIG_SOFTMMU
352
- tcg_insn_unit *label_ptr;
353
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
354
355
- h.index = tcg_out_tlb_read(s, opc, addrlo, addrhi, get_mmuidx(oi), true);
356
- h.base = TCG_REG_R3;
357
-
358
- /* Load a pointer into the current opcode w/conditional branch-link. */
359
- label_ptr = s->code_ptr;
360
- tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
361
-#else /* !CONFIG_SOFTMMU */
362
- unsigned a_bits = get_alignment_bits(opc);
363
- if (a_bits) {
364
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
365
- }
366
- h.base = guest_base ? TCG_GUEST_BASE_REG : 0;
367
- h.index = addrlo;
368
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
369
- tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
370
- h.index = TCG_REG_TMP1;
371
- }
372
-#endif
373
-
374
- if (TCG_TARGET_REG_BITS == 32 && s_bits == MO_64) {
375
+ if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
376
if (opc & MO_BSWAP) {
377
tcg_out32(s, ADDI | TAI(TCG_REG_R0, h.index, 4));
378
tcg_out32(s, LWBRX | TAB(datalo, h.base, h.index));
379
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
380
}
381
}
382
383
-#ifdef CONFIG_SOFTMMU
384
- add_qemu_ldst_label(s, true, data_type, oi, datalo, datahi,
385
- addrlo, addrhi, s->code_ptr, label_ptr);
386
-#endif
387
+ if (ldst) {
388
+ ldst->type = data_type;
389
+ ldst->datalo_reg = datalo;
390
+ ldst->datahi_reg = datahi;
391
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
392
+ }
393
}
394
395
static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
396
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
397
MemOpIdx oi, TCGType data_type)
398
{
399
MemOp opc = get_memop(oi);
400
- MemOp s_bits = opc & MO_SIZE;
401
+ TCGLabelQemuLdst *ldst;
402
HostAddress h;
403
404
-#ifdef CONFIG_SOFTMMU
405
- tcg_insn_unit *label_ptr;
406
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
407
408
- h.index = tcg_out_tlb_read(s, opc, addrlo, addrhi, get_mmuidx(oi), false);
409
- h.base = TCG_REG_R3;
410
-
411
- /* Load a pointer into the current opcode w/conditional branch-link. */
412
- label_ptr = s->code_ptr;
413
- tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
414
-#else /* !CONFIG_SOFTMMU */
415
- unsigned a_bits = get_alignment_bits(opc);
416
- if (a_bits) {
417
- tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
418
- }
419
- h.base = guest_base ? TCG_GUEST_BASE_REG : 0;
420
- h.index = addrlo;
421
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
422
- tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
423
- h.index = TCG_REG_TMP1;
424
- }
425
-#endif
426
-
427
- if (TCG_TARGET_REG_BITS == 32 && s_bits == MO_64) {
428
+ if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
429
if (opc & MO_BSWAP) {
430
tcg_out32(s, ADDI | TAI(TCG_REG_R0, h.index, 4));
431
tcg_out32(s, STWBRX | SAB(datalo, h.base, h.index));
432
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
433
}
434
}
435
436
-#ifdef CONFIG_SOFTMMU
437
- add_qemu_ldst_label(s, false, data_type, oi, datalo, datahi,
438
- addrlo, addrhi, s->code_ptr, label_ptr);
439
-#endif
440
+ if (ldst) {
441
+ ldst->type = data_type;
442
+ ldst->datalo_reg = datalo;
443
+ ldst->datahi_reg = datahi;
444
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
445
+ }
446
}
447
448
static void tcg_out_nop_fill(tcg_insn_unit *p, int count)
449
--
450
2.34.1
451
452
diff view generated by jsdifflib
Deleted patch
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
2
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
3
into one function that returns TCGReg and TCGLabelQemuLdst.
4
1
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/riscv/tcg-target.c.inc | 253 +++++++++++++++++--------------------
9
1 file changed, 114 insertions(+), 139 deletions(-)
10
11
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/riscv/tcg-target.c.inc
14
+++ b/tcg/riscv/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[MO_SIZE + 1] = {
16
#endif
17
};
18
19
-/* We expect to use a 12-bit negative offset from ENV. */
20
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
21
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
22
-
23
static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
24
{
25
tcg_out_opc_jump(s, OPC_JAL, TCG_REG_ZERO, 0);
26
@@ -XXX,XX +XXX,XX @@ static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
27
tcg_debug_assert(ok);
28
}
29
30
-static TCGReg tcg_out_tlb_load(TCGContext *s, TCGReg addr, MemOpIdx oi,
31
- tcg_insn_unit **label_ptr, bool is_load)
32
-{
33
- MemOp opc = get_memop(oi);
34
- unsigned s_bits = opc & MO_SIZE;
35
- unsigned a_bits = get_alignment_bits(opc);
36
- tcg_target_long compare_mask;
37
- int mem_index = get_mmuidx(oi);
38
- int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
39
- int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
40
- int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
41
- TCGReg mask_base = TCG_AREG0, table_base = TCG_AREG0;
42
-
43
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, mask_base, mask_ofs);
44
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, table_base, table_ofs);
45
-
46
- tcg_out_opc_imm(s, OPC_SRLI, TCG_REG_TMP2, addr,
47
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
48
- tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
49
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
50
-
51
- /* Load the tlb comparator and the addend. */
52
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
53
- is_load ? offsetof(CPUTLBEntry, addr_read)
54
- : offsetof(CPUTLBEntry, addr_write));
55
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
56
- offsetof(CPUTLBEntry, addend));
57
-
58
- /* We don't support unaligned accesses. */
59
- if (a_bits < s_bits) {
60
- a_bits = s_bits;
61
- }
62
- /* Clear the non-page, non-alignment bits from the address. */
63
- compare_mask = (tcg_target_long)TARGET_PAGE_MASK | ((1 << a_bits) - 1);
64
- if (compare_mask == sextreg(compare_mask, 0, 12)) {
65
- tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr, compare_mask);
66
- } else {
67
- tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
68
- tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP1, TCG_REG_TMP1, addr);
69
- }
70
-
71
- /* Compare masked address with the TLB entry. */
72
- label_ptr[0] = s->code_ptr;
73
- tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP0, TCG_REG_TMP1, 0);
74
-
75
- /* TLB Hit - translate address using addend. */
76
- if (TARGET_LONG_BITS == 32) {
77
- tcg_out_ext32u(s, TCG_REG_TMP0, addr);
78
- addr = TCG_REG_TMP0;
79
- }
80
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_REG_TMP2, addr);
81
- return TCG_REG_TMP0;
82
-}
83
-
84
-static void add_qemu_ldst_label(TCGContext *s, int is_ld, MemOpIdx oi,
85
- TCGType data_type, TCGReg data_reg,
86
- TCGReg addr_reg, void *raddr,
87
- tcg_insn_unit **label_ptr)
88
-{
89
- TCGLabelQemuLdst *label = new_ldst_label(s);
90
-
91
- label->is_ld = is_ld;
92
- label->oi = oi;
93
- label->type = data_type;
94
- label->datalo_reg = data_reg;
95
- label->addrlo_reg = addr_reg;
96
- label->raddr = tcg_splitwx_to_rx(raddr);
97
- label->label_ptr[0] = label_ptr[0];
98
-}
99
-
100
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
101
{
102
MemOpIdx oi = l->oi;
103
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
104
return true;
105
}
106
#else
107
-
108
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addr_reg,
109
- unsigned a_bits)
110
-{
111
- unsigned a_mask = (1 << a_bits) - 1;
112
- TCGLabelQemuLdst *l = new_ldst_label(s);
113
-
114
- l->is_ld = is_ld;
115
- l->addrlo_reg = addr_reg;
116
-
117
- /* We are expecting a_bits to max out at 7, so we can always use andi. */
118
- tcg_debug_assert(a_bits < 12);
119
- tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr_reg, a_mask);
120
-
121
- l->label_ptr[0] = s->code_ptr;
122
- tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP1, TCG_REG_ZERO, 0);
123
-
124
- l->raddr = tcg_splitwx_to_rx(s->code_ptr);
125
-}
126
-
127
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
128
{
129
/* resolve label address */
130
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
131
{
132
return tcg_out_fail_alignment(s, l);
133
}
134
-
135
#endif /* CONFIG_SOFTMMU */
136
137
+/*
138
+ * For softmmu, perform the TLB load and compare.
139
+ * For useronly, perform any required alignment tests.
140
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
141
+ * is required and fill in @h with the host address for the fast path.
142
+ */
143
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, TCGReg *pbase,
144
+ TCGReg addr_reg, MemOpIdx oi,
145
+ bool is_ld)
146
+{
147
+ TCGLabelQemuLdst *ldst = NULL;
148
+ MemOp opc = get_memop(oi);
149
+ unsigned a_bits = get_alignment_bits(opc);
150
+ unsigned a_mask = (1u << a_bits) - 1;
151
+
152
+#ifdef CONFIG_SOFTMMU
153
+ unsigned s_bits = opc & MO_SIZE;
154
+ int mem_index = get_mmuidx(oi);
155
+ int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
156
+ int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
157
+ int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
158
+ TCGReg mask_base = TCG_AREG0, table_base = TCG_AREG0;
159
+ tcg_target_long compare_mask;
160
+
161
+ ldst = new_ldst_label(s);
162
+ ldst->is_ld = is_ld;
163
+ ldst->oi = oi;
164
+ ldst->addrlo_reg = addr_reg;
165
+
166
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
167
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
168
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, mask_base, mask_ofs);
169
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, table_base, table_ofs);
170
+
171
+ tcg_out_opc_imm(s, OPC_SRLI, TCG_REG_TMP2, addr_reg,
172
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
173
+ tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
174
+ tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
175
+
176
+ /* Load the tlb comparator and the addend. */
177
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
178
+ is_ld ? offsetof(CPUTLBEntry, addr_read)
179
+ : offsetof(CPUTLBEntry, addr_write));
180
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
181
+ offsetof(CPUTLBEntry, addend));
182
+
183
+ /* We don't support unaligned accesses. */
184
+ if (a_bits < s_bits) {
185
+ a_bits = s_bits;
186
+ }
187
+ /* Clear the non-page, non-alignment bits from the address. */
188
+ compare_mask = (tcg_target_long)TARGET_PAGE_MASK | a_mask;
189
+ if (compare_mask == sextreg(compare_mask, 0, 12)) {
190
+ tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr_reg, compare_mask);
191
+ } else {
192
+ tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
193
+ tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP1, TCG_REG_TMP1, addr_reg);
194
+ }
195
+
196
+ /* Compare masked address with the TLB entry. */
197
+ ldst->label_ptr[0] = s->code_ptr;
198
+ tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP0, TCG_REG_TMP1, 0);
199
+
200
+ /* TLB Hit - translate address using addend. */
201
+ if (TARGET_LONG_BITS == 32) {
202
+ tcg_out_ext32u(s, TCG_REG_TMP0, addr_reg);
203
+ addr_reg = TCG_REG_TMP0;
204
+ }
205
+ tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_REG_TMP2, addr_reg);
206
+ *pbase = TCG_REG_TMP0;
207
+#else
208
+ if (a_mask) {
209
+ ldst = new_ldst_label(s);
210
+ ldst->is_ld = is_ld;
211
+ ldst->oi = oi;
212
+ ldst->addrlo_reg = addr_reg;
213
+
214
+ /* We are expecting a_bits max 7, so we can always use andi. */
215
+ tcg_debug_assert(a_bits < 12);
216
+ tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr_reg, a_mask);
217
+
218
+ ldst->label_ptr[0] = s->code_ptr;
219
+ tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP1, TCG_REG_ZERO, 0);
220
+ }
221
+
222
+ TCGReg base = addr_reg;
223
+ if (TARGET_LONG_BITS == 32) {
224
+ tcg_out_ext32u(s, TCG_REG_TMP0, base);
225
+ base = TCG_REG_TMP0;
226
+ }
227
+ if (guest_base != 0) {
228
+ tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
229
+ base = TCG_REG_TMP0;
230
+ }
231
+ *pbase = base;
232
+#endif
233
+
234
+ return ldst;
235
+}
236
+
237
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg val,
238
TCGReg base, MemOp opc, TCGType type)
239
{
240
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg val,
241
static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
242
MemOpIdx oi, TCGType data_type)
243
{
244
- MemOp opc = get_memop(oi);
245
+ TCGLabelQemuLdst *ldst;
246
TCGReg base;
247
248
-#if defined(CONFIG_SOFTMMU)
249
- tcg_insn_unit *label_ptr[1];
250
+ ldst = prepare_host_addr(s, &base, addr_reg, oi, true);
251
+ tcg_out_qemu_ld_direct(s, data_reg, base, get_memop(oi), data_type);
252
253
- base = tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 1);
254
- tcg_out_qemu_ld_direct(s, data_reg, base, opc, data_type);
255
- add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
256
- s->code_ptr, label_ptr);
257
-#else
258
- unsigned a_bits = get_alignment_bits(opc);
259
- if (a_bits) {
260
- tcg_out_test_alignment(s, true, addr_reg, a_bits);
261
+ if (ldst) {
262
+ ldst->type = data_type;
263
+ ldst->datalo_reg = data_reg;
264
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
265
}
266
- base = addr_reg;
267
- if (TARGET_LONG_BITS == 32) {
268
- tcg_out_ext32u(s, TCG_REG_TMP0, base);
269
- base = TCG_REG_TMP0;
270
- }
271
- if (guest_base != 0) {
272
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
273
- base = TCG_REG_TMP0;
274
- }
275
- tcg_out_qemu_ld_direct(s, data_reg, base, opc, data_type);
276
-#endif
277
}
278
279
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg val,
280
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg val,
281
static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
282
MemOpIdx oi, TCGType data_type)
283
{
284
- MemOp opc = get_memop(oi);
285
+ TCGLabelQemuLdst *ldst;
286
TCGReg base;
287
288
-#if defined(CONFIG_SOFTMMU)
289
- tcg_insn_unit *label_ptr[1];
290
+ ldst = prepare_host_addr(s, &base, addr_reg, oi, false);
291
+ tcg_out_qemu_st_direct(s, data_reg, base, get_memop(oi));
292
293
- base = tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 0);
294
- tcg_out_qemu_st_direct(s, data_reg, base, opc);
295
- add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
296
- s->code_ptr, label_ptr);
297
-#else
298
- unsigned a_bits = get_alignment_bits(opc);
299
- if (a_bits) {
300
- tcg_out_test_alignment(s, false, addr_reg, a_bits);
301
+ if (ldst) {
302
+ ldst->type = data_type;
303
+ ldst->datalo_reg = data_reg;
304
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
305
}
306
- base = addr_reg;
307
- if (TARGET_LONG_BITS == 32) {
308
- tcg_out_ext32u(s, TCG_REG_TMP0, base);
309
- base = TCG_REG_TMP0;
310
- }
311
- if (guest_base != 0) {
312
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
313
- base = TCG_REG_TMP0;
314
- }
315
- tcg_out_qemu_st_direct(s, data_reg, base, opc);
316
-#endif
317
}
318
319
static const tcg_insn_unit *tb_ret_addr;
320
--
321
2.34.1
322
323
diff view generated by jsdifflib
Deleted patch
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
2
tcg_prepare_user_ldst, and some code that lived in both tcg_out_qemu_ld
3
and tcg_out_qemu_st into one function that returns HostAddress and
4
TCGLabelQemuLdst structures.
5
1
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/s390x/tcg-target.c.inc | 263 ++++++++++++++++---------------------
10
1 file changed, 113 insertions(+), 150 deletions(-)
11
12
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/s390x/tcg-target.c.inc
15
+++ b/tcg/s390x/tcg-target.c.inc
16
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg data,
17
}
18
19
#if defined(CONFIG_SOFTMMU)
20
-/* We're expecting to use a 20-bit negative offset on the tlb memory ops. */
21
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
22
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 19));
23
-
24
-/* Load and compare a TLB entry, leaving the flags set. Loads the TLB
25
- addend into R2. Returns a register with the santitized guest address. */
26
-static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addr_reg, MemOp opc,
27
- int mem_index, bool is_ld)
28
-{
29
- unsigned s_bits = opc & MO_SIZE;
30
- unsigned a_bits = get_alignment_bits(opc);
31
- unsigned s_mask = (1 << s_bits) - 1;
32
- unsigned a_mask = (1 << a_bits) - 1;
33
- int fast_off = TLB_MASK_TABLE_OFS(mem_index);
34
- int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
35
- int table_off = fast_off + offsetof(CPUTLBDescFast, table);
36
- int ofs, a_off;
37
- uint64_t tlb_mask;
38
-
39
- tcg_out_sh64(s, RSY_SRLG, TCG_REG_R2, addr_reg, TCG_REG_NONE,
40
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
41
- tcg_out_insn(s, RXY, NG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, mask_off);
42
- tcg_out_insn(s, RXY, AG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, table_off);
43
-
44
- /* For aligned accesses, we check the first byte and include the alignment
45
- bits within the address. For unaligned access, we check that we don't
46
- cross pages using the address of the last byte of the access. */
47
- a_off = (a_bits >= s_bits ? 0 : s_mask - a_mask);
48
- tlb_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
49
- if (a_off == 0) {
50
- tgen_andi_risbg(s, TCG_REG_R3, addr_reg, tlb_mask);
51
- } else {
52
- tcg_out_insn(s, RX, LA, TCG_REG_R3, addr_reg, TCG_REG_NONE, a_off);
53
- tgen_andi(s, TCG_TYPE_TL, TCG_REG_R3, tlb_mask);
54
- }
55
-
56
- if (is_ld) {
57
- ofs = offsetof(CPUTLBEntry, addr_read);
58
- } else {
59
- ofs = offsetof(CPUTLBEntry, addr_write);
60
- }
61
- if (TARGET_LONG_BITS == 32) {
62
- tcg_out_insn(s, RX, C, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
63
- } else {
64
- tcg_out_insn(s, RXY, CG, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
65
- }
66
-
67
- tcg_out_insn(s, RXY, LG, TCG_REG_R2, TCG_REG_R2, TCG_REG_NONE,
68
- offsetof(CPUTLBEntry, addend));
69
-
70
- if (TARGET_LONG_BITS == 32) {
71
- tcg_out_ext32u(s, TCG_REG_R3, addr_reg);
72
- return TCG_REG_R3;
73
- }
74
- return addr_reg;
75
-}
76
-
77
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld, MemOpIdx oi,
78
- TCGType type, TCGReg data, TCGReg addr,
79
- tcg_insn_unit *raddr, tcg_insn_unit *label_ptr)
80
-{
81
- TCGLabelQemuLdst *label = new_ldst_label(s);
82
-
83
- label->is_ld = is_ld;
84
- label->oi = oi;
85
- label->type = type;
86
- label->datalo_reg = data;
87
- label->addrlo_reg = addr;
88
- label->raddr = tcg_splitwx_to_rx(raddr);
89
- label->label_ptr[0] = label_ptr;
90
-}
91
-
92
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
93
{
94
TCGReg addr_reg = lb->addrlo_reg;
95
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
96
return true;
97
}
98
#else
99
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld,
100
- TCGReg addrlo, unsigned a_bits)
101
-{
102
- unsigned a_mask = (1 << a_bits) - 1;
103
- TCGLabelQemuLdst *l = new_ldst_label(s);
104
-
105
- l->is_ld = is_ld;
106
- l->addrlo_reg = addrlo;
107
-
108
- /* We are expecting a_bits to max out at 7, much lower than TMLL. */
109
- tcg_debug_assert(a_bits < 16);
110
- tcg_out_insn(s, RI, TMLL, addrlo, a_mask);
111
-
112
- tcg_out16(s, RI_BRC | (7 << 4)); /* CC in {1,2,3} */
113
- l->label_ptr[0] = s->code_ptr;
114
- s->code_ptr += 1;
115
-
116
- l->raddr = tcg_splitwx_to_rx(s->code_ptr);
117
-}
118
-
119
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
120
{
121
if (!patch_reloc(l->label_ptr[0], R_390_PC16DBL,
122
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
123
{
124
return tcg_out_fail_alignment(s, l);
125
}
126
+#endif /* CONFIG_SOFTMMU */
127
128
-static HostAddress tcg_prepare_user_ldst(TCGContext *s, TCGReg addr_reg)
129
+/*
130
+ * For softmmu, perform the TLB load and compare.
131
+ * For useronly, perform any required alignment tests.
132
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
133
+ * is required and fill in @h with the host address for the fast path.
134
+ */
135
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
136
+ TCGReg addr_reg, MemOpIdx oi,
137
+ bool is_ld)
138
{
139
- TCGReg index;
140
- int disp;
141
+ TCGLabelQemuLdst *ldst = NULL;
142
+ MemOp opc = get_memop(oi);
143
+ unsigned a_bits = get_alignment_bits(opc);
144
+ unsigned a_mask = (1u << a_bits) - 1;
145
146
+#ifdef CONFIG_SOFTMMU
147
+ unsigned s_bits = opc & MO_SIZE;
148
+ unsigned s_mask = (1 << s_bits) - 1;
149
+ int mem_index = get_mmuidx(oi);
150
+ int fast_off = TLB_MASK_TABLE_OFS(mem_index);
151
+ int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
152
+ int table_off = fast_off + offsetof(CPUTLBDescFast, table);
153
+ int ofs, a_off;
154
+ uint64_t tlb_mask;
155
+
156
+ ldst = new_ldst_label(s);
157
+ ldst->is_ld = is_ld;
158
+ ldst->oi = oi;
159
+ ldst->addrlo_reg = addr_reg;
160
+
161
+ tcg_out_sh64(s, RSY_SRLG, TCG_REG_R2, addr_reg, TCG_REG_NONE,
162
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
163
+
164
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
165
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 19));
166
+ tcg_out_insn(s, RXY, NG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, mask_off);
167
+ tcg_out_insn(s, RXY, AG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, table_off);
168
+
169
+ /*
170
+ * For aligned accesses, we check the first byte and include the alignment
171
+ * bits within the address. For unaligned access, we check that we don't
172
+ * cross pages using the address of the last byte of the access.
173
+ */
174
+ a_off = (a_bits >= s_bits ? 0 : s_mask - a_mask);
175
+ tlb_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
176
+ if (a_off == 0) {
177
+ tgen_andi_risbg(s, TCG_REG_R3, addr_reg, tlb_mask);
178
+ } else {
179
+ tcg_out_insn(s, RX, LA, TCG_REG_R3, addr_reg, TCG_REG_NONE, a_off);
180
+ tgen_andi(s, TCG_TYPE_TL, TCG_REG_R3, tlb_mask);
181
+ }
182
+
183
+ if (is_ld) {
184
+ ofs = offsetof(CPUTLBEntry, addr_read);
185
+ } else {
186
+ ofs = offsetof(CPUTLBEntry, addr_write);
187
+ }
188
+ if (TARGET_LONG_BITS == 32) {
189
+ tcg_out_insn(s, RX, C, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
190
+ } else {
191
+ tcg_out_insn(s, RXY, CG, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
192
+ }
193
+
194
+ tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
195
+ ldst->label_ptr[0] = s->code_ptr++;
196
+
197
+ h->index = TCG_REG_R2;
198
+ tcg_out_insn(s, RXY, LG, h->index, TCG_REG_R2, TCG_REG_NONE,
199
+ offsetof(CPUTLBEntry, addend));
200
+
201
+ h->base = addr_reg;
202
+ if (TARGET_LONG_BITS == 32) {
203
+ tcg_out_ext32u(s, TCG_REG_R3, addr_reg);
204
+ h->base = TCG_REG_R3;
205
+ }
206
+ h->disp = 0;
207
+#else
208
+ if (a_mask) {
209
+ ldst = new_ldst_label(s);
210
+ ldst->is_ld = is_ld;
211
+ ldst->oi = oi;
212
+ ldst->addrlo_reg = addr_reg;
213
+
214
+ /* We are expecting a_bits to max out at 7, much lower than TMLL. */
215
+ tcg_debug_assert(a_bits < 16);
216
+ tcg_out_insn(s, RI, TMLL, addr_reg, a_mask);
217
+
218
+ tcg_out16(s, RI_BRC | (7 << 4)); /* CC in {1,2,3} */
219
+ ldst->label_ptr[0] = s->code_ptr++;
220
+ }
221
+
222
+ h->base = addr_reg;
223
if (TARGET_LONG_BITS == 32) {
224
tcg_out_ext32u(s, TCG_TMP0, addr_reg);
225
- addr_reg = TCG_TMP0;
226
+ h->base = TCG_TMP0;
227
}
228
if (guest_base < 0x80000) {
229
- index = TCG_REG_NONE;
230
- disp = guest_base;
231
+ h->index = TCG_REG_NONE;
232
+ h->disp = guest_base;
233
} else {
234
- index = TCG_GUEST_BASE_REG;
235
- disp = 0;
236
+ h->index = TCG_GUEST_BASE_REG;
237
+ h->disp = 0;
238
}
239
- return (HostAddress){ .base = addr_reg, .index = index, .disp = disp };
240
+#endif
241
+
242
+ return ldst;
243
}
244
-#endif /* CONFIG_SOFTMMU */
245
246
static void tcg_out_qemu_ld(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
247
MemOpIdx oi, TCGType data_type)
248
{
249
- MemOp opc = get_memop(oi);
250
+ TCGLabelQemuLdst *ldst;
251
HostAddress h;
252
253
-#ifdef CONFIG_SOFTMMU
254
- unsigned mem_index = get_mmuidx(oi);
255
- tcg_insn_unit *label_ptr;
256
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, true);
257
+ tcg_out_qemu_ld_direct(s, get_memop(oi), data_reg, h);
258
259
- h.base = tcg_out_tlb_read(s, addr_reg, opc, mem_index, 1);
260
- h.index = TCG_REG_R2;
261
- h.disp = 0;
262
-
263
- tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
264
- label_ptr = s->code_ptr;
265
- s->code_ptr += 1;
266
-
267
- tcg_out_qemu_ld_direct(s, opc, data_reg, h);
268
-
269
- add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
270
- s->code_ptr, label_ptr);
271
-#else
272
- unsigned a_bits = get_alignment_bits(opc);
273
-
274
- if (a_bits) {
275
- tcg_out_test_alignment(s, true, addr_reg, a_bits);
276
+ if (ldst) {
277
+ ldst->type = data_type;
278
+ ldst->datalo_reg = data_reg;
279
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
280
}
281
- h = tcg_prepare_user_ldst(s, addr_reg);
282
- tcg_out_qemu_ld_direct(s, opc, data_reg, h);
283
-#endif
284
}
285
286
static void tcg_out_qemu_st(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
287
MemOpIdx oi, TCGType data_type)
288
{
289
- MemOp opc = get_memop(oi);
290
+ TCGLabelQemuLdst *ldst;
291
HostAddress h;
292
293
-#ifdef CONFIG_SOFTMMU
294
- unsigned mem_index = get_mmuidx(oi);
295
- tcg_insn_unit *label_ptr;
296
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, false);
297
+ tcg_out_qemu_st_direct(s, get_memop(oi), data_reg, h);
298
299
- h.base = tcg_out_tlb_read(s, addr_reg, opc, mem_index, 0);
300
- h.index = TCG_REG_R2;
301
- h.disp = 0;
302
-
303
- tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
304
- label_ptr = s->code_ptr;
305
- s->code_ptr += 1;
306
-
307
- tcg_out_qemu_st_direct(s, opc, data_reg, h);
308
-
309
- add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
310
- s->code_ptr, label_ptr);
311
-#else
312
- unsigned a_bits = get_alignment_bits(opc);
313
-
314
- if (a_bits) {
315
- tcg_out_test_alignment(s, false, addr_reg, a_bits);
316
+ if (ldst) {
317
+ ldst->type = data_type;
318
+ ldst->datalo_reg = data_reg;
319
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
320
}
321
- h = tcg_prepare_user_ldst(s, addr_reg);
322
- tcg_out_qemu_st_direct(s, opc, data_reg, h);
323
-#endif
324
}
325
326
static void tcg_out_exit_tb(TCGContext *s, uintptr_t a0)
327
--
328
2.34.1
329
330
diff view generated by jsdifflib
Deleted patch
1
Use tcg_out_st_helper_args. This eliminates the use of a tail call to
2
the store helper. This may or may not be an improvement, depending on
3
the call/return branch prediction of the host microarchitecture.
4
1
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/i386/tcg-target.c.inc | 57 +++------------------------------------
9
1 file changed, 4 insertions(+), 53 deletions(-)
10
11
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/i386/tcg-target.c.inc
14
+++ b/tcg/i386/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
16
*/
17
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
18
{
19
- MemOpIdx oi = l->oi;
20
- MemOp opc = get_memop(oi);
21
- MemOp s_bits = opc & MO_SIZE;
22
+ MemOp opc = get_memop(l->oi);
23
tcg_insn_unit **label_ptr = &l->label_ptr[0];
24
- TCGReg retaddr;
25
26
/* resolve label address */
27
tcg_patch32(label_ptr[0], s->code_ptr - label_ptr[0] - 4);
28
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
29
tcg_patch32(label_ptr[1], s->code_ptr - label_ptr[1] - 4);
30
}
31
32
- if (TCG_TARGET_REG_BITS == 32) {
33
- int ofs = 0;
34
+ tcg_out_st_helper_args(s, l, &ldst_helper_param);
35
+ tcg_out_branch(s, 1, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
36
37
- tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, ofs);
38
- ofs += 4;
39
-
40
- tcg_out_st(s, TCG_TYPE_I32, l->addrlo_reg, TCG_REG_ESP, ofs);
41
- ofs += 4;
42
-
43
- if (TARGET_LONG_BITS == 64) {
44
- tcg_out_st(s, TCG_TYPE_I32, l->addrhi_reg, TCG_REG_ESP, ofs);
45
- ofs += 4;
46
- }
47
-
48
- tcg_out_st(s, TCG_TYPE_I32, l->datalo_reg, TCG_REG_ESP, ofs);
49
- ofs += 4;
50
-
51
- if (s_bits == MO_64) {
52
- tcg_out_st(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_ESP, ofs);
53
- ofs += 4;
54
- }
55
-
56
- tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, ofs);
57
- ofs += 4;
58
-
59
- retaddr = TCG_REG_EAX;
60
- tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr);
61
- tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, ofs);
62
- } else {
63
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
64
- tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
65
- l->addrlo_reg);
66
- tcg_out_mov(s, (s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
67
- tcg_target_call_iarg_regs[2], l->datalo_reg);
68
- tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], oi);
69
-
70
- if (ARRAY_SIZE(tcg_target_call_iarg_regs) > 4) {
71
- retaddr = tcg_target_call_iarg_regs[4];
72
- tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr);
73
- } else {
74
- retaddr = TCG_REG_RAX;
75
- tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr);
76
- tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP,
77
- TCG_TARGET_CALL_STACK_OFFSET);
78
- }
79
- }
80
-
81
- /* "Tail call" to the helper, with the return address back inline. */
82
- tcg_out_push(s, retaddr);
83
- tcg_out_jmp(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
84
+ tcg_out_jmp(s, l->raddr);
85
return true;
86
}
87
#else
88
--
89
2.34.1
90
91
diff view generated by jsdifflib
Deleted patch
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
and tcg_out_st_helper_args. This allows our local
3
tcg_out_arg_* infrastructure to be removed.
4
1
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/arm/tcg-target.c.inc | 140 +++++----------------------------------
9
1 file changed, 18 insertions(+), 122 deletions(-)
10
11
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/arm/tcg-target.c.inc
14
+++ b/tcg/arm/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ tcg_out_ldrd_rwb(TCGContext *s, ARMCond cond, TCGReg rt, TCGReg rn, TCGReg rm)
16
tcg_out_memop_r(s, cond, INSN_LDRD_REG, rt, rn, rm, 1, 1, 1);
17
}
18
19
-static void tcg_out_strd_8(TCGContext *s, ARMCond cond, TCGReg rt,
20
- TCGReg rn, int imm8)
21
+static void __attribute__((unused))
22
+tcg_out_strd_8(TCGContext *s, ARMCond cond, TCGReg rt, TCGReg rn, int imm8)
23
{
24
tcg_out_memop_8(s, cond, INSN_STRD_IMM, rt, rn, imm8, 1, 0);
25
}
26
@@ -XXX,XX +XXX,XX @@ static void tcg_out_ext8u(TCGContext *s, TCGReg rd, TCGReg rn)
27
tcg_out_dat_imm(s, COND_AL, ARITH_AND, rd, rn, 0xff);
28
}
29
30
-static void __attribute__((unused))
31
-tcg_out_ext8u_cond(TCGContext *s, ARMCond cond, TCGReg rd, TCGReg rn)
32
-{
33
- tcg_out_dat_imm(s, cond, ARITH_AND, rd, rn, 0xff);
34
-}
35
-
36
static void tcg_out_ext16s(TCGContext *s, TCGType t, TCGReg rd, TCGReg rn)
37
{
38
/* sxth */
39
tcg_out32(s, 0x06bf0070 | (COND_AL << 28) | (rd << 12) | rn);
40
}
41
42
-static void tcg_out_ext16u_cond(TCGContext *s, ARMCond cond,
43
- TCGReg rd, TCGReg rn)
44
-{
45
- /* uxth */
46
- tcg_out32(s, 0x06ff0070 | (cond << 28) | (rd << 12) | rn);
47
-}
48
-
49
static void tcg_out_ext16u(TCGContext *s, TCGReg rd, TCGReg rn)
50
{
51
- tcg_out_ext16u_cond(s, COND_AL, rd, rn);
52
+ /* uxth */
53
+ tcg_out32(s, 0x06ff0070 | (COND_AL << 28) | (rd << 12) | rn);
54
}
55
56
static void tcg_out_ext32s(TCGContext *s, TCGReg rd, TCGReg rn)
57
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[MO_SIZE + 1] = {
58
#endif
59
};
60
61
-/* Helper routines for marshalling helper function arguments into
62
- * the correct registers and stack.
63
- * argreg is where we want to put this argument, arg is the argument itself.
64
- * Return value is the updated argreg ready for the next call.
65
- * Note that argreg 0..3 is real registers, 4+ on stack.
66
- *
67
- * We provide routines for arguments which are: immediate, 32 bit
68
- * value in register, 16 and 8 bit values in register (which must be zero
69
- * extended before use) and 64 bit value in a lo:hi register pair.
70
- */
71
-#define DEFINE_TCG_OUT_ARG(NAME, ARGTYPE, MOV_ARG, EXT_ARG) \
72
-static TCGReg NAME(TCGContext *s, TCGReg argreg, ARGTYPE arg) \
73
-{ \
74
- if (argreg < 4) { \
75
- MOV_ARG(s, COND_AL, argreg, arg); \
76
- } else { \
77
- int ofs = (argreg - 4) * 4; \
78
- EXT_ARG; \
79
- tcg_debug_assert(ofs + 4 <= TCG_STATIC_CALL_ARGS_SIZE); \
80
- tcg_out_st32_12(s, COND_AL, arg, TCG_REG_CALL_STACK, ofs); \
81
- } \
82
- return argreg + 1; \
83
-}
84
-
85
-DEFINE_TCG_OUT_ARG(tcg_out_arg_imm32, uint32_t, tcg_out_movi32,
86
- (tcg_out_movi32(s, COND_AL, TCG_REG_TMP, arg), arg = TCG_REG_TMP))
87
-DEFINE_TCG_OUT_ARG(tcg_out_arg_reg8, TCGReg, tcg_out_ext8u_cond,
88
- (tcg_out_ext8u_cond(s, COND_AL, TCG_REG_TMP, arg), arg = TCG_REG_TMP))
89
-DEFINE_TCG_OUT_ARG(tcg_out_arg_reg16, TCGReg, tcg_out_ext16u_cond,
90
- (tcg_out_ext16u_cond(s, COND_AL, TCG_REG_TMP, arg), arg = TCG_REG_TMP))
91
-DEFINE_TCG_OUT_ARG(tcg_out_arg_reg32, TCGReg, tcg_out_mov_reg, )
92
-
93
-static TCGReg tcg_out_arg_reg64(TCGContext *s, TCGReg argreg,
94
- TCGReg arglo, TCGReg arghi)
95
+static TCGReg ldst_ra_gen(TCGContext *s, const TCGLabelQemuLdst *l, int arg)
96
{
97
- /* 64 bit arguments must go in even/odd register pairs
98
- * and in 8-aligned stack slots.
99
- */
100
- if (argreg & 1) {
101
- argreg++;
102
- }
103
- if (argreg >= 4 && (arglo & 1) == 0 && arghi == arglo + 1) {
104
- tcg_out_strd_8(s, COND_AL, arglo,
105
- TCG_REG_CALL_STACK, (argreg - 4) * 4);
106
- return argreg + 2;
107
- } else {
108
- argreg = tcg_out_arg_reg32(s, argreg, arglo);
109
- argreg = tcg_out_arg_reg32(s, argreg, arghi);
110
- return argreg;
111
- }
112
+ /* We arrive at the slow path via "BLNE", so R14 contains l->raddr. */
113
+ return TCG_REG_R14;
114
}
115
116
+static const TCGLdstHelperParam ldst_helper_param = {
117
+ .ra_gen = ldst_ra_gen,
118
+ .ntmp = 1,
119
+ .tmp = { TCG_REG_TMP },
120
+};
121
+
122
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
123
{
124
- TCGReg argreg;
125
- MemOpIdx oi = lb->oi;
126
- MemOp opc = get_memop(oi);
127
+ MemOp opc = get_memop(lb->oi);
128
129
if (!reloc_pc24(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
130
return false;
131
}
132
133
- argreg = tcg_out_arg_reg32(s, TCG_REG_R0, TCG_AREG0);
134
- if (TARGET_LONG_BITS == 64) {
135
- argreg = tcg_out_arg_reg64(s, argreg, lb->addrlo_reg, lb->addrhi_reg);
136
- } else {
137
- argreg = tcg_out_arg_reg32(s, argreg, lb->addrlo_reg);
138
- }
139
- argreg = tcg_out_arg_imm32(s, argreg, oi);
140
- argreg = tcg_out_arg_reg32(s, argreg, TCG_REG_R14);
141
-
142
- /* Use the canonical unsigned helpers and minimize icache usage. */
143
+ tcg_out_ld_helper_args(s, lb, &ldst_helper_param);
144
tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SIZE]);
145
-
146
- if ((opc & MO_SIZE) == MO_64) {
147
- TCGMovExtend ext[2] = {
148
- { .dst = lb->datalo_reg, .dst_type = TCG_TYPE_I32,
149
- .src = TCG_REG_R0, .src_type = TCG_TYPE_I32, .src_ext = MO_UL },
150
- { .dst = lb->datahi_reg, .dst_type = TCG_TYPE_I32,
151
- .src = TCG_REG_R1, .src_type = TCG_TYPE_I32, .src_ext = MO_UL },
152
- };
153
- tcg_out_movext2(s, &ext[0], &ext[1], TCG_REG_TMP);
154
- } else {
155
- tcg_out_movext(s, TCG_TYPE_I32, lb->datalo_reg,
156
- TCG_TYPE_I32, opc & MO_SSIZE, TCG_REG_R0);
157
- }
158
+ tcg_out_ld_helper_ret(s, lb, false, &ldst_helper_param);
159
160
tcg_out_goto(s, COND_AL, lb->raddr);
161
return true;
162
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
163
164
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
165
{
166
- TCGReg argreg, datalo, datahi;
167
- MemOpIdx oi = lb->oi;
168
- MemOp opc = get_memop(oi);
169
+ MemOp opc = get_memop(lb->oi);
170
171
if (!reloc_pc24(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
172
return false;
173
}
174
175
- argreg = TCG_REG_R0;
176
- argreg = tcg_out_arg_reg32(s, argreg, TCG_AREG0);
177
- if (TARGET_LONG_BITS == 64) {
178
- argreg = tcg_out_arg_reg64(s, argreg, lb->addrlo_reg, lb->addrhi_reg);
179
- } else {
180
- argreg = tcg_out_arg_reg32(s, argreg, lb->addrlo_reg);
181
- }
182
-
183
- datalo = lb->datalo_reg;
184
- datahi = lb->datahi_reg;
185
- switch (opc & MO_SIZE) {
186
- case MO_8:
187
- argreg = tcg_out_arg_reg8(s, argreg, datalo);
188
- break;
189
- case MO_16:
190
- argreg = tcg_out_arg_reg16(s, argreg, datalo);
191
- break;
192
- case MO_32:
193
- default:
194
- argreg = tcg_out_arg_reg32(s, argreg, datalo);
195
- break;
196
- case MO_64:
197
- argreg = tcg_out_arg_reg64(s, argreg, datalo, datahi);
198
- break;
199
- }
200
-
201
- argreg = tcg_out_arg_imm32(s, argreg, oi);
202
- argreg = tcg_out_arg_reg32(s, argreg, TCG_REG_R14);
203
+ tcg_out_st_helper_args(s, lb, &ldst_helper_param);
204
205
/* Tail-call to the helper, which will return to the fast path. */
206
tcg_out_goto(s, COND_AL, qemu_st_helpers[opc & MO_SIZE]);
207
--
208
2.34.1
209
210
diff view generated by jsdifflib
Deleted patch
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
and tcg_out_st_helper_args.
3
1
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/loongarch64/tcg-target.c.inc | 37 ++++++++++----------------------
8
1 file changed, 11 insertions(+), 26 deletions(-)
9
10
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
11
index XXXXXXX..XXXXXXX 100644
12
--- a/tcg/loongarch64/tcg-target.c.inc
13
+++ b/tcg/loongarch64/tcg-target.c.inc
14
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
15
return reloc_br_sd10k16(s->code_ptr - 1, target);
16
}
17
18
+static const TCGLdstHelperParam ldst_helper_param = {
19
+ .ntmp = 1, .tmp = { TCG_REG_TMP0 }
20
+};
21
+
22
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
23
{
24
- MemOpIdx oi = l->oi;
25
- MemOp opc = get_memop(oi);
26
- MemOp size = opc & MO_SIZE;
27
+ MemOp opc = get_memop(l->oi);
28
29
/* resolve label address */
30
if (!reloc_br_sk16(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
31
return false;
32
}
33
34
- /* call load helper */
35
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A0, TCG_AREG0);
36
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A1, l->addrlo_reg);
37
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A2, oi);
38
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A3, (tcg_target_long)l->raddr);
39
-
40
- tcg_out_call_int(s, qemu_ld_helpers[size], false);
41
-
42
- tcg_out_movext(s, l->type, l->datalo_reg,
43
- TCG_TYPE_REG, opc & MO_SSIZE, TCG_REG_A0);
44
+ tcg_out_ld_helper_args(s, l, &ldst_helper_param);
45
+ tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SIZE], false);
46
+ tcg_out_ld_helper_ret(s, l, false, &ldst_helper_param);
47
return tcg_out_goto(s, l->raddr);
48
}
49
50
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
51
{
52
- MemOpIdx oi = l->oi;
53
- MemOp opc = get_memop(oi);
54
- MemOp size = opc & MO_SIZE;
55
+ MemOp opc = get_memop(l->oi);
56
57
/* resolve label address */
58
if (!reloc_br_sk16(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
59
return false;
60
}
61
62
- /* call store helper */
63
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A0, TCG_AREG0);
64
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A1, l->addrlo_reg);
65
- tcg_out_movext(s, size == MO_64 ? TCG_TYPE_I32 : TCG_TYPE_I32, TCG_REG_A2,
66
- l->type, size, l->datalo_reg);
67
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A3, oi);
68
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A4, (tcg_target_long)l->raddr);
69
-
70
- tcg_out_call_int(s, qemu_st_helpers[size], false);
71
-
72
+ tcg_out_st_helper_args(s, l, &ldst_helper_param);
73
+ tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE], false);
74
return tcg_out_goto(s, l->raddr);
75
}
76
#else
77
--
78
2.34.1
79
80
diff view generated by jsdifflib
Deleted patch
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
and tcg_out_st_helper_args.
3
1
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/ppc/tcg-target.c.inc | 88 ++++++++++++----------------------------
9
1 file changed, 26 insertions(+), 62 deletions(-)
10
11
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/ppc/tcg-target.c.inc
14
+++ b/tcg/ppc/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
16
[MO_BEUQ] = helper_be_stq_mmu,
17
};
18
19
+static TCGReg ldst_ra_gen(TCGContext *s, const TCGLabelQemuLdst *l, int arg)
20
+{
21
+ if (arg < 0) {
22
+ arg = TCG_REG_TMP1;
23
+ }
24
+ tcg_out32(s, MFSPR | RT(arg) | LR);
25
+ return arg;
26
+}
27
+
28
+/*
29
+ * For the purposes of ppc32 sorting 4 input registers into 4 argument
30
+ * registers, there is an outside chance we would require 3 temps.
31
+ * Because of constraints, no inputs are in r3, and env will not be
32
+ * placed into r3 until after the sorting is done, and is thus free.
33
+ */
34
+static const TCGLdstHelperParam ldst_helper_param = {
35
+ .ra_gen = ldst_ra_gen,
36
+ .ntmp = 3,
37
+ .tmp = { TCG_REG_TMP1, TCG_REG_R0, TCG_REG_R3 }
38
+};
39
+
40
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
41
{
42
- MemOpIdx oi = lb->oi;
43
- MemOp opc = get_memop(oi);
44
- TCGReg hi, lo, arg = TCG_REG_R3;
45
+ MemOp opc = get_memop(lb->oi);
46
47
if (!reloc_pc14(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
48
return false;
49
}
50
51
- tcg_out_mov(s, TCG_TYPE_PTR, arg++, TCG_AREG0);
52
-
53
- lo = lb->addrlo_reg;
54
- hi = lb->addrhi_reg;
55
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
56
- arg |= (TCG_TARGET_CALL_ARG_I64 == TCG_CALL_ARG_EVEN);
57
- tcg_out_mov(s, TCG_TYPE_I32, arg++, hi);
58
- tcg_out_mov(s, TCG_TYPE_I32, arg++, lo);
59
- } else {
60
- /* If the address needed to be zero-extended, we'll have already
61
- placed it in R4. The only remaining case is 64-bit guest. */
62
- tcg_out_mov(s, TCG_TYPE_TL, arg++, lo);
63
- }
64
-
65
- tcg_out_movi(s, TCG_TYPE_I32, arg++, oi);
66
- tcg_out32(s, MFSPR | RT(arg) | LR);
67
-
68
+ tcg_out_ld_helper_args(s, lb, &ldst_helper_param);
69
tcg_out_call_int(s, LK, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]);
70
-
71
- lo = lb->datalo_reg;
72
- hi = lb->datahi_reg;
73
- if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
74
- tcg_out_mov(s, TCG_TYPE_I32, lo, TCG_REG_R4);
75
- tcg_out_mov(s, TCG_TYPE_I32, hi, TCG_REG_R3);
76
- } else {
77
- tcg_out_movext(s, lb->type, lo,
78
- TCG_TYPE_REG, opc & MO_SSIZE, TCG_REG_R3);
79
- }
80
+ tcg_out_ld_helper_ret(s, lb, false, &ldst_helper_param);
81
82
tcg_out_b(s, 0, lb->raddr);
83
return true;
84
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
85
86
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
87
{
88
- MemOpIdx oi = lb->oi;
89
- MemOp opc = get_memop(oi);
90
- MemOp s_bits = opc & MO_SIZE;
91
- TCGReg hi, lo, arg = TCG_REG_R3;
92
+ MemOp opc = get_memop(lb->oi);
93
94
if (!reloc_pc14(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
95
return false;
96
}
97
98
- tcg_out_mov(s, TCG_TYPE_PTR, arg++, TCG_AREG0);
99
-
100
- lo = lb->addrlo_reg;
101
- hi = lb->addrhi_reg;
102
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
103
- arg |= (TCG_TARGET_CALL_ARG_I64 == TCG_CALL_ARG_EVEN);
104
- tcg_out_mov(s, TCG_TYPE_I32, arg++, hi);
105
- tcg_out_mov(s, TCG_TYPE_I32, arg++, lo);
106
- } else {
107
- /* If the address needed to be zero-extended, we'll have already
108
- placed it in R4. The only remaining case is 64-bit guest. */
109
- tcg_out_mov(s, TCG_TYPE_TL, arg++, lo);
110
- }
111
-
112
- lo = lb->datalo_reg;
113
- hi = lb->datahi_reg;
114
- if (TCG_TARGET_REG_BITS == 32 && s_bits == MO_64) {
115
- arg |= (TCG_TARGET_CALL_ARG_I64 == TCG_CALL_ARG_EVEN);
116
- tcg_out_mov(s, TCG_TYPE_I32, arg++, hi);
117
- tcg_out_mov(s, TCG_TYPE_I32, arg++, lo);
118
- } else {
119
- tcg_out_movext(s, s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32,
120
- arg++, lb->type, s_bits, lo);
121
- }
122
-
123
- tcg_out_movi(s, TCG_TYPE_I32, arg++, oi);
124
- tcg_out32(s, MFSPR | RT(arg) | LR);
125
-
126
+ tcg_out_st_helper_args(s, lb, &ldst_helper_param);
127
tcg_out_call_int(s, LK, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
128
129
tcg_out_b(s, 0, lb->raddr);
130
--
131
2.34.1
132
133
diff view generated by jsdifflib
Deleted patch
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
and tcg_out_st_helper_args.
3
1
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/riscv/tcg-target.c.inc | 37 ++++++++++---------------------------
9
1 file changed, 10 insertions(+), 27 deletions(-)
10
11
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/riscv/tcg-target.c.inc
14
+++ b/tcg/riscv/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
16
tcg_debug_assert(ok);
17
}
18
19
+/* We have three temps, we might as well expose them. */
20
+static const TCGLdstHelperParam ldst_helper_param = {
21
+ .ntmp = 3, .tmp = { TCG_REG_TMP0, TCG_REG_TMP1, TCG_REG_TMP2 }
22
+};
23
+
24
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
25
{
26
- MemOpIdx oi = l->oi;
27
- MemOp opc = get_memop(oi);
28
- TCGReg a0 = tcg_target_call_iarg_regs[0];
29
- TCGReg a1 = tcg_target_call_iarg_regs[1];
30
- TCGReg a2 = tcg_target_call_iarg_regs[2];
31
- TCGReg a3 = tcg_target_call_iarg_regs[3];
32
+ MemOp opc = get_memop(l->oi);
33
34
/* resolve label address */
35
if (!reloc_sbimm12(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
36
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
37
}
38
39
/* call load helper */
40
- tcg_out_mov(s, TCG_TYPE_PTR, a0, TCG_AREG0);
41
- tcg_out_mov(s, TCG_TYPE_PTR, a1, l->addrlo_reg);
42
- tcg_out_movi(s, TCG_TYPE_PTR, a2, oi);
43
- tcg_out_movi(s, TCG_TYPE_PTR, a3, (tcg_target_long)l->raddr);
44
-
45
+ tcg_out_ld_helper_args(s, l, &ldst_helper_param);
46
tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SSIZE], false);
47
- tcg_out_mov(s, (opc & MO_SIZE) == MO_64, l->datalo_reg, a0);
48
+ tcg_out_ld_helper_ret(s, l, true, &ldst_helper_param);
49
50
tcg_out_goto(s, l->raddr);
51
return true;
52
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
53
54
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
55
{
56
- MemOpIdx oi = l->oi;
57
- MemOp opc = get_memop(oi);
58
- MemOp s_bits = opc & MO_SIZE;
59
- TCGReg a0 = tcg_target_call_iarg_regs[0];
60
- TCGReg a1 = tcg_target_call_iarg_regs[1];
61
- TCGReg a2 = tcg_target_call_iarg_regs[2];
62
- TCGReg a3 = tcg_target_call_iarg_regs[3];
63
- TCGReg a4 = tcg_target_call_iarg_regs[4];
64
+ MemOp opc = get_memop(l->oi);
65
66
/* resolve label address */
67
if (!reloc_sbimm12(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
68
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
69
}
70
71
/* call store helper */
72
- tcg_out_mov(s, TCG_TYPE_PTR, a0, TCG_AREG0);
73
- tcg_out_mov(s, TCG_TYPE_PTR, a1, l->addrlo_reg);
74
- tcg_out_movext(s, s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32, a2,
75
- l->type, s_bits, l->datalo_reg);
76
- tcg_out_movi(s, TCG_TYPE_PTR, a3, oi);
77
- tcg_out_movi(s, TCG_TYPE_PTR, a4, (tcg_target_long)l->raddr);
78
-
79
+ tcg_out_st_helper_args(s, l, &ldst_helper_param);
80
tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE], false);
81
82
tcg_out_goto(s, l->raddr);
83
--
84
2.34.1
85
86
diff view generated by jsdifflib
Deleted patch
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
and tcg_out_st_helper_args.
3
1
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/s390x/tcg-target.c.inc | 35 ++++++++++-------------------------
8
1 file changed, 10 insertions(+), 25 deletions(-)
9
10
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
11
index XXXXXXX..XXXXXXX 100644
12
--- a/tcg/s390x/tcg-target.c.inc
13
+++ b/tcg/s390x/tcg-target.c.inc
14
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg data,
15
}
16
17
#if defined(CONFIG_SOFTMMU)
18
+static const TCGLdstHelperParam ldst_helper_param = {
19
+ .ntmp = 1, .tmp = { TCG_TMP0 }
20
+};
21
+
22
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
23
{
24
- TCGReg addr_reg = lb->addrlo_reg;
25
- TCGReg data_reg = lb->datalo_reg;
26
- MemOpIdx oi = lb->oi;
27
- MemOp opc = get_memop(oi);
28
+ MemOp opc = get_memop(lb->oi);
29
30
if (!patch_reloc(lb->label_ptr[0], R_390_PC16DBL,
31
(intptr_t)tcg_splitwx_to_rx(s->code_ptr), 2)) {
32
return false;
33
}
34
35
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_R2, TCG_AREG0);
36
- if (TARGET_LONG_BITS == 64) {
37
- tcg_out_mov(s, TCG_TYPE_I64, TCG_REG_R3, addr_reg);
38
- }
39
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R4, oi);
40
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R5, (uintptr_t)lb->raddr);
41
- tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)]);
42
- tcg_out_mov(s, TCG_TYPE_I64, data_reg, TCG_REG_R2);
43
+ tcg_out_ld_helper_args(s, lb, &ldst_helper_param);
44
+ tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]);
45
+ tcg_out_ld_helper_ret(s, lb, false, &ldst_helper_param);
46
47
tgen_gotoi(s, S390_CC_ALWAYS, lb->raddr);
48
return true;
49
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
50
51
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
52
{
53
- TCGReg addr_reg = lb->addrlo_reg;
54
- TCGReg data_reg = lb->datalo_reg;
55
- MemOpIdx oi = lb->oi;
56
- MemOp opc = get_memop(oi);
57
- MemOp size = opc & MO_SIZE;
58
+ MemOp opc = get_memop(lb->oi);
59
60
if (!patch_reloc(lb->label_ptr[0], R_390_PC16DBL,
61
(intptr_t)tcg_splitwx_to_rx(s->code_ptr), 2)) {
62
return false;
63
}
64
65
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_R2, TCG_AREG0);
66
- if (TARGET_LONG_BITS == 64) {
67
- tcg_out_mov(s, TCG_TYPE_I64, TCG_REG_R3, addr_reg);
68
- }
69
- tcg_out_movext(s, size == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32,
70
- TCG_REG_R4, lb->type, size, data_reg);
71
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R5, oi);
72
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R6, (uintptr_t)lb->raddr);
73
+ tcg_out_st_helper_args(s, lb, &ldst_helper_param);
74
tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
75
76
tgen_gotoi(s, S390_CC_ALWAYS, lb->raddr);
77
--
78
2.34.1
79
80
diff view generated by jsdifflib
Deleted patch
1
The softmmu tlb uses TCG_REG_TMP[0-2], not any of the normally available
2
registers. Now that we handle overlap betwen inputs and helper arguments,
3
we can allow any allocatable reg.
4
1
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/loongarch64/tcg-target-con-set.h | 2 --
9
tcg/loongarch64/tcg-target-con-str.h | 1 -
10
tcg/loongarch64/tcg-target.c.inc | 23 ++++-------------------
11
3 files changed, 4 insertions(+), 22 deletions(-)
12
13
diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/tcg/loongarch64/tcg-target-con-set.h
16
+++ b/tcg/loongarch64/tcg-target-con-set.h
17
@@ -XXX,XX +XXX,XX @@
18
C_O0_I1(r)
19
C_O0_I2(rZ, r)
20
C_O0_I2(rZ, rZ)
21
-C_O0_I2(LZ, L)
22
C_O1_I1(r, r)
23
-C_O1_I1(r, L)
24
C_O1_I2(r, r, rC)
25
C_O1_I2(r, r, ri)
26
C_O1_I2(r, r, rI)
27
diff --git a/tcg/loongarch64/tcg-target-con-str.h b/tcg/loongarch64/tcg-target-con-str.h
28
index XXXXXXX..XXXXXXX 100644
29
--- a/tcg/loongarch64/tcg-target-con-str.h
30
+++ b/tcg/loongarch64/tcg-target-con-str.h
31
@@ -XXX,XX +XXX,XX @@
32
* REGS(letter, register_mask)
33
*/
34
REGS('r', ALL_GENERAL_REGS)
35
-REGS('L', ALL_GENERAL_REGS & ~SOFTMMU_RESERVE_REGS)
36
37
/*
38
* Define constraint letters for constants:
39
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
40
index XXXXXXX..XXXXXXX 100644
41
--- a/tcg/loongarch64/tcg-target.c.inc
42
+++ b/tcg/loongarch64/tcg-target.c.inc
43
@@ -XXX,XX +XXX,XX @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
44
#define TCG_CT_CONST_C12 0x1000
45
#define TCG_CT_CONST_WSZ 0x2000
46
47
-#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
48
-/*
49
- * For softmmu, we need to avoid conflicts with the first 5
50
- * argument registers to call the helper. Some of these are
51
- * also used for the tlb lookup.
52
- */
53
-#ifdef CONFIG_SOFTMMU
54
-#define SOFTMMU_RESERVE_REGS MAKE_64BIT_MASK(TCG_REG_A0, 5)
55
-#else
56
-#define SOFTMMU_RESERVE_REGS 0
57
-#endif
58
-
59
+#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
60
61
static inline tcg_target_long sextreg(tcg_target_long val, int pos, int len)
62
{
63
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
64
case INDEX_op_st32_i64:
65
case INDEX_op_st_i32:
66
case INDEX_op_st_i64:
67
+ case INDEX_op_qemu_st_i32:
68
+ case INDEX_op_qemu_st_i64:
69
return C_O0_I2(rZ, r);
70
71
case INDEX_op_brcond_i32:
72
case INDEX_op_brcond_i64:
73
return C_O0_I2(rZ, rZ);
74
75
- case INDEX_op_qemu_st_i32:
76
- case INDEX_op_qemu_st_i64:
77
- return C_O0_I2(LZ, L);
78
-
79
case INDEX_op_ext8s_i32:
80
case INDEX_op_ext8s_i64:
81
case INDEX_op_ext8u_i32:
82
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
83
case INDEX_op_ld32u_i64:
84
case INDEX_op_ld_i32:
85
case INDEX_op_ld_i64:
86
- return C_O1_I1(r, r);
87
-
88
case INDEX_op_qemu_ld_i32:
89
case INDEX_op_qemu_ld_i64:
90
- return C_O1_I1(r, L);
91
+ return C_O1_I1(r, r);
92
93
case INDEX_op_andc_i32:
94
case INDEX_op_andc_i64:
95
--
96
2.34.1
97
98
diff view generated by jsdifflib
Deleted patch
1
Compare the address vs the tlb entry with sign-extended values.
2
This simplifies the page+alignment mask constant, and the
3
generation of the last byte address for the misaligned test.
4
1
5
Move the tlb addend load up, and the zero-extension down.
6
7
This frees up a register, which allows us use TMP3 as the returned base
8
address register instead of A0, which we were using as a 5th temporary.
9
10
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
---
13
tcg/mips/tcg-target.c.inc | 38 ++++++++++++++++++--------------------
14
1 file changed, 18 insertions(+), 20 deletions(-)
15
16
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
17
index XXXXXXX..XXXXXXX 100644
18
--- a/tcg/mips/tcg-target.c.inc
19
+++ b/tcg/mips/tcg-target.c.inc
20
@@ -XXX,XX +XXX,XX @@ typedef enum {
21
ALIAS_PADDI = sizeof(void *) == 4 ? OPC_ADDIU : OPC_DADDIU,
22
ALIAS_TSRL = TARGET_LONG_BITS == 32 || TCG_TARGET_REG_BITS == 32
23
? OPC_SRL : OPC_DSRL,
24
+ ALIAS_TADDI = TARGET_LONG_BITS == 32 || TCG_TARGET_REG_BITS == 32
25
+ ? OPC_ADDIU : OPC_DADDIU,
26
} MIPSInsn;
27
28
/*
29
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
30
int add_off = offsetof(CPUTLBEntry, addend);
31
int cmp_off = is_ld ? offsetof(CPUTLBEntry, addr_read)
32
: offsetof(CPUTLBEntry, addr_write);
33
- target_ulong tlb_mask;
34
35
ldst = new_ldst_label(s);
36
ldst->is_ld = is_ld;
37
ldst->oi = oi;
38
ldst->addrlo_reg = addrlo;
39
ldst->addrhi_reg = addrhi;
40
- base = TCG_REG_A0;
41
42
/* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
43
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
44
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
45
if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
46
tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + LO_OFF);
47
} else {
48
- tcg_out_ldst(s, (TARGET_LONG_BITS == 64 ? OPC_LD
49
- : TCG_TARGET_REG_BITS == 64 ? OPC_LWU : OPC_LW),
50
- TCG_TMP0, TCG_TMP3, cmp_off);
51
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_TMP0, TCG_TMP3, cmp_off);
52
}
53
54
- /* Zero extend a 32-bit guest address for a 64-bit host. */
55
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
56
- tcg_out_ext32u(s, base, addrlo);
57
- addrlo = base;
58
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
59
+ /* Load the tlb addend for the fast path. */
60
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP3, TCG_TMP3, add_off);
61
}
62
63
/*
64
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
65
* For unaligned accesses, compare against the end of the access to
66
* verify that it does not cross a page boundary.
67
*/
68
- tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
69
- tcg_out_movi(s, TCG_TYPE_I32, TCG_TMP1, tlb_mask);
70
- if (a_mask >= s_mask) {
71
- tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrlo);
72
- } else {
73
- tcg_out_opc_imm(s, ALIAS_PADDI, TCG_TMP2, addrlo, s_mask - a_mask);
74
+ tcg_out_movi(s, TCG_TYPE_TL, TCG_TMP1, TARGET_PAGE_MASK | a_mask);
75
+ if (a_mask < s_mask) {
76
+ tcg_out_opc_imm(s, ALIAS_TADDI, TCG_TMP2, addrlo, s_mask - a_mask);
77
tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, TCG_TMP2);
78
+ } else {
79
+ tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrlo);
80
}
81
82
- if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
83
- /* Load the tlb addend for the fast path. */
84
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
85
+ /* Zero extend a 32-bit guest address for a 64-bit host. */
86
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
87
+ tcg_out_ext32u(s, TCG_TMP2, addrlo);
88
+ addrlo = TCG_TMP2;
89
}
90
91
ldst->label_ptr[0] = s->code_ptr;
92
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
93
tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + HI_OFF);
94
95
/* Load the tlb addend for the fast path. */
96
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
97
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP3, TCG_TMP3, add_off);
98
99
ldst->label_ptr[1] = s->code_ptr;
100
tcg_out_opc_br(s, OPC_BNE, addrhi, TCG_TMP0);
101
}
102
103
/* delay slot */
104
- tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_TMP2, addrlo);
105
+ base = TCG_TMP3;
106
+ tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_TMP3, addrlo);
107
#else
108
if (a_mask && (use_mips32r6_instructions || a_bits != s_bits)) {
109
ldst = new_ldst_label(s);
110
--
111
2.34.1
112
113
diff view generated by jsdifflib
Deleted patch
1
The softmmu tlb uses TCG_REG_TMP[0-3], not any of the normally available
2
registers. Now that we handle overlap betwen inputs and helper arguments,
3
and have eliminated use of A0, we can allow any allocatable reg.
4
1
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/mips/tcg-target-con-set.h | 13 +++++--------
9
tcg/mips/tcg-target-con-str.h | 2 --
10
tcg/mips/tcg-target.c.inc | 30 ++++++++----------------------
11
3 files changed, 13 insertions(+), 32 deletions(-)
12
13
diff --git a/tcg/mips/tcg-target-con-set.h b/tcg/mips/tcg-target-con-set.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/tcg/mips/tcg-target-con-set.h
16
+++ b/tcg/mips/tcg-target-con-set.h
17
@@ -XXX,XX +XXX,XX @@
18
C_O0_I1(r)
19
C_O0_I2(rZ, r)
20
C_O0_I2(rZ, rZ)
21
-C_O0_I2(SZ, S)
22
-C_O0_I3(SZ, S, S)
23
-C_O0_I3(SZ, SZ, S)
24
+C_O0_I3(rZ, r, r)
25
+C_O0_I3(rZ, rZ, r)
26
C_O0_I4(rZ, rZ, rZ, rZ)
27
-C_O0_I4(SZ, SZ, S, S)
28
-C_O1_I1(r, L)
29
+C_O0_I4(rZ, rZ, r, r)
30
C_O1_I1(r, r)
31
C_O1_I2(r, 0, rZ)
32
-C_O1_I2(r, L, L)
33
+C_O1_I2(r, r, r)
34
C_O1_I2(r, r, ri)
35
C_O1_I2(r, r, rI)
36
C_O1_I2(r, r, rIK)
37
@@ -XXX,XX +XXX,XX @@ C_O1_I2(r, rZ, rN)
38
C_O1_I2(r, rZ, rZ)
39
C_O1_I4(r, rZ, rZ, rZ, 0)
40
C_O1_I4(r, rZ, rZ, rZ, rZ)
41
-C_O2_I1(r, r, L)
42
-C_O2_I2(r, r, L, L)
43
+C_O2_I1(r, r, r)
44
C_O2_I2(r, r, r, r)
45
C_O2_I4(r, r, rZ, rZ, rN, rN)
46
diff --git a/tcg/mips/tcg-target-con-str.h b/tcg/mips/tcg-target-con-str.h
47
index XXXXXXX..XXXXXXX 100644
48
--- a/tcg/mips/tcg-target-con-str.h
49
+++ b/tcg/mips/tcg-target-con-str.h
50
@@ -XXX,XX +XXX,XX @@
51
* REGS(letter, register_mask)
52
*/
53
REGS('r', ALL_GENERAL_REGS)
54
-REGS('L', ALL_QLOAD_REGS)
55
-REGS('S', ALL_QSTORE_REGS)
56
57
/*
58
* Define constraint letters for constants:
59
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
60
index XXXXXXX..XXXXXXX 100644
61
--- a/tcg/mips/tcg-target.c.inc
62
+++ b/tcg/mips/tcg-target.c.inc
63
@@ -XXX,XX +XXX,XX @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
64
#define TCG_CT_CONST_WSZ 0x2000 /* word size */
65
66
#define ALL_GENERAL_REGS 0xffffffffu
67
-#define NOA0_REGS (ALL_GENERAL_REGS & ~(1 << TCG_REG_A0))
68
-
69
-#ifdef CONFIG_SOFTMMU
70
-#define ALL_QLOAD_REGS \
71
- (NOA0_REGS & ~((TCG_TARGET_REG_BITS < TARGET_LONG_BITS) << TCG_REG_A2))
72
-#define ALL_QSTORE_REGS \
73
- (NOA0_REGS & ~(TCG_TARGET_REG_BITS < TARGET_LONG_BITS \
74
- ? (1 << TCG_REG_A2) | (1 << TCG_REG_A3) \
75
- : (1 << TCG_REG_A1)))
76
-#else
77
-#define ALL_QLOAD_REGS NOA0_REGS
78
-#define ALL_QSTORE_REGS NOA0_REGS
79
-#endif
80
-
81
82
static bool is_p2m1(tcg_target_long val)
83
{
84
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
85
86
case INDEX_op_qemu_ld_i32:
87
return (TCG_TARGET_REG_BITS == 64 || TARGET_LONG_BITS == 32
88
- ? C_O1_I1(r, L) : C_O1_I2(r, L, L));
89
+ ? C_O1_I1(r, r) : C_O1_I2(r, r, r));
90
case INDEX_op_qemu_st_i32:
91
return (TCG_TARGET_REG_BITS == 64 || TARGET_LONG_BITS == 32
92
- ? C_O0_I2(SZ, S) : C_O0_I3(SZ, S, S));
93
+ ? C_O0_I2(rZ, r) : C_O0_I3(rZ, r, r));
94
case INDEX_op_qemu_ld_i64:
95
- return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, L)
96
- : TARGET_LONG_BITS == 32 ? C_O2_I1(r, r, L)
97
- : C_O2_I2(r, r, L, L));
98
+ return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, r)
99
+ : TARGET_LONG_BITS == 32 ? C_O2_I1(r, r, r)
100
+ : C_O2_I2(r, r, r, r));
101
case INDEX_op_qemu_st_i64:
102
- return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(SZ, S)
103
- : TARGET_LONG_BITS == 32 ? C_O0_I3(SZ, SZ, S)
104
- : C_O0_I4(SZ, SZ, S, S));
105
+ return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(rZ, r)
106
+ : TARGET_LONG_BITS == 32 ? C_O0_I3(rZ, rZ, r)
107
+ : C_O0_I4(rZ, rZ, r, r));
108
109
default:
110
g_assert_not_reached();
111
--
112
2.34.1
113
114
diff view generated by jsdifflib
Deleted patch
1
Allocate TCG_REG_TMP2. Use R0, TMP1, TMP2 instead of any of
2
the normally allocated registers for the tlb load.
3
1
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/ppc/tcg-target.c.inc | 78 ++++++++++++++++++++++++----------------
9
1 file changed, 47 insertions(+), 31 deletions(-)
10
11
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/ppc/tcg-target.c.inc
14
+++ b/tcg/ppc/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@
16
#else
17
# define TCG_REG_TMP1 TCG_REG_R12
18
#endif
19
+#define TCG_REG_TMP2 TCG_REG_R11
20
21
#define TCG_VEC_TMP1 TCG_REG_V0
22
#define TCG_VEC_TMP2 TCG_REG_V1
23
@@ -XXX,XX +XXX,XX @@ static TCGReg ldst_ra_gen(TCGContext *s, const TCGLabelQemuLdst *l, int arg)
24
/*
25
* For the purposes of ppc32 sorting 4 input registers into 4 argument
26
* registers, there is an outside chance we would require 3 temps.
27
- * Because of constraints, no inputs are in r3, and env will not be
28
- * placed into r3 until after the sorting is done, and is thus free.
29
*/
30
static const TCGLdstHelperParam ldst_helper_param = {
31
.ra_gen = ldst_ra_gen,
32
.ntmp = 3,
33
- .tmp = { TCG_REG_TMP1, TCG_REG_R0, TCG_REG_R3 }
34
+ .tmp = { TCG_REG_TMP1, TCG_REG_TMP2, TCG_REG_R0 }
35
};
36
37
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
38
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
39
/* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
40
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
41
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
42
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R3, TCG_AREG0, mask_off);
43
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R4, TCG_AREG0, table_off);
44
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_AREG0, mask_off);
45
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_AREG0, table_off);
46
47
/* Extract the page index, shifted into place for tlb index. */
48
if (TCG_TARGET_REG_BITS == 32) {
49
- tcg_out_shri32(s, TCG_REG_TMP1, addrlo,
50
+ tcg_out_shri32(s, TCG_REG_R0, addrlo,
51
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
52
} else {
53
- tcg_out_shri64(s, TCG_REG_TMP1, addrlo,
54
+ tcg_out_shri64(s, TCG_REG_R0, addrlo,
55
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
56
}
57
- tcg_out32(s, AND | SAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_TMP1));
58
+ tcg_out32(s, AND | SAB(TCG_REG_TMP1, TCG_REG_TMP1, TCG_REG_R0));
59
60
- /* Load the TLB comparator. */
61
+ /* Load the (low part) TLB comparator into TMP2. */
62
if (cmp_off == 0 && TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
63
uint32_t lxu = (TCG_TARGET_REG_BITS == 32 || TARGET_LONG_BITS == 32
64
? LWZUX : LDUX);
65
- tcg_out32(s, lxu | TAB(TCG_REG_TMP1, TCG_REG_R3, TCG_REG_R4));
66
+ tcg_out32(s, lxu | TAB(TCG_REG_TMP2, TCG_REG_TMP1, TCG_REG_TMP2));
67
} else {
68
- tcg_out32(s, ADD | TAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_R4));
69
+ tcg_out32(s, ADD | TAB(TCG_REG_TMP1, TCG_REG_TMP1, TCG_REG_TMP2));
70
if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
71
- tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP1, TCG_REG_R3, cmp_off + 4);
72
- tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_R4, TCG_REG_R3, cmp_off);
73
+ tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP2,
74
+ TCG_REG_TMP1, cmp_off + 4 * HOST_BIG_ENDIAN);
75
} else {
76
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP1, TCG_REG_R3, cmp_off);
77
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP2, TCG_REG_TMP1, cmp_off);
78
}
79
}
80
81
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
82
* Load the TLB addend for use on the fast path.
83
* Do this asap to minimize any load use delay.
84
*/
85
- h->base = TCG_REG_R3;
86
- tcg_out_ld(s, TCG_TYPE_PTR, h->base, TCG_REG_R3,
87
- offsetof(CPUTLBEntry, addend));
88
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
89
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_REG_TMP1,
90
+ offsetof(CPUTLBEntry, addend));
91
+ }
92
93
- /* Clear the non-page, non-alignment bits from the address */
94
+ /* Clear the non-page, non-alignment bits from the address in R0. */
95
if (TCG_TARGET_REG_BITS == 32) {
96
/*
97
* We don't support unaligned accesses on 32-bits.
98
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
99
if (TARGET_LONG_BITS == 32) {
100
tcg_out_rlw(s, RLWINM, TCG_REG_R0, t, 0,
101
(32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
102
- /* Zero-extend the address for use in the final address. */
103
- tcg_out_ext32u(s, TCG_REG_R4, addrlo);
104
- addrlo = TCG_REG_R4;
105
} else if (a_bits == 0) {
106
tcg_out_rld(s, RLDICR, TCG_REG_R0, t, 0, 63 - TARGET_PAGE_BITS);
107
} else {
108
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
109
tcg_out_rld(s, RLDICL, TCG_REG_R0, TCG_REG_R0, TARGET_PAGE_BITS, 0);
110
}
111
}
112
- h->index = addrlo;
113
114
if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
115
- tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
116
+ /* Low part comparison into cr7. */
117
+ tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP2,
118
0, 7, TCG_TYPE_I32);
119
- tcg_out_cmp(s, TCG_COND_EQ, addrhi, TCG_REG_R4, 0, 6, TCG_TYPE_I32);
120
+
121
+ /* Load the high part TLB comparator into TMP2. */
122
+ tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP2, TCG_REG_TMP1,
123
+ cmp_off + 4 * !HOST_BIG_ENDIAN);
124
+
125
+ /* Load addend, deferred for this case. */
126
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_REG_TMP1,
127
+ offsetof(CPUTLBEntry, addend));
128
+
129
+ /* High part comparison into cr6. */
130
+ tcg_out_cmp(s, TCG_COND_EQ, addrhi, TCG_REG_TMP2, 0, 6, TCG_TYPE_I32);
131
+
132
+ /* Combine comparisons into cr7. */
133
tcg_out32(s, CRAND | BT(7, CR_EQ) | BA(6, CR_EQ) | BB(7, CR_EQ));
134
} else {
135
- tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
136
+ /* Full comparison into cr7. */
137
+ tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP2,
138
0, 7, TCG_TYPE_TL);
139
}
140
141
/* Load a pointer into the current opcode w/conditional branch-link. */
142
ldst->label_ptr[0] = s->code_ptr;
143
tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
144
+
145
+ h->base = TCG_REG_TMP1;
146
#else
147
if (a_bits) {
148
ldst = new_ldst_label(s);
149
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
150
}
151
152
h->base = guest_base ? TCG_GUEST_BASE_REG : 0;
153
- h->index = addrlo;
154
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
155
- tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
156
- h->index = TCG_REG_TMP1;
157
- }
158
#endif
159
160
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
161
+ /* Zero-extend the guest address for use in the host address. */
162
+ tcg_out_ext32u(s, TCG_REG_R0, addrlo);
163
+ h->index = TCG_REG_R0;
164
+ } else {
165
+ h->index = addrlo;
166
+ }
167
+
168
return ldst;
169
}
170
171
@@ -XXX,XX +XXX,XX @@ static void tcg_target_init(TCGContext *s)
172
#if defined(_CALL_SYSV) || TCG_TARGET_REG_BITS == 64
173
tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13); /* thread pointer */
174
#endif
175
- tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP1); /* mem temp */
176
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP1);
177
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP2);
178
tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP1);
179
tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP2);
180
if (USE_REG_TB) {
181
--
182
2.34.1
183
184
diff view generated by jsdifflib
Deleted patch
1
These constraints have not been used for quite some time.
2
1
3
Fixes: 77b73de67632 ("Use rem/div[u]_i32 drop div[u]2_i32")
4
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/ppc/tcg-target-con-str.h | 4 ----
10
1 file changed, 4 deletions(-)
11
12
diff --git a/tcg/ppc/tcg-target-con-str.h b/tcg/ppc/tcg-target-con-str.h
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/ppc/tcg-target-con-str.h
15
+++ b/tcg/ppc/tcg-target-con-str.h
16
@@ -XXX,XX +XXX,XX @@
17
*/
18
REGS('r', ALL_GENERAL_REGS)
19
REGS('v', ALL_VECTOR_REGS)
20
-REGS('A', 1u << TCG_REG_R3)
21
-REGS('B', 1u << TCG_REG_R4)
22
-REGS('C', 1u << TCG_REG_R5)
23
-REGS('D', 1u << TCG_REG_R6)
24
25
/*
26
* Define constraint letters for constants:
27
--
28
2.34.1
29
30
diff view generated by jsdifflib
Deleted patch
1
Never used since its introduction.
2
1
3
Fixes: 3d582c6179c ("tcg-ppc64: Rearrange integer constant constraints")
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/ppc/tcg-target-con-str.h | 1 -
8
tcg/ppc/tcg-target.c.inc | 3 ---
9
2 files changed, 4 deletions(-)
10
11
diff --git a/tcg/ppc/tcg-target-con-str.h b/tcg/ppc/tcg-target-con-str.h
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/ppc/tcg-target-con-str.h
14
+++ b/tcg/ppc/tcg-target-con-str.h
15
@@ -XXX,XX +XXX,XX @@ REGS('v', ALL_VECTOR_REGS)
16
* CONST(letter, TCG_CT_CONST_* bit set)
17
*/
18
CONST('I', TCG_CT_CONST_S16)
19
-CONST('J', TCG_CT_CONST_U16)
20
CONST('M', TCG_CT_CONST_MONE)
21
CONST('T', TCG_CT_CONST_S32)
22
CONST('U', TCG_CT_CONST_U32)
23
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
24
index XXXXXXX..XXXXXXX 100644
25
--- a/tcg/ppc/tcg-target.c.inc
26
+++ b/tcg/ppc/tcg-target.c.inc
27
@@ -XXX,XX +XXX,XX @@
28
#define SZR (TCG_TARGET_REG_BITS / 8)
29
30
#define TCG_CT_CONST_S16 0x100
31
-#define TCG_CT_CONST_U16 0x200
32
#define TCG_CT_CONST_S32 0x400
33
#define TCG_CT_CONST_U32 0x800
34
#define TCG_CT_CONST_ZERO 0x1000
35
@@ -XXX,XX +XXX,XX @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
36
37
if ((ct & TCG_CT_CONST_S16) && val == (int16_t)val) {
38
return 1;
39
- } else if ((ct & TCG_CT_CONST_U16) && val == (uint16_t)val) {
40
- return 1;
41
} else if ((ct & TCG_CT_CONST_S32) && val == (int32_t)val) {
42
return 1;
43
} else if ((ct & TCG_CT_CONST_U32) && val == (uint32_t)val) {
44
--
45
2.34.1
46
47
diff view generated by jsdifflib
Deleted patch
1
Rather than zero-extend the guest address into a register,
2
use an add instruction which zero-extends the second input.
3
1
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/s390x/tcg-target.c.inc | 8 +++++---
8
1 file changed, 5 insertions(+), 3 deletions(-)
9
10
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
11
index XXXXXXX..XXXXXXX 100644
12
--- a/tcg/s390x/tcg-target.c.inc
13
+++ b/tcg/s390x/tcg-target.c.inc
14
@@ -XXX,XX +XXX,XX @@ typedef enum S390Opcode {
15
RRE_ALGR = 0xb90a,
16
RRE_ALCR = 0xb998,
17
RRE_ALCGR = 0xb988,
18
+ RRE_ALGFR = 0xb91a,
19
RRE_CGR = 0xb920,
20
RRE_CLGR = 0xb921,
21
RRE_DLGR = 0xb987,
22
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
23
tcg_out_insn(s, RXY, LG, h->index, TCG_REG_R2, TCG_REG_NONE,
24
offsetof(CPUTLBEntry, addend));
25
26
- h->base = addr_reg;
27
if (TARGET_LONG_BITS == 32) {
28
- tcg_out_ext32u(s, TCG_REG_R3, addr_reg);
29
- h->base = TCG_REG_R3;
30
+ tcg_out_insn(s, RRE, ALGFR, h->index, addr_reg);
31
+ h->base = TCG_REG_NONE;
32
+ } else {
33
+ h->base = addr_reg;
34
}
35
h->disp = 0;
36
#else
37
--
38
2.34.1
39
40
diff view generated by jsdifflib
Deleted patch
1
The opposite of MO_UNALN is MO_ALIGN.
2
1
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/mips/tcg/nanomips_translate.c.inc | 2 +-
7
1 file changed, 1 insertion(+), 1 deletion(-)
8
9
diff --git a/target/mips/tcg/nanomips_translate.c.inc b/target/mips/tcg/nanomips_translate.c.inc
10
index XXXXXXX..XXXXXXX 100644
11
--- a/target/mips/tcg/nanomips_translate.c.inc
12
+++ b/target/mips/tcg/nanomips_translate.c.inc
13
@@ -XXX,XX +XXX,XX @@ static int decode_nanomips_32_48_opc(CPUMIPSState *env, DisasContext *ctx)
14
TCGv va = tcg_temp_new();
15
TCGv t1 = tcg_temp_new();
16
MemOp memop = (extract32(ctx->opcode, 8, 3)) ==
17
- NM_P_LS_UAWM ? MO_UNALN : 0;
18
+ NM_P_LS_UAWM ? MO_UNALN : MO_ALIGN;
19
20
count = (count == 0) ? 8 : count;
21
while (counter != count) {
22
--
23
2.34.1
24
25
diff view generated by jsdifflib
Deleted patch
1
Mark all memory operations that are not already marked with UNALIGN.
2
1
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/sh4/translate.c | 102 ++++++++++++++++++++++++++---------------
7
1 file changed, 66 insertions(+), 36 deletions(-)
8
9
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
10
index XXXXXXX..XXXXXXX 100644
11
--- a/target/sh4/translate.c
12
+++ b/target/sh4/translate.c
13
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
14
case 0x9000:        /* mov.w @(disp,PC),Rn */
15
    {
16
TCGv addr = tcg_constant_i32(ctx->base.pc_next + 4 + B7_0 * 2);
17
- tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx, MO_TESW);
18
+ tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx,
19
+ MO_TESW | MO_ALIGN);
20
    }
21
    return;
22
case 0xd000:        /* mov.l @(disp,PC),Rn */
23
    {
24
TCGv addr = tcg_constant_i32((ctx->base.pc_next + 4 + B7_0 * 4) & ~3);
25
- tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx, MO_TESL);
26
+ tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx,
27
+ MO_TESL | MO_ALIGN);
28
    }
29
    return;
30
case 0x7000:        /* add #imm,Rn */
31
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
32
    {
33
     TCGv arg0, arg1;
34
     arg0 = tcg_temp_new();
35
- tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx, MO_TESL);
36
+ tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx,
37
+ MO_TESL | MO_ALIGN);
38
     arg1 = tcg_temp_new();
39
- tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx, MO_TESL);
40
+ tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx,
41
+ MO_TESL | MO_ALIGN);
42
gen_helper_macl(cpu_env, arg0, arg1);
43
     tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 4);
44
     tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 4);
45
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
46
    {
47
     TCGv arg0, arg1;
48
     arg0 = tcg_temp_new();
49
- tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx, MO_TESL);
50
+ tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx,
51
+ MO_TESL | MO_ALIGN);
52
     arg1 = tcg_temp_new();
53
- tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx, MO_TESL);
54
+ tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx,
55
+ MO_TESL | MO_ALIGN);
56
gen_helper_macw(cpu_env, arg0, arg1);
57
     tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 2);
58
     tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 2);
59
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
60
if (ctx->tbflags & FPSCR_SZ) {
61
TCGv_i64 fp = tcg_temp_new_i64();
62
gen_load_fpr64(ctx, fp, XHACK(B7_4));
63
- tcg_gen_qemu_st_i64(fp, REG(B11_8), ctx->memidx, MO_TEUQ);
64
+ tcg_gen_qemu_st_i64(fp, REG(B11_8), ctx->memidx,
65
+ MO_TEUQ | MO_ALIGN);
66
    } else {
67
- tcg_gen_qemu_st_i32(FREG(B7_4), REG(B11_8), ctx->memidx, MO_TEUL);
68
+ tcg_gen_qemu_st_i32(FREG(B7_4), REG(B11_8), ctx->memidx,
69
+ MO_TEUL | MO_ALIGN);
70
    }
71
    return;
72
case 0xf008: /* fmov @Rm,{F,D,X}Rn - FPSCR: Nothing */
73
    CHECK_FPU_ENABLED
74
if (ctx->tbflags & FPSCR_SZ) {
75
TCGv_i64 fp = tcg_temp_new_i64();
76
- tcg_gen_qemu_ld_i64(fp, REG(B7_4), ctx->memidx, MO_TEUQ);
77
+ tcg_gen_qemu_ld_i64(fp, REG(B7_4), ctx->memidx,
78
+ MO_TEUQ | MO_ALIGN);
79
gen_store_fpr64(ctx, fp, XHACK(B11_8));
80
    } else {
81
- tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx, MO_TEUL);
82
+ tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx,
83
+ MO_TEUL | MO_ALIGN);
84
    }
85
    return;
86
case 0xf009: /* fmov @Rm+,{F,D,X}Rn - FPSCR: Nothing */
87
    CHECK_FPU_ENABLED
88
if (ctx->tbflags & FPSCR_SZ) {
89
TCGv_i64 fp = tcg_temp_new_i64();
90
- tcg_gen_qemu_ld_i64(fp, REG(B7_4), ctx->memidx, MO_TEUQ);
91
+ tcg_gen_qemu_ld_i64(fp, REG(B7_4), ctx->memidx,
92
+ MO_TEUQ | MO_ALIGN);
93
gen_store_fpr64(ctx, fp, XHACK(B11_8));
94
tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 8);
95
    } else {
96
- tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx, MO_TEUL);
97
+ tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx,
98
+ MO_TEUL | MO_ALIGN);
99
     tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 4);
100
    }
101
    return;
102
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
103
TCGv_i64 fp = tcg_temp_new_i64();
104
gen_load_fpr64(ctx, fp, XHACK(B7_4));
105
tcg_gen_subi_i32(addr, REG(B11_8), 8);
106
- tcg_gen_qemu_st_i64(fp, addr, ctx->memidx, MO_TEUQ);
107
+ tcg_gen_qemu_st_i64(fp, addr, ctx->memidx,
108
+ MO_TEUQ | MO_ALIGN);
109
} else {
110
tcg_gen_subi_i32(addr, REG(B11_8), 4);
111
- tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
112
+ tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx,
113
+ MO_TEUL | MO_ALIGN);
114
}
115
tcg_gen_mov_i32(REG(B11_8), addr);
116
}
117
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
118
     tcg_gen_add_i32(addr, REG(B7_4), REG(0));
119
if (ctx->tbflags & FPSCR_SZ) {
120
TCGv_i64 fp = tcg_temp_new_i64();
121
- tcg_gen_qemu_ld_i64(fp, addr, ctx->memidx, MO_TEUQ);
122
+ tcg_gen_qemu_ld_i64(fp, addr, ctx->memidx,
123
+ MO_TEUQ | MO_ALIGN);
124
gen_store_fpr64(ctx, fp, XHACK(B11_8));
125
     } else {
126
- tcg_gen_qemu_ld_i32(FREG(B11_8), addr, ctx->memidx, MO_TEUL);
127
+ tcg_gen_qemu_ld_i32(FREG(B11_8), addr, ctx->memidx,
128
+ MO_TEUL | MO_ALIGN);
129
     }
130
    }
131
    return;
132
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
133
if (ctx->tbflags & FPSCR_SZ) {
134
TCGv_i64 fp = tcg_temp_new_i64();
135
gen_load_fpr64(ctx, fp, XHACK(B7_4));
136
- tcg_gen_qemu_st_i64(fp, addr, ctx->memidx, MO_TEUQ);
137
+ tcg_gen_qemu_st_i64(fp, addr, ctx->memidx,
138
+ MO_TEUQ | MO_ALIGN);
139
     } else {
140
- tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
141
+ tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx,
142
+ MO_TEUL | MO_ALIGN);
143
     }
144
    }
145
    return;
146
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
147
    {
148
     TCGv addr = tcg_temp_new();
149
     tcg_gen_addi_i32(addr, cpu_gbr, B7_0 * 2);
150
- tcg_gen_qemu_ld_i32(REG(0), addr, ctx->memidx, MO_TESW);
151
+ tcg_gen_qemu_ld_i32(REG(0), addr, ctx->memidx, MO_TESW | MO_ALIGN);
152
    }
153
    return;
154
case 0xc600:        /* mov.l @(disp,GBR),R0 */
155
    {
156
     TCGv addr = tcg_temp_new();
157
     tcg_gen_addi_i32(addr, cpu_gbr, B7_0 * 4);
158
- tcg_gen_qemu_ld_i32(REG(0), addr, ctx->memidx, MO_TESL);
159
+ tcg_gen_qemu_ld_i32(REG(0), addr, ctx->memidx, MO_TESL | MO_ALIGN);
160
    }
161
    return;
162
case 0xc000:        /* mov.b R0,@(disp,GBR) */
163
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
164
    {
165
     TCGv addr = tcg_temp_new();
166
     tcg_gen_addi_i32(addr, cpu_gbr, B7_0 * 2);
167
- tcg_gen_qemu_st_i32(REG(0), addr, ctx->memidx, MO_TEUW);
168
+ tcg_gen_qemu_st_i32(REG(0), addr, ctx->memidx, MO_TEUW | MO_ALIGN);
169
    }
170
    return;
171
case 0xc200:        /* mov.l R0,@(disp,GBR) */
172
    {
173
     TCGv addr = tcg_temp_new();
174
     tcg_gen_addi_i32(addr, cpu_gbr, B7_0 * 4);
175
- tcg_gen_qemu_st_i32(REG(0), addr, ctx->memidx, MO_TEUL);
176
+ tcg_gen_qemu_st_i32(REG(0), addr, ctx->memidx, MO_TEUL | MO_ALIGN);
177
    }
178
    return;
179
case 0x8000:        /* mov.b R0,@(disp,Rn) */
180
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
181
    return;
182
case 0x4087:        /* ldc.l @Rm+,Rn_BANK */
183
    CHECK_PRIVILEGED
184
- tcg_gen_qemu_ld_i32(ALTREG(B6_4), REG(B11_8), ctx->memidx, MO_TESL);
185
+ tcg_gen_qemu_ld_i32(ALTREG(B6_4), REG(B11_8), ctx->memidx,
186
+ MO_TESL | MO_ALIGN);
187
    tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 4);
188
    return;
189
case 0x0082:        /* stc Rm_BANK,Rn */
190
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
191
    {
192
     TCGv addr = tcg_temp_new();
193
     tcg_gen_subi_i32(addr, REG(B11_8), 4);
194
- tcg_gen_qemu_st_i32(ALTREG(B6_4), addr, ctx->memidx, MO_TEUL);
195
+ tcg_gen_qemu_st_i32(ALTREG(B6_4), addr, ctx->memidx,
196
+ MO_TEUL | MO_ALIGN);
197
     tcg_gen_mov_i32(REG(B11_8), addr);
198
    }
199
    return;
200
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
201
    CHECK_PRIVILEGED
202
    {
203
     TCGv val = tcg_temp_new();
204
- tcg_gen_qemu_ld_i32(val, REG(B11_8), ctx->memidx, MO_TESL);
205
+ tcg_gen_qemu_ld_i32(val, REG(B11_8), ctx->memidx,
206
+ MO_TESL | MO_ALIGN);
207
tcg_gen_andi_i32(val, val, 0x700083f3);
208
gen_write_sr(val);
209
     tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 4);
210
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
211
TCGv val = tcg_temp_new();
212
     tcg_gen_subi_i32(addr, REG(B11_8), 4);
213
gen_read_sr(val);
214
- tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_TEUL);
215
+ tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_TEUL | MO_ALIGN);
216
     tcg_gen_mov_i32(REG(B11_8), addr);
217
    }
218
    return;
219
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
220
return;                            \
221
case ldpnum:                            \
222
prechk                             \
223
- tcg_gen_qemu_ld_i32(cpu_##reg, REG(B11_8), ctx->memidx, MO_TESL); \
224
+ tcg_gen_qemu_ld_i32(cpu_##reg, REG(B11_8), ctx->memidx, \
225
+ MO_TESL | MO_ALIGN); \
226
tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 4);        \
227
return;
228
#define ST(reg,stnum,stpnum,prechk)        \
229
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
230
{                                \
231
    TCGv addr = tcg_temp_new();                \
232
    tcg_gen_subi_i32(addr, REG(B11_8), 4);            \
233
- tcg_gen_qemu_st_i32(cpu_##reg, addr, ctx->memidx, MO_TEUL); \
234
+ tcg_gen_qemu_st_i32(cpu_##reg, addr, ctx->memidx, \
235
+ MO_TEUL | MO_ALIGN); \
236
    tcg_gen_mov_i32(REG(B11_8), addr);            \
237
}                                \
238
return;
239
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
240
    CHECK_FPU_ENABLED
241
    {
242
     TCGv addr = tcg_temp_new();
243
- tcg_gen_qemu_ld_i32(addr, REG(B11_8), ctx->memidx, MO_TESL);
244
+ tcg_gen_qemu_ld_i32(addr, REG(B11_8), ctx->memidx,
245
+ MO_TESL | MO_ALIGN);
246
     tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 4);
247
gen_helper_ld_fpscr(cpu_env, addr);
248
ctx->base.is_jmp = DISAS_STOP;
249
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
250
     tcg_gen_andi_i32(val, cpu_fpscr, 0x003fffff);
251
     addr = tcg_temp_new();
252
     tcg_gen_subi_i32(addr, REG(B11_8), 4);
253
- tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_TEUL);
254
+ tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_TEUL | MO_ALIGN);
255
     tcg_gen_mov_i32(REG(B11_8), addr);
256
    }
257
    return;
258
case 0x00c3:        /* movca.l R0,@Rm */
259
{
260
TCGv val = tcg_temp_new();
261
- tcg_gen_qemu_ld_i32(val, REG(B11_8), ctx->memidx, MO_TEUL);
262
+ tcg_gen_qemu_ld_i32(val, REG(B11_8), ctx->memidx,
263
+ MO_TEUL | MO_ALIGN);
264
gen_helper_movcal(cpu_env, REG(B11_8), val);
265
- tcg_gen_qemu_st_i32(REG(0), REG(B11_8), ctx->memidx, MO_TEUL);
266
+ tcg_gen_qemu_st_i32(REG(0), REG(B11_8), ctx->memidx,
267
+ MO_TEUL | MO_ALIGN);
268
}
269
ctx->has_movcal = 1;
270
    return;
271
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
272
cpu_lock_addr, fail);
273
tmp = tcg_temp_new();
274
tcg_gen_atomic_cmpxchg_i32(tmp, REG(B11_8), cpu_lock_value,
275
- REG(0), ctx->memidx, MO_TEUL);
276
+ REG(0), ctx->memidx,
277
+ MO_TEUL | MO_ALIGN);
278
tcg_gen_setcond_i32(TCG_COND_EQ, cpu_sr_t, tmp, cpu_lock_value);
279
} else {
280
tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_lock_addr, -1, fail);
281
- tcg_gen_qemu_st_i32(REG(0), REG(B11_8), ctx->memidx, MO_TEUL);
282
+ tcg_gen_qemu_st_i32(REG(0), REG(B11_8), ctx->memidx,
283
+ MO_TEUL | MO_ALIGN);
284
tcg_gen_movi_i32(cpu_sr_t, 1);
285
}
286
tcg_gen_br(done);
287
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
288
if ((tb_cflags(ctx->base.tb) & CF_PARALLEL)) {
289
TCGv tmp = tcg_temp_new();
290
tcg_gen_mov_i32(tmp, REG(B11_8));
291
- tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx, MO_TESL);
292
+ tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx,
293
+ MO_TESL | MO_ALIGN);
294
tcg_gen_mov_i32(cpu_lock_value, REG(0));
295
tcg_gen_mov_i32(cpu_lock_addr, tmp);
296
} else {
297
- tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx, MO_TESL);
298
+ tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx,
299
+ MO_TESL | MO_ALIGN);
300
tcg_gen_movi_i32(cpu_lock_addr, 0);
301
}
302
return;
303
--
304
2.34.1
305
306
diff view generated by jsdifflib
Deleted patch
1
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
---
4
configs/targets/sh4-linux-user.mak | 1 -
5
configs/targets/sh4-softmmu.mak | 1 -
6
configs/targets/sh4eb-linux-user.mak | 1 -
7
configs/targets/sh4eb-softmmu.mak | 1 -
8
4 files changed, 4 deletions(-)
9
1
10
diff --git a/configs/targets/sh4-linux-user.mak b/configs/targets/sh4-linux-user.mak
11
index XXXXXXX..XXXXXXX 100644
12
--- a/configs/targets/sh4-linux-user.mak
13
+++ b/configs/targets/sh4-linux-user.mak
14
@@ -XXX,XX +XXX,XX @@
15
TARGET_ARCH=sh4
16
TARGET_SYSTBL_ABI=common
17
TARGET_SYSTBL=syscall.tbl
18
-TARGET_ALIGNED_ONLY=y
19
TARGET_HAS_BFLT=y
20
diff --git a/configs/targets/sh4-softmmu.mak b/configs/targets/sh4-softmmu.mak
21
index XXXXXXX..XXXXXXX 100644
22
--- a/configs/targets/sh4-softmmu.mak
23
+++ b/configs/targets/sh4-softmmu.mak
24
@@ -1,2 +1 @@
25
TARGET_ARCH=sh4
26
-TARGET_ALIGNED_ONLY=y
27
diff --git a/configs/targets/sh4eb-linux-user.mak b/configs/targets/sh4eb-linux-user.mak
28
index XXXXXXX..XXXXXXX 100644
29
--- a/configs/targets/sh4eb-linux-user.mak
30
+++ b/configs/targets/sh4eb-linux-user.mak
31
@@ -XXX,XX +XXX,XX @@
32
TARGET_ARCH=sh4
33
TARGET_SYSTBL_ABI=common
34
TARGET_SYSTBL=syscall.tbl
35
-TARGET_ALIGNED_ONLY=y
36
TARGET_BIG_ENDIAN=y
37
TARGET_HAS_BFLT=y
38
diff --git a/configs/targets/sh4eb-softmmu.mak b/configs/targets/sh4eb-softmmu.mak
39
index XXXXXXX..XXXXXXX 100644
40
--- a/configs/targets/sh4eb-softmmu.mak
41
+++ b/configs/targets/sh4eb-softmmu.mak
42
@@ -XXX,XX +XXX,XX @@
43
TARGET_ARCH=sh4
44
-TARGET_ALIGNED_ONLY=y
45
TARGET_BIG_ENDIAN=y
46
--
47
2.34.1
48
49
diff view generated by jsdifflib