1
TCG patch queue, plus one target/sh4 patch that
1
The following changes since commit d530697ca20e19f7a626f4c1c8b26fccd0dc4470:
2
Yoshinori Sato asked me to process.
3
2
4
3
Merge tag 'pull-testing-updates-100523-1' of https://gitlab.com/stsquad/qemu into staging (2023-05-10 16:43:01 +0100)
5
r~
6
7
8
The following changes since commit efbf38d73e5dcc4d5f8b98c6e7a12be1f3b91745:
9
10
Merge tag 'for-upstream' of git://repo.or.cz/qemu/kevin into staging (2022-10-03 15:06:07 -0400)
11
4
12
are available in the Git repository at:
5
are available in the Git repository at:
13
6
14
https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20221004
7
https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230511
15
8
16
for you to fetch changes up to ab419fd8a035a65942de4e63effcd55ccbf1a9fe:
9
for you to fetch changes up to b2d4d6616c22325dff802e0a35092167f2dc2268:
17
10
18
target/sh4: Fix TB_FLAG_UNALIGN (2022-10-04 12:33:05 -0700)
11
target/loongarch: Do not include tcg-ldst.h (2023-05-11 06:06:04 +0100)
19
12
20
----------------------------------------------------------------
13
----------------------------------------------------------------
21
Cache CPUClass for use in hot code paths.
14
target/m68k: Fix gen_load_fp regression
22
Add CPUTLBEntryFull, probe_access_full, tlb_set_page_full.
15
accel/tcg: Ensure fairness with icount
23
Add generic support for TARGET_TB_PCREL.
16
disas: Move disas.c into the target-independent source sets
24
tcg/ppc: Optimize 26-bit jumps using STQ for POWER 2.07
17
tcg: Use common routines for calling slow path helpers
25
target/sh4: Fix TB_FLAG_UNALIGN
18
tcg/*: Cleanups to qemu_ld/st constraints
19
tcg: Remove TARGET_ALIGNED_ONLY
20
accel/tcg: Reorg system mode load/store helpers
26
21
27
----------------------------------------------------------------
22
----------------------------------------------------------------
28
Alex Bennée (3):
23
Jamie Iles (2):
29
cpu: cache CPUClass in CPUState for hot code paths
24
cpu: expose qemu_cpu_list_lock for lock-guard use
30
hw/core/cpu-sysemu: used cached class in cpu_asidx_from_attrs
25
accel/tcg/tcg-accel-ops-rr: ensure fairness with icount
31
cputlb: used cached CPUClass in our hot-paths
32
26
33
Leandro Lupori (1):
27
Richard Henderson (49):
34
tcg/ppc: Optimize 26-bit jumps
28
target/m68k: Fix gen_load_fp for OS_LONG
29
accel/tcg: Fix atomic_mmu_lookup for reads
30
disas: Fix tabs and braces in disas.c
31
disas: Move disas.c to disas/
32
disas: Remove target_ulong from the interface
33
disas: Remove target-specific headers
34
tcg/i386: Introduce prepare_host_addr
35
tcg/i386: Use indexed addressing for softmmu fast path
36
tcg/aarch64: Introduce prepare_host_addr
37
tcg/arm: Introduce prepare_host_addr
38
tcg/loongarch64: Introduce prepare_host_addr
39
tcg/mips: Introduce prepare_host_addr
40
tcg/ppc: Introduce prepare_host_addr
41
tcg/riscv: Introduce prepare_host_addr
42
tcg/s390x: Introduce prepare_host_addr
43
tcg: Add routines for calling slow-path helpers
44
tcg/i386: Convert tcg_out_qemu_ld_slow_path
45
tcg/i386: Convert tcg_out_qemu_st_slow_path
46
tcg/aarch64: Convert tcg_out_qemu_{ld,st}_slow_path
47
tcg/arm: Convert tcg_out_qemu_{ld,st}_slow_path
48
tcg/loongarch64: Convert tcg_out_qemu_{ld,st}_slow_path
49
tcg/mips: Convert tcg_out_qemu_{ld,st}_slow_path
50
tcg/ppc: Convert tcg_out_qemu_{ld,st}_slow_path
51
tcg/riscv: Convert tcg_out_qemu_{ld,st}_slow_path
52
tcg/s390x: Convert tcg_out_qemu_{ld,st}_slow_path
53
tcg/loongarch64: Simplify constraints on qemu_ld/st
54
tcg/mips: Remove MO_BSWAP handling
55
tcg/mips: Reorg tlb load within prepare_host_addr
56
tcg/mips: Simplify constraints on qemu_ld/st
57
tcg/ppc: Reorg tcg_out_tlb_read
58
tcg/ppc: Adjust constraints on qemu_ld/st
59
tcg/ppc: Remove unused constraints A, B, C, D
60
tcg/ppc: Remove unused constraint J
61
tcg/riscv: Simplify constraints on qemu_ld/st
62
tcg/s390x: Use ALGFR in constructing softmmu host address
63
tcg/s390x: Simplify constraints on qemu_ld/st
64
target/mips: Add MO_ALIGN to gen_llwp, gen_scwp
65
target/mips: Add missing default_tcg_memop_mask
66
target/mips: Use MO_ALIGN instead of 0
67
target/mips: Remove TARGET_ALIGNED_ONLY
68
target/nios2: Remove TARGET_ALIGNED_ONLY
69
target/sh4: Use MO_ALIGN where required
70
target/sh4: Remove TARGET_ALIGNED_ONLY
71
tcg: Remove TARGET_ALIGNED_ONLY
72
accel/tcg: Add cpu_in_serial_context
73
accel/tcg: Introduce tlb_read_idx
74
accel/tcg: Reorg system mode load helpers
75
accel/tcg: Reorg system mode store helpers
76
target/loongarch: Do not include tcg-ldst.h
35
77
36
Richard Henderson (16):
78
Thomas Huth (2):
37
accel/tcg: Rename CPUIOTLBEntry to CPUTLBEntryFull
79
disas: Move softmmu specific code to separate file
38
accel/tcg: Drop addr member from SavedIOTLB
80
disas: Move disas.c into the target-independent source set
39
accel/tcg: Suppress auto-invalidate in probe_access_internal
40
accel/tcg: Introduce probe_access_full
41
accel/tcg: Introduce tlb_set_page_full
42
include/exec: Introduce TARGET_PAGE_ENTRY_EXTRA
43
accel/tcg: Remove PageDesc code_bitmap
44
accel/tcg: Use bool for page_find_alloc
45
accel/tcg: Use DisasContextBase in plugin_gen_tb_start
46
accel/tcg: Do not align tb->page_addr[0]
47
accel/tcg: Inline tb_flush_jmp_cache
48
include/hw/core: Create struct CPUJumpCache
49
hw/core: Add CPUClass.get_pc
50
accel/tcg: Introduce tb_pc and log_pc
51
accel/tcg: Introduce TARGET_TB_PCREL
52
target/sh4: Fix TB_FLAG_UNALIGN
53
81
54
accel/tcg/internal.h | 10 ++
82
configs/targets/mips-linux-user.mak | 1 -
55
accel/tcg/tb-hash.h | 1 +
83
configs/targets/mips-softmmu.mak | 1 -
56
accel/tcg/tb-jmp-cache.h | 65 ++++++++
84
configs/targets/mips64-linux-user.mak | 1 -
57
include/exec/cpu-common.h | 1 +
85
configs/targets/mips64-softmmu.mak | 1 -
58
include/exec/cpu-defs.h | 48 ++++--
86
configs/targets/mips64el-linux-user.mak | 1 -
59
include/exec/exec-all.h | 75 ++++++++-
87
configs/targets/mips64el-softmmu.mak | 1 -
60
include/exec/plugin-gen.h | 7 +-
88
configs/targets/mipsel-linux-user.mak | 1 -
61
include/hw/core/cpu.h | 28 ++--
89
configs/targets/mipsel-softmmu.mak | 1 -
62
include/qemu/typedefs.h | 2 +
90
configs/targets/mipsn32-linux-user.mak | 1 -
63
include/tcg/tcg.h | 2 +-
91
configs/targets/mipsn32el-linux-user.mak | 1 -
64
target/sh4/cpu.h | 56 ++++---
92
configs/targets/nios2-softmmu.mak | 1 -
65
accel/stubs/tcg-stub.c | 4 +
93
configs/targets/sh4-linux-user.mak | 1 -
66
accel/tcg/cpu-exec.c | 80 +++++-----
94
configs/targets/sh4-softmmu.mak | 1 -
67
accel/tcg/cputlb.c | 259 ++++++++++++++++++--------------
95
configs/targets/sh4eb-linux-user.mak | 1 -
68
accel/tcg/plugin-gen.c | 22 +--
96
configs/targets/sh4eb-softmmu.mak | 1 -
69
accel/tcg/translate-all.c | 214 ++++++++++++--------------
97
meson.build | 3 -
70
accel/tcg/translator.c | 2 +-
98
accel/tcg/internal.h | 9 +
71
cpu.c | 9 +-
99
accel/tcg/tcg-accel-ops-icount.h | 3 +-
72
hw/core/cpu-common.c | 3 +-
100
disas/disas-internal.h | 21 +
73
hw/core/cpu-sysemu.c | 5 +-
101
include/disas/disas.h | 23 +-
74
linux-user/sh4/signal.c | 6 +-
102
include/exec/cpu-common.h | 1 +
75
plugins/core.c | 2 +-
103
include/exec/cpu-defs.h | 7 +-
76
target/alpha/cpu.c | 9 ++
104
include/exec/cpu_ldst.h | 26 +-
77
target/arm/cpu.c | 17 ++-
105
include/exec/memop.h | 13 +-
78
target/arm/mte_helper.c | 14 +-
106
include/exec/poison.h | 1 -
79
target/arm/sve_helper.c | 4 +-
107
tcg/loongarch64/tcg-target-con-set.h | 2 -
80
target/arm/translate-a64.c | 2 +-
108
tcg/loongarch64/tcg-target-con-str.h | 1 -
81
target/avr/cpu.c | 10 +-
109
tcg/mips/tcg-target-con-set.h | 13 +-
82
target/cris/cpu.c | 8 +
110
tcg/mips/tcg-target-con-str.h | 2 -
83
target/hexagon/cpu.c | 10 +-
111
tcg/mips/tcg-target.h | 4 +-
84
target/hppa/cpu.c | 12 +-
112
tcg/ppc/tcg-target-con-set.h | 11 +-
85
target/i386/cpu.c | 9 ++
113
tcg/ppc/tcg-target-con-str.h | 7 -
86
target/i386/tcg/tcg-cpu.c | 2 +-
114
tcg/riscv/tcg-target-con-set.h | 2 -
87
target/loongarch/cpu.c | 11 +-
115
tcg/riscv/tcg-target-con-str.h | 1 -
88
target/m68k/cpu.c | 8 +
116
tcg/s390x/tcg-target-con-set.h | 2 -
89
target/microblaze/cpu.c | 10 +-
117
tcg/s390x/tcg-target-con-str.h | 1 -
90
target/mips/cpu.c | 8 +
118
accel/tcg/cpu-exec-common.c | 3 +
91
target/mips/tcg/exception.c | 2 +-
119
accel/tcg/cputlb.c | 1113 ++++++++++++++++-------------
92
target/mips/tcg/sysemu/special_helper.c | 2 +-
120
accel/tcg/tb-maint.c | 2 +-
93
target/nios2/cpu.c | 9 ++
121
accel/tcg/tcg-accel-ops-icount.c | 21 +-
94
target/openrisc/cpu.c | 10 +-
122
accel/tcg/tcg-accel-ops-rr.c | 37 +-
95
target/ppc/cpu_init.c | 8 +
123
bsd-user/elfload.c | 5 +-
96
target/riscv/cpu.c | 17 ++-
124
cpus-common.c | 2 +-
97
target/rx/cpu.c | 10 +-
125
disas/disas-mon.c | 65 ++
98
target/s390x/cpu.c | 8 +
126
disas.c => disas/disas.c | 109 +--
99
target/s390x/tcg/mem_helper.c | 4 -
127
linux-user/elfload.c | 18 +-
100
target/sh4/cpu.c | 18 ++-
128
migration/dirtyrate.c | 26 +-
101
target/sh4/helper.c | 6 +-
129
replay/replay.c | 3 +-
102
target/sh4/translate.c | 90 +++++------
130
target/loongarch/csr_helper.c | 1 -
103
target/sparc/cpu.c | 10 +-
131
target/loongarch/iocsr_helper.c | 1 -
104
target/tricore/cpu.c | 11 +-
132
target/m68k/translate.c | 1 +
105
target/xtensa/cpu.c | 8 +
133
target/mips/tcg/mxu_translate.c | 3 +-
106
tcg/tcg.c | 8 +-
134
target/nios2/translate.c | 10 +
107
trace/control-target.c | 2 +-
135
target/sh4/translate.c | 102 ++-
108
tcg/ppc/tcg-target.c.inc | 119 +++++++++++----
136
tcg/tcg.c | 480 ++++++++++++-
109
55 files changed, 915 insertions(+), 462 deletions(-)
137
trace/control-target.c | 9 +-
110
create mode 100644 accel/tcg/tb-jmp-cache.h
138
target/mips/tcg/micromips_translate.c.inc | 24 +-
111
139
target/mips/tcg/mips16e_translate.c.inc | 18 +-
140
target/mips/tcg/nanomips_translate.c.inc | 32 +-
141
tcg/aarch64/tcg-target.c.inc | 347 ++++-----
142
tcg/arm/tcg-target.c.inc | 455 +++++-------
143
tcg/i386/tcg-target.c.inc | 453 +++++-------
144
tcg/loongarch64/tcg-target.c.inc | 313 +++-----
145
tcg/mips/tcg-target.c.inc | 870 +++++++---------------
146
tcg/ppc/tcg-target.c.inc | 512 ++++++-------
147
tcg/riscv/tcg-target.c.inc | 304 ++++----
148
tcg/s390x/tcg-target.c.inc | 314 ++++----
149
disas/meson.build | 6 +-
150
68 files changed, 2788 insertions(+), 3039 deletions(-)
151
create mode 100644 disas/disas-internal.h
152
create mode 100644 disas/disas-mon.c
153
rename disas.c => disas/disas.c (79%)
diff view generated by jsdifflib
New patch
1
Case was accidentally dropped in b7a94da9550b.
1
2
3
Tested-by: Laurent Vivier <laurent@vivier.eu>
4
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
target/m68k/translate.c | 1 +
9
1 file changed, 1 insertion(+)
10
11
diff --git a/target/m68k/translate.c b/target/m68k/translate.c
12
index XXXXXXX..XXXXXXX 100644
13
--- a/target/m68k/translate.c
14
+++ b/target/m68k/translate.c
15
@@ -XXX,XX +XXX,XX @@ static void gen_load_fp(DisasContext *s, int opsize, TCGv addr, TCGv_ptr fp,
16
switch (opsize) {
17
case OS_BYTE:
18
case OS_WORD:
19
+ case OS_LONG:
20
tcg_gen_qemu_ld_tl(tmp, addr, index, opsize | MO_SIGN | MO_TE);
21
gen_helper_exts32(cpu_env, fp, tmp);
22
break;
23
--
24
2.34.1
25
26
diff view generated by jsdifflib
1
When PAGE_WRITE_INV is set when calling tlb_set_page,
1
A copy-paste bug had us looking at the victim cache for writes.
2
we immediately set TLB_INVALID_MASK in order to force
3
tlb_fill to be called on the next lookup. Here in
4
probe_access_internal, we have just called tlb_fill
5
and eliminated true misses, thus the lookup must be valid.
6
2
7
This allows us to remove a warning comment from s390x.
3
Cc: qemu-stable@nongnu.org
8
There doesn't seem to be a reason to change the code though.
4
Reported-by: Peter Maydell <peter.maydell@linaro.org>
9
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Fixes: 08dff435e2 ("tcg: Probe the proper permissions for atomic ops")
11
Reviewed-by: David Hildenbrand <david@redhat.com>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
12
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
13
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Message-Id: <20230505204049.352469-1-richard.henderson@linaro.org>
14
---
10
---
15
accel/tcg/cputlb.c | 10 +++++++++-
11
accel/tcg/cputlb.c | 2 +-
16
target/s390x/tcg/mem_helper.c | 4 ----
12
1 file changed, 1 insertion(+), 1 deletion(-)
17
2 files changed, 9 insertions(+), 5 deletions(-)
18
13
19
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
14
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
20
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
21
--- a/accel/tcg/cputlb.c
16
--- a/accel/tcg/cputlb.c
22
+++ b/accel/tcg/cputlb.c
17
+++ b/accel/tcg/cputlb.c
23
@@ -XXX,XX +XXX,XX @@ static int probe_access_internal(CPUArchState *env, target_ulong addr,
18
@@ -XXX,XX +XXX,XX @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
24
}
19
} else /* if (prot & PAGE_READ) */ {
25
tlb_addr = tlb_read_ofs(entry, elt_ofs);
20
tlb_addr = tlbe->addr_read;
26
21
if (!tlb_hit(tlb_addr, addr)) {
27
+ flags = TLB_FLAGS_MASK;
22
- if (!VICTIM_TLB_HIT(addr_write, addr)) {
28
page_addr = addr & TARGET_PAGE_MASK;
23
+ if (!VICTIM_TLB_HIT(addr_read, addr)) {
29
if (!tlb_hit_page(tlb_addr, page_addr)) {
24
tlb_fill(env_cpu(env), addr, size,
30
if (!victim_tlb_hit(env, mmu_idx, index, elt_ofs, page_addr)) {
25
MMU_DATA_LOAD, mmu_idx, retaddr);
31
@@ -XXX,XX +XXX,XX @@ static int probe_access_internal(CPUArchState *env, target_ulong addr,
26
index = tlb_index(env, mmu_idx, addr);
32
33
/* TLB resize via tlb_fill may have moved the entry. */
34
entry = tlb_entry(env, mmu_idx, addr);
35
+
36
+ /*
37
+ * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
38
+ * to force the next access through tlb_fill. We've just
39
+ * called tlb_fill, so we know that this entry *is* valid.
40
+ */
41
+ flags &= ~TLB_INVALID_MASK;
42
}
43
tlb_addr = tlb_read_ofs(entry, elt_ofs);
44
}
45
- flags = tlb_addr & TLB_FLAGS_MASK;
46
+ flags &= tlb_addr;
47
48
/* Fold all "mmio-like" bits into TLB_MMIO. This is not RAM. */
49
if (unlikely(flags & ~(TLB_WATCHPOINT | TLB_NOTDIRTY))) {
50
diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c
51
index XXXXXXX..XXXXXXX 100644
52
--- a/target/s390x/tcg/mem_helper.c
53
+++ b/target/s390x/tcg/mem_helper.c
54
@@ -XXX,XX +XXX,XX @@ static int s390_probe_access(CPUArchState *env, target_ulong addr, int size,
55
#else
56
int flags;
57
58
- /*
59
- * For !CONFIG_USER_ONLY, we cannot rely on TLB_INVALID_MASK or haddr==NULL
60
- * to detect if there was an exception during tlb_fill().
61
- */
62
env->tlb_fill_exc = 0;
63
flags = probe_access_flags(env, addr, access_type, mmu_idx, nonfault, phost,
64
ra);
65
--
27
--
66
2.34.1
28
2.34.1
67
29
68
30
diff view generated by jsdifflib
New patch
1
Fix these before moving the file, for checkpatch.pl.
1
2
3
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-Id: <20230510170812.663149-1-richard.henderson@linaro.org>
6
---
7
disas.c | 11 ++++++-----
8
1 file changed, 6 insertions(+), 5 deletions(-)
9
10
diff --git a/disas.c b/disas.c
11
index XXXXXXX..XXXXXXX 100644
12
--- a/disas.c
13
+++ b/disas.c
14
@@ -XXX,XX +XXX,XX @@ void target_disas(FILE *out, CPUState *cpu, target_ulong code,
15
}
16
17
for (pc = code; size > 0; pc += count, size -= count) {
18
-    fprintf(out, "0x" TARGET_FMT_lx ": ", pc);
19
-    count = s.info.print_insn(pc, &s.info);
20
-    fprintf(out, "\n");
21
-    if (count < 0)
22
-     break;
23
+ fprintf(out, "0x" TARGET_FMT_lx ": ", pc);
24
+ count = s.info.print_insn(pc, &s.info);
25
+ fprintf(out, "\n");
26
+ if (count < 0) {
27
+ break;
28
+ }
29
if (size < count) {
30
fprintf(out,
31
"Disassembler disagrees with translator over instruction "
32
--
33
2.34.1
diff view generated by jsdifflib
New patch
1
Reviewed-by: Thomas Huth <thuth@redhat.com>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Message-Id: <20230503072331.1747057-80-richard.henderson@linaro.org>
4
---
5
meson.build | 3 ---
6
disas.c => disas/disas.c | 0
7
disas/meson.build | 4 +++-
8
3 files changed, 3 insertions(+), 4 deletions(-)
9
rename disas.c => disas/disas.c (100%)
1
10
11
diff --git a/meson.build b/meson.build
12
index XXXXXXX..XXXXXXX 100644
13
--- a/meson.build
14
+++ b/meson.build
15
@@ -XXX,XX +XXX,XX @@ specific_ss.add(files('cpu.c'))
16
17
subdir('softmmu')
18
19
-common_ss.add(capstone)
20
-specific_ss.add(files('disas.c'), capstone)
21
-
22
# Work around a gcc bug/misfeature wherein constant propagation looks
23
# through an alias:
24
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696
25
diff --git a/disas.c b/disas/disas.c
26
similarity index 100%
27
rename from disas.c
28
rename to disas/disas.c
29
diff --git a/disas/meson.build b/disas/meson.build
30
index XXXXXXX..XXXXXXX 100644
31
--- a/disas/meson.build
32
+++ b/disas/meson.build
33
@@ -XXX,XX +XXX,XX @@ common_ss.add(when: 'CONFIG_RISCV_DIS', if_true: files('riscv.c'))
34
common_ss.add(when: 'CONFIG_SH4_DIS', if_true: files('sh4.c'))
35
common_ss.add(when: 'CONFIG_SPARC_DIS', if_true: files('sparc.c'))
36
common_ss.add(when: 'CONFIG_XTENSA_DIS', if_true: files('xtensa.c'))
37
-common_ss.add(when: capstone, if_true: files('capstone.c'))
38
+common_ss.add(when: capstone, if_true: [files('capstone.c'), capstone])
39
+
40
+specific_ss.add(files('disas.c'), capstone)
41
--
42
2.34.1
diff view generated by jsdifflib
New patch
1
Use uint64_t for the pc, and size_t for the size.
1
2
3
Reviewed-by: Thomas Huth <thuth@redhat.com>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Message-Id: <20230503072331.1747057-81-richard.henderson@linaro.org>
6
---
7
include/disas/disas.h | 17 ++++++-----------
8
bsd-user/elfload.c | 5 +++--
9
disas/disas.c | 19 +++++++++----------
10
linux-user/elfload.c | 5 +++--
11
4 files changed, 21 insertions(+), 25 deletions(-)
12
13
diff --git a/include/disas/disas.h b/include/disas/disas.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/include/disas/disas.h
16
+++ b/include/disas/disas.h
17
@@ -XXX,XX +XXX,XX @@
18
#include "cpu.h"
19
20
/* Disassemble this for me please... (debugging). */
21
-void disas(FILE *out, const void *code, unsigned long size);
22
-void target_disas(FILE *out, CPUState *cpu, target_ulong code,
23
- target_ulong size);
24
+void disas(FILE *out, const void *code, size_t size);
25
+void target_disas(FILE *out, CPUState *cpu, uint64_t code, size_t size);
26
27
-void monitor_disas(Monitor *mon, CPUState *cpu,
28
- target_ulong pc, int nb_insn, int is_physical);
29
+void monitor_disas(Monitor *mon, CPUState *cpu, uint64_t pc,
30
+ int nb_insn, bool is_physical);
31
32
char *plugin_disas(CPUState *cpu, uint64_t addr, size_t size);
33
34
/* Look up symbol for debugging purpose. Returns "" if unknown. */
35
-const char *lookup_symbol(target_ulong orig_addr);
36
+const char *lookup_symbol(uint64_t orig_addr);
37
#endif
38
39
struct syminfo;
40
struct elf32_sym;
41
struct elf64_sym;
42
43
-#if defined(CONFIG_USER_ONLY)
44
-typedef const char *(*lookup_symbol_t)(struct syminfo *s, target_ulong orig_addr);
45
-#else
46
-typedef const char *(*lookup_symbol_t)(struct syminfo *s, hwaddr orig_addr);
47
-#endif
48
+typedef const char *(*lookup_symbol_t)(struct syminfo *s, uint64_t orig_addr);
49
50
struct syminfo {
51
lookup_symbol_t lookup_symbol;
52
diff --git a/bsd-user/elfload.c b/bsd-user/elfload.c
53
index XXXXXXX..XXXXXXX 100644
54
--- a/bsd-user/elfload.c
55
+++ b/bsd-user/elfload.c
56
@@ -XXX,XX +XXX,XX @@ static abi_ulong load_elf_interp(struct elfhdr *interp_elf_ex,
57
58
static int symfind(const void *s0, const void *s1)
59
{
60
- target_ulong addr = *(target_ulong *)s0;
61
+ __typeof(sym->st_value) addr = *(uint64_t *)s0;
62
struct elf_sym *sym = (struct elf_sym *)s1;
63
int result = 0;
64
+
65
if (addr < sym->st_value) {
66
result = -1;
67
} else if (addr >= sym->st_value + sym->st_size) {
68
@@ -XXX,XX +XXX,XX @@ static int symfind(const void *s0, const void *s1)
69
return result;
70
}
71
72
-static const char *lookup_symbolxx(struct syminfo *s, target_ulong orig_addr)
73
+static const char *lookup_symbolxx(struct syminfo *s, uint64_t orig_addr)
74
{
75
#if ELF_CLASS == ELFCLASS32
76
struct elf_sym *syms = s->disas_symtab.elf32;
77
diff --git a/disas/disas.c b/disas/disas.c
78
index XXXXXXX..XXXXXXX 100644
79
--- a/disas/disas.c
80
+++ b/disas/disas.c
81
@@ -XXX,XX +XXX,XX @@ static void initialize_debug_host(CPUDebug *s)
82
}
83
84
/* Disassemble this for me please... (debugging). */
85
-void target_disas(FILE *out, CPUState *cpu, target_ulong code,
86
- target_ulong size)
87
+void target_disas(FILE *out, CPUState *cpu, uint64_t code, size_t size)
88
{
89
- target_ulong pc;
90
+ uint64_t pc;
91
int count;
92
CPUDebug s;
93
94
@@ -XXX,XX +XXX,XX @@ void target_disas(FILE *out, CPUState *cpu, target_ulong code,
95
}
96
97
for (pc = code; size > 0; pc += count, size -= count) {
98
- fprintf(out, "0x" TARGET_FMT_lx ": ", pc);
99
+ fprintf(out, "0x%08" PRIx64 ": ", pc);
100
count = s.info.print_insn(pc, &s.info);
101
fprintf(out, "\n");
102
if (count < 0) {
103
@@ -XXX,XX +XXX,XX @@ char *plugin_disas(CPUState *cpu, uint64_t addr, size_t size)
104
}
105
106
/* Disassemble this for me please... (debugging). */
107
-void disas(FILE *out, const void *code, unsigned long size)
108
+void disas(FILE *out, const void *code, size_t size)
109
{
110
uintptr_t pc;
111
int count;
112
@@ -XXX,XX +XXX,XX @@ void disas(FILE *out, const void *code, unsigned long size)
113
}
114
115
/* Look up symbol for debugging purpose. Returns "" if unknown. */
116
-const char *lookup_symbol(target_ulong orig_addr)
117
+const char *lookup_symbol(uint64_t orig_addr)
118
{
119
const char *symbol = "";
120
struct syminfo *s;
121
@@ -XXX,XX +XXX,XX @@ physical_read_memory(bfd_vma memaddr, bfd_byte *myaddr, int length,
122
}
123
124
/* Disassembler for the monitor. */
125
-void monitor_disas(Monitor *mon, CPUState *cpu,
126
- target_ulong pc, int nb_insn, int is_physical)
127
+void monitor_disas(Monitor *mon, CPUState *cpu, uint64_t pc,
128
+ int nb_insn, bool is_physical)
129
{
130
int count, i;
131
CPUDebug s;
132
@@ -XXX,XX +XXX,XX @@ void monitor_disas(Monitor *mon, CPUState *cpu,
133
}
134
135
if (!s.info.print_insn) {
136
- monitor_printf(mon, "0x" TARGET_FMT_lx
137
+ monitor_printf(mon, "0x%08" PRIx64
138
": Asm output not supported on this arch\n", pc);
139
return;
140
}
141
142
for (i = 0; i < nb_insn; i++) {
143
- g_string_append_printf(ds, "0x" TARGET_FMT_lx ": ", pc);
144
+ g_string_append_printf(ds, "0x%08" PRIx64 ": ", pc);
145
count = s.info.print_insn(pc, &s.info);
146
g_string_append_c(ds, '\n');
147
if (count < 0) {
148
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
149
index XXXXXXX..XXXXXXX 100644
150
--- a/linux-user/elfload.c
151
+++ b/linux-user/elfload.c
152
@@ -XXX,XX +XXX,XX @@ static void load_elf_interp(const char *filename, struct image_info *info,
153
154
static int symfind(const void *s0, const void *s1)
155
{
156
- target_ulong addr = *(target_ulong *)s0;
157
struct elf_sym *sym = (struct elf_sym *)s1;
158
+ __typeof(sym->st_value) addr = *(uint64_t *)s0;
159
int result = 0;
160
+
161
if (addr < sym->st_value) {
162
result = -1;
163
} else if (addr >= sym->st_value + sym->st_size) {
164
@@ -XXX,XX +XXX,XX @@ static int symfind(const void *s0, const void *s1)
165
return result;
166
}
167
168
-static const char *lookup_symbolxx(struct syminfo *s, target_ulong orig_addr)
169
+static const char *lookup_symbolxx(struct syminfo *s, uint64_t orig_addr)
170
{
171
#if ELF_CLASS == ELFCLASS32
172
struct elf_sym *syms = s->disas_symtab.elf32;
173
--
174
2.34.1
diff view generated by jsdifflib
New patch
1
Reviewed-by: Thomas Huth <thuth@redhat.com>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
Message-Id: <20230503072331.1747057-83-richard.henderson@linaro.org>
4
---
5
include/disas/disas.h | 6 ------
6
disas/disas.c | 3 ++-
7
2 files changed, 2 insertions(+), 7 deletions(-)
1
8
9
diff --git a/include/disas/disas.h b/include/disas/disas.h
10
index XXXXXXX..XXXXXXX 100644
11
--- a/include/disas/disas.h
12
+++ b/include/disas/disas.h
13
@@ -XXX,XX +XXX,XX @@
14
#ifndef QEMU_DISAS_H
15
#define QEMU_DISAS_H
16
17
-#include "exec/hwaddr.h"
18
-
19
-#ifdef NEED_CPU_H
20
-#include "cpu.h"
21
-
22
/* Disassemble this for me please... (debugging). */
23
void disas(FILE *out, const void *code, size_t size);
24
void target_disas(FILE *out, CPUState *cpu, uint64_t code, size_t size);
25
@@ -XXX,XX +XXX,XX @@ char *plugin_disas(CPUState *cpu, uint64_t addr, size_t size);
26
27
/* Look up symbol for debugging purpose. Returns "" if unknown. */
28
const char *lookup_symbol(uint64_t orig_addr);
29
-#endif
30
31
struct syminfo;
32
struct elf32_sym;
33
diff --git a/disas/disas.c b/disas/disas.c
34
index XXXXXXX..XXXXXXX 100644
35
--- a/disas/disas.c
36
+++ b/disas/disas.c
37
@@ -XXX,XX +XXX,XX @@
38
#include "disas/dis-asm.h"
39
#include "elf.h"
40
#include "qemu/qemu-print.h"
41
-
42
#include "disas/disas.h"
43
#include "disas/capstone.h"
44
+#include "hw/core/cpu.h"
45
+#include "exec/memory.h"
46
47
typedef struct CPUDebug {
48
struct disassemble_info info;
49
--
50
2.34.1
diff view generated by jsdifflib
New patch
1
From: Thomas Huth <thuth@redhat.com>
1
2
3
We'd like to move disas.c into the common code source set, where
4
CONFIG_USER_ONLY is not available anymore. So we have to move
5
the related code into a separate file instead.
6
7
Signed-off-by: Thomas Huth <thuth@redhat.com>
8
Message-Id: <20230508133745.109463-2-thuth@redhat.com>
9
[rth: Type change done in a separate patch]
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
---
12
disas/disas-internal.h | 21 ++++++++++++
13
disas/disas-mon.c | 65 ++++++++++++++++++++++++++++++++++++
14
disas/disas.c | 76 ++++--------------------------------------
15
disas/meson.build | 1 +
16
4 files changed, 93 insertions(+), 70 deletions(-)
17
create mode 100644 disas/disas-internal.h
18
create mode 100644 disas/disas-mon.c
19
20
diff --git a/disas/disas-internal.h b/disas/disas-internal.h
21
new file mode 100644
22
index XXXXXXX..XXXXXXX
23
--- /dev/null
24
+++ b/disas/disas-internal.h
25
@@ -XXX,XX +XXX,XX @@
26
+/*
27
+ * Definitions used internally in the disassembly code
28
+ *
29
+ * SPDX-License-Identifier: GPL-2.0-or-later
30
+ */
31
+
32
+#ifndef DISAS_INTERNAL_H
33
+#define DISAS_INTERNAL_H
34
+
35
+#include "disas/dis-asm.h"
36
+
37
+typedef struct CPUDebug {
38
+ struct disassemble_info info;
39
+ CPUState *cpu;
40
+} CPUDebug;
41
+
42
+void disas_initialize_debug_target(CPUDebug *s, CPUState *cpu);
43
+int disas_gstring_printf(FILE *stream, const char *fmt, ...)
44
+ G_GNUC_PRINTF(2, 3);
45
+
46
+#endif
47
diff --git a/disas/disas-mon.c b/disas/disas-mon.c
48
new file mode 100644
49
index XXXXXXX..XXXXXXX
50
--- /dev/null
51
+++ b/disas/disas-mon.c
52
@@ -XXX,XX +XXX,XX @@
53
+/*
54
+ * Functions related to disassembly from the monitor
55
+ *
56
+ * SPDX-License-Identifier: GPL-2.0-or-later
57
+ */
58
+
59
+#include "qemu/osdep.h"
60
+#include "disas-internal.h"
61
+#include "disas/disas.h"
62
+#include "exec/memory.h"
63
+#include "hw/core/cpu.h"
64
+#include "monitor/monitor.h"
65
+
66
+static int
67
+physical_read_memory(bfd_vma memaddr, bfd_byte *myaddr, int length,
68
+ struct disassemble_info *info)
69
+{
70
+ CPUDebug *s = container_of(info, CPUDebug, info);
71
+ MemTxResult res;
72
+
73
+ res = address_space_read(s->cpu->as, memaddr, MEMTXATTRS_UNSPECIFIED,
74
+ myaddr, length);
75
+ return res == MEMTX_OK ? 0 : EIO;
76
+}
77
+
78
+/* Disassembler for the monitor. */
79
+void monitor_disas(Monitor *mon, CPUState *cpu, uint64_t pc,
80
+ int nb_insn, bool is_physical)
81
+{
82
+ int count, i;
83
+ CPUDebug s;
84
+ g_autoptr(GString) ds = g_string_new("");
85
+
86
+ disas_initialize_debug_target(&s, cpu);
87
+ s.info.fprintf_func = disas_gstring_printf;
88
+ s.info.stream = (FILE *)ds; /* abuse this slot */
89
+
90
+ if (is_physical) {
91
+ s.info.read_memory_func = physical_read_memory;
92
+ }
93
+ s.info.buffer_vma = pc;
94
+
95
+ if (s.info.cap_arch >= 0 && cap_disas_monitor(&s.info, pc, nb_insn)) {
96
+ monitor_puts(mon, ds->str);
97
+ return;
98
+ }
99
+
100
+ if (!s.info.print_insn) {
101
+ monitor_printf(mon, "0x%08" PRIx64
102
+ ": Asm output not supported on this arch\n", pc);
103
+ return;
104
+ }
105
+
106
+ for (i = 0; i < nb_insn; i++) {
107
+ g_string_append_printf(ds, "0x%08" PRIx64 ": ", pc);
108
+ count = s.info.print_insn(pc, &s.info);
109
+ g_string_append_c(ds, '\n');
110
+ if (count < 0) {
111
+ break;
112
+ }
113
+ pc += count;
114
+ }
115
+
116
+ monitor_puts(mon, ds->str);
117
+}
118
diff --git a/disas/disas.c b/disas/disas.c
119
index XXXXXXX..XXXXXXX 100644
120
--- a/disas/disas.c
121
+++ b/disas/disas.c
122
@@ -XXX,XX +XXX,XX @@
123
/* General "disassemble this chunk" code. Used for debugging. */
124
#include "qemu/osdep.h"
125
-#include "disas/dis-asm.h"
126
+#include "disas/disas-internal.h"
127
#include "elf.h"
128
#include "qemu/qemu-print.h"
129
#include "disas/disas.h"
130
@@ -XXX,XX +XXX,XX @@
131
#include "hw/core/cpu.h"
132
#include "exec/memory.h"
133
134
-typedef struct CPUDebug {
135
- struct disassemble_info info;
136
- CPUState *cpu;
137
-} CPUDebug;
138
-
139
/* Filled in by elfload.c. Simplistic, but will do for now. */
140
struct syminfo *syminfos = NULL;
141
142
@@ -XXX,XX +XXX,XX @@ static void initialize_debug(CPUDebug *s)
143
s->info.symbol_at_address_func = symbol_at_address;
144
}
145
146
-static void initialize_debug_target(CPUDebug *s, CPUState *cpu)
147
+void disas_initialize_debug_target(CPUDebug *s, CPUState *cpu)
148
{
149
initialize_debug(s);
150
151
@@ -XXX,XX +XXX,XX @@ void target_disas(FILE *out, CPUState *cpu, uint64_t code, size_t size)
152
int count;
153
CPUDebug s;
154
155
- initialize_debug_target(&s, cpu);
156
+ disas_initialize_debug_target(&s, cpu);
157
s.info.fprintf_func = fprintf;
158
s.info.stream = out;
159
s.info.buffer_vma = code;
160
@@ -XXX,XX +XXX,XX @@ void target_disas(FILE *out, CPUState *cpu, uint64_t code, size_t size)
161
}
162
}
163
164
-static int G_GNUC_PRINTF(2, 3)
165
-gstring_printf(FILE *stream, const char *fmt, ...)
166
+int disas_gstring_printf(FILE *stream, const char *fmt, ...)
167
{
168
/* We abuse the FILE parameter to pass a GString. */
169
GString *s = (GString *)stream;
170
@@ -XXX,XX +XXX,XX @@ char *plugin_disas(CPUState *cpu, uint64_t addr, size_t size)
171
CPUDebug s;
172
GString *ds = g_string_new(NULL);
173
174
- initialize_debug_target(&s, cpu);
175
- s.info.fprintf_func = gstring_printf;
176
+ disas_initialize_debug_target(&s, cpu);
177
+ s.info.fprintf_func = disas_gstring_printf;
178
s.info.stream = (FILE *)ds; /* abuse this slot */
179
s.info.buffer_vma = addr;
180
s.info.buffer_length = size;
181
@@ -XXX,XX +XXX,XX @@ const char *lookup_symbol(uint64_t orig_addr)
182
183
return symbol;
184
}
185
-
186
-#if !defined(CONFIG_USER_ONLY)
187
-
188
-#include "monitor/monitor.h"
189
-
190
-static int
191
-physical_read_memory(bfd_vma memaddr, bfd_byte *myaddr, int length,
192
- struct disassemble_info *info)
193
-{
194
- CPUDebug *s = container_of(info, CPUDebug, info);
195
- MemTxResult res;
196
-
197
- res = address_space_read(s->cpu->as, memaddr, MEMTXATTRS_UNSPECIFIED,
198
- myaddr, length);
199
- return res == MEMTX_OK ? 0 : EIO;
200
-}
201
-
202
-/* Disassembler for the monitor. */
203
-void monitor_disas(Monitor *mon, CPUState *cpu, uint64_t pc,
204
- int nb_insn, bool is_physical)
205
-{
206
- int count, i;
207
- CPUDebug s;
208
- g_autoptr(GString) ds = g_string_new("");
209
-
210
- initialize_debug_target(&s, cpu);
211
- s.info.fprintf_func = gstring_printf;
212
- s.info.stream = (FILE *)ds; /* abuse this slot */
213
-
214
- if (is_physical) {
215
- s.info.read_memory_func = physical_read_memory;
216
- }
217
- s.info.buffer_vma = pc;
218
-
219
- if (s.info.cap_arch >= 0 && cap_disas_monitor(&s.info, pc, nb_insn)) {
220
- monitor_puts(mon, ds->str);
221
- return;
222
- }
223
-
224
- if (!s.info.print_insn) {
225
- monitor_printf(mon, "0x%08" PRIx64
226
- ": Asm output not supported on this arch\n", pc);
227
- return;
228
- }
229
-
230
- for (i = 0; i < nb_insn; i++) {
231
- g_string_append_printf(ds, "0x%08" PRIx64 ": ", pc);
232
- count = s.info.print_insn(pc, &s.info);
233
- g_string_append_c(ds, '\n');
234
- if (count < 0) {
235
- break;
236
- }
237
- pc += count;
238
- }
239
-
240
- monitor_puts(mon, ds->str);
241
-}
242
-#endif
243
diff --git a/disas/meson.build b/disas/meson.build
244
index XXXXXXX..XXXXXXX 100644
245
--- a/disas/meson.build
246
+++ b/disas/meson.build
247
@@ -XXX,XX +XXX,XX @@ common_ss.add(when: 'CONFIG_SPARC_DIS', if_true: files('sparc.c'))
248
common_ss.add(when: 'CONFIG_XTENSA_DIS', if_true: files('xtensa.c'))
249
common_ss.add(when: capstone, if_true: [files('capstone.c'), capstone])
250
251
+softmmu_ss.add(files('disas-mon.c'))
252
specific_ss.add(files('disas.c'), capstone)
253
--
254
2.34.1
diff view generated by jsdifflib
New patch
1
From: Thomas Huth <thuth@redhat.com>
1
2
3
By using target_words_bigendian() instead of an ifdef,
4
we can build this code once.
5
6
Signed-off-by: Thomas Huth <thuth@redhat.com>
7
Message-Id: <20230508133745.109463-3-thuth@redhat.com>
8
[rth: Type change done in a separate patch]
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
---
11
disas/disas.c | 10 +++++-----
12
disas/meson.build | 3 ++-
13
2 files changed, 7 insertions(+), 6 deletions(-)
14
15
diff --git a/disas/disas.c b/disas/disas.c
16
index XXXXXXX..XXXXXXX 100644
17
--- a/disas/disas.c
18
+++ b/disas/disas.c
19
@@ -XXX,XX +XXX,XX @@ void disas_initialize_debug_target(CPUDebug *s, CPUState *cpu)
20
s->cpu = cpu;
21
s->info.read_memory_func = target_read_memory;
22
s->info.print_address_func = print_address;
23
-#if TARGET_BIG_ENDIAN
24
- s->info.endian = BFD_ENDIAN_BIG;
25
-#else
26
- s->info.endian = BFD_ENDIAN_LITTLE;
27
-#endif
28
+ if (target_words_bigendian()) {
29
+ s->info.endian = BFD_ENDIAN_BIG;
30
+ } else {
31
+ s->info.endian = BFD_ENDIAN_LITTLE;
32
+ }
33
34
CPUClass *cc = CPU_GET_CLASS(cpu);
35
if (cc->disas_set_info) {
36
diff --git a/disas/meson.build b/disas/meson.build
37
index XXXXXXX..XXXXXXX 100644
38
--- a/disas/meson.build
39
+++ b/disas/meson.build
40
@@ -XXX,XX +XXX,XX @@ common_ss.add(when: 'CONFIG_SH4_DIS', if_true: files('sh4.c'))
41
common_ss.add(when: 'CONFIG_SPARC_DIS', if_true: files('sparc.c'))
42
common_ss.add(when: 'CONFIG_XTENSA_DIS', if_true: files('xtensa.c'))
43
common_ss.add(when: capstone, if_true: [files('capstone.c'), capstone])
44
+common_ss.add(files('disas.c'))
45
46
softmmu_ss.add(files('disas-mon.c'))
47
-specific_ss.add(files('disas.c'), capstone)
48
+specific_ss.add(capstone)
49
--
50
2.34.1
diff view generated by jsdifflib
1
Wrap the bare TranslationBlock pointer into a structure.
1
From: Jamie Iles <quic_jiles@quicinc.com>
2
2
3
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
3
Expose qemu_cpu_list_lock globally so that we can use
4
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
4
WITH_QEMU_LOCK_GUARD and QEMU_LOCK_GUARD to simplify a few code paths
5
now and in future.
6
7
Signed-off-by: Jamie Iles <quic_jiles@quicinc.com>
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-Id: <20230427020925.51003-2-quic_jiles@quicinc.com>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
12
---
7
accel/tcg/tb-hash.h | 1 +
8
accel/tcg/tb-jmp-cache.h | 24 ++++++++++++++++++++++++
9
include/exec/cpu-common.h | 1 +
13
include/exec/cpu-common.h | 1 +
10
include/hw/core/cpu.h | 15 +--------------
14
cpus-common.c | 2 +-
11
include/qemu/typedefs.h | 1 +
15
linux-user/elfload.c | 13 +++++++------
12
accel/stubs/tcg-stub.c | 4 ++++
16
migration/dirtyrate.c | 26 +++++++++++++-------------
13
accel/tcg/cpu-exec.c | 10 +++++++---
17
trace/control-target.c | 9 ++++-----
14
accel/tcg/cputlb.c | 9 +++++----
18
5 files changed, 26 insertions(+), 25 deletions(-)
15
accel/tcg/translate-all.c | 28 +++++++++++++++++++++++++---
16
hw/core/cpu-common.c | 3 +--
17
plugins/core.c | 2 +-
18
trace/control-target.c | 2 +-
19
12 files changed, 72 insertions(+), 28 deletions(-)
20
create mode 100644 accel/tcg/tb-jmp-cache.h
21
19
22
diff --git a/accel/tcg/tb-hash.h b/accel/tcg/tb-hash.h
23
index XXXXXXX..XXXXXXX 100644
24
--- a/accel/tcg/tb-hash.h
25
+++ b/accel/tcg/tb-hash.h
26
@@ -XXX,XX +XXX,XX @@
27
#include "exec/cpu-defs.h"
28
#include "exec/exec-all.h"
29
#include "qemu/xxhash.h"
30
+#include "tb-jmp-cache.h"
31
32
#ifdef CONFIG_SOFTMMU
33
34
diff --git a/accel/tcg/tb-jmp-cache.h b/accel/tcg/tb-jmp-cache.h
35
new file mode 100644
36
index XXXXXXX..XXXXXXX
37
--- /dev/null
38
+++ b/accel/tcg/tb-jmp-cache.h
39
@@ -XXX,XX +XXX,XX @@
40
+/*
41
+ * The per-CPU TranslationBlock jump cache.
42
+ *
43
+ * Copyright (c) 2003 Fabrice Bellard
44
+ *
45
+ * SPDX-License-Identifier: GPL-2.0-or-later
46
+ */
47
+
48
+#ifndef ACCEL_TCG_TB_JMP_CACHE_H
49
+#define ACCEL_TCG_TB_JMP_CACHE_H
50
+
51
+#define TB_JMP_CACHE_BITS 12
52
+#define TB_JMP_CACHE_SIZE (1 << TB_JMP_CACHE_BITS)
53
+
54
+/*
55
+ * Accessed in parallel; all accesses to 'tb' must be atomic.
56
+ */
57
+struct CPUJumpCache {
58
+ struct {
59
+ TranslationBlock *tb;
60
+ } array[TB_JMP_CACHE_SIZE];
61
+};
62
+
63
+#endif /* ACCEL_TCG_TB_JMP_CACHE_H */
64
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
20
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
65
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
66
--- a/include/exec/cpu-common.h
22
--- a/include/exec/cpu-common.h
67
+++ b/include/exec/cpu-common.h
23
+++ b/include/exec/cpu-common.h
68
@@ -XXX,XX +XXX,XX @@ void cpu_list_unlock(void);
24
@@ -XXX,XX +XXX,XX @@ extern intptr_t qemu_host_page_mask;
69
unsigned int cpu_list_generation_id_get(void);
25
#define REAL_HOST_PAGE_ALIGN(addr) ROUND_UP((addr), qemu_real_host_page_size())
70
26
71
void tcg_flush_softmmu_tlb(CPUState *cs);
27
/* The CPU list lock nests outside page_(un)lock or mmap_(un)lock */
72
+void tcg_flush_jmp_cache(CPUState *cs);
28
+extern QemuMutex qemu_cpu_list_lock;
73
29
void qemu_init_cpu_list(void);
74
void tcg_iommu_init_notifier_list(CPUState *cpu);
30
void cpu_list_lock(void);
75
void tcg_iommu_free_notifier_list(CPUState *cpu);
31
void cpu_list_unlock(void);
76
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
32
diff --git a/cpus-common.c b/cpus-common.c
77
index XXXXXXX..XXXXXXX 100644
33
index XXXXXXX..XXXXXXX 100644
78
--- a/include/hw/core/cpu.h
34
--- a/cpus-common.c
79
+++ b/include/hw/core/cpu.h
35
+++ b/cpus-common.c
80
@@ -XXX,XX +XXX,XX @@ struct kvm_run;
36
@@ -XXX,XX +XXX,XX @@
81
struct hax_vcpu_state;
37
#include "qemu/lockable.h"
82
struct hvf_vcpu_state;
38
#include "trace/trace-root.h"
83
39
84
-#define TB_JMP_CACHE_BITS 12
40
-static QemuMutex qemu_cpu_list_lock;
85
-#define TB_JMP_CACHE_SIZE (1 << TB_JMP_CACHE_BITS)
41
+QemuMutex qemu_cpu_list_lock;
86
-
42
static QemuCond exclusive_cond;
87
/* work queue */
43
static QemuCond exclusive_resume;
88
44
static QemuCond qemu_work_cond;
89
/* The union type allows passing of 64 bit target pointers on 32 bit
45
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
90
@@ -XXX,XX +XXX,XX @@ struct CPUState {
91
CPUArchState *env_ptr;
92
IcountDecr *icount_decr_ptr;
93
94
- /* Accessed in parallel; all accesses must be atomic */
95
- TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE];
96
+ CPUJumpCache *tb_jmp_cache;
97
98
struct GDBRegisterState *gdb_regs;
99
int gdb_num_regs;
100
@@ -XXX,XX +XXX,XX @@ extern CPUTailQ cpus;
101
102
extern __thread CPUState *current_cpu;
103
104
-static inline void cpu_tb_jmp_cache_clear(CPUState *cpu)
105
-{
106
- unsigned int i;
107
-
108
- for (i = 0; i < TB_JMP_CACHE_SIZE; i++) {
109
- qatomic_set(&cpu->tb_jmp_cache[i], NULL);
110
- }
111
-}
112
-
113
/**
114
* qemu_tcg_mttcg_enabled:
115
* Check whether we are running MultiThread TCG or not.
116
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
117
index XXXXXXX..XXXXXXX 100644
46
index XXXXXXX..XXXXXXX 100644
118
--- a/include/qemu/typedefs.h
47
--- a/linux-user/elfload.c
119
+++ b/include/qemu/typedefs.h
48
+++ b/linux-user/elfload.c
120
@@ -XXX,XX +XXX,XX @@ typedef struct CoMutex CoMutex;
49
@@ -XXX,XX +XXX,XX @@
121
typedef struct ConfidentialGuestSupport ConfidentialGuestSupport;
50
#include "qemu/guest-random.h"
122
typedef struct CPUAddressSpace CPUAddressSpace;
51
#include "qemu/units.h"
123
typedef struct CPUArchState CPUArchState;
52
#include "qemu/selfmap.h"
124
+typedef struct CPUJumpCache CPUJumpCache;
53
+#include "qemu/lockable.h"
125
typedef struct CPUState CPUState;
54
#include "qapi/error.h"
126
typedef struct CPUTLBEntryFull CPUTLBEntryFull;
55
#include "qemu/error-report.h"
127
typedef struct DeviceListener DeviceListener;
56
#include "target_signal.h"
128
diff --git a/accel/stubs/tcg-stub.c b/accel/stubs/tcg-stub.c
57
@@ -XXX,XX +XXX,XX @@ static int fill_note_info(struct elf_note_info *info,
58
info->notes_size += note_size(&info->notes[i]);
59
60
/* read and fill status of all threads */
61
- cpu_list_lock();
62
- CPU_FOREACH(cpu) {
63
- if (cpu == thread_cpu) {
64
- continue;
65
+ WITH_QEMU_LOCK_GUARD(&qemu_cpu_list_lock) {
66
+ CPU_FOREACH(cpu) {
67
+ if (cpu == thread_cpu) {
68
+ continue;
69
+ }
70
+ fill_thread_info(info, cpu->env_ptr);
71
}
72
- fill_thread_info(info, cpu->env_ptr);
73
}
74
- cpu_list_unlock();
75
76
return (0);
77
}
78
diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
129
index XXXXXXX..XXXXXXX 100644
79
index XXXXXXX..XXXXXXX 100644
130
--- a/accel/stubs/tcg-stub.c
80
--- a/migration/dirtyrate.c
131
+++ b/accel/stubs/tcg-stub.c
81
+++ b/migration/dirtyrate.c
132
@@ -XXX,XX +XXX,XX @@ void tlb_set_dirty(CPUState *cpu, target_ulong vaddr)
82
@@ -XXX,XX +XXX,XX @@ int64_t vcpu_calculate_dirtyrate(int64_t calc_time_ms,
133
{
83
retry:
134
}
84
init_time_ms = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
135
85
136
+void tcg_flush_jmp_cache(CPUState *cpu)
86
- cpu_list_lock();
137
+{
87
- gen_id = cpu_list_generation_id_get();
138
+}
88
- records = vcpu_dirty_stat_alloc(stat);
139
+
89
- vcpu_dirty_stat_collect(stat, records, true);
140
int probe_access_flags(CPUArchState *env, target_ulong addr,
90
- cpu_list_unlock();
141
MMUAccessType access_type, int mmu_idx,
91
+ WITH_QEMU_LOCK_GUARD(&qemu_cpu_list_lock) {
142
bool nonfault, void **phost, uintptr_t retaddr)
92
+ gen_id = cpu_list_generation_id_get();
143
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
93
+ records = vcpu_dirty_stat_alloc(stat);
144
index XXXXXXX..XXXXXXX 100644
94
+ vcpu_dirty_stat_collect(stat, records, true);
145
--- a/accel/tcg/cpu-exec.c
95
+ }
146
+++ b/accel/tcg/cpu-exec.c
96
147
@@ -XXX,XX +XXX,XX @@
97
duration = dirty_stat_wait(calc_time_ms, init_time_ms);
148
#include "sysemu/replay.h"
98
149
#include "sysemu/tcg.h"
99
global_dirty_log_sync(flag, one_shot);
150
#include "exec/helper-proto.h"
100
151
+#include "tb-jmp-cache.h"
101
- cpu_list_lock();
152
#include "tb-hash.h"
102
- if (gen_id != cpu_list_generation_id_get()) {
153
#include "tb-context.h"
103
- g_free(records);
154
#include "internal.h"
104
- g_free(stat->rates);
155
@@ -XXX,XX +XXX,XX @@ static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
105
- cpu_list_unlock();
156
tcg_debug_assert(!(cflags & CF_INVALID));
106
- goto retry;
157
107
+ WITH_QEMU_LOCK_GUARD(&qemu_cpu_list_lock) {
158
hash = tb_jmp_cache_hash_func(pc);
108
+ if (gen_id != cpu_list_generation_id_get()) {
159
- tb = qatomic_rcu_read(&cpu->tb_jmp_cache[hash]);
109
+ g_free(records);
160
+ tb = qatomic_rcu_read(&cpu->tb_jmp_cache->array[hash].tb);
110
+ g_free(stat->rates);
161
111
+ cpu_list_unlock();
162
if (likely(tb &&
112
+ goto retry;
163
tb->pc == pc &&
113
+ }
164
@@ -XXX,XX +XXX,XX @@ static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
114
+ vcpu_dirty_stat_collect(stat, records, false);
165
if (tb == NULL) {
166
return NULL;
167
}
115
}
168
- qatomic_set(&cpu->tb_jmp_cache[hash], tb);
116
- vcpu_dirty_stat_collect(stat, records, false);
169
+ qatomic_set(&cpu->tb_jmp_cache->array[hash].tb, tb);
117
- cpu_list_unlock();
170
return tb;
118
171
}
119
for (i = 0; i < stat->nvcpu; i++) {
172
120
dirtyrate = do_calculate_dirtyrate(records[i], duration);
173
@@ -XXX,XX +XXX,XX @@ int cpu_exec(CPUState *cpu)
174
175
tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
176
if (tb == NULL) {
177
+ uint32_t h;
178
+
179
mmap_lock();
180
tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
181
mmap_unlock();
182
@@ -XXX,XX +XXX,XX @@ int cpu_exec(CPUState *cpu)
183
* We add the TB in the virtual pc hash table
184
* for the fast lookup
185
*/
186
- qatomic_set(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)], tb);
187
+ h = tb_jmp_cache_hash_func(pc);
188
+ qatomic_set(&cpu->tb_jmp_cache->array[h].tb, tb);
189
}
190
191
#ifndef CONFIG_USER_ONLY
192
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
193
index XXXXXXX..XXXXXXX 100644
194
--- a/accel/tcg/cputlb.c
195
+++ b/accel/tcg/cputlb.c
196
@@ -XXX,XX +XXX,XX @@ static void tlb_window_reset(CPUTLBDesc *desc, int64_t ns,
197
198
static void tb_jmp_cache_clear_page(CPUState *cpu, target_ulong page_addr)
199
{
200
- unsigned int i, i0 = tb_jmp_cache_hash_page(page_addr);
201
+ int i, i0 = tb_jmp_cache_hash_page(page_addr);
202
+ CPUJumpCache *jc = cpu->tb_jmp_cache;
203
204
for (i = 0; i < TB_JMP_PAGE_SIZE; i++) {
205
- qatomic_set(&cpu->tb_jmp_cache[i0 + i], NULL);
206
+ qatomic_set(&jc->array[i0 + i].tb, NULL);
207
}
208
}
209
210
@@ -XXX,XX +XXX,XX @@ static void tlb_flush_by_mmuidx_async_work(CPUState *cpu, run_on_cpu_data data)
211
212
qemu_spin_unlock(&env_tlb(env)->c.lock);
213
214
- cpu_tb_jmp_cache_clear(cpu);
215
+ tcg_flush_jmp_cache(cpu);
216
217
if (to_clean == ALL_MMUIDX_BITS) {
218
qatomic_set(&env_tlb(env)->c.full_flush_count,
219
@@ -XXX,XX +XXX,XX @@ static void tlb_flush_range_by_mmuidx_async_0(CPUState *cpu,
220
* longer to clear each entry individually than it will to clear it all.
221
*/
222
if (d.len >= (TARGET_PAGE_SIZE * TB_JMP_CACHE_SIZE)) {
223
- cpu_tb_jmp_cache_clear(cpu);
224
+ tcg_flush_jmp_cache(cpu);
225
return;
226
}
227
228
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
229
index XXXXXXX..XXXXXXX 100644
230
--- a/accel/tcg/translate-all.c
231
+++ b/accel/tcg/translate-all.c
232
@@ -XXX,XX +XXX,XX @@
233
#include "sysemu/tcg.h"
234
#include "qapi/error.h"
235
#include "hw/core/tcg-cpu-ops.h"
236
+#include "tb-jmp-cache.h"
237
#include "tb-hash.h"
238
#include "tb-context.h"
239
#include "internal.h"
240
@@ -XXX,XX +XXX,XX @@ static void do_tb_flush(CPUState *cpu, run_on_cpu_data tb_flush_count)
241
}
242
243
CPU_FOREACH(cpu) {
244
- cpu_tb_jmp_cache_clear(cpu);
245
+ tcg_flush_jmp_cache(cpu);
246
}
247
248
qht_reset_size(&tb_ctx.htable, CODE_GEN_HTABLE_SIZE);
249
@@ -XXX,XX +XXX,XX @@ static void do_tb_phys_invalidate(TranslationBlock *tb, bool rm_from_page_list)
250
/* remove the TB from the hash list */
251
h = tb_jmp_cache_hash_func(tb->pc);
252
CPU_FOREACH(cpu) {
253
- if (qatomic_read(&cpu->tb_jmp_cache[h]) == tb) {
254
- qatomic_set(&cpu->tb_jmp_cache[h], NULL);
255
+ CPUJumpCache *jc = cpu->tb_jmp_cache;
256
+ if (qatomic_read(&jc->array[h].tb) == tb) {
257
+ qatomic_set(&jc->array[h].tb, NULL);
258
}
259
}
260
261
@@ -XXX,XX +XXX,XX @@ int page_unprotect(target_ulong address, uintptr_t pc)
262
}
263
#endif /* CONFIG_USER_ONLY */
264
265
+/*
266
+ * Called by generic code at e.g. cpu reset after cpu creation,
267
+ * therefore we must be prepared to allocate the jump cache.
268
+ */
269
+void tcg_flush_jmp_cache(CPUState *cpu)
270
+{
271
+ CPUJumpCache *jc = cpu->tb_jmp_cache;
272
+
273
+ if (likely(jc)) {
274
+ for (int i = 0; i < TB_JMP_CACHE_SIZE; i++) {
275
+ qatomic_set(&jc->array[i].tb, NULL);
276
+ }
277
+ } else {
278
+ /* This should happen once during realize, and thus never race. */
279
+ jc = g_new0(CPUJumpCache, 1);
280
+ jc = qatomic_xchg(&cpu->tb_jmp_cache, jc);
281
+ assert(jc == NULL);
282
+ }
283
+}
284
+
285
/* This is a wrapper for common code that can not use CONFIG_SOFTMMU */
286
void tcg_flush_softmmu_tlb(CPUState *cs)
287
{
288
diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
289
index XXXXXXX..XXXXXXX 100644
290
--- a/hw/core/cpu-common.c
291
+++ b/hw/core/cpu-common.c
292
@@ -XXX,XX +XXX,XX @@ static void cpu_common_reset(DeviceState *dev)
293
cpu->cflags_next_tb = -1;
294
295
if (tcg_enabled()) {
296
- cpu_tb_jmp_cache_clear(cpu);
297
-
298
+ tcg_flush_jmp_cache(cpu);
299
tcg_flush_softmmu_tlb(cpu);
300
}
301
}
302
diff --git a/plugins/core.c b/plugins/core.c
303
index XXXXXXX..XXXXXXX 100644
304
--- a/plugins/core.c
305
+++ b/plugins/core.c
306
@@ -XXX,XX +XXX,XX @@ struct qemu_plugin_ctx *plugin_id_to_ctx_locked(qemu_plugin_id_t id)
307
static void plugin_cpu_update__async(CPUState *cpu, run_on_cpu_data data)
308
{
309
bitmap_copy(cpu->plugin_mask, &data.host_ulong, QEMU_PLUGIN_EV_MAX);
310
- cpu_tb_jmp_cache_clear(cpu);
311
+ tcg_flush_jmp_cache(cpu);
312
}
313
314
static void plugin_cpu_update__locked(gpointer k, gpointer v, gpointer udata)
315
diff --git a/trace/control-target.c b/trace/control-target.c
121
diff --git a/trace/control-target.c b/trace/control-target.c
316
index XXXXXXX..XXXXXXX 100644
122
index XXXXXXX..XXXXXXX 100644
317
--- a/trace/control-target.c
123
--- a/trace/control-target.c
318
+++ b/trace/control-target.c
124
+++ b/trace/control-target.c
319
@@ -XXX,XX +XXX,XX @@ static void trace_event_synchronize_vcpu_state_dynamic(
125
@@ -XXX,XX +XXX,XX @@
126
*/
127
128
#include "qemu/osdep.h"
129
+#include "qemu/lockable.h"
130
#include "cpu.h"
131
#include "trace/trace-root.h"
132
#include "trace/control.h"
133
@@ -XXX,XX +XXX,XX @@ static bool adding_first_cpu1(void)
134
135
static bool adding_first_cpu(void)
320
{
136
{
321
bitmap_copy(vcpu->trace_dstate, vcpu->trace_dstate_delayed,
137
- bool res;
322
CPU_TRACE_DSTATE_MAX_EVENTS);
138
- cpu_list_lock();
323
- cpu_tb_jmp_cache_clear(vcpu);
139
- res = adding_first_cpu1();
324
+ tcg_flush_jmp_cache(vcpu);
140
- cpu_list_unlock();
141
- return res;
142
+ QEMU_LOCK_GUARD(&qemu_cpu_list_lock);
143
+
144
+ return adding_first_cpu1();
325
}
145
}
326
146
327
void trace_event_set_vcpu_state_dynamic(CPUState *vcpu,
147
void trace_init_vcpu(CPUState *vcpu)
328
--
148
--
329
2.34.1
149
2.34.1
330
150
331
151
diff view generated by jsdifflib
1
Populate this new method for all targets. Always match
1
From: Jamie Iles <quic_jiles@quicinc.com>
2
the result that would be given by cpu_get_tb_cpu_state,
2
3
as we will want these values to correspond in the logs.
3
The round-robin scheduler will iterate over the CPU list with an
4
4
assigned budget until the next timer expiry and may exit early because
5
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
5
of a TB exit. This is fine under normal operation but with icount
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
enabled and SMP it is possible for a CPU to be starved of run time and
7
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> (target/sparc)
7
the system live-locks.
8
9
For example, booting a riscv64 platform with '-icount
10
shift=0,align=off,sleep=on -smp 2' we observe a livelock once the kernel
11
has timers enabled and starts performing TLB shootdowns. In this case
12
we have CPU 0 in M-mode with interrupts disabled sending an IPI to CPU
13
1. As we enter the TCG loop, we assign the icount budget to next timer
14
interrupt to CPU 0 and begin executing where the guest is sat in a busy
15
loop exhausting all of the budget before we try to execute CPU 1 which
16
is the target of the IPI but CPU 1 is left with no budget with which to
17
execute and the process repeats.
18
19
We try here to add some fairness by splitting the budget across all of
20
the CPUs on the thread fairly before entering each one. The CPU count
21
is cached on CPU list generation ID to avoid iterating the list on each
22
loop iteration. With this change it is possible to boot an SMP rv64
23
guest with icount enabled and no hangs.
24
25
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
26
Tested-by: Peter Maydell <peter.maydell@linaro.org>
27
Signed-off-by: Jamie Iles <quic_jiles@quicinc.com>
28
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
29
Message-Id: <20230427020925.51003-3-quic_jiles@quicinc.com>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
30
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
---
31
---
10
Cc: Eduardo Habkost <eduardo@habkost.net> (supporter:Machine core)
32
accel/tcg/tcg-accel-ops-icount.h | 3 ++-
11
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> (supporter:Machine core)
33
accel/tcg/tcg-accel-ops-icount.c | 21 ++++++++++++++----
12
Cc: "Philippe Mathieu-Daudé" <f4bug@amsat.org> (reviewer:Machine core)
34
accel/tcg/tcg-accel-ops-rr.c | 37 +++++++++++++++++++++++++++++++-
13
Cc: Yanan Wang <wangyanan55@huawei.com> (reviewer:Machine core)
35
replay/replay.c | 3 +--
14
Cc: Michael Rolnik <mrolnik@gmail.com> (maintainer:AVR TCG CPUs)
36
4 files changed, 56 insertions(+), 8 deletions(-)
15
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com> (maintainer:CRIS TCG CPUs)
37
16
Cc: Taylor Simpson <tsimpson@quicinc.com> (supporter:Hexagon TCG CPUs)
38
diff --git a/accel/tcg/tcg-accel-ops-icount.h b/accel/tcg/tcg-accel-ops-icount.h
17
Cc: Song Gao <gaosong@loongson.cn> (maintainer:LoongArch TCG CPUs)
39
index XXXXXXX..XXXXXXX 100644
18
Cc: Xiaojuan Yang <yangxiaojuan@loongson.cn> (maintainer:LoongArch TCG CPUs)
40
--- a/accel/tcg/tcg-accel-ops-icount.h
19
Cc: Laurent Vivier <laurent@vivier.eu> (maintainer:M68K TCG CPUs)
41
+++ b/accel/tcg/tcg-accel-ops-icount.h
20
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> (reviewer:MIPS TCG CPUs)
42
@@ -XXX,XX +XXX,XX @@
21
Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com> (reviewer:MIPS TCG CPUs)
43
#define TCG_ACCEL_OPS_ICOUNT_H
22
Cc: Chris Wulff <crwulff@gmail.com> (maintainer:NiosII TCG CPUs)
44
23
Cc: Marek Vasut <marex@denx.de> (maintainer:NiosII TCG CPUs)
45
void icount_handle_deadline(void);
24
Cc: Stafford Horne <shorne@gmail.com> (odd fixer:OpenRISC TCG CPUs)
46
-void icount_prepare_for_run(CPUState *cpu);
25
Cc: Yoshinori Sato <ysato@users.sourceforge.jp> (reviewer:RENESAS RX CPUs)
47
+void icount_prepare_for_run(CPUState *cpu, int64_t cpu_budget);
26
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> (maintainer:SPARC TCG CPUs)
48
+int64_t icount_percpu_budget(int cpu_count);
27
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> (maintainer:TriCore TCG CPUs)
49
void icount_process_data(CPUState *cpu);
28
Cc: Max Filippov <jcmvbkbc@gmail.com> (maintainer:Xtensa TCG CPUs)
50
29
Cc: qemu-arm@nongnu.org (open list:ARM TCG CPUs)
51
void icount_handle_interrupt(CPUState *cpu, int mask);
30
Cc: qemu-ppc@nongnu.org (open list:PowerPC TCG CPUs)
52
diff --git a/accel/tcg/tcg-accel-ops-icount.c b/accel/tcg/tcg-accel-ops-icount.c
31
Cc: qemu-riscv@nongnu.org (open list:RISC-V TCG CPUs)
53
index XXXXXXX..XXXXXXX 100644
32
Cc: qemu-s390x@nongnu.org (open list:S390 TCG CPUs)
54
--- a/accel/tcg/tcg-accel-ops-icount.c
33
---
55
+++ b/accel/tcg/tcg-accel-ops-icount.c
34
include/hw/core/cpu.h | 3 +++
56
@@ -XXX,XX +XXX,XX @@ void icount_handle_deadline(void)
35
target/alpha/cpu.c | 9 +++++++++
36
target/arm/cpu.c | 13 +++++++++++++
37
target/avr/cpu.c | 8 ++++++++
38
target/cris/cpu.c | 8 ++++++++
39
target/hexagon/cpu.c | 8 ++++++++
40
target/hppa/cpu.c | 8 ++++++++
41
target/i386/cpu.c | 9 +++++++++
42
target/loongarch/cpu.c | 9 +++++++++
43
target/m68k/cpu.c | 8 ++++++++
44
target/microblaze/cpu.c | 8 ++++++++
45
target/mips/cpu.c | 8 ++++++++
46
target/nios2/cpu.c | 9 +++++++++
47
target/openrisc/cpu.c | 8 ++++++++
48
target/ppc/cpu_init.c | 8 ++++++++
49
target/riscv/cpu.c | 13 +++++++++++++
50
target/rx/cpu.c | 8 ++++++++
51
target/s390x/cpu.c | 8 ++++++++
52
target/sh4/cpu.c | 8 ++++++++
53
target/sparc/cpu.c | 8 ++++++++
54
target/tricore/cpu.c | 9 +++++++++
55
target/xtensa/cpu.c | 8 ++++++++
56
22 files changed, 186 insertions(+)
57
58
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
59
index XXXXXXX..XXXXXXX 100644
60
--- a/include/hw/core/cpu.h
61
+++ b/include/hw/core/cpu.h
62
@@ -XXX,XX +XXX,XX @@ struct SysemuCPUOps;
63
* If the target behaviour here is anything other than "set
64
* the PC register to the value passed in" then the target must
65
* also implement the synchronize_from_tb hook.
66
+ * @get_pc: Callback for getting the Program Counter register.
67
+ * As above, with the semantics of the target architecture.
68
* @gdb_read_register: Callback for letting GDB read a register.
69
* @gdb_write_register: Callback for letting GDB write a register.
70
* @gdb_adjust_breakpoint: Callback for adjusting the address of a
71
@@ -XXX,XX +XXX,XX @@ struct CPUClass {
72
void (*dump_state)(CPUState *cpu, FILE *, int flags);
73
int64_t (*get_arch_id)(CPUState *cpu);
74
void (*set_pc)(CPUState *cpu, vaddr value);
75
+ vaddr (*get_pc)(CPUState *cpu);
76
int (*gdb_read_register)(CPUState *cpu, GByteArray *buf, int reg);
77
int (*gdb_write_register)(CPUState *cpu, uint8_t *buf, int reg);
78
vaddr (*gdb_adjust_breakpoint)(CPUState *cpu, vaddr addr);
79
diff --git a/target/alpha/cpu.c b/target/alpha/cpu.c
80
index XXXXXXX..XXXXXXX 100644
81
--- a/target/alpha/cpu.c
82
+++ b/target/alpha/cpu.c
83
@@ -XXX,XX +XXX,XX @@ static void alpha_cpu_set_pc(CPUState *cs, vaddr value)
84
cpu->env.pc = value;
85
}
86
87
+static vaddr alpha_cpu_get_pc(CPUState *cs)
88
+{
89
+ AlphaCPU *cpu = ALPHA_CPU(cs);
90
+
91
+ return cpu->env.pc;
92
+}
93
+
94
+
95
static bool alpha_cpu_has_work(CPUState *cs)
96
{
97
/* Here we are checking to see if the CPU should wake up from HALT.
98
@@ -XXX,XX +XXX,XX @@ static void alpha_cpu_class_init(ObjectClass *oc, void *data)
99
cc->has_work = alpha_cpu_has_work;
100
cc->dump_state = alpha_cpu_dump_state;
101
cc->set_pc = alpha_cpu_set_pc;
102
+ cc->get_pc = alpha_cpu_get_pc;
103
cc->gdb_read_register = alpha_cpu_gdb_read_register;
104
cc->gdb_write_register = alpha_cpu_gdb_write_register;
105
#ifndef CONFIG_USER_ONLY
106
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
107
index XXXXXXX..XXXXXXX 100644
108
--- a/target/arm/cpu.c
109
+++ b/target/arm/cpu.c
110
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_set_pc(CPUState *cs, vaddr value)
111
}
57
}
112
}
58
}
113
59
114
+static vaddr arm_cpu_get_pc(CPUState *cs)
60
-void icount_prepare_for_run(CPUState *cpu)
61
+/* Distribute the budget evenly across all CPUs */
62
+int64_t icount_percpu_budget(int cpu_count)
115
+{
63
+{
116
+ ARMCPU *cpu = ARM_CPU(cs);
64
+ int64_t limit = icount_get_limit();
117
+ CPUARMState *env = &cpu->env;
65
+ int64_t timeslice = limit / cpu_count;
118
+
66
+
119
+ if (is_a64(env)) {
67
+ if (timeslice == 0) {
120
+ return env->pc;
68
+ timeslice = limit;
121
+ } else {
122
+ return env->regs[15];
123
+ }
69
+ }
70
+
71
+ return timeslice;
124
+}
72
+}
125
+
73
+
126
#ifdef CONFIG_TCG
74
+void icount_prepare_for_run(CPUState *cpu, int64_t cpu_budget)
127
void arm_cpu_synchronize_from_tb(CPUState *cs,
75
{
128
const TranslationBlock *tb)
76
int insns_left;
129
@@ -XXX,XX +XXX,XX @@ static void arm_cpu_class_init(ObjectClass *oc, void *data)
77
130
cc->has_work = arm_cpu_has_work;
78
@@ -XXX,XX +XXX,XX @@ void icount_prepare_for_run(CPUState *cpu)
131
cc->dump_state = arm_cpu_dump_state;
79
g_assert(cpu_neg(cpu)->icount_decr.u16.low == 0);
132
cc->set_pc = arm_cpu_set_pc;
80
g_assert(cpu->icount_extra == 0);
133
+ cc->get_pc = arm_cpu_get_pc;
81
134
cc->gdb_read_register = arm_cpu_gdb_read_register;
82
- cpu->icount_budget = icount_get_limit();
135
cc->gdb_write_register = arm_cpu_gdb_write_register;
83
+ replay_mutex_lock();
136
#ifndef CONFIG_USER_ONLY
84
+
137
diff --git a/target/avr/cpu.c b/target/avr/cpu.c
85
+ cpu->icount_budget = MIN(icount_get_limit(), cpu_budget);
138
index XXXXXXX..XXXXXXX 100644
86
insns_left = MIN(0xffff, cpu->icount_budget);
139
--- a/target/avr/cpu.c
87
cpu_neg(cpu)->icount_decr.u16.low = insns_left;
140
+++ b/target/avr/cpu.c
88
cpu->icount_extra = cpu->icount_budget - insns_left;
141
@@ -XXX,XX +XXX,XX @@ static void avr_cpu_set_pc(CPUState *cs, vaddr value)
89
142
cpu->env.pc_w = value / 2; /* internally PC points to words */
90
- replay_mutex_lock();
91
-
92
if (cpu->icount_budget == 0) {
93
/*
94
* We're called without the iothread lock, so must take it while
95
diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
96
index XXXXXXX..XXXXXXX 100644
97
--- a/accel/tcg/tcg-accel-ops-rr.c
98
+++ b/accel/tcg/tcg-accel-ops-rr.c
99
@@ -XXX,XX +XXX,XX @@
100
*/
101
102
#include "qemu/osdep.h"
103
+#include "qemu/lockable.h"
104
#include "sysemu/tcg.h"
105
#include "sysemu/replay.h"
106
#include "sysemu/cpu-timers.h"
107
@@ -XXX,XX +XXX,XX @@ static void rr_force_rcu(Notifier *notify, void *data)
108
rr_kick_next_cpu();
143
}
109
}
144
110
145
+static vaddr avr_cpu_get_pc(CPUState *cs)
111
+/*
112
+ * Calculate the number of CPUs that we will process in a single iteration of
113
+ * the main CPU thread loop so that we can fairly distribute the instruction
114
+ * count across CPUs.
115
+ *
116
+ * The CPU count is cached based on the CPU list generation ID to avoid
117
+ * iterating the list every time.
118
+ */
119
+static int rr_cpu_count(void)
146
+{
120
+{
147
+ AVRCPU *cpu = AVR_CPU(cs);
121
+ static unsigned int last_gen_id = ~0;
148
+
122
+ static int cpu_count;
149
+ return cpu->env.pc_w * 2;
123
+ CPUState *cpu;
124
+
125
+ QEMU_LOCK_GUARD(&qemu_cpu_list_lock);
126
+
127
+ if (cpu_list_generation_id_get() != last_gen_id) {
128
+ cpu_count = 0;
129
+ CPU_FOREACH(cpu) {
130
+ ++cpu_count;
131
+ }
132
+ last_gen_id = cpu_list_generation_id_get();
133
+ }
134
+
135
+ return cpu_count;
150
+}
136
+}
151
+
137
+
152
static bool avr_cpu_has_work(CPUState *cs)
138
/*
139
* In the single-threaded case each vCPU is simulated in turn. If
140
* there is more than a single vCPU we create a simple timer to kick
141
@@ -XXX,XX +XXX,XX @@ static void *rr_cpu_thread_fn(void *arg)
142
cpu->exit_request = 1;
143
144
while (1) {
145
+ /* Only used for icount_enabled() */
146
+ int64_t cpu_budget = 0;
147
+
148
qemu_mutex_unlock_iothread();
149
replay_mutex_lock();
150
qemu_mutex_lock_iothread();
151
152
if (icount_enabled()) {
153
+ int cpu_count = rr_cpu_count();
154
+
155
/* Account partial waits to QEMU_CLOCK_VIRTUAL. */
156
icount_account_warp_timer();
157
/*
158
@@ -XXX,XX +XXX,XX @@ static void *rr_cpu_thread_fn(void *arg)
159
* waking up the I/O thread and waiting for completion.
160
*/
161
icount_handle_deadline();
162
+
163
+ cpu_budget = icount_percpu_budget(cpu_count);
164
}
165
166
replay_mutex_unlock();
167
@@ -XXX,XX +XXX,XX @@ static void *rr_cpu_thread_fn(void *arg)
168
169
qemu_mutex_unlock_iothread();
170
if (icount_enabled()) {
171
- icount_prepare_for_run(cpu);
172
+ icount_prepare_for_run(cpu, cpu_budget);
173
}
174
r = tcg_cpus_exec(cpu);
175
if (icount_enabled()) {
176
diff --git a/replay/replay.c b/replay/replay.c
177
index XXXXXXX..XXXXXXX 100644
178
--- a/replay/replay.c
179
+++ b/replay/replay.c
180
@@ -XXX,XX +XXX,XX @@ uint64_t replay_get_current_icount(void)
181
int replay_get_instructions(void)
153
{
182
{
154
AVRCPU *cpu = AVR_CPU(cs);
183
int res = 0;
155
@@ -XXX,XX +XXX,XX @@ static void avr_cpu_class_init(ObjectClass *oc, void *data)
184
- replay_mutex_lock();
156
cc->has_work = avr_cpu_has_work;
185
+ g_assert(replay_mutex_locked());
157
cc->dump_state = avr_cpu_dump_state;
186
if (replay_next_event_is(EVENT_INSTRUCTION)) {
158
cc->set_pc = avr_cpu_set_pc;
187
res = replay_state.instruction_count;
159
+ cc->get_pc = avr_cpu_get_pc;
188
if (replay_break_icount != -1LL) {
160
dc->vmsd = &vms_avr_cpu;
189
@@ -XXX,XX +XXX,XX @@ int replay_get_instructions(void)
161
cc->sysemu_ops = &avr_sysemu_ops;
190
}
162
cc->disas_set_info = avr_cpu_disas_set_info;
191
}
163
diff --git a/target/cris/cpu.c b/target/cris/cpu.c
192
}
164
index XXXXXXX..XXXXXXX 100644
193
- replay_mutex_unlock();
165
--- a/target/cris/cpu.c
194
return res;
166
+++ b/target/cris/cpu.c
167
@@ -XXX,XX +XXX,XX @@ static void cris_cpu_set_pc(CPUState *cs, vaddr value)
168
cpu->env.pc = value;
169
}
195
}
170
196
171
+static vaddr cris_cpu_get_pc(CPUState *cs)
172
+{
173
+ CRISCPU *cpu = CRIS_CPU(cs);
174
+
175
+ return cpu->env.pc;
176
+}
177
+
178
static bool cris_cpu_has_work(CPUState *cs)
179
{
180
return cs->interrupt_request & (CPU_INTERRUPT_HARD | CPU_INTERRUPT_NMI);
181
@@ -XXX,XX +XXX,XX @@ static void cris_cpu_class_init(ObjectClass *oc, void *data)
182
cc->has_work = cris_cpu_has_work;
183
cc->dump_state = cris_cpu_dump_state;
184
cc->set_pc = cris_cpu_set_pc;
185
+ cc->get_pc = cris_cpu_get_pc;
186
cc->gdb_read_register = cris_cpu_gdb_read_register;
187
cc->gdb_write_register = cris_cpu_gdb_write_register;
188
#ifndef CONFIG_USER_ONLY
189
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
190
index XXXXXXX..XXXXXXX 100644
191
--- a/target/hexagon/cpu.c
192
+++ b/target/hexagon/cpu.c
193
@@ -XXX,XX +XXX,XX @@ static void hexagon_cpu_set_pc(CPUState *cs, vaddr value)
194
env->gpr[HEX_REG_PC] = value;
195
}
196
197
+static vaddr hexagon_cpu_get_pc(CPUState *cs)
198
+{
199
+ HexagonCPU *cpu = HEXAGON_CPU(cs);
200
+ CPUHexagonState *env = &cpu->env;
201
+ return env->gpr[HEX_REG_PC];
202
+}
203
+
204
static void hexagon_cpu_synchronize_from_tb(CPUState *cs,
205
const TranslationBlock *tb)
206
{
207
@@ -XXX,XX +XXX,XX @@ static void hexagon_cpu_class_init(ObjectClass *c, void *data)
208
cc->has_work = hexagon_cpu_has_work;
209
cc->dump_state = hexagon_dump_state;
210
cc->set_pc = hexagon_cpu_set_pc;
211
+ cc->get_pc = hexagon_cpu_get_pc;
212
cc->gdb_read_register = hexagon_gdb_read_register;
213
cc->gdb_write_register = hexagon_gdb_write_register;
214
cc->gdb_num_core_regs = TOTAL_PER_THREAD_REGS + NUM_VREGS + NUM_QREGS;
215
diff --git a/target/hppa/cpu.c b/target/hppa/cpu.c
216
index XXXXXXX..XXXXXXX 100644
217
--- a/target/hppa/cpu.c
218
+++ b/target/hppa/cpu.c
219
@@ -XXX,XX +XXX,XX @@ static void hppa_cpu_set_pc(CPUState *cs, vaddr value)
220
cpu->env.iaoq_b = value + 4;
221
}
222
223
+static vaddr hppa_cpu_get_pc(CPUState *cs)
224
+{
225
+ HPPACPU *cpu = HPPA_CPU(cs);
226
+
227
+ return cpu->env.iaoq_f;
228
+}
229
+
230
static void hppa_cpu_synchronize_from_tb(CPUState *cs,
231
const TranslationBlock *tb)
232
{
233
@@ -XXX,XX +XXX,XX @@ static void hppa_cpu_class_init(ObjectClass *oc, void *data)
234
cc->has_work = hppa_cpu_has_work;
235
cc->dump_state = hppa_cpu_dump_state;
236
cc->set_pc = hppa_cpu_set_pc;
237
+ cc->get_pc = hppa_cpu_get_pc;
238
cc->gdb_read_register = hppa_cpu_gdb_read_register;
239
cc->gdb_write_register = hppa_cpu_gdb_write_register;
240
#ifndef CONFIG_USER_ONLY
241
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
242
index XXXXXXX..XXXXXXX 100644
243
--- a/target/i386/cpu.c
244
+++ b/target/i386/cpu.c
245
@@ -XXX,XX +XXX,XX @@ static void x86_cpu_set_pc(CPUState *cs, vaddr value)
246
cpu->env.eip = value;
247
}
248
249
+static vaddr x86_cpu_get_pc(CPUState *cs)
250
+{
251
+ X86CPU *cpu = X86_CPU(cs);
252
+
253
+ /* Match cpu_get_tb_cpu_state. */
254
+ return cpu->env.eip + cpu->env.segs[R_CS].base;
255
+}
256
+
257
int x86_cpu_pending_interrupt(CPUState *cs, int interrupt_request)
258
{
259
X86CPU *cpu = X86_CPU(cs);
260
@@ -XXX,XX +XXX,XX @@ static void x86_cpu_common_class_init(ObjectClass *oc, void *data)
261
cc->has_work = x86_cpu_has_work;
262
cc->dump_state = x86_cpu_dump_state;
263
cc->set_pc = x86_cpu_set_pc;
264
+ cc->get_pc = x86_cpu_get_pc;
265
cc->gdb_read_register = x86_cpu_gdb_read_register;
266
cc->gdb_write_register = x86_cpu_gdb_write_register;
267
cc->get_arch_id = x86_cpu_get_arch_id;
268
diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c
269
index XXXXXXX..XXXXXXX 100644
270
--- a/target/loongarch/cpu.c
271
+++ b/target/loongarch/cpu.c
272
@@ -XXX,XX +XXX,XX @@ static void loongarch_cpu_set_pc(CPUState *cs, vaddr value)
273
env->pc = value;
274
}
275
276
+static vaddr loongarch_cpu_get_pc(CPUState *cs)
277
+{
278
+ LoongArchCPU *cpu = LOONGARCH_CPU(cs);
279
+ CPULoongArchState *env = &cpu->env;
280
+
281
+ return env->pc;
282
+}
283
+
284
#ifndef CONFIG_USER_ONLY
285
#include "hw/loongarch/virt.h"
286
287
@@ -XXX,XX +XXX,XX @@ static void loongarch_cpu_class_init(ObjectClass *c, void *data)
288
cc->has_work = loongarch_cpu_has_work;
289
cc->dump_state = loongarch_cpu_dump_state;
290
cc->set_pc = loongarch_cpu_set_pc;
291
+ cc->get_pc = loongarch_cpu_get_pc;
292
#ifndef CONFIG_USER_ONLY
293
dc->vmsd = &vmstate_loongarch_cpu;
294
cc->sysemu_ops = &loongarch_sysemu_ops;
295
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
296
index XXXXXXX..XXXXXXX 100644
297
--- a/target/m68k/cpu.c
298
+++ b/target/m68k/cpu.c
299
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_set_pc(CPUState *cs, vaddr value)
300
cpu->env.pc = value;
301
}
302
303
+static vaddr m68k_cpu_get_pc(CPUState *cs)
304
+{
305
+ M68kCPU *cpu = M68K_CPU(cs);
306
+
307
+ return cpu->env.pc;
308
+}
309
+
310
static bool m68k_cpu_has_work(CPUState *cs)
311
{
312
return cs->interrupt_request & CPU_INTERRUPT_HARD;
313
@@ -XXX,XX +XXX,XX @@ static void m68k_cpu_class_init(ObjectClass *c, void *data)
314
cc->has_work = m68k_cpu_has_work;
315
cc->dump_state = m68k_cpu_dump_state;
316
cc->set_pc = m68k_cpu_set_pc;
317
+ cc->get_pc = m68k_cpu_get_pc;
318
cc->gdb_read_register = m68k_cpu_gdb_read_register;
319
cc->gdb_write_register = m68k_cpu_gdb_write_register;
320
#if defined(CONFIG_SOFTMMU)
321
diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
322
index XXXXXXX..XXXXXXX 100644
323
--- a/target/microblaze/cpu.c
324
+++ b/target/microblaze/cpu.c
325
@@ -XXX,XX +XXX,XX @@ static void mb_cpu_set_pc(CPUState *cs, vaddr value)
326
cpu->env.iflags = 0;
327
}
328
329
+static vaddr mb_cpu_get_pc(CPUState *cs)
330
+{
331
+ MicroBlazeCPU *cpu = MICROBLAZE_CPU(cs);
332
+
333
+ return cpu->env.pc;
334
+}
335
+
336
static void mb_cpu_synchronize_from_tb(CPUState *cs,
337
const TranslationBlock *tb)
338
{
339
@@ -XXX,XX +XXX,XX @@ static void mb_cpu_class_init(ObjectClass *oc, void *data)
340
341
cc->dump_state = mb_cpu_dump_state;
342
cc->set_pc = mb_cpu_set_pc;
343
+ cc->get_pc = mb_cpu_get_pc;
344
cc->gdb_read_register = mb_cpu_gdb_read_register;
345
cc->gdb_write_register = mb_cpu_gdb_write_register;
346
347
diff --git a/target/mips/cpu.c b/target/mips/cpu.c
348
index XXXXXXX..XXXXXXX 100644
349
--- a/target/mips/cpu.c
350
+++ b/target/mips/cpu.c
351
@@ -XXX,XX +XXX,XX @@ static void mips_cpu_set_pc(CPUState *cs, vaddr value)
352
mips_env_set_pc(&cpu->env, value);
353
}
354
355
+static vaddr mips_cpu_get_pc(CPUState *cs)
356
+{
357
+ MIPSCPU *cpu = MIPS_CPU(cs);
358
+
359
+ return cpu->env.active_tc.PC;
360
+}
361
+
362
static bool mips_cpu_has_work(CPUState *cs)
363
{
364
MIPSCPU *cpu = MIPS_CPU(cs);
365
@@ -XXX,XX +XXX,XX @@ static void mips_cpu_class_init(ObjectClass *c, void *data)
366
cc->has_work = mips_cpu_has_work;
367
cc->dump_state = mips_cpu_dump_state;
368
cc->set_pc = mips_cpu_set_pc;
369
+ cc->get_pc = mips_cpu_get_pc;
370
cc->gdb_read_register = mips_cpu_gdb_read_register;
371
cc->gdb_write_register = mips_cpu_gdb_write_register;
372
#ifndef CONFIG_USER_ONLY
373
diff --git a/target/nios2/cpu.c b/target/nios2/cpu.c
374
index XXXXXXX..XXXXXXX 100644
375
--- a/target/nios2/cpu.c
376
+++ b/target/nios2/cpu.c
377
@@ -XXX,XX +XXX,XX @@ static void nios2_cpu_set_pc(CPUState *cs, vaddr value)
378
env->pc = value;
379
}
380
381
+static vaddr nios2_cpu_get_pc(CPUState *cs)
382
+{
383
+ Nios2CPU *cpu = NIOS2_CPU(cs);
384
+ CPUNios2State *env = &cpu->env;
385
+
386
+ return env->pc;
387
+}
388
+
389
static bool nios2_cpu_has_work(CPUState *cs)
390
{
391
return cs->interrupt_request & CPU_INTERRUPT_HARD;
392
@@ -XXX,XX +XXX,XX @@ static void nios2_cpu_class_init(ObjectClass *oc, void *data)
393
cc->has_work = nios2_cpu_has_work;
394
cc->dump_state = nios2_cpu_dump_state;
395
cc->set_pc = nios2_cpu_set_pc;
396
+ cc->get_pc = nios2_cpu_get_pc;
397
cc->disas_set_info = nios2_cpu_disas_set_info;
398
#ifndef CONFIG_USER_ONLY
399
cc->sysemu_ops = &nios2_sysemu_ops;
400
diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
401
index XXXXXXX..XXXXXXX 100644
402
--- a/target/openrisc/cpu.c
403
+++ b/target/openrisc/cpu.c
404
@@ -XXX,XX +XXX,XX @@ static void openrisc_cpu_set_pc(CPUState *cs, vaddr value)
405
cpu->env.dflag = 0;
406
}
407
408
+static vaddr openrisc_cpu_get_pc(CPUState *cs)
409
+{
410
+ OpenRISCCPU *cpu = OPENRISC_CPU(cs);
411
+
412
+ return cpu->env.pc;
413
+}
414
+
415
static void openrisc_cpu_synchronize_from_tb(CPUState *cs,
416
const TranslationBlock *tb)
417
{
418
@@ -XXX,XX +XXX,XX @@ static void openrisc_cpu_class_init(ObjectClass *oc, void *data)
419
cc->has_work = openrisc_cpu_has_work;
420
cc->dump_state = openrisc_cpu_dump_state;
421
cc->set_pc = openrisc_cpu_set_pc;
422
+ cc->get_pc = openrisc_cpu_get_pc;
423
cc->gdb_read_register = openrisc_cpu_gdb_read_register;
424
cc->gdb_write_register = openrisc_cpu_gdb_write_register;
425
#ifndef CONFIG_USER_ONLY
426
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
427
index XXXXXXX..XXXXXXX 100644
428
--- a/target/ppc/cpu_init.c
429
+++ b/target/ppc/cpu_init.c
430
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_set_pc(CPUState *cs, vaddr value)
431
cpu->env.nip = value;
432
}
433
434
+static vaddr ppc_cpu_get_pc(CPUState *cs)
435
+{
436
+ PowerPCCPU *cpu = POWERPC_CPU(cs);
437
+
438
+ return cpu->env.nip;
439
+}
440
+
441
static bool ppc_cpu_has_work(CPUState *cs)
442
{
443
PowerPCCPU *cpu = POWERPC_CPU(cs);
444
@@ -XXX,XX +XXX,XX @@ static void ppc_cpu_class_init(ObjectClass *oc, void *data)
445
cc->has_work = ppc_cpu_has_work;
446
cc->dump_state = ppc_cpu_dump_state;
447
cc->set_pc = ppc_cpu_set_pc;
448
+ cc->get_pc = ppc_cpu_get_pc;
449
cc->gdb_read_register = ppc_cpu_gdb_read_register;
450
cc->gdb_write_register = ppc_cpu_gdb_write_register;
451
#ifndef CONFIG_USER_ONLY
452
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
453
index XXXXXXX..XXXXXXX 100644
454
--- a/target/riscv/cpu.c
455
+++ b/target/riscv/cpu.c
456
@@ -XXX,XX +XXX,XX @@ static void riscv_cpu_set_pc(CPUState *cs, vaddr value)
457
}
458
}
459
460
+static vaddr riscv_cpu_get_pc(CPUState *cs)
461
+{
462
+ RISCVCPU *cpu = RISCV_CPU(cs);
463
+ CPURISCVState *env = &cpu->env;
464
+
465
+ /* Match cpu_get_tb_cpu_state. */
466
+ if (env->xl == MXL_RV32) {
467
+ return env->pc & UINT32_MAX;
468
+ }
469
+ return env->pc;
470
+}
471
+
472
static void riscv_cpu_synchronize_from_tb(CPUState *cs,
473
const TranslationBlock *tb)
474
{
475
@@ -XXX,XX +XXX,XX @@ static void riscv_cpu_class_init(ObjectClass *c, void *data)
476
cc->has_work = riscv_cpu_has_work;
477
cc->dump_state = riscv_cpu_dump_state;
478
cc->set_pc = riscv_cpu_set_pc;
479
+ cc->get_pc = riscv_cpu_get_pc;
480
cc->gdb_read_register = riscv_cpu_gdb_read_register;
481
cc->gdb_write_register = riscv_cpu_gdb_write_register;
482
cc->gdb_num_core_regs = 33;
483
diff --git a/target/rx/cpu.c b/target/rx/cpu.c
484
index XXXXXXX..XXXXXXX 100644
485
--- a/target/rx/cpu.c
486
+++ b/target/rx/cpu.c
487
@@ -XXX,XX +XXX,XX @@ static void rx_cpu_set_pc(CPUState *cs, vaddr value)
488
cpu->env.pc = value;
489
}
490
491
+static vaddr rx_cpu_get_pc(CPUState *cs)
492
+{
493
+ RXCPU *cpu = RX_CPU(cs);
494
+
495
+ return cpu->env.pc;
496
+}
497
+
498
static void rx_cpu_synchronize_from_tb(CPUState *cs,
499
const TranslationBlock *tb)
500
{
501
@@ -XXX,XX +XXX,XX @@ static void rx_cpu_class_init(ObjectClass *klass, void *data)
502
cc->has_work = rx_cpu_has_work;
503
cc->dump_state = rx_cpu_dump_state;
504
cc->set_pc = rx_cpu_set_pc;
505
+ cc->get_pc = rx_cpu_get_pc;
506
507
#ifndef CONFIG_USER_ONLY
508
cc->sysemu_ops = &rx_sysemu_ops;
509
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
510
index XXXXXXX..XXXXXXX 100644
511
--- a/target/s390x/cpu.c
512
+++ b/target/s390x/cpu.c
513
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_set_pc(CPUState *cs, vaddr value)
514
cpu->env.psw.addr = value;
515
}
516
517
+static vaddr s390_cpu_get_pc(CPUState *cs)
518
+{
519
+ S390CPU *cpu = S390_CPU(cs);
520
+
521
+ return cpu->env.psw.addr;
522
+}
523
+
524
static bool s390_cpu_has_work(CPUState *cs)
525
{
526
S390CPU *cpu = S390_CPU(cs);
527
@@ -XXX,XX +XXX,XX @@ static void s390_cpu_class_init(ObjectClass *oc, void *data)
528
cc->has_work = s390_cpu_has_work;
529
cc->dump_state = s390_cpu_dump_state;
530
cc->set_pc = s390_cpu_set_pc;
531
+ cc->get_pc = s390_cpu_get_pc;
532
cc->gdb_read_register = s390_cpu_gdb_read_register;
533
cc->gdb_write_register = s390_cpu_gdb_write_register;
534
#ifndef CONFIG_USER_ONLY
535
diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
536
index XXXXXXX..XXXXXXX 100644
537
--- a/target/sh4/cpu.c
538
+++ b/target/sh4/cpu.c
539
@@ -XXX,XX +XXX,XX @@ static void superh_cpu_set_pc(CPUState *cs, vaddr value)
540
cpu->env.pc = value;
541
}
542
543
+static vaddr superh_cpu_get_pc(CPUState *cs)
544
+{
545
+ SuperHCPU *cpu = SUPERH_CPU(cs);
546
+
547
+ return cpu->env.pc;
548
+}
549
+
550
static void superh_cpu_synchronize_from_tb(CPUState *cs,
551
const TranslationBlock *tb)
552
{
553
@@ -XXX,XX +XXX,XX @@ static void superh_cpu_class_init(ObjectClass *oc, void *data)
554
cc->has_work = superh_cpu_has_work;
555
cc->dump_state = superh_cpu_dump_state;
556
cc->set_pc = superh_cpu_set_pc;
557
+ cc->get_pc = superh_cpu_get_pc;
558
cc->gdb_read_register = superh_cpu_gdb_read_register;
559
cc->gdb_write_register = superh_cpu_gdb_write_register;
560
#ifndef CONFIG_USER_ONLY
561
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
562
index XXXXXXX..XXXXXXX 100644
563
--- a/target/sparc/cpu.c
564
+++ b/target/sparc/cpu.c
565
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_set_pc(CPUState *cs, vaddr value)
566
cpu->env.npc = value + 4;
567
}
568
569
+static vaddr sparc_cpu_get_pc(CPUState *cs)
570
+{
571
+ SPARCCPU *cpu = SPARC_CPU(cs);
572
+
573
+ return cpu->env.pc;
574
+}
575
+
576
static void sparc_cpu_synchronize_from_tb(CPUState *cs,
577
const TranslationBlock *tb)
578
{
579
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_class_init(ObjectClass *oc, void *data)
580
cc->memory_rw_debug = sparc_cpu_memory_rw_debug;
581
#endif
582
cc->set_pc = sparc_cpu_set_pc;
583
+ cc->get_pc = sparc_cpu_get_pc;
584
cc->gdb_read_register = sparc_cpu_gdb_read_register;
585
cc->gdb_write_register = sparc_cpu_gdb_write_register;
586
#ifndef CONFIG_USER_ONLY
587
diff --git a/target/tricore/cpu.c b/target/tricore/cpu.c
588
index XXXXXXX..XXXXXXX 100644
589
--- a/target/tricore/cpu.c
590
+++ b/target/tricore/cpu.c
591
@@ -XXX,XX +XXX,XX @@ static void tricore_cpu_set_pc(CPUState *cs, vaddr value)
592
env->PC = value & ~(target_ulong)1;
593
}
594
595
+static vaddr tricore_cpu_get_pc(CPUState *cs)
596
+{
597
+ TriCoreCPU *cpu = TRICORE_CPU(cs);
598
+ CPUTriCoreState *env = &cpu->env;
599
+
600
+ return env->PC;
601
+}
602
+
603
static void tricore_cpu_synchronize_from_tb(CPUState *cs,
604
const TranslationBlock *tb)
605
{
606
@@ -XXX,XX +XXX,XX @@ static void tricore_cpu_class_init(ObjectClass *c, void *data)
607
608
cc->dump_state = tricore_cpu_dump_state;
609
cc->set_pc = tricore_cpu_set_pc;
610
+ cc->get_pc = tricore_cpu_get_pc;
611
cc->sysemu_ops = &tricore_sysemu_ops;
612
cc->tcg_ops = &tricore_tcg_ops;
613
}
614
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
615
index XXXXXXX..XXXXXXX 100644
616
--- a/target/xtensa/cpu.c
617
+++ b/target/xtensa/cpu.c
618
@@ -XXX,XX +XXX,XX @@ static void xtensa_cpu_set_pc(CPUState *cs, vaddr value)
619
cpu->env.pc = value;
620
}
621
622
+static vaddr xtensa_cpu_get_pc(CPUState *cs)
623
+{
624
+ XtensaCPU *cpu = XTENSA_CPU(cs);
625
+
626
+ return cpu->env.pc;
627
+}
628
+
629
static bool xtensa_cpu_has_work(CPUState *cs)
630
{
631
#ifndef CONFIG_USER_ONLY
632
@@ -XXX,XX +XXX,XX @@ static void xtensa_cpu_class_init(ObjectClass *oc, void *data)
633
cc->has_work = xtensa_cpu_has_work;
634
cc->dump_state = xtensa_cpu_dump_state;
635
cc->set_pc = xtensa_cpu_set_pc;
636
+ cc->get_pc = xtensa_cpu_get_pc;
637
cc->gdb_read_register = xtensa_cpu_gdb_read_register;
638
cc->gdb_write_register = xtensa_cpu_gdb_write_register;
639
cc->gdb_stop_before_watchpoint = true;
640
--
197
--
641
2.34.1
198
2.34.1
642
199
643
200
diff view generated by jsdifflib
New patch
1
Merge tcg_out_tlb_load, add_qemu_ldst_label,
2
tcg_out_test_alignment, and some code that lived in both
3
tcg_out_qemu_ld and tcg_out_qemu_st into one function
4
that returns HostAddress and TCGLabelQemuLdst structures.
1
5
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/i386/tcg-target.c.inc | 346 ++++++++++++++++----------------------
10
1 file changed, 145 insertions(+), 201 deletions(-)
11
12
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/i386/tcg-target.c.inc
15
+++ b/tcg/i386/tcg-target.c.inc
16
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
17
[MO_BEUQ] = helper_be_stq_mmu,
18
};
19
20
-/* Perform the TLB load and compare.
21
-
22
- Inputs:
23
- ADDRLO and ADDRHI contain the low and high part of the address.
24
-
25
- MEM_INDEX and S_BITS are the memory context and log2 size of the load.
26
-
27
- WHICH is the offset into the CPUTLBEntry structure of the slot to read.
28
- This should be offsetof addr_read or addr_write.
29
-
30
- Outputs:
31
- LABEL_PTRS is filled with 1 (32-bit addresses) or 2 (64-bit addresses)
32
- positions of the displacements of forward jumps to the TLB miss case.
33
-
34
- Second argument register is loaded with the low part of the address.
35
- In the TLB hit case, it has been adjusted as indicated by the TLB
36
- and so is a host address. In the TLB miss case, it continues to
37
- hold a guest address.
38
-
39
- First argument register is clobbered. */
40
-
41
-static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
42
- int mem_index, MemOp opc,
43
- tcg_insn_unit **label_ptr, int which)
44
-{
45
- TCGType ttype = TCG_TYPE_I32;
46
- TCGType tlbtype = TCG_TYPE_I32;
47
- int trexw = 0, hrexw = 0, tlbrexw = 0;
48
- unsigned a_bits = get_alignment_bits(opc);
49
- unsigned s_bits = opc & MO_SIZE;
50
- unsigned a_mask = (1 << a_bits) - 1;
51
- unsigned s_mask = (1 << s_bits) - 1;
52
- target_ulong tlb_mask;
53
-
54
- if (TCG_TARGET_REG_BITS == 64) {
55
- if (TARGET_LONG_BITS == 64) {
56
- ttype = TCG_TYPE_I64;
57
- trexw = P_REXW;
58
- }
59
- if (TCG_TYPE_PTR == TCG_TYPE_I64) {
60
- hrexw = P_REXW;
61
- if (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32) {
62
- tlbtype = TCG_TYPE_I64;
63
- tlbrexw = P_REXW;
64
- }
65
- }
66
- }
67
-
68
- tcg_out_mov(s, tlbtype, TCG_REG_L0, addrlo);
69
- tcg_out_shifti(s, SHIFT_SHR + tlbrexw, TCG_REG_L0,
70
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
71
-
72
- tcg_out_modrm_offset(s, OPC_AND_GvEv + trexw, TCG_REG_L0, TCG_AREG0,
73
- TLB_MASK_TABLE_OFS(mem_index) +
74
- offsetof(CPUTLBDescFast, mask));
75
-
76
- tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, TCG_REG_L0, TCG_AREG0,
77
- TLB_MASK_TABLE_OFS(mem_index) +
78
- offsetof(CPUTLBDescFast, table));
79
-
80
- /* If the required alignment is at least as large as the access, simply
81
- copy the address and mask. For lesser alignments, check that we don't
82
- cross pages for the complete access. */
83
- if (a_bits >= s_bits) {
84
- tcg_out_mov(s, ttype, TCG_REG_L1, addrlo);
85
- } else {
86
- tcg_out_modrm_offset(s, OPC_LEA + trexw, TCG_REG_L1,
87
- addrlo, s_mask - a_mask);
88
- }
89
- tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
90
- tgen_arithi(s, ARITH_AND + trexw, TCG_REG_L1, tlb_mask, 0);
91
-
92
- /* cmp 0(TCG_REG_L0), TCG_REG_L1 */
93
- tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw,
94
- TCG_REG_L1, TCG_REG_L0, which);
95
-
96
- /* Prepare for both the fast path add of the tlb addend, and the slow
97
- path function argument setup. */
98
- tcg_out_mov(s, ttype, TCG_REG_L1, addrlo);
99
-
100
- /* jne slow_path */
101
- tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
102
- label_ptr[0] = s->code_ptr;
103
- s->code_ptr += 4;
104
-
105
- if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
106
- /* cmp 4(TCG_REG_L0), addrhi */
107
- tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, TCG_REG_L0, which + 4);
108
-
109
- /* jne slow_path */
110
- tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
111
- label_ptr[1] = s->code_ptr;
112
- s->code_ptr += 4;
113
- }
114
-
115
- /* TLB Hit. */
116
-
117
- /* add addend(TCG_REG_L0), TCG_REG_L1 */
118
- tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, TCG_REG_L1, TCG_REG_L0,
119
- offsetof(CPUTLBEntry, addend));
120
-}
121
-
122
-/*
123
- * Record the context of a call to the out of line helper code for the slow path
124
- * for a load or store, so that we can later generate the correct helper code
125
- */
126
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld,
127
- TCGType type, MemOpIdx oi,
128
- TCGReg datalo, TCGReg datahi,
129
- TCGReg addrlo, TCGReg addrhi,
130
- tcg_insn_unit *raddr,
131
- tcg_insn_unit **label_ptr)
132
-{
133
- TCGLabelQemuLdst *label = new_ldst_label(s);
134
-
135
- label->is_ld = is_ld;
136
- label->oi = oi;
137
- label->type = type;
138
- label->datalo_reg = datalo;
139
- label->datahi_reg = datahi;
140
- label->addrlo_reg = addrlo;
141
- label->addrhi_reg = addrhi;
142
- label->raddr = tcg_splitwx_to_rx(raddr);
143
- label->label_ptr[0] = label_ptr[0];
144
- if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
145
- label->label_ptr[1] = label_ptr[1];
146
- }
147
-}
148
-
149
/*
150
* Generate code for the slow path for a load at the end of block
151
*/
152
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
153
return true;
154
}
155
#else
156
-
157
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
158
- TCGReg addrhi, unsigned a_bits)
159
-{
160
- unsigned a_mask = (1 << a_bits) - 1;
161
- TCGLabelQemuLdst *label;
162
-
163
- tcg_out_testi(s, addrlo, a_mask);
164
- /* jne slow_path */
165
- tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
166
-
167
- label = new_ldst_label(s);
168
- label->is_ld = is_ld;
169
- label->addrlo_reg = addrlo;
170
- label->addrhi_reg = addrhi;
171
- label->raddr = tcg_splitwx_to_rx(s->code_ptr + 4);
172
- label->label_ptr[0] = s->code_ptr;
173
-
174
- s->code_ptr += 4;
175
-}
176
-
177
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
178
{
179
/* resolve label address */
180
@@ -XXX,XX +XXX,XX @@ static inline int setup_guest_base_seg(void)
181
#endif /* setup_guest_base_seg */
182
#endif /* SOFTMMU */
183
184
+/*
185
+ * For softmmu, perform the TLB load and compare.
186
+ * For useronly, perform any required alignment tests.
187
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
188
+ * is required and fill in @h with the host address for the fast path.
189
+ */
190
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
191
+ TCGReg addrlo, TCGReg addrhi,
192
+ MemOpIdx oi, bool is_ld)
193
+{
194
+ TCGLabelQemuLdst *ldst = NULL;
195
+ MemOp opc = get_memop(oi);
196
+ unsigned a_bits = get_alignment_bits(opc);
197
+ unsigned a_mask = (1 << a_bits) - 1;
198
+
199
+#ifdef CONFIG_SOFTMMU
200
+ int cmp_ofs = is_ld ? offsetof(CPUTLBEntry, addr_read)
201
+ : offsetof(CPUTLBEntry, addr_write);
202
+ TCGType ttype = TCG_TYPE_I32;
203
+ TCGType tlbtype = TCG_TYPE_I32;
204
+ int trexw = 0, hrexw = 0, tlbrexw = 0;
205
+ unsigned mem_index = get_mmuidx(oi);
206
+ unsigned s_bits = opc & MO_SIZE;
207
+ unsigned s_mask = (1 << s_bits) - 1;
208
+ target_ulong tlb_mask;
209
+
210
+ ldst = new_ldst_label(s);
211
+ ldst->is_ld = is_ld;
212
+ ldst->oi = oi;
213
+ ldst->addrlo_reg = addrlo;
214
+ ldst->addrhi_reg = addrhi;
215
+
216
+ if (TCG_TARGET_REG_BITS == 64) {
217
+ if (TARGET_LONG_BITS == 64) {
218
+ ttype = TCG_TYPE_I64;
219
+ trexw = P_REXW;
220
+ }
221
+ if (TCG_TYPE_PTR == TCG_TYPE_I64) {
222
+ hrexw = P_REXW;
223
+ if (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32) {
224
+ tlbtype = TCG_TYPE_I64;
225
+ tlbrexw = P_REXW;
226
+ }
227
+ }
228
+ }
229
+
230
+ tcg_out_mov(s, tlbtype, TCG_REG_L0, addrlo);
231
+ tcg_out_shifti(s, SHIFT_SHR + tlbrexw, TCG_REG_L0,
232
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
233
+
234
+ tcg_out_modrm_offset(s, OPC_AND_GvEv + trexw, TCG_REG_L0, TCG_AREG0,
235
+ TLB_MASK_TABLE_OFS(mem_index) +
236
+ offsetof(CPUTLBDescFast, mask));
237
+
238
+ tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, TCG_REG_L0, TCG_AREG0,
239
+ TLB_MASK_TABLE_OFS(mem_index) +
240
+ offsetof(CPUTLBDescFast, table));
241
+
242
+ /*
243
+ * If the required alignment is at least as large as the access, simply
244
+ * copy the address and mask. For lesser alignments, check that we don't
245
+ * cross pages for the complete access.
246
+ */
247
+ if (a_bits >= s_bits) {
248
+ tcg_out_mov(s, ttype, TCG_REG_L1, addrlo);
249
+ } else {
250
+ tcg_out_modrm_offset(s, OPC_LEA + trexw, TCG_REG_L1,
251
+ addrlo, s_mask - a_mask);
252
+ }
253
+ tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
254
+ tgen_arithi(s, ARITH_AND + trexw, TCG_REG_L1, tlb_mask, 0);
255
+
256
+ /* cmp 0(TCG_REG_L0), TCG_REG_L1 */
257
+ tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw,
258
+ TCG_REG_L1, TCG_REG_L0, cmp_ofs);
259
+
260
+ /*
261
+ * Prepare for both the fast path add of the tlb addend, and the slow
262
+ * path function argument setup.
263
+ */
264
+ *h = (HostAddress) {
265
+ .base = TCG_REG_L1,
266
+ .index = -1
267
+ };
268
+ tcg_out_mov(s, ttype, h->base, addrlo);
269
+
270
+ /* jne slow_path */
271
+ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
272
+ ldst->label_ptr[0] = s->code_ptr;
273
+ s->code_ptr += 4;
274
+
275
+ if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
276
+ /* cmp 4(TCG_REG_L0), addrhi */
277
+ tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, TCG_REG_L0, cmp_ofs + 4);
278
+
279
+ /* jne slow_path */
280
+ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
281
+ ldst->label_ptr[1] = s->code_ptr;
282
+ s->code_ptr += 4;
283
+ }
284
+
285
+ /* TLB Hit. */
286
+
287
+ /* add addend(TCG_REG_L0), TCG_REG_L1 */
288
+ tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, h->base, TCG_REG_L0,
289
+ offsetof(CPUTLBEntry, addend));
290
+#else
291
+ if (a_bits) {
292
+ ldst = new_ldst_label(s);
293
+
294
+ ldst->is_ld = is_ld;
295
+ ldst->oi = oi;
296
+ ldst->addrlo_reg = addrlo;
297
+ ldst->addrhi_reg = addrhi;
298
+
299
+ tcg_out_testi(s, addrlo, a_mask);
300
+ /* jne slow_path */
301
+ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
302
+ ldst->label_ptr[0] = s->code_ptr;
303
+ s->code_ptr += 4;
304
+ }
305
+
306
+ *h = x86_guest_base;
307
+ h->base = addrlo;
308
+#endif
309
+
310
+ return ldst;
311
+}
312
+
313
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
314
HostAddress h, TCGType type, MemOp memop)
315
{
316
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
317
TCGReg addrlo, TCGReg addrhi,
318
MemOpIdx oi, TCGType data_type)
319
{
320
- MemOp opc = get_memop(oi);
321
+ TCGLabelQemuLdst *ldst;
322
HostAddress h;
323
324
-#if defined(CONFIG_SOFTMMU)
325
- tcg_insn_unit *label_ptr[2];
326
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
327
+ tcg_out_qemu_ld_direct(s, datalo, datahi, h, data_type, get_memop(oi));
328
329
- tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc,
330
- label_ptr, offsetof(CPUTLBEntry, addr_read));
331
-
332
- /* TLB Hit. */
333
- h.base = TCG_REG_L1;
334
- h.index = -1;
335
- h.ofs = 0;
336
- h.seg = 0;
337
- tcg_out_qemu_ld_direct(s, datalo, datahi, h, data_type, opc);
338
-
339
- /* Record the current context of a load into ldst label */
340
- add_qemu_ldst_label(s, true, data_type, oi, datalo, datahi,
341
- addrlo, addrhi, s->code_ptr, label_ptr);
342
-#else
343
- unsigned a_bits = get_alignment_bits(opc);
344
- if (a_bits) {
345
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
346
+ if (ldst) {
347
+ ldst->type = data_type;
348
+ ldst->datalo_reg = datalo;
349
+ ldst->datahi_reg = datahi;
350
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
351
}
352
-
353
- h = x86_guest_base;
354
- h.base = addrlo;
355
- tcg_out_qemu_ld_direct(s, datalo, datahi, h, data_type, opc);
356
-#endif
357
}
358
359
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
360
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
361
TCGReg addrlo, TCGReg addrhi,
362
MemOpIdx oi, TCGType data_type)
363
{
364
- MemOp opc = get_memop(oi);
365
+ TCGLabelQemuLdst *ldst;
366
HostAddress h;
367
368
-#if defined(CONFIG_SOFTMMU)
369
- tcg_insn_unit *label_ptr[2];
370
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
371
+ tcg_out_qemu_st_direct(s, datalo, datahi, h, get_memop(oi));
372
373
- tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc,
374
- label_ptr, offsetof(CPUTLBEntry, addr_write));
375
-
376
- /* TLB Hit. */
377
- h.base = TCG_REG_L1;
378
- h.index = -1;
379
- h.ofs = 0;
380
- h.seg = 0;
381
- tcg_out_qemu_st_direct(s, datalo, datahi, h, opc);
382
-
383
- /* Record the current context of a store into ldst label */
384
- add_qemu_ldst_label(s, false, data_type, oi, datalo, datahi,
385
- addrlo, addrhi, s->code_ptr, label_ptr);
386
-#else
387
- unsigned a_bits = get_alignment_bits(opc);
388
- if (a_bits) {
389
- tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
390
+ if (ldst) {
391
+ ldst->type = data_type;
392
+ ldst->datalo_reg = datalo;
393
+ ldst->datahi_reg = datahi;
394
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
395
}
396
-
397
- h = x86_guest_base;
398
- h.base = addrlo;
399
-
400
- tcg_out_qemu_st_direct(s, datalo, datahi, h, opc);
401
-#endif
402
}
403
404
static void tcg_out_exit_tb(TCGContext *s, uintptr_t a0)
405
--
406
2.34.1
407
408
diff view generated by jsdifflib
New patch
1
Since tcg_out_{ld,st}_helper_args, the slow path no longer requires
2
the address argument to be set up by the tlb load sequence. Use a
3
plain load for the addend and indexed addressing with the original
4
input address register.
1
5
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/i386/tcg-target.c.inc | 25 ++++++++++---------------
10
1 file changed, 10 insertions(+), 15 deletions(-)
11
12
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/i386/tcg-target.c.inc
15
+++ b/tcg/i386/tcg-target.c.inc
16
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
17
tcg_out_sti(s, TCG_TYPE_PTR, (uintptr_t)l->raddr, TCG_REG_ESP, ofs);
18
} else {
19
tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
20
- /* The second argument is already loaded with addrlo. */
21
+ tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
22
+ l->addrlo_reg);
23
tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[2], oi);
24
tcg_out_movi(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[3],
25
(uintptr_t)l->raddr);
26
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
27
tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, ofs);
28
} else {
29
tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
30
- /* The second argument is already loaded with addrlo. */
31
+ tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
32
+ l->addrlo_reg);
33
tcg_out_mov(s, (s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
34
tcg_target_call_iarg_regs[2], l->datalo_reg);
35
tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], oi);
36
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
37
tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw,
38
TCG_REG_L1, TCG_REG_L0, cmp_ofs);
39
40
- /*
41
- * Prepare for both the fast path add of the tlb addend, and the slow
42
- * path function argument setup.
43
- */
44
- *h = (HostAddress) {
45
- .base = TCG_REG_L1,
46
- .index = -1
47
- };
48
- tcg_out_mov(s, ttype, h->base, addrlo);
49
-
50
/* jne slow_path */
51
tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
52
ldst->label_ptr[0] = s->code_ptr;
53
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
54
}
55
56
/* TLB Hit. */
57
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_L0, TCG_REG_L0,
58
+ offsetof(CPUTLBEntry, addend));
59
60
- /* add addend(TCG_REG_L0), TCG_REG_L1 */
61
- tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, h->base, TCG_REG_L0,
62
- offsetof(CPUTLBEntry, addend));
63
+ *h = (HostAddress) {
64
+ .base = addrlo,
65
+ .index = TCG_REG_L0,
66
+ };
67
#else
68
if (a_bits) {
69
ldst = new_ldst_label(s);
70
--
71
2.34.1
72
73
diff view generated by jsdifflib
1
Let tb->page_addr[0] contain the address of the first byte of the
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
2
translated block, rather than the address of the page containing the
2
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
3
start of the translated block. We need to recover this value anyway
3
into one function that returns HostAddress and TCGLabelQemuLdst structures.
4
at various points, and it is easier to discard a page offset when it
5
is not needed, which happens naturally via the existing find_page shift.
6
4
7
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
---
7
---
10
accel/tcg/cpu-exec.c | 16 ++++++++--------
8
tcg/aarch64/tcg-target.c.inc | 313 +++++++++++++++--------------------
11
accel/tcg/cputlb.c | 3 ++-
9
1 file changed, 133 insertions(+), 180 deletions(-)
12
accel/tcg/translate-all.c | 9 +++++----
13
3 files changed, 15 insertions(+), 13 deletions(-)
14
10
15
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
11
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
16
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
17
--- a/accel/tcg/cpu-exec.c
13
--- a/tcg/aarch64/tcg-target.c.inc
18
+++ b/accel/tcg/cpu-exec.c
14
+++ b/tcg/aarch64/tcg-target.c.inc
19
@@ -XXX,XX +XXX,XX @@ struct tb_desc {
15
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
20
target_ulong pc;
16
tcg_out_goto(s, lb->raddr);
21
target_ulong cs_base;
17
return true;
22
CPUArchState *env;
18
}
23
- tb_page_addr_t phys_page1;
19
-
24
+ tb_page_addr_t page_addr0;
20
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld, MemOpIdx oi,
25
uint32_t flags;
21
- TCGType ext, TCGReg data_reg, TCGReg addr_reg,
26
uint32_t cflags;
22
- tcg_insn_unit *raddr, tcg_insn_unit *label_ptr)
27
uint32_t trace_vcpu_dstate;
23
-{
28
@@ -XXX,XX +XXX,XX @@ static bool tb_lookup_cmp(const void *p, const void *d)
24
- TCGLabelQemuLdst *label = new_ldst_label(s);
29
const struct tb_desc *desc = d;
25
-
30
26
- label->is_ld = is_ld;
31
if (tb->pc == desc->pc &&
27
- label->oi = oi;
32
- tb->page_addr[0] == desc->phys_page1 &&
28
- label->type = ext;
33
+ tb->page_addr[0] == desc->page_addr0 &&
29
- label->datalo_reg = data_reg;
34
tb->cs_base == desc->cs_base &&
30
- label->addrlo_reg = addr_reg;
35
tb->flags == desc->flags &&
31
- label->raddr = tcg_splitwx_to_rx(raddr);
36
tb->trace_vcpu_dstate == desc->trace_vcpu_dstate &&
32
- label->label_ptr[0] = label_ptr;
37
@@ -XXX,XX +XXX,XX @@ static bool tb_lookup_cmp(const void *p, const void *d)
33
-}
38
if (tb->page_addr[1] == -1) {
34
-
39
return true;
35
-/* We expect to use a 7-bit scaled negative offset from ENV. */
40
} else {
36
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
41
- tb_page_addr_t phys_page2;
37
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -512);
42
- target_ulong virt_page2;
38
-
43
+ tb_page_addr_t phys_page1;
39
-/* These offsets are built into the LDP below. */
44
+ target_ulong virt_page1;
40
-QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
45
41
-QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 8);
46
/*
42
-
47
* We know that the first page matched, and an otherwise valid TB
43
-/* Load and compare a TLB entry, emitting the conditional jump to the
48
@@ -XXX,XX +XXX,XX @@ static bool tb_lookup_cmp(const void *p, const void *d)
44
- slow path for the failure case, which will be patched later when finalizing
49
* is different for the new TB. Therefore any exception raised
45
- the slow path. Generated code returns the host addend in X1,
50
* here by the faulting lookup is not premature.
46
- clobbers X0,X2,X3,TMP. */
51
*/
47
-static void tcg_out_tlb_read(TCGContext *s, TCGReg addr_reg, MemOp opc,
52
- virt_page2 = TARGET_PAGE_ALIGN(desc->pc);
48
- tcg_insn_unit **label_ptr, int mem_index,
53
- phys_page2 = get_page_addr_code(desc->env, virt_page2);
49
- bool is_read)
54
- if (tb->page_addr[1] == phys_page2) {
50
-{
55
+ virt_page1 = TARGET_PAGE_ALIGN(desc->pc);
51
- unsigned a_bits = get_alignment_bits(opc);
56
+ phys_page1 = get_page_addr_code(desc->env, virt_page1);
52
- unsigned s_bits = opc & MO_SIZE;
57
+ if (tb->page_addr[1] == phys_page1) {
53
- unsigned a_mask = (1u << a_bits) - 1;
58
return true;
54
- unsigned s_mask = (1u << s_bits) - 1;
59
}
55
- TCGReg x3;
60
}
56
- TCGType mask_type;
61
@@ -XXX,XX +XXX,XX @@ static TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc,
57
- uint64_t compare_mask;
62
if (phys_pc == -1) {
58
-
63
return NULL;
59
- mask_type = (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32
60
- ? TCG_TYPE_I64 : TCG_TYPE_I32);
61
-
62
- /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {x0,x1}. */
63
- tcg_out_insn(s, 3314, LDP, TCG_REG_X0, TCG_REG_X1, TCG_AREG0,
64
- TLB_MASK_TABLE_OFS(mem_index), 1, 0);
65
-
66
- /* Extract the TLB index from the address into X0. */
67
- tcg_out_insn(s, 3502S, AND_LSR, mask_type == TCG_TYPE_I64,
68
- TCG_REG_X0, TCG_REG_X0, addr_reg,
69
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
70
-
71
- /* Add the tlb_table pointer, creating the CPUTLBEntry address into X1. */
72
- tcg_out_insn(s, 3502, ADD, 1, TCG_REG_X1, TCG_REG_X1, TCG_REG_X0);
73
-
74
- /* Load the tlb comparator into X0, and the fast path addend into X1. */
75
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_X0, TCG_REG_X1, is_read
76
- ? offsetof(CPUTLBEntry, addr_read)
77
- : offsetof(CPUTLBEntry, addr_write));
78
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_X1, TCG_REG_X1,
79
- offsetof(CPUTLBEntry, addend));
80
-
81
- /* For aligned accesses, we check the first byte and include the alignment
82
- bits within the address. For unaligned access, we check that we don't
83
- cross pages using the address of the last byte of the access. */
84
- if (a_bits >= s_bits) {
85
- x3 = addr_reg;
86
- } else {
87
- tcg_out_insn(s, 3401, ADDI, TARGET_LONG_BITS == 64,
88
- TCG_REG_X3, addr_reg, s_mask - a_mask);
89
- x3 = TCG_REG_X3;
90
- }
91
- compare_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
92
-
93
- /* Store the page mask part of the address into X3. */
94
- tcg_out_logicali(s, I3404_ANDI, TARGET_LONG_BITS == 64,
95
- TCG_REG_X3, x3, compare_mask);
96
-
97
- /* Perform the address comparison. */
98
- tcg_out_cmp(s, TARGET_LONG_BITS == 64, TCG_REG_X0, TCG_REG_X3, 0);
99
-
100
- /* If not equal, we jump to the slow path. */
101
- *label_ptr = s->code_ptr;
102
- tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
103
-}
104
-
105
#else
106
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addr_reg,
107
- unsigned a_bits)
108
-{
109
- unsigned a_mask = (1 << a_bits) - 1;
110
- TCGLabelQemuLdst *label = new_ldst_label(s);
111
-
112
- label->is_ld = is_ld;
113
- label->addrlo_reg = addr_reg;
114
-
115
- /* tst addr, #mask */
116
- tcg_out_logicali(s, I3404_ANDSI, 0, TCG_REG_XZR, addr_reg, a_mask);
117
-
118
- label->label_ptr[0] = s->code_ptr;
119
-
120
- /* b.ne slow_path */
121
- tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
122
-
123
- label->raddr = tcg_splitwx_to_rx(s->code_ptr);
124
-}
125
-
126
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
127
{
128
if (!reloc_pc19(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
129
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
130
}
131
#endif /* CONFIG_SOFTMMU */
132
133
+/*
134
+ * For softmmu, perform the TLB load and compare.
135
+ * For useronly, perform any required alignment tests.
136
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
137
+ * is required and fill in @h with the host address for the fast path.
138
+ */
139
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
140
+ TCGReg addr_reg, MemOpIdx oi,
141
+ bool is_ld)
142
+{
143
+ TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
144
+ TCGLabelQemuLdst *ldst = NULL;
145
+ MemOp opc = get_memop(oi);
146
+ unsigned a_bits = get_alignment_bits(opc);
147
+ unsigned a_mask = (1u << a_bits) - 1;
148
+
149
+#ifdef CONFIG_SOFTMMU
150
+ unsigned s_bits = opc & MO_SIZE;
151
+ unsigned s_mask = (1u << s_bits) - 1;
152
+ unsigned mem_index = get_mmuidx(oi);
153
+ TCGReg x3;
154
+ TCGType mask_type;
155
+ uint64_t compare_mask;
156
+
157
+ ldst = new_ldst_label(s);
158
+ ldst->is_ld = is_ld;
159
+ ldst->oi = oi;
160
+ ldst->addrlo_reg = addr_reg;
161
+
162
+ mask_type = (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32
163
+ ? TCG_TYPE_I64 : TCG_TYPE_I32);
164
+
165
+ /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {x0,x1}. */
166
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
167
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -512);
168
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
169
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 8);
170
+ tcg_out_insn(s, 3314, LDP, TCG_REG_X0, TCG_REG_X1, TCG_AREG0,
171
+ TLB_MASK_TABLE_OFS(mem_index), 1, 0);
172
+
173
+ /* Extract the TLB index from the address into X0. */
174
+ tcg_out_insn(s, 3502S, AND_LSR, mask_type == TCG_TYPE_I64,
175
+ TCG_REG_X0, TCG_REG_X0, addr_reg,
176
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
177
+
178
+ /* Add the tlb_table pointer, creating the CPUTLBEntry address into X1. */
179
+ tcg_out_insn(s, 3502, ADD, 1, TCG_REG_X1, TCG_REG_X1, TCG_REG_X0);
180
+
181
+ /* Load the tlb comparator into X0, and the fast path addend into X1. */
182
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_X0, TCG_REG_X1,
183
+ is_ld ? offsetof(CPUTLBEntry, addr_read)
184
+ : offsetof(CPUTLBEntry, addr_write));
185
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_X1, TCG_REG_X1,
186
+ offsetof(CPUTLBEntry, addend));
187
+
188
+ /*
189
+ * For aligned accesses, we check the first byte and include the alignment
190
+ * bits within the address. For unaligned access, we check that we don't
191
+ * cross pages using the address of the last byte of the access.
192
+ */
193
+ if (a_bits >= s_bits) {
194
+ x3 = addr_reg;
195
+ } else {
196
+ tcg_out_insn(s, 3401, ADDI, TARGET_LONG_BITS == 64,
197
+ TCG_REG_X3, addr_reg, s_mask - a_mask);
198
+ x3 = TCG_REG_X3;
199
+ }
200
+ compare_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
201
+
202
+ /* Store the page mask part of the address into X3. */
203
+ tcg_out_logicali(s, I3404_ANDI, TARGET_LONG_BITS == 64,
204
+ TCG_REG_X3, x3, compare_mask);
205
+
206
+ /* Perform the address comparison. */
207
+ tcg_out_cmp(s, TARGET_LONG_BITS == 64, TCG_REG_X0, TCG_REG_X3, 0);
208
+
209
+ /* If not equal, we jump to the slow path. */
210
+ ldst->label_ptr[0] = s->code_ptr;
211
+ tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
212
+
213
+ *h = (HostAddress){
214
+ .base = TCG_REG_X1,
215
+ .index = addr_reg,
216
+ .index_ext = addr_type
217
+ };
218
+#else
219
+ if (a_mask) {
220
+ ldst = new_ldst_label(s);
221
+
222
+ ldst->is_ld = is_ld;
223
+ ldst->oi = oi;
224
+ ldst->addrlo_reg = addr_reg;
225
+
226
+ /* tst addr, #mask */
227
+ tcg_out_logicali(s, I3404_ANDSI, 0, TCG_REG_XZR, addr_reg, a_mask);
228
+
229
+ /* b.ne slow_path */
230
+ ldst->label_ptr[0] = s->code_ptr;
231
+ tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
232
+ }
233
+
234
+ if (USE_GUEST_BASE) {
235
+ *h = (HostAddress){
236
+ .base = TCG_REG_GUEST_BASE,
237
+ .index = addr_reg,
238
+ .index_ext = addr_type
239
+ };
240
+ } else {
241
+ *h = (HostAddress){
242
+ .base = addr_reg,
243
+ .index = TCG_REG_XZR,
244
+ .index_ext = TCG_TYPE_I64
245
+ };
246
+ }
247
+#endif
248
+
249
+ return ldst;
250
+}
251
+
252
static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp memop, TCGType ext,
253
TCGReg data_r, HostAddress h)
254
{
255
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp memop,
256
static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
257
MemOpIdx oi, TCGType data_type)
258
{
259
- MemOp memop = get_memop(oi);
260
- TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
261
+ TCGLabelQemuLdst *ldst;
262
HostAddress h;
263
264
- /* Byte swapping is left to middle-end expansion. */
265
- tcg_debug_assert((memop & MO_BSWAP) == 0);
266
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, true);
267
+ tcg_out_qemu_ld_direct(s, get_memop(oi), data_type, data_reg, h);
268
269
-#ifdef CONFIG_SOFTMMU
270
- tcg_insn_unit *label_ptr;
271
-
272
- tcg_out_tlb_read(s, addr_reg, memop, &label_ptr, get_mmuidx(oi), 1);
273
-
274
- h = (HostAddress){
275
- .base = TCG_REG_X1,
276
- .index = addr_reg,
277
- .index_ext = addr_type
278
- };
279
- tcg_out_qemu_ld_direct(s, memop, data_type, data_reg, h);
280
-
281
- add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
282
- s->code_ptr, label_ptr);
283
-#else /* !CONFIG_SOFTMMU */
284
- unsigned a_bits = get_alignment_bits(memop);
285
- if (a_bits) {
286
- tcg_out_test_alignment(s, true, addr_reg, a_bits);
287
+ if (ldst) {
288
+ ldst->type = data_type;
289
+ ldst->datalo_reg = data_reg;
290
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
64
}
291
}
65
- desc.phys_page1 = phys_pc & TARGET_PAGE_MASK;
292
- if (USE_GUEST_BASE) {
66
+ desc.page_addr0 = phys_pc;
293
- h = (HostAddress){
67
h = tb_hash_func(phys_pc, pc, flags, cflags, *cpu->trace_dstate);
294
- .base = TCG_REG_GUEST_BASE,
68
return qht_lookup_custom(&tb_ctx.htable, &desc, h, tb_lookup_cmp);
295
- .index = addr_reg,
296
- .index_ext = addr_type
297
- };
298
- } else {
299
- h = (HostAddress){
300
- .base = addr_reg,
301
- .index = TCG_REG_XZR,
302
- .index_ext = TCG_TYPE_I64
303
- };
304
- }
305
- tcg_out_qemu_ld_direct(s, memop, data_type, data_reg, h);
306
-#endif /* CONFIG_SOFTMMU */
69
}
307
}
70
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
308
71
index XXXXXXX..XXXXXXX 100644
309
static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
72
--- a/accel/tcg/cputlb.c
310
MemOpIdx oi, TCGType data_type)
73
+++ b/accel/tcg/cputlb.c
74
@@ -XXX,XX +XXX,XX @@ void tlb_flush_page_bits_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
75
can be detected */
76
void tlb_protect_code(ram_addr_t ram_addr)
77
{
311
{
78
- cpu_physical_memory_test_and_clear_dirty(ram_addr, TARGET_PAGE_SIZE,
312
- MemOp memop = get_memop(oi);
79
+ cpu_physical_memory_test_and_clear_dirty(ram_addr & TARGET_PAGE_MASK,
313
- TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
80
+ TARGET_PAGE_SIZE,
314
+ TCGLabelQemuLdst *ldst;
81
DIRTY_MEMORY_CODE);
315
HostAddress h;
316
317
- /* Byte swapping is left to middle-end expansion. */
318
- tcg_debug_assert((memop & MO_BSWAP) == 0);
319
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, false);
320
+ tcg_out_qemu_st_direct(s, get_memop(oi), data_reg, h);
321
322
-#ifdef CONFIG_SOFTMMU
323
- tcg_insn_unit *label_ptr;
324
-
325
- tcg_out_tlb_read(s, addr_reg, memop, &label_ptr, get_mmuidx(oi), 0);
326
-
327
- h = (HostAddress){
328
- .base = TCG_REG_X1,
329
- .index = addr_reg,
330
- .index_ext = addr_type
331
- };
332
- tcg_out_qemu_st_direct(s, memop, data_reg, h);
333
-
334
- add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
335
- s->code_ptr, label_ptr);
336
-#else /* !CONFIG_SOFTMMU */
337
- unsigned a_bits = get_alignment_bits(memop);
338
- if (a_bits) {
339
- tcg_out_test_alignment(s, false, addr_reg, a_bits);
340
+ if (ldst) {
341
+ ldst->type = data_type;
342
+ ldst->datalo_reg = data_reg;
343
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
344
}
345
- if (USE_GUEST_BASE) {
346
- h = (HostAddress){
347
- .base = TCG_REG_GUEST_BASE,
348
- .index = addr_reg,
349
- .index_ext = addr_type
350
- };
351
- } else {
352
- h = (HostAddress){
353
- .base = addr_reg,
354
- .index = TCG_REG_XZR,
355
- .index_ext = TCG_TYPE_I64
356
- };
357
- }
358
- tcg_out_qemu_st_direct(s, memop, data_reg, h);
359
-#endif /* CONFIG_SOFTMMU */
82
}
360
}
83
361
84
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
362
static const tcg_insn_unit *tb_ret_addr;
85
index XXXXXXX..XXXXXXX 100644
86
--- a/accel/tcg/translate-all.c
87
+++ b/accel/tcg/translate-all.c
88
@@ -XXX,XX +XXX,XX @@ static void do_tb_phys_invalidate(TranslationBlock *tb, bool rm_from_page_list)
89
qemu_spin_unlock(&tb->jmp_lock);
90
91
/* remove the TB from the hash list */
92
- phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
93
+ phys_pc = tb->page_addr[0];
94
h = tb_hash_func(phys_pc, tb->pc, tb->flags, orig_cflags,
95
tb->trace_vcpu_dstate);
96
if (!qht_remove(&tb_ctx.htable, tb, h)) {
97
@@ -XXX,XX +XXX,XX @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
98
* we can only insert TBs that are fully initialized.
99
*/
100
page_lock_pair(&p, phys_pc, &p2, phys_page2, true);
101
- tb_page_add(p, tb, 0, phys_pc & TARGET_PAGE_MASK);
102
+ tb_page_add(p, tb, 0, phys_pc);
103
if (p2) {
104
tb_page_add(p2, tb, 1, phys_page2);
105
} else {
106
@@ -XXX,XX +XXX,XX @@ tb_invalidate_phys_page_range__locked(struct page_collection *pages,
107
if (n == 0) {
108
/* NOTE: tb_end may be after the end of the page, but
109
it is not a problem */
110
- tb_start = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
111
+ tb_start = tb->page_addr[0];
112
tb_end = tb_start + tb->size;
113
} else {
114
tb_start = tb->page_addr[1];
115
- tb_end = tb_start + ((tb->pc + tb->size) & ~TARGET_PAGE_MASK);
116
+ tb_end = tb_start + ((tb->page_addr[0] + tb->size)
117
+ & ~TARGET_PAGE_MASK);
118
}
119
if (!(tb_end <= start || tb_start >= end)) {
120
#ifdef TARGET_HAS_PRECISE_SMC
121
--
363
--
122
2.34.1
364
2.34.1
123
365
124
366
diff view generated by jsdifflib
1
This bitmap is created and discarded immediately.
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, and some code that lived
2
We gain nothing by its existence.
2
in both tcg_out_qemu_ld and tcg_out_qemu_st into one function that
3
returns HostAddress and TCGLabelQemuLdst structures.
3
4
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Message-Id: <20220822232338.1727934-2-richard.henderson@linaro.org>
7
---
7
---
8
accel/tcg/translate-all.c | 78 ++-------------------------------------
8
tcg/arm/tcg-target.c.inc | 351 ++++++++++++++++++---------------------
9
1 file changed, 4 insertions(+), 74 deletions(-)
9
1 file changed, 159 insertions(+), 192 deletions(-)
10
10
11
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
11
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
13
--- a/accel/tcg/translate-all.c
13
--- a/tcg/arm/tcg-target.c.inc
14
+++ b/accel/tcg/translate-all.c
14
+++ b/tcg/arm/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@
15
@@ -XXX,XX +XXX,XX @@ static TCGReg tcg_out_arg_reg64(TCGContext *s, TCGReg argreg,
16
#define assert_memory_lock() tcg_debug_assert(have_mmap_lock())
17
#endif
18
19
-#define SMC_BITMAP_USE_THRESHOLD 10
20
-
21
typedef struct PageDesc {
22
/* list of TBs intersecting this ram page */
23
uintptr_t first_tb;
24
-#ifdef CONFIG_SOFTMMU
25
- /* in order to optimize self modifying code, we count the number
26
- of lookups we do to a given page to use a bitmap */
27
- unsigned long *code_bitmap;
28
- unsigned int code_write_count;
29
-#else
30
+#ifdef CONFIG_USER_ONLY
31
unsigned long flags;
32
void *target_data;
33
#endif
34
-#ifndef CONFIG_USER_ONLY
35
+#ifdef CONFIG_SOFTMMU
36
QemuSpin lock;
37
#endif
38
} PageDesc;
39
@@ -XXX,XX +XXX,XX @@ void tb_htable_init(void)
40
qht_init(&tb_ctx.htable, tb_cmp, CODE_GEN_HTABLE_SIZE, mode);
41
}
42
43
-/* call with @p->lock held */
44
-static inline void invalidate_page_bitmap(PageDesc *p)
45
-{
46
- assert_page_locked(p);
47
-#ifdef CONFIG_SOFTMMU
48
- g_free(p->code_bitmap);
49
- p->code_bitmap = NULL;
50
- p->code_write_count = 0;
51
-#endif
52
-}
53
-
54
/* Set to NULL all the 'first_tb' fields in all PageDescs. */
55
static void page_flush_tb_1(int level, void **lp)
56
{
57
@@ -XXX,XX +XXX,XX @@ static void page_flush_tb_1(int level, void **lp)
58
for (i = 0; i < V_L2_SIZE; ++i) {
59
page_lock(&pd[i]);
60
pd[i].first_tb = (uintptr_t)NULL;
61
- invalidate_page_bitmap(pd + i);
62
page_unlock(&pd[i]);
63
}
64
} else {
65
@@ -XXX,XX +XXX,XX @@ static void do_tb_phys_invalidate(TranslationBlock *tb, bool rm_from_page_list)
66
if (rm_from_page_list) {
67
p = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
68
tb_page_remove(p, tb);
69
- invalidate_page_bitmap(p);
70
if (tb->page_addr[1] != -1) {
71
p = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
72
tb_page_remove(p, tb);
73
- invalidate_page_bitmap(p);
74
}
75
}
76
77
@@ -XXX,XX +XXX,XX @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
78
}
16
}
79
}
17
}
80
18
81
-#ifdef CONFIG_SOFTMMU
19
-#define TLB_SHIFT    (CPU_TLB_ENTRY_BITS + CPU_TLB_BITS)
82
-/* call with @p->lock held */
20
-
83
-static void build_page_bitmap(PageDesc *p)
21
-/* We expect to use an 9-bit sign-magnitude negative offset from ENV. */
22
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
23
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -256);
24
-
25
-/* These offsets are built into the LDRD below. */
26
-QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
27
-QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 4);
28
-
29
-/* Load and compare a TLB entry, leaving the flags set. Returns the register
30
- containing the addend of the tlb entry. Clobbers R0, R1, R2, TMP. */
31
-
32
-static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
33
- MemOp opc, int mem_index, bool is_load)
84
-{
34
-{
85
- int n, tb_start, tb_end;
35
- int cmp_off = (is_load ? offsetof(CPUTLBEntry, addr_read)
86
- TranslationBlock *tb;
36
- : offsetof(CPUTLBEntry, addr_write));
87
-
37
- int fast_off = TLB_MASK_TABLE_OFS(mem_index);
88
- assert_page_locked(p);
38
- unsigned s_mask = (1 << (opc & MO_SIZE)) - 1;
89
- p->code_bitmap = bitmap_new(TARGET_PAGE_SIZE);
39
- unsigned a_mask = (1 << get_alignment_bits(opc)) - 1;
90
-
40
- TCGReg t_addr;
91
- PAGE_FOR_EACH_TB(p, tb, n) {
41
-
92
- /* NOTE: this is subtle as a TB may span two physical pages */
42
- /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {r0,r1}. */
93
- if (n == 0) {
43
- tcg_out_ldrd_8(s, COND_AL, TCG_REG_R0, TCG_AREG0, fast_off);
94
- /* NOTE: tb_end may be after the end of the page, but
44
-
95
- it is not a problem */
45
- /* Extract the tlb index from the address into R0. */
96
- tb_start = tb->pc & ~TARGET_PAGE_MASK;
46
- tcg_out_dat_reg(s, COND_AL, ARITH_AND, TCG_REG_R0, TCG_REG_R0, addrlo,
97
- tb_end = tb_start + tb->size;
47
- SHIFT_IMM_LSR(TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS));
98
- if (tb_end > TARGET_PAGE_SIZE) {
48
-
99
- tb_end = TARGET_PAGE_SIZE;
49
- /*
100
- }
50
- * Add the tlb_table pointer, creating the CPUTLBEntry address in R1.
51
- * Load the tlb comparator into R2/R3 and the fast path addend into R1.
52
- */
53
- if (cmp_off == 0) {
54
- if (TARGET_LONG_BITS == 64) {
55
- tcg_out_ldrd_rwb(s, COND_AL, TCG_REG_R2, TCG_REG_R1, TCG_REG_R0);
101
- } else {
56
- } else {
102
- tb_start = 0;
57
- tcg_out_ld32_rwb(s, COND_AL, TCG_REG_R2, TCG_REG_R1, TCG_REG_R0);
103
- tb_end = ((tb->pc + tb->size) & ~TARGET_PAGE_MASK);
104
- }
105
- bitmap_set(p->code_bitmap, tb_start, tb_end - tb_start);
106
- }
107
-}
108
-#endif
109
-
110
/* add the tb in the target page and protect it if necessary
111
*
112
* Called with mmap_lock held for user-mode emulation.
113
@@ -XXX,XX +XXX,XX @@ static inline void tb_page_add(PageDesc *p, TranslationBlock *tb,
114
page_already_protected = p->first_tb != (uintptr_t)NULL;
115
#endif
116
p->first_tb = (uintptr_t)tb | n;
117
- invalidate_page_bitmap(p);
118
119
#if defined(CONFIG_USER_ONLY)
120
/* translator_loop() must have made all TB pages non-writable */
121
@@ -XXX,XX +XXX,XX @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
122
/* remove TB from the page(s) if we couldn't insert it */
123
if (unlikely(existing_tb)) {
124
tb_page_remove(p, tb);
125
- invalidate_page_bitmap(p);
126
if (p2) {
127
tb_page_remove(p2, tb);
128
- invalidate_page_bitmap(p2);
129
}
130
tb = existing_tb;
131
}
132
@@ -XXX,XX +XXX,XX @@ tb_invalidate_phys_page_range__locked(struct page_collection *pages,
133
#if !defined(CONFIG_USER_ONLY)
134
/* if no code remaining, no need to continue to use slow writes */
135
if (!p->first_tb) {
136
- invalidate_page_bitmap(p);
137
tlb_unprotect_code(start);
138
}
139
#endif
140
@@ -XXX,XX +XXX,XX @@ void tb_invalidate_phys_page_fast(struct page_collection *pages,
141
}
142
143
assert_page_locked(p);
144
- if (!p->code_bitmap &&
145
- ++p->code_write_count >= SMC_BITMAP_USE_THRESHOLD) {
146
- build_page_bitmap(p);
147
- }
148
- if (p->code_bitmap) {
149
- unsigned int nr;
150
- unsigned long b;
151
-
152
- nr = start & ~TARGET_PAGE_MASK;
153
- b = p->code_bitmap[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG - 1));
154
- if (b & ((1 << len) - 1)) {
155
- goto do_invalidate;
156
- }
58
- }
157
- } else {
59
- } else {
158
- do_invalidate:
60
- tcg_out_dat_reg(s, COND_AL, ARITH_ADD,
159
- tb_invalidate_phys_page_range__locked(pages, p, start, start + len,
61
- TCG_REG_R1, TCG_REG_R1, TCG_REG_R0, 0);
160
- retaddr);
62
- if (TARGET_LONG_BITS == 64) {
63
- tcg_out_ldrd_8(s, COND_AL, TCG_REG_R2, TCG_REG_R1, cmp_off);
64
- } else {
65
- tcg_out_ld32_12(s, COND_AL, TCG_REG_R2, TCG_REG_R1, cmp_off);
66
- }
161
- }
67
- }
162
+ tb_invalidate_phys_page_range__locked(pages, p, start, start + len,
68
-
163
+ retaddr);
69
- /* Load the tlb addend. */
70
- tcg_out_ld32_12(s, COND_AL, TCG_REG_R1, TCG_REG_R1,
71
- offsetof(CPUTLBEntry, addend));
72
-
73
- /*
74
- * Check alignment, check comparators.
75
- * Do this in 2-4 insns. Use MOVW for v7, if possible,
76
- * to reduce the number of sequential conditional instructions.
77
- * Almost all guests have at least 4k pages, which means that we need
78
- * to clear at least 9 bits even for an 8-byte memory, which means it
79
- * isn't worth checking for an immediate operand for BIC.
80
- *
81
- * For unaligned accesses, test the page of the last unit of alignment.
82
- * This leaves the least significant alignment bits unchanged, and of
83
- * course must be zero.
84
- */
85
- t_addr = addrlo;
86
- if (a_mask < s_mask) {
87
- t_addr = TCG_REG_R0;
88
- tcg_out_dat_imm(s, COND_AL, ARITH_ADD, t_addr,
89
- addrlo, s_mask - a_mask);
90
- }
91
- if (use_armv7_instructions && TARGET_PAGE_BITS <= 16) {
92
- tcg_out_movi32(s, COND_AL, TCG_REG_TMP, ~(TARGET_PAGE_MASK | a_mask));
93
- tcg_out_dat_reg(s, COND_AL, ARITH_BIC, TCG_REG_TMP,
94
- t_addr, TCG_REG_TMP, 0);
95
- tcg_out_dat_reg(s, COND_AL, ARITH_CMP, 0, TCG_REG_R2, TCG_REG_TMP, 0);
96
- } else {
97
- if (a_mask) {
98
- tcg_debug_assert(a_mask <= 0xff);
99
- tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
100
- }
101
- tcg_out_dat_reg(s, COND_AL, ARITH_MOV, TCG_REG_TMP, 0, t_addr,
102
- SHIFT_IMM_LSR(TARGET_PAGE_BITS));
103
- tcg_out_dat_reg(s, (a_mask ? COND_EQ : COND_AL), ARITH_CMP,
104
- 0, TCG_REG_R2, TCG_REG_TMP,
105
- SHIFT_IMM_LSL(TARGET_PAGE_BITS));
106
- }
107
-
108
- if (TARGET_LONG_BITS == 64) {
109
- tcg_out_dat_reg(s, COND_EQ, ARITH_CMP, 0, TCG_REG_R3, addrhi, 0);
110
- }
111
-
112
- return TCG_REG_R1;
113
-}
114
-
115
-/* Record the context of a call to the out of line helper code for the slow
116
- path for a load or store, so that we can later generate the correct
117
- helper code. */
118
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld,
119
- MemOpIdx oi, TCGType type,
120
- TCGReg datalo, TCGReg datahi,
121
- TCGReg addrlo, TCGReg addrhi,
122
- tcg_insn_unit *raddr,
123
- tcg_insn_unit *label_ptr)
124
-{
125
- TCGLabelQemuLdst *label = new_ldst_label(s);
126
-
127
- label->is_ld = is_ld;
128
- label->oi = oi;
129
- label->type = type;
130
- label->datalo_reg = datalo;
131
- label->datahi_reg = datahi;
132
- label->addrlo_reg = addrlo;
133
- label->addrhi_reg = addrhi;
134
- label->raddr = tcg_splitwx_to_rx(raddr);
135
- label->label_ptr[0] = label_ptr;
136
-}
137
-
138
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
139
{
140
TCGReg argreg;
141
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
142
return true;
164
}
143
}
165
#else
144
#else
166
/* Called with mmap_lock held. If pc is not 0 then it indicates the
145
-
146
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
147
- TCGReg addrhi, unsigned a_bits)
148
-{
149
- unsigned a_mask = (1 << a_bits) - 1;
150
- TCGLabelQemuLdst *label = new_ldst_label(s);
151
-
152
- label->is_ld = is_ld;
153
- label->addrlo_reg = addrlo;
154
- label->addrhi_reg = addrhi;
155
-
156
- /* We are expecting a_bits to max out at 7, and can easily support 8. */
157
- tcg_debug_assert(a_mask <= 0xff);
158
- /* tst addr, #mask */
159
- tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
160
-
161
- /* blne slow_path */
162
- label->label_ptr[0] = s->code_ptr;
163
- tcg_out_bl_imm(s, COND_NE, 0);
164
-
165
- label->raddr = tcg_splitwx_to_rx(s->code_ptr);
166
-}
167
-
168
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
169
{
170
if (!reloc_pc24(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
171
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
172
}
173
#endif /* SOFTMMU */
174
175
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
176
+ TCGReg addrlo, TCGReg addrhi,
177
+ MemOpIdx oi, bool is_ld)
178
+{
179
+ TCGLabelQemuLdst *ldst = NULL;
180
+ MemOp opc = get_memop(oi);
181
+ MemOp a_bits = get_alignment_bits(opc);
182
+ unsigned a_mask = (1 << a_bits) - 1;
183
+
184
+#ifdef CONFIG_SOFTMMU
185
+ int mem_index = get_mmuidx(oi);
186
+ int cmp_off = is_ld ? offsetof(CPUTLBEntry, addr_read)
187
+ : offsetof(CPUTLBEntry, addr_write);
188
+ int fast_off = TLB_MASK_TABLE_OFS(mem_index);
189
+ unsigned s_mask = (1 << (opc & MO_SIZE)) - 1;
190
+ TCGReg t_addr;
191
+
192
+ ldst = new_ldst_label(s);
193
+ ldst->is_ld = is_ld;
194
+ ldst->oi = oi;
195
+ ldst->addrlo_reg = addrlo;
196
+ ldst->addrhi_reg = addrhi;
197
+
198
+ /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {r0,r1}. */
199
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
200
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -256);
201
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
202
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 4);
203
+ tcg_out_ldrd_8(s, COND_AL, TCG_REG_R0, TCG_AREG0, fast_off);
204
+
205
+ /* Extract the tlb index from the address into R0. */
206
+ tcg_out_dat_reg(s, COND_AL, ARITH_AND, TCG_REG_R0, TCG_REG_R0, addrlo,
207
+ SHIFT_IMM_LSR(TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS));
208
+
209
+ /*
210
+ * Add the tlb_table pointer, creating the CPUTLBEntry address in R1.
211
+ * Load the tlb comparator into R2/R3 and the fast path addend into R1.
212
+ */
213
+ if (cmp_off == 0) {
214
+ if (TARGET_LONG_BITS == 64) {
215
+ tcg_out_ldrd_rwb(s, COND_AL, TCG_REG_R2, TCG_REG_R1, TCG_REG_R0);
216
+ } else {
217
+ tcg_out_ld32_rwb(s, COND_AL, TCG_REG_R2, TCG_REG_R1, TCG_REG_R0);
218
+ }
219
+ } else {
220
+ tcg_out_dat_reg(s, COND_AL, ARITH_ADD,
221
+ TCG_REG_R1, TCG_REG_R1, TCG_REG_R0, 0);
222
+ if (TARGET_LONG_BITS == 64) {
223
+ tcg_out_ldrd_8(s, COND_AL, TCG_REG_R2, TCG_REG_R1, cmp_off);
224
+ } else {
225
+ tcg_out_ld32_12(s, COND_AL, TCG_REG_R2, TCG_REG_R1, cmp_off);
226
+ }
227
+ }
228
+
229
+ /* Load the tlb addend. */
230
+ tcg_out_ld32_12(s, COND_AL, TCG_REG_R1, TCG_REG_R1,
231
+ offsetof(CPUTLBEntry, addend));
232
+
233
+ /*
234
+ * Check alignment, check comparators.
235
+ * Do this in 2-4 insns. Use MOVW for v7, if possible,
236
+ * to reduce the number of sequential conditional instructions.
237
+ * Almost all guests have at least 4k pages, which means that we need
238
+ * to clear at least 9 bits even for an 8-byte memory, which means it
239
+ * isn't worth checking for an immediate operand for BIC.
240
+ *
241
+ * For unaligned accesses, test the page of the last unit of alignment.
242
+ * This leaves the least significant alignment bits unchanged, and of
243
+ * course must be zero.
244
+ */
245
+ t_addr = addrlo;
246
+ if (a_mask < s_mask) {
247
+ t_addr = TCG_REG_R0;
248
+ tcg_out_dat_imm(s, COND_AL, ARITH_ADD, t_addr,
249
+ addrlo, s_mask - a_mask);
250
+ }
251
+ if (use_armv7_instructions && TARGET_PAGE_BITS <= 16) {
252
+ tcg_out_movi32(s, COND_AL, TCG_REG_TMP, ~(TARGET_PAGE_MASK | a_mask));
253
+ tcg_out_dat_reg(s, COND_AL, ARITH_BIC, TCG_REG_TMP,
254
+ t_addr, TCG_REG_TMP, 0);
255
+ tcg_out_dat_reg(s, COND_AL, ARITH_CMP, 0, TCG_REG_R2, TCG_REG_TMP, 0);
256
+ } else {
257
+ if (a_mask) {
258
+ tcg_debug_assert(a_mask <= 0xff);
259
+ tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
260
+ }
261
+ tcg_out_dat_reg(s, COND_AL, ARITH_MOV, TCG_REG_TMP, 0, t_addr,
262
+ SHIFT_IMM_LSR(TARGET_PAGE_BITS));
263
+ tcg_out_dat_reg(s, (a_mask ? COND_EQ : COND_AL), ARITH_CMP,
264
+ 0, TCG_REG_R2, TCG_REG_TMP,
265
+ SHIFT_IMM_LSL(TARGET_PAGE_BITS));
266
+ }
267
+
268
+ if (TARGET_LONG_BITS == 64) {
269
+ tcg_out_dat_reg(s, COND_EQ, ARITH_CMP, 0, TCG_REG_R3, addrhi, 0);
270
+ }
271
+
272
+ *h = (HostAddress){
273
+ .cond = COND_AL,
274
+ .base = addrlo,
275
+ .index = TCG_REG_R1,
276
+ .index_scratch = true,
277
+ };
278
+#else
279
+ if (a_mask) {
280
+ ldst = new_ldst_label(s);
281
+ ldst->is_ld = is_ld;
282
+ ldst->oi = oi;
283
+ ldst->addrlo_reg = addrlo;
284
+ ldst->addrhi_reg = addrhi;
285
+
286
+ /* We are expecting a_bits to max out at 7 */
287
+ tcg_debug_assert(a_mask <= 0xff);
288
+ /* tst addr, #mask */
289
+ tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
290
+ }
291
+
292
+ *h = (HostAddress){
293
+ .cond = COND_AL,
294
+ .base = addrlo,
295
+ .index = guest_base ? TCG_REG_GUEST_BASE : -1,
296
+ .index_scratch = false,
297
+ };
298
+#endif
299
+
300
+ return ldst;
301
+}
302
+
303
static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg datalo,
304
TCGReg datahi, HostAddress h)
305
{
306
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
307
MemOpIdx oi, TCGType data_type)
308
{
309
MemOp opc = get_memop(oi);
310
+ TCGLabelQemuLdst *ldst;
311
HostAddress h;
312
313
-#ifdef CONFIG_SOFTMMU
314
- h.cond = COND_AL;
315
- h.base = addrlo;
316
- h.index_scratch = true;
317
- h.index = tcg_out_tlb_read(s, addrlo, addrhi, opc, get_mmuidx(oi), 1);
318
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
319
+ if (ldst) {
320
+ ldst->type = data_type;
321
+ ldst->datalo_reg = datalo;
322
+ ldst->datahi_reg = datahi;
323
324
- /*
325
- * This a conditional BL only to load a pointer within this opcode into
326
- * LR for the slow path. We will not be using the value for a tail call.
327
- */
328
- tcg_insn_unit *label_ptr = s->code_ptr;
329
- tcg_out_bl_imm(s, COND_NE, 0);
330
+ /*
331
+ * This a conditional BL only to load a pointer within this
332
+ * opcode into LR for the slow path. We will not be using
333
+ * the value for a tail call.
334
+ */
335
+ ldst->label_ptr[0] = s->code_ptr;
336
+ tcg_out_bl_imm(s, COND_NE, 0);
337
338
- tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
339
-
340
- add_qemu_ldst_label(s, true, oi, data_type, datalo, datahi,
341
- addrlo, addrhi, s->code_ptr, label_ptr);
342
-#else
343
- unsigned a_bits = get_alignment_bits(opc);
344
- if (a_bits) {
345
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
346
+ tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
347
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
348
+ } else {
349
+ tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
350
}
351
-
352
- h.cond = COND_AL;
353
- h.base = addrlo;
354
- h.index = guest_base ? TCG_REG_GUEST_BASE : -1;
355
- h.index_scratch = false;
356
- tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
357
-#endif
358
}
359
360
static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg datalo,
361
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
362
MemOpIdx oi, TCGType data_type)
363
{
364
MemOp opc = get_memop(oi);
365
+ TCGLabelQemuLdst *ldst;
366
HostAddress h;
367
368
-#ifdef CONFIG_SOFTMMU
369
- h.cond = COND_EQ;
370
- h.base = addrlo;
371
- h.index_scratch = true;
372
- h.index = tcg_out_tlb_read(s, addrlo, addrhi, opc, get_mmuidx(oi), 0);
373
- tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
374
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
375
+ if (ldst) {
376
+ ldst->type = data_type;
377
+ ldst->datalo_reg = datalo;
378
+ ldst->datahi_reg = datahi;
379
380
- /* The conditional call must come last, as we're going to return here. */
381
- tcg_insn_unit *label_ptr = s->code_ptr;
382
- tcg_out_bl_imm(s, COND_NE, 0);
383
-
384
- add_qemu_ldst_label(s, false, oi, data_type, datalo, datahi,
385
- addrlo, addrhi, s->code_ptr, label_ptr);
386
-#else
387
- unsigned a_bits = get_alignment_bits(opc);
388
-
389
- h.cond = COND_AL;
390
- if (a_bits) {
391
- tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
392
h.cond = COND_EQ;
393
- }
394
+ tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
395
396
- h.base = addrlo;
397
- h.index = guest_base ? TCG_REG_GUEST_BASE : -1;
398
- h.index_scratch = false;
399
- tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
400
-#endif
401
+ /* The conditional call is last, as we're going to return here. */
402
+ ldst->label_ptr[0] = s->code_ptr;
403
+ tcg_out_bl_imm(s, COND_NE, 0);
404
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
405
+ } else {
406
+ tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
407
+ }
408
}
409
410
static void tcg_out_epilogue(TCGContext *s);
167
--
411
--
168
2.34.1
412
2.34.1
169
413
170
414
diff view generated by jsdifflib
1
Allow the target to cache items from the guest page tables.
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
2
tcg_out_zext_addr_if_32_bit, and some code that lived in both
3
tcg_out_qemu_ld and tcg_out_qemu_st into one function that returns
4
HostAddress and TCGLabelQemuLdst structures.
2
5
3
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
---
8
include/exec/cpu-defs.h | 9 +++++++++
9
tcg/loongarch64/tcg-target.c.inc | 255 +++++++++++++------------------
9
1 file changed, 9 insertions(+)
10
1 file changed, 105 insertions(+), 150 deletions(-)
10
11
11
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
12
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
13
--- a/include/exec/cpu-defs.h
14
--- a/tcg/loongarch64/tcg-target.c.inc
14
+++ b/include/exec/cpu-defs.h
15
+++ b/tcg/loongarch64/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ typedef struct CPUTLBEntryFull {
16
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[4] = {
16
17
[MO_64] = helper_le_stq_mmu,
17
/* @lg_page_size contains the log2 of the page size. */
18
};
18
uint8_t lg_page_size;
19
19
+
20
-/* We expect to use a 12-bit negative offset from ENV. */
20
+ /*
21
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
21
+ * Allow target-specific additions to this structure.
22
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
22
+ * This may be used to cache items from the guest cpu
23
-
23
+ * page tables for later use by the implementation.
24
static bool tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
24
+ */
25
{
25
+#ifdef TARGET_PAGE_ENTRY_EXTRA
26
tcg_out_opc_b(s, 0);
26
+ TARGET_PAGE_ENTRY_EXTRA
27
return reloc_br_sd10k16(s->code_ptr - 1, target);
28
}
29
30
-/*
31
- * Emits common code for TLB addend lookup, that eventually loads the
32
- * addend in TCG_REG_TMP2.
33
- */
34
-static void tcg_out_tlb_load(TCGContext *s, TCGReg addrl, MemOpIdx oi,
35
- tcg_insn_unit **label_ptr, bool is_load)
36
-{
37
- MemOp opc = get_memop(oi);
38
- unsigned s_bits = opc & MO_SIZE;
39
- unsigned a_bits = get_alignment_bits(opc);
40
- tcg_target_long compare_mask;
41
- int mem_index = get_mmuidx(oi);
42
- int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
43
- int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
44
- int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
45
-
46
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, TCG_AREG0, mask_ofs);
47
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_AREG0, table_ofs);
48
-
49
- tcg_out_opc_srli_d(s, TCG_REG_TMP2, addrl,
50
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
51
- tcg_out_opc_and(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
52
- tcg_out_opc_add_d(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
53
-
54
- /* Load the tlb comparator and the addend. */
55
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
56
- is_load ? offsetof(CPUTLBEntry, addr_read)
57
- : offsetof(CPUTLBEntry, addr_write));
58
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
59
- offsetof(CPUTLBEntry, addend));
60
-
61
- /* We don't support unaligned accesses. */
62
- if (a_bits < s_bits) {
63
- a_bits = s_bits;
64
- }
65
- /* Clear the non-page, non-alignment bits from the address. */
66
- compare_mask = (tcg_target_long)TARGET_PAGE_MASK | ((1 << a_bits) - 1);
67
- tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
68
- tcg_out_opc_and(s, TCG_REG_TMP1, TCG_REG_TMP1, addrl);
69
-
70
- /* Compare masked address with the TLB entry. */
71
- label_ptr[0] = s->code_ptr;
72
- tcg_out_opc_bne(s, TCG_REG_TMP0, TCG_REG_TMP1, 0);
73
-
74
- /* TLB Hit - addend in TCG_REG_TMP2, ready for use. */
75
-}
76
-
77
-static void add_qemu_ldst_label(TCGContext *s, int is_ld, MemOpIdx oi,
78
- TCGType type,
79
- TCGReg datalo, TCGReg addrlo,
80
- void *raddr, tcg_insn_unit **label_ptr)
81
-{
82
- TCGLabelQemuLdst *label = new_ldst_label(s);
83
-
84
- label->is_ld = is_ld;
85
- label->oi = oi;
86
- label->type = type;
87
- label->datalo_reg = datalo;
88
- label->datahi_reg = 0; /* unused */
89
- label->addrlo_reg = addrlo;
90
- label->addrhi_reg = 0; /* unused */
91
- label->raddr = tcg_splitwx_to_rx(raddr);
92
- label->label_ptr[0] = label_ptr[0];
93
-}
94
-
95
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
96
{
97
MemOpIdx oi = l->oi;
98
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
99
return tcg_out_goto(s, l->raddr);
100
}
101
#else
102
-
103
-/*
104
- * Alignment helpers for user-mode emulation
105
- */
106
-
107
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addr_reg,
108
- unsigned a_bits)
109
-{
110
- TCGLabelQemuLdst *l = new_ldst_label(s);
111
-
112
- l->is_ld = is_ld;
113
- l->addrlo_reg = addr_reg;
114
-
115
- /*
116
- * Without micro-architecture details, we don't know which of bstrpick or
117
- * andi is faster, so use bstrpick as it's not constrained by imm field
118
- * width. (Not to say alignments >= 2^12 are going to happen any time
119
- * soon, though)
120
- */
121
- tcg_out_opc_bstrpick_d(s, TCG_REG_TMP1, addr_reg, 0, a_bits - 1);
122
-
123
- l->label_ptr[0] = s->code_ptr;
124
- tcg_out_opc_bne(s, TCG_REG_TMP1, TCG_REG_ZERO, 0);
125
-
126
- l->raddr = tcg_splitwx_to_rx(s->code_ptr);
127
-}
128
-
129
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
130
{
131
/* resolve label address */
132
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
133
134
#endif /* CONFIG_SOFTMMU */
135
136
-/*
137
- * `ext32u` the address register into the temp register given,
138
- * if target is 32-bit, no-op otherwise.
139
- *
140
- * Returns the address register ready for use with TLB addend.
141
- */
142
-static TCGReg tcg_out_zext_addr_if_32_bit(TCGContext *s,
143
- TCGReg addr, TCGReg tmp)
144
-{
145
- if (TARGET_LONG_BITS == 32) {
146
- tcg_out_ext32u(s, tmp, addr);
147
- return tmp;
148
- }
149
- return addr;
150
-}
151
-
152
typedef struct {
153
TCGReg base;
154
TCGReg index;
155
} HostAddress;
156
157
+/*
158
+ * For softmmu, perform the TLB load and compare.
159
+ * For useronly, perform any required alignment tests.
160
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
161
+ * is required and fill in @h with the host address for the fast path.
162
+ */
163
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
164
+ TCGReg addr_reg, MemOpIdx oi,
165
+ bool is_ld)
166
+{
167
+ TCGLabelQemuLdst *ldst = NULL;
168
+ MemOp opc = get_memop(oi);
169
+ unsigned a_bits = get_alignment_bits(opc);
170
+
171
+#ifdef CONFIG_SOFTMMU
172
+ unsigned s_bits = opc & MO_SIZE;
173
+ int mem_index = get_mmuidx(oi);
174
+ int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
175
+ int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
176
+ int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
177
+ tcg_target_long compare_mask;
178
+
179
+ ldst = new_ldst_label(s);
180
+ ldst->is_ld = is_ld;
181
+ ldst->oi = oi;
182
+ ldst->addrlo_reg = addr_reg;
183
+
184
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
185
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
186
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, TCG_AREG0, mask_ofs);
187
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_AREG0, table_ofs);
188
+
189
+ tcg_out_opc_srli_d(s, TCG_REG_TMP2, addr_reg,
190
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
191
+ tcg_out_opc_and(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
192
+ tcg_out_opc_add_d(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
193
+
194
+ /* Load the tlb comparator and the addend. */
195
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
196
+ is_ld ? offsetof(CPUTLBEntry, addr_read)
197
+ : offsetof(CPUTLBEntry, addr_write));
198
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
199
+ offsetof(CPUTLBEntry, addend));
200
+
201
+ /* We don't support unaligned accesses. */
202
+ if (a_bits < s_bits) {
203
+ a_bits = s_bits;
204
+ }
205
+ /* Clear the non-page, non-alignment bits from the address. */
206
+ compare_mask = (tcg_target_long)TARGET_PAGE_MASK | ((1 << a_bits) - 1);
207
+ tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
208
+ tcg_out_opc_and(s, TCG_REG_TMP1, TCG_REG_TMP1, addr_reg);
209
+
210
+ /* Compare masked address with the TLB entry. */
211
+ ldst->label_ptr[0] = s->code_ptr;
212
+ tcg_out_opc_bne(s, TCG_REG_TMP0, TCG_REG_TMP1, 0);
213
+
214
+ h->index = TCG_REG_TMP2;
215
+#else
216
+ if (a_bits) {
217
+ ldst = new_ldst_label(s);
218
+
219
+ ldst->is_ld = is_ld;
220
+ ldst->oi = oi;
221
+ ldst->addrlo_reg = addr_reg;
222
+
223
+ /*
224
+ * Without micro-architecture details, we don't know which of
225
+ * bstrpick or andi is faster, so use bstrpick as it's not
226
+ * constrained by imm field width. Not to say alignments >= 2^12
227
+ * are going to happen any time soon.
228
+ */
229
+ tcg_out_opc_bstrpick_d(s, TCG_REG_TMP1, addr_reg, 0, a_bits - 1);
230
+
231
+ ldst->label_ptr[0] = s->code_ptr;
232
+ tcg_out_opc_bne(s, TCG_REG_TMP1, TCG_REG_ZERO, 0);
233
+ }
234
+
235
+ h->index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
27
+#endif
236
+#endif
28
} CPUTLBEntryFull;
237
+
238
+ if (TARGET_LONG_BITS == 32) {
239
+ h->base = TCG_REG_TMP0;
240
+ tcg_out_ext32u(s, h->base, addr_reg);
241
+ } else {
242
+ h->base = addr_reg;
243
+ }
244
+
245
+ return ldst;
246
+}
247
+
248
static void tcg_out_qemu_ld_indexed(TCGContext *s, MemOp opc, TCGType type,
249
TCGReg rd, HostAddress h)
250
{
251
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_indexed(TCGContext *s, MemOp opc, TCGType type,
252
static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
253
MemOpIdx oi, TCGType data_type)
254
{
255
- MemOp opc = get_memop(oi);
256
+ TCGLabelQemuLdst *ldst;
257
HostAddress h;
258
259
-#ifdef CONFIG_SOFTMMU
260
- tcg_insn_unit *label_ptr[1];
261
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, true);
262
+ tcg_out_qemu_ld_indexed(s, get_memop(oi), data_type, data_reg, h);
263
264
- tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 1);
265
- h.index = TCG_REG_TMP2;
266
-#else
267
- unsigned a_bits = get_alignment_bits(opc);
268
- if (a_bits) {
269
- tcg_out_test_alignment(s, true, addr_reg, a_bits);
270
+ if (ldst) {
271
+ ldst->type = data_type;
272
+ ldst->datalo_reg = data_reg;
273
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
274
}
275
- h.index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
276
-#endif
277
-
278
- h.base = tcg_out_zext_addr_if_32_bit(s, addr_reg, TCG_REG_TMP0);
279
- tcg_out_qemu_ld_indexed(s, opc, data_type, data_reg, h);
280
-
281
-#ifdef CONFIG_SOFTMMU
282
- add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
283
- s->code_ptr, label_ptr);
284
-#endif
285
}
286
287
static void tcg_out_qemu_st_indexed(TCGContext *s, MemOp opc,
288
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_indexed(TCGContext *s, MemOp opc,
289
static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
290
MemOpIdx oi, TCGType data_type)
291
{
292
- MemOp opc = get_memop(oi);
293
+ TCGLabelQemuLdst *ldst;
294
HostAddress h;
295
296
-#ifdef CONFIG_SOFTMMU
297
- tcg_insn_unit *label_ptr[1];
298
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, false);
299
+ tcg_out_qemu_st_indexed(s, get_memop(oi), data_reg, h);
300
301
- tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 0);
302
- h.index = TCG_REG_TMP2;
303
-#else
304
- unsigned a_bits = get_alignment_bits(opc);
305
- if (a_bits) {
306
- tcg_out_test_alignment(s, false, addr_reg, a_bits);
307
+ if (ldst) {
308
+ ldst->type = data_type;
309
+ ldst->datalo_reg = data_reg;
310
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
311
}
312
- h.index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
313
-#endif
314
-
315
- h.base = tcg_out_zext_addr_if_32_bit(s, addr_reg, TCG_REG_TMP0);
316
- tcg_out_qemu_st_indexed(s, opc, data_reg, h);
317
-
318
-#ifdef CONFIG_SOFTMMU
319
- add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
320
- s->code_ptr, label_ptr);
321
-#endif
322
}
29
323
30
/*
324
/*
31
--
325
--
32
2.34.1
326
2.34.1
33
327
34
328
diff view generated by jsdifflib
1
Now that we have collected all of the page data into
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
2
CPUTLBEntryFull, provide an interface to record that
2
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
3
all in one go, instead of using 4 arguments. This interface
3
into one function that returns HostAddress and TCGLabelQemuLdst structures.
4
allows CPUTLBEntryFull to be extended without having to
5
change the number of arguments.
6
4
7
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
---
7
---
12
include/exec/cpu-defs.h | 14 +++++++++++
8
tcg/mips/tcg-target.c.inc | 404 ++++++++++++++++----------------------
13
include/exec/exec-all.h | 22 ++++++++++++++++++
9
1 file changed, 172 insertions(+), 232 deletions(-)
14
accel/tcg/cputlb.c | 51 ++++++++++++++++++++++++++---------------
15
3 files changed, 69 insertions(+), 18 deletions(-)
16
10
17
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
11
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
18
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
19
--- a/include/exec/cpu-defs.h
13
--- a/tcg/mips/tcg-target.c.inc
20
+++ b/include/exec/cpu-defs.h
14
+++ b/tcg/mips/tcg-target.c.inc
21
@@ -XXX,XX +XXX,XX @@ typedef struct CPUTLBEntryFull {
15
@@ -XXX,XX +XXX,XX @@ static int tcg_out_call_iarg_reg2(TCGContext *s, int i, TCGReg al, TCGReg ah)
22
* + the offset within the target MemoryRegion (otherwise)
16
return i;
23
*/
17
}
24
hwaddr xlat_section;
18
19
-/* We expect to use a 16-bit negative offset from ENV. */
20
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
21
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
22
-
23
-/*
24
- * Perform the tlb comparison operation.
25
- * The complete host address is placed in BASE.
26
- * Clobbers TMP0, TMP1, TMP2, TMP3.
27
- */
28
-static void tcg_out_tlb_load(TCGContext *s, TCGReg base, TCGReg addrl,
29
- TCGReg addrh, MemOpIdx oi,
30
- tcg_insn_unit *label_ptr[2], bool is_load)
31
-{
32
- MemOp opc = get_memop(oi);
33
- unsigned a_bits = get_alignment_bits(opc);
34
- unsigned s_bits = opc & MO_SIZE;
35
- unsigned a_mask = (1 << a_bits) - 1;
36
- unsigned s_mask = (1 << s_bits) - 1;
37
- int mem_index = get_mmuidx(oi);
38
- int fast_off = TLB_MASK_TABLE_OFS(mem_index);
39
- int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
40
- int table_off = fast_off + offsetof(CPUTLBDescFast, table);
41
- int add_off = offsetof(CPUTLBEntry, addend);
42
- int cmp_off = (is_load ? offsetof(CPUTLBEntry, addr_read)
43
- : offsetof(CPUTLBEntry, addr_write));
44
- target_ulong tlb_mask;
45
-
46
- /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
47
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP0, TCG_AREG0, mask_off);
48
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP1, TCG_AREG0, table_off);
49
-
50
- /* Extract the TLB index from the address into TMP3. */
51
- tcg_out_opc_sa(s, ALIAS_TSRL, TCG_TMP3, addrl,
52
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
53
- tcg_out_opc_reg(s, OPC_AND, TCG_TMP3, TCG_TMP3, TCG_TMP0);
54
-
55
- /* Add the tlb_table pointer, creating the CPUTLBEntry address in TMP3. */
56
- tcg_out_opc_reg(s, ALIAS_PADD, TCG_TMP3, TCG_TMP3, TCG_TMP1);
57
-
58
- /* Load the (low-half) tlb comparator. */
59
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
60
- tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + LO_OFF);
61
- } else {
62
- tcg_out_ldst(s, (TARGET_LONG_BITS == 64 ? OPC_LD
63
- : TCG_TARGET_REG_BITS == 64 ? OPC_LWU : OPC_LW),
64
- TCG_TMP0, TCG_TMP3, cmp_off);
65
- }
66
-
67
- /* Zero extend a 32-bit guest address for a 64-bit host. */
68
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
69
- tcg_out_ext32u(s, base, addrl);
70
- addrl = base;
71
- }
72
-
73
- /*
74
- * Mask the page bits, keeping the alignment bits to compare against.
75
- * For unaligned accesses, compare against the end of the access to
76
- * verify that it does not cross a page boundary.
77
- */
78
- tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
79
- tcg_out_movi(s, TCG_TYPE_I32, TCG_TMP1, tlb_mask);
80
- if (a_mask >= s_mask) {
81
- tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrl);
82
- } else {
83
- tcg_out_opc_imm(s, ALIAS_PADDI, TCG_TMP2, addrl, s_mask - a_mask);
84
- tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, TCG_TMP2);
85
- }
86
-
87
- if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
88
- /* Load the tlb addend for the fast path. */
89
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
90
- }
91
-
92
- label_ptr[0] = s->code_ptr;
93
- tcg_out_opc_br(s, OPC_BNE, TCG_TMP1, TCG_TMP0);
94
-
95
- /* Load and test the high half tlb comparator. */
96
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
97
- /* delay slot */
98
- tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + HI_OFF);
99
-
100
- /* Load the tlb addend for the fast path. */
101
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
102
-
103
- label_ptr[1] = s->code_ptr;
104
- tcg_out_opc_br(s, OPC_BNE, addrh, TCG_TMP0);
105
- }
106
-
107
- /* delay slot */
108
- tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_TMP2, addrl);
109
-}
110
-
111
-static void add_qemu_ldst_label(TCGContext *s, int is_ld, MemOpIdx oi,
112
- TCGType ext,
113
- TCGReg datalo, TCGReg datahi,
114
- TCGReg addrlo, TCGReg addrhi,
115
- void *raddr, tcg_insn_unit *label_ptr[2])
116
-{
117
- TCGLabelQemuLdst *label = new_ldst_label(s);
118
-
119
- label->is_ld = is_ld;
120
- label->oi = oi;
121
- label->type = ext;
122
- label->datalo_reg = datalo;
123
- label->datahi_reg = datahi;
124
- label->addrlo_reg = addrlo;
125
- label->addrhi_reg = addrhi;
126
- label->raddr = tcg_splitwx_to_rx(raddr);
127
- label->label_ptr[0] = label_ptr[0];
128
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
129
- label->label_ptr[1] = label_ptr[1];
130
- }
131
-}
132
-
133
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
134
{
135
const tcg_insn_unit *tgt_rx = tcg_splitwx_to_rx(s->code_ptr);
136
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
137
}
138
139
#else
140
-
141
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
142
- TCGReg addrhi, unsigned a_bits)
143
-{
144
- unsigned a_mask = (1 << a_bits) - 1;
145
- TCGLabelQemuLdst *l = new_ldst_label(s);
146
-
147
- l->is_ld = is_ld;
148
- l->addrlo_reg = addrlo;
149
- l->addrhi_reg = addrhi;
150
-
151
- /* We are expecting a_bits to max out at 7, much lower than ANDI. */
152
- tcg_debug_assert(a_bits < 16);
153
- tcg_out_opc_imm(s, OPC_ANDI, TCG_TMP0, addrlo, a_mask);
154
-
155
- l->label_ptr[0] = s->code_ptr;
156
- if (use_mips32r6_instructions) {
157
- tcg_out_opc_br(s, OPC_BNEZALC_R6, TCG_REG_ZERO, TCG_TMP0);
158
- } else {
159
- tcg_out_opc_br(s, OPC_BNEL, TCG_TMP0, TCG_REG_ZERO);
160
- tcg_out_nop(s);
161
- }
162
-
163
- l->raddr = tcg_splitwx_to_rx(s->code_ptr);
164
-}
165
-
166
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
167
{
168
void *target;
169
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
170
}
171
#endif /* SOFTMMU */
172
173
+typedef struct {
174
+ TCGReg base;
175
+ MemOp align;
176
+} HostAddress;
177
+
178
+/*
179
+ * For softmmu, perform the TLB load and compare.
180
+ * For useronly, perform any required alignment tests.
181
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
182
+ * is required and fill in @h with the host address for the fast path.
183
+ */
184
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
185
+ TCGReg addrlo, TCGReg addrhi,
186
+ MemOpIdx oi, bool is_ld)
187
+{
188
+ TCGLabelQemuLdst *ldst = NULL;
189
+ MemOp opc = get_memop(oi);
190
+ unsigned a_bits = get_alignment_bits(opc);
191
+ unsigned s_bits = opc & MO_SIZE;
192
+ unsigned a_mask = (1 << a_bits) - 1;
193
+ TCGReg base;
194
+
195
+#ifdef CONFIG_SOFTMMU
196
+ unsigned s_mask = (1 << s_bits) - 1;
197
+ int mem_index = get_mmuidx(oi);
198
+ int fast_off = TLB_MASK_TABLE_OFS(mem_index);
199
+ int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
200
+ int table_off = fast_off + offsetof(CPUTLBDescFast, table);
201
+ int add_off = offsetof(CPUTLBEntry, addend);
202
+ int cmp_off = is_ld ? offsetof(CPUTLBEntry, addr_read)
203
+ : offsetof(CPUTLBEntry, addr_write);
204
+ target_ulong tlb_mask;
205
+
206
+ ldst = new_ldst_label(s);
207
+ ldst->is_ld = is_ld;
208
+ ldst->oi = oi;
209
+ ldst->addrlo_reg = addrlo;
210
+ ldst->addrhi_reg = addrhi;
211
+ base = TCG_REG_A0;
212
+
213
+ /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
214
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
215
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
216
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP0, TCG_AREG0, mask_off);
217
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP1, TCG_AREG0, table_off);
218
+
219
+ /* Extract the TLB index from the address into TMP3. */
220
+ tcg_out_opc_sa(s, ALIAS_TSRL, TCG_TMP3, addrlo,
221
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
222
+ tcg_out_opc_reg(s, OPC_AND, TCG_TMP3, TCG_TMP3, TCG_TMP0);
223
+
224
+ /* Add the tlb_table pointer, creating the CPUTLBEntry address in TMP3. */
225
+ tcg_out_opc_reg(s, ALIAS_PADD, TCG_TMP3, TCG_TMP3, TCG_TMP1);
226
+
227
+ /* Load the (low-half) tlb comparator. */
228
+ if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
229
+ tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + LO_OFF);
230
+ } else {
231
+ tcg_out_ldst(s, (TARGET_LONG_BITS == 64 ? OPC_LD
232
+ : TCG_TARGET_REG_BITS == 64 ? OPC_LWU : OPC_LW),
233
+ TCG_TMP0, TCG_TMP3, cmp_off);
234
+ }
235
+
236
+ /* Zero extend a 32-bit guest address for a 64-bit host. */
237
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
238
+ tcg_out_ext32u(s, base, addrlo);
239
+ addrlo = base;
240
+ }
25
+
241
+
26
+ /*
242
+ /*
27
+ * @phys_addr contains the physical address in the address space
243
+ * Mask the page bits, keeping the alignment bits to compare against.
28
+ * given by cpu_asidx_from_attrs(cpu, @attrs).
244
+ * For unaligned accesses, compare against the end of the access to
245
+ * verify that it does not cross a page boundary.
29
+ */
246
+ */
30
+ hwaddr phys_addr;
247
+ tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
31
+
248
+ tcg_out_movi(s, TCG_TYPE_I32, TCG_TMP1, tlb_mask);
32
+ /* @attrs contains the memory transaction attributes for the page. */
249
+ if (a_mask >= s_mask) {
33
MemTxAttrs attrs;
250
+ tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrlo);
34
+
251
+ } else {
35
+ /* @prot contains the complete protections for the page. */
252
+ tcg_out_opc_imm(s, ALIAS_PADDI, TCG_TMP2, addrlo, s_mask - a_mask);
36
+ uint8_t prot;
253
+ tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, TCG_TMP2);
37
+
254
+ }
38
+ /* @lg_page_size contains the log2 of the page size. */
255
+
39
+ uint8_t lg_page_size;
256
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
40
} CPUTLBEntryFull;
257
+ /* Load the tlb addend for the fast path. */
41
258
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
42
/*
259
+ }
43
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
260
+
44
index XXXXXXX..XXXXXXX 100644
261
+ ldst->label_ptr[0] = s->code_ptr;
45
--- a/include/exec/exec-all.h
262
+ tcg_out_opc_br(s, OPC_BNE, TCG_TMP1, TCG_TMP0);
46
+++ b/include/exec/exec-all.h
263
+
47
@@ -XXX,XX +XXX,XX @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *cpu,
264
+ /* Load and test the high half tlb comparator. */
48
uint16_t idxmap,
265
+ if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
49
unsigned bits);
266
+ /* delay slot */
50
267
+ tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + HI_OFF);
51
+/**
268
+
52
+ * tlb_set_page_full:
269
+ /* Load the tlb addend for the fast path. */
53
+ * @cpu: CPU context
270
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
54
+ * @mmu_idx: mmu index of the tlb to modify
271
+
55
+ * @vaddr: virtual address of the entry to add
272
+ ldst->label_ptr[1] = s->code_ptr;
56
+ * @full: the details of the tlb entry
273
+ tcg_out_opc_br(s, OPC_BNE, addrhi, TCG_TMP0);
57
+ *
274
+ }
58
+ * Add an entry to @cpu tlb index @mmu_idx. All of the fields of
275
+
59
+ * @full must be filled, except for xlat_section, and constitute
276
+ /* delay slot */
60
+ * the complete description of the translated page.
277
+ tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_TMP2, addrlo);
61
+ *
278
+#else
62
+ * This is generally called by the target tlb_fill function after
279
+ if (a_mask && (use_mips32r6_instructions || a_bits != s_bits)) {
63
+ * having performed a successful page table walk to find the physical
280
+ ldst = new_ldst_label(s);
64
+ * address and attributes for the translation.
281
+
65
+ *
282
+ ldst->is_ld = is_ld;
66
+ * At most one entry for a given virtual address is permitted. Only a
283
+ ldst->oi = oi;
67
+ * single TARGET_PAGE_SIZE region is mapped; @full->lg_page_size is only
284
+ ldst->addrlo_reg = addrlo;
68
+ * used by tlb_flush_page.
285
+ ldst->addrhi_reg = addrhi;
69
+ */
286
+
70
+void tlb_set_page_full(CPUState *cpu, int mmu_idx, target_ulong vaddr,
287
+ /* We are expecting a_bits to max out at 7, much lower than ANDI. */
71
+ CPUTLBEntryFull *full);
288
+ tcg_debug_assert(a_bits < 16);
72
+
289
+ tcg_out_opc_imm(s, OPC_ANDI, TCG_TMP0, addrlo, a_mask);
73
/**
290
+
74
* tlb_set_page_with_attrs:
291
+ ldst->label_ptr[0] = s->code_ptr;
75
* @cpu: CPU to add this TLB entry for
292
+ if (use_mips32r6_instructions) {
76
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
293
+ tcg_out_opc_br(s, OPC_BNEZALC_R6, TCG_REG_ZERO, TCG_TMP0);
77
index XXXXXXX..XXXXXXX 100644
294
+ } else {
78
--- a/accel/tcg/cputlb.c
295
+ tcg_out_opc_br(s, OPC_BNEL, TCG_TMP0, TCG_REG_ZERO);
79
+++ b/accel/tcg/cputlb.c
296
+ tcg_out_nop(s);
80
@@ -XXX,XX +XXX,XX @@ static void tlb_add_large_page(CPUArchState *env, int mmu_idx,
297
+ }
81
env_tlb(env)->d[mmu_idx].large_page_mask = lp_mask;
298
+ }
299
+
300
+ base = addrlo;
301
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
302
+ tcg_out_ext32u(s, TCG_REG_A0, base);
303
+ base = TCG_REG_A0;
304
+ }
305
+ if (guest_base) {
306
+ if (guest_base == (int16_t)guest_base) {
307
+ tcg_out_opc_imm(s, ALIAS_PADDI, TCG_REG_A0, base, guest_base);
308
+ } else {
309
+ tcg_out_opc_reg(s, ALIAS_PADD, TCG_REG_A0, base,
310
+ TCG_GUEST_BASE_REG);
311
+ }
312
+ base = TCG_REG_A0;
313
+ }
314
+#endif
315
+
316
+ h->base = base;
317
+ h->align = a_bits;
318
+ return ldst;
319
+}
320
+
321
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
322
TCGReg base, MemOp opc, TCGType type)
323
{
324
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
325
MemOpIdx oi, TCGType data_type)
326
{
327
MemOp opc = get_memop(oi);
328
- unsigned a_bits = get_alignment_bits(opc);
329
- unsigned s_bits = opc & MO_SIZE;
330
- TCGReg base;
331
+ TCGLabelQemuLdst *ldst;
332
+ HostAddress h;
333
334
- /*
335
- * R6 removes the left/right instructions but requires the
336
- * system to support misaligned memory accesses.
337
- */
338
-#if defined(CONFIG_SOFTMMU)
339
- tcg_insn_unit *label_ptr[2];
340
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
341
342
- base = TCG_REG_A0;
343
- tcg_out_tlb_load(s, base, addrlo, addrhi, oi, label_ptr, 1);
344
- if (use_mips32r6_instructions || a_bits >= s_bits) {
345
- tcg_out_qemu_ld_direct(s, datalo, datahi, base, opc, data_type);
346
+ if (use_mips32r6_instructions || h.align >= (opc & MO_SIZE)) {
347
+ tcg_out_qemu_ld_direct(s, datalo, datahi, h.base, opc, data_type);
348
} else {
349
- tcg_out_qemu_ld_unalign(s, datalo, datahi, base, opc, data_type);
350
+ tcg_out_qemu_ld_unalign(s, datalo, datahi, h.base, opc, data_type);
351
}
352
- add_qemu_ldst_label(s, true, oi, data_type, datalo, datahi,
353
- addrlo, addrhi, s->code_ptr, label_ptr);
354
-#else
355
- base = addrlo;
356
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
357
- tcg_out_ext32u(s, TCG_REG_A0, base);
358
- base = TCG_REG_A0;
359
+
360
+ if (ldst) {
361
+ ldst->type = data_type;
362
+ ldst->datalo_reg = datalo;
363
+ ldst->datahi_reg = datahi;
364
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
365
}
366
- if (guest_base) {
367
- if (guest_base == (int16_t)guest_base) {
368
- tcg_out_opc_imm(s, ALIAS_PADDI, TCG_REG_A0, base, guest_base);
369
- } else {
370
- tcg_out_opc_reg(s, ALIAS_PADD, TCG_REG_A0, base,
371
- TCG_GUEST_BASE_REG);
372
- }
373
- base = TCG_REG_A0;
374
- }
375
- if (use_mips32r6_instructions) {
376
- if (a_bits) {
377
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
378
- }
379
- tcg_out_qemu_ld_direct(s, datalo, datahi, base, opc, data_type);
380
- } else {
381
- if (a_bits && a_bits != s_bits) {
382
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
383
- }
384
- if (a_bits >= s_bits) {
385
- tcg_out_qemu_ld_direct(s, datalo, datahi, base, opc, data_type);
386
- } else {
387
- tcg_out_qemu_ld_unalign(s, datalo, datahi, base, opc, data_type);
388
- }
389
- }
390
-#endif
82
}
391
}
83
392
84
-/* Add a new TLB entry. At most one entry for a given virtual address
393
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo, TCGReg hi,
85
+/*
394
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
86
+ * Add a new TLB entry. At most one entry for a given virtual address
395
MemOpIdx oi, TCGType data_type)
87
* is permitted. Only a single TARGET_PAGE_SIZE region is mapped, the
88
* supplied size is only used by tlb_flush_page.
89
*
90
* Called from TCG-generated code, which is under an RCU read-side
91
* critical section.
92
*/
93
-void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
94
- hwaddr paddr, MemTxAttrs attrs, int prot,
95
- int mmu_idx, target_ulong size)
96
+void tlb_set_page_full(CPUState *cpu, int mmu_idx,
97
+ target_ulong vaddr, CPUTLBEntryFull *full)
98
{
396
{
99
CPUArchState *env = cpu->env_ptr;
397
MemOp opc = get_memop(oi);
100
CPUTLB *tlb = env_tlb(env);
398
- unsigned a_bits = get_alignment_bits(opc);
101
@@ -XXX,XX +XXX,XX @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
399
- unsigned s_bits = opc & MO_SIZE;
102
CPUTLBEntry *te, tn;
400
- TCGReg base;
103
hwaddr iotlb, xlat, sz, paddr_page;
401
+ TCGLabelQemuLdst *ldst;
104
target_ulong vaddr_page;
402
+ HostAddress h;
105
- int asidx = cpu_asidx_from_attrs(cpu, attrs);
403
106
- int wp_flags;
404
- /*
107
+ int asidx, wp_flags, prot;
405
- * R6 removes the left/right instructions but requires the
108
bool is_ram, is_romd;
406
- * system to support misaligned memory accesses.
109
407
- */
110
assert_cpu_is_self(cpu);
408
-#if defined(CONFIG_SOFTMMU)
111
409
- tcg_insn_unit *label_ptr[2];
112
- if (size <= TARGET_PAGE_SIZE) {
410
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
113
+ if (full->lg_page_size <= TARGET_PAGE_BITS) {
411
114
sz = TARGET_PAGE_SIZE;
412
- base = TCG_REG_A0;
413
- tcg_out_tlb_load(s, base, addrlo, addrhi, oi, label_ptr, 0);
414
- if (use_mips32r6_instructions || a_bits >= s_bits) {
415
- tcg_out_qemu_st_direct(s, datalo, datahi, base, opc);
416
+ if (use_mips32r6_instructions || h.align >= (opc & MO_SIZE)) {
417
+ tcg_out_qemu_st_direct(s, datalo, datahi, h.base, opc);
115
} else {
418
} else {
116
- tlb_add_large_page(env, mmu_idx, vaddr, size);
419
- tcg_out_qemu_st_unalign(s, datalo, datahi, base, opc);
117
- sz = size;
420
+ tcg_out_qemu_st_unalign(s, datalo, datahi, h.base, opc);
118
+ sz = (hwaddr)1 << full->lg_page_size;
119
+ tlb_add_large_page(env, mmu_idx, vaddr, sz);
120
}
421
}
121
vaddr_page = vaddr & TARGET_PAGE_MASK;
422
- add_qemu_ldst_label(s, false, oi, data_type, datalo, datahi,
122
- paddr_page = paddr & TARGET_PAGE_MASK;
423
- addrlo, addrhi, s->code_ptr, label_ptr);
123
+ paddr_page = full->phys_addr & TARGET_PAGE_MASK;
424
-#else
124
425
- base = addrlo;
125
+ prot = full->prot;
426
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
126
+ asidx = cpu_asidx_from_attrs(cpu, full->attrs);
427
- tcg_out_ext32u(s, TCG_REG_A0, base);
127
section = address_space_translate_for_iotlb(cpu, asidx, paddr_page,
428
- base = TCG_REG_A0;
128
- &xlat, &sz, attrs, &prot);
429
+
129
+ &xlat, &sz, full->attrs, &prot);
430
+ if (ldst) {
130
assert(sz >= TARGET_PAGE_SIZE);
431
+ ldst->type = data_type;
131
432
+ ldst->datalo_reg = datalo;
132
tlb_debug("vaddr=" TARGET_FMT_lx " paddr=0x" TARGET_FMT_plx
433
+ ldst->datahi_reg = datahi;
133
" prot=%x idx=%d\n",
434
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
134
- vaddr, paddr, prot, mmu_idx);
135
+ vaddr, full->phys_addr, prot, mmu_idx);
136
137
address = vaddr_page;
138
- if (size < TARGET_PAGE_SIZE) {
139
+ if (full->lg_page_size < TARGET_PAGE_BITS) {
140
/* Repeat the MMU check and TLB fill on every access. */
141
address |= TLB_INVALID_MASK;
142
}
435
}
143
- if (attrs.byte_swap) {
436
- if (guest_base) {
144
+ if (full->attrs.byte_swap) {
437
- if (guest_base == (int16_t)guest_base) {
145
address |= TLB_BSWAP;
438
- tcg_out_opc_imm(s, ALIAS_PADDI, TCG_REG_A0, base, guest_base);
146
}
439
- } else {
147
440
- tcg_out_opc_reg(s, ALIAS_PADD, TCG_REG_A0, base,
148
@@ -XXX,XX +XXX,XX @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
441
- TCG_GUEST_BASE_REG);
149
* subtract here is that of the page base, and not the same as the
442
- }
150
* vaddr we add back in io_readx()/io_writex()/get_page_addr_code().
443
- base = TCG_REG_A0;
151
*/
444
- }
152
+ desc->fulltlb[index] = *full;
445
- if (use_mips32r6_instructions) {
153
desc->fulltlb[index].xlat_section = iotlb - vaddr_page;
446
- if (a_bits) {
154
- desc->fulltlb[index].attrs = attrs;
447
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
155
+ desc->fulltlb[index].phys_addr = paddr_page;
448
- }
156
+ desc->fulltlb[index].prot = prot;
449
- tcg_out_qemu_st_direct(s, datalo, datahi, base, opc);
157
450
- } else {
158
/* Now calculate the new entry */
451
- if (a_bits && a_bits != s_bits) {
159
tn.addend = addend - vaddr_page;
452
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
160
@@ -XXX,XX +XXX,XX @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
453
- }
161
qemu_spin_unlock(&tlb->c.lock);
454
- if (a_bits >= s_bits) {
455
- tcg_out_qemu_st_direct(s, datalo, datahi, base, opc);
456
- } else {
457
- tcg_out_qemu_st_unalign(s, datalo, datahi, base, opc);
458
- }
459
- }
460
-#endif
162
}
461
}
163
462
164
-/* Add a new TLB entry, but without specifying the memory
463
static void tcg_out_mb(TCGContext *s, TCGArg a0)
165
- * transaction attributes to be used.
166
- */
167
+void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
168
+ hwaddr paddr, MemTxAttrs attrs, int prot,
169
+ int mmu_idx, target_ulong size)
170
+{
171
+ CPUTLBEntryFull full = {
172
+ .phys_addr = paddr,
173
+ .attrs = attrs,
174
+ .prot = prot,
175
+ .lg_page_size = ctz64(size)
176
+ };
177
+
178
+ assert(is_power_of_2(size));
179
+ tlb_set_page_full(cpu, mmu_idx, vaddr, &full);
180
+}
181
+
182
void tlb_set_page(CPUState *cpu, target_ulong vaddr,
183
hwaddr paddr, int prot,
184
int mmu_idx, target_ulong size)
185
--
464
--
186
2.34.1
465
2.34.1
187
466
188
467
diff view generated by jsdifflib
1
The availability of tb->pc will shortly be conditional.
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
2
Introduce accessor functions to minimize ifdefs.
2
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
3
3
into one function that returns HostAddress and TCGLabelQemuLdst structures.
4
Pass around a known pc to places like tcg_gen_code,
5
where the caller must already have the value.
6
4
7
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
---
7
---
10
accel/tcg/internal.h | 6 ++++
8
tcg/ppc/tcg-target.c.inc | 381 ++++++++++++++++++---------------------
11
include/exec/exec-all.h | 6 ++++
9
1 file changed, 172 insertions(+), 209 deletions(-)
12
include/tcg/tcg.h | 2 +-
13
accel/tcg/cpu-exec.c | 46 ++++++++++++++-----------
14
accel/tcg/translate-all.c | 37 +++++++++++---------
15
target/arm/cpu.c | 4 +--
16
target/avr/cpu.c | 2 +-
17
target/hexagon/cpu.c | 2 +-
18
target/hppa/cpu.c | 4 +--
19
target/i386/tcg/tcg-cpu.c | 2 +-
20
target/loongarch/cpu.c | 2 +-
21
target/microblaze/cpu.c | 2 +-
22
target/mips/tcg/exception.c | 2 +-
23
target/mips/tcg/sysemu/special_helper.c | 2 +-
24
target/openrisc/cpu.c | 2 +-
25
target/riscv/cpu.c | 4 +--
26
target/rx/cpu.c | 2 +-
27
target/sh4/cpu.c | 4 +--
28
target/sparc/cpu.c | 2 +-
29
target/tricore/cpu.c | 2 +-
30
tcg/tcg.c | 8 ++---
31
21 files changed, 82 insertions(+), 61 deletions(-)
32
10
33
diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
11
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
34
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
35
--- a/accel/tcg/internal.h
13
--- a/tcg/ppc/tcg-target.c.inc
36
+++ b/accel/tcg/internal.h
14
+++ b/tcg/ppc/tcg-target.c.inc
37
@@ -XXX,XX +XXX,XX @@ G_NORETURN void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr);
15
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
38
void page_init(void);
16
[MO_BEUQ] = helper_be_stq_mmu,
39
void tb_htable_init(void);
17
};
40
18
41
+/* Return the current PC from CPU, which may be cached in TB. */
19
-/* We expect to use a 16-bit negative offset from ENV. */
42
+static inline target_ulong log_pc(CPUState *cpu, const TranslationBlock *tb)
20
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
21
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
22
-
23
-/* Perform the TLB load and compare. Places the result of the comparison
24
- in CR7, loads the addend of the TLB into R3, and returns the register
25
- containing the guest address (zero-extended into R4). Clobbers R0 and R2. */
26
-
27
-static TCGReg tcg_out_tlb_read(TCGContext *s, MemOp opc,
28
- TCGReg addrlo, TCGReg addrhi,
29
- int mem_index, bool is_read)
30
-{
31
- int cmp_off
32
- = (is_read
33
- ? offsetof(CPUTLBEntry, addr_read)
34
- : offsetof(CPUTLBEntry, addr_write));
35
- int fast_off = TLB_MASK_TABLE_OFS(mem_index);
36
- int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
37
- int table_off = fast_off + offsetof(CPUTLBDescFast, table);
38
- unsigned s_bits = opc & MO_SIZE;
39
- unsigned a_bits = get_alignment_bits(opc);
40
-
41
- /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
42
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R3, TCG_AREG0, mask_off);
43
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R4, TCG_AREG0, table_off);
44
-
45
- /* Extract the page index, shifted into place for tlb index. */
46
- if (TCG_TARGET_REG_BITS == 32) {
47
- tcg_out_shri32(s, TCG_REG_TMP1, addrlo,
48
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
49
- } else {
50
- tcg_out_shri64(s, TCG_REG_TMP1, addrlo,
51
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
52
- }
53
- tcg_out32(s, AND | SAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_TMP1));
54
-
55
- /* Load the TLB comparator. */
56
- if (cmp_off == 0 && TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
57
- uint32_t lxu = (TCG_TARGET_REG_BITS == 32 || TARGET_LONG_BITS == 32
58
- ? LWZUX : LDUX);
59
- tcg_out32(s, lxu | TAB(TCG_REG_TMP1, TCG_REG_R3, TCG_REG_R4));
60
- } else {
61
- tcg_out32(s, ADD | TAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_R4));
62
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
63
- tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP1, TCG_REG_R3, cmp_off + 4);
64
- tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_R4, TCG_REG_R3, cmp_off);
65
- } else {
66
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP1, TCG_REG_R3, cmp_off);
67
- }
68
- }
69
-
70
- /* Load the TLB addend for use on the fast path. Do this asap
71
- to minimize any load use delay. */
72
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R3, TCG_REG_R3,
73
- offsetof(CPUTLBEntry, addend));
74
-
75
- /* Clear the non-page, non-alignment bits from the address */
76
- if (TCG_TARGET_REG_BITS == 32) {
77
- /* We don't support unaligned accesses on 32-bits.
78
- * Preserve the bottom bits and thus trigger a comparison
79
- * failure on unaligned accesses.
80
- */
81
- if (a_bits < s_bits) {
82
- a_bits = s_bits;
83
- }
84
- tcg_out_rlw(s, RLWINM, TCG_REG_R0, addrlo, 0,
85
- (32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
86
- } else {
87
- TCGReg t = addrlo;
88
-
89
- /* If the access is unaligned, we need to make sure we fail if we
90
- * cross a page boundary. The trick is to add the access size-1
91
- * to the address before masking the low bits. That will make the
92
- * address overflow to the next page if we cross a page boundary,
93
- * which will then force a mismatch of the TLB compare.
94
- */
95
- if (a_bits < s_bits) {
96
- unsigned a_mask = (1 << a_bits) - 1;
97
- unsigned s_mask = (1 << s_bits) - 1;
98
- tcg_out32(s, ADDI | TAI(TCG_REG_R0, t, s_mask - a_mask));
99
- t = TCG_REG_R0;
100
- }
101
-
102
- /* Mask the address for the requested alignment. */
103
- if (TARGET_LONG_BITS == 32) {
104
- tcg_out_rlw(s, RLWINM, TCG_REG_R0, t, 0,
105
- (32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
106
- /* Zero-extend the address for use in the final address. */
107
- tcg_out_ext32u(s, TCG_REG_R4, addrlo);
108
- addrlo = TCG_REG_R4;
109
- } else if (a_bits == 0) {
110
- tcg_out_rld(s, RLDICR, TCG_REG_R0, t, 0, 63 - TARGET_PAGE_BITS);
111
- } else {
112
- tcg_out_rld(s, RLDICL, TCG_REG_R0, t,
113
- 64 - TARGET_PAGE_BITS, TARGET_PAGE_BITS - a_bits);
114
- tcg_out_rld(s, RLDICL, TCG_REG_R0, TCG_REG_R0, TARGET_PAGE_BITS, 0);
115
- }
116
- }
117
-
118
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
119
- tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
120
- 0, 7, TCG_TYPE_I32);
121
- tcg_out_cmp(s, TCG_COND_EQ, addrhi, TCG_REG_R4, 0, 6, TCG_TYPE_I32);
122
- tcg_out32(s, CRAND | BT(7, CR_EQ) | BA(6, CR_EQ) | BB(7, CR_EQ));
123
- } else {
124
- tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
125
- 0, 7, TCG_TYPE_TL);
126
- }
127
-
128
- return addrlo;
129
-}
130
-
131
-/* Record the context of a call to the out of line helper code for the slow
132
- path for a load or store, so that we can later generate the correct
133
- helper code. */
134
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld,
135
- TCGType type, MemOpIdx oi,
136
- TCGReg datalo_reg, TCGReg datahi_reg,
137
- TCGReg addrlo_reg, TCGReg addrhi_reg,
138
- tcg_insn_unit *raddr, tcg_insn_unit *lptr)
139
-{
140
- TCGLabelQemuLdst *label = new_ldst_label(s);
141
-
142
- label->is_ld = is_ld;
143
- label->type = type;
144
- label->oi = oi;
145
- label->datalo_reg = datalo_reg;
146
- label->datahi_reg = datahi_reg;
147
- label->addrlo_reg = addrlo_reg;
148
- label->addrhi_reg = addrhi_reg;
149
- label->raddr = tcg_splitwx_to_rx(raddr);
150
- label->label_ptr[0] = lptr;
151
-}
152
-
153
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
154
{
155
MemOpIdx oi = lb->oi;
156
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
157
return true;
158
}
159
#else
160
-
161
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
162
- TCGReg addrhi, unsigned a_bits)
163
-{
164
- unsigned a_mask = (1 << a_bits) - 1;
165
- TCGLabelQemuLdst *label = new_ldst_label(s);
166
-
167
- label->is_ld = is_ld;
168
- label->addrlo_reg = addrlo;
169
- label->addrhi_reg = addrhi;
170
-
171
- /* We are expecting a_bits to max out at 7, much lower than ANDI. */
172
- tcg_debug_assert(a_bits < 16);
173
- tcg_out32(s, ANDI | SAI(addrlo, TCG_REG_R0, a_mask));
174
-
175
- label->label_ptr[0] = s->code_ptr;
176
- tcg_out32(s, BC | BI(0, CR_EQ) | BO_COND_FALSE | LK);
177
-
178
- label->raddr = tcg_splitwx_to_rx(s->code_ptr);
179
-}
180
-
181
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
182
{
183
if (!reloc_pc14(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
184
@@ -XXX,XX +XXX,XX @@ typedef struct {
185
TCGReg index;
186
} HostAddress;
187
188
+/*
189
+ * For softmmu, perform the TLB load and compare.
190
+ * For useronly, perform any required alignment tests.
191
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
192
+ * is required and fill in @h with the host address for the fast path.
193
+ */
194
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
195
+ TCGReg addrlo, TCGReg addrhi,
196
+ MemOpIdx oi, bool is_ld)
43
+{
197
+{
44
+ return tb_pc(tb);
198
+ TCGLabelQemuLdst *ldst = NULL;
199
+ MemOp opc = get_memop(oi);
200
+ unsigned a_bits = get_alignment_bits(opc);
201
+
202
+#ifdef CONFIG_SOFTMMU
203
+ int mem_index = get_mmuidx(oi);
204
+ int cmp_off = is_ld ? offsetof(CPUTLBEntry, addr_read)
205
+ : offsetof(CPUTLBEntry, addr_write);
206
+ int fast_off = TLB_MASK_TABLE_OFS(mem_index);
207
+ int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
208
+ int table_off = fast_off + offsetof(CPUTLBDescFast, table);
209
+ unsigned s_bits = opc & MO_SIZE;
210
+
211
+ ldst = new_ldst_label(s);
212
+ ldst->is_ld = is_ld;
213
+ ldst->oi = oi;
214
+ ldst->addrlo_reg = addrlo;
215
+ ldst->addrhi_reg = addrhi;
216
+
217
+ /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
218
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
219
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
220
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R3, TCG_AREG0, mask_off);
221
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R4, TCG_AREG0, table_off);
222
+
223
+ /* Extract the page index, shifted into place for tlb index. */
224
+ if (TCG_TARGET_REG_BITS == 32) {
225
+ tcg_out_shri32(s, TCG_REG_TMP1, addrlo,
226
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
227
+ } else {
228
+ tcg_out_shri64(s, TCG_REG_TMP1, addrlo,
229
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
230
+ }
231
+ tcg_out32(s, AND | SAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_TMP1));
232
+
233
+ /* Load the TLB comparator. */
234
+ if (cmp_off == 0 && TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
235
+ uint32_t lxu = (TCG_TARGET_REG_BITS == 32 || TARGET_LONG_BITS == 32
236
+ ? LWZUX : LDUX);
237
+ tcg_out32(s, lxu | TAB(TCG_REG_TMP1, TCG_REG_R3, TCG_REG_R4));
238
+ } else {
239
+ tcg_out32(s, ADD | TAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_R4));
240
+ if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
241
+ tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP1, TCG_REG_R3, cmp_off + 4);
242
+ tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_R4, TCG_REG_R3, cmp_off);
243
+ } else {
244
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP1, TCG_REG_R3, cmp_off);
245
+ }
246
+ }
247
+
248
+ /*
249
+ * Load the TLB addend for use on the fast path.
250
+ * Do this asap to minimize any load use delay.
251
+ */
252
+ h->base = TCG_REG_R3;
253
+ tcg_out_ld(s, TCG_TYPE_PTR, h->base, TCG_REG_R3,
254
+ offsetof(CPUTLBEntry, addend));
255
+
256
+ /* Clear the non-page, non-alignment bits from the address */
257
+ if (TCG_TARGET_REG_BITS == 32) {
258
+ /*
259
+ * We don't support unaligned accesses on 32-bits.
260
+ * Preserve the bottom bits and thus trigger a comparison
261
+ * failure on unaligned accesses.
262
+ */
263
+ if (a_bits < s_bits) {
264
+ a_bits = s_bits;
265
+ }
266
+ tcg_out_rlw(s, RLWINM, TCG_REG_R0, addrlo, 0,
267
+ (32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
268
+ } else {
269
+ TCGReg t = addrlo;
270
+
271
+ /*
272
+ * If the access is unaligned, we need to make sure we fail if we
273
+ * cross a page boundary. The trick is to add the access size-1
274
+ * to the address before masking the low bits. That will make the
275
+ * address overflow to the next page if we cross a page boundary,
276
+ * which will then force a mismatch of the TLB compare.
277
+ */
278
+ if (a_bits < s_bits) {
279
+ unsigned a_mask = (1 << a_bits) - 1;
280
+ unsigned s_mask = (1 << s_bits) - 1;
281
+ tcg_out32(s, ADDI | TAI(TCG_REG_R0, t, s_mask - a_mask));
282
+ t = TCG_REG_R0;
283
+ }
284
+
285
+ /* Mask the address for the requested alignment. */
286
+ if (TARGET_LONG_BITS == 32) {
287
+ tcg_out_rlw(s, RLWINM, TCG_REG_R0, t, 0,
288
+ (32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
289
+ /* Zero-extend the address for use in the final address. */
290
+ tcg_out_ext32u(s, TCG_REG_R4, addrlo);
291
+ addrlo = TCG_REG_R4;
292
+ } else if (a_bits == 0) {
293
+ tcg_out_rld(s, RLDICR, TCG_REG_R0, t, 0, 63 - TARGET_PAGE_BITS);
294
+ } else {
295
+ tcg_out_rld(s, RLDICL, TCG_REG_R0, t,
296
+ 64 - TARGET_PAGE_BITS, TARGET_PAGE_BITS - a_bits);
297
+ tcg_out_rld(s, RLDICL, TCG_REG_R0, TCG_REG_R0, TARGET_PAGE_BITS, 0);
298
+ }
299
+ }
300
+ h->index = addrlo;
301
+
302
+ if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
303
+ tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
304
+ 0, 7, TCG_TYPE_I32);
305
+ tcg_out_cmp(s, TCG_COND_EQ, addrhi, TCG_REG_R4, 0, 6, TCG_TYPE_I32);
306
+ tcg_out32(s, CRAND | BT(7, CR_EQ) | BA(6, CR_EQ) | BB(7, CR_EQ));
307
+ } else {
308
+ tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
309
+ 0, 7, TCG_TYPE_TL);
310
+ }
311
+
312
+ /* Load a pointer into the current opcode w/conditional branch-link. */
313
+ ldst->label_ptr[0] = s->code_ptr;
314
+ tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
315
+#else
316
+ if (a_bits) {
317
+ ldst = new_ldst_label(s);
318
+ ldst->is_ld = is_ld;
319
+ ldst->oi = oi;
320
+ ldst->addrlo_reg = addrlo;
321
+ ldst->addrhi_reg = addrhi;
322
+
323
+ /* We are expecting a_bits to max out at 7, much lower than ANDI. */
324
+ tcg_debug_assert(a_bits < 16);
325
+ tcg_out32(s, ANDI | SAI(addrlo, TCG_REG_R0, (1 << a_bits) - 1));
326
+
327
+ ldst->label_ptr[0] = s->code_ptr;
328
+ tcg_out32(s, BC | BI(0, CR_EQ) | BO_COND_FALSE | LK);
329
+ }
330
+
331
+ h->base = guest_base ? TCG_GUEST_BASE_REG : 0;
332
+ h->index = addrlo;
333
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
334
+ tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
335
+ h->index = TCG_REG_TMP1;
336
+ }
337
+#endif
338
+
339
+ return ldst;
45
+}
340
+}
46
+
341
+
47
#endif /* ACCEL_TCG_INTERNAL_H */
342
static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
48
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
343
TCGReg addrlo, TCGReg addrhi,
49
index XXXXXXX..XXXXXXX 100644
344
MemOpIdx oi, TCGType data_type)
50
--- a/include/exec/exec-all.h
51
+++ b/include/exec/exec-all.h
52
@@ -XXX,XX +XXX,XX @@ struct TranslationBlock {
53
uintptr_t jmp_dest[2];
54
};
55
56
+/* Hide the read to avoid ifdefs for TARGET_TB_PCREL. */
57
+static inline target_ulong tb_pc(const TranslationBlock *tb)
58
+{
59
+ return tb->pc;
60
+}
61
+
62
/* Hide the qatomic_read to make code a little easier on the eyes */
63
static inline uint32_t tb_cflags(const TranslationBlock *tb)
64
{
345
{
65
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
346
MemOp opc = get_memop(oi);
66
index XXXXXXX..XXXXXXX 100644
347
- MemOp s_bits = opc & MO_SIZE;
67
--- a/include/tcg/tcg.h
348
+ TCGLabelQemuLdst *ldst;
68
+++ b/include/tcg/tcg.h
349
HostAddress h;
69
@@ -XXX,XX +XXX,XX @@ void tcg_register_thread(void);
350
70
void tcg_prologue_init(TCGContext *s);
351
-#ifdef CONFIG_SOFTMMU
71
void tcg_func_start(TCGContext *s);
352
- tcg_insn_unit *label_ptr;
72
353
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
73
-int tcg_gen_code(TCGContext *s, TranslationBlock *tb);
354
74
+int tcg_gen_code(TCGContext *s, TranslationBlock *tb, target_ulong pc_start);
355
- h.index = tcg_out_tlb_read(s, opc, addrlo, addrhi, get_mmuidx(oi), true);
75
356
- h.base = TCG_REG_R3;
76
void tcg_set_frame(TCGContext *s, TCGReg reg, intptr_t start, intptr_t size);
357
-
77
358
- /* Load a pointer into the current opcode w/conditional branch-link. */
78
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
359
- label_ptr = s->code_ptr;
79
index XXXXXXX..XXXXXXX 100644
360
- tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
80
--- a/accel/tcg/cpu-exec.c
361
-#else /* !CONFIG_SOFTMMU */
81
+++ b/accel/tcg/cpu-exec.c
362
- unsigned a_bits = get_alignment_bits(opc);
82
@@ -XXX,XX +XXX,XX @@ static bool tb_lookup_cmp(const void *p, const void *d)
363
- if (a_bits) {
83
const TranslationBlock *tb = p;
364
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
84
const struct tb_desc *desc = d;
365
- }
85
366
- h.base = guest_base ? TCG_GUEST_BASE_REG : 0;
86
- if (tb->pc == desc->pc &&
367
- h.index = addrlo;
87
+ if (tb_pc(tb) == desc->pc &&
368
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
88
tb->page_addr[0] == desc->page_addr0 &&
369
- tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
89
tb->cs_base == desc->cs_base &&
370
- h.index = TCG_REG_TMP1;
90
tb->flags == desc->flags &&
371
- }
91
@@ -XXX,XX +XXX,XX @@ static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
372
-#endif
92
return tb;
373
-
93
}
374
- if (TCG_TARGET_REG_BITS == 32 && s_bits == MO_64) {
94
375
+ if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
95
-static inline void log_cpu_exec(target_ulong pc, CPUState *cpu,
376
if (opc & MO_BSWAP) {
96
- const TranslationBlock *tb)
377
tcg_out32(s, ADDI | TAI(TCG_REG_R0, h.index, 4));
97
+static void log_cpu_exec(target_ulong pc, CPUState *cpu,
378
tcg_out32(s, LWBRX | TAB(datalo, h.base, h.index));
98
+ const TranslationBlock *tb)
379
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
99
{
100
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_CPU | CPU_LOG_EXEC))
101
- && qemu_log_in_addr_range(pc)) {
102
-
103
+ if (qemu_log_in_addr_range(pc)) {
104
qemu_log_mask(CPU_LOG_EXEC,
105
"Trace %d: %p [" TARGET_FMT_lx
106
"/" TARGET_FMT_lx "/%08x/%08x] %s\n",
107
@@ -XXX,XX +XXX,XX @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
108
return tcg_code_gen_epilogue;
109
}
110
111
- log_cpu_exec(pc, cpu, tb);
112
+ if (qemu_loglevel_mask(CPU_LOG_TB_CPU | CPU_LOG_EXEC)) {
113
+ log_cpu_exec(pc, cpu, tb);
114
+ }
115
116
return tb->tc.ptr;
117
}
118
@@ -XXX,XX +XXX,XX @@ cpu_tb_exec(CPUState *cpu, TranslationBlock *itb, int *tb_exit)
119
TranslationBlock *last_tb;
120
const void *tb_ptr = itb->tc.ptr;
121
122
- log_cpu_exec(itb->pc, cpu, itb);
123
+ if (qemu_loglevel_mask(CPU_LOG_TB_CPU | CPU_LOG_EXEC)) {
124
+ log_cpu_exec(log_pc(cpu, itb), cpu, itb);
125
+ }
126
127
qemu_thread_jit_execute();
128
ret = tcg_qemu_tb_exec(env, tb_ptr);
129
@@ -XXX,XX +XXX,XX @@ cpu_tb_exec(CPUState *cpu, TranslationBlock *itb, int *tb_exit)
130
* of the start of the TB.
131
*/
132
CPUClass *cc = CPU_GET_CLASS(cpu);
133
- qemu_log_mask_and_addr(CPU_LOG_EXEC, last_tb->pc,
134
- "Stopped execution of TB chain before %p ["
135
- TARGET_FMT_lx "] %s\n",
136
- last_tb->tc.ptr, last_tb->pc,
137
- lookup_symbol(last_tb->pc));
138
+
139
if (cc->tcg_ops->synchronize_from_tb) {
140
cc->tcg_ops->synchronize_from_tb(cpu, last_tb);
141
} else {
142
assert(cc->set_pc);
143
- cc->set_pc(cpu, last_tb->pc);
144
+ cc->set_pc(cpu, tb_pc(last_tb));
145
+ }
146
+ if (qemu_loglevel_mask(CPU_LOG_EXEC)) {
147
+ target_ulong pc = log_pc(cpu, last_tb);
148
+ if (qemu_log_in_addr_range(pc)) {
149
+ qemu_log("Stopped execution of TB chain before %p ["
150
+ TARGET_FMT_lx "] %s\n",
151
+ last_tb->tc.ptr, pc, lookup_symbol(pc));
152
+ }
153
}
380
}
154
}
381
}
155
382
156
@@ -XXX,XX +XXX,XX @@ static inline void tb_add_jump(TranslationBlock *tb, int n,
383
-#ifdef CONFIG_SOFTMMU
157
384
- add_qemu_ldst_label(s, true, data_type, oi, datalo, datahi,
158
qemu_spin_unlock(&tb_next->jmp_lock);
385
- addrlo, addrhi, s->code_ptr, label_ptr);
159
386
-#endif
160
- qemu_log_mask_and_addr(CPU_LOG_EXEC, tb->pc,
387
+ if (ldst) {
161
- "Linking TBs %p [" TARGET_FMT_lx
388
+ ldst->type = data_type;
162
- "] index %d -> %p [" TARGET_FMT_lx "]\n",
389
+ ldst->datalo_reg = datalo;
163
- tb->tc.ptr, tb->pc, n,
390
+ ldst->datahi_reg = datahi;
164
- tb_next->tc.ptr, tb_next->pc);
391
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
165
+ qemu_log_mask(CPU_LOG_EXEC, "Linking TBs %p index %d -> %p\n",
392
+ }
166
+ tb->tc.ptr, n, tb_next->tc.ptr);
167
return;
168
169
out_unlock_next:
170
@@ -XXX,XX +XXX,XX @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
171
}
393
}
172
394
173
static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
395
static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
174
+ target_ulong pc,
396
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
175
TranslationBlock **last_tb, int *tb_exit)
397
MemOpIdx oi, TCGType data_type)
176
{
398
{
177
int32_t insns_left;
399
MemOp opc = get_memop(oi);
178
400
- MemOp s_bits = opc & MO_SIZE;
179
- trace_exec_tb(tb, tb->pc);
401
+ TCGLabelQemuLdst *ldst;
180
+ trace_exec_tb(tb, pc);
402
HostAddress h;
181
tb = cpu_tb_exec(cpu, tb, tb_exit);
403
182
if (*tb_exit != TB_EXIT_REQUESTED) {
404
-#ifdef CONFIG_SOFTMMU
183
*last_tb = tb;
405
- tcg_insn_unit *label_ptr;
184
@@ -XXX,XX +XXX,XX @@ int cpu_exec(CPUState *cpu)
406
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
185
tb_add_jump(last_tb, tb_exit, tb);
407
186
}
408
- h.index = tcg_out_tlb_read(s, opc, addrlo, addrhi, get_mmuidx(oi), false);
187
409
- h.base = TCG_REG_R3;
188
- cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit);
410
-
189
+ cpu_loop_exec_tb(cpu, tb, pc, &last_tb, &tb_exit);
411
- /* Load a pointer into the current opcode w/conditional branch-link. */
190
412
- label_ptr = s->code_ptr;
191
/* Try to align the host and virtual clocks
413
- tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
192
if the guest is in advance */
414
-#else /* !CONFIG_SOFTMMU */
193
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
415
- unsigned a_bits = get_alignment_bits(opc);
194
index XXXXXXX..XXXXXXX 100644
416
- if (a_bits) {
195
--- a/accel/tcg/translate-all.c
417
- tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
196
+++ b/accel/tcg/translate-all.c
418
- }
197
@@ -XXX,XX +XXX,XX @@ static int encode_search(TranslationBlock *tb, uint8_t *block)
419
- h.base = guest_base ? TCG_GUEST_BASE_REG : 0;
198
420
- h.index = addrlo;
199
for (j = 0; j < TARGET_INSN_START_WORDS; ++j) {
421
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
200
if (i == 0) {
422
- tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
201
- prev = (j == 0 ? tb->pc : 0);
423
- h.index = TCG_REG_TMP1;
202
+ prev = (j == 0 ? tb_pc(tb) : 0);
424
- }
203
} else {
425
-#endif
204
prev = tcg_ctx->gen_insn_data[i - 1][j];
426
-
205
}
427
- if (TCG_TARGET_REG_BITS == 32 && s_bits == MO_64) {
206
@@ -XXX,XX +XXX,XX @@ static int encode_search(TranslationBlock *tb, uint8_t *block)
428
+ if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
207
static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
429
if (opc & MO_BSWAP) {
208
uintptr_t searched_pc, bool reset_icount)
430
tcg_out32(s, ADDI | TAI(TCG_REG_R0, h.index, 4));
209
{
431
tcg_out32(s, STWBRX | SAB(datalo, h.base, h.index));
210
- target_ulong data[TARGET_INSN_START_WORDS] = { tb->pc };
432
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
211
+ target_ulong data[TARGET_INSN_START_WORDS] = { tb_pc(tb) };
433
}
212
uintptr_t host_pc = (uintptr_t)tb->tc.ptr;
213
CPUArchState *env = cpu->env_ptr;
214
const uint8_t *p = tb->tc.ptr + tb->tc.size;
215
@@ -XXX,XX +XXX,XX @@ static bool tb_cmp(const void *ap, const void *bp)
216
const TranslationBlock *a = ap;
217
const TranslationBlock *b = bp;
218
219
- return a->pc == b->pc &&
220
+ return tb_pc(a) == tb_pc(b) &&
221
a->cs_base == b->cs_base &&
222
a->flags == b->flags &&
223
(tb_cflags(a) & ~CF_INVALID) == (tb_cflags(b) & ~CF_INVALID) &&
224
@@ -XXX,XX +XXX,XX @@ static void do_tb_invalidate_check(void *p, uint32_t hash, void *userp)
225
TranslationBlock *tb = p;
226
target_ulong addr = *(target_ulong *)userp;
227
228
- if (!(addr + TARGET_PAGE_SIZE <= tb->pc || addr >= tb->pc + tb->size)) {
229
+ if (!(addr + TARGET_PAGE_SIZE <= tb_pc(tb) ||
230
+ addr >= tb_pc(tb) + tb->size)) {
231
printf("ERROR invalidate: address=" TARGET_FMT_lx
232
- " PC=%08lx size=%04x\n", addr, (long)tb->pc, tb->size);
233
+ " PC=%08lx size=%04x\n", addr, (long)tb_pc(tb), tb->size);
234
}
434
}
435
436
-#ifdef CONFIG_SOFTMMU
437
- add_qemu_ldst_label(s, false, data_type, oi, datalo, datahi,
438
- addrlo, addrhi, s->code_ptr, label_ptr);
439
-#endif
440
+ if (ldst) {
441
+ ldst->type = data_type;
442
+ ldst->datalo_reg = datalo;
443
+ ldst->datahi_reg = datahi;
444
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
445
+ }
235
}
446
}
236
447
237
@@ -XXX,XX +XXX,XX @@ static void do_tb_page_check(void *p, uint32_t hash, void *userp)
448
static void tcg_out_nop_fill(tcg_insn_unit *p, int count)
238
TranslationBlock *tb = p;
239
int flags1, flags2;
240
241
- flags1 = page_get_flags(tb->pc);
242
- flags2 = page_get_flags(tb->pc + tb->size - 1);
243
+ flags1 = page_get_flags(tb_pc(tb));
244
+ flags2 = page_get_flags(tb_pc(tb) + tb->size - 1);
245
if ((flags1 & PAGE_WRITE) || (flags2 & PAGE_WRITE)) {
246
printf("ERROR page flags: PC=%08lx size=%04x f1=%x f2=%x\n",
247
- (long)tb->pc, tb->size, flags1, flags2);
248
+ (long)tb_pc(tb), tb->size, flags1, flags2);
249
}
250
}
251
252
@@ -XXX,XX +XXX,XX @@ static void do_tb_phys_invalidate(TranslationBlock *tb, bool rm_from_page_list)
253
254
/* remove the TB from the hash list */
255
phys_pc = tb->page_addr[0];
256
- h = tb_hash_func(phys_pc, tb->pc, tb->flags, orig_cflags,
257
+ h = tb_hash_func(phys_pc, tb_pc(tb), tb->flags, orig_cflags,
258
tb->trace_vcpu_dstate);
259
if (!qht_remove(&tb_ctx.htable, tb, h)) {
260
return;
261
@@ -XXX,XX +XXX,XX @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
262
}
263
264
/* add in the hash table */
265
- h = tb_hash_func(phys_pc, tb->pc, tb->flags, tb->cflags,
266
+ h = tb_hash_func(phys_pc, tb_pc(tb), tb->flags, tb->cflags,
267
tb->trace_vcpu_dstate);
268
qht_insert(&tb_ctx.htable, tb, h, &existing_tb);
269
270
@@ -XXX,XX +XXX,XX @@ TranslationBlock *tb_gen_code(CPUState *cpu,
271
tcg_ctx->cpu = NULL;
272
max_insns = tb->icount;
273
274
- trace_translate_block(tb, tb->pc, tb->tc.ptr);
275
+ trace_translate_block(tb, pc, tb->tc.ptr);
276
277
/* generate machine code */
278
tb->jmp_reset_offset[0] = TB_JMP_RESET_OFFSET_INVALID;
279
@@ -XXX,XX +XXX,XX @@ TranslationBlock *tb_gen_code(CPUState *cpu,
280
ti = profile_getclock();
281
#endif
282
283
- gen_code_size = tcg_gen_code(tcg_ctx, tb);
284
+ gen_code_size = tcg_gen_code(tcg_ctx, tb, pc);
285
if (unlikely(gen_code_size < 0)) {
286
error_return:
287
switch (gen_code_size) {
288
@@ -XXX,XX +XXX,XX @@ TranslationBlock *tb_gen_code(CPUState *cpu,
289
290
#ifdef DEBUG_DISAS
291
if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM) &&
292
- qemu_log_in_addr_range(tb->pc)) {
293
+ qemu_log_in_addr_range(pc)) {
294
FILE *logfile = qemu_log_trylock();
295
if (logfile) {
296
int code_size, data_size;
297
@@ -XXX,XX +XXX,XX @@ void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
298
*/
299
cpu->cflags_next_tb = curr_cflags(cpu) | CF_MEMI_ONLY | CF_LAST_IO | n;
300
301
- qemu_log_mask_and_addr(CPU_LOG_EXEC, tb->pc,
302
- "cpu_io_recompile: rewound execution of TB to "
303
- TARGET_FMT_lx "\n", tb->pc);
304
+ if (qemu_loglevel_mask(CPU_LOG_EXEC)) {
305
+ target_ulong pc = log_pc(cpu, tb);
306
+ if (qemu_log_in_addr_range(pc)) {
307
+ qemu_log("cpu_io_recompile: rewound execution of TB to "
308
+ TARGET_FMT_lx "\n", pc);
309
+ }
310
+ }
311
312
cpu_loop_exit_noexc(cpu);
313
}
314
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
315
index XXXXXXX..XXXXXXX 100644
316
--- a/target/arm/cpu.c
317
+++ b/target/arm/cpu.c
318
@@ -XXX,XX +XXX,XX @@ void arm_cpu_synchronize_from_tb(CPUState *cs,
319
* never possible for an AArch64 TB to chain to an AArch32 TB.
320
*/
321
if (is_a64(env)) {
322
- env->pc = tb->pc;
323
+ env->pc = tb_pc(tb);
324
} else {
325
- env->regs[15] = tb->pc;
326
+ env->regs[15] = tb_pc(tb);
327
}
328
}
329
#endif /* CONFIG_TCG */
330
diff --git a/target/avr/cpu.c b/target/avr/cpu.c
331
index XXXXXXX..XXXXXXX 100644
332
--- a/target/avr/cpu.c
333
+++ b/target/avr/cpu.c
334
@@ -XXX,XX +XXX,XX @@ static void avr_cpu_synchronize_from_tb(CPUState *cs,
335
AVRCPU *cpu = AVR_CPU(cs);
336
CPUAVRState *env = &cpu->env;
337
338
- env->pc_w = tb->pc / 2; /* internally PC points to words */
339
+ env->pc_w = tb_pc(tb) / 2; /* internally PC points to words */
340
}
341
342
static void avr_cpu_reset(DeviceState *ds)
343
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
344
index XXXXXXX..XXXXXXX 100644
345
--- a/target/hexagon/cpu.c
346
+++ b/target/hexagon/cpu.c
347
@@ -XXX,XX +XXX,XX @@ static void hexagon_cpu_synchronize_from_tb(CPUState *cs,
348
{
349
HexagonCPU *cpu = HEXAGON_CPU(cs);
350
CPUHexagonState *env = &cpu->env;
351
- env->gpr[HEX_REG_PC] = tb->pc;
352
+ env->gpr[HEX_REG_PC] = tb_pc(tb);
353
}
354
355
static bool hexagon_cpu_has_work(CPUState *cs)
356
diff --git a/target/hppa/cpu.c b/target/hppa/cpu.c
357
index XXXXXXX..XXXXXXX 100644
358
--- a/target/hppa/cpu.c
359
+++ b/target/hppa/cpu.c
360
@@ -XXX,XX +XXX,XX @@ static void hppa_cpu_synchronize_from_tb(CPUState *cs,
361
HPPACPU *cpu = HPPA_CPU(cs);
362
363
#ifdef CONFIG_USER_ONLY
364
- cpu->env.iaoq_f = tb->pc;
365
+ cpu->env.iaoq_f = tb_pc(tb);
366
cpu->env.iaoq_b = tb->cs_base;
367
#else
368
/* Recover the IAOQ values from the GVA + PRIV. */
369
@@ -XXX,XX +XXX,XX @@ static void hppa_cpu_synchronize_from_tb(CPUState *cs,
370
int32_t diff = cs_base;
371
372
cpu->env.iasq_f = iasq_f;
373
- cpu->env.iaoq_f = (tb->pc & ~iasq_f) + priv;
374
+ cpu->env.iaoq_f = (tb_pc(tb) & ~iasq_f) + priv;
375
if (diff) {
376
cpu->env.iaoq_b = cpu->env.iaoq_f + diff;
377
}
378
diff --git a/target/i386/tcg/tcg-cpu.c b/target/i386/tcg/tcg-cpu.c
379
index XXXXXXX..XXXXXXX 100644
380
--- a/target/i386/tcg/tcg-cpu.c
381
+++ b/target/i386/tcg/tcg-cpu.c
382
@@ -XXX,XX +XXX,XX @@ static void x86_cpu_synchronize_from_tb(CPUState *cs,
383
{
384
X86CPU *cpu = X86_CPU(cs);
385
386
- cpu->env.eip = tb->pc - tb->cs_base;
387
+ cpu->env.eip = tb_pc(tb) - tb->cs_base;
388
}
389
390
#ifndef CONFIG_USER_ONLY
391
diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c
392
index XXXXXXX..XXXXXXX 100644
393
--- a/target/loongarch/cpu.c
394
+++ b/target/loongarch/cpu.c
395
@@ -XXX,XX +XXX,XX @@ static void loongarch_cpu_synchronize_from_tb(CPUState *cs,
396
LoongArchCPU *cpu = LOONGARCH_CPU(cs);
397
CPULoongArchState *env = &cpu->env;
398
399
- env->pc = tb->pc;
400
+ env->pc = tb_pc(tb);
401
}
402
#endif /* CONFIG_TCG */
403
404
diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
405
index XXXXXXX..XXXXXXX 100644
406
--- a/target/microblaze/cpu.c
407
+++ b/target/microblaze/cpu.c
408
@@ -XXX,XX +XXX,XX @@ static void mb_cpu_synchronize_from_tb(CPUState *cs,
409
{
410
MicroBlazeCPU *cpu = MICROBLAZE_CPU(cs);
411
412
- cpu->env.pc = tb->pc;
413
+ cpu->env.pc = tb_pc(tb);
414
cpu->env.iflags = tb->flags & IFLAGS_TB_MASK;
415
}
416
417
diff --git a/target/mips/tcg/exception.c b/target/mips/tcg/exception.c
418
index XXXXXXX..XXXXXXX 100644
419
--- a/target/mips/tcg/exception.c
420
+++ b/target/mips/tcg/exception.c
421
@@ -XXX,XX +XXX,XX @@ void mips_cpu_synchronize_from_tb(CPUState *cs, const TranslationBlock *tb)
422
MIPSCPU *cpu = MIPS_CPU(cs);
423
CPUMIPSState *env = &cpu->env;
424
425
- env->active_tc.PC = tb->pc;
426
+ env->active_tc.PC = tb_pc(tb);
427
env->hflags &= ~MIPS_HFLAG_BMASK;
428
env->hflags |= tb->flags & MIPS_HFLAG_BMASK;
429
}
430
diff --git a/target/mips/tcg/sysemu/special_helper.c b/target/mips/tcg/sysemu/special_helper.c
431
index XXXXXXX..XXXXXXX 100644
432
--- a/target/mips/tcg/sysemu/special_helper.c
433
+++ b/target/mips/tcg/sysemu/special_helper.c
434
@@ -XXX,XX +XXX,XX @@ bool mips_io_recompile_replay_branch(CPUState *cs, const TranslationBlock *tb)
435
CPUMIPSState *env = &cpu->env;
436
437
if ((env->hflags & MIPS_HFLAG_BMASK) != 0
438
- && env->active_tc.PC != tb->pc) {
439
+ && env->active_tc.PC != tb_pc(tb)) {
440
env->active_tc.PC -= (env->hflags & MIPS_HFLAG_B16 ? 2 : 4);
441
env->hflags &= ~MIPS_HFLAG_BMASK;
442
return true;
443
diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
444
index XXXXXXX..XXXXXXX 100644
445
--- a/target/openrisc/cpu.c
446
+++ b/target/openrisc/cpu.c
447
@@ -XXX,XX +XXX,XX @@ static void openrisc_cpu_synchronize_from_tb(CPUState *cs,
448
{
449
OpenRISCCPU *cpu = OPENRISC_CPU(cs);
450
451
- cpu->env.pc = tb->pc;
452
+ cpu->env.pc = tb_pc(tb);
453
}
454
455
456
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
457
index XXXXXXX..XXXXXXX 100644
458
--- a/target/riscv/cpu.c
459
+++ b/target/riscv/cpu.c
460
@@ -XXX,XX +XXX,XX @@ static void riscv_cpu_synchronize_from_tb(CPUState *cs,
461
RISCVMXL xl = FIELD_EX32(tb->flags, TB_FLAGS, XL);
462
463
if (xl == MXL_RV32) {
464
- env->pc = (int32_t)tb->pc;
465
+ env->pc = (int32_t)tb_pc(tb);
466
} else {
467
- env->pc = tb->pc;
468
+ env->pc = tb_pc(tb);
469
}
470
}
471
472
diff --git a/target/rx/cpu.c b/target/rx/cpu.c
473
index XXXXXXX..XXXXXXX 100644
474
--- a/target/rx/cpu.c
475
+++ b/target/rx/cpu.c
476
@@ -XXX,XX +XXX,XX @@ static void rx_cpu_synchronize_from_tb(CPUState *cs,
477
{
478
RXCPU *cpu = RX_CPU(cs);
479
480
- cpu->env.pc = tb->pc;
481
+ cpu->env.pc = tb_pc(tb);
482
}
483
484
static bool rx_cpu_has_work(CPUState *cs)
485
diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
486
index XXXXXXX..XXXXXXX 100644
487
--- a/target/sh4/cpu.c
488
+++ b/target/sh4/cpu.c
489
@@ -XXX,XX +XXX,XX @@ static void superh_cpu_synchronize_from_tb(CPUState *cs,
490
{
491
SuperHCPU *cpu = SUPERH_CPU(cs);
492
493
- cpu->env.pc = tb->pc;
494
+ cpu->env.pc = tb_pc(tb);
495
cpu->env.flags = tb->flags & TB_FLAG_ENVFLAGS_MASK;
496
}
497
498
@@ -XXX,XX +XXX,XX @@ static bool superh_io_recompile_replay_branch(CPUState *cs,
499
CPUSH4State *env = &cpu->env;
500
501
if ((env->flags & ((DELAY_SLOT | DELAY_SLOT_CONDITIONAL))) != 0
502
- && env->pc != tb->pc) {
503
+ && env->pc != tb_pc(tb)) {
504
env->pc -= 2;
505
env->flags &= ~(DELAY_SLOT | DELAY_SLOT_CONDITIONAL);
506
return true;
507
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
508
index XXXXXXX..XXXXXXX 100644
509
--- a/target/sparc/cpu.c
510
+++ b/target/sparc/cpu.c
511
@@ -XXX,XX +XXX,XX @@ static void sparc_cpu_synchronize_from_tb(CPUState *cs,
512
{
513
SPARCCPU *cpu = SPARC_CPU(cs);
514
515
- cpu->env.pc = tb->pc;
516
+ cpu->env.pc = tb_pc(tb);
517
cpu->env.npc = tb->cs_base;
518
}
519
520
diff --git a/target/tricore/cpu.c b/target/tricore/cpu.c
521
index XXXXXXX..XXXXXXX 100644
522
--- a/target/tricore/cpu.c
523
+++ b/target/tricore/cpu.c
524
@@ -XXX,XX +XXX,XX @@ static void tricore_cpu_synchronize_from_tb(CPUState *cs,
525
TriCoreCPU *cpu = TRICORE_CPU(cs);
526
CPUTriCoreState *env = &cpu->env;
527
528
- env->PC = tb->pc;
529
+ env->PC = tb_pc(tb);
530
}
531
532
static void tricore_cpu_reset(DeviceState *dev)
533
diff --git a/tcg/tcg.c b/tcg/tcg.c
534
index XXXXXXX..XXXXXXX 100644
535
--- a/tcg/tcg.c
536
+++ b/tcg/tcg.c
537
@@ -XXX,XX +XXX,XX @@ int64_t tcg_cpu_exec_time(void)
538
#endif
539
540
541
-int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
542
+int tcg_gen_code(TCGContext *s, TranslationBlock *tb, target_ulong pc_start)
543
{
544
#ifdef CONFIG_PROFILER
545
TCGProfile *prof = &s->prof;
546
@@ -XXX,XX +XXX,XX @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
547
548
#ifdef DEBUG_DISAS
549
if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP)
550
- && qemu_log_in_addr_range(tb->pc))) {
551
+ && qemu_log_in_addr_range(pc_start))) {
552
FILE *logfile = qemu_log_trylock();
553
if (logfile) {
554
fprintf(logfile, "OP:\n");
555
@@ -XXX,XX +XXX,XX @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
556
if (s->nb_indirects > 0) {
557
#ifdef DEBUG_DISAS
558
if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP_IND)
559
- && qemu_log_in_addr_range(tb->pc))) {
560
+ && qemu_log_in_addr_range(pc_start))) {
561
FILE *logfile = qemu_log_trylock();
562
if (logfile) {
563
fprintf(logfile, "OP before indirect lowering:\n");
564
@@ -XXX,XX +XXX,XX @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
565
566
#ifdef DEBUG_DISAS
567
if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP_OPT)
568
- && qemu_log_in_addr_range(tb->pc))) {
569
+ && qemu_log_in_addr_range(pc_start))) {
570
FILE *logfile = qemu_log_trylock();
571
if (logfile) {
572
fprintf(logfile, "OP after optimization and liveness analysis:\n");
573
--
449
--
574
2.34.1
450
2.34.1
575
451
576
452
diff view generated by jsdifflib
New patch
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
2
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
3
into one function that returns TCGReg and TCGLabelQemuLdst.
1
4
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/riscv/tcg-target.c.inc | 253 +++++++++++++++++--------------------
9
1 file changed, 114 insertions(+), 139 deletions(-)
10
11
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/riscv/tcg-target.c.inc
14
+++ b/tcg/riscv/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[MO_SIZE + 1] = {
16
#endif
17
};
18
19
-/* We expect to use a 12-bit negative offset from ENV. */
20
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
21
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
22
-
23
static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
24
{
25
tcg_out_opc_jump(s, OPC_JAL, TCG_REG_ZERO, 0);
26
@@ -XXX,XX +XXX,XX @@ static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
27
tcg_debug_assert(ok);
28
}
29
30
-static TCGReg tcg_out_tlb_load(TCGContext *s, TCGReg addr, MemOpIdx oi,
31
- tcg_insn_unit **label_ptr, bool is_load)
32
-{
33
- MemOp opc = get_memop(oi);
34
- unsigned s_bits = opc & MO_SIZE;
35
- unsigned a_bits = get_alignment_bits(opc);
36
- tcg_target_long compare_mask;
37
- int mem_index = get_mmuidx(oi);
38
- int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
39
- int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
40
- int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
41
- TCGReg mask_base = TCG_AREG0, table_base = TCG_AREG0;
42
-
43
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, mask_base, mask_ofs);
44
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, table_base, table_ofs);
45
-
46
- tcg_out_opc_imm(s, OPC_SRLI, TCG_REG_TMP2, addr,
47
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
48
- tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
49
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
50
-
51
- /* Load the tlb comparator and the addend. */
52
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
53
- is_load ? offsetof(CPUTLBEntry, addr_read)
54
- : offsetof(CPUTLBEntry, addr_write));
55
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
56
- offsetof(CPUTLBEntry, addend));
57
-
58
- /* We don't support unaligned accesses. */
59
- if (a_bits < s_bits) {
60
- a_bits = s_bits;
61
- }
62
- /* Clear the non-page, non-alignment bits from the address. */
63
- compare_mask = (tcg_target_long)TARGET_PAGE_MASK | ((1 << a_bits) - 1);
64
- if (compare_mask == sextreg(compare_mask, 0, 12)) {
65
- tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr, compare_mask);
66
- } else {
67
- tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
68
- tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP1, TCG_REG_TMP1, addr);
69
- }
70
-
71
- /* Compare masked address with the TLB entry. */
72
- label_ptr[0] = s->code_ptr;
73
- tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP0, TCG_REG_TMP1, 0);
74
-
75
- /* TLB Hit - translate address using addend. */
76
- if (TARGET_LONG_BITS == 32) {
77
- tcg_out_ext32u(s, TCG_REG_TMP0, addr);
78
- addr = TCG_REG_TMP0;
79
- }
80
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_REG_TMP2, addr);
81
- return TCG_REG_TMP0;
82
-}
83
-
84
-static void add_qemu_ldst_label(TCGContext *s, int is_ld, MemOpIdx oi,
85
- TCGType data_type, TCGReg data_reg,
86
- TCGReg addr_reg, void *raddr,
87
- tcg_insn_unit **label_ptr)
88
-{
89
- TCGLabelQemuLdst *label = new_ldst_label(s);
90
-
91
- label->is_ld = is_ld;
92
- label->oi = oi;
93
- label->type = data_type;
94
- label->datalo_reg = data_reg;
95
- label->addrlo_reg = addr_reg;
96
- label->raddr = tcg_splitwx_to_rx(raddr);
97
- label->label_ptr[0] = label_ptr[0];
98
-}
99
-
100
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
101
{
102
MemOpIdx oi = l->oi;
103
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
104
return true;
105
}
106
#else
107
-
108
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addr_reg,
109
- unsigned a_bits)
110
-{
111
- unsigned a_mask = (1 << a_bits) - 1;
112
- TCGLabelQemuLdst *l = new_ldst_label(s);
113
-
114
- l->is_ld = is_ld;
115
- l->addrlo_reg = addr_reg;
116
-
117
- /* We are expecting a_bits to max out at 7, so we can always use andi. */
118
- tcg_debug_assert(a_bits < 12);
119
- tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr_reg, a_mask);
120
-
121
- l->label_ptr[0] = s->code_ptr;
122
- tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP1, TCG_REG_ZERO, 0);
123
-
124
- l->raddr = tcg_splitwx_to_rx(s->code_ptr);
125
-}
126
-
127
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
128
{
129
/* resolve label address */
130
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
131
{
132
return tcg_out_fail_alignment(s, l);
133
}
134
-
135
#endif /* CONFIG_SOFTMMU */
136
137
+/*
138
+ * For softmmu, perform the TLB load and compare.
139
+ * For useronly, perform any required alignment tests.
140
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
141
+ * is required and fill in @h with the host address for the fast path.
142
+ */
143
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, TCGReg *pbase,
144
+ TCGReg addr_reg, MemOpIdx oi,
145
+ bool is_ld)
146
+{
147
+ TCGLabelQemuLdst *ldst = NULL;
148
+ MemOp opc = get_memop(oi);
149
+ unsigned a_bits = get_alignment_bits(opc);
150
+ unsigned a_mask = (1u << a_bits) - 1;
151
+
152
+#ifdef CONFIG_SOFTMMU
153
+ unsigned s_bits = opc & MO_SIZE;
154
+ int mem_index = get_mmuidx(oi);
155
+ int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
156
+ int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
157
+ int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
158
+ TCGReg mask_base = TCG_AREG0, table_base = TCG_AREG0;
159
+ tcg_target_long compare_mask;
160
+
161
+ ldst = new_ldst_label(s);
162
+ ldst->is_ld = is_ld;
163
+ ldst->oi = oi;
164
+ ldst->addrlo_reg = addr_reg;
165
+
166
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
167
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
168
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, mask_base, mask_ofs);
169
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, table_base, table_ofs);
170
+
171
+ tcg_out_opc_imm(s, OPC_SRLI, TCG_REG_TMP2, addr_reg,
172
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
173
+ tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
174
+ tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
175
+
176
+ /* Load the tlb comparator and the addend. */
177
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
178
+ is_ld ? offsetof(CPUTLBEntry, addr_read)
179
+ : offsetof(CPUTLBEntry, addr_write));
180
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
181
+ offsetof(CPUTLBEntry, addend));
182
+
183
+ /* We don't support unaligned accesses. */
184
+ if (a_bits < s_bits) {
185
+ a_bits = s_bits;
186
+ }
187
+ /* Clear the non-page, non-alignment bits from the address. */
188
+ compare_mask = (tcg_target_long)TARGET_PAGE_MASK | a_mask;
189
+ if (compare_mask == sextreg(compare_mask, 0, 12)) {
190
+ tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr_reg, compare_mask);
191
+ } else {
192
+ tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
193
+ tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP1, TCG_REG_TMP1, addr_reg);
194
+ }
195
+
196
+ /* Compare masked address with the TLB entry. */
197
+ ldst->label_ptr[0] = s->code_ptr;
198
+ tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP0, TCG_REG_TMP1, 0);
199
+
200
+ /* TLB Hit - translate address using addend. */
201
+ if (TARGET_LONG_BITS == 32) {
202
+ tcg_out_ext32u(s, TCG_REG_TMP0, addr_reg);
203
+ addr_reg = TCG_REG_TMP0;
204
+ }
205
+ tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_REG_TMP2, addr_reg);
206
+ *pbase = TCG_REG_TMP0;
207
+#else
208
+ if (a_mask) {
209
+ ldst = new_ldst_label(s);
210
+ ldst->is_ld = is_ld;
211
+ ldst->oi = oi;
212
+ ldst->addrlo_reg = addr_reg;
213
+
214
+ /* We are expecting a_bits max 7, so we can always use andi. */
215
+ tcg_debug_assert(a_bits < 12);
216
+ tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr_reg, a_mask);
217
+
218
+ ldst->label_ptr[0] = s->code_ptr;
219
+ tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP1, TCG_REG_ZERO, 0);
220
+ }
221
+
222
+ TCGReg base = addr_reg;
223
+ if (TARGET_LONG_BITS == 32) {
224
+ tcg_out_ext32u(s, TCG_REG_TMP0, base);
225
+ base = TCG_REG_TMP0;
226
+ }
227
+ if (guest_base != 0) {
228
+ tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
229
+ base = TCG_REG_TMP0;
230
+ }
231
+ *pbase = base;
232
+#endif
233
+
234
+ return ldst;
235
+}
236
+
237
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg val,
238
TCGReg base, MemOp opc, TCGType type)
239
{
240
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg val,
241
static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
242
MemOpIdx oi, TCGType data_type)
243
{
244
- MemOp opc = get_memop(oi);
245
+ TCGLabelQemuLdst *ldst;
246
TCGReg base;
247
248
-#if defined(CONFIG_SOFTMMU)
249
- tcg_insn_unit *label_ptr[1];
250
+ ldst = prepare_host_addr(s, &base, addr_reg, oi, true);
251
+ tcg_out_qemu_ld_direct(s, data_reg, base, get_memop(oi), data_type);
252
253
- base = tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 1);
254
- tcg_out_qemu_ld_direct(s, data_reg, base, opc, data_type);
255
- add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
256
- s->code_ptr, label_ptr);
257
-#else
258
- unsigned a_bits = get_alignment_bits(opc);
259
- if (a_bits) {
260
- tcg_out_test_alignment(s, true, addr_reg, a_bits);
261
+ if (ldst) {
262
+ ldst->type = data_type;
263
+ ldst->datalo_reg = data_reg;
264
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
265
}
266
- base = addr_reg;
267
- if (TARGET_LONG_BITS == 32) {
268
- tcg_out_ext32u(s, TCG_REG_TMP0, base);
269
- base = TCG_REG_TMP0;
270
- }
271
- if (guest_base != 0) {
272
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
273
- base = TCG_REG_TMP0;
274
- }
275
- tcg_out_qemu_ld_direct(s, data_reg, base, opc, data_type);
276
-#endif
277
}
278
279
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg val,
280
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg val,
281
static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
282
MemOpIdx oi, TCGType data_type)
283
{
284
- MemOp opc = get_memop(oi);
285
+ TCGLabelQemuLdst *ldst;
286
TCGReg base;
287
288
-#if defined(CONFIG_SOFTMMU)
289
- tcg_insn_unit *label_ptr[1];
290
+ ldst = prepare_host_addr(s, &base, addr_reg, oi, false);
291
+ tcg_out_qemu_st_direct(s, data_reg, base, get_memop(oi));
292
293
- base = tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 0);
294
- tcg_out_qemu_st_direct(s, data_reg, base, opc);
295
- add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
296
- s->code_ptr, label_ptr);
297
-#else
298
- unsigned a_bits = get_alignment_bits(opc);
299
- if (a_bits) {
300
- tcg_out_test_alignment(s, false, addr_reg, a_bits);
301
+ if (ldst) {
302
+ ldst->type = data_type;
303
+ ldst->datalo_reg = data_reg;
304
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
305
}
306
- base = addr_reg;
307
- if (TARGET_LONG_BITS == 32) {
308
- tcg_out_ext32u(s, TCG_REG_TMP0, base);
309
- base = TCG_REG_TMP0;
310
- }
311
- if (guest_base != 0) {
312
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
313
- base = TCG_REG_TMP0;
314
- }
315
- tcg_out_qemu_st_direct(s, data_reg, base, opc);
316
-#endif
317
}
318
319
static const tcg_insn_unit *tb_ret_addr;
320
--
321
2.34.1
322
323
diff view generated by jsdifflib
New patch
1
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
2
tcg_prepare_user_ldst, and some code that lived in both tcg_out_qemu_ld
3
and tcg_out_qemu_st into one function that returns HostAddress and
4
TCGLabelQemuLdst structures.
1
5
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/s390x/tcg-target.c.inc | 263 ++++++++++++++++---------------------
10
1 file changed, 113 insertions(+), 150 deletions(-)
11
12
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/s390x/tcg-target.c.inc
15
+++ b/tcg/s390x/tcg-target.c.inc
16
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg data,
17
}
18
19
#if defined(CONFIG_SOFTMMU)
20
-/* We're expecting to use a 20-bit negative offset on the tlb memory ops. */
21
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
22
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 19));
23
-
24
-/* Load and compare a TLB entry, leaving the flags set. Loads the TLB
25
- addend into R2. Returns a register with the santitized guest address. */
26
-static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addr_reg, MemOp opc,
27
- int mem_index, bool is_ld)
28
-{
29
- unsigned s_bits = opc & MO_SIZE;
30
- unsigned a_bits = get_alignment_bits(opc);
31
- unsigned s_mask = (1 << s_bits) - 1;
32
- unsigned a_mask = (1 << a_bits) - 1;
33
- int fast_off = TLB_MASK_TABLE_OFS(mem_index);
34
- int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
35
- int table_off = fast_off + offsetof(CPUTLBDescFast, table);
36
- int ofs, a_off;
37
- uint64_t tlb_mask;
38
-
39
- tcg_out_sh64(s, RSY_SRLG, TCG_REG_R2, addr_reg, TCG_REG_NONE,
40
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
41
- tcg_out_insn(s, RXY, NG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, mask_off);
42
- tcg_out_insn(s, RXY, AG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, table_off);
43
-
44
- /* For aligned accesses, we check the first byte and include the alignment
45
- bits within the address. For unaligned access, we check that we don't
46
- cross pages using the address of the last byte of the access. */
47
- a_off = (a_bits >= s_bits ? 0 : s_mask - a_mask);
48
- tlb_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
49
- if (a_off == 0) {
50
- tgen_andi_risbg(s, TCG_REG_R3, addr_reg, tlb_mask);
51
- } else {
52
- tcg_out_insn(s, RX, LA, TCG_REG_R3, addr_reg, TCG_REG_NONE, a_off);
53
- tgen_andi(s, TCG_TYPE_TL, TCG_REG_R3, tlb_mask);
54
- }
55
-
56
- if (is_ld) {
57
- ofs = offsetof(CPUTLBEntry, addr_read);
58
- } else {
59
- ofs = offsetof(CPUTLBEntry, addr_write);
60
- }
61
- if (TARGET_LONG_BITS == 32) {
62
- tcg_out_insn(s, RX, C, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
63
- } else {
64
- tcg_out_insn(s, RXY, CG, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
65
- }
66
-
67
- tcg_out_insn(s, RXY, LG, TCG_REG_R2, TCG_REG_R2, TCG_REG_NONE,
68
- offsetof(CPUTLBEntry, addend));
69
-
70
- if (TARGET_LONG_BITS == 32) {
71
- tcg_out_ext32u(s, TCG_REG_R3, addr_reg);
72
- return TCG_REG_R3;
73
- }
74
- return addr_reg;
75
-}
76
-
77
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld, MemOpIdx oi,
78
- TCGType type, TCGReg data, TCGReg addr,
79
- tcg_insn_unit *raddr, tcg_insn_unit *label_ptr)
80
-{
81
- TCGLabelQemuLdst *label = new_ldst_label(s);
82
-
83
- label->is_ld = is_ld;
84
- label->oi = oi;
85
- label->type = type;
86
- label->datalo_reg = data;
87
- label->addrlo_reg = addr;
88
- label->raddr = tcg_splitwx_to_rx(raddr);
89
- label->label_ptr[0] = label_ptr;
90
-}
91
-
92
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
93
{
94
TCGReg addr_reg = lb->addrlo_reg;
95
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
96
return true;
97
}
98
#else
99
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld,
100
- TCGReg addrlo, unsigned a_bits)
101
-{
102
- unsigned a_mask = (1 << a_bits) - 1;
103
- TCGLabelQemuLdst *l = new_ldst_label(s);
104
-
105
- l->is_ld = is_ld;
106
- l->addrlo_reg = addrlo;
107
-
108
- /* We are expecting a_bits to max out at 7, much lower than TMLL. */
109
- tcg_debug_assert(a_bits < 16);
110
- tcg_out_insn(s, RI, TMLL, addrlo, a_mask);
111
-
112
- tcg_out16(s, RI_BRC | (7 << 4)); /* CC in {1,2,3} */
113
- l->label_ptr[0] = s->code_ptr;
114
- s->code_ptr += 1;
115
-
116
- l->raddr = tcg_splitwx_to_rx(s->code_ptr);
117
-}
118
-
119
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
120
{
121
if (!patch_reloc(l->label_ptr[0], R_390_PC16DBL,
122
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
123
{
124
return tcg_out_fail_alignment(s, l);
125
}
126
+#endif /* CONFIG_SOFTMMU */
127
128
-static HostAddress tcg_prepare_user_ldst(TCGContext *s, TCGReg addr_reg)
129
+/*
130
+ * For softmmu, perform the TLB load and compare.
131
+ * For useronly, perform any required alignment tests.
132
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
133
+ * is required and fill in @h with the host address for the fast path.
134
+ */
135
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
136
+ TCGReg addr_reg, MemOpIdx oi,
137
+ bool is_ld)
138
{
139
- TCGReg index;
140
- int disp;
141
+ TCGLabelQemuLdst *ldst = NULL;
142
+ MemOp opc = get_memop(oi);
143
+ unsigned a_bits = get_alignment_bits(opc);
144
+ unsigned a_mask = (1u << a_bits) - 1;
145
146
+#ifdef CONFIG_SOFTMMU
147
+ unsigned s_bits = opc & MO_SIZE;
148
+ unsigned s_mask = (1 << s_bits) - 1;
149
+ int mem_index = get_mmuidx(oi);
150
+ int fast_off = TLB_MASK_TABLE_OFS(mem_index);
151
+ int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
152
+ int table_off = fast_off + offsetof(CPUTLBDescFast, table);
153
+ int ofs, a_off;
154
+ uint64_t tlb_mask;
155
+
156
+ ldst = new_ldst_label(s);
157
+ ldst->is_ld = is_ld;
158
+ ldst->oi = oi;
159
+ ldst->addrlo_reg = addr_reg;
160
+
161
+ tcg_out_sh64(s, RSY_SRLG, TCG_REG_R2, addr_reg, TCG_REG_NONE,
162
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
163
+
164
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
165
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 19));
166
+ tcg_out_insn(s, RXY, NG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, mask_off);
167
+ tcg_out_insn(s, RXY, AG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, table_off);
168
+
169
+ /*
170
+ * For aligned accesses, we check the first byte and include the alignment
171
+ * bits within the address. For unaligned access, we check that we don't
172
+ * cross pages using the address of the last byte of the access.
173
+ */
174
+ a_off = (a_bits >= s_bits ? 0 : s_mask - a_mask);
175
+ tlb_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
176
+ if (a_off == 0) {
177
+ tgen_andi_risbg(s, TCG_REG_R3, addr_reg, tlb_mask);
178
+ } else {
179
+ tcg_out_insn(s, RX, LA, TCG_REG_R3, addr_reg, TCG_REG_NONE, a_off);
180
+ tgen_andi(s, TCG_TYPE_TL, TCG_REG_R3, tlb_mask);
181
+ }
182
+
183
+ if (is_ld) {
184
+ ofs = offsetof(CPUTLBEntry, addr_read);
185
+ } else {
186
+ ofs = offsetof(CPUTLBEntry, addr_write);
187
+ }
188
+ if (TARGET_LONG_BITS == 32) {
189
+ tcg_out_insn(s, RX, C, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
190
+ } else {
191
+ tcg_out_insn(s, RXY, CG, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
192
+ }
193
+
194
+ tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
195
+ ldst->label_ptr[0] = s->code_ptr++;
196
+
197
+ h->index = TCG_REG_R2;
198
+ tcg_out_insn(s, RXY, LG, h->index, TCG_REG_R2, TCG_REG_NONE,
199
+ offsetof(CPUTLBEntry, addend));
200
+
201
+ h->base = addr_reg;
202
+ if (TARGET_LONG_BITS == 32) {
203
+ tcg_out_ext32u(s, TCG_REG_R3, addr_reg);
204
+ h->base = TCG_REG_R3;
205
+ }
206
+ h->disp = 0;
207
+#else
208
+ if (a_mask) {
209
+ ldst = new_ldst_label(s);
210
+ ldst->is_ld = is_ld;
211
+ ldst->oi = oi;
212
+ ldst->addrlo_reg = addr_reg;
213
+
214
+ /* We are expecting a_bits to max out at 7, much lower than TMLL. */
215
+ tcg_debug_assert(a_bits < 16);
216
+ tcg_out_insn(s, RI, TMLL, addr_reg, a_mask);
217
+
218
+ tcg_out16(s, RI_BRC | (7 << 4)); /* CC in {1,2,3} */
219
+ ldst->label_ptr[0] = s->code_ptr++;
220
+ }
221
+
222
+ h->base = addr_reg;
223
if (TARGET_LONG_BITS == 32) {
224
tcg_out_ext32u(s, TCG_TMP0, addr_reg);
225
- addr_reg = TCG_TMP0;
226
+ h->base = TCG_TMP0;
227
}
228
if (guest_base < 0x80000) {
229
- index = TCG_REG_NONE;
230
- disp = guest_base;
231
+ h->index = TCG_REG_NONE;
232
+ h->disp = guest_base;
233
} else {
234
- index = TCG_GUEST_BASE_REG;
235
- disp = 0;
236
+ h->index = TCG_GUEST_BASE_REG;
237
+ h->disp = 0;
238
}
239
- return (HostAddress){ .base = addr_reg, .index = index, .disp = disp };
240
+#endif
241
+
242
+ return ldst;
243
}
244
-#endif /* CONFIG_SOFTMMU */
245
246
static void tcg_out_qemu_ld(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
247
MemOpIdx oi, TCGType data_type)
248
{
249
- MemOp opc = get_memop(oi);
250
+ TCGLabelQemuLdst *ldst;
251
HostAddress h;
252
253
-#ifdef CONFIG_SOFTMMU
254
- unsigned mem_index = get_mmuidx(oi);
255
- tcg_insn_unit *label_ptr;
256
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, true);
257
+ tcg_out_qemu_ld_direct(s, get_memop(oi), data_reg, h);
258
259
- h.base = tcg_out_tlb_read(s, addr_reg, opc, mem_index, 1);
260
- h.index = TCG_REG_R2;
261
- h.disp = 0;
262
-
263
- tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
264
- label_ptr = s->code_ptr;
265
- s->code_ptr += 1;
266
-
267
- tcg_out_qemu_ld_direct(s, opc, data_reg, h);
268
-
269
- add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
270
- s->code_ptr, label_ptr);
271
-#else
272
- unsigned a_bits = get_alignment_bits(opc);
273
-
274
- if (a_bits) {
275
- tcg_out_test_alignment(s, true, addr_reg, a_bits);
276
+ if (ldst) {
277
+ ldst->type = data_type;
278
+ ldst->datalo_reg = data_reg;
279
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
280
}
281
- h = tcg_prepare_user_ldst(s, addr_reg);
282
- tcg_out_qemu_ld_direct(s, opc, data_reg, h);
283
-#endif
284
}
285
286
static void tcg_out_qemu_st(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
287
MemOpIdx oi, TCGType data_type)
288
{
289
- MemOp opc = get_memop(oi);
290
+ TCGLabelQemuLdst *ldst;
291
HostAddress h;
292
293
-#ifdef CONFIG_SOFTMMU
294
- unsigned mem_index = get_mmuidx(oi);
295
- tcg_insn_unit *label_ptr;
296
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, false);
297
+ tcg_out_qemu_st_direct(s, get_memop(oi), data_reg, h);
298
299
- h.base = tcg_out_tlb_read(s, addr_reg, opc, mem_index, 0);
300
- h.index = TCG_REG_R2;
301
- h.disp = 0;
302
-
303
- tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
304
- label_ptr = s->code_ptr;
305
- s->code_ptr += 1;
306
-
307
- tcg_out_qemu_st_direct(s, opc, data_reg, h);
308
-
309
- add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
310
- s->code_ptr, label_ptr);
311
-#else
312
- unsigned a_bits = get_alignment_bits(opc);
313
-
314
- if (a_bits) {
315
- tcg_out_test_alignment(s, false, addr_reg, a_bits);
316
+ if (ldst) {
317
+ ldst->type = data_type;
318
+ ldst->datalo_reg = data_reg;
319
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
320
}
321
- h = tcg_prepare_user_ldst(s, addr_reg);
322
- tcg_out_qemu_st_direct(s, opc, data_reg, h);
323
-#endif
324
}
325
326
static void tcg_out_exit_tb(TCGContext *s, uintptr_t a0)
327
--
328
2.34.1
329
330
diff view generated by jsdifflib
1
Use the pc coming from db->pc_first rather than the TB.
1
Add tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
and tcg_out_st_helper_args. These and their subroutines
3
use the existing knowledge of the host function call abi
4
to load the function call arguments and return results.
2
5
3
Use the cached host_addr rather than re-computing for the
6
These will be used to simplify the backends in turn.
4
first page. We still need a separate lookup for the second
5
page because it won't be computed for DisasContextBase until
6
the translator actually performs a read from the page.
7
7
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
---
10
---
11
include/exec/plugin-gen.h | 7 ++++---
11
tcg/tcg.c | 475 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
12
accel/tcg/plugin-gen.c | 22 +++++++++++-----------
12
1 file changed, 471 insertions(+), 4 deletions(-)
13
accel/tcg/translator.c | 2 +-
14
3 files changed, 16 insertions(+), 15 deletions(-)
15
13
16
diff --git a/include/exec/plugin-gen.h b/include/exec/plugin-gen.h
14
diff --git a/tcg/tcg.c b/tcg/tcg.c
17
index XXXXXXX..XXXXXXX 100644
15
index XXXXXXX..XXXXXXX 100644
18
--- a/include/exec/plugin-gen.h
16
--- a/tcg/tcg.c
19
+++ b/include/exec/plugin-gen.h
17
+++ b/tcg/tcg.c
20
@@ -XXX,XX +XXX,XX @@ struct DisasContextBase;
18
@@ -XXX,XX +XXX,XX @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct);
21
19
static int tcg_out_ldst_finalize(TCGContext *s);
22
#ifdef CONFIG_PLUGIN
20
#endif
23
21
24
-bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb, bool supress);
22
+typedef struct TCGLdstHelperParam {
25
+bool plugin_gen_tb_start(CPUState *cpu, const struct DisasContextBase *db,
23
+ TCGReg (*ra_gen)(TCGContext *s, const TCGLabelQemuLdst *l, int arg_reg);
26
+ bool supress);
24
+ unsigned ntmp;
27
void plugin_gen_tb_end(CPUState *cpu);
25
+ int tmp[3];
28
void plugin_gen_insn_start(CPUState *cpu, const struct DisasContextBase *db);
26
+} TCGLdstHelperParam;
29
void plugin_gen_insn_end(void);
27
+
30
@@ -XXX,XX +XXX,XX @@ static inline void plugin_insn_append(abi_ptr pc, const void *from, size_t size)
28
+static void tcg_out_ld_helper_args(TCGContext *s, const TCGLabelQemuLdst *l,
31
29
+ const TCGLdstHelperParam *p)
32
#else /* !CONFIG_PLUGIN */
30
+ __attribute__((unused));
33
31
+static void tcg_out_ld_helper_ret(TCGContext *s, const TCGLabelQemuLdst *l,
34
-static inline
32
+ bool load_sign, const TCGLdstHelperParam *p)
35
-bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb, bool supress)
33
+ __attribute__((unused));
36
+static inline bool
34
+static void tcg_out_st_helper_args(TCGContext *s, const TCGLabelQemuLdst *l,
37
+plugin_gen_tb_start(CPUState *cpu, const struct DisasContextBase *db, bool sup)
35
+ const TCGLdstHelperParam *p)
36
+ __attribute__((unused));
37
+
38
TCGContext tcg_init_ctx;
39
__thread TCGContext *tcg_ctx;
40
41
@@ -XXX,XX +XXX,XX @@ void tcg_raise_tb_overflow(TCGContext *s)
42
siglongjmp(s->jmp_trans, -2);
43
}
44
45
+/*
46
+ * Used by tcg_out_movext{1,2} to hold the arguments for tcg_out_movext.
47
+ * By the time we arrive at tcg_out_movext1, @dst is always a TCGReg.
48
+ *
49
+ * However, tcg_out_helper_load_slots reuses this field to hold an
50
+ * argument slot number (which may designate a argument register or an
51
+ * argument stack slot), converting to TCGReg once all arguments that
52
+ * are destined for the stack are processed.
53
+ */
54
typedef struct TCGMovExtend {
55
- TCGReg dst;
56
+ unsigned dst;
57
TCGReg src;
58
TCGType dst_type;
59
TCGType src_type;
60
@@ -XXX,XX +XXX,XX @@ static void tcg_out_movext1(TCGContext *s, const TCGMovExtend *i)
61
* between the sources and destinations.
62
*/
63
64
-static void __attribute__((unused))
65
-tcg_out_movext2(TCGContext *s, const TCGMovExtend *i1,
66
- const TCGMovExtend *i2, int scratch)
67
+static void tcg_out_movext2(TCGContext *s, const TCGMovExtend *i1,
68
+ const TCGMovExtend *i2, int scratch)
38
{
69
{
39
return false;
70
TCGReg src1 = i1->src;
40
}
71
TCGReg src2 = i2->src;
41
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
72
@@ -XXX,XX +XXX,XX @@ static TCGHelperInfo all_helpers[] = {
42
index XXXXXXX..XXXXXXX 100644
73
};
43
--- a/accel/tcg/plugin-gen.c
74
static GHashTable *helper_table;
44
+++ b/accel/tcg/plugin-gen.c
75
45
@@ -XXX,XX +XXX,XX @@ static void plugin_gen_inject(const struct qemu_plugin_tb *plugin_tb)
76
+/*
46
pr_ops();
77
+ * Create TCGHelperInfo structures for "tcg/tcg-ldst.h" functions,
47
}
78
+ * akin to what "exec/helper-tcg.h" does with DEF_HELPER_FLAGS_N.
48
79
+ * We only use these for layout in tcg_out_ld_helper_ret and
49
-bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb, bool mem_only)
80
+ * tcg_out_st_helper_args, and share them between several of
50
+bool plugin_gen_tb_start(CPUState *cpu, const DisasContextBase *db,
81
+ * the helpers, with the end result that it's easier to build manually.
51
+ bool mem_only)
82
+ */
83
+
84
+#if TCG_TARGET_REG_BITS == 32
85
+# define dh_typecode_ttl dh_typecode_i32
86
+#else
87
+# define dh_typecode_ttl dh_typecode_i64
88
+#endif
89
+
90
+static TCGHelperInfo info_helper_ld32_mmu = {
91
+ .flags = TCG_CALL_NO_WG,
92
+ .typemask = dh_typemask(ttl, 0) /* return tcg_target_ulong */
93
+ | dh_typemask(env, 1)
94
+ | dh_typemask(tl, 2) /* target_ulong addr */
95
+ | dh_typemask(i32, 3) /* unsigned oi */
96
+ | dh_typemask(ptr, 4) /* uintptr_t ra */
97
+};
98
+
99
+static TCGHelperInfo info_helper_ld64_mmu = {
100
+ .flags = TCG_CALL_NO_WG,
101
+ .typemask = dh_typemask(i64, 0) /* return uint64_t */
102
+ | dh_typemask(env, 1)
103
+ | dh_typemask(tl, 2) /* target_ulong addr */
104
+ | dh_typemask(i32, 3) /* unsigned oi */
105
+ | dh_typemask(ptr, 4) /* uintptr_t ra */
106
+};
107
+
108
+static TCGHelperInfo info_helper_st32_mmu = {
109
+ .flags = TCG_CALL_NO_WG,
110
+ .typemask = dh_typemask(void, 0)
111
+ | dh_typemask(env, 1)
112
+ | dh_typemask(tl, 2) /* target_ulong addr */
113
+ | dh_typemask(i32, 3) /* uint32_t data */
114
+ | dh_typemask(i32, 4) /* unsigned oi */
115
+ | dh_typemask(ptr, 5) /* uintptr_t ra */
116
+};
117
+
118
+static TCGHelperInfo info_helper_st64_mmu = {
119
+ .flags = TCG_CALL_NO_WG,
120
+ .typemask = dh_typemask(void, 0)
121
+ | dh_typemask(env, 1)
122
+ | dh_typemask(tl, 2) /* target_ulong addr */
123
+ | dh_typemask(i64, 3) /* uint64_t data */
124
+ | dh_typemask(i32, 4) /* unsigned oi */
125
+ | dh_typemask(ptr, 5) /* uintptr_t ra */
126
+};
127
+
128
#ifdef CONFIG_TCG_INTERPRETER
129
static ffi_type *typecode_to_ffi(int argmask)
52
{
130
{
53
bool ret = false;
131
@@ -XXX,XX +XXX,XX @@ static void tcg_context_init(unsigned max_cpus)
54
132
(gpointer)&all_helpers[i]);
55
@@ -XXX,XX +XXX,XX @@ bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb, bool mem_onl
133
}
56
134
57
ret = true;
135
+ init_call_layout(&info_helper_ld32_mmu);
58
136
+ init_call_layout(&info_helper_ld64_mmu);
59
- ptb->vaddr = tb->pc;
137
+ init_call_layout(&info_helper_st32_mmu);
60
+ ptb->vaddr = db->pc_first;
138
+ init_call_layout(&info_helper_st64_mmu);
61
ptb->vaddr2 = -1;
139
+
62
- get_page_addr_code_hostp(cpu->env_ptr, tb->pc, &ptb->haddr1);
140
#ifdef CONFIG_TCG_INTERPRETER
63
+ ptb->haddr1 = db->host_addr[0];
141
init_ffi_layouts();
64
ptb->haddr2 = NULL;
142
#endif
65
ptb->mem_only = mem_only;
143
@@ -XXX,XX +XXX,XX @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op)
66
67
@@ -XXX,XX +XXX,XX @@ void plugin_gen_insn_start(CPUState *cpu, const DisasContextBase *db)
68
* Note that we skip this when haddr1 == NULL, e.g. when we're
69
* fetching instructions from a region not backed by RAM.
70
*/
71
- if (likely(ptb->haddr1 != NULL && ptb->vaddr2 == -1) &&
72
- unlikely((db->pc_next & TARGET_PAGE_MASK) !=
73
- (db->pc_first & TARGET_PAGE_MASK))) {
74
- get_page_addr_code_hostp(cpu->env_ptr, db->pc_next,
75
- &ptb->haddr2);
76
- ptb->vaddr2 = db->pc_next;
77
- }
78
- if (likely(ptb->vaddr2 == -1)) {
79
+ if (ptb->haddr1 == NULL) {
80
+ pinsn->haddr = NULL;
81
+ } else if (is_same_page(db, db->pc_next)) {
82
pinsn->haddr = ptb->haddr1 + pinsn->vaddr - ptb->vaddr;
83
} else {
84
+ if (ptb->vaddr2 == -1) {
85
+ ptb->vaddr2 = TARGET_PAGE_ALIGN(db->pc_first);
86
+ get_page_addr_code_hostp(cpu->env_ptr, ptb->vaddr2, &ptb->haddr2);
87
+ }
88
pinsn->haddr = ptb->haddr2 + pinsn->vaddr - ptb->vaddr2;
89
}
144
}
90
}
145
}
91
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
146
92
index XXXXXXX..XXXXXXX 100644
147
+/*
93
--- a/accel/tcg/translator.c
148
+ * Similarly for qemu_ld/st slow path helpers.
94
+++ b/accel/tcg/translator.c
149
+ * We must re-implement tcg_gen_callN and tcg_reg_alloc_call simultaneously,
95
@@ -XXX,XX +XXX,XX @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int max_insns,
150
+ * using only the provided backend tcg_out_* functions.
96
ops->tb_start(db, cpu);
151
+ */
97
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
152
+
98
153
+static int tcg_out_helper_stk_ofs(TCGType type, unsigned slot)
99
- plugin_enabled = plugin_gen_tb_start(cpu, tb, cflags & CF_MEMI_ONLY);
154
+{
100
+ plugin_enabled = plugin_gen_tb_start(cpu, db, cflags & CF_MEMI_ONLY);
155
+ int ofs = arg_slot_stk_ofs(slot);
101
156
+
102
while (true) {
157
+ /*
103
db->num_insns++;
158
+ * Each stack slot is TCG_TARGET_LONG_BITS. If the host does not
159
+ * require extension to uint64_t, adjust the address for uint32_t.
160
+ */
161
+ if (HOST_BIG_ENDIAN &&
162
+ TCG_TARGET_REG_BITS == 64 &&
163
+ type == TCG_TYPE_I32) {
164
+ ofs += 4;
165
+ }
166
+ return ofs;
167
+}
168
+
169
+static void tcg_out_helper_load_regs(TCGContext *s,
170
+ unsigned nmov, TCGMovExtend *mov,
171
+ unsigned ntmp, const int *tmp)
172
+{
173
+ switch (nmov) {
174
+ default:
175
+ /* The backend must have provided enough temps for the worst case. */
176
+ tcg_debug_assert(ntmp + 1 >= nmov);
177
+
178
+ for (unsigned i = nmov - 1; i >= 2; --i) {
179
+ TCGReg dst = mov[i].dst;
180
+
181
+ for (unsigned j = 0; j < i; ++j) {
182
+ if (dst == mov[j].src) {
183
+ /*
184
+ * Conflict.
185
+ * Copy the source to a temporary, recurse for the
186
+ * remaining moves, perform the extension from our
187
+ * scratch on the way out.
188
+ */
189
+ TCGReg scratch = tmp[--ntmp];
190
+ tcg_out_mov(s, mov[i].src_type, scratch, mov[i].src);
191
+ mov[i].src = scratch;
192
+
193
+ tcg_out_helper_load_regs(s, i, mov, ntmp, tmp);
194
+ tcg_out_movext1(s, &mov[i]);
195
+ return;
196
+ }
197
+ }
198
+
199
+ /* No conflicts: perform this move and continue. */
200
+ tcg_out_movext1(s, &mov[i]);
201
+ }
202
+ /* fall through for the final two moves */
203
+
204
+ case 2:
205
+ tcg_out_movext2(s, mov, mov + 1, ntmp ? tmp[0] : -1);
206
+ return;
207
+ case 1:
208
+ tcg_out_movext1(s, mov);
209
+ return;
210
+ case 0:
211
+ g_assert_not_reached();
212
+ }
213
+}
214
+
215
+static void tcg_out_helper_load_slots(TCGContext *s,
216
+ unsigned nmov, TCGMovExtend *mov,
217
+ const TCGLdstHelperParam *parm)
218
+{
219
+ unsigned i;
220
+
221
+ /*
222
+ * Start from the end, storing to the stack first.
223
+ * This frees those registers, so we need not consider overlap.
224
+ */
225
+ for (i = nmov; i-- > 0; ) {
226
+ unsigned slot = mov[i].dst;
227
+
228
+ if (arg_slot_reg_p(slot)) {
229
+ goto found_reg;
230
+ }
231
+
232
+ TCGReg src = mov[i].src;
233
+ TCGType dst_type = mov[i].dst_type;
234
+ MemOp dst_mo = dst_type == TCG_TYPE_I32 ? MO_32 : MO_64;
235
+
236
+ /* The argument is going onto the stack; extend into scratch. */
237
+ if ((mov[i].src_ext & MO_SIZE) != dst_mo) {
238
+ tcg_debug_assert(parm->ntmp != 0);
239
+ mov[i].dst = src = parm->tmp[0];
240
+ tcg_out_movext1(s, &mov[i]);
241
+ }
242
+
243
+ tcg_out_st(s, dst_type, src, TCG_REG_CALL_STACK,
244
+ tcg_out_helper_stk_ofs(dst_type, slot));
245
+ }
246
+ return;
247
+
248
+ found_reg:
249
+ /*
250
+ * The remaining arguments are in registers.
251
+ * Convert slot numbers to argument registers.
252
+ */
253
+ nmov = i + 1;
254
+ for (i = 0; i < nmov; ++i) {
255
+ mov[i].dst = tcg_target_call_iarg_regs[mov[i].dst];
256
+ }
257
+ tcg_out_helper_load_regs(s, nmov, mov, parm->ntmp, parm->tmp);
258
+}
259
+
260
+static void tcg_out_helper_load_imm(TCGContext *s, unsigned slot,
261
+ TCGType type, tcg_target_long imm,
262
+ const TCGLdstHelperParam *parm)
263
+{
264
+ if (arg_slot_reg_p(slot)) {
265
+ tcg_out_movi(s, type, tcg_target_call_iarg_regs[slot], imm);
266
+ } else {
267
+ int ofs = tcg_out_helper_stk_ofs(type, slot);
268
+ if (!tcg_out_sti(s, type, imm, TCG_REG_CALL_STACK, ofs)) {
269
+ tcg_debug_assert(parm->ntmp != 0);
270
+ tcg_out_movi(s, type, parm->tmp[0], imm);
271
+ tcg_out_st(s, type, parm->tmp[0], TCG_REG_CALL_STACK, ofs);
272
+ }
273
+ }
274
+}
275
+
276
+static void tcg_out_helper_load_common_args(TCGContext *s,
277
+ const TCGLabelQemuLdst *ldst,
278
+ const TCGLdstHelperParam *parm,
279
+ const TCGHelperInfo *info,
280
+ unsigned next_arg)
281
+{
282
+ TCGMovExtend ptr_mov = {
283
+ .dst_type = TCG_TYPE_PTR,
284
+ .src_type = TCG_TYPE_PTR,
285
+ .src_ext = sizeof(void *) == 4 ? MO_32 : MO_64
286
+ };
287
+ const TCGCallArgumentLoc *loc = &info->in[0];
288
+ TCGType type;
289
+ unsigned slot;
290
+ tcg_target_ulong imm;
291
+
292
+ /*
293
+ * Handle env, which is always first.
294
+ */
295
+ ptr_mov.dst = loc->arg_slot;
296
+ ptr_mov.src = TCG_AREG0;
297
+ tcg_out_helper_load_slots(s, 1, &ptr_mov, parm);
298
+
299
+ /*
300
+ * Handle oi.
301
+ */
302
+ imm = ldst->oi;
303
+ loc = &info->in[next_arg];
304
+ type = TCG_TYPE_I32;
305
+ switch (loc->kind) {
306
+ case TCG_CALL_ARG_NORMAL:
307
+ break;
308
+ case TCG_CALL_ARG_EXTEND_U:
309
+ case TCG_CALL_ARG_EXTEND_S:
310
+ /* No extension required for MemOpIdx. */
311
+ tcg_debug_assert(imm <= INT32_MAX);
312
+ type = TCG_TYPE_REG;
313
+ break;
314
+ default:
315
+ g_assert_not_reached();
316
+ }
317
+ tcg_out_helper_load_imm(s, loc->arg_slot, type, imm, parm);
318
+ next_arg++;
319
+
320
+ /*
321
+ * Handle ra.
322
+ */
323
+ loc = &info->in[next_arg];
324
+ slot = loc->arg_slot;
325
+ if (parm->ra_gen) {
326
+ int arg_reg = -1;
327
+ TCGReg ra_reg;
328
+
329
+ if (arg_slot_reg_p(slot)) {
330
+ arg_reg = tcg_target_call_iarg_regs[slot];
331
+ }
332
+ ra_reg = parm->ra_gen(s, ldst, arg_reg);
333
+
334
+ ptr_mov.dst = slot;
335
+ ptr_mov.src = ra_reg;
336
+ tcg_out_helper_load_slots(s, 1, &ptr_mov, parm);
337
+ } else {
338
+ imm = (uintptr_t)ldst->raddr;
339
+ tcg_out_helper_load_imm(s, slot, TCG_TYPE_PTR, imm, parm);
340
+ }
341
+}
342
+
343
+static unsigned tcg_out_helper_add_mov(TCGMovExtend *mov,
344
+ const TCGCallArgumentLoc *loc,
345
+ TCGType dst_type, TCGType src_type,
346
+ TCGReg lo, TCGReg hi)
347
+{
348
+ if (dst_type <= TCG_TYPE_REG) {
349
+ MemOp src_ext;
350
+
351
+ switch (loc->kind) {
352
+ case TCG_CALL_ARG_NORMAL:
353
+ src_ext = src_type == TCG_TYPE_I32 ? MO_32 : MO_64;
354
+ break;
355
+ case TCG_CALL_ARG_EXTEND_U:
356
+ dst_type = TCG_TYPE_REG;
357
+ src_ext = MO_UL;
358
+ break;
359
+ case TCG_CALL_ARG_EXTEND_S:
360
+ dst_type = TCG_TYPE_REG;
361
+ src_ext = MO_SL;
362
+ break;
363
+ default:
364
+ g_assert_not_reached();
365
+ }
366
+
367
+ mov[0].dst = loc->arg_slot;
368
+ mov[0].dst_type = dst_type;
369
+ mov[0].src = lo;
370
+ mov[0].src_type = src_type;
371
+ mov[0].src_ext = src_ext;
372
+ return 1;
373
+ }
374
+
375
+ assert(TCG_TARGET_REG_BITS == 32);
376
+
377
+ mov[0].dst = loc[HOST_BIG_ENDIAN].arg_slot;
378
+ mov[0].src = lo;
379
+ mov[0].dst_type = TCG_TYPE_I32;
380
+ mov[0].src_type = TCG_TYPE_I32;
381
+ mov[0].src_ext = MO_32;
382
+
383
+ mov[1].dst = loc[!HOST_BIG_ENDIAN].arg_slot;
384
+ mov[1].src = hi;
385
+ mov[1].dst_type = TCG_TYPE_I32;
386
+ mov[1].src_type = TCG_TYPE_I32;
387
+ mov[1].src_ext = MO_32;
388
+
389
+ return 2;
390
+}
391
+
392
+static void tcg_out_ld_helper_args(TCGContext *s, const TCGLabelQemuLdst *ldst,
393
+ const TCGLdstHelperParam *parm)
394
+{
395
+ const TCGHelperInfo *info;
396
+ const TCGCallArgumentLoc *loc;
397
+ TCGMovExtend mov[2];
398
+ unsigned next_arg, nmov;
399
+ MemOp mop = get_memop(ldst->oi);
400
+
401
+ switch (mop & MO_SIZE) {
402
+ case MO_8:
403
+ case MO_16:
404
+ case MO_32:
405
+ info = &info_helper_ld32_mmu;
406
+ break;
407
+ case MO_64:
408
+ info = &info_helper_ld64_mmu;
409
+ break;
410
+ default:
411
+ g_assert_not_reached();
412
+ }
413
+
414
+ /* Defer env argument. */
415
+ next_arg = 1;
416
+
417
+ loc = &info->in[next_arg];
418
+ nmov = tcg_out_helper_add_mov(mov, loc, TCG_TYPE_TL, TCG_TYPE_TL,
419
+ ldst->addrlo_reg, ldst->addrhi_reg);
420
+ next_arg += nmov;
421
+
422
+ tcg_out_helper_load_slots(s, nmov, mov, parm);
423
+
424
+ /* No special attention for 32 and 64-bit return values. */
425
+ tcg_debug_assert(info->out_kind == TCG_CALL_RET_NORMAL);
426
+
427
+ tcg_out_helper_load_common_args(s, ldst, parm, info, next_arg);
428
+}
429
+
430
+static void tcg_out_ld_helper_ret(TCGContext *s, const TCGLabelQemuLdst *ldst,
431
+ bool load_sign,
432
+ const TCGLdstHelperParam *parm)
433
+{
434
+ TCGMovExtend mov[2];
435
+
436
+ if (ldst->type <= TCG_TYPE_REG) {
437
+ MemOp mop = get_memop(ldst->oi);
438
+
439
+ mov[0].dst = ldst->datalo_reg;
440
+ mov[0].src = tcg_target_call_oarg_reg(TCG_CALL_RET_NORMAL, 0);
441
+ mov[0].dst_type = ldst->type;
442
+ mov[0].src_type = TCG_TYPE_REG;
443
+
444
+ /*
445
+ * If load_sign, then we allowed the helper to perform the
446
+ * appropriate sign extension to tcg_target_ulong, and all
447
+ * we need now is a plain move.
448
+ *
449
+ * If they do not, then we expect the relevant extension
450
+ * instruction to be no more expensive than a move, and
451
+ * we thus save the icache etc by only using one of two
452
+ * helper functions.
453
+ */
454
+ if (load_sign || !(mop & MO_SIGN)) {
455
+ if (TCG_TARGET_REG_BITS == 32 || ldst->type == TCG_TYPE_I32) {
456
+ mov[0].src_ext = MO_32;
457
+ } else {
458
+ mov[0].src_ext = MO_64;
459
+ }
460
+ } else {
461
+ mov[0].src_ext = mop & MO_SSIZE;
462
+ }
463
+ tcg_out_movext1(s, mov);
464
+ } else {
465
+ assert(TCG_TARGET_REG_BITS == 32);
466
+
467
+ mov[0].dst = ldst->datalo_reg;
468
+ mov[0].src =
469
+ tcg_target_call_oarg_reg(TCG_CALL_RET_NORMAL, HOST_BIG_ENDIAN);
470
+ mov[0].dst_type = TCG_TYPE_I32;
471
+ mov[0].src_type = TCG_TYPE_I32;
472
+ mov[0].src_ext = MO_32;
473
+
474
+ mov[1].dst = ldst->datahi_reg;
475
+ mov[1].src =
476
+ tcg_target_call_oarg_reg(TCG_CALL_RET_NORMAL, !HOST_BIG_ENDIAN);
477
+ mov[1].dst_type = TCG_TYPE_REG;
478
+ mov[1].src_type = TCG_TYPE_REG;
479
+ mov[1].src_ext = MO_32;
480
+
481
+ tcg_out_movext2(s, mov, mov + 1, parm->ntmp ? parm->tmp[0] : -1);
482
+ }
483
+}
484
+
485
+static void tcg_out_st_helper_args(TCGContext *s, const TCGLabelQemuLdst *ldst,
486
+ const TCGLdstHelperParam *parm)
487
+{
488
+ const TCGHelperInfo *info;
489
+ const TCGCallArgumentLoc *loc;
490
+ TCGMovExtend mov[4];
491
+ TCGType data_type;
492
+ unsigned next_arg, nmov, n;
493
+ MemOp mop = get_memop(ldst->oi);
494
+
495
+ switch (mop & MO_SIZE) {
496
+ case MO_8:
497
+ case MO_16:
498
+ case MO_32:
499
+ info = &info_helper_st32_mmu;
500
+ data_type = TCG_TYPE_I32;
501
+ break;
502
+ case MO_64:
503
+ info = &info_helper_st64_mmu;
504
+ data_type = TCG_TYPE_I64;
505
+ break;
506
+ default:
507
+ g_assert_not_reached();
508
+ }
509
+
510
+ /* Defer env argument. */
511
+ next_arg = 1;
512
+ nmov = 0;
513
+
514
+ /* Handle addr argument. */
515
+ loc = &info->in[next_arg];
516
+ n = tcg_out_helper_add_mov(mov, loc, TCG_TYPE_TL, TCG_TYPE_TL,
517
+ ldst->addrlo_reg, ldst->addrhi_reg);
518
+ next_arg += n;
519
+ nmov += n;
520
+
521
+ /* Handle data argument. */
522
+ loc = &info->in[next_arg];
523
+ n = tcg_out_helper_add_mov(mov + nmov, loc, data_type, ldst->type,
524
+ ldst->datalo_reg, ldst->datahi_reg);
525
+ next_arg += n;
526
+ nmov += n;
527
+ tcg_debug_assert(nmov <= ARRAY_SIZE(mov));
528
+
529
+ tcg_out_helper_load_slots(s, nmov, mov, parm);
530
+ tcg_out_helper_load_common_args(s, ldst, parm, info, next_arg);
531
+}
532
+
533
#ifdef CONFIG_PROFILER
534
535
/* avoid copy/paste errors */
104
--
536
--
105
2.34.1
537
2.34.1
106
538
107
539
diff view generated by jsdifflib
1
From: Alex Bennée <alex.bennee@linaro.org>
1
Use tcg_out_ld_helper_args and tcg_out_ld_helper_ret.
2
2
3
The class cast checkers are quite expensive and always on (unlike the
3
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
dynamic case who's checks are gated by CONFIG_QOM_CAST_DEBUG). To
5
avoid the overhead of repeatedly checking something which should never
6
change we cache the CPUClass reference for use in the hot code paths.
7
8
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
9
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Message-Id: <20220811151413.3350684-3-alex.bennee@linaro.org>
11
Signed-off-by: Cédric Le Goater <clg@kaod.org>
12
Message-Id: <20220923084803.498337-3-clg@kaod.org>
13
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
14
---
5
---
15
include/hw/core/cpu.h | 9 +++++++++
6
tcg/i386/tcg-target.c.inc | 71 +++++++++++++++------------------------
16
cpu.c | 9 ++++-----
7
1 file changed, 28 insertions(+), 43 deletions(-)
17
2 files changed, 13 insertions(+), 5 deletions(-)
18
8
19
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
9
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
20
index XXXXXXX..XXXXXXX 100644
10
index XXXXXXX..XXXXXXX 100644
21
--- a/include/hw/core/cpu.h
11
--- a/tcg/i386/tcg-target.c.inc
22
+++ b/include/hw/core/cpu.h
12
+++ b/tcg/i386/tcg-target.c.inc
23
@@ -XXX,XX +XXX,XX @@ typedef int (*WriteCoreDumpFunction)(const void *buf, size_t size,
13
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
14
[MO_BEUQ] = helper_be_stq_mmu,
15
};
16
17
+/*
18
+ * Because i686 has no register parameters and because x86_64 has xchg
19
+ * to handle addr/data register overlap, we have placed all input arguments
20
+ * before we need might need a scratch reg.
21
+ *
22
+ * Even then, a scratch is only needed for l->raddr. Rather than expose
23
+ * a general-purpose scratch when we don't actually know it's available,
24
+ * use the ra_gen hook to load into RAX if needed.
25
+ */
26
+#if TCG_TARGET_REG_BITS == 64
27
+static TCGReg ldst_ra_gen(TCGContext *s, const TCGLabelQemuLdst *l, int arg)
28
+{
29
+ if (arg < 0) {
30
+ arg = TCG_REG_RAX;
31
+ }
32
+ tcg_out_movi(s, TCG_TYPE_PTR, arg, (uintptr_t)l->raddr);
33
+ return arg;
34
+}
35
+static const TCGLdstHelperParam ldst_helper_param = {
36
+ .ra_gen = ldst_ra_gen
37
+};
38
+#else
39
+static const TCGLdstHelperParam ldst_helper_param = { };
40
+#endif
41
+
42
/*
43
* Generate code for the slow path for a load at the end of block
24
*/
44
*/
25
#define CPU(obj) ((CPUState *)(obj))
45
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
26
27
+/*
28
+ * The class checkers bring in CPU_GET_CLASS() which is potentially
29
+ * expensive given the eventual call to
30
+ * object_class_dynamic_cast_assert(). Because of this the CPUState
31
+ * has a cached value for the class in cs->cc which is set up in
32
+ * cpu_exec_realizefn() for use in hot code paths.
33
+ */
34
typedef struct CPUClass CPUClass;
35
DECLARE_CLASS_CHECKERS(CPUClass, CPU,
36
TYPE_CPU)
37
@@ -XXX,XX +XXX,XX @@ struct qemu_work_item;
38
struct CPUState {
39
/*< private >*/
40
DeviceState parent_obj;
41
+ /* cache to avoid expensive CPU_GET_CLASS */
42
+ CPUClass *cc;
43
/*< public >*/
44
45
int nr_cores;
46
diff --git a/cpu.c b/cpu.c
47
index XXXXXXX..XXXXXXX 100644
48
--- a/cpu.c
49
+++ b/cpu.c
50
@@ -XXX,XX +XXX,XX @@ const VMStateDescription vmstate_cpu_common = {
51
52
void cpu_exec_realizefn(CPUState *cpu, Error **errp)
53
{
46
{
54
-#ifndef CONFIG_USER_ONLY
47
- MemOpIdx oi = l->oi;
55
- CPUClass *cc = CPU_GET_CLASS(cpu);
48
- MemOp opc = get_memop(oi);
56
-#endif
49
+ MemOp opc = get_memop(l->oi);
57
+ /* cache the cpu class for the hotpath */
50
tcg_insn_unit **label_ptr = &l->label_ptr[0];
58
+ cpu->cc = CPU_GET_CLASS(cpu);
51
59
52
/* resolve label address */
60
cpu_list_add(cpu);
53
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
61
if (!accel_cpu_realizefn(cpu, errp)) {
54
tcg_patch32(label_ptr[1], s->code_ptr - label_ptr[1] - 4);
62
@@ -XXX,XX +XXX,XX @@ void cpu_exec_realizefn(CPUState *cpu, Error **errp)
63
if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
64
vmstate_register(NULL, cpu->cpu_index, &vmstate_cpu_common, cpu);
65
}
55
}
66
- if (cc->sysemu_ops->legacy_vmsd != NULL) {
56
67
- vmstate_register(NULL, cpu->cpu_index, cc->sysemu_ops->legacy_vmsd, cpu);
57
- if (TCG_TARGET_REG_BITS == 32) {
68
+ if (cpu->cc->sysemu_ops->legacy_vmsd != NULL) {
58
- int ofs = 0;
69
+ vmstate_register(NULL, cpu->cpu_index, cpu->cc->sysemu_ops->legacy_vmsd, cpu);
59
-
70
}
60
- tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, ofs);
71
#endif /* CONFIG_USER_ONLY */
61
- ofs += 4;
62
-
63
- tcg_out_st(s, TCG_TYPE_I32, l->addrlo_reg, TCG_REG_ESP, ofs);
64
- ofs += 4;
65
-
66
- if (TARGET_LONG_BITS == 64) {
67
- tcg_out_st(s, TCG_TYPE_I32, l->addrhi_reg, TCG_REG_ESP, ofs);
68
- ofs += 4;
69
- }
70
-
71
- tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, ofs);
72
- ofs += 4;
73
-
74
- tcg_out_sti(s, TCG_TYPE_PTR, (uintptr_t)l->raddr, TCG_REG_ESP, ofs);
75
- } else {
76
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
77
- tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
78
- l->addrlo_reg);
79
- tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[2], oi);
80
- tcg_out_movi(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[3],
81
- (uintptr_t)l->raddr);
82
- }
83
-
84
+ tcg_out_ld_helper_args(s, l, &ldst_helper_param);
85
tcg_out_branch(s, 1, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]);
86
+ tcg_out_ld_helper_ret(s, l, false, &ldst_helper_param);
87
88
- if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
89
- TCGMovExtend ext[2] = {
90
- { .dst = l->datalo_reg, .dst_type = TCG_TYPE_I32,
91
- .src = TCG_REG_EAX, .src_type = TCG_TYPE_I32, .src_ext = MO_UL },
92
- { .dst = l->datahi_reg, .dst_type = TCG_TYPE_I32,
93
- .src = TCG_REG_EDX, .src_type = TCG_TYPE_I32, .src_ext = MO_UL },
94
- };
95
- tcg_out_movext2(s, &ext[0], &ext[1], -1);
96
- } else {
97
- tcg_out_movext(s, l->type, l->datalo_reg,
98
- TCG_TYPE_REG, opc & MO_SSIZE, TCG_REG_EAX);
99
- }
100
-
101
- /* Jump to the code corresponding to next IR of qemu_st */
102
tcg_out_jmp(s, l->raddr);
103
return true;
72
}
104
}
73
--
105
--
74
2.34.1
106
2.34.1
75
107
76
108
diff view generated by jsdifflib
New patch
1
Use tcg_out_st_helper_args. This eliminates the use of a tail call to
2
the store helper. This may or may not be an improvement, depending on
3
the call/return branch prediction of the host microarchitecture.
1
4
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/i386/tcg-target.c.inc | 57 +++------------------------------------
9
1 file changed, 4 insertions(+), 53 deletions(-)
10
11
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/i386/tcg-target.c.inc
14
+++ b/tcg/i386/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
16
*/
17
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
18
{
19
- MemOpIdx oi = l->oi;
20
- MemOp opc = get_memop(oi);
21
- MemOp s_bits = opc & MO_SIZE;
22
+ MemOp opc = get_memop(l->oi);
23
tcg_insn_unit **label_ptr = &l->label_ptr[0];
24
- TCGReg retaddr;
25
26
/* resolve label address */
27
tcg_patch32(label_ptr[0], s->code_ptr - label_ptr[0] - 4);
28
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
29
tcg_patch32(label_ptr[1], s->code_ptr - label_ptr[1] - 4);
30
}
31
32
- if (TCG_TARGET_REG_BITS == 32) {
33
- int ofs = 0;
34
+ tcg_out_st_helper_args(s, l, &ldst_helper_param);
35
+ tcg_out_branch(s, 1, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
36
37
- tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, ofs);
38
- ofs += 4;
39
-
40
- tcg_out_st(s, TCG_TYPE_I32, l->addrlo_reg, TCG_REG_ESP, ofs);
41
- ofs += 4;
42
-
43
- if (TARGET_LONG_BITS == 64) {
44
- tcg_out_st(s, TCG_TYPE_I32, l->addrhi_reg, TCG_REG_ESP, ofs);
45
- ofs += 4;
46
- }
47
-
48
- tcg_out_st(s, TCG_TYPE_I32, l->datalo_reg, TCG_REG_ESP, ofs);
49
- ofs += 4;
50
-
51
- if (s_bits == MO_64) {
52
- tcg_out_st(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_ESP, ofs);
53
- ofs += 4;
54
- }
55
-
56
- tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, ofs);
57
- ofs += 4;
58
-
59
- retaddr = TCG_REG_EAX;
60
- tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr);
61
- tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, ofs);
62
- } else {
63
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
64
- tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
65
- l->addrlo_reg);
66
- tcg_out_mov(s, (s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
67
- tcg_target_call_iarg_regs[2], l->datalo_reg);
68
- tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], oi);
69
-
70
- if (ARRAY_SIZE(tcg_target_call_iarg_regs) > 4) {
71
- retaddr = tcg_target_call_iarg_regs[4];
72
- tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr);
73
- } else {
74
- retaddr = TCG_REG_RAX;
75
- tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr);
76
- tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP,
77
- TCG_TARGET_CALL_STACK_OFFSET);
78
- }
79
- }
80
-
81
- /* "Tail call" to the helper, with the return address back inline. */
82
- tcg_out_push(s, retaddr);
83
- tcg_out_jmp(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
84
+ tcg_out_jmp(s, l->raddr);
85
return true;
86
}
87
#else
88
--
89
2.34.1
90
91
diff view generated by jsdifflib
1
This function has two users, who use it incompatibly.
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
In tlb_flush_page_by_mmuidx_async_0, when flushing a
2
and tcg_out_st_helper_args.
3
single page, we need to flush exactly two pages.
4
In tlb_flush_range_by_mmuidx_async_0, when flushing a
5
range of pages, we need to flush N+1 pages.
6
7
This avoids double-flushing of jmp cache pages in a range.
8
3
9
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
---
6
---
12
accel/tcg/cputlb.c | 25 ++++++++++++++-----------
7
tcg/aarch64/tcg-target.c.inc | 40 +++++++++++++++---------------------
13
1 file changed, 14 insertions(+), 11 deletions(-)
8
1 file changed, 16 insertions(+), 24 deletions(-)
14
9
15
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
10
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
16
index XXXXXXX..XXXXXXX 100644
11
index XXXXXXX..XXXXXXX 100644
17
--- a/accel/tcg/cputlb.c
12
--- a/tcg/aarch64/tcg-target.c.inc
18
+++ b/accel/tcg/cputlb.c
13
+++ b/tcg/aarch64/tcg-target.c.inc
19
@@ -XXX,XX +XXX,XX @@ static void tb_jmp_cache_clear_page(CPUState *cpu, target_ulong page_addr)
14
@@ -XXX,XX +XXX,XX @@ static void tcg_out_cltz(TCGContext *s, TCGType ext, TCGReg d,
20
}
15
}
21
}
16
}
22
17
23
-static void tb_flush_jmp_cache(CPUState *cpu, target_ulong addr)
18
-static void tcg_out_adr(TCGContext *s, TCGReg rd, const void *target)
24
-{
19
-{
25
- /* Discard jump cache entries for any tb which might potentially
20
- ptrdiff_t offset = tcg_pcrel_diff(s, target);
26
- overlap the flushed page. */
21
- tcg_debug_assert(offset == sextract64(offset, 0, 21));
27
- tb_jmp_cache_clear_page(cpu, addr - TARGET_PAGE_SIZE);
22
- tcg_out_insn(s, 3406, ADR, rd, offset);
28
- tb_jmp_cache_clear_page(cpu, addr);
29
-}
23
-}
30
-
24
-
31
/**
25
typedef struct {
32
* tlb_mmu_resize_locked() - perform TLB resize bookkeeping; resize if necessary
26
TCGReg base;
33
* @desc: The CPUTLBDesc portion of the TLB
27
TCGReg index;
34
@@ -XXX,XX +XXX,XX @@ static void tlb_flush_page_by_mmuidx_async_0(CPUState *cpu,
28
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[MO_SIZE + 1] = {
29
#endif
30
};
31
32
+static const TCGLdstHelperParam ldst_helper_param = {
33
+ .ntmp = 1, .tmp = { TCG_REG_TMP }
34
+};
35
+
36
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
37
{
38
- MemOpIdx oi = lb->oi;
39
- MemOp opc = get_memop(oi);
40
+ MemOp opc = get_memop(lb->oi);
41
42
if (!reloc_pc19(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
43
return false;
35
}
44
}
36
qemu_spin_unlock(&env_tlb(env)->c.lock);
45
37
46
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_X0, TCG_AREG0);
38
- tb_flush_jmp_cache(cpu, addr);
47
- tcg_out_mov(s, TARGET_LONG_BITS == 64, TCG_REG_X1, lb->addrlo_reg);
39
+ /*
48
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_X2, oi);
40
+ * Discard jump cache entries for any tb which might potentially
49
- tcg_out_adr(s, TCG_REG_X3, lb->raddr);
41
+ * overlap the flushed page, which includes the previous.
50
+ tcg_out_ld_helper_args(s, lb, &ldst_helper_param);
42
+ */
51
tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SIZE]);
43
+ tb_jmp_cache_clear_page(cpu, addr - TARGET_PAGE_SIZE);
52
-
44
+ tb_jmp_cache_clear_page(cpu, addr);
53
- tcg_out_movext(s, lb->type, lb->datalo_reg,
54
- TCG_TYPE_REG, opc & MO_SSIZE, TCG_REG_X0);
55
+ tcg_out_ld_helper_ret(s, lb, false, &ldst_helper_param);
56
tcg_out_goto(s, lb->raddr);
57
return true;
45
}
58
}
46
59
47
/**
60
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
48
@@ -XXX,XX +XXX,XX @@ static void tlb_flush_range_by_mmuidx_async_0(CPUState *cpu,
61
{
49
return;
62
- MemOpIdx oi = lb->oi;
63
- MemOp opc = get_memop(oi);
64
- MemOp size = opc & MO_SIZE;
65
+ MemOp opc = get_memop(lb->oi);
66
67
if (!reloc_pc19(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
68
return false;
50
}
69
}
51
70
52
- for (target_ulong i = 0; i < d.len; i += TARGET_PAGE_SIZE) {
71
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_X0, TCG_AREG0);
53
- tb_flush_jmp_cache(cpu, d.addr + i);
72
- tcg_out_mov(s, TARGET_LONG_BITS == 64, TCG_REG_X1, lb->addrlo_reg);
54
+ /*
73
- tcg_out_mov(s, size == MO_64, TCG_REG_X2, lb->datalo_reg);
55
+ * Discard jump cache entries for any tb which might potentially
74
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_X3, oi);
56
+ * overlap the flushed pages, which includes the previous.
75
- tcg_out_adr(s, TCG_REG_X4, lb->raddr);
57
+ */
76
+ tcg_out_st_helper_args(s, lb, &ldst_helper_param);
58
+ d.addr -= TARGET_PAGE_SIZE;
77
tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE]);
59
+ for (target_ulong i = 0, n = d.len / TARGET_PAGE_SIZE + 1; i < n; i++) {
78
tcg_out_goto(s, lb->raddr);
60
+ tb_jmp_cache_clear_page(cpu, d.addr);
79
return true;
61
+ d.addr += TARGET_PAGE_SIZE;
62
}
63
}
80
}
64
81
#else
82
+static void tcg_out_adr(TCGContext *s, TCGReg rd, const void *target)
83
+{
84
+ ptrdiff_t offset = tcg_pcrel_diff(s, target);
85
+ tcg_debug_assert(offset == sextract64(offset, 0, 21));
86
+ tcg_out_insn(s, 3406, ADR, rd, offset);
87
+}
88
+
89
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
90
{
91
if (!reloc_pc19(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
65
--
92
--
66
2.34.1
93
2.34.1
67
94
68
95
diff view generated by jsdifflib
New patch
1
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
and tcg_out_st_helper_args. This allows our local
3
tcg_out_arg_* infrastructure to be removed.
4
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/arm/tcg-target.c.inc | 140 +++++----------------------------------
9
1 file changed, 18 insertions(+), 122 deletions(-)
10
11
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/arm/tcg-target.c.inc
14
+++ b/tcg/arm/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ tcg_out_ldrd_rwb(TCGContext *s, ARMCond cond, TCGReg rt, TCGReg rn, TCGReg rm)
16
tcg_out_memop_r(s, cond, INSN_LDRD_REG, rt, rn, rm, 1, 1, 1);
17
}
18
19
-static void tcg_out_strd_8(TCGContext *s, ARMCond cond, TCGReg rt,
20
- TCGReg rn, int imm8)
21
+static void __attribute__((unused))
22
+tcg_out_strd_8(TCGContext *s, ARMCond cond, TCGReg rt, TCGReg rn, int imm8)
23
{
24
tcg_out_memop_8(s, cond, INSN_STRD_IMM, rt, rn, imm8, 1, 0);
25
}
26
@@ -XXX,XX +XXX,XX @@ static void tcg_out_ext8u(TCGContext *s, TCGReg rd, TCGReg rn)
27
tcg_out_dat_imm(s, COND_AL, ARITH_AND, rd, rn, 0xff);
28
}
29
30
-static void __attribute__((unused))
31
-tcg_out_ext8u_cond(TCGContext *s, ARMCond cond, TCGReg rd, TCGReg rn)
32
-{
33
- tcg_out_dat_imm(s, cond, ARITH_AND, rd, rn, 0xff);
34
-}
35
-
36
static void tcg_out_ext16s(TCGContext *s, TCGType t, TCGReg rd, TCGReg rn)
37
{
38
/* sxth */
39
tcg_out32(s, 0x06bf0070 | (COND_AL << 28) | (rd << 12) | rn);
40
}
41
42
-static void tcg_out_ext16u_cond(TCGContext *s, ARMCond cond,
43
- TCGReg rd, TCGReg rn)
44
-{
45
- /* uxth */
46
- tcg_out32(s, 0x06ff0070 | (cond << 28) | (rd << 12) | rn);
47
-}
48
-
49
static void tcg_out_ext16u(TCGContext *s, TCGReg rd, TCGReg rn)
50
{
51
- tcg_out_ext16u_cond(s, COND_AL, rd, rn);
52
+ /* uxth */
53
+ tcg_out32(s, 0x06ff0070 | (COND_AL << 28) | (rd << 12) | rn);
54
}
55
56
static void tcg_out_ext32s(TCGContext *s, TCGReg rd, TCGReg rn)
57
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[MO_SIZE + 1] = {
58
#endif
59
};
60
61
-/* Helper routines for marshalling helper function arguments into
62
- * the correct registers and stack.
63
- * argreg is where we want to put this argument, arg is the argument itself.
64
- * Return value is the updated argreg ready for the next call.
65
- * Note that argreg 0..3 is real registers, 4+ on stack.
66
- *
67
- * We provide routines for arguments which are: immediate, 32 bit
68
- * value in register, 16 and 8 bit values in register (which must be zero
69
- * extended before use) and 64 bit value in a lo:hi register pair.
70
- */
71
-#define DEFINE_TCG_OUT_ARG(NAME, ARGTYPE, MOV_ARG, EXT_ARG) \
72
-static TCGReg NAME(TCGContext *s, TCGReg argreg, ARGTYPE arg) \
73
-{ \
74
- if (argreg < 4) { \
75
- MOV_ARG(s, COND_AL, argreg, arg); \
76
- } else { \
77
- int ofs = (argreg - 4) * 4; \
78
- EXT_ARG; \
79
- tcg_debug_assert(ofs + 4 <= TCG_STATIC_CALL_ARGS_SIZE); \
80
- tcg_out_st32_12(s, COND_AL, arg, TCG_REG_CALL_STACK, ofs); \
81
- } \
82
- return argreg + 1; \
83
-}
84
-
85
-DEFINE_TCG_OUT_ARG(tcg_out_arg_imm32, uint32_t, tcg_out_movi32,
86
- (tcg_out_movi32(s, COND_AL, TCG_REG_TMP, arg), arg = TCG_REG_TMP))
87
-DEFINE_TCG_OUT_ARG(tcg_out_arg_reg8, TCGReg, tcg_out_ext8u_cond,
88
- (tcg_out_ext8u_cond(s, COND_AL, TCG_REG_TMP, arg), arg = TCG_REG_TMP))
89
-DEFINE_TCG_OUT_ARG(tcg_out_arg_reg16, TCGReg, tcg_out_ext16u_cond,
90
- (tcg_out_ext16u_cond(s, COND_AL, TCG_REG_TMP, arg), arg = TCG_REG_TMP))
91
-DEFINE_TCG_OUT_ARG(tcg_out_arg_reg32, TCGReg, tcg_out_mov_reg, )
92
-
93
-static TCGReg tcg_out_arg_reg64(TCGContext *s, TCGReg argreg,
94
- TCGReg arglo, TCGReg arghi)
95
+static TCGReg ldst_ra_gen(TCGContext *s, const TCGLabelQemuLdst *l, int arg)
96
{
97
- /* 64 bit arguments must go in even/odd register pairs
98
- * and in 8-aligned stack slots.
99
- */
100
- if (argreg & 1) {
101
- argreg++;
102
- }
103
- if (argreg >= 4 && (arglo & 1) == 0 && arghi == arglo + 1) {
104
- tcg_out_strd_8(s, COND_AL, arglo,
105
- TCG_REG_CALL_STACK, (argreg - 4) * 4);
106
- return argreg + 2;
107
- } else {
108
- argreg = tcg_out_arg_reg32(s, argreg, arglo);
109
- argreg = tcg_out_arg_reg32(s, argreg, arghi);
110
- return argreg;
111
- }
112
+ /* We arrive at the slow path via "BLNE", so R14 contains l->raddr. */
113
+ return TCG_REG_R14;
114
}
115
116
+static const TCGLdstHelperParam ldst_helper_param = {
117
+ .ra_gen = ldst_ra_gen,
118
+ .ntmp = 1,
119
+ .tmp = { TCG_REG_TMP },
120
+};
121
+
122
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
123
{
124
- TCGReg argreg;
125
- MemOpIdx oi = lb->oi;
126
- MemOp opc = get_memop(oi);
127
+ MemOp opc = get_memop(lb->oi);
128
129
if (!reloc_pc24(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
130
return false;
131
}
132
133
- argreg = tcg_out_arg_reg32(s, TCG_REG_R0, TCG_AREG0);
134
- if (TARGET_LONG_BITS == 64) {
135
- argreg = tcg_out_arg_reg64(s, argreg, lb->addrlo_reg, lb->addrhi_reg);
136
- } else {
137
- argreg = tcg_out_arg_reg32(s, argreg, lb->addrlo_reg);
138
- }
139
- argreg = tcg_out_arg_imm32(s, argreg, oi);
140
- argreg = tcg_out_arg_reg32(s, argreg, TCG_REG_R14);
141
-
142
- /* Use the canonical unsigned helpers and minimize icache usage. */
143
+ tcg_out_ld_helper_args(s, lb, &ldst_helper_param);
144
tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SIZE]);
145
-
146
- if ((opc & MO_SIZE) == MO_64) {
147
- TCGMovExtend ext[2] = {
148
- { .dst = lb->datalo_reg, .dst_type = TCG_TYPE_I32,
149
- .src = TCG_REG_R0, .src_type = TCG_TYPE_I32, .src_ext = MO_UL },
150
- { .dst = lb->datahi_reg, .dst_type = TCG_TYPE_I32,
151
- .src = TCG_REG_R1, .src_type = TCG_TYPE_I32, .src_ext = MO_UL },
152
- };
153
- tcg_out_movext2(s, &ext[0], &ext[1], TCG_REG_TMP);
154
- } else {
155
- tcg_out_movext(s, TCG_TYPE_I32, lb->datalo_reg,
156
- TCG_TYPE_I32, opc & MO_SSIZE, TCG_REG_R0);
157
- }
158
+ tcg_out_ld_helper_ret(s, lb, false, &ldst_helper_param);
159
160
tcg_out_goto(s, COND_AL, lb->raddr);
161
return true;
162
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
163
164
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
165
{
166
- TCGReg argreg, datalo, datahi;
167
- MemOpIdx oi = lb->oi;
168
- MemOp opc = get_memop(oi);
169
+ MemOp opc = get_memop(lb->oi);
170
171
if (!reloc_pc24(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
172
return false;
173
}
174
175
- argreg = TCG_REG_R0;
176
- argreg = tcg_out_arg_reg32(s, argreg, TCG_AREG0);
177
- if (TARGET_LONG_BITS == 64) {
178
- argreg = tcg_out_arg_reg64(s, argreg, lb->addrlo_reg, lb->addrhi_reg);
179
- } else {
180
- argreg = tcg_out_arg_reg32(s, argreg, lb->addrlo_reg);
181
- }
182
-
183
- datalo = lb->datalo_reg;
184
- datahi = lb->datahi_reg;
185
- switch (opc & MO_SIZE) {
186
- case MO_8:
187
- argreg = tcg_out_arg_reg8(s, argreg, datalo);
188
- break;
189
- case MO_16:
190
- argreg = tcg_out_arg_reg16(s, argreg, datalo);
191
- break;
192
- case MO_32:
193
- default:
194
- argreg = tcg_out_arg_reg32(s, argreg, datalo);
195
- break;
196
- case MO_64:
197
- argreg = tcg_out_arg_reg64(s, argreg, datalo, datahi);
198
- break;
199
- }
200
-
201
- argreg = tcg_out_arg_imm32(s, argreg, oi);
202
- argreg = tcg_out_arg_reg32(s, argreg, TCG_REG_R14);
203
+ tcg_out_st_helper_args(s, lb, &ldst_helper_param);
204
205
/* Tail-call to the helper, which will return to the fast path. */
206
tcg_out_goto(s, COND_AL, qemu_st_helpers[opc & MO_SIZE]);
207
--
208
2.34.1
209
210
diff view generated by jsdifflib
New patch
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
and tcg_out_st_helper_args.
1
3
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/loongarch64/tcg-target.c.inc | 37 ++++++++++----------------------
8
1 file changed, 11 insertions(+), 26 deletions(-)
9
10
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
11
index XXXXXXX..XXXXXXX 100644
12
--- a/tcg/loongarch64/tcg-target.c.inc
13
+++ b/tcg/loongarch64/tcg-target.c.inc
14
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
15
return reloc_br_sd10k16(s->code_ptr - 1, target);
16
}
17
18
+static const TCGLdstHelperParam ldst_helper_param = {
19
+ .ntmp = 1, .tmp = { TCG_REG_TMP0 }
20
+};
21
+
22
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
23
{
24
- MemOpIdx oi = l->oi;
25
- MemOp opc = get_memop(oi);
26
- MemOp size = opc & MO_SIZE;
27
+ MemOp opc = get_memop(l->oi);
28
29
/* resolve label address */
30
if (!reloc_br_sk16(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
31
return false;
32
}
33
34
- /* call load helper */
35
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A0, TCG_AREG0);
36
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A1, l->addrlo_reg);
37
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A2, oi);
38
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A3, (tcg_target_long)l->raddr);
39
-
40
- tcg_out_call_int(s, qemu_ld_helpers[size], false);
41
-
42
- tcg_out_movext(s, l->type, l->datalo_reg,
43
- TCG_TYPE_REG, opc & MO_SSIZE, TCG_REG_A0);
44
+ tcg_out_ld_helper_args(s, l, &ldst_helper_param);
45
+ tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SIZE], false);
46
+ tcg_out_ld_helper_ret(s, l, false, &ldst_helper_param);
47
return tcg_out_goto(s, l->raddr);
48
}
49
50
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
51
{
52
- MemOpIdx oi = l->oi;
53
- MemOp opc = get_memop(oi);
54
- MemOp size = opc & MO_SIZE;
55
+ MemOp opc = get_memop(l->oi);
56
57
/* resolve label address */
58
if (!reloc_br_sk16(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
59
return false;
60
}
61
62
- /* call store helper */
63
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A0, TCG_AREG0);
64
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A1, l->addrlo_reg);
65
- tcg_out_movext(s, size == MO_64 ? TCG_TYPE_I32 : TCG_TYPE_I32, TCG_REG_A2,
66
- l->type, size, l->datalo_reg);
67
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A3, oi);
68
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A4, (tcg_target_long)l->raddr);
69
-
70
- tcg_out_call_int(s, qemu_st_helpers[size], false);
71
-
72
+ tcg_out_st_helper_args(s, l, &ldst_helper_param);
73
+ tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE], false);
74
return tcg_out_goto(s, l->raddr);
75
}
76
#else
77
--
78
2.34.1
79
80
diff view generated by jsdifflib
1
Bool is more appropriate type for the alloc parameter.
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
and tcg_out_st_helper_args. This allows our local
3
tcg_out_arg_* infrastructure to be removed.
4
5
We are no longer filling the call or return branch
6
delay slots, nor are we tail-calling for the store,
7
but this seems a small price to pay.
2
8
3
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
11
---
7
accel/tcg/translate-all.c | 14 +++++++-------
12
tcg/mips/tcg-target.c.inc | 154 ++++++--------------------------------
8
1 file changed, 7 insertions(+), 7 deletions(-)
13
1 file changed, 22 insertions(+), 132 deletions(-)
9
14
10
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
15
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
11
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
12
--- a/accel/tcg/translate-all.c
17
--- a/tcg/mips/tcg-target.c.inc
13
+++ b/accel/tcg/translate-all.c
18
+++ b/tcg/mips/tcg-target.c.inc
14
@@ -XXX,XX +XXX,XX @@ void page_init(void)
19
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
15
#endif
20
[MO_BEUQ] = helper_be_stq_mmu,
21
};
22
23
-/* Helper routines for marshalling helper function arguments into
24
- * the correct registers and stack.
25
- * I is where we want to put this argument, and is updated and returned
26
- * for the next call. ARG is the argument itself.
27
- *
28
- * We provide routines for arguments which are: immediate, 32 bit
29
- * value in register, 16 and 8 bit values in register (which must be zero
30
- * extended before use) and 64 bit value in a lo:hi register pair.
31
- */
32
-
33
-static int tcg_out_call_iarg_reg(TCGContext *s, int i, TCGReg arg)
34
-{
35
- if (i < ARRAY_SIZE(tcg_target_call_iarg_regs)) {
36
- tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[i], arg);
37
- } else {
38
- /* For N32 and N64, the initial offset is different. But there
39
- we also have 8 argument register so we don't run out here. */
40
- tcg_debug_assert(TCG_TARGET_REG_BITS == 32);
41
- tcg_out_st(s, TCG_TYPE_REG, arg, TCG_REG_SP, 4 * i);
42
- }
43
- return i + 1;
44
-}
45
-
46
-static int tcg_out_call_iarg_reg8(TCGContext *s, int i, TCGReg arg)
47
-{
48
- TCGReg tmp = TCG_TMP0;
49
- if (i < ARRAY_SIZE(tcg_target_call_iarg_regs)) {
50
- tmp = tcg_target_call_iarg_regs[i];
51
- }
52
- tcg_out_ext8u(s, tmp, arg);
53
- return tcg_out_call_iarg_reg(s, i, tmp);
54
-}
55
-
56
-static int tcg_out_call_iarg_reg16(TCGContext *s, int i, TCGReg arg)
57
-{
58
- TCGReg tmp = TCG_TMP0;
59
- if (i < ARRAY_SIZE(tcg_target_call_iarg_regs)) {
60
- tmp = tcg_target_call_iarg_regs[i];
61
- }
62
- tcg_out_opc_imm(s, OPC_ANDI, tmp, arg, 0xffff);
63
- return tcg_out_call_iarg_reg(s, i, tmp);
64
-}
65
-
66
-static int tcg_out_call_iarg_imm(TCGContext *s, int i, TCGArg arg)
67
-{
68
- TCGReg tmp = TCG_TMP0;
69
- if (arg == 0) {
70
- tmp = TCG_REG_ZERO;
71
- } else {
72
- if (i < ARRAY_SIZE(tcg_target_call_iarg_regs)) {
73
- tmp = tcg_target_call_iarg_regs[i];
74
- }
75
- tcg_out_movi(s, TCG_TYPE_REG, tmp, arg);
76
- }
77
- return tcg_out_call_iarg_reg(s, i, tmp);
78
-}
79
-
80
-static int tcg_out_call_iarg_reg2(TCGContext *s, int i, TCGReg al, TCGReg ah)
81
-{
82
- tcg_debug_assert(TCG_TARGET_REG_BITS == 32);
83
- i = (i + 1) & ~1;
84
- i = tcg_out_call_iarg_reg(s, i, (MIPS_BE ? ah : al));
85
- i = tcg_out_call_iarg_reg(s, i, (MIPS_BE ? al : ah));
86
- return i;
87
-}
88
+/* We have four temps, we might as well expose three of them. */
89
+static const TCGLdstHelperParam ldst_helper_param = {
90
+ .ntmp = 3, .tmp = { TCG_TMP0, TCG_TMP1, TCG_TMP2 }
91
+};
92
93
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
94
{
95
const tcg_insn_unit *tgt_rx = tcg_splitwx_to_rx(s->code_ptr);
96
- MemOpIdx oi = l->oi;
97
- MemOp opc = get_memop(oi);
98
- TCGReg v0;
99
- int i;
100
+ MemOp opc = get_memop(l->oi);
101
102
/* resolve label address */
103
if (!reloc_pc16(l->label_ptr[0], tgt_rx)
104
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
105
return false;
106
}
107
108
- i = 1;
109
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
110
- i = tcg_out_call_iarg_reg2(s, i, l->addrlo_reg, l->addrhi_reg);
111
- } else {
112
- i = tcg_out_call_iarg_reg(s, i, l->addrlo_reg);
113
- }
114
- i = tcg_out_call_iarg_imm(s, i, oi);
115
- i = tcg_out_call_iarg_imm(s, i, (intptr_t)l->raddr);
116
+ tcg_out_ld_helper_args(s, l, &ldst_helper_param);
117
+
118
tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)], false);
119
/* delay slot */
120
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
121
+ tcg_out_nop(s);
122
123
- v0 = l->datalo_reg;
124
- if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
125
- /* We eliminated V0 from the possible output registers, so it
126
- cannot be clobbered here. So we must move V1 first. */
127
- if (MIPS_BE) {
128
- tcg_out_mov(s, TCG_TYPE_I32, v0, TCG_REG_V1);
129
- v0 = l->datahi_reg;
130
- } else {
131
- tcg_out_mov(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_V1);
132
- }
133
- }
134
+ tcg_out_ld_helper_ret(s, l, true, &ldst_helper_param);
135
136
tcg_out_opc_br(s, OPC_BEQ, TCG_REG_ZERO, TCG_REG_ZERO);
137
if (!reloc_pc16(s->code_ptr - 1, l->raddr)) {
138
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
139
}
140
141
/* delay slot */
142
- if (TCG_TARGET_REG_BITS == 64 && l->type == TCG_TYPE_I32) {
143
- /* we always sign-extend 32-bit loads */
144
- tcg_out_ext32s(s, v0, TCG_REG_V0);
145
- } else {
146
- tcg_out_opc_reg(s, OPC_OR, v0, TCG_REG_V0, TCG_REG_ZERO);
147
- }
148
+ tcg_out_nop(s);
149
return true;
16
}
150
}
17
151
18
-static PageDesc *page_find_alloc(tb_page_addr_t index, int alloc)
152
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
19
+static PageDesc *page_find_alloc(tb_page_addr_t index, bool alloc)
20
{
153
{
21
PageDesc *pd;
154
const tcg_insn_unit *tgt_rx = tcg_splitwx_to_rx(s->code_ptr);
22
void **lp;
155
- MemOpIdx oi = l->oi;
23
@@ -XXX,XX +XXX,XX @@ static PageDesc *page_find_alloc(tb_page_addr_t index, int alloc)
156
- MemOp opc = get_memop(oi);
24
157
- MemOp s_bits = opc & MO_SIZE;
25
static inline PageDesc *page_find(tb_page_addr_t index)
158
- int i;
26
{
159
+ MemOp opc = get_memop(l->oi);
27
- return page_find_alloc(index, 0);
160
28
+ return page_find_alloc(index, false);
161
/* resolve label address */
162
if (!reloc_pc16(l->label_ptr[0], tgt_rx)
163
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
164
return false;
165
}
166
167
- i = 1;
168
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
169
- i = tcg_out_call_iarg_reg2(s, i, l->addrlo_reg, l->addrhi_reg);
170
- } else {
171
- i = tcg_out_call_iarg_reg(s, i, l->addrlo_reg);
172
- }
173
- switch (s_bits) {
174
- case MO_8:
175
- i = tcg_out_call_iarg_reg8(s, i, l->datalo_reg);
176
- break;
177
- case MO_16:
178
- i = tcg_out_call_iarg_reg16(s, i, l->datalo_reg);
179
- break;
180
- case MO_32:
181
- i = tcg_out_call_iarg_reg(s, i, l->datalo_reg);
182
- break;
183
- case MO_64:
184
- if (TCG_TARGET_REG_BITS == 32) {
185
- i = tcg_out_call_iarg_reg2(s, i, l->datalo_reg, l->datahi_reg);
186
- } else {
187
- i = tcg_out_call_iarg_reg(s, i, l->datalo_reg);
188
- }
189
- break;
190
- default:
191
- g_assert_not_reached();
192
- }
193
- i = tcg_out_call_iarg_imm(s, i, oi);
194
+ tcg_out_st_helper_args(s, l, &ldst_helper_param);
195
196
- /* Tail call to the store helper. Thus force the return address
197
- computation to take place in the return address register. */
198
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_RA, (intptr_t)l->raddr);
199
- i = tcg_out_call_iarg_reg(s, i, TCG_REG_RA);
200
- tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)], true);
201
+ tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)], false);
202
/* delay slot */
203
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
204
+ tcg_out_nop(s);
205
+
206
+ tcg_out_opc_br(s, OPC_BEQ, TCG_REG_ZERO, TCG_REG_ZERO);
207
+ if (!reloc_pc16(s->code_ptr - 1, l->raddr)) {
208
+ return false;
209
+ }
210
+
211
+ /* delay slot */
212
+ tcg_out_nop(s);
213
return true;
29
}
214
}
30
215
31
static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1,
32
- PageDesc **ret_p2, tb_page_addr_t phys2, int alloc);
33
+ PageDesc **ret_p2, tb_page_addr_t phys2, bool alloc);
34
35
/* In user-mode page locks aren't used; mmap_lock is enough */
36
#ifdef CONFIG_USER_ONLY
37
@@ -XXX,XX +XXX,XX @@ static inline void page_unlock(PageDesc *pd)
38
/* lock the page(s) of a TB in the correct acquisition order */
39
static inline void page_lock_tb(const TranslationBlock *tb)
40
{
41
- page_lock_pair(NULL, tb->page_addr[0], NULL, tb->page_addr[1], 0);
42
+ page_lock_pair(NULL, tb->page_addr[0], NULL, tb->page_addr[1], false);
43
}
44
45
static inline void page_unlock_tb(const TranslationBlock *tb)
46
@@ -XXX,XX +XXX,XX @@ void page_collection_unlock(struct page_collection *set)
47
#endif /* !CONFIG_USER_ONLY */
48
49
static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1,
50
- PageDesc **ret_p2, tb_page_addr_t phys2, int alloc)
51
+ PageDesc **ret_p2, tb_page_addr_t phys2, bool alloc)
52
{
53
PageDesc *p1, *p2;
54
tb_page_addr_t page1;
55
@@ -XXX,XX +XXX,XX @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
56
* Note that inserting into the hash table first isn't an option, since
57
* we can only insert TBs that are fully initialized.
58
*/
59
- page_lock_pair(&p, phys_pc, &p2, phys_page2, 1);
60
+ page_lock_pair(&p, phys_pc, &p2, phys_page2, true);
61
tb_page_add(p, tb, 0, phys_pc & TARGET_PAGE_MASK);
62
if (p2) {
63
tb_page_add(p2, tb, 1, phys_page2);
64
@@ -XXX,XX +XXX,XX @@ void page_set_flags(target_ulong start, target_ulong end, int flags)
65
for (addr = start, len = end - start;
66
len != 0;
67
len -= TARGET_PAGE_SIZE, addr += TARGET_PAGE_SIZE) {
68
- PageDesc *p = page_find_alloc(addr >> TARGET_PAGE_BITS, 1);
69
+ PageDesc *p = page_find_alloc(addr >> TARGET_PAGE_BITS, true);
70
71
/* If the write protection bit is set, then we invalidate
72
the code inside. */
73
--
216
--
74
2.34.1
217
2.34.1
75
218
76
219
diff view generated by jsdifflib
New patch
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
and tcg_out_st_helper_args.
1
3
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/ppc/tcg-target.c.inc | 88 ++++++++++++----------------------------
9
1 file changed, 26 insertions(+), 62 deletions(-)
10
11
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/ppc/tcg-target.c.inc
14
+++ b/tcg/ppc/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
16
[MO_BEUQ] = helper_be_stq_mmu,
17
};
18
19
+static TCGReg ldst_ra_gen(TCGContext *s, const TCGLabelQemuLdst *l, int arg)
20
+{
21
+ if (arg < 0) {
22
+ arg = TCG_REG_TMP1;
23
+ }
24
+ tcg_out32(s, MFSPR | RT(arg) | LR);
25
+ return arg;
26
+}
27
+
28
+/*
29
+ * For the purposes of ppc32 sorting 4 input registers into 4 argument
30
+ * registers, there is an outside chance we would require 3 temps.
31
+ * Because of constraints, no inputs are in r3, and env will not be
32
+ * placed into r3 until after the sorting is done, and is thus free.
33
+ */
34
+static const TCGLdstHelperParam ldst_helper_param = {
35
+ .ra_gen = ldst_ra_gen,
36
+ .ntmp = 3,
37
+ .tmp = { TCG_REG_TMP1, TCG_REG_R0, TCG_REG_R3 }
38
+};
39
+
40
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
41
{
42
- MemOpIdx oi = lb->oi;
43
- MemOp opc = get_memop(oi);
44
- TCGReg hi, lo, arg = TCG_REG_R3;
45
+ MemOp opc = get_memop(lb->oi);
46
47
if (!reloc_pc14(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
48
return false;
49
}
50
51
- tcg_out_mov(s, TCG_TYPE_PTR, arg++, TCG_AREG0);
52
-
53
- lo = lb->addrlo_reg;
54
- hi = lb->addrhi_reg;
55
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
56
- arg |= (TCG_TARGET_CALL_ARG_I64 == TCG_CALL_ARG_EVEN);
57
- tcg_out_mov(s, TCG_TYPE_I32, arg++, hi);
58
- tcg_out_mov(s, TCG_TYPE_I32, arg++, lo);
59
- } else {
60
- /* If the address needed to be zero-extended, we'll have already
61
- placed it in R4. The only remaining case is 64-bit guest. */
62
- tcg_out_mov(s, TCG_TYPE_TL, arg++, lo);
63
- }
64
-
65
- tcg_out_movi(s, TCG_TYPE_I32, arg++, oi);
66
- tcg_out32(s, MFSPR | RT(arg) | LR);
67
-
68
+ tcg_out_ld_helper_args(s, lb, &ldst_helper_param);
69
tcg_out_call_int(s, LK, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]);
70
-
71
- lo = lb->datalo_reg;
72
- hi = lb->datahi_reg;
73
- if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
74
- tcg_out_mov(s, TCG_TYPE_I32, lo, TCG_REG_R4);
75
- tcg_out_mov(s, TCG_TYPE_I32, hi, TCG_REG_R3);
76
- } else {
77
- tcg_out_movext(s, lb->type, lo,
78
- TCG_TYPE_REG, opc & MO_SSIZE, TCG_REG_R3);
79
- }
80
+ tcg_out_ld_helper_ret(s, lb, false, &ldst_helper_param);
81
82
tcg_out_b(s, 0, lb->raddr);
83
return true;
84
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
85
86
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
87
{
88
- MemOpIdx oi = lb->oi;
89
- MemOp opc = get_memop(oi);
90
- MemOp s_bits = opc & MO_SIZE;
91
- TCGReg hi, lo, arg = TCG_REG_R3;
92
+ MemOp opc = get_memop(lb->oi);
93
94
if (!reloc_pc14(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
95
return false;
96
}
97
98
- tcg_out_mov(s, TCG_TYPE_PTR, arg++, TCG_AREG0);
99
-
100
- lo = lb->addrlo_reg;
101
- hi = lb->addrhi_reg;
102
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
103
- arg |= (TCG_TARGET_CALL_ARG_I64 == TCG_CALL_ARG_EVEN);
104
- tcg_out_mov(s, TCG_TYPE_I32, arg++, hi);
105
- tcg_out_mov(s, TCG_TYPE_I32, arg++, lo);
106
- } else {
107
- /* If the address needed to be zero-extended, we'll have already
108
- placed it in R4. The only remaining case is 64-bit guest. */
109
- tcg_out_mov(s, TCG_TYPE_TL, arg++, lo);
110
- }
111
-
112
- lo = lb->datalo_reg;
113
- hi = lb->datahi_reg;
114
- if (TCG_TARGET_REG_BITS == 32 && s_bits == MO_64) {
115
- arg |= (TCG_TARGET_CALL_ARG_I64 == TCG_CALL_ARG_EVEN);
116
- tcg_out_mov(s, TCG_TYPE_I32, arg++, hi);
117
- tcg_out_mov(s, TCG_TYPE_I32, arg++, lo);
118
- } else {
119
- tcg_out_movext(s, s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32,
120
- arg++, lb->type, s_bits, lo);
121
- }
122
-
123
- tcg_out_movi(s, TCG_TYPE_I32, arg++, oi);
124
- tcg_out32(s, MFSPR | RT(arg) | LR);
125
-
126
+ tcg_out_st_helper_args(s, lb, &ldst_helper_param);
127
tcg_out_call_int(s, LK, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
128
129
tcg_out_b(s, 0, lb->raddr);
130
--
131
2.34.1
132
133
diff view generated by jsdifflib
New patch
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
and tcg_out_st_helper_args.
1
3
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/riscv/tcg-target.c.inc | 37 ++++++++++---------------------------
9
1 file changed, 10 insertions(+), 27 deletions(-)
10
11
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/riscv/tcg-target.c.inc
14
+++ b/tcg/riscv/tcg-target.c.inc
15
@@ -XXX,XX +XXX,XX @@ static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
16
tcg_debug_assert(ok);
17
}
18
19
+/* We have three temps, we might as well expose them. */
20
+static const TCGLdstHelperParam ldst_helper_param = {
21
+ .ntmp = 3, .tmp = { TCG_REG_TMP0, TCG_REG_TMP1, TCG_REG_TMP2 }
22
+};
23
+
24
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
25
{
26
- MemOpIdx oi = l->oi;
27
- MemOp opc = get_memop(oi);
28
- TCGReg a0 = tcg_target_call_iarg_regs[0];
29
- TCGReg a1 = tcg_target_call_iarg_regs[1];
30
- TCGReg a2 = tcg_target_call_iarg_regs[2];
31
- TCGReg a3 = tcg_target_call_iarg_regs[3];
32
+ MemOp opc = get_memop(l->oi);
33
34
/* resolve label address */
35
if (!reloc_sbimm12(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
36
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
37
}
38
39
/* call load helper */
40
- tcg_out_mov(s, TCG_TYPE_PTR, a0, TCG_AREG0);
41
- tcg_out_mov(s, TCG_TYPE_PTR, a1, l->addrlo_reg);
42
- tcg_out_movi(s, TCG_TYPE_PTR, a2, oi);
43
- tcg_out_movi(s, TCG_TYPE_PTR, a3, (tcg_target_long)l->raddr);
44
-
45
+ tcg_out_ld_helper_args(s, l, &ldst_helper_param);
46
tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SSIZE], false);
47
- tcg_out_mov(s, (opc & MO_SIZE) == MO_64, l->datalo_reg, a0);
48
+ tcg_out_ld_helper_ret(s, l, true, &ldst_helper_param);
49
50
tcg_out_goto(s, l->raddr);
51
return true;
52
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
53
54
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
55
{
56
- MemOpIdx oi = l->oi;
57
- MemOp opc = get_memop(oi);
58
- MemOp s_bits = opc & MO_SIZE;
59
- TCGReg a0 = tcg_target_call_iarg_regs[0];
60
- TCGReg a1 = tcg_target_call_iarg_regs[1];
61
- TCGReg a2 = tcg_target_call_iarg_regs[2];
62
- TCGReg a3 = tcg_target_call_iarg_regs[3];
63
- TCGReg a4 = tcg_target_call_iarg_regs[4];
64
+ MemOp opc = get_memop(l->oi);
65
66
/* resolve label address */
67
if (!reloc_sbimm12(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
68
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
69
}
70
71
/* call store helper */
72
- tcg_out_mov(s, TCG_TYPE_PTR, a0, TCG_AREG0);
73
- tcg_out_mov(s, TCG_TYPE_PTR, a1, l->addrlo_reg);
74
- tcg_out_movext(s, s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32, a2,
75
- l->type, s_bits, l->datalo_reg);
76
- tcg_out_movi(s, TCG_TYPE_PTR, a3, oi);
77
- tcg_out_movi(s, TCG_TYPE_PTR, a4, (tcg_target_long)l->raddr);
78
-
79
+ tcg_out_st_helper_args(s, l, &ldst_helper_param);
80
tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE], false);
81
82
tcg_out_goto(s, l->raddr);
83
--
84
2.34.1
85
86
diff view generated by jsdifflib
New patch
1
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
2
and tcg_out_st_helper_args.
1
3
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/s390x/tcg-target.c.inc | 35 ++++++++++-------------------------
8
1 file changed, 10 insertions(+), 25 deletions(-)
9
10
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
11
index XXXXXXX..XXXXXXX 100644
12
--- a/tcg/s390x/tcg-target.c.inc
13
+++ b/tcg/s390x/tcg-target.c.inc
14
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg data,
15
}
16
17
#if defined(CONFIG_SOFTMMU)
18
+static const TCGLdstHelperParam ldst_helper_param = {
19
+ .ntmp = 1, .tmp = { TCG_TMP0 }
20
+};
21
+
22
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
23
{
24
- TCGReg addr_reg = lb->addrlo_reg;
25
- TCGReg data_reg = lb->datalo_reg;
26
- MemOpIdx oi = lb->oi;
27
- MemOp opc = get_memop(oi);
28
+ MemOp opc = get_memop(lb->oi);
29
30
if (!patch_reloc(lb->label_ptr[0], R_390_PC16DBL,
31
(intptr_t)tcg_splitwx_to_rx(s->code_ptr), 2)) {
32
return false;
33
}
34
35
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_R2, TCG_AREG0);
36
- if (TARGET_LONG_BITS == 64) {
37
- tcg_out_mov(s, TCG_TYPE_I64, TCG_REG_R3, addr_reg);
38
- }
39
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R4, oi);
40
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R5, (uintptr_t)lb->raddr);
41
- tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)]);
42
- tcg_out_mov(s, TCG_TYPE_I64, data_reg, TCG_REG_R2);
43
+ tcg_out_ld_helper_args(s, lb, &ldst_helper_param);
44
+ tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]);
45
+ tcg_out_ld_helper_ret(s, lb, false, &ldst_helper_param);
46
47
tgen_gotoi(s, S390_CC_ALWAYS, lb->raddr);
48
return true;
49
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
50
51
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
52
{
53
- TCGReg addr_reg = lb->addrlo_reg;
54
- TCGReg data_reg = lb->datalo_reg;
55
- MemOpIdx oi = lb->oi;
56
- MemOp opc = get_memop(oi);
57
- MemOp size = opc & MO_SIZE;
58
+ MemOp opc = get_memop(lb->oi);
59
60
if (!patch_reloc(lb->label_ptr[0], R_390_PC16DBL,
61
(intptr_t)tcg_splitwx_to_rx(s->code_ptr), 2)) {
62
return false;
63
}
64
65
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_R2, TCG_AREG0);
66
- if (TARGET_LONG_BITS == 64) {
67
- tcg_out_mov(s, TCG_TYPE_I64, TCG_REG_R3, addr_reg);
68
- }
69
- tcg_out_movext(s, size == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32,
70
- TCG_REG_R4, lb->type, size, data_reg);
71
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R5, oi);
72
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R6, (uintptr_t)lb->raddr);
73
+ tcg_out_st_helper_args(s, lb, &ldst_helper_param);
74
tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
75
76
tgen_gotoi(s, S390_CC_ALWAYS, lb->raddr);
77
--
78
2.34.1
79
80
diff view generated by jsdifflib
New patch
1
The softmmu tlb uses TCG_REG_TMP[0-2], not any of the normally available
2
registers. Now that we handle overlap betwen inputs and helper arguments,
3
we can allow any allocatable reg.
1
4
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/loongarch64/tcg-target-con-set.h | 2 --
9
tcg/loongarch64/tcg-target-con-str.h | 1 -
10
tcg/loongarch64/tcg-target.c.inc | 23 ++++-------------------
11
3 files changed, 4 insertions(+), 22 deletions(-)
12
13
diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/tcg/loongarch64/tcg-target-con-set.h
16
+++ b/tcg/loongarch64/tcg-target-con-set.h
17
@@ -XXX,XX +XXX,XX @@
18
C_O0_I1(r)
19
C_O0_I2(rZ, r)
20
C_O0_I2(rZ, rZ)
21
-C_O0_I2(LZ, L)
22
C_O1_I1(r, r)
23
-C_O1_I1(r, L)
24
C_O1_I2(r, r, rC)
25
C_O1_I2(r, r, ri)
26
C_O1_I2(r, r, rI)
27
diff --git a/tcg/loongarch64/tcg-target-con-str.h b/tcg/loongarch64/tcg-target-con-str.h
28
index XXXXXXX..XXXXXXX 100644
29
--- a/tcg/loongarch64/tcg-target-con-str.h
30
+++ b/tcg/loongarch64/tcg-target-con-str.h
31
@@ -XXX,XX +XXX,XX @@
32
* REGS(letter, register_mask)
33
*/
34
REGS('r', ALL_GENERAL_REGS)
35
-REGS('L', ALL_GENERAL_REGS & ~SOFTMMU_RESERVE_REGS)
36
37
/*
38
* Define constraint letters for constants:
39
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
40
index XXXXXXX..XXXXXXX 100644
41
--- a/tcg/loongarch64/tcg-target.c.inc
42
+++ b/tcg/loongarch64/tcg-target.c.inc
43
@@ -XXX,XX +XXX,XX @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
44
#define TCG_CT_CONST_C12 0x1000
45
#define TCG_CT_CONST_WSZ 0x2000
46
47
-#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
48
-/*
49
- * For softmmu, we need to avoid conflicts with the first 5
50
- * argument registers to call the helper. Some of these are
51
- * also used for the tlb lookup.
52
- */
53
-#ifdef CONFIG_SOFTMMU
54
-#define SOFTMMU_RESERVE_REGS MAKE_64BIT_MASK(TCG_REG_A0, 5)
55
-#else
56
-#define SOFTMMU_RESERVE_REGS 0
57
-#endif
58
-
59
+#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
60
61
static inline tcg_target_long sextreg(tcg_target_long val, int pos, int len)
62
{
63
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
64
case INDEX_op_st32_i64:
65
case INDEX_op_st_i32:
66
case INDEX_op_st_i64:
67
+ case INDEX_op_qemu_st_i32:
68
+ case INDEX_op_qemu_st_i64:
69
return C_O0_I2(rZ, r);
70
71
case INDEX_op_brcond_i32:
72
case INDEX_op_brcond_i64:
73
return C_O0_I2(rZ, rZ);
74
75
- case INDEX_op_qemu_st_i32:
76
- case INDEX_op_qemu_st_i64:
77
- return C_O0_I2(LZ, L);
78
-
79
case INDEX_op_ext8s_i32:
80
case INDEX_op_ext8s_i64:
81
case INDEX_op_ext8u_i32:
82
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
83
case INDEX_op_ld32u_i64:
84
case INDEX_op_ld_i32:
85
case INDEX_op_ld_i64:
86
- return C_O1_I1(r, r);
87
-
88
case INDEX_op_qemu_ld_i32:
89
case INDEX_op_qemu_ld_i64:
90
- return C_O1_I1(r, L);
91
+ return C_O1_I1(r, r);
92
93
case INDEX_op_andc_i32:
94
case INDEX_op_andc_i64:
95
--
96
2.34.1
97
98
diff view generated by jsdifflib
New patch
1
While performing the load in the delay slot of the call to the common
2
bswap helper function is cute, it is not worth the added complexity.
1
3
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/mips/tcg-target.h | 4 +-
8
tcg/mips/tcg-target.c.inc | 284 ++++++--------------------------------
9
2 files changed, 48 insertions(+), 240 deletions(-)
10
11
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/mips/tcg-target.h
14
+++ b/tcg/mips/tcg-target.h
15
@@ -XXX,XX +XXX,XX @@ extern bool use_mips32r2_instructions;
16
#define TCG_TARGET_HAS_ext16u_i64 0 /* andi rt, rs, 0xffff */
17
#endif
18
19
-#define TCG_TARGET_DEFAULT_MO (0)
20
-#define TCG_TARGET_HAS_MEMORY_BSWAP 1
21
+#define TCG_TARGET_DEFAULT_MO 0
22
+#define TCG_TARGET_HAS_MEMORY_BSWAP 0
23
24
#define TCG_TARGET_NEED_LDST_LABELS
25
26
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
27
index XXXXXXX..XXXXXXX 100644
28
--- a/tcg/mips/tcg-target.c.inc
29
+++ b/tcg/mips/tcg-target.c.inc
30
@@ -XXX,XX +XXX,XX @@ static void tcg_out_call(TCGContext *s, const tcg_insn_unit *arg,
31
}
32
33
#if defined(CONFIG_SOFTMMU)
34
-static void * const qemu_ld_helpers[(MO_SSIZE | MO_BSWAP) + 1] = {
35
+static void * const qemu_ld_helpers[MO_SSIZE + 1] = {
36
[MO_UB] = helper_ret_ldub_mmu,
37
[MO_SB] = helper_ret_ldsb_mmu,
38
- [MO_LEUW] = helper_le_lduw_mmu,
39
- [MO_LESW] = helper_le_ldsw_mmu,
40
- [MO_LEUL] = helper_le_ldul_mmu,
41
- [MO_LEUQ] = helper_le_ldq_mmu,
42
- [MO_BEUW] = helper_be_lduw_mmu,
43
- [MO_BESW] = helper_be_ldsw_mmu,
44
- [MO_BEUL] = helper_be_ldul_mmu,
45
- [MO_BEUQ] = helper_be_ldq_mmu,
46
-#if TCG_TARGET_REG_BITS == 64
47
- [MO_LESL] = helper_le_ldsl_mmu,
48
- [MO_BESL] = helper_be_ldsl_mmu,
49
+#if HOST_BIG_ENDIAN
50
+ [MO_UW] = helper_be_lduw_mmu,
51
+ [MO_SW] = helper_be_ldsw_mmu,
52
+ [MO_UL] = helper_be_ldul_mmu,
53
+ [MO_SL] = helper_be_ldsl_mmu,
54
+ [MO_UQ] = helper_be_ldq_mmu,
55
+#else
56
+ [MO_UW] = helper_le_lduw_mmu,
57
+ [MO_SW] = helper_le_ldsw_mmu,
58
+ [MO_UL] = helper_le_ldul_mmu,
59
+ [MO_UQ] = helper_le_ldq_mmu,
60
+ [MO_SL] = helper_le_ldsl_mmu,
61
#endif
62
};
63
64
-static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
65
+static void * const qemu_st_helpers[MO_SIZE + 1] = {
66
[MO_UB] = helper_ret_stb_mmu,
67
- [MO_LEUW] = helper_le_stw_mmu,
68
- [MO_LEUL] = helper_le_stl_mmu,
69
- [MO_LEUQ] = helper_le_stq_mmu,
70
- [MO_BEUW] = helper_be_stw_mmu,
71
- [MO_BEUL] = helper_be_stl_mmu,
72
- [MO_BEUQ] = helper_be_stq_mmu,
73
+#if HOST_BIG_ENDIAN
74
+ [MO_UW] = helper_be_stw_mmu,
75
+ [MO_UL] = helper_be_stl_mmu,
76
+ [MO_UQ] = helper_be_stq_mmu,
77
+#else
78
+ [MO_UW] = helper_le_stw_mmu,
79
+ [MO_UL] = helper_le_stl_mmu,
80
+ [MO_UQ] = helper_le_stq_mmu,
81
+#endif
82
};
83
84
/* We have four temps, we might as well expose three of them. */
85
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
86
87
tcg_out_ld_helper_args(s, l, &ldst_helper_param);
88
89
- tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)], false);
90
+ tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SSIZE], false);
91
/* delay slot */
92
tcg_out_nop(s);
93
94
@@ -XXX,XX +XXX,XX @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
95
96
tcg_out_st_helper_args(s, l, &ldst_helper_param);
97
98
- tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)], false);
99
+ tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE], false);
100
/* delay slot */
101
tcg_out_nop(s);
102
103
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
104
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
105
TCGReg base, MemOp opc, TCGType type)
106
{
107
- switch (opc & (MO_SSIZE | MO_BSWAP)) {
108
+ switch (opc & MO_SSIZE) {
109
case MO_UB:
110
tcg_out_opc_imm(s, OPC_LBU, lo, base, 0);
111
break;
112
case MO_SB:
113
tcg_out_opc_imm(s, OPC_LB, lo, base, 0);
114
break;
115
- case MO_UW | MO_BSWAP:
116
- tcg_out_opc_imm(s, OPC_LHU, TCG_TMP1, base, 0);
117
- tcg_out_bswap16(s, lo, TCG_TMP1, TCG_BSWAP_IZ | TCG_BSWAP_OZ);
118
- break;
119
case MO_UW:
120
tcg_out_opc_imm(s, OPC_LHU, lo, base, 0);
121
break;
122
- case MO_SW | MO_BSWAP:
123
- tcg_out_opc_imm(s, OPC_LHU, TCG_TMP1, base, 0);
124
- tcg_out_bswap16(s, lo, TCG_TMP1, TCG_BSWAP_IZ | TCG_BSWAP_OS);
125
- break;
126
case MO_SW:
127
tcg_out_opc_imm(s, OPC_LH, lo, base, 0);
128
break;
129
- case MO_UL | MO_BSWAP:
130
- if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64) {
131
- if (use_mips32r2_instructions) {
132
- tcg_out_opc_imm(s, OPC_LWU, lo, base, 0);
133
- tcg_out_bswap32(s, lo, lo, TCG_BSWAP_IZ | TCG_BSWAP_OZ);
134
- } else {
135
- tcg_out_bswap_subr(s, bswap32u_addr);
136
- /* delay slot */
137
- tcg_out_opc_imm(s, OPC_LWU, TCG_TMP0, base, 0);
138
- tcg_out_mov(s, TCG_TYPE_I64, lo, TCG_TMP3);
139
- }
140
- break;
141
- }
142
- /* FALLTHRU */
143
- case MO_SL | MO_BSWAP:
144
- if (use_mips32r2_instructions) {
145
- tcg_out_opc_imm(s, OPC_LW, lo, base, 0);
146
- tcg_out_bswap32(s, lo, lo, 0);
147
- } else {
148
- tcg_out_bswap_subr(s, bswap32_addr);
149
- /* delay slot */
150
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, base, 0);
151
- tcg_out_mov(s, TCG_TYPE_I32, lo, TCG_TMP3);
152
- }
153
- break;
154
case MO_UL:
155
if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64) {
156
tcg_out_opc_imm(s, OPC_LWU, lo, base, 0);
157
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
158
case MO_SL:
159
tcg_out_opc_imm(s, OPC_LW, lo, base, 0);
160
break;
161
- case MO_UQ | MO_BSWAP:
162
- if (TCG_TARGET_REG_BITS == 64) {
163
- if (use_mips32r2_instructions) {
164
- tcg_out_opc_imm(s, OPC_LD, lo, base, 0);
165
- tcg_out_bswap64(s, lo, lo);
166
- } else {
167
- tcg_out_bswap_subr(s, bswap64_addr);
168
- /* delay slot */
169
- tcg_out_opc_imm(s, OPC_LD, TCG_TMP0, base, 0);
170
- tcg_out_mov(s, TCG_TYPE_I64, lo, TCG_TMP3);
171
- }
172
- } else if (use_mips32r2_instructions) {
173
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, base, 0);
174
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP1, base, 4);
175
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP0, 0, TCG_TMP0);
176
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP1, 0, TCG_TMP1);
177
- tcg_out_opc_sa(s, OPC_ROTR, MIPS_BE ? lo : hi, TCG_TMP0, 16);
178
- tcg_out_opc_sa(s, OPC_ROTR, MIPS_BE ? hi : lo, TCG_TMP1, 16);
179
- } else {
180
- tcg_out_bswap_subr(s, bswap32_addr);
181
- /* delay slot */
182
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, base, 0);
183
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, base, 4);
184
- tcg_out_bswap_subr(s, bswap32_addr);
185
- /* delay slot */
186
- tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? lo : hi, TCG_TMP3);
187
- tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? hi : lo, TCG_TMP3);
188
- }
189
- break;
190
case MO_UQ:
191
/* Prefer to load from offset 0 first, but allow for overlap. */
192
if (TCG_TARGET_REG_BITS == 64) {
193
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
194
const MIPSInsn lw2 = MIPS_BE ? OPC_LWR : OPC_LWL;
195
const MIPSInsn ld1 = MIPS_BE ? OPC_LDL : OPC_LDR;
196
const MIPSInsn ld2 = MIPS_BE ? OPC_LDR : OPC_LDL;
197
+ bool sgn = opc & MO_SIGN;
198
199
- bool sgn = (opc & MO_SIGN);
200
-
201
- switch (opc & (MO_SSIZE | MO_BSWAP)) {
202
- case MO_SW | MO_BE:
203
- case MO_UW | MO_BE:
204
- tcg_out_opc_imm(s, sgn ? OPC_LB : OPC_LBU, TCG_TMP0, base, 0);
205
- tcg_out_opc_imm(s, OPC_LBU, lo, base, 1);
206
- if (use_mips32r2_instructions) {
207
- tcg_out_opc_bf(s, OPC_INS, lo, TCG_TMP0, 31, 8);
208
- } else {
209
- tcg_out_opc_sa(s, OPC_SLL, TCG_TMP0, TCG_TMP0, 8);
210
- tcg_out_opc_reg(s, OPC_OR, lo, TCG_TMP0, TCG_TMP1);
211
- }
212
- break;
213
-
214
- case MO_SW | MO_LE:
215
- case MO_UW | MO_LE:
216
- if (use_mips32r2_instructions && lo != base) {
217
+ switch (opc & MO_SIZE) {
218
+ case MO_16:
219
+ if (HOST_BIG_ENDIAN) {
220
+ tcg_out_opc_imm(s, sgn ? OPC_LB : OPC_LBU, TCG_TMP0, base, 0);
221
+ tcg_out_opc_imm(s, OPC_LBU, lo, base, 1);
222
+ if (use_mips32r2_instructions) {
223
+ tcg_out_opc_bf(s, OPC_INS, lo, TCG_TMP0, 31, 8);
224
+ } else {
225
+ tcg_out_opc_sa(s, OPC_SLL, TCG_TMP0, TCG_TMP0, 8);
226
+ tcg_out_opc_reg(s, OPC_OR, lo, lo, TCG_TMP0);
227
+ }
228
+ } else if (use_mips32r2_instructions && lo != base) {
229
tcg_out_opc_imm(s, OPC_LBU, lo, base, 0);
230
tcg_out_opc_imm(s, sgn ? OPC_LB : OPC_LBU, TCG_TMP0, base, 1);
231
tcg_out_opc_bf(s, OPC_INS, lo, TCG_TMP0, 31, 8);
232
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
233
}
234
break;
235
236
- case MO_SL:
237
- case MO_UL:
238
+ case MO_32:
239
tcg_out_opc_imm(s, lw1, lo, base, 0);
240
tcg_out_opc_imm(s, lw2, lo, base, 3);
241
if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64 && !sgn) {
242
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
243
}
244
break;
245
246
- case MO_UL | MO_BSWAP:
247
- case MO_SL | MO_BSWAP:
248
- if (use_mips32r2_instructions) {
249
- tcg_out_opc_imm(s, lw1, lo, base, 0);
250
- tcg_out_opc_imm(s, lw2, lo, base, 3);
251
- tcg_out_bswap32(s, lo, lo,
252
- TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64
253
- ? (sgn ? TCG_BSWAP_OS : TCG_BSWAP_OZ) : 0);
254
- } else {
255
- const tcg_insn_unit *subr =
256
- (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64 && !sgn
257
- ? bswap32u_addr : bswap32_addr);
258
-
259
- tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 0);
260
- tcg_out_bswap_subr(s, subr);
261
- /* delay slot */
262
- tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 3);
263
- tcg_out_mov(s, type, lo, TCG_TMP3);
264
- }
265
- break;
266
-
267
- case MO_UQ:
268
+ case MO_64:
269
if (TCG_TARGET_REG_BITS == 64) {
270
tcg_out_opc_imm(s, ld1, lo, base, 0);
271
tcg_out_opc_imm(s, ld2, lo, base, 7);
272
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
273
}
274
break;
275
276
- case MO_UQ | MO_BSWAP:
277
- if (TCG_TARGET_REG_BITS == 64) {
278
- if (use_mips32r2_instructions) {
279
- tcg_out_opc_imm(s, ld1, lo, base, 0);
280
- tcg_out_opc_imm(s, ld2, lo, base, 7);
281
- tcg_out_bswap64(s, lo, lo);
282
- } else {
283
- tcg_out_opc_imm(s, ld1, TCG_TMP0, base, 0);
284
- tcg_out_bswap_subr(s, bswap64_addr);
285
- /* delay slot */
286
- tcg_out_opc_imm(s, ld2, TCG_TMP0, base, 7);
287
- tcg_out_mov(s, TCG_TYPE_I64, lo, TCG_TMP3);
288
- }
289
- } else if (use_mips32r2_instructions) {
290
- tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 0 + 0);
291
- tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 0 + 3);
292
- tcg_out_opc_imm(s, lw1, TCG_TMP1, base, 4 + 0);
293
- tcg_out_opc_imm(s, lw2, TCG_TMP1, base, 4 + 3);
294
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP0, 0, TCG_TMP0);
295
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP1, 0, TCG_TMP1);
296
- tcg_out_opc_sa(s, OPC_ROTR, MIPS_BE ? lo : hi, TCG_TMP0, 16);
297
- tcg_out_opc_sa(s, OPC_ROTR, MIPS_BE ? hi : lo, TCG_TMP1, 16);
298
- } else {
299
- tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 0 + 0);
300
- tcg_out_bswap_subr(s, bswap32_addr);
301
- /* delay slot */
302
- tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 0 + 3);
303
- tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 4 + 0);
304
- tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? lo : hi, TCG_TMP3);
305
- tcg_out_bswap_subr(s, bswap32_addr);
306
- /* delay slot */
307
- tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 4 + 3);
308
- tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? hi : lo, TCG_TMP3);
309
- }
310
- break;
311
-
312
default:
313
g_assert_not_reached();
314
}
315
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
316
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo, TCGReg hi,
317
TCGReg base, MemOp opc)
318
{
319
- /* Don't clutter the code below with checks to avoid bswapping ZERO. */
320
- if ((lo | hi) == 0) {
321
- opc &= ~MO_BSWAP;
322
- }
323
-
324
- switch (opc & (MO_SIZE | MO_BSWAP)) {
325
+ switch (opc & MO_SIZE) {
326
case MO_8:
327
tcg_out_opc_imm(s, OPC_SB, lo, base, 0);
328
break;
329
-
330
- case MO_16 | MO_BSWAP:
331
- tcg_out_bswap16(s, TCG_TMP1, lo, 0);
332
- lo = TCG_TMP1;
333
- /* FALLTHRU */
334
case MO_16:
335
tcg_out_opc_imm(s, OPC_SH, lo, base, 0);
336
break;
337
-
338
- case MO_32 | MO_BSWAP:
339
- tcg_out_bswap32(s, TCG_TMP3, lo, 0);
340
- lo = TCG_TMP3;
341
- /* FALLTHRU */
342
case MO_32:
343
tcg_out_opc_imm(s, OPC_SW, lo, base, 0);
344
break;
345
-
346
- case MO_64 | MO_BSWAP:
347
- if (TCG_TARGET_REG_BITS == 64) {
348
- tcg_out_bswap64(s, TCG_TMP3, lo);
349
- tcg_out_opc_imm(s, OPC_SD, TCG_TMP3, base, 0);
350
- } else if (use_mips32r2_instructions) {
351
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP0, 0, MIPS_BE ? lo : hi);
352
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP1, 0, MIPS_BE ? hi : lo);
353
- tcg_out_opc_sa(s, OPC_ROTR, TCG_TMP0, TCG_TMP0, 16);
354
- tcg_out_opc_sa(s, OPC_ROTR, TCG_TMP1, TCG_TMP1, 16);
355
- tcg_out_opc_imm(s, OPC_SW, TCG_TMP0, base, 0);
356
- tcg_out_opc_imm(s, OPC_SW, TCG_TMP1, base, 4);
357
- } else {
358
- tcg_out_bswap32(s, TCG_TMP3, MIPS_BE ? lo : hi, 0);
359
- tcg_out_opc_imm(s, OPC_SW, TCG_TMP3, base, 0);
360
- tcg_out_bswap32(s, TCG_TMP3, MIPS_BE ? hi : lo, 0);
361
- tcg_out_opc_imm(s, OPC_SW, TCG_TMP3, base, 4);
362
- }
363
- break;
364
case MO_64:
365
if (TCG_TARGET_REG_BITS == 64) {
366
tcg_out_opc_imm(s, OPC_SD, lo, base, 0);
367
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo, TCGReg hi,
368
tcg_out_opc_imm(s, OPC_SW, MIPS_BE ? lo : hi, base, 4);
369
}
370
break;
371
-
372
default:
373
g_assert_not_reached();
374
}
375
@@ -XXX,XX +XXX,XX @@ static void tcg_out_qemu_st_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
376
const MIPSInsn sd1 = MIPS_BE ? OPC_SDL : OPC_SDR;
377
const MIPSInsn sd2 = MIPS_BE ? OPC_SDR : OPC_SDL;
378
379
- /* Don't clutter the code below with checks to avoid bswapping ZERO. */
380
- if ((lo | hi) == 0) {
381
- opc &= ~MO_BSWAP;
382
- }
383
-
384
- switch (opc & (MO_SIZE | MO_BSWAP)) {
385
- case MO_16 | MO_BE:
386
+ switch (opc & MO_SIZE) {
387
+ case MO_16:
388
tcg_out_opc_sa(s, OPC_SRL, TCG_TMP0, lo, 8);
389
- tcg_out_opc_imm(s, OPC_SB, TCG_TMP0, base, 0);
390
- tcg_out_opc_imm(s, OPC_SB, lo, base, 1);
391
+ tcg_out_opc_imm(s, OPC_SB, HOST_BIG_ENDIAN ? TCG_TMP0 : lo, base, 0);
392
+ tcg_out_opc_imm(s, OPC_SB, HOST_BIG_ENDIAN ? lo : TCG_TMP0, base, 1);
393
break;
394
395
- case MO_16 | MO_LE:
396
- tcg_out_opc_sa(s, OPC_SRL, TCG_TMP0, lo, 8);
397
- tcg_out_opc_imm(s, OPC_SB, lo, base, 0);
398
- tcg_out_opc_imm(s, OPC_SB, TCG_TMP0, base, 1);
399
- break;
400
-
401
- case MO_32 | MO_BSWAP:
402
- tcg_out_bswap32(s, TCG_TMP3, lo, 0);
403
- lo = TCG_TMP3;
404
- /* fall through */
405
case MO_32:
406
tcg_out_opc_imm(s, sw1, lo, base, 0);
407
tcg_out_opc_imm(s, sw2, lo, base, 3);
408
break;
409
410
- case MO_64 | MO_BSWAP:
411
- if (TCG_TARGET_REG_BITS == 64) {
412
- tcg_out_bswap64(s, TCG_TMP3, lo);
413
- lo = TCG_TMP3;
414
- } else if (use_mips32r2_instructions) {
415
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP0, 0, MIPS_BE ? hi : lo);
416
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP1, 0, MIPS_BE ? lo : hi);
417
- tcg_out_opc_sa(s, OPC_ROTR, TCG_TMP0, TCG_TMP0, 16);
418
- tcg_out_opc_sa(s, OPC_ROTR, TCG_TMP1, TCG_TMP1, 16);
419
- hi = MIPS_BE ? TCG_TMP0 : TCG_TMP1;
420
- lo = MIPS_BE ? TCG_TMP1 : TCG_TMP0;
421
- } else {
422
- tcg_out_bswap32(s, TCG_TMP3, MIPS_BE ? lo : hi, 0);
423
- tcg_out_opc_imm(s, sw1, TCG_TMP3, base, 0 + 0);
424
- tcg_out_opc_imm(s, sw2, TCG_TMP3, base, 0 + 3);
425
- tcg_out_bswap32(s, TCG_TMP3, MIPS_BE ? hi : lo, 0);
426
- tcg_out_opc_imm(s, sw1, TCG_TMP3, base, 4 + 0);
427
- tcg_out_opc_imm(s, sw2, TCG_TMP3, base, 4 + 3);
428
- break;
429
- }
430
- /* fall through */
431
case MO_64:
432
if (TCG_TARGET_REG_BITS == 64) {
433
tcg_out_opc_imm(s, sd1, lo, base, 0);
434
--
435
2.34.1
436
437
diff view generated by jsdifflib
New patch
1
Compare the address vs the tlb entry with sign-extended values.
2
This simplifies the page+alignment mask constant, and the
3
generation of the last byte address for the misaligned test.
1
4
5
Move the tlb addend load up, and the zero-extension down.
6
7
This frees up a register, which allows us use TMP3 as the returned base
8
address register instead of A0, which we were using as a 5th temporary.
9
10
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
---
13
tcg/mips/tcg-target.c.inc | 38 ++++++++++++++++++--------------------
14
1 file changed, 18 insertions(+), 20 deletions(-)
15
16
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
17
index XXXXXXX..XXXXXXX 100644
18
--- a/tcg/mips/tcg-target.c.inc
19
+++ b/tcg/mips/tcg-target.c.inc
20
@@ -XXX,XX +XXX,XX @@ typedef enum {
21
ALIAS_PADDI = sizeof(void *) == 4 ? OPC_ADDIU : OPC_DADDIU,
22
ALIAS_TSRL = TARGET_LONG_BITS == 32 || TCG_TARGET_REG_BITS == 32
23
? OPC_SRL : OPC_DSRL,
24
+ ALIAS_TADDI = TARGET_LONG_BITS == 32 || TCG_TARGET_REG_BITS == 32
25
+ ? OPC_ADDIU : OPC_DADDIU,
26
} MIPSInsn;
27
28
/*
29
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
30
int add_off = offsetof(CPUTLBEntry, addend);
31
int cmp_off = is_ld ? offsetof(CPUTLBEntry, addr_read)
32
: offsetof(CPUTLBEntry, addr_write);
33
- target_ulong tlb_mask;
34
35
ldst = new_ldst_label(s);
36
ldst->is_ld = is_ld;
37
ldst->oi = oi;
38
ldst->addrlo_reg = addrlo;
39
ldst->addrhi_reg = addrhi;
40
- base = TCG_REG_A0;
41
42
/* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
43
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
44
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
45
if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
46
tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + LO_OFF);
47
} else {
48
- tcg_out_ldst(s, (TARGET_LONG_BITS == 64 ? OPC_LD
49
- : TCG_TARGET_REG_BITS == 64 ? OPC_LWU : OPC_LW),
50
- TCG_TMP0, TCG_TMP3, cmp_off);
51
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_TMP0, TCG_TMP3, cmp_off);
52
}
53
54
- /* Zero extend a 32-bit guest address for a 64-bit host. */
55
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
56
- tcg_out_ext32u(s, base, addrlo);
57
- addrlo = base;
58
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
59
+ /* Load the tlb addend for the fast path. */
60
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP3, TCG_TMP3, add_off);
61
}
62
63
/*
64
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
65
* For unaligned accesses, compare against the end of the access to
66
* verify that it does not cross a page boundary.
67
*/
68
- tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
69
- tcg_out_movi(s, TCG_TYPE_I32, TCG_TMP1, tlb_mask);
70
- if (a_mask >= s_mask) {
71
- tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrlo);
72
- } else {
73
- tcg_out_opc_imm(s, ALIAS_PADDI, TCG_TMP2, addrlo, s_mask - a_mask);
74
+ tcg_out_movi(s, TCG_TYPE_TL, TCG_TMP1, TARGET_PAGE_MASK | a_mask);
75
+ if (a_mask < s_mask) {
76
+ tcg_out_opc_imm(s, ALIAS_TADDI, TCG_TMP2, addrlo, s_mask - a_mask);
77
tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, TCG_TMP2);
78
+ } else {
79
+ tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrlo);
80
}
81
82
- if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
83
- /* Load the tlb addend for the fast path. */
84
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
85
+ /* Zero extend a 32-bit guest address for a 64-bit host. */
86
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
87
+ tcg_out_ext32u(s, TCG_TMP2, addrlo);
88
+ addrlo = TCG_TMP2;
89
}
90
91
ldst->label_ptr[0] = s->code_ptr;
92
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
93
tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + HI_OFF);
94
95
/* Load the tlb addend for the fast path. */
96
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
97
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP3, TCG_TMP3, add_off);
98
99
ldst->label_ptr[1] = s->code_ptr;
100
tcg_out_opc_br(s, OPC_BNE, addrhi, TCG_TMP0);
101
}
102
103
/* delay slot */
104
- tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_TMP2, addrlo);
105
+ base = TCG_TMP3;
106
+ tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_TMP3, addrlo);
107
#else
108
if (a_mask && (use_mips32r6_instructions || a_bits != s_bits)) {
109
ldst = new_ldst_label(s);
110
--
111
2.34.1
112
113
diff view generated by jsdifflib
New patch
1
The softmmu tlb uses TCG_REG_TMP[0-3], not any of the normally available
2
registers. Now that we handle overlap betwen inputs and helper arguments,
3
and have eliminated use of A0, we can allow any allocatable reg.
1
4
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/mips/tcg-target-con-set.h | 13 +++++--------
9
tcg/mips/tcg-target-con-str.h | 2 --
10
tcg/mips/tcg-target.c.inc | 30 ++++++++----------------------
11
3 files changed, 13 insertions(+), 32 deletions(-)
12
13
diff --git a/tcg/mips/tcg-target-con-set.h b/tcg/mips/tcg-target-con-set.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/tcg/mips/tcg-target-con-set.h
16
+++ b/tcg/mips/tcg-target-con-set.h
17
@@ -XXX,XX +XXX,XX @@
18
C_O0_I1(r)
19
C_O0_I2(rZ, r)
20
C_O0_I2(rZ, rZ)
21
-C_O0_I2(SZ, S)
22
-C_O0_I3(SZ, S, S)
23
-C_O0_I3(SZ, SZ, S)
24
+C_O0_I3(rZ, r, r)
25
+C_O0_I3(rZ, rZ, r)
26
C_O0_I4(rZ, rZ, rZ, rZ)
27
-C_O0_I4(SZ, SZ, S, S)
28
-C_O1_I1(r, L)
29
+C_O0_I4(rZ, rZ, r, r)
30
C_O1_I1(r, r)
31
C_O1_I2(r, 0, rZ)
32
-C_O1_I2(r, L, L)
33
+C_O1_I2(r, r, r)
34
C_O1_I2(r, r, ri)
35
C_O1_I2(r, r, rI)
36
C_O1_I2(r, r, rIK)
37
@@ -XXX,XX +XXX,XX @@ C_O1_I2(r, rZ, rN)
38
C_O1_I2(r, rZ, rZ)
39
C_O1_I4(r, rZ, rZ, rZ, 0)
40
C_O1_I4(r, rZ, rZ, rZ, rZ)
41
-C_O2_I1(r, r, L)
42
-C_O2_I2(r, r, L, L)
43
+C_O2_I1(r, r, r)
44
C_O2_I2(r, r, r, r)
45
C_O2_I4(r, r, rZ, rZ, rN, rN)
46
diff --git a/tcg/mips/tcg-target-con-str.h b/tcg/mips/tcg-target-con-str.h
47
index XXXXXXX..XXXXXXX 100644
48
--- a/tcg/mips/tcg-target-con-str.h
49
+++ b/tcg/mips/tcg-target-con-str.h
50
@@ -XXX,XX +XXX,XX @@
51
* REGS(letter, register_mask)
52
*/
53
REGS('r', ALL_GENERAL_REGS)
54
-REGS('L', ALL_QLOAD_REGS)
55
-REGS('S', ALL_QSTORE_REGS)
56
57
/*
58
* Define constraint letters for constants:
59
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
60
index XXXXXXX..XXXXXXX 100644
61
--- a/tcg/mips/tcg-target.c.inc
62
+++ b/tcg/mips/tcg-target.c.inc
63
@@ -XXX,XX +XXX,XX @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
64
#define TCG_CT_CONST_WSZ 0x2000 /* word size */
65
66
#define ALL_GENERAL_REGS 0xffffffffu
67
-#define NOA0_REGS (ALL_GENERAL_REGS & ~(1 << TCG_REG_A0))
68
-
69
-#ifdef CONFIG_SOFTMMU
70
-#define ALL_QLOAD_REGS \
71
- (NOA0_REGS & ~((TCG_TARGET_REG_BITS < TARGET_LONG_BITS) << TCG_REG_A2))
72
-#define ALL_QSTORE_REGS \
73
- (NOA0_REGS & ~(TCG_TARGET_REG_BITS < TARGET_LONG_BITS \
74
- ? (1 << TCG_REG_A2) | (1 << TCG_REG_A3) \
75
- : (1 << TCG_REG_A1)))
76
-#else
77
-#define ALL_QLOAD_REGS NOA0_REGS
78
-#define ALL_QSTORE_REGS NOA0_REGS
79
-#endif
80
-
81
82
static bool is_p2m1(tcg_target_long val)
83
{
84
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
85
86
case INDEX_op_qemu_ld_i32:
87
return (TCG_TARGET_REG_BITS == 64 || TARGET_LONG_BITS == 32
88
- ? C_O1_I1(r, L) : C_O1_I2(r, L, L));
89
+ ? C_O1_I1(r, r) : C_O1_I2(r, r, r));
90
case INDEX_op_qemu_st_i32:
91
return (TCG_TARGET_REG_BITS == 64 || TARGET_LONG_BITS == 32
92
- ? C_O0_I2(SZ, S) : C_O0_I3(SZ, S, S));
93
+ ? C_O0_I2(rZ, r) : C_O0_I3(rZ, r, r));
94
case INDEX_op_qemu_ld_i64:
95
- return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, L)
96
- : TARGET_LONG_BITS == 32 ? C_O2_I1(r, r, L)
97
- : C_O2_I2(r, r, L, L));
98
+ return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, r)
99
+ : TARGET_LONG_BITS == 32 ? C_O2_I1(r, r, r)
100
+ : C_O2_I2(r, r, r, r));
101
case INDEX_op_qemu_st_i64:
102
- return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(SZ, S)
103
- : TARGET_LONG_BITS == 32 ? C_O0_I3(SZ, SZ, S)
104
- : C_O0_I4(SZ, SZ, S, S));
105
+ return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(rZ, r)
106
+ : TARGET_LONG_BITS == 32 ? C_O0_I3(rZ, rZ, r)
107
+ : C_O0_I4(rZ, rZ, r, r));
108
109
default:
110
g_assert_not_reached();
111
--
112
2.34.1
113
114
diff view generated by jsdifflib
1
This structure will shortly contain more than just
1
Allocate TCG_REG_TMP2. Use R0, TMP1, TMP2 instead of any of
2
data for accessing MMIO. Rename the 'addr' member
2
the normally allocated registers for the tlb load.
3
to 'xlat_section' to more clearly indicate its purpose.
4
3
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
7
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
9
---
7
---
10
include/exec/cpu-defs.h | 22 ++++----
8
tcg/ppc/tcg-target.c.inc | 78 ++++++++++++++++++++++++----------------
11
accel/tcg/cputlb.c | 102 +++++++++++++++++++------------------
9
1 file changed, 47 insertions(+), 31 deletions(-)
12
target/arm/mte_helper.c | 14 ++---
13
target/arm/sve_helper.c | 4 +-
14
target/arm/translate-a64.c | 2 +-
15
5 files changed, 73 insertions(+), 71 deletions(-)
16
10
17
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
11
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
18
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
19
--- a/include/exec/cpu-defs.h
13
--- a/tcg/ppc/tcg-target.c.inc
20
+++ b/include/exec/cpu-defs.h
14
+++ b/tcg/ppc/tcg-target.c.inc
21
@@ -XXX,XX +XXX,XX @@ typedef uint64_t target_ulong;
15
@@ -XXX,XX +XXX,XX @@
22
# endif
16
#else
23
# endif
17
# define TCG_REG_TMP1 TCG_REG_R12
24
18
#endif
25
+/* Minimalized TLB entry for use by TCG fast path. */
19
+#define TCG_REG_TMP2 TCG_REG_R11
26
typedef struct CPUTLBEntry {
20
27
/* bit TARGET_LONG_BITS to TARGET_PAGE_BITS : virtual address
21
#define TCG_VEC_TMP1 TCG_REG_V0
28
bit TARGET_PAGE_BITS-1..4 : Nonzero for accesses that should not
22
#define TCG_VEC_TMP2 TCG_REG_V1
29
@@ -XXX,XX +XXX,XX @@ typedef struct CPUTLBEntry {
23
@@ -XXX,XX +XXX,XX @@ static TCGReg ldst_ra_gen(TCGContext *s, const TCGLabelQemuLdst *l, int arg)
30
24
/*
31
QEMU_BUILD_BUG_ON(sizeof(CPUTLBEntry) != (1 << CPU_TLB_ENTRY_BITS));
25
* For the purposes of ppc32 sorting 4 input registers into 4 argument
32
26
* registers, there is an outside chance we would require 3 temps.
33
-/* The IOTLB is not accessed directly inline by generated TCG code,
27
- * Because of constraints, no inputs are in r3, and env will not be
34
- * so the CPUIOTLBEntry layout is not as critical as that of the
28
- * placed into r3 until after the sorting is done, and is thus free.
35
- * CPUTLBEntry. (This is also why we don't want to combine the two
36
- * structs into one.)
37
+/*
38
+ * The full TLB entry, which is not accessed by generated TCG code,
39
+ * so the layout is not as critical as that of CPUTLBEntry. This is
40
+ * also why we don't want to combine the two structs.
41
*/
29
*/
42
-typedef struct CPUIOTLBEntry {
30
static const TCGLdstHelperParam ldst_helper_param = {
43
+typedef struct CPUTLBEntryFull {
31
.ra_gen = ldst_ra_gen,
44
/*
32
.ntmp = 3,
45
- * @addr contains:
33
- .tmp = { TCG_REG_TMP1, TCG_REG_R0, TCG_REG_R3 }
46
+ * @xlat_section contains:
34
+ .tmp = { TCG_REG_TMP1, TCG_REG_TMP2, TCG_REG_R0 }
47
* - in the lower TARGET_PAGE_BITS, a physical section number
35
};
48
* - with the lower TARGET_PAGE_BITS masked off, an offset which
36
49
* must be added to the virtual address to obtain:
37
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
50
@@ -XXX,XX +XXX,XX @@ typedef struct CPUIOTLBEntry {
38
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
51
* number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
39
/* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
52
* + the offset within the target MemoryRegion (otherwise)
40
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
53
*/
41
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
54
- hwaddr addr;
42
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R3, TCG_AREG0, mask_off);
55
+ hwaddr xlat_section;
43
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R4, TCG_AREG0, table_off);
56
MemTxAttrs attrs;
44
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_AREG0, mask_off);
57
-} CPUIOTLBEntry;
45
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_AREG0, table_off);
58
+} CPUTLBEntryFull;
46
59
47
/* Extract the page index, shifted into place for tlb index. */
60
/*
48
if (TCG_TARGET_REG_BITS == 32) {
61
* Data elements that are per MMU mode, minus the bits accessed by
49
- tcg_out_shri32(s, TCG_REG_TMP1, addrlo,
62
@@ -XXX,XX +XXX,XX @@ typedef struct CPUTLBDesc {
50
+ tcg_out_shri32(s, TCG_REG_R0, addrlo,
63
size_t vindex;
51
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
64
/* The tlb victim table, in two parts. */
52
} else {
65
CPUTLBEntry vtable[CPU_VTLB_SIZE];
53
- tcg_out_shri64(s, TCG_REG_TMP1, addrlo,
66
- CPUIOTLBEntry viotlb[CPU_VTLB_SIZE];
54
+ tcg_out_shri64(s, TCG_REG_R0, addrlo,
67
- /* The iotlb. */
55
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
68
- CPUIOTLBEntry *iotlb;
69
+ CPUTLBEntryFull vfulltlb[CPU_VTLB_SIZE];
70
+ CPUTLBEntryFull *fulltlb;
71
} CPUTLBDesc;
72
73
/*
74
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
75
index XXXXXXX..XXXXXXX 100644
76
--- a/accel/tcg/cputlb.c
77
+++ b/accel/tcg/cputlb.c
78
@@ -XXX,XX +XXX,XX @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
79
}
56
}
80
57
- tcg_out32(s, AND | SAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_TMP1));
81
g_free(fast->table);
58
+ tcg_out32(s, AND | SAB(TCG_REG_TMP1, TCG_REG_TMP1, TCG_REG_R0));
82
- g_free(desc->iotlb);
59
83
+ g_free(desc->fulltlb);
60
- /* Load the TLB comparator. */
84
61
+ /* Load the (low part) TLB comparator into TMP2. */
85
tlb_window_reset(desc, now, 0);
62
if (cmp_off == 0 && TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
86
/* desc->n_used_entries is cleared by the caller */
63
uint32_t lxu = (TCG_TARGET_REG_BITS == 32 || TARGET_LONG_BITS == 32
87
fast->mask = (new_size - 1) << CPU_TLB_ENTRY_BITS;
64
? LWZUX : LDUX);
88
fast->table = g_try_new(CPUTLBEntry, new_size);
65
- tcg_out32(s, lxu | TAB(TCG_REG_TMP1, TCG_REG_R3, TCG_REG_R4));
89
- desc->iotlb = g_try_new(CPUIOTLBEntry, new_size);
66
+ tcg_out32(s, lxu | TAB(TCG_REG_TMP2, TCG_REG_TMP1, TCG_REG_TMP2));
90
+ desc->fulltlb = g_try_new(CPUTLBEntryFull, new_size);
67
} else {
91
68
- tcg_out32(s, ADD | TAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_R4));
92
/*
69
+ tcg_out32(s, ADD | TAB(TCG_REG_TMP1, TCG_REG_TMP1, TCG_REG_TMP2));
93
* If the allocations fail, try smaller sizes. We just freed some
70
if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
94
@@ -XXX,XX +XXX,XX @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
71
- tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP1, TCG_REG_R3, cmp_off + 4);
95
* allocations to fail though, so we progressively reduce the allocation
72
- tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_R4, TCG_REG_R3, cmp_off);
96
* size, aborting if we cannot even allocate the smallest TLB we support.
73
+ tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP2,
97
*/
74
+ TCG_REG_TMP1, cmp_off + 4 * HOST_BIG_ENDIAN);
98
- while (fast->table == NULL || desc->iotlb == NULL) {
75
} else {
99
+ while (fast->table == NULL || desc->fulltlb == NULL) {
76
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP1, TCG_REG_R3, cmp_off);
100
if (new_size == (1 << CPU_TLB_DYN_MIN_BITS)) {
77
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP2, TCG_REG_TMP1, cmp_off);
101
error_report("%s: %s", __func__, strerror(errno));
102
abort();
103
@@ -XXX,XX +XXX,XX @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
104
fast->mask = (new_size - 1) << CPU_TLB_ENTRY_BITS;
105
106
g_free(fast->table);
107
- g_free(desc->iotlb);
108
+ g_free(desc->fulltlb);
109
fast->table = g_try_new(CPUTLBEntry, new_size);
110
- desc->iotlb = g_try_new(CPUIOTLBEntry, new_size);
111
+ desc->fulltlb = g_try_new(CPUTLBEntryFull, new_size);
112
}
113
}
114
115
@@ -XXX,XX +XXX,XX @@ static void tlb_mmu_init(CPUTLBDesc *desc, CPUTLBDescFast *fast, int64_t now)
116
desc->n_used_entries = 0;
117
fast->mask = (n_entries - 1) << CPU_TLB_ENTRY_BITS;
118
fast->table = g_new(CPUTLBEntry, n_entries);
119
- desc->iotlb = g_new(CPUIOTLBEntry, n_entries);
120
+ desc->fulltlb = g_new(CPUTLBEntryFull, n_entries);
121
tlb_mmu_flush_locked(desc, fast);
122
}
123
124
@@ -XXX,XX +XXX,XX @@ void tlb_destroy(CPUState *cpu)
125
CPUTLBDescFast *fast = &env_tlb(env)->f[i];
126
127
g_free(fast->table);
128
- g_free(desc->iotlb);
129
+ g_free(desc->fulltlb);
130
}
131
}
132
133
@@ -XXX,XX +XXX,XX @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
134
135
/* Evict the old entry into the victim tlb. */
136
copy_tlb_helper_locked(tv, te);
137
- desc->viotlb[vidx] = desc->iotlb[index];
138
+ desc->vfulltlb[vidx] = desc->fulltlb[index];
139
tlb_n_used_entries_dec(env, mmu_idx);
140
}
141
142
@@ -XXX,XX +XXX,XX @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
143
* subtract here is that of the page base, and not the same as the
144
* vaddr we add back in io_readx()/io_writex()/get_page_addr_code().
145
*/
146
- desc->iotlb[index].addr = iotlb - vaddr_page;
147
- desc->iotlb[index].attrs = attrs;
148
+ desc->fulltlb[index].xlat_section = iotlb - vaddr_page;
149
+ desc->fulltlb[index].attrs = attrs;
150
151
/* Now calculate the new entry */
152
tn.addend = addend - vaddr_page;
153
@@ -XXX,XX +XXX,XX @@ static inline void cpu_transaction_failed(CPUState *cpu, hwaddr physaddr,
154
}
155
}
156
157
-static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
158
+static uint64_t io_readx(CPUArchState *env, CPUTLBEntryFull *full,
159
int mmu_idx, target_ulong addr, uintptr_t retaddr,
160
MMUAccessType access_type, MemOp op)
161
{
162
@@ -XXX,XX +XXX,XX @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
163
bool locked = false;
164
MemTxResult r;
165
166
- section = iotlb_to_section(cpu, iotlbentry->addr, iotlbentry->attrs);
167
+ section = iotlb_to_section(cpu, full->xlat_section, full->attrs);
168
mr = section->mr;
169
- mr_offset = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
170
+ mr_offset = (full->xlat_section & TARGET_PAGE_MASK) + addr;
171
cpu->mem_io_pc = retaddr;
172
if (!cpu->can_do_io) {
173
cpu_io_recompile(cpu, retaddr);
174
@@ -XXX,XX +XXX,XX @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
175
qemu_mutex_lock_iothread();
176
locked = true;
177
}
178
- r = memory_region_dispatch_read(mr, mr_offset, &val, op, iotlbentry->attrs);
179
+ r = memory_region_dispatch_read(mr, mr_offset, &val, op, full->attrs);
180
if (r != MEMTX_OK) {
181
hwaddr physaddr = mr_offset +
182
section->offset_within_address_space -
183
section->offset_within_region;
184
185
cpu_transaction_failed(cpu, physaddr, addr, memop_size(op), access_type,
186
- mmu_idx, iotlbentry->attrs, r, retaddr);
187
+ mmu_idx, full->attrs, r, retaddr);
188
}
189
if (locked) {
190
qemu_mutex_unlock_iothread();
191
@@ -XXX,XX +XXX,XX @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
192
}
193
194
/*
195
- * Save a potentially trashed IOTLB entry for later lookup by plugin.
196
- * This is read by tlb_plugin_lookup if the iotlb entry doesn't match
197
+ * Save a potentially trashed CPUTLBEntryFull for later lookup by plugin.
198
+ * This is read by tlb_plugin_lookup if the fulltlb entry doesn't match
199
* because of the side effect of io_writex changing memory layout.
200
*/
201
static void save_iotlb_data(CPUState *cs, hwaddr addr,
202
@@ -XXX,XX +XXX,XX @@ static void save_iotlb_data(CPUState *cs, hwaddr addr,
203
#endif
204
}
205
206
-static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
207
+static void io_writex(CPUArchState *env, CPUTLBEntryFull *full,
208
int mmu_idx, uint64_t val, target_ulong addr,
209
uintptr_t retaddr, MemOp op)
210
{
211
@@ -XXX,XX +XXX,XX @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
212
bool locked = false;
213
MemTxResult r;
214
215
- section = iotlb_to_section(cpu, iotlbentry->addr, iotlbentry->attrs);
216
+ section = iotlb_to_section(cpu, full->xlat_section, full->attrs);
217
mr = section->mr;
218
- mr_offset = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
219
+ mr_offset = (full->xlat_section & TARGET_PAGE_MASK) + addr;
220
if (!cpu->can_do_io) {
221
cpu_io_recompile(cpu, retaddr);
222
}
223
@@ -XXX,XX +XXX,XX @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry *iotlbentry,
224
* The memory_region_dispatch may trigger a flush/resize
225
* so for plugins we save the iotlb_data just in case.
226
*/
227
- save_iotlb_data(cpu, iotlbentry->addr, section, mr_offset);
228
+ save_iotlb_data(cpu, full->xlat_section, section, mr_offset);
229
230
if (!qemu_mutex_iothread_locked()) {
231
qemu_mutex_lock_iothread();
232
locked = true;
233
}
234
- r = memory_region_dispatch_write(mr, mr_offset, val, op, iotlbentry->attrs);
235
+ r = memory_region_dispatch_write(mr, mr_offset, val, op, full->attrs);
236
if (r != MEMTX_OK) {
237
hwaddr physaddr = mr_offset +
238
section->offset_within_address_space -
239
section->offset_within_region;
240
241
cpu_transaction_failed(cpu, physaddr, addr, memop_size(op),
242
- MMU_DATA_STORE, mmu_idx, iotlbentry->attrs, r,
243
+ MMU_DATA_STORE, mmu_idx, full->attrs, r,
244
retaddr);
245
}
246
if (locked) {
247
@@ -XXX,XX +XXX,XX @@ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
248
copy_tlb_helper_locked(vtlb, &tmptlb);
249
qemu_spin_unlock(&env_tlb(env)->c.lock);
250
251
- CPUIOTLBEntry tmpio, *io = &env_tlb(env)->d[mmu_idx].iotlb[index];
252
- CPUIOTLBEntry *vio = &env_tlb(env)->d[mmu_idx].viotlb[vidx];
253
- tmpio = *io; *io = *vio; *vio = tmpio;
254
+ CPUTLBEntryFull *f1 = &env_tlb(env)->d[mmu_idx].fulltlb[index];
255
+ CPUTLBEntryFull *f2 = &env_tlb(env)->d[mmu_idx].vfulltlb[vidx];
256
+ CPUTLBEntryFull tmpf;
257
+ tmpf = *f1; *f1 = *f2; *f2 = tmpf;
258
return true;
259
}
78
}
260
}
79
}
261
@@ -XXX,XX +XXX,XX @@ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
80
262
(ADDR) & TARGET_PAGE_MASK)
81
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
263
82
* Load the TLB addend for use on the fast path.
264
static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
83
* Do this asap to minimize any load use delay.
265
- CPUIOTLBEntry *iotlbentry, uintptr_t retaddr)
84
*/
266
+ CPUTLBEntryFull *full, uintptr_t retaddr)
85
- h->base = TCG_REG_R3;
267
{
86
- tcg_out_ld(s, TCG_TYPE_PTR, h->base, TCG_REG_R3,
268
- ram_addr_t ram_addr = mem_vaddr + iotlbentry->addr;
87
- offsetof(CPUTLBEntry, addend));
269
+ ram_addr_t ram_addr = mem_vaddr + full->xlat_section;
88
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
270
89
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_REG_TMP1,
271
trace_memory_notdirty_write_access(mem_vaddr, ram_addr, size);
90
+ offsetof(CPUTLBEntry, addend));
272
91
+ }
273
@@ -XXX,XX +XXX,XX @@ int probe_access_flags(CPUArchState *env, target_ulong addr,
92
274
/* Handle clean RAM pages. */
93
- /* Clear the non-page, non-alignment bits from the address */
275
if (unlikely(flags & TLB_NOTDIRTY)) {
94
+ /* Clear the non-page, non-alignment bits from the address in R0. */
276
uintptr_t index = tlb_index(env, mmu_idx, addr);
95
if (TCG_TARGET_REG_BITS == 32) {
277
- CPUIOTLBEntry *iotlbentry = &env_tlb(env)->d[mmu_idx].iotlb[index];
96
/*
278
+ CPUTLBEntryFull *full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
97
* We don't support unaligned accesses on 32-bits.
279
98
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
280
- notdirty_write(env_cpu(env), addr, 1, iotlbentry, retaddr);
99
if (TARGET_LONG_BITS == 32) {
281
+ notdirty_write(env_cpu(env), addr, 1, full, retaddr);
100
tcg_out_rlw(s, RLWINM, TCG_REG_R0, t, 0,
282
flags &= ~TLB_NOTDIRTY;
101
(32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
283
}
102
- /* Zero-extend the address for use in the final address. */
284
103
- tcg_out_ext32u(s, TCG_REG_R4, addrlo);
285
@@ -XXX,XX +XXX,XX @@ void *probe_access(CPUArchState *env, target_ulong addr, int size,
104
- addrlo = TCG_REG_R4;
286
105
} else if (a_bits == 0) {
287
if (unlikely(flags & (TLB_NOTDIRTY | TLB_WATCHPOINT))) {
106
tcg_out_rld(s, RLDICR, TCG_REG_R0, t, 0, 63 - TARGET_PAGE_BITS);
288
uintptr_t index = tlb_index(env, mmu_idx, addr);
107
} else {
289
- CPUIOTLBEntry *iotlbentry = &env_tlb(env)->d[mmu_idx].iotlb[index];
108
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
290
+ CPUTLBEntryFull *full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
109
tcg_out_rld(s, RLDICL, TCG_REG_R0, TCG_REG_R0, TARGET_PAGE_BITS, 0);
291
292
/* Handle watchpoints. */
293
if (flags & TLB_WATCHPOINT) {
294
int wp_access = (access_type == MMU_DATA_STORE
295
? BP_MEM_WRITE : BP_MEM_READ);
296
cpu_check_watchpoint(env_cpu(env), addr, size,
297
- iotlbentry->attrs, wp_access, retaddr);
298
+ full->attrs, wp_access, retaddr);
299
}
300
301
/* Handle clean RAM pages. */
302
if (flags & TLB_NOTDIRTY) {
303
- notdirty_write(env_cpu(env), addr, 1, iotlbentry, retaddr);
304
+ notdirty_write(env_cpu(env), addr, 1, full, retaddr);
305
}
110
}
306
}
111
}
307
112
- h->index = addrlo;
308
@@ -XXX,XX +XXX,XX @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
113
309
* should have just filled the TLB. The one corner case is io_writex
114
if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
310
* which can cause TLB flushes and potential resizing of the TLBs
115
- tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
311
* losing the information we need. In those cases we need to recover
116
+ /* Low part comparison into cr7. */
312
- * data from a copy of the iotlbentry. As long as this always occurs
117
+ tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP2,
313
+ * data from a copy of the CPUTLBEntryFull. As long as this always occurs
118
0, 7, TCG_TYPE_I32);
314
* from the same thread (which a mem callback will be) this is safe.
119
- tcg_out_cmp(s, TCG_COND_EQ, addrhi, TCG_REG_R4, 0, 6, TCG_TYPE_I32);
315
*/
120
+
316
121
+ /* Load the high part TLB comparator into TMP2. */
317
@@ -XXX,XX +XXX,XX @@ bool tlb_plugin_lookup(CPUState *cpu, target_ulong addr, int mmu_idx,
122
+ tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP2, TCG_REG_TMP1,
318
if (likely(tlb_hit(tlb_addr, addr))) {
123
+ cmp_off + 4 * !HOST_BIG_ENDIAN);
319
/* We must have an iotlb entry for MMIO */
124
+
320
if (tlb_addr & TLB_MMIO) {
125
+ /* Load addend, deferred for this case. */
321
- CPUIOTLBEntry *iotlbentry;
126
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_REG_TMP1,
322
- iotlbentry = &env_tlb(env)->d[mmu_idx].iotlb[index];
127
+ offsetof(CPUTLBEntry, addend));
323
+ CPUTLBEntryFull *full;
128
+
324
+ full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
129
+ /* High part comparison into cr6. */
325
data->is_io = true;
130
+ tcg_out_cmp(s, TCG_COND_EQ, addrhi, TCG_REG_TMP2, 0, 6, TCG_TYPE_I32);
326
- data->v.io.section = iotlb_to_section(cpu, iotlbentry->addr, iotlbentry->attrs);
131
+
327
- data->v.io.offset = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
132
+ /* Combine comparisons into cr7. */
328
+ data->v.io.section =
133
tcg_out32(s, CRAND | BT(7, CR_EQ) | BA(6, CR_EQ) | BB(7, CR_EQ));
329
+ iotlb_to_section(cpu, full->xlat_section, full->attrs);
134
} else {
330
+ data->v.io.offset = (full->xlat_section & TARGET_PAGE_MASK) + addr;
135
- tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
331
} else {
136
+ /* Full comparison into cr7. */
332
data->is_io = false;
137
+ tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP2,
333
data->v.ram.hostaddr = (void *)((uintptr_t)addr + tlbe->addend);
138
0, 7, TCG_TYPE_TL);
334
@@ -XXX,XX +XXX,XX @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
335
336
if (unlikely(tlb_addr & TLB_NOTDIRTY)) {
337
notdirty_write(env_cpu(env), addr, size,
338
- &env_tlb(env)->d[mmu_idx].iotlb[index], retaddr);
339
+ &env_tlb(env)->d[mmu_idx].fulltlb[index], retaddr);
340
}
139
}
341
140
342
return hostaddr;
141
/* Load a pointer into the current opcode w/conditional branch-link. */
343
@@ -XXX,XX +XXX,XX @@ load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi,
142
ldst->label_ptr[0] = s->code_ptr;
344
143
tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
345
/* Handle anything that isn't just a straight memory access. */
144
+
346
if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
145
+ h->base = TCG_REG_TMP1;
347
- CPUIOTLBEntry *iotlbentry;
146
#else
348
+ CPUTLBEntryFull *full;
147
if (a_bits) {
349
bool need_swap;
148
ldst = new_ldst_label(s);
350
149
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
351
/* For anything that is unaligned, recurse through full_load. */
352
@@ -XXX,XX +XXX,XX @@ load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi,
353
goto do_unaligned_access;
354
}
355
356
- iotlbentry = &env_tlb(env)->d[mmu_idx].iotlb[index];
357
+ full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
358
359
/* Handle watchpoints. */
360
if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
361
/* On watchpoint hit, this will longjmp out. */
362
cpu_check_watchpoint(env_cpu(env), addr, size,
363
- iotlbentry->attrs, BP_MEM_READ, retaddr);
364
+ full->attrs, BP_MEM_READ, retaddr);
365
}
366
367
need_swap = size > 1 && (tlb_addr & TLB_BSWAP);
368
369
/* Handle I/O access. */
370
if (likely(tlb_addr & TLB_MMIO)) {
371
- return io_readx(env, iotlbentry, mmu_idx, addr, retaddr,
372
+ return io_readx(env, full, mmu_idx, addr, retaddr,
373
access_type, op ^ (need_swap * MO_BSWAP));
374
}
375
376
@@ -XXX,XX +XXX,XX @@ store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val,
377
*/
378
if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
379
cpu_check_watchpoint(env_cpu(env), addr, size - size2,
380
- env_tlb(env)->d[mmu_idx].iotlb[index].attrs,
381
+ env_tlb(env)->d[mmu_idx].fulltlb[index].attrs,
382
BP_MEM_WRITE, retaddr);
383
}
150
}
384
if (unlikely(tlb_addr2 & TLB_WATCHPOINT)) {
151
385
cpu_check_watchpoint(env_cpu(env), page2, size2,
152
h->base = guest_base ? TCG_GUEST_BASE_REG : 0;
386
- env_tlb(env)->d[mmu_idx].iotlb[index2].attrs,
153
- h->index = addrlo;
387
+ env_tlb(env)->d[mmu_idx].fulltlb[index2].attrs,
154
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
388
BP_MEM_WRITE, retaddr);
155
- tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
389
}
156
- h->index = TCG_REG_TMP1;
390
157
- }
391
@@ -XXX,XX +XXX,XX @@ store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
392
393
/* Handle anything that isn't just a straight memory access. */
394
if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
395
- CPUIOTLBEntry *iotlbentry;
396
+ CPUTLBEntryFull *full;
397
bool need_swap;
398
399
/* For anything that is unaligned, recurse through byte stores. */
400
@@ -XXX,XX +XXX,XX @@ store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
401
goto do_unaligned_access;
402
}
403
404
- iotlbentry = &env_tlb(env)->d[mmu_idx].iotlb[index];
405
+ full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
406
407
/* Handle watchpoints. */
408
if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
409
/* On watchpoint hit, this will longjmp out. */
410
cpu_check_watchpoint(env_cpu(env), addr, size,
411
- iotlbentry->attrs, BP_MEM_WRITE, retaddr);
412
+ full->attrs, BP_MEM_WRITE, retaddr);
413
}
414
415
need_swap = size > 1 && (tlb_addr & TLB_BSWAP);
416
417
/* Handle I/O access. */
418
if (tlb_addr & TLB_MMIO) {
419
- io_writex(env, iotlbentry, mmu_idx, val, addr, retaddr,
420
+ io_writex(env, full, mmu_idx, val, addr, retaddr,
421
op ^ (need_swap * MO_BSWAP));
422
return;
423
}
424
@@ -XXX,XX +XXX,XX @@ store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
425
426
/* Handle clean RAM pages. */
427
if (tlb_addr & TLB_NOTDIRTY) {
428
- notdirty_write(env_cpu(env), addr, size, iotlbentry, retaddr);
429
+ notdirty_write(env_cpu(env), addr, size, full, retaddr);
430
}
431
432
haddr = (void *)((uintptr_t)addr + entry->addend);
433
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
434
index XXXXXXX..XXXXXXX 100644
435
--- a/target/arm/mte_helper.c
436
+++ b/target/arm/mte_helper.c
437
@@ -XXX,XX +XXX,XX @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int ptr_mmu_idx,
438
return tags + index;
439
#else
440
uintptr_t index;
441
- CPUIOTLBEntry *iotlbentry;
442
+ CPUTLBEntryFull *full;
443
int in_page, flags;
444
ram_addr_t ptr_ra;
445
hwaddr ptr_paddr, tag_paddr, xlat;
446
@@ -XXX,XX +XXX,XX @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int ptr_mmu_idx,
447
assert(!(flags & TLB_INVALID_MASK));
448
449
/*
450
- * Find the iotlbentry for ptr. This *must* be present in the TLB
451
+ * Find the CPUTLBEntryFull for ptr. This *must* be present in the TLB
452
* because we just found the mapping.
453
* TODO: Perhaps there should be a cputlb helper that returns a
454
* matching tlb entry + iotlb entry.
455
@@ -XXX,XX +XXX,XX @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int ptr_mmu_idx,
456
g_assert(tlb_hit(comparator, ptr));
457
}
458
# endif
459
- iotlbentry = &env_tlb(env)->d[ptr_mmu_idx].iotlb[index];
460
+ full = &env_tlb(env)->d[ptr_mmu_idx].fulltlb[index];
461
462
/* If the virtual page MemAttr != Tagged, access unchecked. */
463
- if (!arm_tlb_mte_tagged(&iotlbentry->attrs)) {
464
+ if (!arm_tlb_mte_tagged(&full->attrs)) {
465
return NULL;
466
}
467
468
@@ -XXX,XX +XXX,XX @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int ptr_mmu_idx,
469
int wp = ptr_access == MMU_DATA_LOAD ? BP_MEM_READ : BP_MEM_WRITE;
470
assert(ra != 0);
471
cpu_check_watchpoint(env_cpu(env), ptr, ptr_size,
472
- iotlbentry->attrs, wp, ra);
473
+ full->attrs, wp, ra);
474
}
475
476
/*
477
@@ -XXX,XX +XXX,XX @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int ptr_mmu_idx,
478
tag_paddr = ptr_paddr >> (LOG2_TAG_GRANULE + 1);
479
480
/* Look up the address in tag space. */
481
- tag_asi = iotlbentry->attrs.secure ? ARMASIdx_TagS : ARMASIdx_TagNS;
482
+ tag_asi = full->attrs.secure ? ARMASIdx_TagS : ARMASIdx_TagNS;
483
tag_as = cpu_get_address_space(env_cpu(env), tag_asi);
484
mr = address_space_translate(tag_as, tag_paddr, &xlat, NULL,
485
tag_access == MMU_DATA_STORE,
486
- iotlbentry->attrs);
487
+ full->attrs);
488
489
/*
490
* Note that @mr will never be NULL. If there is nothing in the address
491
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
492
index XXXXXXX..XXXXXXX 100644
493
--- a/target/arm/sve_helper.c
494
+++ b/target/arm/sve_helper.c
495
@@ -XXX,XX +XXX,XX @@ bool sve_probe_page(SVEHostPage *info, bool nofault, CPUARMState *env,
496
g_assert(tlb_hit(comparator, addr));
497
# endif
498
499
- CPUIOTLBEntry *iotlbentry = &env_tlb(env)->d[mmu_idx].iotlb[index];
500
- info->attrs = iotlbentry->attrs;
501
+ CPUTLBEntryFull *full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
502
+ info->attrs = full->attrs;
503
}
504
#endif
158
#endif
505
159
506
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
160
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
507
index XXXXXXX..XXXXXXX 100644
161
+ /* Zero-extend the guest address for use in the host address. */
508
--- a/target/arm/translate-a64.c
162
+ tcg_out_ext32u(s, TCG_REG_R0, addrlo);
509
+++ b/target/arm/translate-a64.c
163
+ h->index = TCG_REG_R0;
510
@@ -XXX,XX +XXX,XX @@ static bool is_guarded_page(CPUARMState *env, DisasContext *s)
164
+ } else {
511
* table entry even for that case.
165
+ h->index = addrlo;
512
*/
166
+ }
513
return (tlb_hit(entry->addr_code, addr) &&
167
+
514
- arm_tlb_bti_gp(&env_tlb(env)->d[mmu_idx].iotlb[index].attrs));
168
return ldst;
515
+ arm_tlb_bti_gp(&env_tlb(env)->d[mmu_idx].fulltlb[index].attrs));
169
}
170
171
@@ -XXX,XX +XXX,XX @@ static void tcg_target_init(TCGContext *s)
172
#if defined(_CALL_SYSV) || TCG_TARGET_REG_BITS == 64
173
tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13); /* thread pointer */
516
#endif
174
#endif
517
}
175
- tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP1); /* mem temp */
518
176
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP1);
177
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP2);
178
tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP1);
179
tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP2);
180
if (USE_REG_TB) {
519
--
181
--
520
2.34.1
182
2.34.1
521
183
522
184
diff view generated by jsdifflib
New patch
1
The softmmu tlb uses TCG_REG_{TMP1,TMP2,R0}, not any of the normally
2
available registers. Now that we handle overlap betwen inputs and
3
helper arguments, we can allow any allocatable reg.
1
4
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/ppc/tcg-target-con-set.h | 11 ++++-------
10
tcg/ppc/tcg-target-con-str.h | 2 --
11
tcg/ppc/tcg-target.c.inc | 32 ++++++++++----------------------
12
3 files changed, 14 insertions(+), 31 deletions(-)
13
14
diff --git a/tcg/ppc/tcg-target-con-set.h b/tcg/ppc/tcg-target-con-set.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/tcg/ppc/tcg-target-con-set.h
17
+++ b/tcg/ppc/tcg-target-con-set.h
18
@@ -XXX,XX +XXX,XX @@
19
C_O0_I1(r)
20
C_O0_I2(r, r)
21
C_O0_I2(r, ri)
22
-C_O0_I2(S, S)
23
C_O0_I2(v, r)
24
-C_O0_I3(S, S, S)
25
+C_O0_I3(r, r, r)
26
C_O0_I4(r, r, ri, ri)
27
-C_O0_I4(S, S, S, S)
28
-C_O1_I1(r, L)
29
+C_O0_I4(r, r, r, r)
30
C_O1_I1(r, r)
31
C_O1_I1(v, r)
32
C_O1_I1(v, v)
33
C_O1_I1(v, vr)
34
C_O1_I2(r, 0, rZ)
35
-C_O1_I2(r, L, L)
36
C_O1_I2(r, rI, ri)
37
C_O1_I2(r, rI, rT)
38
C_O1_I2(r, r, r)
39
@@ -XXX,XX +XXX,XX @@ C_O1_I2(v, v, v)
40
C_O1_I3(v, v, v, v)
41
C_O1_I4(r, r, ri, rZ, rZ)
42
C_O1_I4(r, r, r, ri, ri)
43
-C_O2_I1(L, L, L)
44
-C_O2_I2(L, L, L, L)
45
+C_O2_I1(r, r, r)
46
+C_O2_I2(r, r, r, r)
47
C_O2_I4(r, r, rI, rZM, r, r)
48
C_O2_I4(r, r, r, r, rI, rZM)
49
diff --git a/tcg/ppc/tcg-target-con-str.h b/tcg/ppc/tcg-target-con-str.h
50
index XXXXXXX..XXXXXXX 100644
51
--- a/tcg/ppc/tcg-target-con-str.h
52
+++ b/tcg/ppc/tcg-target-con-str.h
53
@@ -XXX,XX +XXX,XX @@ REGS('A', 1u << TCG_REG_R3)
54
REGS('B', 1u << TCG_REG_R4)
55
REGS('C', 1u << TCG_REG_R5)
56
REGS('D', 1u << TCG_REG_R6)
57
-REGS('L', ALL_QLOAD_REGS)
58
-REGS('S', ALL_QSTORE_REGS)
59
60
/*
61
* Define constraint letters for constants:
62
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
63
index XXXXXXX..XXXXXXX 100644
64
--- a/tcg/ppc/tcg-target.c.inc
65
+++ b/tcg/ppc/tcg-target.c.inc
66
@@ -XXX,XX +XXX,XX @@
67
#define ALL_GENERAL_REGS 0xffffffffu
68
#define ALL_VECTOR_REGS 0xffffffff00000000ull
69
70
-#ifdef CONFIG_SOFTMMU
71
-#define ALL_QLOAD_REGS \
72
- (ALL_GENERAL_REGS & \
73
- ~((1 << TCG_REG_R3) | (1 << TCG_REG_R4) | (1 << TCG_REG_R5)))
74
-#define ALL_QSTORE_REGS \
75
- (ALL_GENERAL_REGS & ~((1 << TCG_REG_R3) | (1 << TCG_REG_R4) | \
76
- (1 << TCG_REG_R5) | (1 << TCG_REG_R6)))
77
-#else
78
-#define ALL_QLOAD_REGS (ALL_GENERAL_REGS & ~(1 << TCG_REG_R3))
79
-#define ALL_QSTORE_REGS ALL_QLOAD_REGS
80
-#endif
81
-
82
TCGPowerISA have_isa;
83
static bool have_isel;
84
bool have_altivec;
85
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
86
87
case INDEX_op_qemu_ld_i32:
88
return (TCG_TARGET_REG_BITS == 64 || TARGET_LONG_BITS == 32
89
- ? C_O1_I1(r, L)
90
- : C_O1_I2(r, L, L));
91
+ ? C_O1_I1(r, r)
92
+ : C_O1_I2(r, r, r));
93
94
case INDEX_op_qemu_st_i32:
95
return (TCG_TARGET_REG_BITS == 64 || TARGET_LONG_BITS == 32
96
- ? C_O0_I2(S, S)
97
- : C_O0_I3(S, S, S));
98
+ ? C_O0_I2(r, r)
99
+ : C_O0_I3(r, r, r));
100
101
case INDEX_op_qemu_ld_i64:
102
- return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, L)
103
- : TARGET_LONG_BITS == 32 ? C_O2_I1(L, L, L)
104
- : C_O2_I2(L, L, L, L));
105
+ return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, r)
106
+ : TARGET_LONG_BITS == 32 ? C_O2_I1(r, r, r)
107
+ : C_O2_I2(r, r, r, r));
108
109
case INDEX_op_qemu_st_i64:
110
- return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(S, S)
111
- : TARGET_LONG_BITS == 32 ? C_O0_I3(S, S, S)
112
- : C_O0_I4(S, S, S, S));
113
+ return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(r, r)
114
+ : TARGET_LONG_BITS == 32 ? C_O0_I3(r, r, r)
115
+ : C_O0_I4(r, r, r, r));
116
117
case INDEX_op_add_vec:
118
case INDEX_op_sub_vec:
119
--
120
2.34.1
121
122
diff view generated by jsdifflib
New patch
1
These constraints have not been used for quite some time.
1
2
3
Fixes: 77b73de67632 ("Use rem/div[u]_i32 drop div[u]2_i32")
4
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/ppc/tcg-target-con-str.h | 4 ----
10
1 file changed, 4 deletions(-)
11
12
diff --git a/tcg/ppc/tcg-target-con-str.h b/tcg/ppc/tcg-target-con-str.h
13
index XXXXXXX..XXXXXXX 100644
14
--- a/tcg/ppc/tcg-target-con-str.h
15
+++ b/tcg/ppc/tcg-target-con-str.h
16
@@ -XXX,XX +XXX,XX @@
17
*/
18
REGS('r', ALL_GENERAL_REGS)
19
REGS('v', ALL_VECTOR_REGS)
20
-REGS('A', 1u << TCG_REG_R3)
21
-REGS('B', 1u << TCG_REG_R4)
22
-REGS('C', 1u << TCG_REG_R5)
23
-REGS('D', 1u << TCG_REG_R6)
24
25
/*
26
* Define constraint letters for constants:
27
--
28
2.34.1
29
30
diff view generated by jsdifflib
New patch
1
Never used since its introduction.
1
2
3
Fixes: 3d582c6179c ("tcg-ppc64: Rearrange integer constant constraints")
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/ppc/tcg-target-con-str.h | 1 -
8
tcg/ppc/tcg-target.c.inc | 3 ---
9
2 files changed, 4 deletions(-)
10
11
diff --git a/tcg/ppc/tcg-target-con-str.h b/tcg/ppc/tcg-target-con-str.h
12
index XXXXXXX..XXXXXXX 100644
13
--- a/tcg/ppc/tcg-target-con-str.h
14
+++ b/tcg/ppc/tcg-target-con-str.h
15
@@ -XXX,XX +XXX,XX @@ REGS('v', ALL_VECTOR_REGS)
16
* CONST(letter, TCG_CT_CONST_* bit set)
17
*/
18
CONST('I', TCG_CT_CONST_S16)
19
-CONST('J', TCG_CT_CONST_U16)
20
CONST('M', TCG_CT_CONST_MONE)
21
CONST('T', TCG_CT_CONST_S32)
22
CONST('U', TCG_CT_CONST_U32)
23
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
24
index XXXXXXX..XXXXXXX 100644
25
--- a/tcg/ppc/tcg-target.c.inc
26
+++ b/tcg/ppc/tcg-target.c.inc
27
@@ -XXX,XX +XXX,XX @@
28
#define SZR (TCG_TARGET_REG_BITS / 8)
29
30
#define TCG_CT_CONST_S16 0x100
31
-#define TCG_CT_CONST_U16 0x200
32
#define TCG_CT_CONST_S32 0x400
33
#define TCG_CT_CONST_U32 0x800
34
#define TCG_CT_CONST_ZERO 0x1000
35
@@ -XXX,XX +XXX,XX @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
36
37
if ((ct & TCG_CT_CONST_S16) && val == (int16_t)val) {
38
return 1;
39
- } else if ((ct & TCG_CT_CONST_U16) && val == (uint16_t)val) {
40
- return 1;
41
} else if ((ct & TCG_CT_CONST_S32) && val == (int32_t)val) {
42
return 1;
43
} else if ((ct & TCG_CT_CONST_U32) && val == (uint32_t)val) {
44
--
45
2.34.1
46
47
diff view generated by jsdifflib
New patch
1
The softmmu tlb uses TCG_REG_TMP[0-2], not any of the normally available
2
registers. Now that we handle overlap betwen inputs and helper arguments,
3
we can allow any allocatable reg.
1
4
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
---
9
tcg/riscv/tcg-target-con-set.h | 2 --
10
tcg/riscv/tcg-target-con-str.h | 1 -
11
tcg/riscv/tcg-target.c.inc | 16 +++-------------
12
3 files changed, 3 insertions(+), 16 deletions(-)
13
14
diff --git a/tcg/riscv/tcg-target-con-set.h b/tcg/riscv/tcg-target-con-set.h
15
index XXXXXXX..XXXXXXX 100644
16
--- a/tcg/riscv/tcg-target-con-set.h
17
+++ b/tcg/riscv/tcg-target-con-set.h
18
@@ -XXX,XX +XXX,XX @@
19
* tcg-target-con-str.h; the constraint combination is inclusive or.
20
*/
21
C_O0_I1(r)
22
-C_O0_I2(LZ, L)
23
C_O0_I2(rZ, r)
24
C_O0_I2(rZ, rZ)
25
-C_O1_I1(r, L)
26
C_O1_I1(r, r)
27
C_O1_I2(r, r, ri)
28
C_O1_I2(r, r, rI)
29
diff --git a/tcg/riscv/tcg-target-con-str.h b/tcg/riscv/tcg-target-con-str.h
30
index XXXXXXX..XXXXXXX 100644
31
--- a/tcg/riscv/tcg-target-con-str.h
32
+++ b/tcg/riscv/tcg-target-con-str.h
33
@@ -XXX,XX +XXX,XX @@
34
* REGS(letter, register_mask)
35
*/
36
REGS('r', ALL_GENERAL_REGS)
37
-REGS('L', ALL_GENERAL_REGS & ~SOFTMMU_RESERVE_REGS)
38
39
/*
40
* Define constraint letters for constants:
41
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
42
index XXXXXXX..XXXXXXX 100644
43
--- a/tcg/riscv/tcg-target.c.inc
44
+++ b/tcg/riscv/tcg-target.c.inc
45
@@ -XXX,XX +XXX,XX @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
46
#define TCG_CT_CONST_N12 0x400
47
#define TCG_CT_CONST_M12 0x800
48
49
-#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
50
-/*
51
- * For softmmu, we need to avoid conflicts with the first 5
52
- * argument registers to call the helper. Some of these are
53
- * also used for the tlb lookup.
54
- */
55
-#ifdef CONFIG_SOFTMMU
56
-#define SOFTMMU_RESERVE_REGS MAKE_64BIT_MASK(TCG_REG_A0, 5)
57
-#else
58
-#define SOFTMMU_RESERVE_REGS 0
59
-#endif
60
+#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
61
62
#define sextreg sextract64
63
64
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
65
66
case INDEX_op_qemu_ld_i32:
67
case INDEX_op_qemu_ld_i64:
68
- return C_O1_I1(r, L);
69
+ return C_O1_I1(r, r);
70
case INDEX_op_qemu_st_i32:
71
case INDEX_op_qemu_st_i64:
72
- return C_O0_I2(LZ, L);
73
+ return C_O0_I2(rZ, r);
74
75
default:
76
g_assert_not_reached();
77
--
78
2.34.1
79
80
diff view generated by jsdifflib
New patch
1
Rather than zero-extend the guest address into a register,
2
use an add instruction which zero-extends the second input.
1
3
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
7
tcg/s390x/tcg-target.c.inc | 8 +++++---
8
1 file changed, 5 insertions(+), 3 deletions(-)
9
10
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
11
index XXXXXXX..XXXXXXX 100644
12
--- a/tcg/s390x/tcg-target.c.inc
13
+++ b/tcg/s390x/tcg-target.c.inc
14
@@ -XXX,XX +XXX,XX @@ typedef enum S390Opcode {
15
RRE_ALGR = 0xb90a,
16
RRE_ALCR = 0xb998,
17
RRE_ALCGR = 0xb988,
18
+ RRE_ALGFR = 0xb91a,
19
RRE_CGR = 0xb920,
20
RRE_CLGR = 0xb921,
21
RRE_DLGR = 0xb987,
22
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
23
tcg_out_insn(s, RXY, LG, h->index, TCG_REG_R2, TCG_REG_NONE,
24
offsetof(CPUTLBEntry, addend));
25
26
- h->base = addr_reg;
27
if (TARGET_LONG_BITS == 32) {
28
- tcg_out_ext32u(s, TCG_REG_R3, addr_reg);
29
- h->base = TCG_REG_R3;
30
+ tcg_out_insn(s, RRE, ALGFR, h->index, addr_reg);
31
+ h->base = TCG_REG_NONE;
32
+ } else {
33
+ h->base = addr_reg;
34
}
35
h->disp = 0;
36
#else
37
--
38
2.34.1
39
40
diff view generated by jsdifflib
New patch
1
Adjust the softmmu tlb to use R0+R1, not any of the normally available
2
registers. Since we handle overlap betwen inputs and helper arguments,
3
we can allow any allocatable reg.
1
4
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
tcg/s390x/tcg-target-con-set.h | 2 --
9
tcg/s390x/tcg-target-con-str.h | 1 -
10
tcg/s390x/tcg-target.c.inc | 36 ++++++++++++----------------------
11
3 files changed, 12 insertions(+), 27 deletions(-)
12
13
diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
14
index XXXXXXX..XXXXXXX 100644
15
--- a/tcg/s390x/tcg-target-con-set.h
16
+++ b/tcg/s390x/tcg-target-con-set.h
17
@@ -XXX,XX +XXX,XX @@
18
* tcg-target-con-str.h; the constraint combination is inclusive or.
19
*/
20
C_O0_I1(r)
21
-C_O0_I2(L, L)
22
C_O0_I2(r, r)
23
C_O0_I2(r, ri)
24
C_O0_I2(r, rA)
25
C_O0_I2(v, r)
26
-C_O1_I1(r, L)
27
C_O1_I1(r, r)
28
C_O1_I1(v, r)
29
C_O1_I1(v, v)
30
diff --git a/tcg/s390x/tcg-target-con-str.h b/tcg/s390x/tcg-target-con-str.h
31
index XXXXXXX..XXXXXXX 100644
32
--- a/tcg/s390x/tcg-target-con-str.h
33
+++ b/tcg/s390x/tcg-target-con-str.h
34
@@ -XXX,XX +XXX,XX @@
35
* REGS(letter, register_mask)
36
*/
37
REGS('r', ALL_GENERAL_REGS)
38
-REGS('L', ALL_GENERAL_REGS & ~SOFTMMU_RESERVE_REGS)
39
REGS('v', ALL_VECTOR_REGS)
40
REGS('o', 0xaaaa) /* odd numbered general regs */
41
42
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
43
index XXXXXXX..XXXXXXX 100644
44
--- a/tcg/s390x/tcg-target.c.inc
45
+++ b/tcg/s390x/tcg-target.c.inc
46
@@ -XXX,XX +XXX,XX @@
47
#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 16)
48
#define ALL_VECTOR_REGS MAKE_64BIT_MASK(32, 32)
49
50
-/*
51
- * For softmmu, we need to avoid conflicts with the first 3
52
- * argument registers to perform the tlb lookup, and to call
53
- * the helper function.
54
- */
55
-#ifdef CONFIG_SOFTMMU
56
-#define SOFTMMU_RESERVE_REGS MAKE_64BIT_MASK(TCG_REG_R2, 3)
57
-#else
58
-#define SOFTMMU_RESERVE_REGS 0
59
-#endif
60
-
61
-
62
/* Several places within the instruction set 0 means "no register"
63
rather than TCG_REG_R0. */
64
#define TCG_REG_NONE 0
65
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
66
ldst->oi = oi;
67
ldst->addrlo_reg = addr_reg;
68
69
- tcg_out_sh64(s, RSY_SRLG, TCG_REG_R2, addr_reg, TCG_REG_NONE,
70
+ tcg_out_sh64(s, RSY_SRLG, TCG_TMP0, addr_reg, TCG_REG_NONE,
71
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
72
73
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
74
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 19));
75
- tcg_out_insn(s, RXY, NG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, mask_off);
76
- tcg_out_insn(s, RXY, AG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, table_off);
77
+ tcg_out_insn(s, RXY, NG, TCG_TMP0, TCG_AREG0, TCG_REG_NONE, mask_off);
78
+ tcg_out_insn(s, RXY, AG, TCG_TMP0, TCG_AREG0, TCG_REG_NONE, table_off);
79
80
/*
81
* For aligned accesses, we check the first byte and include the alignment
82
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
83
a_off = (a_bits >= s_bits ? 0 : s_mask - a_mask);
84
tlb_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
85
if (a_off == 0) {
86
- tgen_andi_risbg(s, TCG_REG_R3, addr_reg, tlb_mask);
87
+ tgen_andi_risbg(s, TCG_REG_R0, addr_reg, tlb_mask);
88
} else {
89
- tcg_out_insn(s, RX, LA, TCG_REG_R3, addr_reg, TCG_REG_NONE, a_off);
90
- tgen_andi(s, TCG_TYPE_TL, TCG_REG_R3, tlb_mask);
91
+ tcg_out_insn(s, RX, LA, TCG_REG_R0, addr_reg, TCG_REG_NONE, a_off);
92
+ tgen_andi(s, TCG_TYPE_TL, TCG_REG_R0, tlb_mask);
93
}
94
95
if (is_ld) {
96
@@ -XXX,XX +XXX,XX @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
97
ofs = offsetof(CPUTLBEntry, addr_write);
98
}
99
if (TARGET_LONG_BITS == 32) {
100
- tcg_out_insn(s, RX, C, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
101
+ tcg_out_insn(s, RX, C, TCG_REG_R0, TCG_TMP0, TCG_REG_NONE, ofs);
102
} else {
103
- tcg_out_insn(s, RXY, CG, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
104
+ tcg_out_insn(s, RXY, CG, TCG_REG_R0, TCG_TMP0, TCG_REG_NONE, ofs);
105
}
106
107
tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
108
ldst->label_ptr[0] = s->code_ptr++;
109
110
- h->index = TCG_REG_R2;
111
- tcg_out_insn(s, RXY, LG, h->index, TCG_REG_R2, TCG_REG_NONE,
112
+ h->index = TCG_TMP0;
113
+ tcg_out_insn(s, RXY, LG, h->index, TCG_TMP0, TCG_REG_NONE,
114
offsetof(CPUTLBEntry, addend));
115
116
if (TARGET_LONG_BITS == 32) {
117
@@ -XXX,XX +XXX,XX @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
118
119
case INDEX_op_qemu_ld_i32:
120
case INDEX_op_qemu_ld_i64:
121
- return C_O1_I1(r, L);
122
+ return C_O1_I1(r, r);
123
case INDEX_op_qemu_st_i64:
124
case INDEX_op_qemu_st_i32:
125
- return C_O0_I2(L, L);
126
+ return C_O0_I2(r, r);
127
128
case INDEX_op_deposit_i32:
129
case INDEX_op_deposit_i64:
130
--
131
2.34.1
132
133
diff view generated by jsdifflib
New patch
1
These are atomic operations, so mark as requiring alignment.
1
2
3
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
---
5
target/mips/tcg/nanomips_translate.c.inc | 5 +++--
6
1 file changed, 3 insertions(+), 2 deletions(-)
7
8
diff --git a/target/mips/tcg/nanomips_translate.c.inc b/target/mips/tcg/nanomips_translate.c.inc
9
index XXXXXXX..XXXXXXX 100644
10
--- a/target/mips/tcg/nanomips_translate.c.inc
11
+++ b/target/mips/tcg/nanomips_translate.c.inc
12
@@ -XXX,XX +XXX,XX @@ static void gen_llwp(DisasContext *ctx, uint32_t base, int16_t offset,
13
TCGv tmp2 = tcg_temp_new();
14
15
gen_base_offset_addr(ctx, taddr, base, offset);
16
- tcg_gen_qemu_ld_i64(tval, taddr, ctx->mem_idx, MO_TEUQ);
17
+ tcg_gen_qemu_ld_i64(tval, taddr, ctx->mem_idx, MO_TEUQ | MO_ALIGN);
18
if (cpu_is_bigendian(ctx)) {
19
tcg_gen_extr_i64_tl(tmp2, tmp1, tval);
20
} else {
21
@@ -XXX,XX +XXX,XX @@ static void gen_scwp(DisasContext *ctx, uint32_t base, int16_t offset,
22
23
tcg_gen_ld_i64(llval, cpu_env, offsetof(CPUMIPSState, llval_wp));
24
tcg_gen_atomic_cmpxchg_i64(val, taddr, llval, tval,
25
- eva ? MIPS_HFLAG_UM : ctx->mem_idx, MO_64);
26
+ eva ? MIPS_HFLAG_UM : ctx->mem_idx,
27
+ MO_64 | MO_ALIGN);
28
if (reg1 != 0) {
29
tcg_gen_movi_tl(cpu_gpr[reg1], 1);
30
}
31
--
32
2.34.1
diff view generated by jsdifflib
1
The value previously chosen overlaps GUSA_MASK.
1
Memory operations that are not already aligned, or otherwise
2
marked up, require addition of ctx->default_tcg_memop_mask.
2
3
3
Rename all DELAY_SLOT_* and GUSA_* defines to emphasize
4
that they are included in TB_FLAGs. Add aliases for the
5
FPSCR and SR bits that are included in TB_FLAGS, so that
6
we don't accidentally reassign those bits.
7
8
Fixes: 4da06fb3062 ("target/sh4: Implement prctl_unalign_sigbus")
9
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/856
10
Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
---
5
---
13
target/sh4/cpu.h | 56 +++++++++++++------------
6
target/mips/tcg/mxu_translate.c | 3 ++-
14
linux-user/sh4/signal.c | 6 +--
7
target/mips/tcg/micromips_translate.c.inc | 24 ++++++++++++++--------
15
target/sh4/cpu.c | 6 +--
8
target/mips/tcg/mips16e_translate.c.inc | 18 ++++++++++------
16
target/sh4/helper.c | 6 +--
9
target/mips/tcg/nanomips_translate.c.inc | 25 +++++++++++------------
17
target/sh4/translate.c | 90 ++++++++++++++++++++++-------------------
10
4 files changed, 42 insertions(+), 28 deletions(-)
18
5 files changed, 88 insertions(+), 76 deletions(-)
19
11
20
diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
12
diff --git a/target/mips/tcg/mxu_translate.c b/target/mips/tcg/mxu_translate.c
21
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
22
--- a/target/sh4/cpu.h
14
--- a/target/mips/tcg/mxu_translate.c
23
+++ b/target/sh4/cpu.h
15
+++ b/target/mips/tcg/mxu_translate.c
24
@@ -XXX,XX +XXX,XX @@
16
@@ -XXX,XX +XXX,XX @@ static void gen_mxu_s32ldd_s32lddr(DisasContext *ctx)
25
#define FPSCR_RM_NEAREST (0 << 0)
17
tcg_gen_ori_tl(t1, t1, 0xFFFFF000);
26
#define FPSCR_RM_ZERO (1 << 0)
18
}
27
19
tcg_gen_add_tl(t1, t0, t1);
28
-#define DELAY_SLOT_MASK 0x7
20
- tcg_gen_qemu_ld_tl(t1, t1, ctx->mem_idx, MO_TESL ^ (sel * MO_BSWAP));
29
-#define DELAY_SLOT (1 << 0)
21
+ tcg_gen_qemu_ld_tl(t1, t1, ctx->mem_idx, (MO_TESL ^ (sel * MO_BSWAP)) |
30
-#define DELAY_SLOT_CONDITIONAL (1 << 1)
22
+ ctx->default_tcg_memop_mask);
31
-#define DELAY_SLOT_RTE (1 << 2)
23
32
+#define TB_FLAG_DELAY_SLOT (1 << 0)
24
gen_store_mxu_gpr(t1, XRa);
33
+#define TB_FLAG_DELAY_SLOT_COND (1 << 1)
34
+#define TB_FLAG_DELAY_SLOT_RTE (1 << 2)
35
+#define TB_FLAG_PENDING_MOVCA (1 << 3)
36
+#define TB_FLAG_GUSA_SHIFT 4 /* [11:4] */
37
+#define TB_FLAG_GUSA_EXCLUSIVE (1 << 12)
38
+#define TB_FLAG_UNALIGN (1 << 13)
39
+#define TB_FLAG_SR_FD (1 << SR_FD) /* 15 */
40
+#define TB_FLAG_FPSCR_PR FPSCR_PR /* 19 */
41
+#define TB_FLAG_FPSCR_SZ FPSCR_SZ /* 20 */
42
+#define TB_FLAG_FPSCR_FR FPSCR_FR /* 21 */
43
+#define TB_FLAG_SR_RB (1 << SR_RB) /* 29 */
44
+#define TB_FLAG_SR_MD (1 << SR_MD) /* 30 */
45
46
-#define TB_FLAG_PENDING_MOVCA (1 << 3)
47
-#define TB_FLAG_UNALIGN (1 << 4)
48
-
49
-#define GUSA_SHIFT 4
50
-#ifdef CONFIG_USER_ONLY
51
-#define GUSA_EXCLUSIVE (1 << 12)
52
-#define GUSA_MASK ((0xff << GUSA_SHIFT) | GUSA_EXCLUSIVE)
53
-#else
54
-/* Provide dummy versions of the above to allow tests against tbflags
55
- to be elided while avoiding ifdefs. */
56
-#define GUSA_EXCLUSIVE 0
57
-#define GUSA_MASK 0
58
-#endif
59
-
60
-#define TB_FLAG_ENVFLAGS_MASK (DELAY_SLOT_MASK | GUSA_MASK)
61
+#define TB_FLAG_DELAY_SLOT_MASK (TB_FLAG_DELAY_SLOT | \
62
+ TB_FLAG_DELAY_SLOT_COND | \
63
+ TB_FLAG_DELAY_SLOT_RTE)
64
+#define TB_FLAG_GUSA_MASK ((0xff << TB_FLAG_GUSA_SHIFT) | \
65
+ TB_FLAG_GUSA_EXCLUSIVE)
66
+#define TB_FLAG_FPSCR_MASK (TB_FLAG_FPSCR_PR | \
67
+ TB_FLAG_FPSCR_SZ | \
68
+ TB_FLAG_FPSCR_FR)
69
+#define TB_FLAG_SR_MASK (TB_FLAG_SR_FD | \
70
+ TB_FLAG_SR_RB | \
71
+ TB_FLAG_SR_MD)
72
+#define TB_FLAG_ENVFLAGS_MASK (TB_FLAG_DELAY_SLOT_MASK | \
73
+ TB_FLAG_GUSA_MASK)
74
75
typedef struct tlb_t {
76
uint32_t vpn;        /* virtual page number */
77
@@ -XXX,XX +XXX,XX @@ static inline int cpu_mmu_index (CPUSH4State *env, bool ifetch)
78
{
79
/* The instruction in a RTE delay slot is fetched in privileged
80
mode, but executed in user mode. */
81
- if (ifetch && (env->flags & DELAY_SLOT_RTE)) {
82
+ if (ifetch && (env->flags & TB_FLAG_DELAY_SLOT_RTE)) {
83
return 0;
84
} else {
85
return (env->sr & (1u << SR_MD)) == 0 ? 1 : 0;
86
@@ -XXX,XX +XXX,XX @@ static inline void cpu_get_tb_cpu_state(CPUSH4State *env, target_ulong *pc,
87
{
88
*pc = env->pc;
89
/* For a gUSA region, notice the end of the region. */
90
- *cs_base = env->flags & GUSA_MASK ? env->gregs[0] : 0;
91
- *flags = env->flags /* TB_FLAG_ENVFLAGS_MASK: bits 0-2, 4-12 */
92
- | (env->fpscr & (FPSCR_FR | FPSCR_SZ | FPSCR_PR)) /* Bits 19-21 */
93
- | (env->sr & ((1u << SR_MD) | (1u << SR_RB))) /* Bits 29-30 */
94
- | (env->sr & (1u << SR_FD)) /* Bit 15 */
95
+ *cs_base = env->flags & TB_FLAG_GUSA_MASK ? env->gregs[0] : 0;
96
+ *flags = env->flags
97
+ | (env->fpscr & TB_FLAG_FPSCR_MASK)
98
+ | (env->sr & TB_FLAG_SR_MASK)
99
| (env->movcal_backup ? TB_FLAG_PENDING_MOVCA : 0); /* Bit 3 */
100
#ifdef CONFIG_USER_ONLY
101
*flags |= TB_FLAG_UNALIGN * !env_cpu(env)->prctl_unalign_sigbus;
102
diff --git a/linux-user/sh4/signal.c b/linux-user/sh4/signal.c
103
index XXXXXXX..XXXXXXX 100644
104
--- a/linux-user/sh4/signal.c
105
+++ b/linux-user/sh4/signal.c
106
@@ -XXX,XX +XXX,XX @@ static void restore_sigcontext(CPUSH4State *regs, struct target_sigcontext *sc)
107
__get_user(regs->fpul, &sc->sc_fpul);
108
109
regs->tra = -1; /* disable syscall checks */
110
- regs->flags &= ~(DELAY_SLOT_MASK | GUSA_MASK);
111
+ regs->flags = 0;
112
}
25
}
113
26
diff --git a/target/mips/tcg/micromips_translate.c.inc b/target/mips/tcg/micromips_translate.c.inc
114
void setup_frame(int sig, struct target_sigaction *ka,
27
index XXXXXXX..XXXXXXX 100644
115
@@ -XXX,XX +XXX,XX @@ void setup_frame(int sig, struct target_sigaction *ka,
28
--- a/target/mips/tcg/micromips_translate.c.inc
116
regs->gregs[5] = 0;
29
+++ b/target/mips/tcg/micromips_translate.c.inc
117
regs->gregs[6] = frame_addr += offsetof(typeof(*frame), sc);
30
@@ -XXX,XX +XXX,XX @@ static void gen_ldst_pair(DisasContext *ctx, uint32_t opc, int rd,
118
regs->pc = (unsigned long) ka->_sa_handler;
31
gen_reserved_instruction(ctx);
119
- regs->flags &= ~(DELAY_SLOT_MASK | GUSA_MASK);
120
+ regs->flags &= ~(TB_FLAG_DELAY_SLOT_MASK | TB_FLAG_GUSA_MASK);
121
122
unlock_user_struct(frame, frame_addr, 1);
123
return;
124
@@ -XXX,XX +XXX,XX @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
125
regs->gregs[5] = frame_addr + offsetof(typeof(*frame), info);
126
regs->gregs[6] = frame_addr + offsetof(typeof(*frame), uc);
127
regs->pc = (unsigned long) ka->_sa_handler;
128
- regs->flags &= ~(DELAY_SLOT_MASK | GUSA_MASK);
129
+ regs->flags &= ~(TB_FLAG_DELAY_SLOT_MASK | TB_FLAG_GUSA_MASK);
130
131
unlock_user_struct(frame, frame_addr, 1);
132
return;
133
diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
134
index XXXXXXX..XXXXXXX 100644
135
--- a/target/sh4/cpu.c
136
+++ b/target/sh4/cpu.c
137
@@ -XXX,XX +XXX,XX @@ static void superh_cpu_synchronize_from_tb(CPUState *cs,
138
SuperHCPU *cpu = SUPERH_CPU(cs);
139
140
cpu->env.pc = tb_pc(tb);
141
- cpu->env.flags = tb->flags & TB_FLAG_ENVFLAGS_MASK;
142
+ cpu->env.flags = tb->flags;
143
}
144
145
#ifndef CONFIG_USER_ONLY
146
@@ -XXX,XX +XXX,XX @@ static bool superh_io_recompile_replay_branch(CPUState *cs,
147
SuperHCPU *cpu = SUPERH_CPU(cs);
148
CPUSH4State *env = &cpu->env;
149
150
- if ((env->flags & ((DELAY_SLOT | DELAY_SLOT_CONDITIONAL))) != 0
151
+ if ((env->flags & (TB_FLAG_DELAY_SLOT | TB_FLAG_DELAY_SLOT_COND))
152
&& env->pc != tb_pc(tb)) {
153
env->pc -= 2;
154
- env->flags &= ~(DELAY_SLOT | DELAY_SLOT_CONDITIONAL);
155
+ env->flags &= ~(TB_FLAG_DELAY_SLOT | TB_FLAG_DELAY_SLOT_COND);
156
return true;
157
}
158
return false;
159
diff --git a/target/sh4/helper.c b/target/sh4/helper.c
160
index XXXXXXX..XXXXXXX 100644
161
--- a/target/sh4/helper.c
162
+++ b/target/sh4/helper.c
163
@@ -XXX,XX +XXX,XX @@ void superh_cpu_do_interrupt(CPUState *cs)
164
env->sr |= (1u << SR_BL) | (1u << SR_MD) | (1u << SR_RB);
165
env->lock_addr = -1;
166
167
- if (env->flags & DELAY_SLOT_MASK) {
168
+ if (env->flags & TB_FLAG_DELAY_SLOT_MASK) {
169
/* Branch instruction should be executed again before delay slot. */
170
    env->spc -= 2;
171
    /* Clear flags for exception/interrupt routine. */
172
- env->flags &= ~DELAY_SLOT_MASK;
173
+ env->flags &= ~TB_FLAG_DELAY_SLOT_MASK;
174
}
175
176
if (do_exp) {
177
@@ -XXX,XX +XXX,XX @@ bool superh_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
178
CPUSH4State *env = &cpu->env;
179
180
/* Delay slots are indivisible, ignore interrupts */
181
- if (env->flags & DELAY_SLOT_MASK) {
182
+ if (env->flags & TB_FLAG_DELAY_SLOT_MASK) {
183
return false;
184
} else {
185
superh_cpu_do_interrupt(cs);
186
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
187
index XXXXXXX..XXXXXXX 100644
188
--- a/target/sh4/translate.c
189
+++ b/target/sh4/translate.c
190
@@ -XXX,XX +XXX,XX @@ void superh_cpu_dump_state(CPUState *cs, FILE *f, int flags)
191
         i, env->gregs[i], i + 1, env->gregs[i + 1],
192
         i + 2, env->gregs[i + 2], i + 3, env->gregs[i + 3]);
193
}
194
- if (env->flags & DELAY_SLOT) {
195
+ if (env->flags & TB_FLAG_DELAY_SLOT) {
196
qemu_printf("in delay slot (delayed_pc=0x%08x)\n",
197
         env->delayed_pc);
198
- } else if (env->flags & DELAY_SLOT_CONDITIONAL) {
199
+ } else if (env->flags & TB_FLAG_DELAY_SLOT_COND) {
200
qemu_printf("in conditional delay slot (delayed_pc=0x%08x)\n",
201
         env->delayed_pc);
202
- } else if (env->flags & DELAY_SLOT_RTE) {
203
+ } else if (env->flags & TB_FLAG_DELAY_SLOT_RTE) {
204
qemu_fprintf(f, "in rte delay slot (delayed_pc=0x%08x)\n",
205
env->delayed_pc);
206
}
207
@@ -XXX,XX +XXX,XX @@ static inline void gen_save_cpu_state(DisasContext *ctx, bool save_pc)
208
209
static inline bool use_exit_tb(DisasContext *ctx)
210
{
211
- return (ctx->tbflags & GUSA_EXCLUSIVE) != 0;
212
+ return (ctx->tbflags & TB_FLAG_GUSA_EXCLUSIVE) != 0;
213
}
214
215
static bool use_goto_tb(DisasContext *ctx, target_ulong dest)
216
@@ -XXX,XX +XXX,XX @@ static void gen_conditional_jump(DisasContext *ctx, target_ulong dest,
217
TCGLabel *l1 = gen_new_label();
218
TCGCond cond_not_taken = jump_if_true ? TCG_COND_EQ : TCG_COND_NE;
219
220
- if (ctx->tbflags & GUSA_EXCLUSIVE) {
221
+ if (ctx->tbflags & TB_FLAG_GUSA_EXCLUSIVE) {
222
/* When in an exclusive region, we must continue to the end.
223
Therefore, exit the region on a taken branch, but otherwise
224
fall through to the next instruction. */
225
tcg_gen_brcondi_i32(cond_not_taken, cpu_sr_t, 0, l1);
226
- tcg_gen_movi_i32(cpu_flags, ctx->envflags & ~GUSA_MASK);
227
+ tcg_gen_movi_i32(cpu_flags, ctx->envflags & ~TB_FLAG_GUSA_MASK);
228
/* Note that this won't actually use a goto_tb opcode because we
229
disallow it in use_goto_tb, but it handles exit + singlestep. */
230
gen_goto_tb(ctx, 0, dest);
231
@@ -XXX,XX +XXX,XX @@ static void gen_delayed_conditional_jump(DisasContext * ctx)
232
tcg_gen_mov_i32(ds, cpu_delayed_cond);
233
tcg_gen_discard_i32(cpu_delayed_cond);
234
235
- if (ctx->tbflags & GUSA_EXCLUSIVE) {
236
+ if (ctx->tbflags & TB_FLAG_GUSA_EXCLUSIVE) {
237
/* When in an exclusive region, we must continue to the end.
238
Therefore, exit the region on a taken branch, but otherwise
239
fall through to the next instruction. */
240
tcg_gen_brcondi_i32(TCG_COND_EQ, ds, 0, l1);
241
242
/* Leave the gUSA region. */
243
- tcg_gen_movi_i32(cpu_flags, ctx->envflags & ~GUSA_MASK);
244
+ tcg_gen_movi_i32(cpu_flags, ctx->envflags & ~TB_FLAG_GUSA_MASK);
245
gen_jump(ctx);
246
247
gen_set_label(l1);
248
@@ -XXX,XX +XXX,XX @@ static inline void gen_store_fpr64(DisasContext *ctx, TCGv_i64 t, int reg)
249
#define XHACK(x) ((((x) & 1 ) << 4) | ((x) & 0xe))
250
251
#define CHECK_NOT_DELAY_SLOT \
252
- if (ctx->envflags & DELAY_SLOT_MASK) { \
253
- goto do_illegal_slot; \
254
+ if (ctx->envflags & TB_FLAG_DELAY_SLOT_MASK) { \
255
+ goto do_illegal_slot; \
256
}
257
258
#define CHECK_PRIVILEGED \
259
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
260
case 0x000b:        /* rts */
261
    CHECK_NOT_DELAY_SLOT
262
    tcg_gen_mov_i32(cpu_delayed_pc, cpu_pr);
263
- ctx->envflags |= DELAY_SLOT;
264
+ ctx->envflags |= TB_FLAG_DELAY_SLOT;
265
    ctx->delayed_pc = (uint32_t) - 1;
266
    return;
267
case 0x0028:        /* clrmac */
268
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
269
    CHECK_NOT_DELAY_SLOT
270
gen_write_sr(cpu_ssr);
271
    tcg_gen_mov_i32(cpu_delayed_pc, cpu_spc);
272
- ctx->envflags |= DELAY_SLOT_RTE;
273
+ ctx->envflags |= TB_FLAG_DELAY_SLOT_RTE;
274
    ctx->delayed_pc = (uint32_t) - 1;
275
ctx->base.is_jmp = DISAS_STOP;
276
    return;
277
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
278
    return;
279
case 0xe000:        /* mov #imm,Rn */
280
#ifdef CONFIG_USER_ONLY
281
- /* Detect the start of a gUSA region. If so, update envflags
282
- and end the TB. This will allow us to see the end of the
283
- region (stored in R0) in the next TB. */
284
+ /*
285
+ * Detect the start of a gUSA region (mov #-n, r15).
286
+ * If so, update envflags and end the TB. This will allow us
287
+ * to see the end of the region (stored in R0) in the next TB.
288
+ */
289
if (B11_8 == 15 && B7_0s < 0 &&
290
(tb_cflags(ctx->base.tb) & CF_PARALLEL)) {
291
- ctx->envflags = deposit32(ctx->envflags, GUSA_SHIFT, 8, B7_0s);
292
+ ctx->envflags =
293
+ deposit32(ctx->envflags, TB_FLAG_GUSA_SHIFT, 8, B7_0s);
294
ctx->base.is_jmp = DISAS_STOP;
295
}
296
#endif
297
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
298
case 0xa000:        /* bra disp */
299
    CHECK_NOT_DELAY_SLOT
300
ctx->delayed_pc = ctx->base.pc_next + 4 + B11_0s * 2;
301
- ctx->envflags |= DELAY_SLOT;
302
+ ctx->envflags |= TB_FLAG_DELAY_SLOT;
303
    return;
304
case 0xb000:        /* bsr disp */
305
    CHECK_NOT_DELAY_SLOT
306
tcg_gen_movi_i32(cpu_pr, ctx->base.pc_next + 4);
307
ctx->delayed_pc = ctx->base.pc_next + 4 + B11_0s * 2;
308
- ctx->envflags |= DELAY_SLOT;
309
+ ctx->envflags |= TB_FLAG_DELAY_SLOT;
310
    return;
311
}
312
313
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
314
    CHECK_NOT_DELAY_SLOT
315
tcg_gen_xori_i32(cpu_delayed_cond, cpu_sr_t, 1);
316
ctx->delayed_pc = ctx->base.pc_next + 4 + B7_0s * 2;
317
- ctx->envflags |= DELAY_SLOT_CONDITIONAL;
318
+ ctx->envflags |= TB_FLAG_DELAY_SLOT_COND;
319
    return;
320
case 0x8900:        /* bt label */
321
    CHECK_NOT_DELAY_SLOT
322
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
323
    CHECK_NOT_DELAY_SLOT
324
tcg_gen_mov_i32(cpu_delayed_cond, cpu_sr_t);
325
ctx->delayed_pc = ctx->base.pc_next + 4 + B7_0s * 2;
326
- ctx->envflags |= DELAY_SLOT_CONDITIONAL;
327
+ ctx->envflags |= TB_FLAG_DELAY_SLOT_COND;
328
    return;
329
case 0x8800:        /* cmp/eq #imm,R0 */
330
tcg_gen_setcondi_i32(TCG_COND_EQ, cpu_sr_t, REG(0), B7_0s);
331
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
332
case 0x0023:        /* braf Rn */
333
    CHECK_NOT_DELAY_SLOT
334
tcg_gen_addi_i32(cpu_delayed_pc, REG(B11_8), ctx->base.pc_next + 4);
335
- ctx->envflags |= DELAY_SLOT;
336
+ ctx->envflags |= TB_FLAG_DELAY_SLOT;
337
    ctx->delayed_pc = (uint32_t) - 1;
338
    return;
339
case 0x0003:        /* bsrf Rn */
340
    CHECK_NOT_DELAY_SLOT
341
tcg_gen_movi_i32(cpu_pr, ctx->base.pc_next + 4);
342
    tcg_gen_add_i32(cpu_delayed_pc, REG(B11_8), cpu_pr);
343
- ctx->envflags |= DELAY_SLOT;
344
+ ctx->envflags |= TB_FLAG_DELAY_SLOT;
345
    ctx->delayed_pc = (uint32_t) - 1;
346
    return;
347
case 0x4015:        /* cmp/pl Rn */
348
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
349
case 0x402b:        /* jmp @Rn */
350
    CHECK_NOT_DELAY_SLOT
351
    tcg_gen_mov_i32(cpu_delayed_pc, REG(B11_8));
352
- ctx->envflags |= DELAY_SLOT;
353
+ ctx->envflags |= TB_FLAG_DELAY_SLOT;
354
    ctx->delayed_pc = (uint32_t) - 1;
355
    return;
356
case 0x400b:        /* jsr @Rn */
357
    CHECK_NOT_DELAY_SLOT
358
tcg_gen_movi_i32(cpu_pr, ctx->base.pc_next + 4);
359
    tcg_gen_mov_i32(cpu_delayed_pc, REG(B11_8));
360
- ctx->envflags |= DELAY_SLOT;
361
+ ctx->envflags |= TB_FLAG_DELAY_SLOT;
362
    ctx->delayed_pc = (uint32_t) - 1;
363
    return;
364
case 0x400e:        /* ldc Rm,SR */
365
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
366
fflush(stderr);
367
#endif
368
do_illegal:
369
- if (ctx->envflags & DELAY_SLOT_MASK) {
370
+ if (ctx->envflags & TB_FLAG_DELAY_SLOT_MASK) {
371
do_illegal_slot:
372
gen_save_cpu_state(ctx, true);
373
gen_helper_raise_slot_illegal_instruction(cpu_env);
374
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
375
376
do_fpu_disabled:
377
gen_save_cpu_state(ctx, true);
378
- if (ctx->envflags & DELAY_SLOT_MASK) {
379
+ if (ctx->envflags & TB_FLAG_DELAY_SLOT_MASK) {
380
gen_helper_raise_slot_fpu_disable(cpu_env);
381
} else {
382
gen_helper_raise_fpu_disable(cpu_env);
383
@@ -XXX,XX +XXX,XX @@ static void decode_opc(DisasContext * ctx)
384
385
_decode_opc(ctx);
386
387
- if (old_flags & DELAY_SLOT_MASK) {
388
+ if (old_flags & TB_FLAG_DELAY_SLOT_MASK) {
389
/* go out of the delay slot */
390
- ctx->envflags &= ~DELAY_SLOT_MASK;
391
+ ctx->envflags &= ~TB_FLAG_DELAY_SLOT_MASK;
392
393
/* When in an exclusive region, we must continue to the end
394
for conditional branches. */
395
- if (ctx->tbflags & GUSA_EXCLUSIVE
396
- && old_flags & DELAY_SLOT_CONDITIONAL) {
397
+ if (ctx->tbflags & TB_FLAG_GUSA_EXCLUSIVE
398
+ && old_flags & TB_FLAG_DELAY_SLOT_COND) {
399
gen_delayed_conditional_jump(ctx);
400
return;
32
return;
401
}
33
}
402
/* Otherwise this is probably an invalid gUSA region.
34
- tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TESL);
403
Drop the GUSA bits so the next TB doesn't see them. */
35
+ tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TESL |
404
- ctx->envflags &= ~GUSA_MASK;
36
+ ctx->default_tcg_memop_mask);
405
+ ctx->envflags &= ~TB_FLAG_GUSA_MASK;
37
gen_store_gpr(t1, rd);
406
38
tcg_gen_movi_tl(t1, 4);
407
tcg_gen_movi_i32(cpu_flags, ctx->envflags);
39
gen_op_addr_add(ctx, t0, t0, t1);
408
- if (old_flags & DELAY_SLOT_CONDITIONAL) {
40
- tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TESL);
409
+ if (old_flags & TB_FLAG_DELAY_SLOT_COND) {
41
+ tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TESL |
410
     gen_delayed_conditional_jump(ctx);
42
+ ctx->default_tcg_memop_mask);
411
} else {
43
gen_store_gpr(t1, rd + 1);
412
gen_jump(ctx);
44
break;
413
@@ -XXX,XX +XXX,XX @@ static void decode_gusa(DisasContext *ctx, CPUSH4State *env)
45
case SWP:
414
}
46
gen_load_gpr(t1, rd);
415
47
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
416
/* The entire region has been translated. */
48
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL |
417
- ctx->envflags &= ~GUSA_MASK;
49
+ ctx->default_tcg_memop_mask);
418
+ ctx->envflags &= ~TB_FLAG_GUSA_MASK;
50
tcg_gen_movi_tl(t1, 4);
419
ctx->base.pc_next = pc_end;
51
gen_op_addr_add(ctx, t0, t0, t1);
420
ctx->base.num_insns += max_insns - 1;
52
gen_load_gpr(t1, rd + 1);
421
return;
53
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
422
@@ -XXX,XX +XXX,XX @@ static void decode_gusa(DisasContext *ctx, CPUSH4State *env)
54
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL |
423
55
+ ctx->default_tcg_memop_mask);
424
/* Restart with the EXCLUSIVE bit set, within a TB run via
56
break;
425
cpu_exec_step_atomic holding the exclusive lock. */
57
#ifdef TARGET_MIPS64
426
- ctx->envflags |= GUSA_EXCLUSIVE;
58
case LDP:
427
+ ctx->envflags |= TB_FLAG_GUSA_EXCLUSIVE;
59
@@ -XXX,XX +XXX,XX @@ static void gen_ldst_pair(DisasContext *ctx, uint32_t opc, int rd,
428
gen_save_cpu_state(ctx, false);
60
gen_reserved_instruction(ctx);
429
gen_helper_exclusive(cpu_env);
430
ctx->base.is_jmp = DISAS_NORETURN;
431
@@ -XXX,XX +XXX,XX @@ static void sh4_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
432
(tbflags & (1 << SR_RB))) * 0x10;
433
ctx->fbank = tbflags & FPSCR_FR ? 0x10 : 0;
434
435
- if (tbflags & GUSA_MASK) {
436
+#ifdef CONFIG_USER_ONLY
437
+ if (tbflags & TB_FLAG_GUSA_MASK) {
438
+ /* In gUSA exclusive region. */
439
uint32_t pc = ctx->base.pc_next;
440
uint32_t pc_end = ctx->base.tb->cs_base;
441
- int backup = sextract32(ctx->tbflags, GUSA_SHIFT, 8);
442
+ int backup = sextract32(ctx->tbflags, TB_FLAG_GUSA_SHIFT, 8);
443
int max_insns = (pc_end - pc) / 2;
444
445
if (pc != pc_end + backup || max_insns < 2) {
446
/* This is a malformed gUSA region. Don't do anything special,
447
since the interpreter is likely to get confused. */
448
- ctx->envflags &= ~GUSA_MASK;
449
- } else if (tbflags & GUSA_EXCLUSIVE) {
450
+ ctx->envflags &= ~TB_FLAG_GUSA_MASK;
451
+ } else if (tbflags & TB_FLAG_GUSA_EXCLUSIVE) {
452
/* Regardless of single-stepping or the end of the page,
453
we must complete execution of the gUSA region while
454
holding the exclusive lock. */
455
@@ -XXX,XX +XXX,XX @@ static void sh4_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
456
return;
61
return;
457
}
62
}
63
- tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TEUQ);
64
+ tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TEUQ |
65
+ ctx->default_tcg_memop_mask);
66
gen_store_gpr(t1, rd);
67
tcg_gen_movi_tl(t1, 8);
68
gen_op_addr_add(ctx, t0, t0, t1);
69
- tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TEUQ);
70
+ tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TEUQ |
71
+ ctx->default_tcg_memop_mask);
72
gen_store_gpr(t1, rd + 1);
73
break;
74
case SDP:
75
gen_load_gpr(t1, rd);
76
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUQ);
77
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUQ |
78
+ ctx->default_tcg_memop_mask);
79
tcg_gen_movi_tl(t1, 8);
80
gen_op_addr_add(ctx, t0, t0, t1);
81
gen_load_gpr(t1, rd + 1);
82
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUQ);
83
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUQ |
84
+ ctx->default_tcg_memop_mask);
85
break;
86
#endif
458
}
87
}
459
+#endif
88
diff --git a/target/mips/tcg/mips16e_translate.c.inc b/target/mips/tcg/mips16e_translate.c.inc
460
89
index XXXXXXX..XXXXXXX 100644
461
/* Since the ISA is fixed-width, we can bound by the number
90
--- a/target/mips/tcg/mips16e_translate.c.inc
462
of instructions remaining on the page. */
91
+++ b/target/mips/tcg/mips16e_translate.c.inc
463
@@ -XXX,XX +XXX,XX @@ static void sh4_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
92
@@ -XXX,XX +XXX,XX @@ static void gen_mips16_save(DisasContext *ctx,
464
DisasContext *ctx = container_of(dcbase, DisasContext, base);
93
case 4:
465
94
gen_base_offset_addr(ctx, t0, 29, 12);
466
#ifdef CONFIG_USER_ONLY
95
gen_load_gpr(t1, 7);
467
- if (unlikely(ctx->envflags & GUSA_MASK)
96
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
468
- && !(ctx->envflags & GUSA_EXCLUSIVE)) {
97
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL |
469
+ if (unlikely(ctx->envflags & TB_FLAG_GUSA_MASK)
98
+ ctx->default_tcg_memop_mask);
470
+ && !(ctx->envflags & TB_FLAG_GUSA_EXCLUSIVE)) {
99
/* Fall through */
471
/* We're in an gUSA region, and we have not already fallen
100
case 3:
472
back on using an exclusive region. Attempt to parse the
101
gen_base_offset_addr(ctx, t0, 29, 8);
473
region into a single supported atomic operation. Failure
102
gen_load_gpr(t1, 6);
474
@@ -XXX,XX +XXX,XX @@ static void sh4_tr_tb_stop(DisasContextBase *dcbase, CPUState *cs)
103
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
475
{
104
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL |
476
DisasContext *ctx = container_of(dcbase, DisasContext, base);
105
+ ctx->default_tcg_memop_mask);
477
106
/* Fall through */
478
- if (ctx->tbflags & GUSA_EXCLUSIVE) {
107
case 2:
479
+ if (ctx->tbflags & TB_FLAG_GUSA_EXCLUSIVE) {
108
gen_base_offset_addr(ctx, t0, 29, 4);
480
/* Ending the region of exclusivity. Clear the bits. */
109
gen_load_gpr(t1, 5);
481
- ctx->envflags &= ~GUSA_MASK;
110
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
482
+ ctx->envflags &= ~TB_FLAG_GUSA_MASK;
111
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL |
112
+ ctx->default_tcg_memop_mask);
113
/* Fall through */
114
case 1:
115
gen_base_offset_addr(ctx, t0, 29, 0);
116
gen_load_gpr(t1, 4);
117
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
118
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL |
119
+ ctx->default_tcg_memop_mask);
483
}
120
}
484
121
485
switch (ctx->base.is_jmp) {
122
gen_load_gpr(t0, 29);
123
@@ -XXX,XX +XXX,XX @@ static void gen_mips16_save(DisasContext *ctx,
124
tcg_gen_movi_tl(t2, -4); \
125
gen_op_addr_add(ctx, t0, t0, t2); \
126
gen_load_gpr(t1, reg); \
127
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL); \
128
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL | \
129
+ ctx->default_tcg_memop_mask); \
130
} while (0)
131
132
if (do_ra) {
133
@@ -XXX,XX +XXX,XX @@ static void gen_mips16_restore(DisasContext *ctx,
134
#define DECR_AND_LOAD(reg) do { \
135
tcg_gen_movi_tl(t2, -4); \
136
gen_op_addr_add(ctx, t0, t0, t2); \
137
- tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TESL); \
138
+ tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, MO_TESL | \
139
+ ctx->default_tcg_memop_mask); \
140
gen_store_gpr(t1, reg); \
141
} while (0)
142
143
diff --git a/target/mips/tcg/nanomips_translate.c.inc b/target/mips/tcg/nanomips_translate.c.inc
144
index XXXXXXX..XXXXXXX 100644
145
--- a/target/mips/tcg/nanomips_translate.c.inc
146
+++ b/target/mips/tcg/nanomips_translate.c.inc
147
@@ -XXX,XX +XXX,XX @@ static void gen_p_lsx(DisasContext *ctx, int rd, int rs, int rt)
148
149
switch (extract32(ctx->opcode, 7, 4)) {
150
case NM_LBX:
151
- tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx,
152
- MO_SB);
153
+ tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx, MO_SB);
154
gen_store_gpr(t0, rd);
155
break;
156
case NM_LHX:
157
/*case NM_LHXS:*/
158
tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx,
159
- MO_TESW);
160
+ MO_TESW | ctx->default_tcg_memop_mask);
161
gen_store_gpr(t0, rd);
162
break;
163
case NM_LWX:
164
/*case NM_LWXS:*/
165
tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx,
166
- MO_TESL);
167
+ MO_TESL | ctx->default_tcg_memop_mask);
168
gen_store_gpr(t0, rd);
169
break;
170
case NM_LBUX:
171
- tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx,
172
- MO_UB);
173
+ tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx, MO_UB);
174
gen_store_gpr(t0, rd);
175
break;
176
case NM_LHUX:
177
/*case NM_LHUXS:*/
178
tcg_gen_qemu_ld_tl(t0, t0, ctx->mem_idx,
179
- MO_TEUW);
180
+ MO_TEUW | ctx->default_tcg_memop_mask);
181
gen_store_gpr(t0, rd);
182
break;
183
case NM_SBX:
184
check_nms(ctx);
185
gen_load_gpr(t1, rd);
186
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx,
187
- MO_8);
188
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_8);
189
break;
190
case NM_SHX:
191
/*case NM_SHXS:*/
192
check_nms(ctx);
193
gen_load_gpr(t1, rd);
194
tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx,
195
- MO_TEUW);
196
+ MO_TEUW | ctx->default_tcg_memop_mask);
197
break;
198
case NM_SWX:
199
/*case NM_SWXS:*/
200
check_nms(ctx);
201
gen_load_gpr(t1, rd);
202
tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx,
203
- MO_TEUL);
204
+ MO_TEUL | ctx->default_tcg_memop_mask);
205
break;
206
case NM_LWC1X:
207
/*case NM_LWC1XS:*/
208
@@ -XXX,XX +XXX,XX @@ static int decode_nanomips_32_48_opc(CPUMIPSState *env, DisasContext *ctx)
209
addr_off);
210
211
tcg_gen_movi_tl(t0, addr);
212
- tcg_gen_qemu_ld_tl(cpu_gpr[rt], t0, ctx->mem_idx, MO_TESL);
213
+ tcg_gen_qemu_ld_tl(cpu_gpr[rt], t0, ctx->mem_idx,
214
+ MO_TESL | ctx->default_tcg_memop_mask);
215
}
216
break;
217
case NM_SWPC48:
218
@@ -XXX,XX +XXX,XX @@ static int decode_nanomips_32_48_opc(CPUMIPSState *env, DisasContext *ctx)
219
tcg_gen_movi_tl(t0, addr);
220
gen_load_gpr(t1, rt);
221
222
- tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx, MO_TEUL);
223
+ tcg_gen_qemu_st_tl(t1, t0, ctx->mem_idx,
224
+ MO_TEUL | ctx->default_tcg_memop_mask);
225
}
226
break;
227
default:
486
--
228
--
487
2.34.1
229
2.34.1
diff view generated by jsdifflib
New patch
1
The opposite of MO_UNALN is MO_ALIGN.
1
2
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/mips/tcg/nanomips_translate.c.inc | 2 +-
7
1 file changed, 1 insertion(+), 1 deletion(-)
8
9
diff --git a/target/mips/tcg/nanomips_translate.c.inc b/target/mips/tcg/nanomips_translate.c.inc
10
index XXXXXXX..XXXXXXX 100644
11
--- a/target/mips/tcg/nanomips_translate.c.inc
12
+++ b/target/mips/tcg/nanomips_translate.c.inc
13
@@ -XXX,XX +XXX,XX @@ static int decode_nanomips_32_48_opc(CPUMIPSState *env, DisasContext *ctx)
14
TCGv va = tcg_temp_new();
15
TCGv t1 = tcg_temp_new();
16
MemOp memop = (extract32(ctx->opcode, 8, 3)) ==
17
- NM_P_LS_UAWM ? MO_UNALN : 0;
18
+ NM_P_LS_UAWM ? MO_UNALN : MO_ALIGN;
19
20
count = (count == 0) ? 8 : count;
21
while (counter != count) {
22
--
23
2.34.1
24
25
diff view generated by jsdifflib
New patch
1
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2
---
3
configs/targets/mips-linux-user.mak | 1 -
4
configs/targets/mips-softmmu.mak | 1 -
5
configs/targets/mips64-linux-user.mak | 1 -
6
configs/targets/mips64-softmmu.mak | 1 -
7
configs/targets/mips64el-linux-user.mak | 1 -
8
configs/targets/mips64el-softmmu.mak | 1 -
9
configs/targets/mipsel-linux-user.mak | 1 -
10
configs/targets/mipsel-softmmu.mak | 1 -
11
configs/targets/mipsn32-linux-user.mak | 1 -
12
configs/targets/mipsn32el-linux-user.mak | 1 -
13
10 files changed, 10 deletions(-)
1
14
15
diff --git a/configs/targets/mips-linux-user.mak b/configs/targets/mips-linux-user.mak
16
index XXXXXXX..XXXXXXX 100644
17
--- a/configs/targets/mips-linux-user.mak
18
+++ b/configs/targets/mips-linux-user.mak
19
@@ -XXX,XX +XXX,XX @@ TARGET_ARCH=mips
20
TARGET_ABI_MIPSO32=y
21
TARGET_SYSTBL_ABI=o32
22
TARGET_SYSTBL=syscall_o32.tbl
23
-TARGET_ALIGNED_ONLY=y
24
TARGET_BIG_ENDIAN=y
25
diff --git a/configs/targets/mips-softmmu.mak b/configs/targets/mips-softmmu.mak
26
index XXXXXXX..XXXXXXX 100644
27
--- a/configs/targets/mips-softmmu.mak
28
+++ b/configs/targets/mips-softmmu.mak
29
@@ -XXX,XX +XXX,XX @@
30
TARGET_ARCH=mips
31
-TARGET_ALIGNED_ONLY=y
32
TARGET_BIG_ENDIAN=y
33
TARGET_SUPPORTS_MTTCG=y
34
diff --git a/configs/targets/mips64-linux-user.mak b/configs/targets/mips64-linux-user.mak
35
index XXXXXXX..XXXXXXX 100644
36
--- a/configs/targets/mips64-linux-user.mak
37
+++ b/configs/targets/mips64-linux-user.mak
38
@@ -XXX,XX +XXX,XX @@ TARGET_ABI_MIPSN64=y
39
TARGET_BASE_ARCH=mips
40
TARGET_SYSTBL_ABI=n64
41
TARGET_SYSTBL=syscall_n64.tbl
42
-TARGET_ALIGNED_ONLY=y
43
TARGET_BIG_ENDIAN=y
44
diff --git a/configs/targets/mips64-softmmu.mak b/configs/targets/mips64-softmmu.mak
45
index XXXXXXX..XXXXXXX 100644
46
--- a/configs/targets/mips64-softmmu.mak
47
+++ b/configs/targets/mips64-softmmu.mak
48
@@ -XXX,XX +XXX,XX @@
49
TARGET_ARCH=mips64
50
TARGET_BASE_ARCH=mips
51
-TARGET_ALIGNED_ONLY=y
52
TARGET_BIG_ENDIAN=y
53
diff --git a/configs/targets/mips64el-linux-user.mak b/configs/targets/mips64el-linux-user.mak
54
index XXXXXXX..XXXXXXX 100644
55
--- a/configs/targets/mips64el-linux-user.mak
56
+++ b/configs/targets/mips64el-linux-user.mak
57
@@ -XXX,XX +XXX,XX @@ TARGET_ABI_MIPSN64=y
58
TARGET_BASE_ARCH=mips
59
TARGET_SYSTBL_ABI=n64
60
TARGET_SYSTBL=syscall_n64.tbl
61
-TARGET_ALIGNED_ONLY=y
62
diff --git a/configs/targets/mips64el-softmmu.mak b/configs/targets/mips64el-softmmu.mak
63
index XXXXXXX..XXXXXXX 100644
64
--- a/configs/targets/mips64el-softmmu.mak
65
+++ b/configs/targets/mips64el-softmmu.mak
66
@@ -XXX,XX +XXX,XX @@
67
TARGET_ARCH=mips64
68
TARGET_BASE_ARCH=mips
69
-TARGET_ALIGNED_ONLY=y
70
TARGET_NEED_FDT=y
71
diff --git a/configs/targets/mipsel-linux-user.mak b/configs/targets/mipsel-linux-user.mak
72
index XXXXXXX..XXXXXXX 100644
73
--- a/configs/targets/mipsel-linux-user.mak
74
+++ b/configs/targets/mipsel-linux-user.mak
75
@@ -XXX,XX +XXX,XX @@ TARGET_ARCH=mips
76
TARGET_ABI_MIPSO32=y
77
TARGET_SYSTBL_ABI=o32
78
TARGET_SYSTBL=syscall_o32.tbl
79
-TARGET_ALIGNED_ONLY=y
80
diff --git a/configs/targets/mipsel-softmmu.mak b/configs/targets/mipsel-softmmu.mak
81
index XXXXXXX..XXXXXXX 100644
82
--- a/configs/targets/mipsel-softmmu.mak
83
+++ b/configs/targets/mipsel-softmmu.mak
84
@@ -XXX,XX +XXX,XX @@
85
TARGET_ARCH=mips
86
-TARGET_ALIGNED_ONLY=y
87
TARGET_SUPPORTS_MTTCG=y
88
diff --git a/configs/targets/mipsn32-linux-user.mak b/configs/targets/mipsn32-linux-user.mak
89
index XXXXXXX..XXXXXXX 100644
90
--- a/configs/targets/mipsn32-linux-user.mak
91
+++ b/configs/targets/mipsn32-linux-user.mak
92
@@ -XXX,XX +XXX,XX @@ TARGET_ABI32=y
93
TARGET_BASE_ARCH=mips
94
TARGET_SYSTBL_ABI=n32
95
TARGET_SYSTBL=syscall_n32.tbl
96
-TARGET_ALIGNED_ONLY=y
97
TARGET_BIG_ENDIAN=y
98
diff --git a/configs/targets/mipsn32el-linux-user.mak b/configs/targets/mipsn32el-linux-user.mak
99
index XXXXXXX..XXXXXXX 100644
100
--- a/configs/targets/mipsn32el-linux-user.mak
101
+++ b/configs/targets/mipsn32el-linux-user.mak
102
@@ -XXX,XX +XXX,XX @@ TARGET_ABI32=y
103
TARGET_BASE_ARCH=mips
104
TARGET_SYSTBL_ABI=n32
105
TARGET_SYSTBL=syscall_n32.tbl
106
-TARGET_ALIGNED_ONLY=y
107
--
108
2.34.1
diff view generated by jsdifflib
1
From: Alex Bennée <alex.bennee@linaro.org>
1
In gen_ldx/gen_stx, the only two locations for memory operations,
2
mark the operation as either aligned (softmmu) or unaligned
3
(user-only, as if emulated by the kernel).
2
4
3
Before: 35.912 s ± 0.168 s
5
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
After: 35.565 s ± 0.087 s
5
6
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
7
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
8
Message-Id: <20220811151413.3350684-5-alex.bennee@linaro.org>
9
Signed-off-by: Cédric Le Goater <clg@kaod.org>
10
Message-Id: <20220923084803.498337-5-clg@kaod.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
12
---
7
---
13
accel/tcg/cputlb.c | 15 ++++++---------
8
configs/targets/nios2-softmmu.mak | 1 -
14
1 file changed, 6 insertions(+), 9 deletions(-)
9
target/nios2/translate.c | 10 ++++++++++
10
2 files changed, 10 insertions(+), 1 deletion(-)
15
11
16
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
12
diff --git a/configs/targets/nios2-softmmu.mak b/configs/targets/nios2-softmmu.mak
17
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
18
--- a/accel/tcg/cputlb.c
14
--- a/configs/targets/nios2-softmmu.mak
19
+++ b/accel/tcg/cputlb.c
15
+++ b/configs/targets/nios2-softmmu.mak
20
@@ -XXX,XX +XXX,XX @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
16
@@ -XXX,XX +XXX,XX @@
21
static void tlb_fill(CPUState *cpu, target_ulong addr, int size,
17
TARGET_ARCH=nios2
22
MMUAccessType access_type, int mmu_idx, uintptr_t retaddr)
18
-TARGET_ALIGNED_ONLY=y
23
{
19
TARGET_NEED_FDT=y
24
- CPUClass *cc = CPU_GET_CLASS(cpu);
20
diff --git a/target/nios2/translate.c b/target/nios2/translate.c
25
bool ok;
21
index XXXXXXX..XXXXXXX 100644
26
22
--- a/target/nios2/translate.c
27
/*
23
+++ b/target/nios2/translate.c
28
* This is not a probe, so only valid return is success; failure
24
@@ -XXX,XX +XXX,XX @@ static void gen_ldx(DisasContext *dc, uint32_t code, uint32_t flags)
29
* should result in exception + longjmp to the cpu loop.
25
TCGv data = dest_gpr(dc, instr.b);
30
*/
26
31
- ok = cc->tcg_ops->tlb_fill(cpu, addr, size,
27
tcg_gen_addi_tl(addr, load_gpr(dc, instr.a), instr.imm16.s);
32
- access_type, mmu_idx, false, retaddr);
28
+#ifdef CONFIG_USER_ONLY
33
+ ok = cpu->cc->tcg_ops->tlb_fill(cpu, addr, size,
29
+ flags |= MO_UNALN;
34
+ access_type, mmu_idx, false, retaddr);
30
+#else
35
assert(ok);
31
+ flags |= MO_ALIGN;
32
+#endif
33
tcg_gen_qemu_ld_tl(data, addr, dc->mem_idx, flags);
36
}
34
}
37
35
38
@@ -XXX,XX +XXX,XX @@ static inline void cpu_unaligned_access(CPUState *cpu, vaddr addr,
36
@@ -XXX,XX +XXX,XX @@ static void gen_stx(DisasContext *dc, uint32_t code, uint32_t flags)
39
MMUAccessType access_type,
37
40
int mmu_idx, uintptr_t retaddr)
38
TCGv addr = tcg_temp_new();
41
{
39
tcg_gen_addi_tl(addr, load_gpr(dc, instr.a), instr.imm16.s);
42
- CPUClass *cc = CPU_GET_CLASS(cpu);
40
+#ifdef CONFIG_USER_ONLY
43
-
41
+ flags |= MO_UNALN;
44
- cc->tcg_ops->do_unaligned_access(cpu, addr, access_type, mmu_idx, retaddr);
42
+#else
45
+ cpu->cc->tcg_ops->do_unaligned_access(cpu, addr, access_type,
43
+ flags |= MO_ALIGN;
46
+ mmu_idx, retaddr);
44
+#endif
45
tcg_gen_qemu_st_tl(val, addr, dc->mem_idx, flags);
47
}
46
}
48
47
49
static inline void cpu_transaction_failed(CPUState *cpu, hwaddr physaddr,
50
@@ -XXX,XX +XXX,XX @@ static int probe_access_internal(CPUArchState *env, target_ulong addr,
51
if (!tlb_hit_page(tlb_addr, page_addr)) {
52
if (!victim_tlb_hit(env, mmu_idx, index, elt_ofs, page_addr)) {
53
CPUState *cs = env_cpu(env);
54
- CPUClass *cc = CPU_GET_CLASS(cs);
55
56
- if (!cc->tcg_ops->tlb_fill(cs, addr, fault_size, access_type,
57
- mmu_idx, nonfault, retaddr)) {
58
+ if (!cs->cc->tcg_ops->tlb_fill(cs, addr, fault_size, access_type,
59
+ mmu_idx, nonfault, retaddr)) {
60
/* Non-faulting page table read failed. */
61
*phost = NULL;
62
return TLB_INVALID_MASK;
63
--
48
--
64
2.34.1
49
2.34.1
65
50
66
51
diff view generated by jsdifflib
New patch
1
Mark all memory operations that are not already marked with UNALIGN.
1
2
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
5
---
6
target/sh4/translate.c | 102 ++++++++++++++++++++++++++---------------
7
1 file changed, 66 insertions(+), 36 deletions(-)
8
9
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
10
index XXXXXXX..XXXXXXX 100644
11
--- a/target/sh4/translate.c
12
+++ b/target/sh4/translate.c
13
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
14
case 0x9000:        /* mov.w @(disp,PC),Rn */
15
    {
16
TCGv addr = tcg_constant_i32(ctx->base.pc_next + 4 + B7_0 * 2);
17
- tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx, MO_TESW);
18
+ tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx,
19
+ MO_TESW | MO_ALIGN);
20
    }
21
    return;
22
case 0xd000:        /* mov.l @(disp,PC),Rn */
23
    {
24
TCGv addr = tcg_constant_i32((ctx->base.pc_next + 4 + B7_0 * 4) & ~3);
25
- tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx, MO_TESL);
26
+ tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx,
27
+ MO_TESL | MO_ALIGN);
28
    }
29
    return;
30
case 0x7000:        /* add #imm,Rn */
31
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
32
    {
33
     TCGv arg0, arg1;
34
     arg0 = tcg_temp_new();
35
- tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx, MO_TESL);
36
+ tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx,
37
+ MO_TESL | MO_ALIGN);
38
     arg1 = tcg_temp_new();
39
- tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx, MO_TESL);
40
+ tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx,
41
+ MO_TESL | MO_ALIGN);
42
gen_helper_macl(cpu_env, arg0, arg1);
43
     tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 4);
44
     tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 4);
45
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
46
    {
47
     TCGv arg0, arg1;
48
     arg0 = tcg_temp_new();
49
- tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx, MO_TESL);
50
+ tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx,
51
+ MO_TESL | MO_ALIGN);
52
     arg1 = tcg_temp_new();
53
- tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx, MO_TESL);
54
+ tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx,
55
+ MO_TESL | MO_ALIGN);
56
gen_helper_macw(cpu_env, arg0, arg1);
57
     tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 2);
58
     tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 2);
59
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
60
if (ctx->tbflags & FPSCR_SZ) {
61
TCGv_i64 fp = tcg_temp_new_i64();
62
gen_load_fpr64(ctx, fp, XHACK(B7_4));
63
- tcg_gen_qemu_st_i64(fp, REG(B11_8), ctx->memidx, MO_TEUQ);
64
+ tcg_gen_qemu_st_i64(fp, REG(B11_8), ctx->memidx,
65
+ MO_TEUQ | MO_ALIGN);
66
    } else {
67
- tcg_gen_qemu_st_i32(FREG(B7_4), REG(B11_8), ctx->memidx, MO_TEUL);
68
+ tcg_gen_qemu_st_i32(FREG(B7_4), REG(B11_8), ctx->memidx,
69
+ MO_TEUL | MO_ALIGN);
70
    }
71
    return;
72
case 0xf008: /* fmov @Rm,{F,D,X}Rn - FPSCR: Nothing */
73
    CHECK_FPU_ENABLED
74
if (ctx->tbflags & FPSCR_SZ) {
75
TCGv_i64 fp = tcg_temp_new_i64();
76
- tcg_gen_qemu_ld_i64(fp, REG(B7_4), ctx->memidx, MO_TEUQ);
77
+ tcg_gen_qemu_ld_i64(fp, REG(B7_4), ctx->memidx,
78
+ MO_TEUQ | MO_ALIGN);
79
gen_store_fpr64(ctx, fp, XHACK(B11_8));
80
    } else {
81
- tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx, MO_TEUL);
82
+ tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx,
83
+ MO_TEUL | MO_ALIGN);
84
    }
85
    return;
86
case 0xf009: /* fmov @Rm+,{F,D,X}Rn - FPSCR: Nothing */
87
    CHECK_FPU_ENABLED
88
if (ctx->tbflags & FPSCR_SZ) {
89
TCGv_i64 fp = tcg_temp_new_i64();
90
- tcg_gen_qemu_ld_i64(fp, REG(B7_4), ctx->memidx, MO_TEUQ);
91
+ tcg_gen_qemu_ld_i64(fp, REG(B7_4), ctx->memidx,
92
+ MO_TEUQ | MO_ALIGN);
93
gen_store_fpr64(ctx, fp, XHACK(B11_8));
94
tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 8);
95
    } else {
96
- tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx, MO_TEUL);
97
+ tcg_gen_qemu_ld_i32(FREG(B11_8), REG(B7_4), ctx->memidx,
98
+ MO_TEUL | MO_ALIGN);
99
     tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 4);
100
    }
101
    return;
102
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
103
TCGv_i64 fp = tcg_temp_new_i64();
104
gen_load_fpr64(ctx, fp, XHACK(B7_4));
105
tcg_gen_subi_i32(addr, REG(B11_8), 8);
106
- tcg_gen_qemu_st_i64(fp, addr, ctx->memidx, MO_TEUQ);
107
+ tcg_gen_qemu_st_i64(fp, addr, ctx->memidx,
108
+ MO_TEUQ | MO_ALIGN);
109
} else {
110
tcg_gen_subi_i32(addr, REG(B11_8), 4);
111
- tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
112
+ tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx,
113
+ MO_TEUL | MO_ALIGN);
114
}
115
tcg_gen_mov_i32(REG(B11_8), addr);
116
}
117
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
118
     tcg_gen_add_i32(addr, REG(B7_4), REG(0));
119
if (ctx->tbflags & FPSCR_SZ) {
120
TCGv_i64 fp = tcg_temp_new_i64();
121
- tcg_gen_qemu_ld_i64(fp, addr, ctx->memidx, MO_TEUQ);
122
+ tcg_gen_qemu_ld_i64(fp, addr, ctx->memidx,
123
+ MO_TEUQ | MO_ALIGN);
124
gen_store_fpr64(ctx, fp, XHACK(B11_8));
125
     } else {
126
- tcg_gen_qemu_ld_i32(FREG(B11_8), addr, ctx->memidx, MO_TEUL);
127
+ tcg_gen_qemu_ld_i32(FREG(B11_8), addr, ctx->memidx,
128
+ MO_TEUL | MO_ALIGN);
129
     }
130
    }
131
    return;
132
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
133
if (ctx->tbflags & FPSCR_SZ) {
134
TCGv_i64 fp = tcg_temp_new_i64();
135
gen_load_fpr64(ctx, fp, XHACK(B7_4));
136
- tcg_gen_qemu_st_i64(fp, addr, ctx->memidx, MO_TEUQ);
137
+ tcg_gen_qemu_st_i64(fp, addr, ctx->memidx,
138
+ MO_TEUQ | MO_ALIGN);
139
     } else {
140
- tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx, MO_TEUL);
141
+ tcg_gen_qemu_st_i32(FREG(B7_4), addr, ctx->memidx,
142
+ MO_TEUL | MO_ALIGN);
143
     }
144
    }
145
    return;
146
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
147
    {
148
     TCGv addr = tcg_temp_new();
149
     tcg_gen_addi_i32(addr, cpu_gbr, B7_0 * 2);
150
- tcg_gen_qemu_ld_i32(REG(0), addr, ctx->memidx, MO_TESW);
151
+ tcg_gen_qemu_ld_i32(REG(0), addr, ctx->memidx, MO_TESW | MO_ALIGN);
152
    }
153
    return;
154
case 0xc600:        /* mov.l @(disp,GBR),R0 */
155
    {
156
     TCGv addr = tcg_temp_new();
157
     tcg_gen_addi_i32(addr, cpu_gbr, B7_0 * 4);
158
- tcg_gen_qemu_ld_i32(REG(0), addr, ctx->memidx, MO_TESL);
159
+ tcg_gen_qemu_ld_i32(REG(0), addr, ctx->memidx, MO_TESL | MO_ALIGN);
160
    }
161
    return;
162
case 0xc000:        /* mov.b R0,@(disp,GBR) */
163
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
164
    {
165
     TCGv addr = tcg_temp_new();
166
     tcg_gen_addi_i32(addr, cpu_gbr, B7_0 * 2);
167
- tcg_gen_qemu_st_i32(REG(0), addr, ctx->memidx, MO_TEUW);
168
+ tcg_gen_qemu_st_i32(REG(0), addr, ctx->memidx, MO_TEUW | MO_ALIGN);
169
    }
170
    return;
171
case 0xc200:        /* mov.l R0,@(disp,GBR) */
172
    {
173
     TCGv addr = tcg_temp_new();
174
     tcg_gen_addi_i32(addr, cpu_gbr, B7_0 * 4);
175
- tcg_gen_qemu_st_i32(REG(0), addr, ctx->memidx, MO_TEUL);
176
+ tcg_gen_qemu_st_i32(REG(0), addr, ctx->memidx, MO_TEUL | MO_ALIGN);
177
    }
178
    return;
179
case 0x8000:        /* mov.b R0,@(disp,Rn) */
180
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
181
    return;
182
case 0x4087:        /* ldc.l @Rm+,Rn_BANK */
183
    CHECK_PRIVILEGED
184
- tcg_gen_qemu_ld_i32(ALTREG(B6_4), REG(B11_8), ctx->memidx, MO_TESL);
185
+ tcg_gen_qemu_ld_i32(ALTREG(B6_4), REG(B11_8), ctx->memidx,
186
+ MO_TESL | MO_ALIGN);
187
    tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 4);
188
    return;
189
case 0x0082:        /* stc Rm_BANK,Rn */
190
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
191
    {
192
     TCGv addr = tcg_temp_new();
193
     tcg_gen_subi_i32(addr, REG(B11_8), 4);
194
- tcg_gen_qemu_st_i32(ALTREG(B6_4), addr, ctx->memidx, MO_TEUL);
195
+ tcg_gen_qemu_st_i32(ALTREG(B6_4), addr, ctx->memidx,
196
+ MO_TEUL | MO_ALIGN);
197
     tcg_gen_mov_i32(REG(B11_8), addr);
198
    }
199
    return;
200
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
201
    CHECK_PRIVILEGED
202
    {
203
     TCGv val = tcg_temp_new();
204
- tcg_gen_qemu_ld_i32(val, REG(B11_8), ctx->memidx, MO_TESL);
205
+ tcg_gen_qemu_ld_i32(val, REG(B11_8), ctx->memidx,
206
+ MO_TESL | MO_ALIGN);
207
tcg_gen_andi_i32(val, val, 0x700083f3);
208
gen_write_sr(val);
209
     tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 4);
210
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
211
TCGv val = tcg_temp_new();
212
     tcg_gen_subi_i32(addr, REG(B11_8), 4);
213
gen_read_sr(val);
214
- tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_TEUL);
215
+ tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_TEUL | MO_ALIGN);
216
     tcg_gen_mov_i32(REG(B11_8), addr);
217
    }
218
    return;
219
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
220
return;                            \
221
case ldpnum:                            \
222
prechk                             \
223
- tcg_gen_qemu_ld_i32(cpu_##reg, REG(B11_8), ctx->memidx, MO_TESL); \
224
+ tcg_gen_qemu_ld_i32(cpu_##reg, REG(B11_8), ctx->memidx, \
225
+ MO_TESL | MO_ALIGN); \
226
tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 4);        \
227
return;
228
#define ST(reg,stnum,stpnum,prechk)        \
229
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
230
{                                \
231
    TCGv addr = tcg_temp_new();                \
232
    tcg_gen_subi_i32(addr, REG(B11_8), 4);            \
233
- tcg_gen_qemu_st_i32(cpu_##reg, addr, ctx->memidx, MO_TEUL); \
234
+ tcg_gen_qemu_st_i32(cpu_##reg, addr, ctx->memidx, \
235
+ MO_TEUL | MO_ALIGN); \
236
    tcg_gen_mov_i32(REG(B11_8), addr);            \
237
}                                \
238
return;
239
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
240
    CHECK_FPU_ENABLED
241
    {
242
     TCGv addr = tcg_temp_new();
243
- tcg_gen_qemu_ld_i32(addr, REG(B11_8), ctx->memidx, MO_TESL);
244
+ tcg_gen_qemu_ld_i32(addr, REG(B11_8), ctx->memidx,
245
+ MO_TESL | MO_ALIGN);
246
     tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 4);
247
gen_helper_ld_fpscr(cpu_env, addr);
248
ctx->base.is_jmp = DISAS_STOP;
249
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
250
     tcg_gen_andi_i32(val, cpu_fpscr, 0x003fffff);
251
     addr = tcg_temp_new();
252
     tcg_gen_subi_i32(addr, REG(B11_8), 4);
253
- tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_TEUL);
254
+ tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_TEUL | MO_ALIGN);
255
     tcg_gen_mov_i32(REG(B11_8), addr);
256
    }
257
    return;
258
case 0x00c3:        /* movca.l R0,@Rm */
259
{
260
TCGv val = tcg_temp_new();
261
- tcg_gen_qemu_ld_i32(val, REG(B11_8), ctx->memidx, MO_TEUL);
262
+ tcg_gen_qemu_ld_i32(val, REG(B11_8), ctx->memidx,
263
+ MO_TEUL | MO_ALIGN);
264
gen_helper_movcal(cpu_env, REG(B11_8), val);
265
- tcg_gen_qemu_st_i32(REG(0), REG(B11_8), ctx->memidx, MO_TEUL);
266
+ tcg_gen_qemu_st_i32(REG(0), REG(B11_8), ctx->memidx,
267
+ MO_TEUL | MO_ALIGN);
268
}
269
ctx->has_movcal = 1;
270
    return;
271
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
272
cpu_lock_addr, fail);
273
tmp = tcg_temp_new();
274
tcg_gen_atomic_cmpxchg_i32(tmp, REG(B11_8), cpu_lock_value,
275
- REG(0), ctx->memidx, MO_TEUL);
276
+ REG(0), ctx->memidx,
277
+ MO_TEUL | MO_ALIGN);
278
tcg_gen_setcond_i32(TCG_COND_EQ, cpu_sr_t, tmp, cpu_lock_value);
279
} else {
280
tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_lock_addr, -1, fail);
281
- tcg_gen_qemu_st_i32(REG(0), REG(B11_8), ctx->memidx, MO_TEUL);
282
+ tcg_gen_qemu_st_i32(REG(0), REG(B11_8), ctx->memidx,
283
+ MO_TEUL | MO_ALIGN);
284
tcg_gen_movi_i32(cpu_sr_t, 1);
285
}
286
tcg_gen_br(done);
287
@@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx)
288
if ((tb_cflags(ctx->base.tb) & CF_PARALLEL)) {
289
TCGv tmp = tcg_temp_new();
290
tcg_gen_mov_i32(tmp, REG(B11_8));
291
- tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx, MO_TESL);
292
+ tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx,
293
+ MO_TESL | MO_ALIGN);
294
tcg_gen_mov_i32(cpu_lock_value, REG(0));
295
tcg_gen_mov_i32(cpu_lock_addr, tmp);
296
} else {
297
- tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx, MO_TESL);
298
+ tcg_gen_qemu_ld_i32(REG(0), REG(B11_8), ctx->memidx,
299
+ MO_TESL | MO_ALIGN);
300
tcg_gen_movi_i32(cpu_lock_addr, 0);
301
}
302
return;
303
--
304
2.34.1
305
306
diff view generated by jsdifflib
New patch
1
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
3
---
4
configs/targets/sh4-linux-user.mak | 1 -
5
configs/targets/sh4-softmmu.mak | 1 -
6
configs/targets/sh4eb-linux-user.mak | 1 -
7
configs/targets/sh4eb-softmmu.mak | 1 -
8
4 files changed, 4 deletions(-)
1
9
10
diff --git a/configs/targets/sh4-linux-user.mak b/configs/targets/sh4-linux-user.mak
11
index XXXXXXX..XXXXXXX 100644
12
--- a/configs/targets/sh4-linux-user.mak
13
+++ b/configs/targets/sh4-linux-user.mak
14
@@ -XXX,XX +XXX,XX @@
15
TARGET_ARCH=sh4
16
TARGET_SYSTBL_ABI=common
17
TARGET_SYSTBL=syscall.tbl
18
-TARGET_ALIGNED_ONLY=y
19
TARGET_HAS_BFLT=y
20
diff --git a/configs/targets/sh4-softmmu.mak b/configs/targets/sh4-softmmu.mak
21
index XXXXXXX..XXXXXXX 100644
22
--- a/configs/targets/sh4-softmmu.mak
23
+++ b/configs/targets/sh4-softmmu.mak
24
@@ -1,2 +1 @@
25
TARGET_ARCH=sh4
26
-TARGET_ALIGNED_ONLY=y
27
diff --git a/configs/targets/sh4eb-linux-user.mak b/configs/targets/sh4eb-linux-user.mak
28
index XXXXXXX..XXXXXXX 100644
29
--- a/configs/targets/sh4eb-linux-user.mak
30
+++ b/configs/targets/sh4eb-linux-user.mak
31
@@ -XXX,XX +XXX,XX @@
32
TARGET_ARCH=sh4
33
TARGET_SYSTBL_ABI=common
34
TARGET_SYSTBL=syscall.tbl
35
-TARGET_ALIGNED_ONLY=y
36
TARGET_BIG_ENDIAN=y
37
TARGET_HAS_BFLT=y
38
diff --git a/configs/targets/sh4eb-softmmu.mak b/configs/targets/sh4eb-softmmu.mak
39
index XXXXXXX..XXXXXXX 100644
40
--- a/configs/targets/sh4eb-softmmu.mak
41
+++ b/configs/targets/sh4eb-softmmu.mak
42
@@ -XXX,XX +XXX,XX @@
43
TARGET_ARCH=sh4
44
-TARGET_ALIGNED_ONLY=y
45
TARGET_BIG_ENDIAN=y
46
--
47
2.34.1
48
49
diff view generated by jsdifflib
1
From: Alex Bennée <alex.bennee@linaro.org>
1
All uses have now been expunged.
2
2
3
This is a heavily used function so lets avoid the cost of
3
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
4
CPU_GET_CLASS. On the romulus-bmc run it has a modest effect:
5
6
Before: 36.812 s ± 0.506 s
7
After: 35.912 s ± 0.168 s
8
9
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
11
Message-Id: <20220811151413.3350684-4-alex.bennee@linaro.org>
12
Signed-off-by: Cédric Le Goater <clg@kaod.org>
13
Message-Id: <20220923084803.498337-4-clg@kaod.org>
14
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
4
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
15
---
5
---
16
hw/core/cpu-sysemu.c | 5 ++---
6
include/exec/memop.h | 13 ++-----------
17
1 file changed, 2 insertions(+), 3 deletions(-)
7
include/exec/poison.h | 1 -
8
tcg/tcg.c | 5 -----
9
3 files changed, 2 insertions(+), 17 deletions(-)
18
10
19
diff --git a/hw/core/cpu-sysemu.c b/hw/core/cpu-sysemu.c
11
diff --git a/include/exec/memop.h b/include/exec/memop.h
20
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
21
--- a/hw/core/cpu-sysemu.c
13
--- a/include/exec/memop.h
22
+++ b/hw/core/cpu-sysemu.c
14
+++ b/include/exec/memop.h
23
@@ -XXX,XX +XXX,XX @@ hwaddr cpu_get_phys_page_debug(CPUState *cpu, vaddr addr)
15
@@ -XXX,XX +XXX,XX @@ typedef enum MemOp {
24
16
* MO_UNALN accesses are never checked for alignment.
25
int cpu_asidx_from_attrs(CPUState *cpu, MemTxAttrs attrs)
17
* MO_ALIGN accesses will result in a call to the CPU's
26
{
18
* do_unaligned_access hook if the guest address is not aligned.
27
- CPUClass *cc = CPU_GET_CLASS(cpu);
19
- * The default depends on whether the target CPU defines
28
int ret = 0;
20
- * TARGET_ALIGNED_ONLY.
29
21
*
30
- if (cc->sysemu_ops->asidx_from_attrs) {
22
* Some architectures (e.g. ARMv8) need the address which is aligned
31
- ret = cc->sysemu_ops->asidx_from_attrs(cpu, attrs);
23
* to a size more than the size of the memory access.
32
+ if (cpu->cc->sysemu_ops->asidx_from_attrs) {
24
@@ -XXX,XX +XXX,XX @@ typedef enum MemOp {
33
+ ret = cpu->cc->sysemu_ops->asidx_from_attrs(cpu, attrs);
25
*/
34
assert(ret < cpu->num_ases && ret >= 0);
26
MO_ASHIFT = 5,
35
}
27
MO_AMASK = 0x7 << MO_ASHIFT,
36
return ret;
28
-#ifdef NEED_CPU_H
29
-#ifdef TARGET_ALIGNED_ONLY
30
- MO_ALIGN = 0,
31
- MO_UNALN = MO_AMASK,
32
-#else
33
- MO_ALIGN = MO_AMASK,
34
- MO_UNALN = 0,
35
-#endif
36
-#endif
37
+ MO_UNALN = 0,
38
MO_ALIGN_2 = 1 << MO_ASHIFT,
39
MO_ALIGN_4 = 2 << MO_ASHIFT,
40
MO_ALIGN_8 = 3 << MO_ASHIFT,
41
MO_ALIGN_16 = 4 << MO_ASHIFT,
42
MO_ALIGN_32 = 5 << MO_ASHIFT,
43
MO_ALIGN_64 = 6 << MO_ASHIFT,
44
+ MO_ALIGN = MO_AMASK,
45
46
/* Combinations of the above, for ease of use. */
47
MO_UB = MO_8,
48
diff --git a/include/exec/poison.h b/include/exec/poison.h
49
index XXXXXXX..XXXXXXX 100644
50
--- a/include/exec/poison.h
51
+++ b/include/exec/poison.h
52
@@ -XXX,XX +XXX,XX @@
53
#pragma GCC poison TARGET_TRICORE
54
#pragma GCC poison TARGET_XTENSA
55
56
-#pragma GCC poison TARGET_ALIGNED_ONLY
57
#pragma GCC poison TARGET_HAS_BFLT
58
#pragma GCC poison TARGET_NAME
59
#pragma GCC poison TARGET_SUPPORTS_MTTCG
60
diff --git a/tcg/tcg.c b/tcg/tcg.c
61
index XXXXXXX..XXXXXXX 100644
62
--- a/tcg/tcg.c
63
+++ b/tcg/tcg.c
64
@@ -XXX,XX +XXX,XX @@ static const char * const ldst_name[] =
65
};
66
67
static const char * const alignment_name[(MO_AMASK >> MO_ASHIFT) + 1] = {
68
-#ifdef TARGET_ALIGNED_ONLY
69
[MO_UNALN >> MO_ASHIFT] = "un+",
70
- [MO_ALIGN >> MO_ASHIFT] = "",
71
-#else
72
- [MO_UNALN >> MO_ASHIFT] = "",
73
[MO_ALIGN >> MO_ASHIFT] = "al+",
74
-#endif
75
[MO_ALIGN_2 >> MO_ASHIFT] = "al2+",
76
[MO_ALIGN_4 >> MO_ASHIFT] = "al4+",
77
[MO_ALIGN_8 >> MO_ASHIFT] = "al8+",
37
--
78
--
38
2.34.1
79
2.34.1
39
80
40
81
diff view generated by jsdifflib
1
Prepare for targets to be able to produce TBs that can
1
Like cpu_in_exclusive_context, but also true if
2
run in more than one virtual context.
2
there is no other cpu against which we could race.
3
4
Use it in tb_flush as a direct replacement.
5
Use it in cpu_loop_exit_atomic to ensure that there
6
is no loop against cpu_exec_step_atomic.
3
7
4
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
8
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
9
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
10
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
11
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
---
12
---
7
accel/tcg/internal.h | 4 +++
13
accel/tcg/internal.h | 9 +++++++++
8
accel/tcg/tb-jmp-cache.h | 41 +++++++++++++++++++++++++
14
accel/tcg/cpu-exec-common.c | 3 +++
9
include/exec/cpu-defs.h | 3 ++
15
accel/tcg/tb-maint.c | 2 +-
10
include/exec/exec-all.h | 32 ++++++++++++++++++--
16
3 files changed, 13 insertions(+), 1 deletion(-)
11
accel/tcg/cpu-exec.c | 16 ++++++----
12
accel/tcg/translate-all.c | 64 ++++++++++++++++++++++++++-------------
13
6 files changed, 131 insertions(+), 29 deletions(-)
14
17
15
diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
18
diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
16
index XXXXXXX..XXXXXXX 100644
19
index XXXXXXX..XXXXXXX 100644
17
--- a/accel/tcg/internal.h
20
--- a/accel/tcg/internal.h
18
+++ b/accel/tcg/internal.h
21
+++ b/accel/tcg/internal.h
19
@@ -XXX,XX +XXX,XX @@ void tb_htable_init(void);
22
@@ -XXX,XX +XXX,XX @@ static inline target_ulong log_pc(CPUState *cpu, const TranslationBlock *tb)
20
/* Return the current PC from CPU, which may be cached in TB. */
23
}
21
static inline target_ulong log_pc(CPUState *cpu, const TranslationBlock *tb)
22
{
23
+#if TARGET_TB_PCREL
24
+ return cpu->cc->get_pc(cpu);
25
+#else
26
return tb_pc(tb);
27
+#endif
28
}
24
}
29
25
30
#endif /* ACCEL_TCG_INTERNAL_H */
26
+/*
31
diff --git a/accel/tcg/tb-jmp-cache.h b/accel/tcg/tb-jmp-cache.h
27
+ * Return true if CS is not running in parallel with other cpus, either
32
index XXXXXXX..XXXXXXX 100644
28
+ * because there are no other cpus or we are within an exclusive context.
33
--- a/accel/tcg/tb-jmp-cache.h
29
+ */
34
+++ b/accel/tcg/tb-jmp-cache.h
30
+static inline bool cpu_in_serial_context(CPUState *cs)
35
@@ -XXX,XX +XXX,XX @@
36
37
/*
38
* Accessed in parallel; all accesses to 'tb' must be atomic.
39
+ * For TARGET_TB_PCREL, accesses to 'pc' must be protected by
40
+ * a load_acquire/store_release to 'tb'.
41
*/
42
struct CPUJumpCache {
43
struct {
44
TranslationBlock *tb;
45
+#if TARGET_TB_PCREL
46
+ target_ulong pc;
47
+#endif
48
} array[TB_JMP_CACHE_SIZE];
49
};
50
51
+static inline TranslationBlock *
52
+tb_jmp_cache_get_tb(CPUJumpCache *jc, uint32_t hash)
53
+{
31
+{
54
+#if TARGET_TB_PCREL
32
+ return !(cs->tcg_cflags & CF_PARALLEL) || cpu_in_exclusive_context(cs);
55
+ /* Use acquire to ensure current load of pc from jc. */
56
+ return qatomic_load_acquire(&jc->array[hash].tb);
57
+#else
58
+ /* Use rcu_read to ensure current load of pc from *tb. */
59
+ return qatomic_rcu_read(&jc->array[hash].tb);
60
+#endif
61
+}
33
+}
62
+
34
+
63
+static inline target_ulong
35
extern int64_t max_delay;
64
+tb_jmp_cache_get_pc(CPUJumpCache *jc, uint32_t hash, TranslationBlock *tb)
36
extern int64_t max_advance;
65
+{
37
66
+#if TARGET_TB_PCREL
38
diff --git a/accel/tcg/cpu-exec-common.c b/accel/tcg/cpu-exec-common.c
67
+ return jc->array[hash].pc;
68
+#else
69
+ return tb_pc(tb);
70
+#endif
71
+}
72
+
73
+static inline void
74
+tb_jmp_cache_set(CPUJumpCache *jc, uint32_t hash,
75
+ TranslationBlock *tb, target_ulong pc)
76
+{
77
+#if TARGET_TB_PCREL
78
+ jc->array[hash].pc = pc;
79
+ /* Use store_release on tb to ensure pc is written first. */
80
+ qatomic_store_release(&jc->array[hash].tb, tb);
81
+#else
82
+ /* Use the pc value already stored in tb->pc. */
83
+ qatomic_set(&jc->array[hash].tb, tb);
84
+#endif
85
+}
86
+
87
#endif /* ACCEL_TCG_TB_JMP_CACHE_H */
88
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
89
index XXXXXXX..XXXXXXX 100644
39
index XXXXXXX..XXXXXXX 100644
90
--- a/include/exec/cpu-defs.h
40
--- a/accel/tcg/cpu-exec-common.c
91
+++ b/include/exec/cpu-defs.h
41
+++ b/accel/tcg/cpu-exec-common.c
92
@@ -XXX,XX +XXX,XX @@
42
@@ -XXX,XX +XXX,XX @@
93
# error TARGET_PAGE_BITS must be defined in cpu-param.h
43
#include "sysemu/tcg.h"
94
# endif
44
#include "exec/exec-all.h"
95
#endif
45
#include "qemu/plugin.h"
96
+#ifndef TARGET_TB_PCREL
46
+#include "internal.h"
97
+# define TARGET_TB_PCREL 0
47
98
+#endif
48
bool tcg_allowed;
99
49
100
#define TARGET_LONG_SIZE (TARGET_LONG_BITS / 8)
50
@@ -XXX,XX +XXX,XX @@ void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc)
101
51
102
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
52
void cpu_loop_exit_atomic(CPUState *cpu, uintptr_t pc)
53
{
54
+ /* Prevent looping if already executing in a serial context. */
55
+ g_assert(!cpu_in_serial_context(cpu));
56
cpu->exception_index = EXCP_ATOMIC;
57
cpu_loop_exit_restore(cpu, pc);
58
}
59
diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
103
index XXXXXXX..XXXXXXX 100644
60
index XXXXXXX..XXXXXXX 100644
104
--- a/include/exec/exec-all.h
61
--- a/accel/tcg/tb-maint.c
105
+++ b/include/exec/exec-all.h
62
+++ b/accel/tcg/tb-maint.c
106
@@ -XXX,XX +XXX,XX @@ struct tb_tc {
63
@@ -XXX,XX +XXX,XX @@ void tb_flush(CPUState *cpu)
107
};
64
if (tcg_enabled()) {
108
65
unsigned tb_flush_count = qatomic_read(&tb_ctx.tb_flush_count);
109
struct TranslationBlock {
66
110
- target_ulong pc; /* simulated PC corresponding to this block (EIP + CS base) */
67
- if (cpu_in_exclusive_context(cpu)) {
111
- target_ulong cs_base; /* CS base for this block */
68
+ if (cpu_in_serial_context(cpu)) {
112
+#if !TARGET_TB_PCREL
69
do_tb_flush(cpu, RUN_ON_CPU_HOST_INT(tb_flush_count));
113
+ /*
114
+ * Guest PC corresponding to this block. This must be the true
115
+ * virtual address. Therefore e.g. x86 stores EIP + CS_BASE, and
116
+ * targets like Arm, MIPS, HP-PA, which reuse low bits for ISA or
117
+ * privilege, must store those bits elsewhere.
118
+ *
119
+ * If TARGET_TB_PCREL, the opcodes for the TranslationBlock are
120
+ * written such that the TB is associated only with the physical
121
+ * page and may be run in any virtual address context. In this case,
122
+ * PC must always be taken from ENV in a target-specific manner.
123
+ * Unwind information is taken as offsets from the page, to be
124
+ * deposited into the "current" PC.
125
+ */
126
+ target_ulong pc;
127
+#endif
128
+
129
+ /*
130
+ * Target-specific data associated with the TranslationBlock, e.g.:
131
+ * x86: the original user, the Code Segment virtual base,
132
+ * arm: an extension of tb->flags,
133
+ * s390x: instruction data for EXECUTE,
134
+ * sparc: the next pc of the instruction queue (for delay slots).
135
+ */
136
+ target_ulong cs_base;
137
+
138
uint32_t flags; /* flags defining in which context the code was generated */
139
uint32_t cflags; /* compile flags */
140
141
@@ -XXX,XX +XXX,XX @@ struct TranslationBlock {
142
/* Hide the read to avoid ifdefs for TARGET_TB_PCREL. */
143
static inline target_ulong tb_pc(const TranslationBlock *tb)
144
{
145
+#if TARGET_TB_PCREL
146
+ qemu_build_not_reached();
147
+#else
148
return tb->pc;
149
+#endif
150
}
151
152
/* Hide the qatomic_read to make code a little easier on the eyes */
153
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
154
index XXXXXXX..XXXXXXX 100644
155
--- a/accel/tcg/cpu-exec.c
156
+++ b/accel/tcg/cpu-exec.c
157
@@ -XXX,XX +XXX,XX @@ static bool tb_lookup_cmp(const void *p, const void *d)
158
const TranslationBlock *tb = p;
159
const struct tb_desc *desc = d;
160
161
- if (tb_pc(tb) == desc->pc &&
162
+ if ((TARGET_TB_PCREL || tb_pc(tb) == desc->pc) &&
163
tb->page_addr[0] == desc->page_addr0 &&
164
tb->cs_base == desc->cs_base &&
165
tb->flags == desc->flags &&
166
@@ -XXX,XX +XXX,XX @@ static TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc,
167
return NULL;
168
}
169
desc.page_addr0 = phys_pc;
170
- h = tb_hash_func(phys_pc, pc, flags, cflags, *cpu->trace_dstate);
171
+ h = tb_hash_func(phys_pc, (TARGET_TB_PCREL ? 0 : pc),
172
+ flags, cflags, *cpu->trace_dstate);
173
return qht_lookup_custom(&tb_ctx.htable, &desc, h, tb_lookup_cmp);
174
}
175
176
@@ -XXX,XX +XXX,XX @@ static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
177
uint32_t flags, uint32_t cflags)
178
{
179
TranslationBlock *tb;
180
+ CPUJumpCache *jc;
181
uint32_t hash;
182
183
/* we should never be trying to look up an INVALID tb */
184
tcg_debug_assert(!(cflags & CF_INVALID));
185
186
hash = tb_jmp_cache_hash_func(pc);
187
- tb = qatomic_rcu_read(&cpu->tb_jmp_cache->array[hash].tb);
188
+ jc = cpu->tb_jmp_cache;
189
+ tb = tb_jmp_cache_get_tb(jc, hash);
190
191
if (likely(tb &&
192
- tb->pc == pc &&
193
+ tb_jmp_cache_get_pc(jc, hash, tb) == pc &&
194
tb->cs_base == cs_base &&
195
tb->flags == flags &&
196
tb->trace_vcpu_dstate == *cpu->trace_dstate &&
197
@@ -XXX,XX +XXX,XX @@ static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
198
if (tb == NULL) {
199
return NULL;
200
}
201
- qatomic_set(&cpu->tb_jmp_cache->array[hash].tb, tb);
202
+ tb_jmp_cache_set(jc, hash, tb, pc);
203
return tb;
204
}
205
206
@@ -XXX,XX +XXX,XX @@ cpu_tb_exec(CPUState *cpu, TranslationBlock *itb, int *tb_exit)
207
if (cc->tcg_ops->synchronize_from_tb) {
208
cc->tcg_ops->synchronize_from_tb(cpu, last_tb);
209
} else {
70
} else {
210
+ assert(!TARGET_TB_PCREL);
71
async_safe_run_on_cpu(cpu, do_tb_flush,
211
assert(cc->set_pc);
212
cc->set_pc(cpu, tb_pc(last_tb));
213
}
214
@@ -XXX,XX +XXX,XX @@ int cpu_exec(CPUState *cpu)
215
* for the fast lookup
216
*/
217
h = tb_jmp_cache_hash_func(pc);
218
- qatomic_set(&cpu->tb_jmp_cache->array[h].tb, tb);
219
+ tb_jmp_cache_set(cpu->tb_jmp_cache, h, tb, pc);
220
}
221
222
#ifndef CONFIG_USER_ONLY
223
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
224
index XXXXXXX..XXXXXXX 100644
225
--- a/accel/tcg/translate-all.c
226
+++ b/accel/tcg/translate-all.c
227
@@ -XXX,XX +XXX,XX @@ static int encode_search(TranslationBlock *tb, uint8_t *block)
228
229
for (j = 0; j < TARGET_INSN_START_WORDS; ++j) {
230
if (i == 0) {
231
- prev = (j == 0 ? tb_pc(tb) : 0);
232
+ prev = (!TARGET_TB_PCREL && j == 0 ? tb_pc(tb) : 0);
233
} else {
234
prev = tcg_ctx->gen_insn_data[i - 1][j];
235
}
236
@@ -XXX,XX +XXX,XX @@ static int encode_search(TranslationBlock *tb, uint8_t *block)
237
static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
238
uintptr_t searched_pc, bool reset_icount)
239
{
240
- target_ulong data[TARGET_INSN_START_WORDS] = { tb_pc(tb) };
241
+ target_ulong data[TARGET_INSN_START_WORDS];
242
uintptr_t host_pc = (uintptr_t)tb->tc.ptr;
243
CPUArchState *env = cpu->env_ptr;
244
const uint8_t *p = tb->tc.ptr + tb->tc.size;
245
@@ -XXX,XX +XXX,XX @@ static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
246
return -1;
247
}
248
249
+ memset(data, 0, sizeof(data));
250
+ if (!TARGET_TB_PCREL) {
251
+ data[0] = tb_pc(tb);
252
+ }
253
+
254
/* Reconstruct the stored insn data while looking for the point at
255
which the end of the insn exceeds the searched_pc. */
256
for (i = 0; i < num_insns; ++i) {
257
@@ -XXX,XX +XXX,XX @@ static bool tb_cmp(const void *ap, const void *bp)
258
const TranslationBlock *a = ap;
259
const TranslationBlock *b = bp;
260
261
- return tb_pc(a) == tb_pc(b) &&
262
- a->cs_base == b->cs_base &&
263
- a->flags == b->flags &&
264
- (tb_cflags(a) & ~CF_INVALID) == (tb_cflags(b) & ~CF_INVALID) &&
265
- a->trace_vcpu_dstate == b->trace_vcpu_dstate &&
266
- a->page_addr[0] == b->page_addr[0] &&
267
- a->page_addr[1] == b->page_addr[1];
268
+ return ((TARGET_TB_PCREL || tb_pc(a) == tb_pc(b)) &&
269
+ a->cs_base == b->cs_base &&
270
+ a->flags == b->flags &&
271
+ (tb_cflags(a) & ~CF_INVALID) == (tb_cflags(b) & ~CF_INVALID) &&
272
+ a->trace_vcpu_dstate == b->trace_vcpu_dstate &&
273
+ a->page_addr[0] == b->page_addr[0] &&
274
+ a->page_addr[1] == b->page_addr[1]);
275
}
276
277
void tb_htable_init(void)
278
@@ -XXX,XX +XXX,XX @@ static inline void tb_jmp_unlink(TranslationBlock *dest)
279
qemu_spin_unlock(&dest->jmp_lock);
280
}
281
282
+static void tb_jmp_cache_inval_tb(TranslationBlock *tb)
283
+{
284
+ CPUState *cpu;
285
+
286
+ if (TARGET_TB_PCREL) {
287
+ /* A TB may be at any virtual address */
288
+ CPU_FOREACH(cpu) {
289
+ tcg_flush_jmp_cache(cpu);
290
+ }
291
+ } else {
292
+ uint32_t h = tb_jmp_cache_hash_func(tb_pc(tb));
293
+
294
+ CPU_FOREACH(cpu) {
295
+ CPUJumpCache *jc = cpu->tb_jmp_cache;
296
+
297
+ if (qatomic_read(&jc->array[h].tb) == tb) {
298
+ qatomic_set(&jc->array[h].tb, NULL);
299
+ }
300
+ }
301
+ }
302
+}
303
+
304
/*
305
* In user-mode, call with mmap_lock held.
306
* In !user-mode, if @rm_from_page_list is set, call with the TB's pages'
307
@@ -XXX,XX +XXX,XX @@ static inline void tb_jmp_unlink(TranslationBlock *dest)
308
*/
309
static void do_tb_phys_invalidate(TranslationBlock *tb, bool rm_from_page_list)
310
{
311
- CPUState *cpu;
312
PageDesc *p;
313
uint32_t h;
314
tb_page_addr_t phys_pc;
315
@@ -XXX,XX +XXX,XX @@ static void do_tb_phys_invalidate(TranslationBlock *tb, bool rm_from_page_list)
316
317
/* remove the TB from the hash list */
318
phys_pc = tb->page_addr[0];
319
- h = tb_hash_func(phys_pc, tb_pc(tb), tb->flags, orig_cflags,
320
- tb->trace_vcpu_dstate);
321
+ h = tb_hash_func(phys_pc, (TARGET_TB_PCREL ? 0 : tb_pc(tb)),
322
+ tb->flags, orig_cflags, tb->trace_vcpu_dstate);
323
if (!qht_remove(&tb_ctx.htable, tb, h)) {
324
return;
325
}
326
@@ -XXX,XX +XXX,XX @@ static void do_tb_phys_invalidate(TranslationBlock *tb, bool rm_from_page_list)
327
}
328
329
/* remove the TB from the hash list */
330
- h = tb_jmp_cache_hash_func(tb->pc);
331
- CPU_FOREACH(cpu) {
332
- CPUJumpCache *jc = cpu->tb_jmp_cache;
333
- if (qatomic_read(&jc->array[h].tb) == tb) {
334
- qatomic_set(&jc->array[h].tb, NULL);
335
- }
336
- }
337
+ tb_jmp_cache_inval_tb(tb);
338
339
/* suppress this TB from the two jump lists */
340
tb_remove_from_jmp_list(tb, 0);
341
@@ -XXX,XX +XXX,XX @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
342
}
343
344
/* add in the hash table */
345
- h = tb_hash_func(phys_pc, tb_pc(tb), tb->flags, tb->cflags,
346
- tb->trace_vcpu_dstate);
347
+ h = tb_hash_func(phys_pc, (TARGET_TB_PCREL ? 0 : tb_pc(tb)),
348
+ tb->flags, tb->cflags, tb->trace_vcpu_dstate);
349
qht_insert(&tb_ctx.htable, tb, h, &existing_tb);
350
351
/* remove TB from the page(s) if we couldn't insert it */
352
@@ -XXX,XX +XXX,XX @@ TranslationBlock *tb_gen_code(CPUState *cpu,
353
354
gen_code_buf = tcg_ctx->code_gen_ptr;
355
tb->tc.ptr = tcg_splitwx_to_rx(gen_code_buf);
356
+#if !TARGET_TB_PCREL
357
tb->pc = pc;
358
+#endif
359
tb->cs_base = cs_base;
360
tb->flags = flags;
361
tb->cflags = cflags;
362
--
72
--
363
2.34.1
73
2.34.1
364
74
365
75
diff view generated by jsdifflib
1
Add an interface to return the CPUTLBEntryFull struct
1
Instead of playing with offsetof in various places, use
2
that goes with the lookup. The result is not intended
2
MMUAccessType to index an array. This is easily defined
3
to be valid across multiple lookups, so the user must
3
instead of the previous dummy padding array in the union.
4
use the results immediately.
5
4
6
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
6
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
7
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
8
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
9
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
8
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
10
---
9
---
11
include/exec/exec-all.h | 15 +++++++++++++
10
include/exec/cpu-defs.h | 7 ++-
12
include/qemu/typedefs.h | 1 +
11
include/exec/cpu_ldst.h | 26 ++++++++--
13
accel/tcg/cputlb.c | 47 +++++++++++++++++++++++++----------------
12
accel/tcg/cputlb.c | 104 +++++++++++++---------------------------
14
3 files changed, 45 insertions(+), 18 deletions(-)
13
3 files changed, 59 insertions(+), 78 deletions(-)
15
14
16
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
15
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
17
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
18
--- a/include/exec/exec-all.h
17
--- a/include/exec/cpu-defs.h
19
+++ b/include/exec/exec-all.h
18
+++ b/include/exec/cpu-defs.h
20
@@ -XXX,XX +XXX,XX @@ int probe_access_flags(CPUArchState *env, target_ulong addr,
19
@@ -XXX,XX +XXX,XX @@ typedef struct CPUTLBEntry {
21
MMUAccessType access_type, int mmu_idx,
20
use the corresponding iotlb value. */
22
bool nonfault, void **phost, uintptr_t retaddr);
21
uintptr_t addend;
23
22
};
24
+#ifndef CONFIG_USER_ONLY
23
- /* padding to get a power of two size */
25
+/**
24
- uint8_t dummy[1 << CPU_TLB_ENTRY_BITS];
26
+ * probe_access_full:
25
+ /*
27
+ * Like probe_access_flags, except also return into @pfull.
26
+ * Padding to get a power of two size, as well as index
28
+ *
27
+ * access to addr_{read,write,code}.
29
+ * The CPUTLBEntryFull structure returned via @pfull is transient
28
+ */
30
+ * and must be consumed or copied immediately, before any further
29
+ target_ulong addr_idx[(1 << CPU_TLB_ENTRY_BITS) / TARGET_LONG_SIZE];
31
+ * access or changes to TLB @mmu_idx.
30
};
32
+ */
31
} CPUTLBEntry;
33
+int probe_access_full(CPUArchState *env, target_ulong addr,
32
34
+ MMUAccessType access_type, int mmu_idx,
33
diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h
35
+ bool nonfault, void **phost,
34
index XXXXXXX..XXXXXXX 100644
36
+ CPUTLBEntryFull **pfull, uintptr_t retaddr);
35
--- a/include/exec/cpu_ldst.h
36
+++ b/include/exec/cpu_ldst.h
37
@@ -XXX,XX +XXX,XX @@ static inline void clear_helper_retaddr(void)
38
/* Needed for TCG_OVERSIZED_GUEST */
39
#include "tcg/tcg.h"
40
41
+static inline target_ulong tlb_read_idx(const CPUTLBEntry *entry,
42
+ MMUAccessType access_type)
43
+{
44
+ /* Do not rearrange the CPUTLBEntry structure members. */
45
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_read) !=
46
+ MMU_DATA_LOAD * TARGET_LONG_SIZE);
47
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_write) !=
48
+ MMU_DATA_STORE * TARGET_LONG_SIZE);
49
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_code) !=
50
+ MMU_INST_FETCH * TARGET_LONG_SIZE);
51
+
52
+ const target_ulong *ptr = &entry->addr_idx[access_type];
53
+#if TCG_OVERSIZED_GUEST
54
+ return *ptr;
55
+#else
56
+ /* ofs might correspond to .addr_write, so use qatomic_read */
57
+ return qatomic_read(ptr);
37
+#endif
58
+#endif
59
+}
38
+
60
+
39
#define CODE_GEN_ALIGN 16 /* must be >= of the size of a icache line */
61
static inline target_ulong tlb_addr_write(const CPUTLBEntry *entry)
40
62
{
41
/* Estimated block size for TB allocation. */
63
-#if TCG_OVERSIZED_GUEST
42
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
64
- return entry->addr_write;
43
index XXXXXXX..XXXXXXX 100644
65
-#else
44
--- a/include/qemu/typedefs.h
66
- return qatomic_read(&entry->addr_write);
45
+++ b/include/qemu/typedefs.h
67
-#endif
46
@@ -XXX,XX +XXX,XX @@ typedef struct ConfidentialGuestSupport ConfidentialGuestSupport;
68
+ return tlb_read_idx(entry, MMU_DATA_STORE);
47
typedef struct CPUAddressSpace CPUAddressSpace;
69
}
48
typedef struct CPUArchState CPUArchState;
70
49
typedef struct CPUState CPUState;
71
/* Find the TLB index corresponding to the mmu_idx + address pair. */
50
+typedef struct CPUTLBEntryFull CPUTLBEntryFull;
51
typedef struct DeviceListener DeviceListener;
52
typedef struct DeviceState DeviceState;
53
typedef struct DirtyBitmapSnapshot DirtyBitmapSnapshot;
54
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
72
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
55
index XXXXXXX..XXXXXXX 100644
73
index XXXXXXX..XXXXXXX 100644
56
--- a/accel/tcg/cputlb.c
74
--- a/accel/tcg/cputlb.c
57
+++ b/accel/tcg/cputlb.c
75
+++ b/accel/tcg/cputlb.c
58
@@ -XXX,XX +XXX,XX @@ static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
76
@@ -XXX,XX +XXX,XX @@ static void io_writex(CPUArchState *env, CPUTLBEntryFull *full,
59
static int probe_access_internal(CPUArchState *env, target_ulong addr,
77
}
60
int fault_size, MMUAccessType access_type,
78
}
61
int mmu_idx, bool nonfault,
79
62
- void **phost, uintptr_t retaddr)
80
-static inline target_ulong tlb_read_ofs(CPUTLBEntry *entry, size_t ofs)
63
+ void **phost, CPUTLBEntryFull **pfull,
81
-{
64
+ uintptr_t retaddr)
82
-#if TCG_OVERSIZED_GUEST
83
- return *(target_ulong *)((uintptr_t)entry + ofs);
84
-#else
85
- /* ofs might correspond to .addr_write, so use qatomic_read */
86
- return qatomic_read((target_ulong *)((uintptr_t)entry + ofs));
87
-#endif
88
-}
89
-
90
/* Return true if ADDR is present in the victim tlb, and has been copied
91
back to the main tlb. */
92
static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
93
- size_t elt_ofs, target_ulong page)
94
+ MMUAccessType access_type, target_ulong page)
95
{
96
size_t vidx;
97
98
assert_cpu_is_self(env_cpu(env));
99
for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) {
100
CPUTLBEntry *vtlb = &env_tlb(env)->d[mmu_idx].vtable[vidx];
101
- target_ulong cmp;
102
-
103
- /* elt_ofs might correspond to .addr_write, so use qatomic_read */
104
-#if TCG_OVERSIZED_GUEST
105
- cmp = *(target_ulong *)((uintptr_t)vtlb + elt_ofs);
106
-#else
107
- cmp = qatomic_read((target_ulong *)((uintptr_t)vtlb + elt_ofs));
108
-#endif
109
+ target_ulong cmp = tlb_read_idx(vtlb, access_type);
110
111
if (cmp == page) {
112
/* Found entry in victim tlb, swap tlb and iotlb. */
113
@@ -XXX,XX +XXX,XX @@ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
114
return false;
115
}
116
117
-/* Macro to call the above, with local variables from the use context. */
118
-#define VICTIM_TLB_HIT(TY, ADDR) \
119
- victim_tlb_hit(env, mmu_idx, index, offsetof(CPUTLBEntry, TY), \
120
- (ADDR) & TARGET_PAGE_MASK)
121
-
122
static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
123
CPUTLBEntryFull *full, uintptr_t retaddr)
124
{
125
@@ -XXX,XX +XXX,XX @@ static int probe_access_internal(CPUArchState *env, target_ulong addr,
65
{
126
{
66
uintptr_t index = tlb_index(env, mmu_idx, addr);
127
uintptr_t index = tlb_index(env, mmu_idx, addr);
67
CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
128
CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
129
- target_ulong tlb_addr, page_addr;
130
- size_t elt_ofs;
131
- int flags;
132
+ target_ulong tlb_addr = tlb_read_idx(entry, access_type);
133
+ target_ulong page_addr = addr & TARGET_PAGE_MASK;
134
+ int flags = TLB_FLAGS_MASK;
135
136
- switch (access_type) {
137
- case MMU_DATA_LOAD:
138
- elt_ofs = offsetof(CPUTLBEntry, addr_read);
139
- break;
140
- case MMU_DATA_STORE:
141
- elt_ofs = offsetof(CPUTLBEntry, addr_write);
142
- break;
143
- case MMU_INST_FETCH:
144
- elt_ofs = offsetof(CPUTLBEntry, addr_code);
145
- break;
146
- default:
147
- g_assert_not_reached();
148
- }
149
- tlb_addr = tlb_read_ofs(entry, elt_ofs);
150
-
151
- flags = TLB_FLAGS_MASK;
152
- page_addr = addr & TARGET_PAGE_MASK;
153
if (!tlb_hit_page(tlb_addr, page_addr)) {
154
- if (!victim_tlb_hit(env, mmu_idx, index, elt_ofs, page_addr)) {
155
+ if (!victim_tlb_hit(env, mmu_idx, index, access_type, page_addr)) {
156
CPUState *cs = env_cpu(env);
157
158
if (!cs->cc->tcg_ops->tlb_fill(cs, addr, fault_size, access_type,
68
@@ -XXX,XX +XXX,XX @@ static int probe_access_internal(CPUArchState *env, target_ulong addr,
159
@@ -XXX,XX +XXX,XX @@ static int probe_access_internal(CPUArchState *env, target_ulong addr,
69
mmu_idx, nonfault, retaddr)) {
160
*/
70
/* Non-faulting page table read failed. */
161
flags &= ~TLB_INVALID_MASK;
71
*phost = NULL;
162
}
72
+ *pfull = NULL;
163
- tlb_addr = tlb_read_ofs(entry, elt_ofs);
73
return TLB_INVALID_MASK;
164
+ tlb_addr = tlb_read_idx(entry, access_type);
74
}
75
76
/* TLB resize via tlb_fill may have moved the entry. */
77
+ index = tlb_index(env, mmu_idx, addr);
78
entry = tlb_entry(env, mmu_idx, addr);
79
80
/*
81
@@ -XXX,XX +XXX,XX @@ static int probe_access_internal(CPUArchState *env, target_ulong addr,
82
}
165
}
83
flags &= tlb_addr;
166
flags &= tlb_addr;
84
167
85
+ *pfull = &env_tlb(env)->d[mmu_idx].fulltlb[index];
168
@@ -XXX,XX +XXX,XX @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
86
+
169
if (prot & PAGE_WRITE) {
87
/* Fold all "mmio-like" bits into TLB_MMIO. This is not RAM. */
170
tlb_addr = tlb_addr_write(tlbe);
88
if (unlikely(flags & ~(TLB_WATCHPOINT | TLB_NOTDIRTY))) {
171
if (!tlb_hit(tlb_addr, addr)) {
89
*phost = NULL;
172
- if (!VICTIM_TLB_HIT(addr_write, addr)) {
90
@@ -XXX,XX +XXX,XX @@ static int probe_access_internal(CPUArchState *env, target_ulong addr,
173
+ if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_STORE,
91
return flags;
174
+ addr & TARGET_PAGE_MASK)) {
92
}
175
tlb_fill(env_cpu(env), addr, size,
93
176
MMU_DATA_STORE, mmu_idx, retaddr);
94
-int probe_access_flags(CPUArchState *env, target_ulong addr,
177
index = tlb_index(env, mmu_idx, addr);
95
- MMUAccessType access_type, int mmu_idx,
178
@@ -XXX,XX +XXX,XX @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
96
- bool nonfault, void **phost, uintptr_t retaddr)
179
} else /* if (prot & PAGE_READ) */ {
97
+int probe_access_full(CPUArchState *env, target_ulong addr,
180
tlb_addr = tlbe->addr_read;
98
+ MMUAccessType access_type, int mmu_idx,
181
if (!tlb_hit(tlb_addr, addr)) {
99
+ bool nonfault, void **phost, CPUTLBEntryFull **pfull,
182
- if (!VICTIM_TLB_HIT(addr_read, addr)) {
100
+ uintptr_t retaddr)
183
+ if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_LOAD,
101
{
184
+ addr & TARGET_PAGE_MASK)) {
102
- int flags;
185
tlb_fill(env_cpu(env), addr, size,
103
-
186
MMU_DATA_LOAD, mmu_idx, retaddr);
104
- flags = probe_access_internal(env, addr, 0, access_type, mmu_idx,
187
index = tlb_index(env, mmu_idx, addr);
105
- nonfault, phost, retaddr);
188
@@ -XXX,XX +XXX,XX @@ load_memop(const void *haddr, MemOp op)
106
+ int flags = probe_access_internal(env, addr, 0, access_type, mmu_idx,
189
107
+ nonfault, phost, pfull, retaddr);
190
static inline uint64_t QEMU_ALWAYS_INLINE
108
191
load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi,
109
/* Handle clean RAM pages. */
192
- uintptr_t retaddr, MemOp op, bool code_read,
110
if (unlikely(flags & TLB_NOTDIRTY)) {
193
+ uintptr_t retaddr, MemOp op, MMUAccessType access_type,
111
- uintptr_t index = tlb_index(env, mmu_idx, addr);
194
FullLoadHelper *full_load)
112
- CPUTLBEntryFull *full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
195
{
113
-
196
- const size_t tlb_off = code_read ?
114
- notdirty_write(env_cpu(env), addr, 1, full, retaddr);
197
- offsetof(CPUTLBEntry, addr_code) : offsetof(CPUTLBEntry, addr_read);
115
+ notdirty_write(env_cpu(env), addr, 1, *pfull, retaddr);
198
- const MMUAccessType access_type =
116
flags &= ~TLB_NOTDIRTY;
199
- code_read ? MMU_INST_FETCH : MMU_DATA_LOAD;
200
const unsigned a_bits = get_alignment_bits(get_memop(oi));
201
const size_t size = memop_size(op);
202
uintptr_t mmu_idx = get_mmuidx(oi);
203
@@ -XXX,XX +XXX,XX @@ load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi,
204
205
index = tlb_index(env, mmu_idx, addr);
206
entry = tlb_entry(env, mmu_idx, addr);
207
- tlb_addr = code_read ? entry->addr_code : entry->addr_read;
208
+ tlb_addr = tlb_read_idx(entry, access_type);
209
210
/* If the TLB entry is for a different page, reload and try again. */
211
if (!tlb_hit(tlb_addr, addr)) {
212
- if (!victim_tlb_hit(env, mmu_idx, index, tlb_off,
213
+ if (!victim_tlb_hit(env, mmu_idx, index, access_type,
214
addr & TARGET_PAGE_MASK)) {
215
tlb_fill(env_cpu(env), addr, size,
216
access_type, mmu_idx, retaddr);
217
index = tlb_index(env, mmu_idx, addr);
218
entry = tlb_entry(env, mmu_idx, addr);
219
}
220
- tlb_addr = code_read ? entry->addr_code : entry->addr_read;
221
+ tlb_addr = tlb_read_idx(entry, access_type);
222
tlb_addr &= ~TLB_INVALID_MASK;
117
}
223
}
118
224
119
return flags;
225
@@ -XXX,XX +XXX,XX @@ static uint64_t full_ldub_mmu(CPUArchState *env, target_ulong addr,
120
}
226
MemOpIdx oi, uintptr_t retaddr)
121
227
{
122
+int probe_access_flags(CPUArchState *env, target_ulong addr,
228
validate_memop(oi, MO_UB);
123
+ MMUAccessType access_type, int mmu_idx,
229
- return load_helper(env, addr, oi, retaddr, MO_UB, false, full_ldub_mmu);
124
+ bool nonfault, void **phost, uintptr_t retaddr)
230
+ return load_helper(env, addr, oi, retaddr, MO_UB, MMU_DATA_LOAD,
125
+{
231
+ full_ldub_mmu);
126
+ CPUTLBEntryFull *full;
232
}
127
+
233
128
+ return probe_access_full(env, addr, access_type, mmu_idx,
234
tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr,
129
+ nonfault, phost, &full, retaddr);
235
@@ -XXX,XX +XXX,XX @@ static uint64_t full_le_lduw_mmu(CPUArchState *env, target_ulong addr,
130
+}
236
MemOpIdx oi, uintptr_t retaddr)
131
+
237
{
132
void *probe_access(CPUArchState *env, target_ulong addr, int size,
238
validate_memop(oi, MO_LEUW);
133
MMUAccessType access_type, int mmu_idx, uintptr_t retaddr)
239
- return load_helper(env, addr, oi, retaddr, MO_LEUW, false,
134
{
240
+ return load_helper(env, addr, oi, retaddr, MO_LEUW, MMU_DATA_LOAD,
135
+ CPUTLBEntryFull *full;
241
full_le_lduw_mmu);
136
void *host;
242
}
137
int flags;
243
138
244
@@ -XXX,XX +XXX,XX @@ static uint64_t full_be_lduw_mmu(CPUArchState *env, target_ulong addr,
139
g_assert(-(addr | TARGET_PAGE_MASK) >= size);
245
MemOpIdx oi, uintptr_t retaddr)
140
246
{
141
flags = probe_access_internal(env, addr, size, access_type, mmu_idx,
247
validate_memop(oi, MO_BEUW);
142
- false, &host, retaddr);
248
- return load_helper(env, addr, oi, retaddr, MO_BEUW, false,
143
+ false, &host, &full, retaddr);
249
+ return load_helper(env, addr, oi, retaddr, MO_BEUW, MMU_DATA_LOAD,
144
250
full_be_lduw_mmu);
145
/* Per the interface, size == 0 merely faults the access. */
251
}
146
if (size == 0) {
252
147
@@ -XXX,XX +XXX,XX @@ void *probe_access(CPUArchState *env, target_ulong addr, int size,
253
@@ -XXX,XX +XXX,XX @@ static uint64_t full_le_ldul_mmu(CPUArchState *env, target_ulong addr,
148
}
254
MemOpIdx oi, uintptr_t retaddr)
149
255
{
150
if (unlikely(flags & (TLB_NOTDIRTY | TLB_WATCHPOINT))) {
256
validate_memop(oi, MO_LEUL);
151
- uintptr_t index = tlb_index(env, mmu_idx, addr);
257
- return load_helper(env, addr, oi, retaddr, MO_LEUL, false,
152
- CPUTLBEntryFull *full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
258
+ return load_helper(env, addr, oi, retaddr, MO_LEUL, MMU_DATA_LOAD,
153
-
259
full_le_ldul_mmu);
154
/* Handle watchpoints. */
260
}
155
if (flags & TLB_WATCHPOINT) {
261
156
int wp_access = (access_type == MMU_DATA_STORE
262
@@ -XXX,XX +XXX,XX @@ static uint64_t full_be_ldul_mmu(CPUArchState *env, target_ulong addr,
157
@@ -XXX,XX +XXX,XX @@ void *probe_access(CPUArchState *env, target_ulong addr, int size,
263
MemOpIdx oi, uintptr_t retaddr)
158
void *tlb_vaddr_to_host(CPUArchState *env, abi_ptr addr,
264
{
159
MMUAccessType access_type, int mmu_idx)
265
validate_memop(oi, MO_BEUL);
160
{
266
- return load_helper(env, addr, oi, retaddr, MO_BEUL, false,
161
+ CPUTLBEntryFull *full;
267
+ return load_helper(env, addr, oi, retaddr, MO_BEUL, MMU_DATA_LOAD,
162
void *host;
268
full_be_ldul_mmu);
163
int flags;
269
}
164
270
165
flags = probe_access_internal(env, addr, 0, access_type,
271
@@ -XXX,XX +XXX,XX @@ uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr,
166
- mmu_idx, true, &host, 0);
272
MemOpIdx oi, uintptr_t retaddr)
167
+ mmu_idx, true, &host, &full, 0);
273
{
168
274
validate_memop(oi, MO_LEUQ);
169
/* No combination of flags are expected by the caller. */
275
- return load_helper(env, addr, oi, retaddr, MO_LEUQ, false,
170
return flags ? NULL : host;
276
+ return load_helper(env, addr, oi, retaddr, MO_LEUQ, MMU_DATA_LOAD,
171
@@ -XXX,XX +XXX,XX @@ void *tlb_vaddr_to_host(CPUArchState *env, abi_ptr addr,
277
helper_le_ldq_mmu);
172
tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
278
}
173
void **hostp)
279
174
{
280
@@ -XXX,XX +XXX,XX @@ uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr,
175
+ CPUTLBEntryFull *full;
281
MemOpIdx oi, uintptr_t retaddr)
176
void *p;
282
{
177
283
validate_memop(oi, MO_BEUQ);
178
(void)probe_access_internal(env, addr, 1, MMU_INST_FETCH,
284
- return load_helper(env, addr, oi, retaddr, MO_BEUQ, false,
179
- cpu_mmu_index(env, true), false, &p, 0);
285
+ return load_helper(env, addr, oi, retaddr, MO_BEUQ, MMU_DATA_LOAD,
180
+ cpu_mmu_index(env, true), false, &p, &full, 0);
286
helper_be_ldq_mmu);
181
if (p == NULL) {
287
}
182
return -1;
288
183
}
289
@@ -XXX,XX +XXX,XX @@ store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val,
290
uintptr_t retaddr, size_t size, uintptr_t mmu_idx,
291
bool big_endian)
292
{
293
- const size_t tlb_off = offsetof(CPUTLBEntry, addr_write);
294
uintptr_t index, index2;
295
CPUTLBEntry *entry, *entry2;
296
target_ulong page1, page2, tlb_addr, tlb_addr2;
297
@@ -XXX,XX +XXX,XX @@ store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val,
298
299
tlb_addr2 = tlb_addr_write(entry2);
300
if (page1 != page2 && !tlb_hit_page(tlb_addr2, page2)) {
301
- if (!victim_tlb_hit(env, mmu_idx, index2, tlb_off, page2)) {
302
+ if (!victim_tlb_hit(env, mmu_idx, index2, MMU_DATA_STORE, page2)) {
303
tlb_fill(env_cpu(env), page2, size2, MMU_DATA_STORE,
304
mmu_idx, retaddr);
305
index2 = tlb_index(env, mmu_idx, page2);
306
@@ -XXX,XX +XXX,XX @@ static inline void QEMU_ALWAYS_INLINE
307
store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
308
MemOpIdx oi, uintptr_t retaddr, MemOp op)
309
{
310
- const size_t tlb_off = offsetof(CPUTLBEntry, addr_write);
311
const unsigned a_bits = get_alignment_bits(get_memop(oi));
312
const size_t size = memop_size(op);
313
uintptr_t mmu_idx = get_mmuidx(oi);
314
@@ -XXX,XX +XXX,XX @@ store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
315
316
/* If the TLB entry is for a different page, reload and try again. */
317
if (!tlb_hit(tlb_addr, addr)) {
318
- if (!victim_tlb_hit(env, mmu_idx, index, tlb_off,
319
+ if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_STORE,
320
addr & TARGET_PAGE_MASK)) {
321
tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE,
322
mmu_idx, retaddr);
323
@@ -XXX,XX +XXX,XX @@ void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val,
324
static uint64_t full_ldub_code(CPUArchState *env, target_ulong addr,
325
MemOpIdx oi, uintptr_t retaddr)
326
{
327
- return load_helper(env, addr, oi, retaddr, MO_8, true, full_ldub_code);
328
+ return load_helper(env, addr, oi, retaddr, MO_8,
329
+ MMU_INST_FETCH, full_ldub_code);
330
}
331
332
uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr)
333
@@ -XXX,XX +XXX,XX @@ uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr)
334
static uint64_t full_lduw_code(CPUArchState *env, target_ulong addr,
335
MemOpIdx oi, uintptr_t retaddr)
336
{
337
- return load_helper(env, addr, oi, retaddr, MO_TEUW, true, full_lduw_code);
338
+ return load_helper(env, addr, oi, retaddr, MO_TEUW,
339
+ MMU_INST_FETCH, full_lduw_code);
340
}
341
342
uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr)
343
@@ -XXX,XX +XXX,XX @@ uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr)
344
static uint64_t full_ldl_code(CPUArchState *env, target_ulong addr,
345
MemOpIdx oi, uintptr_t retaddr)
346
{
347
- return load_helper(env, addr, oi, retaddr, MO_TEUL, true, full_ldl_code);
348
+ return load_helper(env, addr, oi, retaddr, MO_TEUL,
349
+ MMU_INST_FETCH, full_ldl_code);
350
}
351
352
uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr)
353
@@ -XXX,XX +XXX,XX @@ uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr)
354
static uint64_t full_ldq_code(CPUArchState *env, target_ulong addr,
355
MemOpIdx oi, uintptr_t retaddr)
356
{
357
- return load_helper(env, addr, oi, retaddr, MO_TEUQ, true, full_ldq_code);
358
+ return load_helper(env, addr, oi, retaddr, MO_TEUQ,
359
+ MMU_INST_FETCH, full_ldq_code);
360
}
361
362
uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr)
184
--
363
--
185
2.34.1
364
2.34.1
186
365
187
366
diff view generated by jsdifflib
1
This field is only written, not read; remove it.
1
Instead of trying to unify all operations on uint64_t, pull out
2
mmu_lookup() to perform the basic tlb hit and resolution.
3
Create individual functions to handle access by size.
2
4
3
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
5
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
4
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
5
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
---
8
include/hw/core/cpu.h | 1 -
9
accel/tcg/cputlb.c | 645 +++++++++++++++++++++++++++++----------------
9
accel/tcg/cputlb.c | 7 +++----
10
1 file changed, 424 insertions(+), 221 deletions(-)
10
2 files changed, 3 insertions(+), 5 deletions(-)
11
11
12
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
13
index XXXXXXX..XXXXXXX 100644
14
--- a/include/hw/core/cpu.h
15
+++ b/include/hw/core/cpu.h
16
@@ -XXX,XX +XXX,XX @@ struct CPUWatchpoint {
17
* the memory regions get moved around by io_writex.
18
*/
19
typedef struct SavedIOTLB {
20
- hwaddr addr;
21
MemoryRegionSection *section;
22
hwaddr mr_offset;
23
} SavedIOTLB;
24
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
12
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
25
index XXXXXXX..XXXXXXX 100644
13
index XXXXXXX..XXXXXXX 100644
26
--- a/accel/tcg/cputlb.c
14
--- a/accel/tcg/cputlb.c
27
+++ b/accel/tcg/cputlb.c
15
+++ b/accel/tcg/cputlb.c
28
@@ -XXX,XX +XXX,XX @@ static uint64_t io_readx(CPUArchState *env, CPUTLBEntryFull *full,
16
@@ -XXX,XX +XXX,XX @@ bool tlb_plugin_lookup(CPUState *cpu, target_ulong addr, int mmu_idx,
29
* This is read by tlb_plugin_lookup if the fulltlb entry doesn't match
17
30
* because of the side effect of io_writex changing memory layout.
18
#endif
19
20
+/*
21
+ * Probe for a load/store operation.
22
+ * Return the host address and into @flags.
23
+ */
24
+
25
+typedef struct MMULookupPageData {
26
+ CPUTLBEntryFull *full;
27
+ void *haddr;
28
+ target_ulong addr;
29
+ int flags;
30
+ int size;
31
+} MMULookupPageData;
32
+
33
+typedef struct MMULookupLocals {
34
+ MMULookupPageData page[2];
35
+ MemOp memop;
36
+ int mmu_idx;
37
+} MMULookupLocals;
38
+
39
+/**
40
+ * mmu_lookup1: translate one page
41
+ * @env: cpu context
42
+ * @data: lookup parameters
43
+ * @mmu_idx: virtual address context
44
+ * @access_type: load/store/code
45
+ * @ra: return address into tcg generated code, or 0
46
+ *
47
+ * Resolve the translation for the one page at @data.addr, filling in
48
+ * the rest of @data with the results. If the translation fails,
49
+ * tlb_fill will longjmp out. Return true if the softmmu tlb for
50
+ * @mmu_idx may have resized.
51
+ */
52
+static bool mmu_lookup1(CPUArchState *env, MMULookupPageData *data,
53
+ int mmu_idx, MMUAccessType access_type, uintptr_t ra)
54
+{
55
+ target_ulong addr = data->addr;
56
+ uintptr_t index = tlb_index(env, mmu_idx, addr);
57
+ CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
58
+ target_ulong tlb_addr = tlb_read_idx(entry, access_type);
59
+ bool maybe_resized = false;
60
+
61
+ /* If the TLB entry is for a different page, reload and try again. */
62
+ if (!tlb_hit(tlb_addr, addr)) {
63
+ if (!victim_tlb_hit(env, mmu_idx, index, access_type,
64
+ addr & TARGET_PAGE_MASK)) {
65
+ tlb_fill(env_cpu(env), addr, data->size, access_type, mmu_idx, ra);
66
+ maybe_resized = true;
67
+ index = tlb_index(env, mmu_idx, addr);
68
+ entry = tlb_entry(env, mmu_idx, addr);
69
+ }
70
+ tlb_addr = tlb_read_idx(entry, access_type) & ~TLB_INVALID_MASK;
71
+ }
72
+
73
+ data->flags = tlb_addr & TLB_FLAGS_MASK;
74
+ data->full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
75
+ /* Compute haddr speculatively; depending on flags it might be invalid. */
76
+ data->haddr = (void *)((uintptr_t)addr + entry->addend);
77
+
78
+ return maybe_resized;
79
+}
80
+
81
+/**
82
+ * mmu_watch_or_dirty
83
+ * @env: cpu context
84
+ * @data: lookup parameters
85
+ * @access_type: load/store/code
86
+ * @ra: return address into tcg generated code, or 0
87
+ *
88
+ * Trigger watchpoints for @data.addr:@data.size;
89
+ * record writes to protected clean pages.
90
+ */
91
+static void mmu_watch_or_dirty(CPUArchState *env, MMULookupPageData *data,
92
+ MMUAccessType access_type, uintptr_t ra)
93
+{
94
+ CPUTLBEntryFull *full = data->full;
95
+ target_ulong addr = data->addr;
96
+ int flags = data->flags;
97
+ int size = data->size;
98
+
99
+ /* On watchpoint hit, this will longjmp out. */
100
+ if (flags & TLB_WATCHPOINT) {
101
+ int wp = access_type == MMU_DATA_STORE ? BP_MEM_WRITE : BP_MEM_READ;
102
+ cpu_check_watchpoint(env_cpu(env), addr, size, full->attrs, wp, ra);
103
+ flags &= ~TLB_WATCHPOINT;
104
+ }
105
+
106
+ /* Note that notdirty is only set for writes. */
107
+ if (flags & TLB_NOTDIRTY) {
108
+ notdirty_write(env_cpu(env), addr, size, full, ra);
109
+ flags &= ~TLB_NOTDIRTY;
110
+ }
111
+ data->flags = flags;
112
+}
113
+
114
+/**
115
+ * mmu_lookup: translate page(s)
116
+ * @env: cpu context
117
+ * @addr: virtual address
118
+ * @oi: combined mmu_idx and MemOp
119
+ * @ra: return address into tcg generated code, or 0
120
+ * @access_type: load/store/code
121
+ * @l: output result
122
+ *
123
+ * Resolve the translation for the page(s) beginning at @addr, for MemOp.size
124
+ * bytes. Return true if the lookup crosses a page boundary.
125
+ */
126
+static bool mmu_lookup(CPUArchState *env, target_ulong addr, MemOpIdx oi,
127
+ uintptr_t ra, MMUAccessType type, MMULookupLocals *l)
128
+{
129
+ unsigned a_bits;
130
+ bool crosspage;
131
+ int flags;
132
+
133
+ l->memop = get_memop(oi);
134
+ l->mmu_idx = get_mmuidx(oi);
135
+
136
+ tcg_debug_assert(l->mmu_idx < NB_MMU_MODES);
137
+
138
+ /* Handle CPU specific unaligned behaviour */
139
+ a_bits = get_alignment_bits(l->memop);
140
+ if (addr & ((1 << a_bits) - 1)) {
141
+ cpu_unaligned_access(env_cpu(env), addr, type, l->mmu_idx, ra);
142
+ }
143
+
144
+ l->page[0].addr = addr;
145
+ l->page[0].size = memop_size(l->memop);
146
+ l->page[1].addr = (addr + l->page[0].size - 1) & TARGET_PAGE_MASK;
147
+ l->page[1].size = 0;
148
+ crosspage = (addr ^ l->page[1].addr) & TARGET_PAGE_MASK;
149
+
150
+ if (likely(!crosspage)) {
151
+ mmu_lookup1(env, &l->page[0], l->mmu_idx, type, ra);
152
+
153
+ flags = l->page[0].flags;
154
+ if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
155
+ mmu_watch_or_dirty(env, &l->page[0], type, ra);
156
+ }
157
+ if (unlikely(flags & TLB_BSWAP)) {
158
+ l->memop ^= MO_BSWAP;
159
+ }
160
+ } else {
161
+ /* Finish compute of page crossing. */
162
+ int size0 = l->page[1].addr - addr;
163
+ l->page[1].size = l->page[0].size - size0;
164
+ l->page[0].size = size0;
165
+
166
+ /*
167
+ * Lookup both pages, recognizing exceptions from either. If the
168
+ * second lookup potentially resized, refresh first CPUTLBEntryFull.
169
+ */
170
+ mmu_lookup1(env, &l->page[0], l->mmu_idx, type, ra);
171
+ if (mmu_lookup1(env, &l->page[1], l->mmu_idx, type, ra)) {
172
+ uintptr_t index = tlb_index(env, l->mmu_idx, addr);
173
+ l->page[0].full = &env_tlb(env)->d[l->mmu_idx].fulltlb[index];
174
+ }
175
+
176
+ flags = l->page[0].flags | l->page[1].flags;
177
+ if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
178
+ mmu_watch_or_dirty(env, &l->page[0], type, ra);
179
+ mmu_watch_or_dirty(env, &l->page[1], type, ra);
180
+ }
181
+
182
+ /*
183
+ * Since target/sparc is the only user of TLB_BSWAP, and all
184
+ * Sparc accesses are aligned, any treatment across two pages
185
+ * would be arbitrary. Refuse it until there's a use.
186
+ */
187
+ tcg_debug_assert((flags & TLB_BSWAP) == 0);
188
+ }
189
+
190
+ return crosspage;
191
+}
192
+
193
/*
194
* Probe for an atomic operation. Do not allow unaligned operations,
195
* or io operations to proceed. Return the host address.
196
@@ -XXX,XX +XXX,XX @@ load_memop(const void *haddr, MemOp op)
197
}
198
}
199
200
-static inline uint64_t QEMU_ALWAYS_INLINE
201
-load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi,
202
- uintptr_t retaddr, MemOp op, MMUAccessType access_type,
203
- FullLoadHelper *full_load)
204
-{
205
- const unsigned a_bits = get_alignment_bits(get_memop(oi));
206
- const size_t size = memop_size(op);
207
- uintptr_t mmu_idx = get_mmuidx(oi);
208
- uintptr_t index;
209
- CPUTLBEntry *entry;
210
- target_ulong tlb_addr;
211
- void *haddr;
212
- uint64_t res;
213
-
214
- tcg_debug_assert(mmu_idx < NB_MMU_MODES);
215
-
216
- /* Handle CPU specific unaligned behaviour */
217
- if (addr & ((1 << a_bits) - 1)) {
218
- cpu_unaligned_access(env_cpu(env), addr, access_type,
219
- mmu_idx, retaddr);
220
- }
221
-
222
- index = tlb_index(env, mmu_idx, addr);
223
- entry = tlb_entry(env, mmu_idx, addr);
224
- tlb_addr = tlb_read_idx(entry, access_type);
225
-
226
- /* If the TLB entry is for a different page, reload and try again. */
227
- if (!tlb_hit(tlb_addr, addr)) {
228
- if (!victim_tlb_hit(env, mmu_idx, index, access_type,
229
- addr & TARGET_PAGE_MASK)) {
230
- tlb_fill(env_cpu(env), addr, size,
231
- access_type, mmu_idx, retaddr);
232
- index = tlb_index(env, mmu_idx, addr);
233
- entry = tlb_entry(env, mmu_idx, addr);
234
- }
235
- tlb_addr = tlb_read_idx(entry, access_type);
236
- tlb_addr &= ~TLB_INVALID_MASK;
237
- }
238
-
239
- /* Handle anything that isn't just a straight memory access. */
240
- if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
241
- CPUTLBEntryFull *full;
242
- bool need_swap;
243
-
244
- /* For anything that is unaligned, recurse through full_load. */
245
- if ((addr & (size - 1)) != 0) {
246
- goto do_unaligned_access;
247
- }
248
-
249
- full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
250
-
251
- /* Handle watchpoints. */
252
- if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
253
- /* On watchpoint hit, this will longjmp out. */
254
- cpu_check_watchpoint(env_cpu(env), addr, size,
255
- full->attrs, BP_MEM_READ, retaddr);
256
- }
257
-
258
- need_swap = size > 1 && (tlb_addr & TLB_BSWAP);
259
-
260
- /* Handle I/O access. */
261
- if (likely(tlb_addr & TLB_MMIO)) {
262
- return io_readx(env, full, mmu_idx, addr, retaddr,
263
- access_type, op ^ (need_swap * MO_BSWAP));
264
- }
265
-
266
- haddr = (void *)((uintptr_t)addr + entry->addend);
267
-
268
- /*
269
- * Keep these two load_memop separate to ensure that the compiler
270
- * is able to fold the entire function to a single instruction.
271
- * There is a build-time assert inside to remind you of this. ;-)
272
- */
273
- if (unlikely(need_swap)) {
274
- return load_memop(haddr, op ^ MO_BSWAP);
275
- }
276
- return load_memop(haddr, op);
277
- }
278
-
279
- /* Handle slow unaligned access (it spans two pages or IO). */
280
- if (size > 1
281
- && unlikely((addr & ~TARGET_PAGE_MASK) + size - 1
282
- >= TARGET_PAGE_SIZE)) {
283
- target_ulong addr1, addr2;
284
- uint64_t r1, r2;
285
- unsigned shift;
286
- do_unaligned_access:
287
- addr1 = addr & ~((target_ulong)size - 1);
288
- addr2 = addr1 + size;
289
- r1 = full_load(env, addr1, oi, retaddr);
290
- r2 = full_load(env, addr2, oi, retaddr);
291
- shift = (addr & (size - 1)) * 8;
292
-
293
- if (memop_big_endian(op)) {
294
- /* Big-endian combine. */
295
- res = (r1 << shift) | (r2 >> ((size * 8) - shift));
296
- } else {
297
- /* Little-endian combine. */
298
- res = (r1 >> shift) | (r2 << ((size * 8) - shift));
299
- }
300
- return res & MAKE_64BIT_MASK(0, size * 8);
301
- }
302
-
303
- haddr = (void *)((uintptr_t)addr + entry->addend);
304
- return load_memop(haddr, op);
305
-}
306
-
307
/*
308
* For the benefit of TCG generated code, we want to avoid the
309
* complication of ABI-specific return type promotion and always
310
@@ -XXX,XX +XXX,XX @@ load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi,
311
* We don't bother with this widened value for SOFTMMU_CODE_ACCESS.
31
*/
312
*/
32
-static void save_iotlb_data(CPUState *cs, hwaddr addr,
313
33
- MemoryRegionSection *section, hwaddr mr_offset)
314
-static uint64_t full_ldub_mmu(CPUArchState *env, target_ulong addr,
34
+static void save_iotlb_data(CPUState *cs, MemoryRegionSection *section,
315
- MemOpIdx oi, uintptr_t retaddr)
35
+ hwaddr mr_offset)
316
+/**
36
{
317
+ * do_ld_mmio_beN:
37
#ifdef CONFIG_PLUGIN
318
+ * @env: cpu context
38
SavedIOTLB *saved = &cs->saved_iotlb;
319
+ * @p: translation parameters
39
- saved->addr = addr;
320
+ * @ret_be: accumulated data
40
saved->section = section;
321
+ * @mmu_idx: virtual address context
41
saved->mr_offset = mr_offset;
322
+ * @ra: return address into tcg generated code, or 0
42
#endif
323
+ *
43
@@ -XXX,XX +XXX,XX @@ static void io_writex(CPUArchState *env, CPUTLBEntryFull *full,
324
+ * Load @p->size bytes from @p->addr, which is memory-mapped i/o.
44
* The memory_region_dispatch may trigger a flush/resize
325
+ * The bytes are concatenated in big-endian order with @ret_be.
45
* so for plugins we save the iotlb_data just in case.
326
+ */
46
*/
327
+static uint64_t do_ld_mmio_beN(CPUArchState *env, MMULookupPageData *p,
47
- save_iotlb_data(cpu, full->xlat_section, section, mr_offset);
328
+ uint64_t ret_be, int mmu_idx,
48
+ save_iotlb_data(cpu, section, mr_offset);
329
+ MMUAccessType type, uintptr_t ra)
49
330
{
50
if (!qemu_mutex_iothread_locked()) {
331
- validate_memop(oi, MO_UB);
51
qemu_mutex_lock_iothread();
332
- return load_helper(env, addr, oi, retaddr, MO_UB, MMU_DATA_LOAD,
333
- full_ldub_mmu);
334
+ CPUTLBEntryFull *full = p->full;
335
+ target_ulong addr = p->addr;
336
+ int i, size = p->size;
337
+
338
+ QEMU_IOTHREAD_LOCK_GUARD();
339
+ for (i = 0; i < size; i++) {
340
+ uint8_t x = io_readx(env, full, mmu_idx, addr + i, ra, type, MO_UB);
341
+ ret_be = (ret_be << 8) | x;
342
+ }
343
+ return ret_be;
344
+}
345
+
346
+/**
347
+ * do_ld_bytes_beN
348
+ * @p: translation parameters
349
+ * @ret_be: accumulated data
350
+ *
351
+ * Load @p->size bytes from @p->haddr, which is RAM.
352
+ * The bytes to concatenated in big-endian order with @ret_be.
353
+ */
354
+static uint64_t do_ld_bytes_beN(MMULookupPageData *p, uint64_t ret_be)
355
+{
356
+ uint8_t *haddr = p->haddr;
357
+ int i, size = p->size;
358
+
359
+ for (i = 0; i < size; i++) {
360
+ ret_be = (ret_be << 8) | haddr[i];
361
+ }
362
+ return ret_be;
363
+}
364
+
365
+/*
366
+ * Wrapper for the above.
367
+ */
368
+static uint64_t do_ld_beN(CPUArchState *env, MMULookupPageData *p,
369
+ uint64_t ret_be, int mmu_idx,
370
+ MMUAccessType type, uintptr_t ra)
371
+{
372
+ if (unlikely(p->flags & TLB_MMIO)) {
373
+ return do_ld_mmio_beN(env, p, ret_be, mmu_idx, type, ra);
374
+ } else {
375
+ return do_ld_bytes_beN(p, ret_be);
376
+ }
377
+}
378
+
379
+static uint8_t do_ld_1(CPUArchState *env, MMULookupPageData *p, int mmu_idx,
380
+ MMUAccessType type, uintptr_t ra)
381
+{
382
+ if (unlikely(p->flags & TLB_MMIO)) {
383
+ return io_readx(env, p->full, mmu_idx, p->addr, ra, type, MO_UB);
384
+ } else {
385
+ return *(uint8_t *)p->haddr;
386
+ }
387
+}
388
+
389
+static uint16_t do_ld_2(CPUArchState *env, MMULookupPageData *p, int mmu_idx,
390
+ MMUAccessType type, MemOp memop, uintptr_t ra)
391
+{
392
+ uint64_t ret;
393
+
394
+ if (unlikely(p->flags & TLB_MMIO)) {
395
+ return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop);
396
+ }
397
+
398
+ /* Perform the load host endian, then swap if necessary. */
399
+ ret = load_memop(p->haddr, MO_UW);
400
+ if (memop & MO_BSWAP) {
401
+ ret = bswap16(ret);
402
+ }
403
+ return ret;
404
+}
405
+
406
+static uint32_t do_ld_4(CPUArchState *env, MMULookupPageData *p, int mmu_idx,
407
+ MMUAccessType type, MemOp memop, uintptr_t ra)
408
+{
409
+ uint32_t ret;
410
+
411
+ if (unlikely(p->flags & TLB_MMIO)) {
412
+ return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop);
413
+ }
414
+
415
+ /* Perform the load host endian. */
416
+ ret = load_memop(p->haddr, MO_UL);
417
+ if (memop & MO_BSWAP) {
418
+ ret = bswap32(ret);
419
+ }
420
+ return ret;
421
+}
422
+
423
+static uint64_t do_ld_8(CPUArchState *env, MMULookupPageData *p, int mmu_idx,
424
+ MMUAccessType type, MemOp memop, uintptr_t ra)
425
+{
426
+ uint64_t ret;
427
+
428
+ if (unlikely(p->flags & TLB_MMIO)) {
429
+ return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop);
430
+ }
431
+
432
+ /* Perform the load host endian. */
433
+ ret = load_memop(p->haddr, MO_UQ);
434
+ if (memop & MO_BSWAP) {
435
+ ret = bswap64(ret);
436
+ }
437
+ return ret;
438
+}
439
+
440
+static uint8_t do_ld1_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi,
441
+ uintptr_t ra, MMUAccessType access_type)
442
+{
443
+ MMULookupLocals l;
444
+ bool crosspage;
445
+
446
+ crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l);
447
+ tcg_debug_assert(!crosspage);
448
+
449
+ return do_ld_1(env, &l.page[0], l.mmu_idx, access_type, ra);
450
}
451
452
tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr,
453
MemOpIdx oi, uintptr_t retaddr)
454
{
455
- return full_ldub_mmu(env, addr, oi, retaddr);
456
+ validate_memop(oi, MO_UB);
457
+ return do_ld1_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
458
}
459
460
-static uint64_t full_le_lduw_mmu(CPUArchState *env, target_ulong addr,
461
- MemOpIdx oi, uintptr_t retaddr)
462
+static uint16_t do_ld2_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi,
463
+ uintptr_t ra, MMUAccessType access_type)
464
{
465
- validate_memop(oi, MO_LEUW);
466
- return load_helper(env, addr, oi, retaddr, MO_LEUW, MMU_DATA_LOAD,
467
- full_le_lduw_mmu);
468
+ MMULookupLocals l;
469
+ bool crosspage;
470
+ uint16_t ret;
471
+ uint8_t a, b;
472
+
473
+ crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l);
474
+ if (likely(!crosspage)) {
475
+ return do_ld_2(env, &l.page[0], l.mmu_idx, access_type, l.memop, ra);
476
+ }
477
+
478
+ a = do_ld_1(env, &l.page[0], l.mmu_idx, access_type, ra);
479
+ b = do_ld_1(env, &l.page[1], l.mmu_idx, access_type, ra);
480
+
481
+ if ((l.memop & MO_BSWAP) == MO_LE) {
482
+ ret = a | (b << 8);
483
+ } else {
484
+ ret = b | (a << 8);
485
+ }
486
+ return ret;
487
}
488
489
tcg_target_ulong helper_le_lduw_mmu(CPUArchState *env, target_ulong addr,
490
MemOpIdx oi, uintptr_t retaddr)
491
{
492
- return full_le_lduw_mmu(env, addr, oi, retaddr);
493
-}
494
-
495
-static uint64_t full_be_lduw_mmu(CPUArchState *env, target_ulong addr,
496
- MemOpIdx oi, uintptr_t retaddr)
497
-{
498
- validate_memop(oi, MO_BEUW);
499
- return load_helper(env, addr, oi, retaddr, MO_BEUW, MMU_DATA_LOAD,
500
- full_be_lduw_mmu);
501
+ validate_memop(oi, MO_LEUW);
502
+ return do_ld2_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
503
}
504
505
tcg_target_ulong helper_be_lduw_mmu(CPUArchState *env, target_ulong addr,
506
MemOpIdx oi, uintptr_t retaddr)
507
{
508
- return full_be_lduw_mmu(env, addr, oi, retaddr);
509
+ validate_memop(oi, MO_BEUW);
510
+ return do_ld2_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
511
}
512
513
-static uint64_t full_le_ldul_mmu(CPUArchState *env, target_ulong addr,
514
- MemOpIdx oi, uintptr_t retaddr)
515
+static uint32_t do_ld4_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi,
516
+ uintptr_t ra, MMUAccessType access_type)
517
{
518
- validate_memop(oi, MO_LEUL);
519
- return load_helper(env, addr, oi, retaddr, MO_LEUL, MMU_DATA_LOAD,
520
- full_le_ldul_mmu);
521
+ MMULookupLocals l;
522
+ bool crosspage;
523
+ uint32_t ret;
524
+
525
+ crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l);
526
+ if (likely(!crosspage)) {
527
+ return do_ld_4(env, &l.page[0], l.mmu_idx, access_type, l.memop, ra);
528
+ }
529
+
530
+ ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, ra);
531
+ ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, ra);
532
+ if ((l.memop & MO_BSWAP) == MO_LE) {
533
+ ret = bswap32(ret);
534
+ }
535
+ return ret;
536
}
537
538
tcg_target_ulong helper_le_ldul_mmu(CPUArchState *env, target_ulong addr,
539
MemOpIdx oi, uintptr_t retaddr)
540
{
541
- return full_le_ldul_mmu(env, addr, oi, retaddr);
542
-}
543
-
544
-static uint64_t full_be_ldul_mmu(CPUArchState *env, target_ulong addr,
545
- MemOpIdx oi, uintptr_t retaddr)
546
-{
547
- validate_memop(oi, MO_BEUL);
548
- return load_helper(env, addr, oi, retaddr, MO_BEUL, MMU_DATA_LOAD,
549
- full_be_ldul_mmu);
550
+ validate_memop(oi, MO_LEUL);
551
+ return do_ld4_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
552
}
553
554
tcg_target_ulong helper_be_ldul_mmu(CPUArchState *env, target_ulong addr,
555
MemOpIdx oi, uintptr_t retaddr)
556
{
557
- return full_be_ldul_mmu(env, addr, oi, retaddr);
558
+ validate_memop(oi, MO_BEUL);
559
+ return do_ld4_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
560
+}
561
+
562
+static uint64_t do_ld8_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi,
563
+ uintptr_t ra, MMUAccessType access_type)
564
+{
565
+ MMULookupLocals l;
566
+ bool crosspage;
567
+ uint64_t ret;
568
+
569
+ crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l);
570
+ if (likely(!crosspage)) {
571
+ return do_ld_8(env, &l.page[0], l.mmu_idx, access_type, l.memop, ra);
572
+ }
573
+
574
+ ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, ra);
575
+ ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, ra);
576
+ if ((l.memop & MO_BSWAP) == MO_LE) {
577
+ ret = bswap64(ret);
578
+ }
579
+ return ret;
580
}
581
582
uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr,
583
MemOpIdx oi, uintptr_t retaddr)
584
{
585
validate_memop(oi, MO_LEUQ);
586
- return load_helper(env, addr, oi, retaddr, MO_LEUQ, MMU_DATA_LOAD,
587
- helper_le_ldq_mmu);
588
+ return do_ld8_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
589
}
590
591
uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr,
592
MemOpIdx oi, uintptr_t retaddr)
593
{
594
validate_memop(oi, MO_BEUQ);
595
- return load_helper(env, addr, oi, retaddr, MO_BEUQ, MMU_DATA_LOAD,
596
- helper_be_ldq_mmu);
597
+ return do_ld8_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD);
598
}
599
600
/*
601
@@ -XXX,XX +XXX,XX @@ tcg_target_ulong helper_be_ldsl_mmu(CPUArchState *env, target_ulong addr,
602
* Load helpers for cpu_ldst.h.
603
*/
604
605
-static inline uint64_t cpu_load_helper(CPUArchState *env, abi_ptr addr,
606
- MemOpIdx oi, uintptr_t retaddr,
607
- FullLoadHelper *full_load)
608
+static void plugin_load_cb(CPUArchState *env, abi_ptr addr, MemOpIdx oi)
609
{
610
- uint64_t ret;
611
-
612
- ret = full_load(env, addr, oi, retaddr);
613
qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
614
- return ret;
615
}
616
617
uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra)
618
{
619
- return cpu_load_helper(env, addr, oi, ra, full_ldub_mmu);
620
+ uint8_t ret;
621
+
622
+ validate_memop(oi, MO_UB);
623
+ ret = do_ld1_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
624
+ plugin_load_cb(env, addr, oi);
625
+ return ret;
626
}
627
628
uint16_t cpu_ldw_be_mmu(CPUArchState *env, abi_ptr addr,
629
MemOpIdx oi, uintptr_t ra)
630
{
631
- return cpu_load_helper(env, addr, oi, ra, full_be_lduw_mmu);
632
+ uint16_t ret;
633
+
634
+ validate_memop(oi, MO_BEUW);
635
+ ret = do_ld2_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
636
+ plugin_load_cb(env, addr, oi);
637
+ return ret;
638
}
639
640
uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr,
641
MemOpIdx oi, uintptr_t ra)
642
{
643
- return cpu_load_helper(env, addr, oi, ra, full_be_ldul_mmu);
644
+ uint32_t ret;
645
+
646
+ validate_memop(oi, MO_BEUL);
647
+ ret = do_ld4_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
648
+ plugin_load_cb(env, addr, oi);
649
+ return ret;
650
}
651
652
uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr,
653
MemOpIdx oi, uintptr_t ra)
654
{
655
- return cpu_load_helper(env, addr, oi, ra, helper_be_ldq_mmu);
656
+ uint64_t ret;
657
+
658
+ validate_memop(oi, MO_BEUQ);
659
+ ret = do_ld8_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
660
+ plugin_load_cb(env, addr, oi);
661
+ return ret;
662
}
663
664
uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr,
665
MemOpIdx oi, uintptr_t ra)
666
{
667
- return cpu_load_helper(env, addr, oi, ra, full_le_lduw_mmu);
668
+ uint16_t ret;
669
+
670
+ validate_memop(oi, MO_LEUW);
671
+ ret = do_ld2_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
672
+ plugin_load_cb(env, addr, oi);
673
+ return ret;
674
}
675
676
uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr,
677
MemOpIdx oi, uintptr_t ra)
678
{
679
- return cpu_load_helper(env, addr, oi, ra, full_le_ldul_mmu);
680
+ uint32_t ret;
681
+
682
+ validate_memop(oi, MO_LEUL);
683
+ ret = do_ld4_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
684
+ plugin_load_cb(env, addr, oi);
685
+ return ret;
686
}
687
688
uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr,
689
MemOpIdx oi, uintptr_t ra)
690
{
691
- return cpu_load_helper(env, addr, oi, ra, helper_le_ldq_mmu);
692
+ uint64_t ret;
693
+
694
+ validate_memop(oi, MO_LEUQ);
695
+ ret = do_ld8_mmu(env, addr, oi, ra, MMU_DATA_LOAD);
696
+ plugin_load_cb(env, addr, oi);
697
+ return ret;
698
}
699
700
Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr,
701
@@ -XXX,XX +XXX,XX @@ void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val,
702
703
/* Code access functions. */
704
705
-static uint64_t full_ldub_code(CPUArchState *env, target_ulong addr,
706
- MemOpIdx oi, uintptr_t retaddr)
707
-{
708
- return load_helper(env, addr, oi, retaddr, MO_8,
709
- MMU_INST_FETCH, full_ldub_code);
710
-}
711
-
712
uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr)
713
{
714
MemOpIdx oi = make_memop_idx(MO_UB, cpu_mmu_index(env, true));
715
- return full_ldub_code(env, addr, oi, 0);
716
-}
717
-
718
-static uint64_t full_lduw_code(CPUArchState *env, target_ulong addr,
719
- MemOpIdx oi, uintptr_t retaddr)
720
-{
721
- return load_helper(env, addr, oi, retaddr, MO_TEUW,
722
- MMU_INST_FETCH, full_lduw_code);
723
+ return do_ld1_mmu(env, addr, oi, 0, MMU_INST_FETCH);
724
}
725
726
uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr)
727
{
728
MemOpIdx oi = make_memop_idx(MO_TEUW, cpu_mmu_index(env, true));
729
- return full_lduw_code(env, addr, oi, 0);
730
-}
731
-
732
-static uint64_t full_ldl_code(CPUArchState *env, target_ulong addr,
733
- MemOpIdx oi, uintptr_t retaddr)
734
-{
735
- return load_helper(env, addr, oi, retaddr, MO_TEUL,
736
- MMU_INST_FETCH, full_ldl_code);
737
+ return do_ld2_mmu(env, addr, oi, 0, MMU_INST_FETCH);
738
}
739
740
uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr)
741
{
742
MemOpIdx oi = make_memop_idx(MO_TEUL, cpu_mmu_index(env, true));
743
- return full_ldl_code(env, addr, oi, 0);
744
-}
745
-
746
-static uint64_t full_ldq_code(CPUArchState *env, target_ulong addr,
747
- MemOpIdx oi, uintptr_t retaddr)
748
-{
749
- return load_helper(env, addr, oi, retaddr, MO_TEUQ,
750
- MMU_INST_FETCH, full_ldq_code);
751
+ return do_ld4_mmu(env, addr, oi, 0, MMU_INST_FETCH);
752
}
753
754
uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr)
755
{
756
MemOpIdx oi = make_memop_idx(MO_TEUQ, cpu_mmu_index(env, true));
757
- return full_ldq_code(env, addr, oi, 0);
758
+ return do_ld8_mmu(env, addr, oi, 0, MMU_INST_FETCH);
759
}
760
761
uint8_t cpu_ldb_code_mmu(CPUArchState *env, abi_ptr addr,
762
MemOpIdx oi, uintptr_t retaddr)
763
{
764
- return full_ldub_code(env, addr, oi, retaddr);
765
+ return do_ld1_mmu(env, addr, oi, retaddr, MMU_INST_FETCH);
766
}
767
768
uint16_t cpu_ldw_code_mmu(CPUArchState *env, abi_ptr addr,
769
MemOpIdx oi, uintptr_t retaddr)
770
{
771
- MemOp mop = get_memop(oi);
772
- int idx = get_mmuidx(oi);
773
- uint16_t ret;
774
-
775
- ret = full_lduw_code(env, addr, make_memop_idx(MO_TEUW, idx), retaddr);
776
- if ((mop & MO_BSWAP) != MO_TE) {
777
- ret = bswap16(ret);
778
- }
779
- return ret;
780
+ return do_ld2_mmu(env, addr, oi, retaddr, MMU_INST_FETCH);
781
}
782
783
uint32_t cpu_ldl_code_mmu(CPUArchState *env, abi_ptr addr,
784
MemOpIdx oi, uintptr_t retaddr)
785
{
786
- MemOp mop = get_memop(oi);
787
- int idx = get_mmuidx(oi);
788
- uint32_t ret;
789
-
790
- ret = full_ldl_code(env, addr, make_memop_idx(MO_TEUL, idx), retaddr);
791
- if ((mop & MO_BSWAP) != MO_TE) {
792
- ret = bswap32(ret);
793
- }
794
- return ret;
795
+ return do_ld4_mmu(env, addr, oi, retaddr, MMU_INST_FETCH);
796
}
797
798
uint64_t cpu_ldq_code_mmu(CPUArchState *env, abi_ptr addr,
799
MemOpIdx oi, uintptr_t retaddr)
800
{
801
- MemOp mop = get_memop(oi);
802
- int idx = get_mmuidx(oi);
803
- uint64_t ret;
804
-
805
- ret = full_ldq_code(env, addr, make_memop_idx(MO_TEUQ, idx), retaddr);
806
- if ((mop & MO_BSWAP) != MO_TE) {
807
- ret = bswap64(ret);
808
- }
809
- return ret;
810
+ return do_ld8_mmu(env, addr, oi, retaddr, MMU_INST_FETCH);
811
}
52
--
812
--
53
2.34.1
813
2.34.1
54
814
55
815
diff view generated by jsdifflib
1
From: Leandro Lupori <leandro.lupori@eldorado.org.br>
1
Instead of trying to unify all operations on uint64_t, use
2
mmu_lookup() to perform the basic tlb hit and resolution.
3
Create individual functions to handle access by size.
2
4
3
PowerPC64 processors handle direct branches better than indirect
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
4
ones, resulting in less stalled cycles and branch misses.
5
6
However, PPC's tb_target_set_jmp_target() was only using direct
7
branches for 16-bit jumps, while PowerPC64's unconditional branch
8
instructions are able to handle displacements of up to 26 bits.
9
To take advantage of this, now jumps whose displacements fit in
10
between 17 and 26 bits are also converted to direct branches.
11
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
13
Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br>
14
[rth: Expanded some commentary.]
15
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
16
---
7
---
17
tcg/ppc/tcg-target.c.inc | 119 +++++++++++++++++++++++++++++----------
8
accel/tcg/cputlb.c | 408 +++++++++++++++++++++------------------------
18
1 file changed, 88 insertions(+), 31 deletions(-)
9
1 file changed, 193 insertions(+), 215 deletions(-)
19
10
20
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
11
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
21
index XXXXXXX..XXXXXXX 100644
12
index XXXXXXX..XXXXXXX 100644
22
--- a/tcg/ppc/tcg-target.c.inc
13
--- a/accel/tcg/cputlb.c
23
+++ b/tcg/ppc/tcg-target.c.inc
14
+++ b/accel/tcg/cputlb.c
24
@@ -XXX,XX +XXX,XX @@ static void tcg_out_mb(TCGContext *s, TCGArg a0)
15
@@ -XXX,XX +XXX,XX @@ store_memop(void *haddr, uint64_t val, MemOp op)
25
tcg_out32(s, insn);
16
}
26
}
17
}
27
18
28
+static inline uint64_t make_pair(tcg_insn_unit i1, tcg_insn_unit i2)
19
-static void full_stb_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
20
- MemOpIdx oi, uintptr_t retaddr);
21
-
22
-static void __attribute__((noinline))
23
-store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val,
24
- uintptr_t retaddr, size_t size, uintptr_t mmu_idx,
25
- bool big_endian)
26
+/**
27
+ * do_st_mmio_leN:
28
+ * @env: cpu context
29
+ * @p: translation parameters
30
+ * @val_le: data to store
31
+ * @mmu_idx: virtual address context
32
+ * @ra: return address into tcg generated code, or 0
33
+ *
34
+ * Store @p->size bytes at @p->addr, which is memory-mapped i/o.
35
+ * The bytes to store are extracted in little-endian order from @val_le;
36
+ * return the bytes of @val_le beyond @p->size that have not been stored.
37
+ */
38
+static uint64_t do_st_mmio_leN(CPUArchState *env, MMULookupPageData *p,
39
+ uint64_t val_le, int mmu_idx, uintptr_t ra)
40
{
41
- uintptr_t index, index2;
42
- CPUTLBEntry *entry, *entry2;
43
- target_ulong page1, page2, tlb_addr, tlb_addr2;
44
- MemOpIdx oi;
45
- size_t size2;
46
- int i;
47
+ CPUTLBEntryFull *full = p->full;
48
+ target_ulong addr = p->addr;
49
+ int i, size = p->size;
50
51
- /*
52
- * Ensure the second page is in the TLB. Note that the first page
53
- * is already guaranteed to be filled, and that the second page
54
- * cannot evict the first. An exception to this rule is PAGE_WRITE_INV
55
- * handling: the first page could have evicted itself.
56
- */
57
- page1 = addr & TARGET_PAGE_MASK;
58
- page2 = (addr + size) & TARGET_PAGE_MASK;
59
- size2 = (addr + size) & ~TARGET_PAGE_MASK;
60
- index2 = tlb_index(env, mmu_idx, page2);
61
- entry2 = tlb_entry(env, mmu_idx, page2);
62
-
63
- tlb_addr2 = tlb_addr_write(entry2);
64
- if (page1 != page2 && !tlb_hit_page(tlb_addr2, page2)) {
65
- if (!victim_tlb_hit(env, mmu_idx, index2, MMU_DATA_STORE, page2)) {
66
- tlb_fill(env_cpu(env), page2, size2, MMU_DATA_STORE,
67
- mmu_idx, retaddr);
68
- index2 = tlb_index(env, mmu_idx, page2);
69
- entry2 = tlb_entry(env, mmu_idx, page2);
70
- }
71
- tlb_addr2 = tlb_addr_write(entry2);
72
+ QEMU_IOTHREAD_LOCK_GUARD();
73
+ for (i = 0; i < size; i++, val_le >>= 8) {
74
+ io_writex(env, full, mmu_idx, val_le, addr + i, ra, MO_UB);
75
}
76
+ return val_le;
77
+}
78
79
- index = tlb_index(env, mmu_idx, addr);
80
- entry = tlb_entry(env, mmu_idx, addr);
81
- tlb_addr = tlb_addr_write(entry);
82
+/**
83
+ * do_st_bytes_leN:
84
+ * @p: translation parameters
85
+ * @val_le: data to store
86
+ *
87
+ * Store @p->size bytes at @p->haddr, which is RAM.
88
+ * The bytes to store are extracted in little-endian order from @val_le;
89
+ * return the bytes of @val_le beyond @p->size that have not been stored.
90
+ */
91
+static uint64_t do_st_bytes_leN(MMULookupPageData *p, uint64_t val_le)
29
+{
92
+{
30
+ if (HOST_BIG_ENDIAN) {
93
+ uint8_t *haddr = p->haddr;
31
+ return (uint64_t)i1 << 32 | i2;
94
+ int i, size = p->size;
32
+ }
95
33
+ return (uint64_t)i2 << 32 | i1;
96
- /*
97
- * Handle watchpoints. Since this may trap, all checks
98
- * must happen before any store.
99
- */
100
- if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
101
- cpu_check_watchpoint(env_cpu(env), addr, size - size2,
102
- env_tlb(env)->d[mmu_idx].fulltlb[index].attrs,
103
- BP_MEM_WRITE, retaddr);
104
- }
105
- if (unlikely(tlb_addr2 & TLB_WATCHPOINT)) {
106
- cpu_check_watchpoint(env_cpu(env), page2, size2,
107
- env_tlb(env)->d[mmu_idx].fulltlb[index2].attrs,
108
- BP_MEM_WRITE, retaddr);
109
+ for (i = 0; i < size; i++, val_le >>= 8) {
110
+ haddr[i] = val_le;
111
}
112
+ return val_le;
34
+}
113
+}
35
+
114
36
+static inline void ppc64_replace2(uintptr_t rx, uintptr_t rw,
115
- /*
37
+ tcg_insn_unit i0, tcg_insn_unit i1)
116
- * XXX: not efficient, but simple.
117
- * This loop must go in the forward direction to avoid issues
118
- * with self-modifying code in Windows 64-bit.
119
- */
120
- oi = make_memop_idx(MO_UB, mmu_idx);
121
- if (big_endian) {
122
- for (i = 0; i < size; ++i) {
123
- /* Big-endian extract. */
124
- uint8_t val8 = val >> (((size - 1) * 8) - (i * 8));
125
- full_stb_mmu(env, addr + i, val8, oi, retaddr);
126
- }
127
+/*
128
+ * Wrapper for the above.
129
+ */
130
+static uint64_t do_st_leN(CPUArchState *env, MMULookupPageData *p,
131
+ uint64_t val_le, int mmu_idx, uintptr_t ra)
38
+{
132
+{
39
+#if TCG_TARGET_REG_BITS == 64
133
+ if (unlikely(p->flags & TLB_MMIO)) {
40
+ qatomic_set((uint64_t *)rw, make_pair(i0, i1));
134
+ return do_st_mmio_leN(env, p, val_le, mmu_idx, ra);
41
+ flush_idcache_range(rx, rw, 8);
135
+ } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
42
+#else
136
+ return val_le >> (p->size * 8);
43
+ qemu_build_not_reached();
137
} else {
44
+#endif
138
- for (i = 0; i < size; ++i) {
139
- /* Little-endian extract. */
140
- uint8_t val8 = val >> (i * 8);
141
- full_stb_mmu(env, addr + i, val8, oi, retaddr);
142
- }
143
+ return do_st_bytes_leN(p, val_le);
144
}
145
}
146
147
-static inline void QEMU_ALWAYS_INLINE
148
-store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
149
- MemOpIdx oi, uintptr_t retaddr, MemOp op)
150
+static void do_st_1(CPUArchState *env, MMULookupPageData *p, uint8_t val,
151
+ int mmu_idx, uintptr_t ra)
152
{
153
- const unsigned a_bits = get_alignment_bits(get_memop(oi));
154
- const size_t size = memop_size(op);
155
- uintptr_t mmu_idx = get_mmuidx(oi);
156
- uintptr_t index;
157
- CPUTLBEntry *entry;
158
- target_ulong tlb_addr;
159
- void *haddr;
160
-
161
- tcg_debug_assert(mmu_idx < NB_MMU_MODES);
162
-
163
- /* Handle CPU specific unaligned behaviour */
164
- if (addr & ((1 << a_bits) - 1)) {
165
- cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_STORE,
166
- mmu_idx, retaddr);
167
+ if (unlikely(p->flags & TLB_MMIO)) {
168
+ io_writex(env, p->full, mmu_idx, val, p->addr, ra, MO_UB);
169
+ } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
170
+ /* nothing */
171
+ } else {
172
+ *(uint8_t *)p->haddr = val;
173
}
174
-
175
- index = tlb_index(env, mmu_idx, addr);
176
- entry = tlb_entry(env, mmu_idx, addr);
177
- tlb_addr = tlb_addr_write(entry);
178
-
179
- /* If the TLB entry is for a different page, reload and try again. */
180
- if (!tlb_hit(tlb_addr, addr)) {
181
- if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_STORE,
182
- addr & TARGET_PAGE_MASK)) {
183
- tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE,
184
- mmu_idx, retaddr);
185
- index = tlb_index(env, mmu_idx, addr);
186
- entry = tlb_entry(env, mmu_idx, addr);
187
- }
188
- tlb_addr = tlb_addr_write(entry) & ~TLB_INVALID_MASK;
189
- }
190
-
191
- /* Handle anything that isn't just a straight memory access. */
192
- if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
193
- CPUTLBEntryFull *full;
194
- bool need_swap;
195
-
196
- /* For anything that is unaligned, recurse through byte stores. */
197
- if ((addr & (size - 1)) != 0) {
198
- goto do_unaligned_access;
199
- }
200
-
201
- full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
202
-
203
- /* Handle watchpoints. */
204
- if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
205
- /* On watchpoint hit, this will longjmp out. */
206
- cpu_check_watchpoint(env_cpu(env), addr, size,
207
- full->attrs, BP_MEM_WRITE, retaddr);
208
- }
209
-
210
- need_swap = size > 1 && (tlb_addr & TLB_BSWAP);
211
-
212
- /* Handle I/O access. */
213
- if (tlb_addr & TLB_MMIO) {
214
- io_writex(env, full, mmu_idx, val, addr, retaddr,
215
- op ^ (need_swap * MO_BSWAP));
216
- return;
217
- }
218
-
219
- /* Ignore writes to ROM. */
220
- if (unlikely(tlb_addr & TLB_DISCARD_WRITE)) {
221
- return;
222
- }
223
-
224
- /* Handle clean RAM pages. */
225
- if (tlb_addr & TLB_NOTDIRTY) {
226
- notdirty_write(env_cpu(env), addr, size, full, retaddr);
227
- }
228
-
229
- haddr = (void *)((uintptr_t)addr + entry->addend);
230
-
231
- /*
232
- * Keep these two store_memop separate to ensure that the compiler
233
- * is able to fold the entire function to a single instruction.
234
- * There is a build-time assert inside to remind you of this. ;-)
235
- */
236
- if (unlikely(need_swap)) {
237
- store_memop(haddr, val, op ^ MO_BSWAP);
238
- } else {
239
- store_memop(haddr, val, op);
240
- }
241
- return;
242
- }
243
-
244
- /* Handle slow unaligned access (it spans two pages or IO). */
245
- if (size > 1
246
- && unlikely((addr & ~TARGET_PAGE_MASK) + size - 1
247
- >= TARGET_PAGE_SIZE)) {
248
- do_unaligned_access:
249
- store_helper_unaligned(env, addr, val, retaddr, size,
250
- mmu_idx, memop_big_endian(op));
251
- return;
252
- }
253
-
254
- haddr = (void *)((uintptr_t)addr + entry->addend);
255
- store_memop(haddr, val, op);
256
}
257
258
-static void __attribute__((noinline))
259
-full_stb_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
260
- MemOpIdx oi, uintptr_t retaddr)
261
+static void do_st_2(CPUArchState *env, MMULookupPageData *p, uint16_t val,
262
+ int mmu_idx, MemOp memop, uintptr_t ra)
263
{
264
- validate_memop(oi, MO_UB);
265
- store_helper(env, addr, val, oi, retaddr, MO_UB);
266
+ if (unlikely(p->flags & TLB_MMIO)) {
267
+ io_writex(env, p->full, mmu_idx, val, p->addr, ra, memop);
268
+ } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
269
+ /* nothing */
270
+ } else {
271
+ /* Swap to host endian if necessary, then store. */
272
+ if (memop & MO_BSWAP) {
273
+ val = bswap16(val);
274
+ }
275
+ store_memop(p->haddr, val, MO_UW);
276
+ }
45
+}
277
+}
46
+
278
+
47
+static inline void ppc64_replace4(uintptr_t rx, uintptr_t rw,
279
+static void do_st_4(CPUArchState *env, MMULookupPageData *p, uint32_t val,
48
+ tcg_insn_unit i0, tcg_insn_unit i1,
280
+ int mmu_idx, MemOp memop, uintptr_t ra)
49
+ tcg_insn_unit i2, tcg_insn_unit i3)
50
+{
281
+{
51
+ uint64_t p[2];
282
+ if (unlikely(p->flags & TLB_MMIO)) {
52
+
283
+ io_writex(env, p->full, mmu_idx, val, p->addr, ra, memop);
53
+ p[!HOST_BIG_ENDIAN] = make_pair(i0, i1);
284
+ } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
54
+ p[HOST_BIG_ENDIAN] = make_pair(i2, i3);
285
+ /* nothing */
55
+
286
+ } else {
56
+ /*
287
+ /* Swap to host endian if necessary, then store. */
57
+ * There's no convenient way to get the compiler to allocate a pair
288
+ if (memop & MO_BSWAP) {
58
+ * of registers at an even index, so copy into r6/r7 and clobber.
289
+ val = bswap32(val);
59
+ */
290
+ }
60
+ asm("mr %%r6, %1\n\t"
291
+ store_memop(p->haddr, val, MO_UL);
61
+ "mr %%r7, %2\n\t"
292
+ }
62
+ "stq %%r6, %0"
63
+ : "=Q"(*(__int128 *)rw) : "r"(p[0]), "r"(p[1]) : "r6", "r7");
64
+ flush_idcache_range(rx, rw, 16);
65
+}
293
+}
66
+
294
+
67
void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx,
295
+static void do_st_8(CPUArchState *env, MMULookupPageData *p, uint64_t val,
68
uintptr_t jmp_rw, uintptr_t addr)
296
+ int mmu_idx, MemOp memop, uintptr_t ra)
69
{
297
+{
70
- if (TCG_TARGET_REG_BITS == 64) {
298
+ if (unlikely(p->flags & TLB_MMIO)) {
71
- tcg_insn_unit i1, i2;
299
+ io_writex(env, p->full, mmu_idx, val, p->addr, ra, memop);
72
- intptr_t tb_diff = addr - tc_ptr;
300
+ } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
73
- intptr_t br_diff = addr - (jmp_rx + 4);
301
+ /* nothing */
74
- uint64_t pair;
302
+ } else {
75
+ tcg_insn_unit i0, i1, i2, i3;
303
+ /* Swap to host endian if necessary, then store. */
76
+ intptr_t tb_diff = addr - tc_ptr;
304
+ if (memop & MO_BSWAP) {
77
+ intptr_t br_diff = addr - (jmp_rx + 4);
305
+ val = bswap64(val);
78
+ intptr_t lo, hi;
306
+ }
79
307
+ store_memop(p->haddr, val, MO_UQ);
80
- /* This does not exercise the range of the branch, but we do
308
+ }
81
- still need to be able to load the new value of TCG_REG_TB.
309
}
82
- But this does still happen quite often. */
310
83
- if (tb_diff == (int16_t)tb_diff) {
311
void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
84
- i1 = ADDI | TAI(TCG_REG_TB, TCG_REG_TB, tb_diff);
312
- MemOpIdx oi, uintptr_t retaddr)
85
- i2 = B | (br_diff & 0x3fffffc);
313
+ MemOpIdx oi, uintptr_t ra)
86
- } else {
314
{
87
- intptr_t lo = (int16_t)tb_diff;
315
- full_stb_mmu(env, addr, val, oi, retaddr);
88
- intptr_t hi = (int32_t)(tb_diff - lo);
316
+ MMULookupLocals l;
89
- assert(tb_diff == hi + lo);
317
+ bool crosspage;
90
- i1 = ADDIS | TAI(TCG_REG_TB, TCG_REG_TB, hi >> 16);
318
+
91
- i2 = ADDI | TAI(TCG_REG_TB, TCG_REG_TB, lo);
319
+ validate_memop(oi, MO_UB);
92
- }
320
+ crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l);
93
-#if HOST_BIG_ENDIAN
321
+ tcg_debug_assert(!crosspage);
94
- pair = (uint64_t)i1 << 32 | i2;
322
+
95
-#else
323
+ do_st_1(env, &l.page[0], val, l.mmu_idx, ra);
96
- pair = (uint64_t)i2 << 32 | i1;
324
}
97
-#endif
325
98
-
326
-static void full_le_stw_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
99
- /* As per the enclosing if, this is ppc64. Avoid the _Static_assert
327
- MemOpIdx oi, uintptr_t retaddr)
100
- within qatomic_set that would fail to build a ppc32 host. */
328
+static void do_st2_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
101
- qatomic_set__nocheck((uint64_t *)jmp_rw, pair);
329
+ MemOpIdx oi, uintptr_t ra)
102
- flush_idcache_range(jmp_rx, jmp_rw, 8);
330
{
103
- } else {
331
- validate_memop(oi, MO_LEUW);
104
+ if (TCG_TARGET_REG_BITS == 32) {
332
- store_helper(env, addr, val, oi, retaddr, MO_LEUW);
105
intptr_t diff = addr - jmp_rx;
333
+ MMULookupLocals l;
106
tcg_debug_assert(in_range_b(diff));
334
+ bool crosspage;
107
qatomic_set((uint32_t *)jmp_rw, B | (diff & 0x3fffffc));
335
+ uint8_t a, b;
108
flush_idcache_range(jmp_rx, jmp_rw, 4);
336
+
337
+ crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l);
338
+ if (likely(!crosspage)) {
339
+ do_st_2(env, &l.page[0], val, l.mmu_idx, l.memop, ra);
109
+ return;
340
+ return;
110
}
341
+ }
111
+
342
+
112
+ /*
343
+ if ((l.memop & MO_BSWAP) == MO_LE) {
113
+ * For 16-bit displacements, we can use a single add + branch.
344
+ a = val, b = val >> 8;
114
+ * This happens quite often.
345
+ } else {
115
+ */
346
+ b = val, a = val >> 8;
116
+ if (tb_diff == (int16_t)tb_diff) {
347
+ }
117
+ i0 = ADDI | TAI(TCG_REG_TB, TCG_REG_TB, tb_diff);
348
+ do_st_1(env, &l.page[0], a, l.mmu_idx, ra);
118
+ i1 = B | (br_diff & 0x3fffffc);
349
+ do_st_1(env, &l.page[1], b, l.mmu_idx, ra);
119
+ ppc64_replace2(jmp_rx, jmp_rw, i0, i1);
350
}
351
352
void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
353
MemOpIdx oi, uintptr_t retaddr)
354
{
355
- full_le_stw_mmu(env, addr, val, oi, retaddr);
356
-}
357
-
358
-static void full_be_stw_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
359
- MemOpIdx oi, uintptr_t retaddr)
360
-{
361
- validate_memop(oi, MO_BEUW);
362
- store_helper(env, addr, val, oi, retaddr, MO_BEUW);
363
+ validate_memop(oi, MO_LEUW);
364
+ do_st2_mmu(env, addr, val, oi, retaddr);
365
}
366
367
void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
368
MemOpIdx oi, uintptr_t retaddr)
369
{
370
- full_be_stw_mmu(env, addr, val, oi, retaddr);
371
+ validate_memop(oi, MO_BEUW);
372
+ do_st2_mmu(env, addr, val, oi, retaddr);
373
}
374
375
-static void full_le_stl_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
376
- MemOpIdx oi, uintptr_t retaddr)
377
+static void do_st4_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
378
+ MemOpIdx oi, uintptr_t ra)
379
{
380
- validate_memop(oi, MO_LEUL);
381
- store_helper(env, addr, val, oi, retaddr, MO_LEUL);
382
+ MMULookupLocals l;
383
+ bool crosspage;
384
+
385
+ crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l);
386
+ if (likely(!crosspage)) {
387
+ do_st_4(env, &l.page[0], val, l.mmu_idx, l.memop, ra);
120
+ return;
388
+ return;
121
+ }
389
+ }
122
+
390
+
123
+ lo = (int16_t)tb_diff;
391
+ /* Swap to little endian for simplicity, then store by bytes. */
124
+ hi = (int32_t)(tb_diff - lo);
392
+ if ((l.memop & MO_BSWAP) != MO_LE) {
125
+ assert(tb_diff == hi + lo);
393
+ val = bswap32(val);
126
+ i0 = ADDIS | TAI(TCG_REG_TB, TCG_REG_TB, hi >> 16);
394
+ }
127
+ i1 = ADDI | TAI(TCG_REG_TB, TCG_REG_TB, lo);
395
+ val = do_st_leN(env, &l.page[0], val, l.mmu_idx, ra);
128
+
396
+ (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, ra);
129
+ /*
397
}
130
+ * Without stq from 2.07, we can only update two insns,
398
131
+ * and those must be the ones that load the target address.
399
void helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
132
+ */
400
MemOpIdx oi, uintptr_t retaddr)
133
+ if (!have_isa_2_07) {
401
{
134
+ ppc64_replace2(jmp_rx, jmp_rw, i0, i1);
402
- full_le_stl_mmu(env, addr, val, oi, retaddr);
403
-}
404
-
405
-static void full_be_stl_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
406
- MemOpIdx oi, uintptr_t retaddr)
407
-{
408
- validate_memop(oi, MO_BEUL);
409
- store_helper(env, addr, val, oi, retaddr, MO_BEUL);
410
+ validate_memop(oi, MO_LEUL);
411
+ do_st4_mmu(env, addr, val, oi, retaddr);
412
}
413
414
void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
415
MemOpIdx oi, uintptr_t retaddr)
416
{
417
- full_be_stl_mmu(env, addr, val, oi, retaddr);
418
+ validate_memop(oi, MO_BEUL);
419
+ do_st4_mmu(env, addr, val, oi, retaddr);
420
+}
421
+
422
+static void do_st8_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
423
+ MemOpIdx oi, uintptr_t ra)
424
+{
425
+ MMULookupLocals l;
426
+ bool crosspage;
427
+
428
+ crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l);
429
+ if (likely(!crosspage)) {
430
+ do_st_8(env, &l.page[0], val, l.mmu_idx, l.memop, ra);
135
+ return;
431
+ return;
136
+ }
432
+ }
137
+
433
+
138
+ /*
434
+ /* Swap to little endian for simplicity, then store by bytes. */
139
+ * For 26-bit displacements, we can use a direct branch.
435
+ if ((l.memop & MO_BSWAP) != MO_LE) {
140
+ * Otherwise we still need the indirect branch, which we
436
+ val = bswap64(val);
141
+ * must restore after a potential direct branch write.
437
+ }
142
+ */
438
+ val = do_st_leN(env, &l.page[0], val, l.mmu_idx, ra);
143
+ br_diff -= 4;
439
+ (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, ra);
144
+ if (in_range_b(br_diff)) {
440
}
145
+ i2 = B | (br_diff & 0x3fffffc);
441
146
+ i3 = NOP;
442
void helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
147
+ } else {
443
MemOpIdx oi, uintptr_t retaddr)
148
+ i2 = MTSPR | RS(TCG_REG_TB) | CTR;
444
{
149
+ i3 = BCCTR | BO_ALWAYS;
445
validate_memop(oi, MO_LEUQ);
150
+ }
446
- store_helper(env, addr, val, oi, retaddr, MO_LEUQ);
151
+ ppc64_replace4(jmp_rx, jmp_rw, i0, i1, i2, i3);
447
+ do_st8_mmu(env, addr, val, oi, retaddr);
152
}
448
}
153
449
154
static void tcg_out_call_int(TCGContext *s, int lk,
450
void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
155
@@ -XXX,XX +XXX,XX @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
451
MemOpIdx oi, uintptr_t retaddr)
156
if (s->tb_jmp_insn_offset) {
452
{
157
/* Direct jump. */
453
validate_memop(oi, MO_BEUQ);
158
if (TCG_TARGET_REG_BITS == 64) {
454
- store_helper(env, addr, val, oi, retaddr, MO_BEUQ);
159
- /* Ensure the next insns are 8-byte aligned. */
455
+ do_st8_mmu(env, addr, val, oi, retaddr);
160
- if ((uintptr_t)s->code_ptr & 7) {
456
}
161
+ /* Ensure the next insns are 8 or 16-byte aligned. */
457
162
+ while ((uintptr_t)s->code_ptr & (have_isa_2_07 ? 15 : 7)) {
458
/*
163
tcg_out32(s, NOP);
459
* Store Helpers for cpu_ldst.h
164
}
460
*/
165
s->tb_jmp_insn_offset[args[0]] = tcg_current_code_size(s);
461
462
-typedef void FullStoreHelper(CPUArchState *env, target_ulong addr,
463
- uint64_t val, MemOpIdx oi, uintptr_t retaddr);
464
-
465
-static inline void cpu_store_helper(CPUArchState *env, target_ulong addr,
466
- uint64_t val, MemOpIdx oi, uintptr_t ra,
467
- FullStoreHelper *full_store)
468
+static void plugin_store_cb(CPUArchState *env, abi_ptr addr, MemOpIdx oi)
469
{
470
- full_store(env, addr, val, oi, ra);
471
qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W);
472
}
473
474
void cpu_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val,
475
MemOpIdx oi, uintptr_t retaddr)
476
{
477
- cpu_store_helper(env, addr, val, oi, retaddr, full_stb_mmu);
478
+ helper_ret_stb_mmu(env, addr, val, oi, retaddr);
479
+ plugin_store_cb(env, addr, oi);
480
}
481
482
void cpu_stw_be_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
483
MemOpIdx oi, uintptr_t retaddr)
484
{
485
- cpu_store_helper(env, addr, val, oi, retaddr, full_be_stw_mmu);
486
+ helper_be_stw_mmu(env, addr, val, oi, retaddr);
487
+ plugin_store_cb(env, addr, oi);
488
}
489
490
void cpu_stl_be_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
491
MemOpIdx oi, uintptr_t retaddr)
492
{
493
- cpu_store_helper(env, addr, val, oi, retaddr, full_be_stl_mmu);
494
+ helper_be_stl_mmu(env, addr, val, oi, retaddr);
495
+ plugin_store_cb(env, addr, oi);
496
}
497
498
void cpu_stq_be_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
499
MemOpIdx oi, uintptr_t retaddr)
500
{
501
- cpu_store_helper(env, addr, val, oi, retaddr, helper_be_stq_mmu);
502
+ helper_be_stq_mmu(env, addr, val, oi, retaddr);
503
+ plugin_store_cb(env, addr, oi);
504
}
505
506
void cpu_stw_le_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
507
MemOpIdx oi, uintptr_t retaddr)
508
{
509
- cpu_store_helper(env, addr, val, oi, retaddr, full_le_stw_mmu);
510
+ helper_le_stw_mmu(env, addr, val, oi, retaddr);
511
+ plugin_store_cb(env, addr, oi);
512
}
513
514
void cpu_stl_le_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
515
MemOpIdx oi, uintptr_t retaddr)
516
{
517
- cpu_store_helper(env, addr, val, oi, retaddr, full_le_stl_mmu);
518
+ helper_le_stl_mmu(env, addr, val, oi, retaddr);
519
+ plugin_store_cb(env, addr, oi);
520
}
521
522
void cpu_stq_le_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
523
MemOpIdx oi, uintptr_t retaddr)
524
{
525
- cpu_store_helper(env, addr, val, oi, retaddr, helper_le_stq_mmu);
526
+ helper_le_stq_mmu(env, addr, val, oi, retaddr);
527
+ plugin_store_cb(env, addr, oi);
528
}
529
530
void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, Int128 val,
166
--
531
--
167
2.34.1
532
2.34.1
diff view generated by jsdifflib
New patch
1
This header is supposed to be private to tcg and in fact
2
does not need to be included here at all.
1
3
4
Reviewed-by: Song Gao <gaosong@loongson.cn>
5
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
6
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
7
---
8
target/loongarch/csr_helper.c | 1 -
9
target/loongarch/iocsr_helper.c | 1 -
10
2 files changed, 2 deletions(-)
11
12
diff --git a/target/loongarch/csr_helper.c b/target/loongarch/csr_helper.c
13
index XXXXXXX..XXXXXXX 100644
14
--- a/target/loongarch/csr_helper.c
15
+++ b/target/loongarch/csr_helper.c
16
@@ -XXX,XX +XXX,XX @@
17
#include "exec/cpu_ldst.h"
18
#include "hw/irq.h"
19
#include "cpu-csr.h"
20
-#include "tcg/tcg-ldst.h"
21
22
target_ulong helper_csrrd_pgd(CPULoongArchState *env)
23
{
24
diff --git a/target/loongarch/iocsr_helper.c b/target/loongarch/iocsr_helper.c
25
index XXXXXXX..XXXXXXX 100644
26
--- a/target/loongarch/iocsr_helper.c
27
+++ b/target/loongarch/iocsr_helper.c
28
@@ -XXX,XX +XXX,XX @@
29
#include "exec/helper-proto.h"
30
#include "exec/exec-all.h"
31
#include "exec/cpu_ldst.h"
32
-#include "tcg/tcg-ldst.h"
33
34
#define GET_MEMTXATTRS(cas) \
35
((MemTxAttrs){.requester_id = env_cpu(cas)->cpu_index})
36
--
37
2.34.1
diff view generated by jsdifflib